Finding Nash Equilibria of Two-Player Games
FFinding Nash Equilibria of Two-Player Games
Bernhard von Stengel
Department of Mathematics, London School of Economics, London WC2A 2AE,United Kingdom. Email: [email protected]
February 9, 2021
Abstract
This paper is an exposition of algorithms for finding one or all equilibria ofa bimatrix game (a two-player game in strategic form) in the style of a chapterin a graduate textbook. Using labeled “best-response polytopes”, we presentthe Lemke-Howson algorithm that finds one equilibrium. We show that thepath followed by this algorithm has a direction, and that the endpoints of thepath have opposite index , in a canonical way using determinants. For reference,we prove that a number of notions of nondegeneracy of a bimatrix game areequivalent. The computation of all equilibria of a general bimatrix game, via adescription of the maximal Nash subsets of the game, is canonically describedusing “complementary pairs” of faces of the best-response polytopes. A bimatrix game is a two-player game in strategic form, specified by the two matricesof payoffs to the row player and column player. This article describes algorithmsthat find one or all Nash equilibria of such a game.The game gives rise to two suitably labeled polytopes (described in Section 3),which help finding its Nash equilibria. This geometric structure is also veryinformative and accessible, for example for the construction of 3 × × 𝑋 and 𝑌 intobest-response regions, and the construction of ˜ 𝑋 and ˜ 𝑌 in Section 4 and Figure 5,which extends 𝑋 × 𝑌 with an “artificial equilibrium”. In Section 5, we then give amore concise description using polytopes. Section 6 gives a canonical proof that the1 a r X i v : . [ c s . G T ] F e b ndpoints of Lemke-Howson paths have opposite index . The index is here definedin an elementary way using determinants (Definition 11). In Section 7, we showthat a number of known definitions of nondegeneracy of a bimatrix game are in factequivalent. Section 8 shows how to implement the Lemke-Howson algorithm by“complementary pivoting”, even when the game is degenerate. Section 9 describesthe structure of Nash equilibria of a general bimatrix game.An undergraduate text, in even more detailed style and avoiding advancedmathematical machinery, is von Stengel (2021). This article continues Chapter 9 ofthat book, with the proof of the direction of a Lemke–Howson path and the conceptof the index of an equilibrium, and the detailed discussion of nondegeneracy.Earlier expositions of this topic are von Stengel (2002), which gives additionalhistorical references, and von Stengel (2007). Compared to these surveys, thefollowing expository results are new: • The definition of the index of a Nash equilibrium in a nondegenerate game,and the very canonical proof that opposite endpoints of Lemke-Howson pathshave opposite index in Theorem 13. Essentially, this is a much more accessibleversion of the argument by Shapley (1974). • The equivalent definitions of nondegeneracy in Theorem 14. • A cleaner presentation of maximal Nash subsets, adapted from Avis, Rosenberg,Savani, and von Stengel (2010), in Proposition 17.
We use the following notation throughout. Let ( 𝐴, 𝐵 ) be an 𝑚 × 𝑛 bimatrix game,that is, 𝐴 and 𝐵 are 𝑚 × 𝑛 matrices of payoffs to the row player 1 and columnplayer 2, respectively. This is a two-player game in strategic form (also called“normal form”), which is played by a simultaneous choice of a row 𝑖 by player 1and column 𝑗 by player 2, who then receive the entries 𝑎 𝑖𝑗 of the matrix 𝐴 , and 𝑏 𝑖𝑗 of 𝐵 , as respective payoffs. The payoffs represent risk-neutral utilities, so whenfacing a probability distribution, the players want to maximize their expectedpayoff. These preferences do not depend on positive-affine transformations, sothat 𝐴 and 𝐵 can be assumed to have nonnegative entries. In addition, as inputs toan algorithm they are assumed to be rationals or just integers.All vectors are column vectors, so an 𝑚 -vector vector 𝑥 (that is, an element of R 𝑚 ) is treated as an 𝑚 × 𝑥 , . . . , 𝑥 𝑚 . A scalar is treatedas a 1 × mixed strategy 𝑥 for player 1 is a probability distributionon the rows of the game, written as an 𝑚 -vector of probabilities. Similarly, a mixedstrategy 𝑦 for player 2 is an 𝑛 -vector of probabilities for playing the columns of the2ame. Let be the all-zero vector and let be the all-one vector of appropriatedimension. The transpose of any matrix 𝐶 is denoted by 𝐶 (cid:62) , so (cid:62) is the all-onerow vector. Inequalities like 𝑥 ≥ between two vectors hold for all components.Let 𝑋 and 𝑌 be the mixed-strategy sets of the two players, 𝑋 = { 𝑥 ∈ R 𝑚 | 𝑥 ≥ , (cid:62) 𝑥 = } , 𝑌 = { 𝑦 ∈ R 𝑛 | 𝑦 ≥ , (cid:62) 𝑦 = } . (1)The support supp ( 𝑧 ) of a mixed strategy 𝑧 is the set of pure strategies that havepositive probability, so supp ( 𝑧 ) = { 𝑘 | 𝑧 𝑘 > } .A best response to the mixed strategy 𝑦 of player 2 is a mixed strategy 𝑥 ofplayer 1 that maximizes his expected payoff 𝑥 (cid:62) 𝐴𝑦 . Similarly, a best response 𝑦 of player 2 to 𝑥 maximizes her expected payoff 𝑥 (cid:62) 𝐵𝑦 . A Nash equilibrium or just equilibrium is a pair ( 𝑥, 𝑦 ) of mixed strategies that are best responses to each other.The following proposition states that a mixed strategy 𝑥 is a best response to anopponent strategy 𝑦 if and only if all pure strategies in its support are pure bestresponses to 𝑦 . The same holds with the roles of the players exchanged. Proposition 1 (Best response condition) . Let 𝑥 and 𝑦 be mixed strategies of player and , respectively. Then 𝑥 is a best response to 𝑦 if and only if for all 𝑖 = , . . . , 𝑚 , 𝑥 𝑖 > ⇒ ( 𝐴𝑦 ) 𝑖 = 𝑢 = max { ( 𝐴𝑦 ) 𝑘 | 𝑘 = , . . . , 𝑚 } . (2) Proof. ( 𝐴𝑦 ) 𝑖 is the 𝑖 th component of 𝐴𝑦 , which is the expected payoff to player 1when playing row 𝑖 . Then 𝑥 (cid:62) 𝐴𝑦 = 𝑚 (cid:213) 𝑖 = 𝑥 𝑖 ( 𝐴𝑦 ) 𝑖 = 𝑚 (cid:213) 𝑖 = 𝑥 𝑖 ( 𝑢 − ( 𝑢 − ( 𝐴𝑦 ) 𝑖 ) = 𝑢 − 𝑚 (cid:213) 𝑖 = 𝑥 𝑖 ( 𝑢 − ( 𝐴𝑦 ) 𝑖 ) . So 𝑥 (cid:62) 𝐴𝑦 ≤ 𝑢 because 𝑥 𝑖 ≥ 𝑢 − ( 𝐴𝑦 ) 𝑖 ≥ 𝑖 = , . . . , 𝑚 , and 𝑥 (cid:62) 𝐴𝑦 = 𝑢 if and only if 𝑥 𝑖 > ( 𝐴𝑦 ) 𝑖 = 𝑢 , as claimed.Proposition 1 is useful in a number of respects. First, by definition, 𝑥 is a bestresponse to 𝑦 if and only if 𝑥 (cid:62) 𝐴𝑦 ≥ ˆ 𝑥 (cid:62) 𝐴𝑦 for all other mixed strategies ˆ 𝑥 in 𝑋 ofplayer 1, where 𝑋 is an infinite set. In contrast, (2) is a finite condition, which onlyconcerns the pure strategies 𝑖 of player 1, which have to give maximum payoff ( 𝐴𝑦 ) 𝑖 whenever 𝑥 𝑖 >
0. For example, in the 3 × 𝐴 = , 𝐵 = , (3)if 𝑦 = ( , ) (cid:62) , then 𝐴𝑦 = ( , , ) (cid:62) . (From now on, we omit for brevity thetransposition when writing down specific vectors, as in 𝑦 = ( , ) .) Then player 1’spure best responses against 𝑦 are the second and third row, and 𝑥 = ( 𝑥 , 𝑥 , 𝑥 ) 𝑋 is a best response to 𝑦 if and only if 𝑥 =
0. In order for 𝑥 to be a best responseagainst 𝑦 , the pure best responses 2 and 3 can be played with arbitrary probabilities 𝑥 and 𝑥 . (As part of an equilibrium these probabilities will here be unique inorder to ensure the best response condition for the other player.) Second, as theproof of Proposition 1 shows, mixing cannot improve the payoff of a player (hereof player 1), which is just a “weighted average” of the expected payoffs ( 𝐴𝑦 ) 𝑖 withthe weights 𝑥 𝑖 for the rows 𝑖 . This payoff is maximal only if only the maximumpure-strategy payoffs ( 𝐴𝑦 ) 𝑖 have positive weight.We denote by bestresp ( 𝑧 ) the set of pure best responses of a player against amixed strategy 𝑧 of the other player, so bestresp ( 𝑦 ) ⊆ { , . . . , 𝑚 } if 𝑦 ∈ 𝑌 and bestresp ( 𝑥 ) ⊆ { , . . . , 𝑛 } if 𝑥 ∈ 𝑋 . Then (2) states that 𝑥 is a best response to 𝑦 ifand only if supp ( 𝑥 ) ⊆ bestresp ( 𝑦 ) . (4)This condition applies also to games with any finite number of players: If 𝑠 is anyone of 𝑁 players who plays the mixed strategy 𝑥 𝑠 , with the tuple of the 𝑁 − 𝑥 − 𝑠 , then 𝑥 𝑠 is a best responseagainst 𝑥 − 𝑠 if and only if supp ( 𝑥 𝑠 ) ⊆ bestresp ( 𝑥 − 𝑠 ) . (5)The proof of Proposition 1 still applies, where instead of ( 𝐴𝑦 ) 𝑖 for a pure strategy 𝑖 of player 𝑠 one has to use the expected payoff to player 𝑠 when he uses strategy 𝑖 against the tuple 𝑥 − 𝑠 of mixed strategies of the other players. For more thantwo players, 𝑁 >
2, that expected payoff involves products of the mixed strategyprobabilities 𝑥 𝑟 for the other players 𝑟 in 𝑁 − { 𝑠 } and is therefore nonlinear.The resulting polynomial equations and inequalities make the structure andcomputation of Nash equilibria for such games much more complicated than fortwo players, where the expected payoffs ( 𝐴𝑦 ) 𝑖 are linear in the opponent’s mixedstrategy 𝑦 . We consider only two-player games here.Proposition 1 is used in algorithms that find Nash equilibria of the game. Onesuch approach is to consider the different possible supports of mixed strategies. Allpure strategies in the support must have maximum, and hence equal, expectedpayoff to that player. This leads to equations for the probabilities of the opponent’s mixed strategy. In the above example (3), the mixed strategy 𝑦 = ( , ) has any 𝑥 = ( , 𝑥 , 𝑥 ) ≥ with 𝑥 + 𝑥 = 𝑦 to be a bestresponse against such an 𝑥 , the two columns have to have maximal and hence equalpayoff to player 2, that is, 2 𝑥 + 𝑥 = 𝑥 + 𝑥 , which has the unique solution 𝑥 = , 𝑥 = and expected payoff to player 2. Hence, ( 𝑥, 𝑦 ) is an equilibrium,which we denote for later reference by ( 𝑎, 𝑐 ) , ( 𝑎, 𝑐 ) = (( , , ) , ( , )) . (6)4ere the mixed strategy 𝑦 of player 2 is uniquely determined by the condition2 𝑦 + 𝑦 = 𝑦 + 𝑦 that the two bottom rows give equal expected payoff toplayer 1.A second mixed equilibrium ( 𝑥, 𝑦 ) is given if the support of player 1’s strategyconsists of the first two rows, which gives the equation 3 𝑦 + 𝑦 = 𝑦 + 𝑦 withthe unique solution 𝑦 = ( , ) and thus 𝐴𝑦 = ( , , ) . With 𝑥 = ( 𝑥 , 𝑥 , ) the equalpayoffs to player 2 for her two columns give the equation 3 𝑥 + 𝑥 = 𝑥 + 𝑥 with unique solution 𝑥 = ( , , ) . Then ( 𝑥, 𝑦 ) is an equilibrium, for later referencedenoted by ( 𝑏, 𝑑 ) , ( 𝑏, 𝑑 ) = (( , , ) , ( , )) . (7)A third, pure-strategy Nash equilibrium of the game is (( , , ) , ( , )) .The support set { , } for the mixed strategy of player 1 does not lead to anequilibrium, for two reasons. First, player 2 would have to play 𝑦 = ( , ) to makeplayer 1 indifferent between row 1 and row 3. But then the vector of expectedpayoffs to player 1 is 𝐴𝑦 = ( , , ) , so that rows 1 and 3 give the same payoffto player 1 but not the maximum payoff for all rows. Second, player 2 needs tobe indifferent between her two strategies (because player 1’s best response to apure strategy is unique and cannot have the support { , } ). The correspondingequation 3 𝑥 + 𝑥 = 𝑥 (together with 𝑥 + 𝑥 =
1) has the solution 𝑥 = , 𝑥 = − , so 𝑥 is not a vector of probabilities.In this “support testing” method, it normally suffices to consider supports ofequal size for the two players. For example, in (3) it is not necessary to consider amixed strategy 𝑥 of player 1 where all three pure strategies have positive probability,because player 1 would then have to be indifferent between all these. However,a mixed strategy 𝑦 of player 1 is already uniquely determined by equalizing theexpected payoffs for two rows, and then the payoff for the remaining row is alreadydifferent. This is the typical, “nondegenerate” case, according to the followingdefinition. Definition 2.
A two-player game is called nondegenerate if no mixed strategy 𝑧 of either player of support size 𝑘 has more than 𝑘 pure best responses, that is, | bestresp ( 𝑧 )| ≤ | supp ( 𝑧 )| .In a degenerate game, Definition 2 is violated, for example if there is a purestrategy that has two pure best responses. For the moment, we only considernondegenerate games, where the players’ equilibrium strategies have equal sizedsupport, which is immediate from Proposition 1: Proposition 3.
In any Nash equilibrium ( 𝑥, 𝑦 ) of a nondegenerate bimatrix game, 𝑥 and 𝑦 have supports of equal size. roof. Condition (4), and the analogous condition supp ( 𝑦 ) ⊆ bestresp ( 𝑥 ) , give | supp ( 𝑥 )| ≤ | bestresp ( 𝑦 )| ≤ | supp ( 𝑦 )| ≤ | bestresp ( 𝑥 )| ≤ | supp ( 𝑥 )| so we have equality throughout.The “support testing” algorithm for finding equilibria of a nondegeneratebimatrix game considers any two equal-sized supports of a potential equilibrium,equalizes their payoffs 𝑢 and 𝑣 , and then checks whether 𝑥 and 𝑦 are mixedstrategies and 𝑢 and 𝑣 are maximal payoffs. Algorithm 4 (Equilibria by support enumeration) . Input: An 𝑚 × 𝑛 bimatrix game ( 𝐴, 𝐵 ) that is nondegenerate. Output:
All Nash equilibria of the game.
Method:
Foreach 𝑘 = , . . . , min { 𝑚, 𝑛 } and each pair ( 𝐼 , 𝐽 ) of 𝑘 -sized sets of pure strategies forthe two players, solve (with unknowns 𝑥, 𝑣, 𝑦, 𝑢 ) the equations (cid:205) 𝑖 ∈ 𝐼 𝑥 𝑖 𝑏 𝑖𝑗 = 𝑣 for 𝑗 ∈ 𝐽 , (cid:205) 𝑖 ∈ 𝐼 𝑥 𝑖 = (cid:205) 𝑗 ∈ 𝐽 𝑎 𝑖𝑗 𝑦 𝑗 = 𝑢 for 𝑖 ∈ 𝐼 , (cid:205) 𝑗 ∈ 𝐽 𝑦 𝑗 =
1, and subsequently check that 𝑥 ≥ , 𝑦 ≥ , and that (2) holds for 𝑥 and analogously 𝑦 . If so, output ( 𝑥, 𝑦 ) .The linear equations considered in this algorithm may not have solutions,which then mean no equilibrium for that support pair. Nonunique solutionscan occur for degenerate games, which have underdetermined systems of linearequations for equalizing the opponent’s expected payoffs (see Theorem 14(f)below). Algorithm 4 can be improved because equal payoffs for the pure strategies in apotential equilibrium support do not imply that these payoffs are also optimal, forexample against the mixed strategy 𝑦 = ( , ) in example (3). By using suitablelinear inequalities, one can capture this additional condition automatically. Thisgives rise to “best-response polyhedra”, which have equivalent descriptions via“best-response regions” and “best-response polytopes”.In this geometric approach, mixed strategies 𝑥 and 𝑦 are considered as pointsin the respective mixed strategy “simplex” 𝑋 or 𝑌 in (1). We use the followingnotions from convex geometry. An affine combination of points 𝑧 , . . . , 𝑧 𝑘 in someEuclidean space is of the form (cid:205) 𝑘𝑖 = 𝑧 𝑖 𝜆 𝑖 where 𝜆 , . . . , 𝜆 𝑘 are reals with (cid:205) 𝑘𝑖 = 𝜆 𝑖 = convex combination if 𝜆 𝑖 ≥ 𝑖 . A set of points is convex if itis closed under forming convex combinations. The convex hull of a set of pointsis the smallest convex set that contains all these points. Given points are affinelyindependent if none of these points is an affine combination of the others. A convexset has dimension 𝑑 if and only if it has 𝑑 +
1, but no more, affinely independentpoints. A simplex is the convex hull of a set of affinely independent points. The 𝑘 th unit vector has its 𝑘 th component equal to one and all other components equal to6ero. The mixed strategy simplex 𝑋 of player 1 in (1) is the convex hull of the 𝑚 unit vectors in R 𝑚 (and has dimension 𝑚 − 𝑌 is the convex hull of the 𝑛 unit vectors in R 𝑛 (and has dimension 𝑛 − × 𝑌 is the line segment that connects the unit vectors ( , ) and ( , ) , whose convex combinations ( 𝑦 , 𝑦 ) are the mixed strategies ofplayer 2. The resulting expected payoffs to player 1 for his three pure strategies aregiven by 3 𝑦 + 𝑦 , 2 𝑦 + 𝑦 , and 0 𝑦 + 𝑦 . The maximum of these three linearexpressions in ( 𝑦 , 𝑦 ) defines the upper envelope of player 1’s expected payoffs,shown in bold in Figure 1. This picture shows that row 1 is a best response if 𝑦 ∈ [ , ] , row 2 is a best response if 𝑦 ∈ [ , ] , and row 3 is a best response if 𝑦 ∈ [ , ] . The sets of mixed strategies 𝑦 corresponding to these three intervalsare labeled with the pure strategies 1 , , 𝑑 = ( , ) has two labels 1 and 2, which are the two pureresponses of player 1. Similarly, point 𝑐 = ( , ) has the two labels 2 and 3 as bestresponses. The picture shows also that for 𝑦 = ( , ) the two pure strategies 1 and3 have equal expected payoff, but the label of this point 𝑦 is 2 because its (unique)best response, row 2, has higher payoff. (1,0) (0,1) u d c Figure 1
Upper envelope of expected payoffs to player 1, as a function of themixed strategy 𝑦 of player 2, for the game (3). We label the pure strategies of the two players uniquely by giving label 𝑖 to eachrow 𝑖 = , . . . , 𝑚 , and label 𝑚 + 𝑗 to each column 𝑗 = , . . . , 𝑛 . In our 3 × 𝑥 ∈ 𝑋 of player 1; note that 𝑋 is a triangle. As found earlier in (6) and (7), forthe points 𝑎 = ( , , ) and 𝑏 = ( , , ) in 𝑋 both columns have equal expectedpayoffs to player 2. This is also the case for any convex combination of 𝑎 and 𝑏 ,that is, any point on the line segment that connects 𝑎 and 𝑏 . This line segment iscommon to the two best-response regions that otherwise partition 𝑋 , namely thebest-response region for the first column (with label 4) that is the convex hull ofthe points ( , , ) , 𝑏 , 𝑎 , and ( , , ) , and the best-response region for the second7 (0,0,1) v b a Figure 2
Perspective drawing of the upper envelope of expected payoffs toplayer 2, as a function of the mixed strategy 𝑥 of player 1, for the game (3). column (with label 5) which is the convex hull of the points ( , , ) , 𝑎 , and 𝑏 . Bothregions are shown in Figure 2. (0,1)(1,0)(0,1,0)(1,0,0) (0,0,1) YX ab d c
54 132 1 2 35 4
Figure 3
The mixed strategy sets 𝑋 and 𝑌 with labels of pure best responses ofthe other player, and own labels where a pure strategy has probability zero. The two strategy sets 𝑋 and 𝑌 with their subdivision into best-response regionsfor the pure strategies of the other player are now given additional labels at theirboundaries. Namely, a point 𝑥 in 𝑋 gets label 𝑖 in { , . . . , 𝑚 } if 𝑥 𝑖 =
0, and apoint 𝑦 in 𝑋 gets label 𝑚 + 𝑗 in { 𝑚 + , . . . , 𝑚 + 𝑛 } if 𝑦 𝑗 =
0. That is, the “outsidelabels” correspond to a player’s own pure strategies that are played with probabilityzero . Figure 3 shows this for the example (3). A point may have several labels of aplayer, if it has multiple best responses or more than one own strategy that hasprobability zero. For example, 𝑥 = ( , , ) has the three labels 2 , ,
4. The pointsin 𝑋 that have three labels, and the points in 𝑌 that have two labels, are marked asdots in Figure 3. Apart from the unit vectors that are the vertices (corners) of 𝑋 and 𝑌 , these are the points 𝑎 and 𝑏 in 𝑋 and 𝑐 and 𝑑 in 𝑌 . With these labels, anequilibrium is any completely labeled pair ( 𝑥, 𝑦 ) , that is, every label in { , . . . , 𝑚 + 𝑛 } is a label of 𝑥 or of 𝑦 , as the next proposition asserts. Proposition 5.
Let ( 𝑥, 𝑦 ) ∈ 𝑋 × 𝑌 for an 𝑚 × 𝑛 bimatrix game ( 𝐴, 𝐵 ) . Then ( 𝑥, 𝑦 ) is aNash equilibrium of ( 𝐴, 𝐵 ) if and only if ( 𝑥, 𝑦 ) is completely labeled. roof. A missing label would represent a pure strategy of either player that is nota pure best response but has positive probability, which is exactly what is notallowed in an equilibrium according to Proposition 1.The advantage of this condition is that it is purely combinatorial and justdepends on the labels but not on the exact position of the dots in the diagrams inFigure 3. There, because a completely labeled pair ( 𝑥, 𝑦 ) requires all five labels,three of these must be labels of 𝑥 and two must be labels of 𝑦 , so it suffices toconsider the finitely many points with these properties. In 𝑌 , there are onlyfour points 𝑦 that have two labels. The first is ( , ) , which has labels 1 and 5.There is indeed a point in 𝑋 which has the other labels 2, 3, 4, namely ( , , ) , so (( , , ) , ( , )) is an equilibrium. Point 𝑑 = ( , ) in 𝑌 has labels 1 and 2, and point 𝑏 in 𝑋 has the other labels 3, 4, 5, so ( 𝑏, 𝑑 ) is another equilibrium, in agreementwith (7). Point 𝑐 = ( , ) in 𝑌 has labels 2 and 3, and point 𝑎 in 𝑋 has the otherlabels 1, 4, 5, so ( 𝑎, 𝑏 ) is a third equilibrium, in agreement with (6). Finally, point ( , ) in 𝑌 has labels 3 and 4, but there is no point in 𝑋 that has the remaininglabels 1, 2, 5, so there is no equilibrium where player 2 plays ( , ) . This suffices toidentify all equilibria. (The remaining points ( , , ) and ( , , ) of 𝑋 have threelabels, neither of which have corresponding points in 𝑌 that have the other twolabels.)In the above example, no point in 𝑋 has more than three labels, and no pointin 𝑌 has more than two labels. In general, this is equivalent to the nondegeneracyof the game. Proposition 6. An 𝑚 × 𝑛 bimatrix game is nondegenerate if and only if no 𝑥 in 𝑋 hasmore than 𝑚 labels, and no 𝑦 in 𝑌 has more than 𝑛 labels.Proof. Let 𝑥 ∈ 𝑋 . The labels of 𝑥 are the | bestresp ( 𝑥 )| pure best responses to 𝑥 and player 1’s own strategies 𝑖 where 𝑥 𝑖 =
0, where the number of the latter is 𝑚 − | supp ( 𝑥 )| . So if the game is degenerate because | bestresp ( 𝑥 )| > | supp ( 𝑥 )| , thisis equivalent to | bestresp ( 𝑥 )| + 𝑚 − | supp ( 𝑥 )| > 𝑚 , that is, 𝑥 having more than 𝑚 labels. Similarly, 𝑦 in 𝑌 has more that | supp ( 𝑦 )| pure best responses if and only if 𝑦 has more than 𝑛 labels. If this is never the case, the game is nondegenerate.We need further concepts about polyhedra and polytopes. A polyhedron 𝑃 in R 𝑑 is a set { 𝑧 ∈ R 𝑑 | 𝐶 𝑧 ≤ 𝑞 } for some matrix 𝐶 and vector 𝑞 . It is called full-dimensional if it has dimension 𝑑 . It is called a polytope if it is bounded. A face of 𝑃 is a set { 𝑧 ∈ 𝑃 | 𝑐 (cid:62) 𝑧 = 𝑞 } for some 𝑐 ∈ R 𝑑 and 𝑞 ∈ R so that the inequality 𝑐 (cid:62) 𝑧 ≤ 𝑞 is valid for 𝑃 , that is, holds for all 𝑧 in 𝑃 . A vertex of 𝑃 is the uniqueelement of a 0-dimensional face of 𝑃 . An edge of 𝑃 is a one-dimensional face of 𝑃 .A facet of a 𝑑 -dimensional polyhedron 𝑃 is a face of dimension 𝑑 −
1. It can beshown that any nonempty face 𝐹 of 𝑃 can be obtained by turning some of theinequalities that define 𝑃 into equalities, which are then called binding inequalities.9hat is, 𝐹 = { 𝑧 ∈ 𝑃 | 𝑐 (cid:62) 𝑖 𝑧 = 𝑞 𝑖 , 𝑖 ∈ 𝐼 } , where 𝑐 (cid:62) 𝑖 𝑧 ≤ 𝑞 𝑖 for 𝑖 ∈ 𝐼 are some of the rowsin 𝐶 𝑧 ≤ 𝑞 . A facet is characterized by a single binding inequality 𝑐 (cid:62) 𝑖 𝑧 ≤ 𝑞 𝑖 whichis irredundant , that is (after omitting any equivalent inequality), the inequalitycannot be omitted without changing the polyhedron; the vector 𝑐 𝑖 is called the normal vector of the facet. A 𝑑 -dimensional polyhedron 𝑃 is called simple if nopoint belongs to more than 𝑑 facets of 𝑃 , which is true if there are no specialdependencies between the facet-defining inequalities.The subdivision of 𝑋 and 𝑌 into best-response regions as shown in the examplein Figure 3 can be nicely visualized for small games with up to four strategies perplayer, because then 𝑋 and 𝑌 have dimension at most three. If the payoff matrix 𝐴 in the game ( 𝐴, 𝐵 ) has rows 𝑎 , . . . , 𝑎 𝑚 , then the best-response region for player 1’sstrategy 𝑖 is the set { 𝑦 ∈ 𝑌 | 𝑎 𝑖 𝑦 ≥ 𝑎 𝑘 𝑦, 𝑘 = , . . . , 𝑚 } , which is a polytope since 𝑌 is bounded. However, for general 𝑚 × 𝑛 games, the subdivision of 𝑌 and 𝑋 intobest-response regions has more structure by taking into account, as an additionaldimension, the payoffs 𝑢 and 𝑣 to player 1 and 2. In Figure 2, the upper envelopeof expected payoffs to player 1 is obtained by the smallest 𝑢 for the points ( 𝑦, 𝑢 ) in 𝑌 × R so that 3 𝑦 + 𝑦 ≤ 𝑢 , 2 𝑦 + 𝑦 ≤ 𝑢 , 0 𝑦 + 𝑦 ≤ 𝑢 , or in general 𝐴𝑦 ≤ 𝑢 .Similarly, Figure 2 shows the smallest 𝑣 for ( 𝑥, 𝑣 ) in 𝑋 × R with 3 𝑥 + 𝑥 + 𝑥 ≤ 𝑣 and 2 𝑥 + 𝑥 + 𝑥 ≤ 𝑣 , or in general 𝐵 (cid:62) 𝑥 ≤ 𝑣 . The best-response polyhedron of aplayer is the set of that player’s mixed strategies together with the upper envelopeof expected payoffs (and any larger payoffs) to the other player. The best-responsepolyhedra 𝑃 and 𝑄 of players 1 and 2 are therefore 𝑃 = {( 𝑥, 𝑣 ) ∈ R 𝑚 × R | 𝑥 ≥ , (cid:62) 𝑥 = , 𝐵 (cid:62) 𝑥 ≤ 𝑣 } ,𝑄 = {( 𝑦, 𝑢 ) ∈ R 𝑛 × R | 𝐴𝑦 ≤ 𝑢, 𝑦 ≥ , (cid:62) 𝑦 = } . (8)Both polyhedra are defined by 𝑚 + 𝑛 inequalities (and one additional equation).Whenever one of these inequalities is binding , we give it the corresponding label in { , . . . , 𝑚 + 𝑛 } . For example, if in the example (3) the inequality 3 𝑥 + 𝑥 + 𝑥 ≤ 𝑣 of 𝑃 is binding, that is, 3 𝑥 + 𝑥 + 𝑥 = 𝑣 , this means that the first pure strategyof player 2, which has label 4, is a best response. The best-response region withlabel 4 is therefore the facet of 𝑃 for this binding inequality, projected to the mixedstrategy set 𝑋 by ignoring the payoff 𝑣 to player 2, as seen in Figure 2. Facets ofpolyhedra are easier to deal with than subdivisions of mixed-strategy simplicesinto best-response regions.The binding inequalities of any ( 𝑥, 𝑣 ) in 𝑃 and ( 𝑦, 𝑢 ) in 𝑄 define labels asbefore, so that equilibria ( 𝑥, 𝑦 ) are again identified as completely labeled pairsin 𝑋 × 𝑌 . The corresponding payoffs 𝑣 and 𝑢 are then on the respective upperenvelope (that is, smallest), for the following reason: For any 𝑥 in 𝑋 , at least onecomponent 𝑥 𝑖 of 𝑥 is nonzero, so in an equilibrium label 𝑖 must appear as a bestresponse to 𝑦 , which means that the 𝑖 th inequality in 𝐴𝑦 ≤ 𝑢 is binding, that is,10 𝐴𝑦 ) 𝑖 = 𝑢 , so 𝑢 is on the upper envelope of expected payoffs in 𝑄 as claimed; theanalogous statement holds for any nonzero component 𝑦 𝑗 of 𝑦 with label 𝑚 + 𝑗 .The polyhedra 𝑃 and 𝑄 in (8) can be simplified by eliminating the payoffvariables 𝑢 and 𝑣 , by defining the following polyhedra: 𝑃 = { 𝑥 ∈ R 𝑚 | 𝑥 ≥ , 𝐵 (cid:62) 𝑥 ≤ } ,𝑄 = { 𝑦 ∈ R 𝑛 | 𝐴𝑦 ≤ , 𝑦 ≥ } . (9)We want 𝑃 and 𝑄 to be polytopes, which is equivalent to 𝑣 > 𝑢 > ( 𝑥, 𝑣 ) ∈ 𝑃 and ( 𝑦, 𝑢 ) ∈ 𝑄 , according to the following proposition. Proposition 7.
Consider a bimatrix game ( 𝐴, 𝐵 ) . Then 𝑃 in ( ) is a polytope if and onlyif the best-response payoff to any 𝑥 in 𝑋 is always positive, and 𝑄 in ( ) is a polytope ifand only if the best-response payoff to any 𝑦 in 𝑌 is always positive.Proof. We prove the statement for 𝑄 ; the proof for 𝑃 is analogous. The best-response payoff to any mixed strategy 𝑦 is the maximum entry of 𝐴𝑦 , so this isnot always positive if and only if 𝐴𝑦 ≤ for some 𝑦 ∈ 𝑌 . For such a 𝑦 we have 𝑦 ≥ 𝑦 ≠ , and 𝑦 𝛼 ∈ 𝑄 for any 𝛼 ≥
0, which shows that 𝑄 is not bounded.Conversely, suppose the best-response payoff 𝑢 to any 𝑦 is always positive. Because 𝑌 is compact and 𝑄 is closed, the minimum 𝑢 (cid:48) of { 𝑢 | ∃ 𝑦 : ( 𝑦, 𝑢 ) ∈ 𝑄 } exists, 𝑢 (cid:48) >
0, and 𝑢 ≥ 𝑢 (cid:48) for all ( 𝑦, 𝑢 ) in 𝑄 . Then the map 𝑄 → 𝑄 − { } , ( 𝑦, 𝑢 ) ↦→ 𝑦 · 𝑢 (10)is a bijection with inverse 𝑧 ↦→ ( 𝑧 · (cid:62) 𝑧 ) , (cid:62) 𝑧 ) for 𝑧 ∈ 𝑄 − { } . Here, (cid:62) 𝑧 ≥ 𝑢 (cid:48) and thus (cid:62) 𝑧 ≤ / 𝑢 (cid:48) , where (cid:62) 𝑧 is the 1-norm (cid:205) 𝑛𝑗 = | 𝑧 𝑗 | of 𝑧 (because 𝑧 ≥ ), which provesthat 𝑄 is bounded and therefore a polytope.As a sufficient condition that 𝑣 > 𝑢 > ( 𝑥, 𝑣 ) in 𝑃 and ( 𝑦, 𝑢 ) in 𝑄 , we assume that 𝐴 and 𝐵 (cid:62) are nonnegative and have no zero column (11)(because then 𝐵 (cid:62) 𝑥 and 𝐴𝑦 are nonnegative and nonzero for any 𝑥 ∈ 𝑋 , 𝑦 ∈ 𝑌 ). Wecould simply assume 𝐴 > and 𝐵 > , but it is useful to admit zero matrix entries(e.g. as in the identity matrix). Note that condition (11) is not necessary for positivebest-response payoffs (which is still the case, for example, if the zero entry of 𝐴 in(3) is negative, as Figure 1 shows). By adding a suitable positive constant to allpayoffs of a player, which preserves the preferences of that player, we can assume(11) without loss of generality.With positive best-response payoffs, the polytope 𝑃 is obtained from 𝑃 bydividing each inequality (cid:205) 𝑚𝑖 = 𝑏 𝑖𝑗 𝑥 𝑖 ≤ 𝑣 by 𝑣 , which gives (cid:205) 𝑚𝑖 = 𝑏 𝑖𝑗 ( 𝑥 𝑖 / 𝑣 ) ≤
1, andthen treating 𝑥 𝑖 / 𝑣 as a new variable that is again called 𝑥 𝑖 in 𝑃 . Similarly, 𝑄 is11eplaced by 𝑄 by dividing each inequality in 𝐴𝑦 ≤ 𝑢 by 𝑢 . In effect, we havenormalized the expected payoffs on the upper envelope to be 1, and dropped theconditions (cid:62) 𝑥 = (cid:62) 𝑦 = 𝑃 and 𝑄 have full dimension, unlike 𝑃 and 𝑄 ). Conversely, nonzero vectors 𝑥 ∈ 𝑃 and 𝑦 ∈ 𝑄 are multiplied by 𝑣 = / (cid:62) 𝑥 and 𝑢 = / (cid:62) 𝑦 to turn them into probability vectors. The scaling factors 𝑣 and 𝑢 are the expected payoffs to the other player. 𝑃 ( , , ) ( , , ) ( , , ) = 𝑏 ( , , )( , , ) = 𝑎 ( , , ) 𝑄 ( , ) ( , )( , ) = 𝑑 ( , ) = 𝑐 ( , ) Figure 4
The polytopes 𝑃 and 𝑄 in (9) for the game (3). Vertices are shown asdots, and facet labels as circled numbers. Similar to (10), the set 𝑃 is in one-to-one correspondence with 𝑃 − { } with themap 𝑃 → 𝑃 − { } , ( 𝑥, 𝑣 ) ↦→ 𝑥 · ( / 𝑣 ) . (12)These bijections are not linear, but are known as “projective transformations” (fora visualization see von Stengel (2002, Fig. 2.5)). They map lines to lines, andany binding inequality in 𝑃 (respectively, 𝑄 ) corresponds to a binding inequalityin 𝑃 (respectively, 𝑄 ) and vice versa. Therefore, corresponding points have thesame labels defined by the binding inequalities, which are some of the 𝑚 + 𝑛 inequalities that define 𝑃 and 𝑄 in (9), see Figure 4. An equilibrium is then a(re-scaled) completely labeled pair ( 𝑥, 𝑦 ) ∈ 𝑃 × 𝑄 − {( , )} that has for each label 𝑖 the respective binding 𝑖 th inequality in 𝑥 ≥ or 𝐴𝑦 ≤ , and for each label 𝑚 + 𝑗 the respective binding 𝑗 th inequality in 𝐵 (cid:62) 𝑥 ≤ or 𝑦 ≥ .With assumption (11) and the polytopes 𝑃 and 𝑄 in (9), an improved algorithmcompared to Algorithm 4 is to find all completely labeled vertex pairs ( 𝑥, 𝑦 ) of 𝑃 × 𝑄 .A simple example of the resulting improvement is an 𝑚 × 𝑄 , similarto Figure 1, has at most 𝑚 + 𝑚 / 𝑚 = 𝑛 , the maximum numberof support pairs versus vertices to be tested changes from about 4 𝑛 to about 2 . 𝑛 Lemke and Howson (1964) (LH) described an algorithm that finds one Nashequilibrium of a bimatrix game. It proves the existence of a Nash equilibrium fornondegenerate games, which can also be adapted to degenerate games. We firstexplain this algorithm following Shapley (1974). In the next section we describe itusing the polytopes from the previous Section 3.Consider a nondegenerate bimatrix game and Figure 3 for the example (3).The mixed-strategy simplices 𝑋 and 𝑌 are subdivided into best-response regionswhich are labeled with the other player’s best responses, and the facets of thesimplices are labeled with the unplayed own pure strategies. These labels give riseto a graph that consists of finitely many vertices, joined by edges. A vertex is anypoint of 𝑋 that has 𝑚 labels. An edge of the graph for 𝑋 is a set of points definedby 𝑚 − 𝑚 − 𝑎 and 𝑏 are joined by the edge with labels 4and 5. As shown in Theorem 14(g) below, nondegeneracy implies that the faces of 𝑃 that are defined by 𝑚 and 𝑚 − 𝑃 , whichcorrespond to these graph vertices and edges. There are only finitely many setswith 𝑚 labels and therefore only finitely many vertices. Similarly, every vertex andedge of 𝑌 is defined by 𝑛 and 𝑛 − 𝑋 by adding another vertex in R 𝑚 to obtain anextended graph ˜ 𝑋 . The new vertex has all labels 1 , . . . , 𝑚 , and is connected byan edge to each unit vector 𝑒 𝑖 (which is a vertex of 𝑋 ), which has all labels 1 , . . . , 𝑚 except 𝑖 . One can also consider ˜ 𝑋 geometrically as the convex hull of 𝑋 and . Thisis an 𝑚 -dimensional simplex in R 𝑚 with 𝑋 as one facet (subdivided and labeledas before) and 𝑚 additional facets { 𝑥 ∈ ˜ 𝑋 | 𝑥 𝑖 = } for each 𝑖 = , . . . , 𝑚 , withlabel 𝑖 , which produces the described labels. However, only the graph structure of ˜ 𝑋 matters. In the same way, 𝑌 is extended to ˜ 𝑌 with an extra vertex in R 𝑛 thathas all labels 𝑚 + , . . . , 𝑚 + 𝑛 , which is connected by 𝑛 edges to the 𝑛 unit vectorsin 𝑌 . The extended graphs ˜ 𝑋 and ˜ 𝑌 are shown in Figure 5.The point ( , ) of ˜ 𝑋 × ˜ 𝑌 is completely labeled, but does not represent a mixedstrategy pair. We call it the artificial equilibrium , which is the starting point of theLH algorithm. For the algorithm, one label 𝑘 in { , . . . , 𝑚, 𝑚 + , . . . , 𝑚 + 𝑛 } isdeclared as possibly missing . The algorithm computes a path in ˜ 𝑋 × ˜ 𝑌 , which wefirst describe for our example and then in general.13 𝑋 ˜ 𝑌 (0,1)(1,0) (0,0)(0,1,0)(1,0,0) (0,0,1)(0,0,0) YX ab d c
54 132 1 2 35 4
Figure 5
Extension ˜ 𝑋 of 𝑋 and ˜ 𝑌 of 𝑌 , each with an additional vertex . Theartificial equilibrium ( , ) is completely labeled. The arrows show the LH (Lemke-Howson) path starting from ( , ) for missing label 2. Figure 5 shows the LH path for missing label 2. The starting point is ( 𝑥, 𝑦 ) = (( , , ) , ( , )) , where 𝑥 has labels 1 , , 𝑦 has labels 4 ,
5. With label 2 allowedto be missing, we start by dropping label 2, which means changing 𝑥 along theunique edge that connects ( , , ) to ( , , ) (shown by an arrow in the figure),while keeping 𝑦 = ( , ) fixed. The endpoint 𝑥 = ( , , ) of that arrow has a newlabel 5 which is picked up . Because 𝑥 has three labels 1 , , 𝑦 = ( , ) has twolabels 4 , ˜ 𝑋 isnow duplicate . Because 𝑦 no longer needs to have the duplicate label 5, the nextstep is to drop label 5 in ˜ 𝑌 , that is, change 𝑦 from ( , ) to ( , ) along the edgewhich has only label 4. At the end of that edge, 𝑦 = ( , ) has labels 4 and 3, wherelabel 3 has been picked up. The current point ( 𝑥, 𝑦 ) = (( , , ) , ( , )) therefore hasduplicate label 3. Correspondingly, we can now drop label 3 in ˜ 𝑋 , that is, move 𝑥 along the edge with labels 1 and 5 to point 𝑎 , where label 4 is picked up. At thecurrent point ( 𝑥, 𝑦 ) = ( 𝑎, ( , )) , label 4 is duplicate. Next, we drop label 4 in ˜ 𝑌 by moving 𝑦 along the edge with label 3 to reach point 𝑐 , where label 2 is pickedup. Because 2 is the missing label, the reached point ( 𝑥, 𝑦 ) = ( 𝑎, 𝑐 ) is completelylabeled. This is the equilibrium that is found as the endpoint of the LH path.In general, the algorithm traces a path that consists of points ( 𝑥, 𝑦 ) in ˜ 𝑋 × ˜ 𝑌 that have all labels except possibly label 𝑘 . Because ( 𝑥, 𝑦 ) has at least 𝑚 + 𝑛 − 𝑥 is a vertex of ˜ 𝑋 (whichhas 𝑚 labels) and 𝑦 is a vertex of ˜ 𝑌 (which has 𝑛 labels). If ( 𝑥, 𝑦 ) has all labels1 , . . . , 𝑚 + 𝑛 then it is an equilibrium. If ( 𝑥, 𝑦 ) has all labels except label 𝑘 , then 𝑥 and 𝑦 have exactly one label in common, which is the duplicate label. Alternatively,either 𝑥 has 𝑚 labels and is therefore a vertex of ˜ 𝑋 and 𝑦 has 𝑛 − ˜ 𝑌 , or 𝑥 has 𝑚 − ˜ 𝑋 and 𝑦 has 𝑛 labels and is therefore a vertex of ˜ 𝑌 . These two possibilities { 𝑥 } × 𝐹 with 𝑦 ∈ 𝐹 for an edge 𝐹 of ˜ 𝑌 , or 𝐸 × { 𝑦 } with 𝑥 ∈ 𝐸 for an edge 𝐸 of ˜ 𝑋 , define the edges14f the product graph ˜ 𝑋 × ˜ 𝑌 . The vertices of this product graph are of the form ( 𝑥, 𝑦 ) where 𝑥 is a vertex of ˜ 𝑋 and 𝑦 is a vertex of ˜ 𝑌 . The LH algorithm generates a pathin this product graph. The steps of the algorithm alternate between traversing anedge of ˜ 𝑋 while keeping a vertex of ˜ 𝑌 fixed and vice versa.The LH algorithm works because there is a unique next edge in every step,which for the start depends on the chosen missing label 𝑘 . The algorithm startsfrom the artificial equilibrium ( 𝑥, 𝑦 ) = ( , ) which is completely labeled. If themissing label 𝑘 is in { , . . . , 𝑚 } then the unique start is to move 𝑥 along the edgein ˜ 𝑋 that connects to the unit vector 𝑒 𝑘 because this is the only edge that has alllabels except 𝑘 . If 𝑘 = 𝑚 + 𝑗 for 𝑗 in { , . . . , 𝑛 } then the unique start is to move 𝑦 in ˜ 𝑌 to 𝑒 𝑗 . After that, a new label is picked which (unless it is 𝑘 ) is duplicate, and thereis a unique edge in the other graph ( ˜ 𝑌 or ˜ 𝑋 ) where that duplicate label is dropped,to continue the path. If the label that is picked up is the missing label 𝑘 then thealgorithm terminates at an equilibrium. This cannot be the artificial equilibriumbecause the edge that reaches the equilibrium would offer a second way to start,which is not the case (because any edge of ˜ 𝑋 × ˜ 𝑌 that has all labels except 𝑘 couldalso be traversed in the other direction). Similarly, a vertex pair ( 𝑥, 𝑦 ) of ˜ 𝑋 × ˜ 𝑌 cannot be re-visited because this would mean a second way to continue, whichis also not the case. These two (excluded) possibilities are shown abstractly inFigure 6. ( , ) ( 𝑥, 𝑦 ) Figure 6
The LH algorithm cannot return to its starting point ( , ) or re-visit anearlier vertex pair ( 𝑥, 𝑦 ) because this would imply alternative choices for startingor continuing. The LH algorithm can be started at any equilibrium, not just the artificialequilibrium ( , ) . For example, in Figure 5, starting it with missing label 2 fromthe equilibrium ( 𝑎, 𝑐 ) that has just been found would simply traverse the path backto the artificial equilibrium. However, as shown in Figure 7, if started from thepure-strategy equilibrium ( 𝑥, 𝑦 ) = (( , , ) , ( , )) for missing label 2, it proceedsas follows: Dropping label 2 in ˜ 𝑋 changes to ( 𝑏, 𝑦 ) where label 5 is picked up.Dropping the duplicate label 5 in ˜ 𝑌 changes to ( 𝑏, 𝑑 ) where label 2 is picked up.This is the missing label, so the algorithm finds the equilibrium ( 𝑏, 𝑑 ) . This hasto be a new equilibrium because ( , ) and ( 𝑎, 𝑐 ) are connected by the unique LHpath for missing label 2 to which there is no other access.15 𝑋 ˜ 𝑌 (0,1)(1,0) (0,0)(0,1,0)(1,0,0) (0,0,1)(0,0,0) YX ab d c
54 132 1 2 35 4
Figure 7
LH path for missing label 2 when started at the pure-strategy equilibrium (( , , ) , ( , )) , which leads to the equilibrium ( 𝑏, 𝑑 ) . Hence, we obtain the following important consequence.
Theorem 8.
Any nondegenerate bimatrix game has an odd number of Nash equilibria.Proof.
Fix a missing label 𝑘 . Then the artificial equilibrium ( , ) and all Nashequilibria are the unique endpoints of the LH paths for missing label 𝑘 . Thenumber of endpoints of these paths is even, exactly one of which is the artificialequilibrium, so the number of Nash equilibria is odd.The LH paths for missing label 𝑘 are the sets of edges and vertices of ˜ 𝑋 × ˜ 𝑌 that have all labels except possibly 𝑘 . These may also create cycles which have noendpoints. Such cycles may occur but do not affect the algorithm.A different missing label may change how the artificial equilibrium and theNash equilibria are “paired” as endpoints of each LH path for that missing label.For example, any pure Nash equilibrium is connected in two steps to the artificialequilibrium via a suitable missing label. Suppose the pure strategy equilibrium is ( 𝑖, 𝑗 ) . Choose 𝑘 = 𝑖 as the missing label. Then the LH path first moves in ˜ 𝑋 to ( 𝑒 𝑖 , ) where the label that is picked up is 𝑚 + 𝑗 because 𝑗 is the best response to 𝑖 . The nextstep is then to ( 𝑒 𝑖 , 𝑒 𝑗 ) where the algorithm terminates because the best response to 𝑗 is 𝑖 which is the missing label. In the above example in Figure 7, the pure-strategyequilibrium (( , , ) , ( , )) can therefore be found via missing label 1 (or missinglabel 4 which corresponds to player 2’s pure equilibrium strategy). As shownearlier, missing label 2 connects the artificial equilibrium to ( 𝑎, 𝑐 ) , and thereforethe LH path for missing label 2 when started from (( , , ) , ( , )) necessarily leadsto a third equilibrium. However, the “network” obtained by connecting equilibriavia LH paths for different missing labels may still not connect all Nash equilibriadirectly or indirectly to the artificial equilibrium. An example due to Robert Wilsonhas been given by Shapley (1974, Fig. 3), which is a 3 × ˜ 𝑋 and ˜ 𝑌 in full (which would directly allow finding all Nash equilibria as completelylabeled vertex pairs). Rather, the alternate traversal of the edges of these graphscan be done in each step by a local “pivoting” operation that is similarly known forthe simplex algorithm for linear programming. We explain this in Section 8. A convenient way to implement the LH algorithm uses the polytopes 𝑃 and 𝑄 in(9) rather than the projections of the best-response polyhedra 𝑃 and 𝑄 in (8) to 𝑋 and 𝑌 . The polytopes 𝑃 and 𝑄 have the extra point which is the only pointnot in correspondence to the polyhedron 𝑃 and 𝑄 via a projective transformationas in (10). The extra point ( , ) ∈ 𝑃 × 𝑄 is completely labeled and represents theartificial equilibrium where the LH algorithm starts.We now consider a more general setting. A Linear Complementarity Problem orLCP is given by a 𝑑 × 𝑑 matrix 𝐶 and a vector 𝑞 ∈ R 𝑑 , where the problem is to find 𝑧 ∈ R 𝑑 so that 𝑧 ≥ , 𝑤 = 𝑞 − 𝐶 𝑧 ≥ , 𝑧 (cid:62) 𝑤 = − 𝑀 instead of 𝐶 and 𝑛 instead of 𝑑 ). In (13), because both 𝑧 and 𝑤 are nonnegative, theorthogonality condition 𝑧 (cid:62) 𝑤 = 𝑧 𝑖 𝑤 𝑖 = 𝑖 = , . . . , 𝑑 , which means that at least one of the variables 𝑧 𝑖 and 𝑤 𝑖 is zero; thesevariables are therefore also called complementary .A geometric way to view an LCP is the following. Consider the polyhedron 𝑆 in R 𝑑 given by 𝑆 = { 𝑧 ∈ R 𝑑 | 𝑧 ≥ , 𝐶 𝑧 ≤ 𝑞 } . (14)For any 𝑧 ∈ 𝑆 , we say 𝑧 has label 𝑖 in { , . . . , 𝑑 } if 𝑧 𝑖 = ( 𝐶 𝑧 ) 𝑖 = 𝑞 𝑖 , and call 𝑧 completely labeled if 𝑧 has all labels 1 , . . . , 𝑑 . Clearly, 𝑧 is a solution to the LCP (13) ifand only if 𝑧 ∈ 𝑆 and 𝑧 is completely labeled.In 𝑆 , the 2 𝑑 inequalities 𝑧 ≥ 𝐶 𝑧 ≤ 𝑞 have the labels 1 , . . . , 𝑑, , . . . , 𝑑 (whichmeans every label occurs twice) and 𝑧 in 𝑆 has label 𝑖 if one of the correspondinginequalities is binding. The labels of 𝑆 in (14) should be thought of as labelingthe facets of 𝑆 . We assume 𝑆 is nondegenerate , that is, no 𝑧 ∈ 𝑆 has more than 𝑑 binding inequalities. As shown in Theorem 14(h) below, this is equivalent to thefollowing conditions: 𝑆 is a simple polytope (no point is on more than 𝑑 facets),and no inequality can be omitted without changing 𝑆 , unless it is never binding.Every facet therefore corresponds to a unique binding inequality, and has the17orresponding label. Any edge of 𝑆 is defined by 𝑑 − 𝑑 facets. Any point has the labels of the facets it lies on.Consider an 𝑚 × 𝑛 bimatrix game ( 𝐴, 𝐵 ) , which may be degenerate. Assumethat 𝑃 and 𝑄 in (9) are polytopes, if necessary by adding a constant to the payoffs(see Proposition 7). Then any Nash equilibrium of ( 𝐴, 𝐵 ) is given by a solution 𝑧 = ( 𝑥, 𝑦 ) ∈ 𝑃 × 𝑄 = 𝑆 with 𝑧 ≠ to the LCP (13). That is, 𝑑 = 𝑚 + 𝑛 and 𝑞 = ∈ R 𝑚 + 𝑛 , and 𝐶 = (cid:20) 𝐴𝐵 (cid:62) (cid:21) (15)where 0 is an all-zero matrix (of size 𝑚 × 𝑚 and 𝑛 × 𝑛 , respectively). The 𝑚 + 𝑛 labels are exactly as described in Section 3, and correspond to unplayed purestrategies 𝑖 if 𝑧 𝑖 = 𝑖 if ( 𝐶 𝑧 ) 𝑖 = 𝑞 𝑖 =
1. As before,for 𝑧 = ( 𝑥, 𝑦 ) the vectors 𝑥 and 𝑦 have to be re-scaled to represent mixed strategies.Moreover, 𝑆 is nondegenerate if and only if the game ( 𝐴, 𝐵 ) is nondegenerate, byProposition 6.We now study the LH algorithm without assuming the product structure 𝑆 = 𝑃 × 𝑄 for 𝑆 , which simplifies the description. Let 𝑆 in (14) be a nondegeneratepolytope so that is a vertex of 𝑆 . By nondegeneracy, when 𝑧 = then theremaining 𝑑 inequalities 𝐶 𝑧 ≤ 𝑞 are strict, that is, < 𝑞 . We can therefore dividethe 𝑖 th inequality (that is, the 𝑖 th row of 𝐶 and of 𝑞 ) by 𝑞 𝑖 and thus assume 𝑞 = .This polytope has also a game-theoretic interpretation. Proposition 9.
Let 𝑆 = { 𝑧 ∈ R 𝑑 | 𝑧 ≥ , 𝐶 𝑧 ≤ } (16) be a polytope with its 𝑑 inequalities labeled , . . . , 𝑑, , . . . , 𝑑 . Then 𝑧 ∈ 𝑆 − { } iscompletely labeled if and only if (with 𝑧 re-scaled as a mixed strategy) ( 𝑧, 𝑧 ) is a symmetricNash equilibrium of the symmetric 𝑑 × 𝑑 game ( 𝐶 , 𝐶 (cid:62) ) .Proof. In the game ( 𝐶 , 𝐶 (cid:62) ) , let 𝑦 be a mixed strategy of player 2, where the best-response payoff max 𝑖 ( 𝐶 𝑦 ) 𝑖 against 𝑦 is always positive because 𝑆 is a polytope (seeProposition 7 where this is stated for 𝑄 instead of 𝑆 ). Re-scaling the best-responsepayoff against 𝑦 to 1 and re-scaling 𝑦 to 𝑧 gives the inequality 𝐶 𝑧 ≤ , where 𝑧 ≥ .By Proposition 1, 𝑧 has all labels 1 , . . . , 𝑑 if and only if ( 𝑦, 𝑦 ) is a Nash equilibriumof ( 𝐶 , 𝐶 (cid:62) ) .Hence, the equilibria of a bimatrix game ( 𝐴, 𝐵 ) correspond to the symmetricequilibria of the symmetric game ( 𝐶 , 𝐶 (cid:62) ) in (15). This “symmetrization” seemsto be a folklore result, first stated for zero-sum games by Gale, Kuhn, and Tucker(1950).We now express the LH algorithm in terms of computing a path of edges ofthe polytope 𝑆 . 18 roposition 10. Suppose 𝑆 in ( ) is a nondegenerate polytope, with its 𝑑 inequalitieslabeled , . . . , 𝑑, , . . . , 𝑑 . Then 𝑆 has an even number of completely labeled vertices,including .Proof. This a consequence of the LH algorithm applied to 𝑆 . Fix a label 𝑘 in { , . . . , 𝑑 } as allowed to be missing and consider the set of all points of 𝑆 thathave all labels except possibly 𝑘 . This defines a set of vertices and edges of 𝑆 ,which we call the missing- 𝑘 vertices (which may nevertheless also have label 𝑘 ) andedges. Any missing- 𝑘 vertex is either completely labeled (for example, ), or has aduplicate label, say ℓ . A completely labeled vertex 𝑧 is the endpoint of a uniquemissing- 𝑘 edge which is defined by the 𝑑 − 𝑧 except for thefacet with label 𝑘 , by “moving away” from that facet. If the missing- 𝑘 vertex 𝑧 does not have label 𝑘 , then it is the endpoint of two missing- 𝑘 edges, each obtainedby moving away from one of the two facets with the duplicate label ℓ . Hence, themissing- 𝑘 vertices and edges define a collection of paths and cycles, where theendpoints of the paths are the completely labeled vertices. Their total number iseven because each path has two endpoints.For the game ( 𝐶 , 𝐶 (cid:62) ) with 𝐶 = , (17)the polytope 𝑆 in (16) is shown in Figure 8 in a suitable planar projection (whereall facets are visible except for the facet defined by 𝑧 = and 𝑥 = ( , , ) . 𝑥 − + 𝑥 Figure 8
Left: LH path for missing label 1 for the polytope 𝑆 with 𝐶 as in (17). Right:Opposite orientation − and + of the labels , , around the two completely la-beled vertices and 𝑥 . Endpoints of LH paths have opposite index
In this section we prove a stronger version of Proposition 10. Namely, theendpoints of an LH path will be shown to have opposite “signs” − +
1, whichare independent of the missing label. This “sign” is called the index of a Nashequilibrium, which we define here in an elementary way using determinants.By convention, the artificial equilibrium has index −
1. This implies that everynondegenerate bimatrix game has 𝑟 Nash equilibria of index + 𝑟 − −
1, for some integer 𝑟 ≥ the labels 1 , , −
1, whereas around the Nash equilibrium 𝑥 they appear counterclockwise, which is a positive orientation and defines index + , the unique edge with missing label 1has label 2 on the left side of the path and label 3 on the right side of the path. Ascan be seen from the diagram, this holds for all missing-1 edges when followingthe path. The path terminates when it hits a facet with label 1, which is now infront of the edge of the path which has label 2 on the left and label 3 on the right,so the labels 1 , , behind the edge of the path. In addition, the LH path has a well-defined local direction that indicates where to go “forward” in order to reach the endpoint withindex +
1, even if one does not remember where one started: The forward directionhas label 2 on the left and label 3 on the right.We show these properties of the index for labeled polytopes 𝑆 as in Proposi-tion 10 for general dimension 𝑑 . Our argument substantially simplifies the proof byShapley (1974) who first defined the index for Nash equilibria of bimatrix games. Definition 11.
Consider a labeled nondegenerate polytope { 𝑧 ∈ R 𝑑 | 𝑐 (cid:62) 𝑗 𝑧 ≤ 𝑞 𝑗 for 𝑗 = , . . . , 𝑁 } (18)where each inequality 𝑐 (cid:62) 𝑗 𝑧 ≤ 𝑞 𝑗 for 𝑗 = , . . . , 𝑁 has some label in { , . . . , 𝑑 } .Consider a completely labeled vertex 𝑥 of 𝑆 where 𝜆 ( 𝑖 ) indicates the inequality 𝑐 (cid:62) 𝜆 ( 𝑖 ) 𝑥 = 𝑞 𝜆 ( 𝑖 ) that is binding for 𝑥 and has label 𝑖 , for 𝑖 = , . . . , 𝑑 . Then the index of 𝑥 is defined as the sign of the following determinant (multiplied by − 𝑑 is even): (− ) 𝑑 + sign | 𝑐 𝜆 ( ) · · · 𝑐 𝜆 ( 𝑑 ) | . (19)A 𝑑 × 𝑑 matrix formed by 𝑑 linearly independent vectors in R 𝑑 has a nonzerodeterminant, but its sign is only well defined for a specific order of these vectors.For a vertex of a nondegenerate polytope as in (18), the normal vectors 𝑐 𝑗 of its20inding inequalities are linearly independent (see Theorem 14(e) below). Whenthe vertex is completely labeled, we write down these normal vectors in the orderof their labels, that is, for 𝑗 = 𝜆 ( 𝑖 ) for 𝑖 = , . . . , 𝑑 , and consider the resulting 𝑑 × 𝑑 determinant in (19). The sign correction for even dimension 𝑑 is made forthe following reason. For the polytope 𝑆 in (16) we write 𝑧 ≥ as − 𝑧 ≤ so allinequalities go in the same direction as required in (18). For the completely labeledvertex we thus obtain the determinant of the negative of the 𝑑 × 𝑑 identity matrix,which is 1 if 𝑑 is even and − 𝑑 is odd. In order to obtain a negative index forthis artificial equilibrium, we therefore multiply the sign of the determinant with (− ) 𝑑 + . 𝑥 𝑦𝑐 𝜆 ( ) 𝑐 𝜆 ( ) 𝑐 𝜆 ( ) 𝑐 𝜆 ( ) Figure 9
Opposite geometric orientation of adjacent vertices 𝑥 and 𝑦 as inLemma 12 for 𝑑 = . The four involved facets are shown with their normal vec-tors. The next lemma states that “pivoting changes sign” in the following sense.“Pivoting” is the algebraic representation of moving from a vertex to an “adjacent”vertex along an edge. This means that one binding inequality is replaced byanother. For any fixed order of the normal vectors of the binding inequalities, oneof these vectors is thus replaced by another, which we choose to be the vector infirst position. The lemma states that the corresponding determinants then haveopposite sign; it is geometrically illustrated in Figure 9.
Lemma 12.
Consider a nondegenerate polytope 𝑆 as in ( ) , and an edge defined by 𝑑 − binding inequalities 𝑐 (cid:62) 𝜆 ( 𝑖 ) 𝑧 = 𝑞 𝜆 ( 𝑖 ) for 𝑖 = , . . . , 𝑑 . Let 𝑥 and 𝑦 be the endpoints of thisedge, with the additional binding inequality 𝑐 (cid:62) 𝜆 ( ) 𝑥 = 𝑞 𝜆 ( ) for 𝑥 and 𝑐 (cid:62) 𝜆 ( ) 𝑦 = 𝑞 𝜆 ( ) for 𝑦 .Then sign | 𝑐 𝜆 ( ) 𝑐 𝜆 ( ) . . . 𝑐 𝜆 ( 𝑑 ) | = − sign | 𝑐 𝜆 ( ) 𝑐 𝜆 ( ) . . . 𝑐 𝜆 ( 𝑑 ) | ≠ . (20) Proof.
Because 𝑆 is nondegenerate, 𝑥 and 𝑦 have exactly 𝑑 binding inequalities, sothat the following conditions hold: 𝑐 (cid:62) 𝜆 ( ) 𝑥 < 𝑞 𝜆 ( ) , 𝑐 (cid:62) 𝜆 ( ) 𝑦 = 𝑞 𝜆 ( ) ,𝑐 (cid:62) 𝜆 ( ) 𝑥 = 𝑞 𝜆 ( ) , 𝑐 (cid:62) 𝜆 ( ) 𝑦 < 𝑞 𝜆 ( ) ,𝑐 (cid:62) 𝜆 ( ) 𝑥 = 𝑞 𝜆 ( ) , 𝑐 (cid:62) 𝜆 ( ) 𝑦 = 𝑞 𝜆 ( ) ,... ...𝑐 (cid:62) 𝜆 ( 𝑑 ) 𝑥 = 𝑞 𝜆 ( 𝑑 ) , 𝑐 (cid:62) 𝜆 ( 𝑑 ) 𝑦 = 𝑞 𝜆 ( 𝑑 ) . (21)21he 𝑑 + 𝑐 𝜆 ( ) , 𝑐 𝜆 ( ) , . . . , 𝑐 𝜆 ( 𝑑 ) are linearly dependent, so there are reals 𝛾 , 𝛾 , . . . , 𝛾 𝑑 , not all zero, with 𝛾 𝑐 (cid:62) 𝜆 ( ) + 𝛾 𝑐 (cid:62) 𝜆 ( ) + · · · + 𝛾 𝑑 𝑐 (cid:62) 𝜆 ( 𝑑 ) = (cid:62) (22)where 𝛾 ≠ 𝛾 ≠ 𝑥 or 𝑦 would be linearly dependent, which is not the case. Hence, by (22) and (21),0 = (cid:62) ( 𝑦 − 𝑥 ) = 𝛾 𝑐 (cid:62) 𝜆 ( ) ( 𝑦 − 𝑥 ) + 𝛾 𝑐 (cid:62) 𝜆 ( ) ( 𝑦 − 𝑥 ) and therefore 𝛾 𝛾 = 𝑐 (cid:62) 𝜆 ( ) 𝑥 − 𝑐 (cid:62) 𝜆 ( ) 𝑦𝑐 (cid:62) 𝜆 ( ) 𝑦 − 𝑐 (cid:62) 𝜆 ( ) 𝑥 > = |( 𝑐 𝜆 ( ) 𝛾 + 𝑐 𝜆 ( ) 𝛾 ) 𝑐 𝜆 ( ) · · · 𝑐 𝜆 ( 𝑑 ) | = | 𝑐 𝜆 ( ) 𝑐 𝜆 ( ) · · · 𝑐 𝜆 ( 𝑑 ) | 𝛾 + | 𝑐 𝜆 ( ) 𝑐 𝜆 ( ) · · · 𝑐 𝜆 ( 𝑑 ) | 𝛾 (24)and thus | 𝑐 𝜆 ( ) 𝑐 𝜆 ( ) · · · 𝑐 𝜆 ( 𝑑 ) | = − | 𝑐 𝜆 ( ) 𝑐 𝜆 ( ) · · · 𝑐 𝜆 ( 𝑑 ) | 𝛾 𝛾 (25)which shows (20). Theorem 13.
Suppose 𝑆 in ( ) is a nondegenerate polytope, with its 𝑑 inequalitieslabeled , . . . , 𝑑, , . . . , 𝑑 . Then 𝑆 has an even number of completely labeled vertices. Halfof these (including ) have index − , the other half index + . The endpoints of any LHpath have opposite index.Proof. Let 𝑆 be described as in (18), so that 𝑐 𝑖 = − 𝑒 𝑖 for 𝑖 = , . . . , 𝑑 and 𝐶 (cid:62) = [ 𝑐 𝑑 + · · · 𝑐 𝑑 ] . Consider some completely labeled vertex 𝑥 of 𝑆 . Let the bindinginequalities for 𝑥 be 𝑐 (cid:62) 𝜆 ( 𝑖 ) 𝑥 = 𝑞 𝑖 with label 𝑖 for 𝑖 = , . . . , 𝑑 . We consider the LHpath with missing label 1 that starts at 𝑥 , and show that the endpoint of that pathhas opposite index to 𝑥 . Suppose that 𝑥 has negative index and that 𝑑 is odd, sothat | 𝑐 𝜆 ( ) 𝑐 𝜆 ( ) · · · 𝑐 𝜆 ( 𝑑 ) | <
0. On the first edge of the LH path with missing label 1,the same inequalities as for 𝑥 are binding, except for the inequality 𝑐 (cid:62) 𝜆 ( ) 𝑧 ≤ 𝑞 . Letthe endpoint of that edge be 𝑦 , where now the inequality 𝑐 (cid:62) 𝜆 ( ) 𝑦 ≤ 𝑞 is binding.This is the situation of Lemma 12, so | 𝑐 𝜆 ( ) 𝑐 𝜆 ( ) · · · 𝑐 𝜆 ( 𝑑 ) | > 𝑐 (cid:62) 𝜆 ( ) 𝑦 ≤ 𝑞 has the missing label 1, the claim is proved, because then 𝑦 is the other endpoint of the LH path, and has positive index.So suppose this is not the case, that is, the binding inequality 𝑐 (cid:62) 𝜆 ( ) 𝑦 ≤ 𝑞 hasa duplicate label ℓ in { , . . . , 𝑑 } . We now exchange columns 1 and ℓ in the matrix [ 𝑐 𝜆 ( ) 𝑐 𝜆 ( ) · · · 𝑐 𝜆 ( 𝑑 ) ] , which changes the sign of its determinant, which is now | 𝑐 𝜆 ( ℓ ) 𝑐 𝜆 ( ) · · · 𝑐 𝜆 ( ℓ − ) 𝑐 𝜆 ( ) 𝑐 𝜆 ( ℓ + ) · · · 𝑐 𝜆 ( 𝑑 ) | (26)22nd again negative. Note that these are still the same normal vectors of the bindinginequalities for 𝑦 , except for the exchanged columns 1 and ℓ ; moreover, columns2 , . . . , 𝑑 have labels 2 , . . . , 𝑑 in that order. The first column in (26) has the duplicatelabel ℓ , and corresponds to the inequality 𝑐 (cid:62) 𝜆 ( ℓ ) 𝑦 ≤ 𝑞 ℓ that is no longer bindingwhen label ℓ is dropped for the next edge on the LH path. That is, (26) representsthe same situation as the starting point 𝑥 : The determinant is negative, columns2 , . . . , 𝑑 have the correct labels, and the first column will be exchanged for a newcolumn when traversing the next edge. The resulting determinant with the newfirst column has opposite sign by Lemma 12. If the label that has been picked up isthe missing label 1, then it is the endpoint of the LH path and the claim is proved.Otherwise we again exchange the first column with the column of the duplicatelabel, with the determinant going back to negative, and repeat, until the endpointof the path is reached.On any missing-1 vertex on the path, we also can identify the direction ofthe path by considering the two determinants obtained by exchanging the firstcolumn and the column with its duplicate label (in both cases, columns 2 , . . . , 𝑑 have the correct labels). The pivoting step (which determines the edge to betraversed) replaces the first column of the determinant. If it starts from a negativedeterminant, then the direction is towards the endpoint with positive index (forodd 𝑑 , as in our description so far). If it starts from a positive determinant, thenthe direction is towards the endpoint with negative index.Clearly, the analogous reasoning applies if the considered starting point 𝑥 ofthe LH path has positive index or if 𝑑 is even. Because the endpoints of the LHpaths for missing label 1 have opposite index, half of these endpoints have index − +
1, as claimed.As concerns missing labels 𝑘 other than label 1, we can reduce this to the case 𝑘 = 𝑘 th coordinate of R 𝑑 as the ambientspace of 𝑆 , and the first and 𝑘 th row in the 𝑑 inequalities − 𝑧 ≤ as well as in 𝐶 𝑧 ≤ . This double exchange of rows and columns does not change the signs ofthe determinant (19) of any completely labeled vertex, and the LH path for missinglabel 𝑘 becomes the LH path for missing label 1 where the preceding reasoningapplies.Figure 10 illustrates the proof of Theorem 13 where the right-hand side showstwo columns that display the determinants with sign − + , which has the negativeunit vectors as normal vectors of its binding inequalities. Dropping label 1 andpicking up label 3 exchanges − 𝑒 with 𝑐 , with the sign of the determinant changedfrom − +
1. This is the edge from to 𝑦 . The double-headed arrow shows theswitch to the next line which exchanges the columns 𝑐 and − 𝑒 with the duplicatelabel 3, and brings the determinant back to −
1, but still refers to the same point 𝑦 .23 𝑦 − 𝑒 − 𝑒 − 𝑒 𝑐 𝑐 𝑐 − + | − 𝑒 − 𝑒 − 𝑒 | | 𝑐 − 𝑒 − 𝑒 || − 𝑒 − 𝑒 𝑐 | | 𝑐 − 𝑒 𝑐 || − 𝑒 𝑐 𝑐 | | − 𝑒 𝑐 𝑐 || 𝑐 𝑐 − 𝑒 | | 𝑐 𝑐 − 𝑒 | Figure 10
Illustration of the proof of Theorem 13 with facets of 𝑆 in (16) shownby their normal vectors (whose subscripts are their labels), which are either thenegative unit vectors − 𝑒 , − 𝑒 , − 𝑒 , or 𝑐 (cid:62) , 𝑐 (cid:62) , 𝑐 (cid:62) as the three rows of the matrix 𝐶 in (17). The next step away from 𝑦 exchanges the column − 𝑒 with 𝑐 (it is always the firstcolumn that is being replaced), and so on. The last column that is found is 𝑐 which has the missing label 1, with a positive determinant | 𝑐 𝑐 − 𝑒 | and hencepositive index of the found Nash equilibrium. Nondegeneracy of a bimatrix game ( 𝐴, 𝐵 ) is an important assumption for thealgorithms that we have described so far. In Algorithm 4, which finds all equilibriaby support enumeration, it ensures that the equations that define the mixedstrategy probabilities for a given support pair have unique solutions. For the LHalgorithm, it is, in addition, important for the vertex pairs encountered on the LHpath so that the path is well defined.The following theorem states a number of equivalent conditions of nondegen-eracy. Some of them have been stated only as sufficient conditions (but they arenot stronger), for example condition (e) by van Damme (1991, p. 52) and Lemkeand Howson (1964), or (g) by Krohn, Moltzahn, Rosenmüller, Sudhölter, andWallmeier (1991) and, in slightly weaker form, by Shapley (1974). The purpose ofthis section is to state and prove the equivalence of these conditions, which has notbeen done in this completeness before. Much of the proof is straightforward linearalgebra, but illustrative in this context, for example for the implication (d) ⇒ (e).We comment on the different conditions afterwards. Theorem 14.
Let ( 𝐴, 𝐵 ) be an 𝑚 × 𝑛 bimatrix game so that 𝑃 and 𝑄 in ( ) are polytopes.Consider 𝐶 as in ( ) where 𝑑 = 𝑚 + 𝑛 and 𝐶 (cid:62) = [ 𝑐 · · · 𝑐 𝑑 ] , and the polytope 𝑆 in ( ) . s before, a point in 𝑃 , 𝑄 , or 𝑆 has label 𝑖 in { , . . . , 𝑑 } if the corresponding 𝑖 th inequalityis binding (in 𝑆 this can occur twice, for 𝑧 𝑖 = or 𝑐 (cid:62) 𝑖 𝑧 = ). Then the following areequivalent. (a) ( 𝐴, 𝐵 ) is nondegenerate. (b) No point in 𝑃 has more than 𝑚 labels, and no point in 𝑄 has more than 𝑛 labels. (c) The symmetric game ( 𝐶 , 𝐶 (cid:62) ) is nondegenerate. (d) For no point 𝑧 in 𝑆 more than 𝑑 of the inequalities 𝑧 ≥ and 𝐶 𝑧 ≤ are binding. (e) For every 𝑧 ∈ 𝑆 the row vectors 𝑒 (cid:62) 𝑖 and 𝑐 (cid:62) 𝑗 for the binding inequalities of 𝑧 ≥ and 𝐶 𝑧 ≤ are linearly independent. (f) Consider any ˆ 𝑥 ∈ 𝑋 and ˆ 𝑦 ∈ 𝑌 . Let 𝐼 = supp ( ˆ 𝑥 ) and 𝐽 = bestresp ( ˆ 𝑥 ) , and let 𝐵 𝐼𝐽 be the | 𝐼 | × | 𝐽 | submatrix of 𝐵 with entries 𝑏 𝑖𝑗 of 𝐵 for 𝑖 ∈ 𝐼 and 𝑗 ∈ 𝐽 . Similarly,let 𝐾 = bestresp ( ˆ 𝑦 ) and 𝐿 = supp ( ˆ 𝑦 ) , and let 𝐴 𝐾𝐿 be the corresponding | 𝐾 | × | 𝐿 | submatrix of 𝐴 . Then the columns of 𝐵 𝐼𝐽 are linearly independent, and the rows of 𝐴 𝐾𝐿 are linearly independent. (g) Consider any ˆ 𝑥 ∈ 𝑋 and ˆ 𝑦 ∈ 𝑌 . Let 𝐼 be the set of labels of ˆ 𝑥 , let 𝐽 be the set of labelsof ˆ 𝑦 , and let 𝑃 ( 𝐼 ) = { 𝑥 ∈ 𝑃 | 𝑥 has at least all the labels in 𝐼 } ,𝑄 ( 𝐽 ) = { 𝑦 ∈ 𝑄 | 𝑦 has at least all the labels in 𝐽 } . (27) Then 𝑃 ( 𝐼 ) has dimension 𝑚 − | 𝐼 | , and 𝑄 ( 𝐽 ) has dimension 𝑛 − | 𝐽 | . (h) 𝑃 and 𝑄 are simple polytopes, and for both polytopes any inequality that is redundant(that is, can be omitted without changing the polytope) is never binding. (i) 𝑃 and 𝑄 are simple polytopes, and any pure strategy of a player that is weaklydominated by or payoff equivalent to a different mixed strategy is strictly dominated.Proof. We show the implication chain (a) ⇒ (b), . . . , (h) ⇒ (i), (i) ⇒ (a).Assume (a), and consider any 𝑦 ∈ 𝑄 . If 𝐴𝑦 < then the only labels of 𝑦 are 𝑚 + 𝑗 for 𝑦 𝑗 = ≤ 𝑗 ≤ 𝑛 . Hence, we can assume that at least one inequalityof 𝐴𝑦 ≤ is binding, which corresponds to a best response (and hence label) ofthe mixed strategy 𝑦 = 𝑦 (cid:62) 𝑦 . Via the projective map (10), 𝑦 and 𝑦 have the samelabels. By Proposition 6, 𝑦 and therefore 𝑦 has no more than 𝑛 labels, as claimed.Similarly, no 𝑥 ∈ 𝑃 has no more than 𝑚 labels. This shows (b).Assume (b); we show (c). For the game ( 𝐶 , 𝐶 (cid:62) ) , the polytopes correspondingto (9) are 𝑃 (cid:48) = { 𝑥 (cid:48) ∈ R 𝑑 | 𝑥 (cid:48) ≥ , 𝐶 𝑥 (cid:48) ≤ } ,𝑄 (cid:48) = { 𝑦 (cid:48) ∈ R 𝑑 | 𝐶 𝑦 (cid:48) ≤ , 𝑦 (cid:48) ≥ } , (28)so 𝑃 (cid:48) = 𝑄 (cid:48) = 𝑆 . By (15), for 𝑧 = ( 𝑥, 𝑦 ) we have 𝑧 ≥ and 𝐶 𝑧 ≤ if and only if 𝑥 ≥ , 𝑦 ≥ , 𝐴𝑦 ≤ , 𝐵 (cid:62) 𝑥 ≤ , (29)25hat is, 𝑥 ∈ 𝑃 and 𝑦 ∈ 𝑄 . By (b), 𝑥 has no more than 𝑚 labels and 𝑦 has nomore than 𝑛 labels, so 𝑧 = ( 𝑥, 𝑦 ) has no more than 𝑚 + 𝑛 = 𝑑 labels, and thisholds correspondingly for any 𝑥 (cid:48) ∈ 𝑃 (cid:48) and 𝑦 (cid:48) ∈ 𝑄 (cid:48) in (28). Therefore, ( 𝐶 , 𝐶 (cid:62) ) isnondegenerate.Assume (c). The inequalities of the polytope 𝑃 (cid:48) in (28) have unique labels1 , . . . , 𝑑 (unlike 𝑆 ). No point in 𝑃 (cid:48) has more than 𝑑 labels, and therefore no pointin 𝑆 has more than 𝑑 binding inequalities. This shows (d).Assume (d), and, to show (e), suppose for some 𝑧 ∈ 𝑆 with 𝐾 = { 𝑖 | 𝑧 𝑖 = } and 𝐿 = { 𝑗 | 𝑐 (cid:62) 𝑗 𝑧 = } the row vectors 𝑒 (cid:62) 𝑖 for 𝑖 ∈ 𝐾 and 𝑐 (cid:62) 𝑗 for 𝑗 ∈ 𝐿 are linearlydependent; choose 𝑧 so that | 𝐾 | + | 𝐿 | is maximal. By (d), | 𝐾 | + | 𝐿 | ≤ 𝑑 . Let 𝑈 be thematrix with rows 𝑒 (cid:62) 𝑖 for 𝑖 ∈ 𝐾 and 𝑐 (cid:62) 𝑗 for 𝑗 ∈ 𝐿 , which has row rank 𝑟 < | 𝐾 | + | 𝐿 | ≤ 𝑑 ,and therefore only 𝑟 < 𝑑 linearly independent columns. Hence, there is somenonzero 𝑣 ∈ R 𝑑 so that 𝑈 𝑣 = . For 𝛼 ∈ R let 𝑧 𝛼 = 𝑧 + 𝑣 𝛼 . Then ( 𝑧 𝛼 ) 𝑖 = 𝑖 ∈ 𝐾 and 𝑐 (cid:62) 𝑗 𝑧 𝛼 = 𝑗 ∈ 𝐿 because 𝑈 𝑣 = . For 𝛼 =
0, the inequalities ( 𝑧 𝛼 ) 𝑖 ≥ 𝑖 ∉ 𝐾 and 𝑐 (cid:62) 𝑗 𝑧 𝛼 ≤ 𝑗 ∉ 𝐿 are not binding, but maximizing 𝛼 subject to theseinequalities (which imply 𝑧 𝛼 ∈ 𝑆 ) produces at least one further binding inequalitybecause 𝑆 is bounded and 𝑣 ≠ . This contradicts the maximality of | 𝐾 | + | 𝐿 | . Thisproves (e).Assume (e), and consider ˆ 𝑥, ˆ 𝑦, 𝐼 , 𝐽 , 𝐾, 𝐿 as defined in (f), with best-responsepayoff 𝑢 to player 1 and 𝑣 to player 2. Let 𝑥 = ˆ 𝑥 𝑣 and 𝑦 = ˆ 𝑦 𝑢 so that 𝑥 ∈ 𝑃 and 𝑦 ∈ 𝑄 via (12) and (10). With 𝑧 = ( 𝑥, 𝑦 ) , the binding inequalities in 𝑧 ≥ and 𝐶 𝑧 ≤ , that is, (29), are 𝑥 𝑖 = 𝑖 ∉ 𝐼 and 𝑦 𝑗 = 𝑗 ∉ 𝐿 and 𝑎 (cid:62) 𝑖 𝑦 = 𝑖 ∈ 𝐾 where 𝐴 (cid:62) = [ 𝑎 · · · 𝑎 𝑚 ] and 𝑏 (cid:62) 𝑗 𝑥 = 𝑗 ∈ 𝐽 where 𝐵 = [ 𝑏 · · · 𝑏 𝑛 ] . Thecorresponding row vectors 𝑒 (cid:62) 𝑖 for 𝑖 ∉ 𝐼 ∪ 𝐿 and (as rows of 𝐶 ) ( , 𝑎 (cid:62) 𝑖 ) for 𝑖 ∈ 𝐾 and ( 𝑏 (cid:62) 𝑗 , ) for 𝑗 ∈ 𝐽 are linearly independent by assumption (e). This implies that therows 𝑎 (cid:62) 𝑖𝐿 of 𝐴 𝐾𝐿 are linearly independent : suppose (cid:205) 𝑖 ∈ 𝐾 𝛼 𝑖 𝑎 (cid:62) 𝑖𝐿 = (cid:62) for some reals 𝛼 𝑖 . Then with 𝛽 𝑗 = − (cid:205) 𝑖 ∈ 𝐾 𝛼 𝑖 𝑎 𝑖𝑗 for 𝑗 ∉ 𝐿 we have (cid:205) 𝑗 ∉ 𝐿 𝛽 𝑗 𝑒 (cid:62) 𝑗 + (cid:205) 𝑖 ∈ 𝐾 𝛼 𝑖 𝑎 (cid:62) 𝑖 = (cid:62) whichby linear independence of these row vectors is only the trivial linear combination,so 𝛼 𝑖 = 𝑖 ∈ 𝐾 as claimed. Similarly, the columns of ( 𝐵 𝐼𝐽 ) (cid:62) , that is, rows of 𝐵 𝐼𝐽 ,are linearly independent, as claimed in (f).Assume (f), and consider ˆ 𝑥, ˆ 𝑦, 𝐼 , 𝐽 as in (g). With the set 𝐽 of labels of ˆ 𝑦 , let 𝐾 = 𝐽 ∩ { , . . . , 𝑚 } ,𝐽 (cid:48) = { 𝑗 ∈ { , . . . , 𝑛 } | 𝑚 + 𝑗 ∈ 𝐽 } ,𝐿 = { 𝑗 ∈ { , . . . , 𝑛 } | 𝑚 + 𝑗 ∉ 𝐽 } , (30)that is, 𝐾 = bestresp ( ˆ 𝑦 ) and 𝐿 = supp ( ˆ 𝑦 ) , and | 𝐽 | = | 𝐾 | + | 𝐽 (cid:48) | = | 𝐾 | + 𝑛 − | 𝐿 | . (31)Let 𝐴 𝐾𝐿 be the submatrix of 𝐴 with entries 𝑎 𝑖𝑗 of 𝐴 for 𝑖 ∈ 𝐾 and 𝑗 ∈ 𝐿 . We write 𝑦 ∈ 𝑄 as 𝑦 = ( 𝑦 𝐽 (cid:48) , 𝑦 𝐿 ) . Then 26 = ( 𝑦 𝐽 (cid:48) , 𝑦 𝐿 ) ∈ 𝑄 ( 𝐽 ) ⇔ 𝑦 𝐽 (cid:48) = , 𝑦 𝐿 ≥ , 𝐴 𝐾𝐿 𝑦 𝐿 = . (32)The | 𝐾 | equations 𝐴 𝐾𝐿 𝑦 𝐿 = with | 𝐿 | variables are underdetermined, wherewe show that its solution set for all constraints in (32) has dimension | 𝐿 | − | 𝐾 | .By assumption (f), 𝐴 𝐾𝐿 has full row rank | 𝐾 | , so there is an invertible | 𝐾 | × | 𝐾 | submatrix 𝐴 𝐾𝐾 (cid:48) of 𝐴 𝐾𝐿 , where we write 𝐴 𝐾𝐿 = [ 𝐴 𝐾𝐾 (cid:48) 𝐴 𝐾𝐿 (cid:48) ] and 𝑦 𝐿 = ( 𝑦 𝐾 (cid:48) , 𝑦 𝐿 (cid:48) ) , sothat the following are equivalent: 𝐴 𝐾𝐿 𝑦 𝐿 = ,𝐴 𝐾𝐾 (cid:48) 𝑦 𝐾 (cid:48) + 𝐴 𝐾𝐿 (cid:48) 𝑦 𝐿 (cid:48) = ,𝑦 𝐾 (cid:48) = 𝐴 − 𝐾𝐾 (cid:48) − ( 𝐴 − 𝐾𝐾 (cid:48) 𝐴 𝐾𝐿 (cid:48) ) 𝑦 𝐿 (cid:48) , (33)where 𝑦 𝐿 (cid:48) can be freely chosen subject to 𝑦 𝐿 = ( 𝑦 𝐾 (cid:48) , 𝑦 𝐿 (cid:48) ) ≥ to ensure (32). Let ℓ = | 𝐿 (cid:48) | = | 𝐿 | − | 𝐾 (cid:48) | = | 𝐿 | − | 𝐾 | = 𝑛 − | 𝐽 | by (31). We claim that (32) and (33) implythat 𝑄 ( 𝐽 ) is a set of affine dimension ℓ . By definition, this means that 𝑄 ( 𝐽 ) has ℓ + 𝑦 , 𝑦 , . . . , 𝑦 ℓ that are affinely independent, or equivalently(as is easy to see) that the ℓ points 𝑦 − 𝑦 , . . . , 𝑦 ℓ − 𝑦 are linearly independent. (34)Any 𝑦 𝑗 ∈ 𝑄 ( 𝐽 ) is by (32) of the form 𝑦 𝑗 = ( 𝑦 𝑗𝐽 (cid:48) , 𝑦 𝑗𝐾 (cid:48) , 𝑦 𝑗𝐿 (cid:48) ) , where 𝑦 𝑗𝐽 (cid:48) = and 𝑦 𝑗𝐾 (cid:48) is anaffine function of 𝑦 𝑗𝐿 (cid:48) by (33). Hence, 𝑦 𝑗𝐾 (cid:48) − 𝑦 𝐾 (cid:48) is a linear function of 𝑦 𝑗𝐿 (cid:48) ∈ R ℓ , andthere can be no more than ℓ linearly independent vectors 𝑦 𝑗 − 𝑦 in (34). We findsuch vectors as follows. Let 𝑢 be the best-response payoff to ˆ 𝑦 and 𝑦 = ˆ 𝑦 𝑢 , and(assuming for simplicity that 𝐿 (cid:48) = { , . . . , ℓ }) 𝑦 𝑗 = 𝑦 + 𝑒 𝑗 𝜀 for 𝑗 ∈ 𝐿 (cid:48) and 𝜀 > 𝑦 ∈ 𝑄 ( 𝐽 ) and for sufficiently small 𝜀 𝑦 𝑗𝐿 = ( 𝑦 𝑗𝐾 (cid:48) , 𝑦 𝑗𝐿 (cid:48) ) > , 𝑎 (cid:62) 𝑖 𝑦 𝑗 < 𝑖 ∉ 𝐾 (35)because these strict inequalities hold (as “non-labels” of ˆ 𝑦 ) for 𝑗 = 𝑦 𝑗 is by(33) a continuous function of its part 𝑦 𝑗𝐿 (cid:48) whose 𝑗 th component is augmented by 𝜀 .Then 𝑦 𝑗 − 𝑦 are the scaled unit vectors 𝑒 𝑗 𝜀 which are linearly independent, whichimplies (34). So 𝑄 ( 𝐽 ) has dimension ℓ = 𝑛 − | 𝐽 | . Similarly, 𝑃 ( 𝐼 ) has dimension 𝑚 − | 𝐼 | . This shows (g).Assume (g). If, say, 𝑄 was not simple, then some point 𝑦 of 𝑄 would be onmore than 𝑛 facets and have a set 𝐽 of more than 𝑛 labels. The correspondingset 𝑄 ( 𝐽 ) would have negative dimension and be the empty set, but contains 𝑦 ,a contradiction. So 𝑄 , and similarly 𝑃 , is a simple polytope. Suppose someinequality of 𝑄 is redundant, and that it is sometimes binding, with label 𝑘 . Thisbinding inequality therefore defines a nonempty face 𝐹 of 𝑄 . Consider the set 𝐽 of labels that all points in 𝐹 have, which includes 𝑘 . Because the inequality isredundant, 𝑄 ( 𝐽 ) and 𝑄 ( 𝐽 − { 𝑘 }) are the same set, but have different dimension by(g), a contradiction. The same applies to 𝑃 . This shows (h).27ssume (h). We show that because 𝑄 has no redundant inequality that isbinding, player 1 has no pure strategy 𝑖 that is weakly dominated by or payoffequivalent to a different mixed strategy 𝑥 ∈ 𝑋 , and not strictly dominated. Supposethis was the case, that is, 𝑎 (cid:62) 𝑖 ≤ 𝑥 (cid:62) 𝐴, 𝑥 𝑖 = 𝑎 (cid:62) 𝑖 is the 𝑖 th row of 𝐴 . In (36) we can assume 𝑥 𝑖 = 𝑥 with 𝑥 − 𝑒 𝑖 𝑥 𝑖 and re-scaling because 𝑥 ≠ 𝑒 𝑖 . Then the 𝑖 th inequality 𝑎 (cid:62) 𝑖 𝑦 ≤ 𝐴𝑦 ≤ is redundant, because it is implied by the other inequalities in 𝐴𝑦 ≤ since 𝑦 ≥ implies 𝑎 (cid:62) 𝑖 𝑦 ≤ 𝑥 (cid:62) 𝐴𝑦 ≤ 𝑥 (cid:62) =
1. Because 𝑖 is not strictly dominated bysome mixed strategy, it is not hard to show (see Lemma 3 of Pearce, 1984) that 𝑖 is the best response to some mixed strategy ˆ 𝑦 ∈ 𝑌 , with best response payoff 𝑢 , so 𝑎 (cid:62) 𝑖 ˆ 𝑦 = 𝑢 . But then for 𝑦 = ˆ 𝑦 𝑢 ∈ 𝑌 the inequality 𝑎 (cid:62) 𝑖 𝑦 ≤ 𝑃 and player 2. This shows (i).Finally, (i) implies (a) where we use Proposition 6. Any ˆ 𝑦 in 𝑌 with more than 𝑛 labels would, via (10), either define a point 𝑦 in 𝑄 that is on more than 𝑛 facetsso that 𝑄 is not simple, or one of the labels would define the exact same facet asanother and thus a duplicate pure strategy, or one of the labels 𝑖 would definea lower-dimensional face 𝐹 = { 𝑦 ∈ 𝑄 | 𝑎 (cid:62) 𝑖 𝑦 = } as in the implication (g) ⇒ (h)which can be shown to imply (36) for some 𝑥 ∈ 𝑋 , all contradicting (i). The sameapplies for the other player.In Theorem 14, condition (b) is very similar to Proposition 6, but applies to thelabels of points in 𝑃 and 𝑄 rather 𝑋 and 𝑌 . Condition (f) (and similarly (e)) statesfull row rank of the best-response submatrix 𝐴 𝐾𝐿 of the payoff matrix 𝐴 to player 1for the support 𝐿 and best-response set 𝐾 of a mixed strategy ˆ 𝑦 ∈ 𝑌 , and similarlyfor the other player. This uses the condition that 𝑃 and 𝑄 are polytopes, namelypositive best-response payoffs by Proposition 7. Otherwise, a nondegenerate gamemay have a payoff (sub)matrix that does not have full rank, such as (cid:20) − − (cid:21) .Condition (g) is about the dimension of the sets 𝑃 ( 𝐼 ) and 𝑄 ( 𝐽 ) defined by setsof labels 𝐼 and 𝐽 . These are the labels of some mixed strategies, which ensures that 𝑃 ( 𝐼 ) and 𝑄 ( 𝐽 ) are not empty. The condition states that each extra label reduces thedimension by one. A singleton label set defines a facet of 𝑃 or 𝑄 . Condition (h) isalso geometric, and is about the shape of the polytope (being simple) and about itsdescription by linear inequalities. For example, a duplicate strategy of player 1and thus duplicate row of 𝐴 would not change the shape of 𝑄 , but affect its labels.Redundant inequalities are allowed as long as they do not define labels at all. In (i)these never-binding inequalities are strictly dominated strategies. Condition (36)states that the pure strategy 𝑖 of player 1 is weakly dominated by a different mixedstrategy 𝑥 , or payoff equivalent to it if 𝑎 𝑖 = 𝑥 (cid:62) 𝐴 .28 Pivoting and handling degenerate games
As mentioned before the start of Section 5, the LH algorithm is a path-followingmethod that can be implemented by certain algebraic operations. These areknown as “pivoting” as used in the simplex algorithm for linear programming(see Dantzig, 1963, or, for example, Matoušek and Gärtner, 2007). We explain thisusing the letters
𝐴, 𝐵, 𝑚, 𝑛 that are standard in this context and do not refer to abimatrix game.Let 𝐶 be an 𝑚 × 𝑑 matrix and 𝑞 ∈ R 𝑚 , and consider, like in (14) (where we haveassumed 𝑚 = 𝑑 ) the polyhedron 𝑆 = { 𝑧 ∈ R 𝑑 | 𝑧 ≥ , 𝐶 𝑧 ≤ 𝑞 } . Then 𝑧 ∈ 𝑆 if andonly if there is some 𝑤 ∈ R 𝑚 so that 𝐼𝑤 + 𝐶 𝑧 = 𝑞, 𝑤 ≥ , 𝑧 ≥ , (37)where 𝐼 is the 𝑚 × 𝑚 identity matrix. We write this more generally with 𝑛 = 𝑚 + 𝑑 and the 𝑚 × 𝑛 matrix 𝐴 = [ 𝐼 𝐶 ] and 𝑥 = ( 𝑤, 𝑧 ) ∈ R 𝑛 as 𝐴𝑥 = 𝑞, 𝑥 ≥ . (38)Any 𝑥 ∈ R 𝑛 that fulfills (38) is called feasible for these constraints. A linear program (LP) is the problem of maximizing a linear function 𝑐 (cid:62) 𝑥 subject to (38), for some 𝑐 ∈ R 𝑛 .Let 𝐴 = [ 𝐴 · · · 𝐴 𝑛 ] . For any partition 𝐵, 𝑁 of { , . . . , 𝑛 } we write 𝐴 = [ 𝐴 𝐵 𝐴 𝑁 ] , 𝑥 = ( 𝑥 𝐵 , 𝑥 𝑁 ) , 𝑐 (cid:62) 𝑥 = ( 𝑐 (cid:62) 𝐵 𝑥 𝐵 , 𝑐 (cid:62) 𝑁 𝑥 𝑁 ) . We say 𝐵 is a basis of 𝐴 if 𝐴 𝐵 is an invertible 𝑚 × 𝑚 matrix (which implies | 𝐵 | = 𝑚 and | 𝑁 | = 𝑛 − 𝑚 = 𝑑 ; this requires that 𝐴 has full row rank, e.g. if 𝐼 is part of 𝐴 as above). Then the following equations areequivalent for any 𝑥 ∈ R 𝑛 : 𝐴𝑥 = 𝑞 ,𝐴 𝐵 𝑥 𝐵 + 𝐴 𝑁 𝑥 𝑁 = 𝑞 ,𝑥 𝐵 = 𝐴 − 𝐵 𝑞 − 𝐴 − 𝐵 𝐴 𝑁 𝑥 𝑁 . (39)For the given basis 𝐵 and 𝑥 = ( 𝑥 𝐵 , 𝑥 𝑁 ) , the last equation expresses how the “basicvariables” 𝑥 𝐵 depend on the “nonbasic variables” 𝑥 𝑁 (so that 𝐴𝑥 = 𝑞 ). The basicsolution associated with 𝐵 is given by 𝑥 𝑁 = and thus 𝑥 𝐵 = 𝐴 − 𝐵 𝑞 . It is called feasible if 𝑥 𝐵 ≥ .Basic feasible solutions are the algebraic representations of the vertices of thepolyhedron defined by (38), called 𝐻 in the following proposition; for the system(37) there is a bijection between 𝑆 and 𝐻 via 𝑧 ∈ 𝑆 and 𝑥 = ( 𝑤, 𝑧 ) ∈ 𝐻 with 𝑤 = 𝑞 − 𝐶 𝑧 . Proposition 15.
Let 𝐻 = { 𝑥 | 𝐴𝑥 = 𝑞, 𝑥 ≥ } be a polyhedron where 𝐴 has full rowrank. Then 𝑥 is a vertex of 𝐻 if and only if 𝑥 is a basic feasible solution to ( ) . roof. Let 𝑥 = ( 𝑥 𝐵 , 𝑥 𝑁 ) be a basic feasible solution with basis 𝐵 and consider the LPmax 𝑥 ∈ 𝐻 𝑐 (cid:62) 𝑥 for 𝑐 𝐵 = , 𝑐 𝑁 = − . Then clearly 𝑐 (cid:62) 𝑥 ≤ 𝑥 ∈ 𝐻 (so this is a validinequality for 𝐻 ) and 𝑐 (cid:62) 𝑥 = 𝑐 (cid:62) 𝑥 = 𝑐 (cid:62) 𝑁 𝑥 𝑁 =
0, which means the only optimal solution isthe basic solution ( 𝑥 𝐵 , 𝑥 𝑁 ) = ( 𝐴 − 𝐵 𝑞, ) . Hence the face { 𝑥 ∈ 𝐻 | 𝑐 (cid:62) 𝑥 = } has onlyone point in it and is therefore a vertex. This shows every basic feasible solution isa vertex.Conversely, suppose ˆ 𝑥 is a vertex of 𝐻 , that is, { 𝑥 ∈ 𝐻 | 𝑐 (cid:62) 𝑥 = 𝑞 } = { ˆ 𝑥 } where 𝑐 (cid:62) 𝑥 ≤ 𝑞 is valid for 𝐻 . Hence ˆ 𝑥 is the unique optimal solution to the LPmax 𝑥 ∈ 𝐻 𝑐 (cid:62) 𝑥 . If the LP has an optimal solution then it has a basic optimal solution(this can shown similarly to the implication (d) ⇒ (e) for Theorem 14), whichequals ˆ 𝑥 .In general, a vertex may correspond to several bases that represent the samebasic feasible solution, namely when at least one basic variable is zero and can bereplaced by some nonbasic variable. However, in a nondegenerate polyhedronthe basis that represents a vertex is unique. This happens if and only if in anybasic feasible solution ( 𝑥 𝐵 , 𝑥 𝑁 ) to (38) with 𝑥 𝑁 = we have 𝑥 𝐵 > , which can beshown to be equivalent to Theorem 14(e), for example, for the system (37). For themoment, we assume this nondegeneracy condition.For a vertex 𝑧 of the polytope 𝑆 in (14), which corresponds to 𝑥 = ( 𝑤, 𝑧 ) ∈ 𝐻 with a basic feasible solution 𝑥 = ( 𝑥 𝐵 , 𝑥 𝑁 ) , the binding inequalities of 𝑧 ≥ , 𝐶 𝑧 ≤ correspond to the nonbasic variables 𝑥 𝑁 = (because 𝑥 𝐵 > ); these are exactly 𝑑 = | 𝑁 | binding inequalities. We re-write (39) as 𝑥 𝐵 = 𝐴 − 𝐵 𝑞 − 𝐴 − 𝐵 𝐴 𝑁 𝑥 𝑁 = 𝐴 − 𝐵 𝑞 − (cid:213) 𝑗 ∈ 𝑁 𝐴 − 𝐵 𝐴 𝑗 𝑥 𝑗 = : 𝑞 − (cid:213) 𝑗 ∈ 𝑁 𝐴 𝑗 𝑥 𝑗 (40)where 𝑞 and 𝐴 𝑗 depend on the basis 𝐵 . In the basic feasible solution, 𝑥 𝑁 = . In theLH algorithm as described in Section 5, the next vertex is found by allowing oneof the binding inequalities to become non-binding. This means that in (40), oneof the nonbasic variables 𝑥 𝑗 for 𝑗 ∈ 𝑁 is allowed to increase from zero to becomepositive. This variable is called the entering variable (about to “enter the basis”);all other nonbasic variables stay zero. The current basic variables 𝑥 𝐵 then changelinearly as function of 𝑥 𝑗 according to the equation and constraint 𝑥 𝐵 = 𝑞 − 𝐴 𝑗 𝑥 𝑗 ≥ . (41)When in this equation 𝑥 𝑗 > 𝑥 𝐵 > , only 𝑑 − 𝑆 . Normally, for example if 𝑆 and thus 𝐻 is a polytope, this edgeends at another vertex which is obtained when one of the components 𝑥 𝑖 of 𝑥 𝐵 in(41) becomes zero when increasing 𝑥 𝑗 . Then 𝑥 𝑖 is called the variable that leaves thebasis , and the pivot step is to replace 𝐵 with 𝐵 ∪ { 𝑗 } − { 𝑖 } which becomes the new30asis which defines the new vertex. If the leaving variable 𝑥 𝑖 is not unique, then atleast one other basic variable 𝑥 𝑘 becomes simultaneously zero with 𝑥 𝑖 , and is thena zero basic variable in the next basis, which means a degeneracy. Hence, for anondegenerate polyhedron the leaving variable is unique.The pivot step is an algebraic representation of the edge traversal. In (41), theleaving variable is determined by the constraints 𝑞 𝑖 − 𝑎 𝑖𝑗 𝑥 𝑗 ≥ 𝑞 𝑖 of 𝑞 and 𝑎 𝑖𝑗 of 𝐴 𝑗 , for 𝑖 ∈ 𝐵 . These impose a restriction on 𝑥 𝑗 only if 𝑎 𝑖𝑗 > 𝐴 𝑗 ≤ then 𝑥 𝑗 in (41) can increase indefinitely, which would mean that 𝐻 is unbounded, which we assume is not the case). Hence, these constraints areequivalent to 𝑞 𝑖 𝑎 𝑖𝑗 ≥ 𝑥 𝑗 ≥ , 𝑎 𝑖𝑗 > , 𝑖 ∈ 𝐵 . (42)The smallest of the ratios 𝑞 𝑖 / 𝑎 𝑖𝑗 in (42) thus determines how much 𝑥 𝑗 can increaseto maintain the condition 𝑥 𝐵 ≥ in (41). Finding that minimum is called the minimum ratio test . Moreover, that ratio is positive because the current basic feasiblesolution is given by 𝑥 𝐵 = 𝑞 > . The ratios in (42) have a unique minimum whichdetermines the leaving variable.Pivoting, the successive change from one basic feasible solution to another byexchanging one “entering” nonbasic variable for a unique “leaving” basic variable,thus represents a path of edges of the given polytope 𝑆 . In the LH algorithm, theentering variable is chosen according to the following rule. Algorithm 16 (Lemke-Howson with complementary pivoting) . Consider thesystem ( ) with 𝑞 = as in ( ) .1. Start with the basic feasible solution where 𝑧 = , 𝑤 = 𝑞 . Choose one 𝑘 asmissing label which determines the first entering variable 𝑧 𝑘 .2. In the pivot step, if the leaving variable is 𝑤 𝑘 or 𝑧 𝑘 , output the current basicsolution and stop. Otherwise, the leaving variable is 𝑤 𝑖 or 𝑧 𝑖 for 𝑖 ≠ 𝑘 . Choosethe complement of that variable ( 𝑧 𝑖 respectively 𝑤 𝑖 ) as the new entering variableand repeat step 2.This is the algebraic implementation of the LH algorithm. It ensures thatfor each 𝑖 ≠ 𝑘 at least one variable 𝑤 𝑖 or 𝑧 𝑖 is always nonbasic and represents abinding inequality, so that the traversed vertices and edges of 𝑆 have all labelsexcept possibly 𝑘 . Except for the endpoints of the computed path, both 𝑤 𝑘 and 𝑧 𝑘 are basic variables, which are positive throughout and correspond to the missinglabel.Pivoting has originally been invented by Dantzig (1963) for the simplexalgorithm for solving an LP, where the entering variable is chosen so as to improvethe current value of the objective function. This is given as 𝑐 (cid:62) 𝑥 = 𝑐 (cid:62) 𝐵 𝑥 𝐵 + 𝑐 (cid:62) 𝑁 𝑥 𝑁 ,31nd by expressing 𝑥 𝐵 as a function of 𝑥 𝑁 according to (39), any 𝑥 𝑗 with a positivecoefficient can serve as entering variable. The optimum of the LP is found whenthere is no such positive coefficient. Hence, the only difference between the LH andthe simplex algorithm is the choice of the entering variable by the “complementarityrule” in step 2 above.The LH algorithm, like the simplex algorithm, can be generalized to thedegenerate case where basic feasible solutions may have zero basic variables.For that purpose, the right-hand side 𝑞 in (38) is perturbed by replacing it by 𝑞 + ( 𝜀 , 𝜀 , . . . , 𝜀 𝑚 ) (cid:62) for some sufficiently small 𝜀 > 𝐵 , the corresponding basic solution ( 𝑥 𝐵 , 𝑥 𝑁 ) is given by 𝑥 𝑁 = and 𝑥 𝐵 = 𝐴 − 𝐵 𝑞 + 𝐴 − 𝐵 ( 𝜀 , 𝜀 , . . . , 𝜀 𝑚 ) (cid:62) (43)and it is feasible (that is, 𝑥 𝐵 ≥ ) if and only if the 𝑚 × ( + 𝑚 ) matrix [ 𝐴 − 𝐵 𝑞 𝐴 − 𝐵 ] is lexico-positive , (44)that is, the first nonzero entry of each row of this matrix is positive. Note that 𝑞 = 𝐴 − 𝐵 𝑞 may have zero entries, but 𝐴 − 𝐵 cannot have an all-zero row, so (44) implies 𝑥 𝐵 > for all sufficiently small positive 𝜀 in (43), and thus nondegeneracy throughout .However, no actual perturbance is needed, because (44) is recognized solely from 𝐴 − 𝐵 . Condition (44) is maintained by extending (42) to a “lexico-minimum ratiotest”, which determines the leaving variable uniquely (von Stengel, 2002, p. 1741).In that way, the LH algorithm proceeds uniquely even for a degenerate game, andterminates at a Nash equilibrium.For an accurate computation of the LH steps, it is necessary to store thesystem (40) precisely without rounding errors as they may occur in floating-pointarithmetic. If the entries of the given bimatrix game are integers, then it is possibleto store this linear system using only integers and a separate integer for thedeterminant of the current basis matrix 𝐴 𝐵 . This “integer pivoting” (see vonStengel, 2007, Section 3.5) avoids numerical errors by storing arbitrary-precisionintegers without the costly cancellation operations when adding fractions inrational arithmetic.Complementary pivoting as described in Algorithm 16 has been generalizedby Lemke (1965) to solve linear complementary problems (13) for more generalparameters 𝐶 and 𝑞 . The system (37) is thereby extended by an additional matrixcolumn and variable 𝑧 . A first basic solution has 𝑤 = 𝑞 and 𝑧 = and 𝑧 whichfulfills the complementarity condition 𝑧 (cid:62) 𝑤 = 𝑞 has negativecomponents. Then 𝑧 enters the basis so has to obtain feasibility, with some 𝑤 𝑖 as leaving variable. Then as in step 2 of Algorithm 16, the next entering variableis 𝑧 𝑖 , more generally the complement of the leaving variable, which is repeated32ntil 𝑧 leaves the basis. A number of conditions on 𝐶 can ensure that there is no“ray termination”, that is, the “entering column” 𝐴 𝑗 in (41) has always at least onepositive component (see Cottle, Pang, and Stone, 1992).Most path-following methods that find an equilibrium of a two-player gamecan be encoded as special cases of Lemke’s algorithm, such as Govindan andWilson (2003). In von Stengel, van den Elzen, and Talman (2002) it is shown howto use it for mimicking the (linear) “Tracing Procedure” of Harsanyi and Selten(1988) that traces a path of best responses against a suitable convex combination ofa “prior” mixed-strategy pair as starting point and the currently played strategies;it terminates when the weight of the prior (encoded by the variable 𝑧 ) becomeszero. Moreover, this algorithm can also be applied to more general strategy sets,such as the “sequence form” for extensive form games (von Stengel, 1996). The LH algorithm finds (at least) one Nash equilibrium of a bimatrix game.
All equilibria are found by Algorithm 4, which checks the possible support sets of anequilibrium. This can be improved by considering instead of these support setsthe vertices of the labeled polytopes 𝑃 and 𝑄 in (9).A degenerate bimatrix game may have infinite sets of Nash equilibria. They canbe described via maximal Nash subsets (Millham, 1974; Winkels, 1979; Jansen, 1981),called “sub-solutions” by Nash (1951). A Nash subset for ( 𝐴, 𝐵 ) is a nonemptyproduct set 𝑆 × 𝑇 where 𝑆 ⊆ 𝑋 and 𝑇 ⊆ 𝑌 so that every ( 𝑥, 𝑦 ) in 𝑆 × 𝑇 is anequilibrium of ( 𝐴, 𝐵 ) ; in other words, any two equilibrium strategies 𝑥 ∈ 𝑆 and 𝑦 ∈ 𝑇 are “exchangeable”. The following proposition shows that a maximal Nashsubset is just a pair of faces of 𝑃 and 𝑄 that together have all labels 1 , . . . , 𝑚 + 𝑛 . Proposition 17.
Let ( 𝐴, 𝐵 ) an 𝑚 × 𝑛 bimatrix game with polytopes 𝑃 and 𝑄 in ( ) , and for 𝐼 , 𝐽 ⊆ { , . . . , 𝑚 + 𝑛 } let 𝑃 ( 𝐼 ) and 𝑄 ( 𝐽 ) be defined as in ( ) . Then ( 𝑥, 𝑦 ) ∈ 𝑃 × 𝑄 − { , } ,re-scaled to a mixed-strategy pair in 𝑋 × 𝑌 , is a Nash equilibrium if and only if for some 𝐼 and 𝐽 we have ( 𝑥, 𝑦 ) ∈ 𝑃 ( 𝐼 ) × 𝑄 ( 𝐽 ) , 𝐼 ∪ 𝐽 = { , . . . , 𝑚 + 𝑛 } . (45) Proof.
This follows from Proposition 5: (45) implies that ( 𝑥, 𝑦 ) is completely labeledand therefore a Nash equilibrium. Conversely, if ( 𝑥, 𝑦 ) is a Nash equilibrium and 𝐼 and 𝐽 are the set of labels of 𝑥 and 𝑦 (this may increase the sets 𝐼 and 𝐽 whenstarting from (45)), then (45) holds.In (45), 𝑃 ( 𝐼 ) is the face of 𝑃 defined by the binding inequalities in 𝐼 , and 𝑄 ( 𝐽 ) is the face of 𝑄 defined by the binding inequalities in 𝐽 . In a nondegenerate game,these faces are vertices of 𝑃 and 𝑄 . In general, they may be higher-dimensional33aces such as edges. Usually, when the dimension of these faces is not too high,it is informative to describe them via the vertices of these faces, which are alsovertices of 𝑃 or 𝑄 . They are usually called extreme equilibria. Proposition 18 (Winkels, 1979; Jansen, 1981) . Under the assumptions of Proposition 17, ( 𝑥, 𝑦 ) is, after re-scaling, a Nash equilibrium if and only if there is a set 𝑈 of vertices of 𝑃 − { } and a set 𝑉 of vertices of 𝑄 − { } so that 𝑥 ∈ conv 𝑈 and 𝑦 ∈ conv 𝑉 , and every ( 𝑢, 𝑣 ) ∈ 𝑈 × 𝑉 is completely labeled.Proof. 𝑈 and 𝑉 are just the vertices of 𝑃 ( 𝐼 ) and 𝑄 ( 𝐽 ) in (45); see Avis, Rosenberg,Savani, and von Stengel (2010, Prop. 4).Proposition 18 shows that the set of all Nash equilibria is completely describedby the (finitely many) extreme Nash equilibria. Consider the bipartite graph 𝑅 on the vertices of 𝑃 − { } and 𝑄 − { } are the completely labeled vertex pairs ( 𝑥, 𝑦 ) , which are the extreme equilibria of ( 𝐴, 𝐵 ) . The maximal “cliques” (maximalcomplete bipartite subgraphs) of 𝑅 of the form 𝑈 × 𝑉 then define the maximalNash subsets conv 𝑈 × conv 𝑉 , as in Proposition 18, whose union is the set ofall Nash equilibria. Maximal Nash subsets may intersect, in which case theirvertex sets intersect. The inclusion-maximal connected sets of Nash equilibriaare the topological components . An algorithm that outputs the extreme Nashequilibria, maximal Nash subsets, and components of a bimatrix game is describedin Avis, Rosenberg, Savani, and von Stengel (2010) and available on the web at http://banach.lse.ac.uk (at the time of this writing for games of size up to15 ×
15, due to the typically exponential number of vertices that have to be checked).
Acknowledgment
I thank Yannick Viossat for detailed comments.
References
Avis, D., G. D. Rosenberg, R. Savani, and B. von Stengel (2010). Enumeration of Nashequilibria for two-player games.
Economic Theory
The Linear Complementarity Problem .Academic Press, Boston.Dantzig, G. B. (1963).
Linear Programming and Extensions . Princeton University Press,Princeton, NJ.Gale, D., H. W. Kuhn, and A. W. Tucker (1950). On symmetric games. In:
Contributions tothe Theory of Games, Vol. I , edited by H. W. Kuhn and A. W. Tucker, volume 24 of
Annalsof Mathematics Studies , 81–87. Princeton University Press, Princeton, NJ.Govindan, S. and R. Wilson (2003). A global Newton method to compute Nash equilibria.
Journal of Economic Theory arsanyi, J. C. and R. Selten (1988). A General Theory of Equilibrium Selection in Games . MITPress, Cambridge MA.Jansen, M. J. M. (1981). Maximal Nash subsets for bimatrix games.
Naval Research LogisticsQuarterly 𝑛 × 𝑛 bimatrix game. Games and Economic Behavior
Applied Mathematics and Computation
Management Science
Journal ofthe Society for Industrial and Applied Mathematics
Understanding and Using Linear Programming . Springer,Berlin.Millham, C. (1974). On Nash subsets of bimatrix games.
Naval Research Logistics Quarterly
Annals of Mathematics
Econometrica
Mathematical ProgrammingStudy 1: Pivoting and Extensions,
Stability and Perfection of Nash Equilibria . Springer-Verlag, Berlin,second edition.von Stengel, B. (1996). Efficient computation of behavior strategies.
Games and EconomicBehavior
Handbook ofGame Theory with Economic Applications , edited by R. J. Aumann and S. Hart, volume 3,1723–1759. North-Holland, Amsterdam.von Stengel, B. (2007). Equilibrium computation for two-player games in strategic andextensive form. In:
Algorithmic Game Theory , edited by N. Nisan, T. Roughgarden,E. Tardos, and V. Vazirani, 53–78. Cambridge University Press, Cambridge, UK.von Stengel, B. (2021).
Game Theory Basics . Cambridge University Press, Cambridge, UK.von Stengel, B., A. van den Elzen, and D. Talman (2002). Computing normal form perfectequilibria for extensive two-person games.
Econometrica
Game Theory and Related Topics , edited by O. Moeschlin and D. Pallaschke,137–148. North-Holland, Amsterdam., edited by O. Moeschlin and D. Pallaschke,137–148. North-Holland, Amsterdam.