Kuhn Poker with Cheating and Its Detection
11 Kuhn Poker with Cheating and Its Detection
Amanda Metzner Daniel ZwillingerBAE Systems FAST Labs BAE Systems FAST Labs600 District Ave. 600 District Ave.Burlington, MA 01803 Burlington, MA [email protected] [email protected]
Abstract —Poker is a multiplayer game of imperfect informa-tion and has been widely studied in game theory. Many popularvariants of poker (e.g., Texas Hold’em and Omaha) at the edgeof modern game theory research are large games. However,even toy poker games, such as Kuhn poker, can pose newchallenges. Many Kuhn poker variants have been investigated:varying the number of players, initial pot size, and numberof betting rounds. In this paper we analyze a new variant –Kuhn poker with cheating and cheating detection. We determinehow cheating changes the players’ strategies and derive newanalytical results.
Index Terms —game theory, Kuhn poker, cheating
I. I
NTRODUCTION
Poker is one of the most popular games studied amonggame theory researchers. The complex nature of the gameprovides challenging problems, however its complexity andsize also make poker a difficult game to study. Kuhn poker,a toy poker game, can pose theoretical challenges at a muchmore manageable size. Multiple variants of Kuhn poker havepreviously been studied and optimal strategies determined;varying the number of players, the number of betting rounds,the initial pot size, etc [1]. A variant that has not been studiedpreviously is the incorporation of cheating and its detection.What happens when one or both players cheat? How doescheating alter the players’ strategies?II. P
RELIMINARIES
A. Game Theory
Game theory is a theoretical framework for comprehendingsocial interactions among competing players [2]. In someregards, game theory is the science of strategy: the optimaldecision-making of independent and competing agents in astrategic setting. In game theory, the game itself serves asa model of an interactive decision making among rationalplayers.Game theory terms: • Players: The game decision-maker • Strategy: A player’s plan of action • Payoff: The player payouts received from a particularoutcome
Approved for public release; unlimited distribution.Not export controlled per ES-FL-102220-0153. • Information set: The information available to a playerat a game decision pointIn this paper we study Kuhn poker with the incorporationof cheating and its detection. The goal is to determine howcheating changes the players’ strategies.
B. Kuhn Poker
The rules of Kuhn poker are simple. There are two playersusing a three card deck (the cards are called K, Q, and Jfor King, Queen, and Jack). Each player antes $1 at thebeginning of the game. Each player is dealt one card. Player1 goes first. There is a single betting round, with a maximumof one bet of $1. If no player folds, then the player with thehigher card wins (K beats Q and J, while Q beats J).Kuhn poker is asymmetric, the first player has a disadvan-tage with an expected payoff [3] (under optimal play by bothsides) of − / dollars. Section IV-A explains this in moredetail.Note that optimal play of classical Kuhn power incorpo-rates bluffing. For example, in Kuhn poker (with no cheating),if player 1 is dealt a J then, with some probability, player 1bets even though they will lose to whatever card player 2 has.The reason is simple, if player 1 bets with a J, and player2 has a Q, then player 2 may “think” player 1 has a K, andfold.III. I NCORPORATING C HEATING AND I TS D ETECTION
There are many ways in which cheating could occur inKuhn poker. For example, a player could shuffle the deck,or otherwise arrange it, in such a way that all 6 two-carddistributions are not equally likely. The cheating variationthat we choose to analyze was having either player, or bothplayers, look at the third/face-down card.If either player looks at the face-down card, then thatplayer knows who has the best hand. For the cheating player,it is a game of complete information. Note that cheating alonewill not cause a player to win a hand – they may have a losinghand (e.g., the J). Note that bluffing against a cheating playerwill not be successful.We analyzed several variations of our way of cheating:1) Classic (no cheating by either player)2) Player 1 (alone) cheats probabilistically3) Player 2 (alone) cheats probabilistically4) Players 1 and 2 both cheat probabilistically a r X i v : . [ c s . G T ] N ov ABLE IO
PTIMAL PLAY STRATEGIES FOR PLAYER WHEN BOTH PLAYERS PLAYFAIRLY ( NO CHEATING )player 1 . . .. . . initial play . . . response to a betcard dealt prob bet prob check prob call prob foldK a − a / a / − a J a − a PTIMAL PLAY STRATEGIES FOR PLAYER WHEN BOTH PLAYERS PLAYFAIRLY ( NO CHEATING )player 2 response to a . . .. . . check . . . betcard dealt prob bet prob call prob call prob foldK 1 0 1 0Q 0 1 / / J / /
5) Players 1 and 2 both cheat probabilistically and theyboth detect cheating probabilistically.Probabilistic detection of cheating was also incorporatedinto the game. For example, “Only 10% of the time willplayer 1 determine if player 2 has cheated.” The reason forprobabilistic detection is that there could be a cost associatedwith detection, so a player may only check for cheating someof the time. For a game that incorporates cheating detection,the rules differ slightly: • If one player cheats and the other player detects it ,then the round is played out per usual with betting andfolding and then the detector wins , independent of thecards dealt or how the game played out. • If both players cheats and both players detect that theother is cheating , then the round is played out per usualwith betting and folding and then neither player wins ,independent of the cards dealt or how the game playedout. In this case the monies bet are returned.IV. G AME C OMPUTATION
A. Kuhn Poker Analytical Computation
For classical Kuhn poker there are a range of optimalstrategies for player 1, all yielding the same optimal re-sult [3]. While the strategy for player 2 is fixed, the optimalstrategy for player 1 depends on a parameter ” a ”, which canhave any value between 0 and / , inclusive.If player 1 is dealt a J, then player 1 bets (bluffs) withprobability a ; if player 1 is dealt a K, then player 1 betswith probability a . A complete description of both players’optimal strategy is contained in Tables I and II; we call thisthe “fair strategy.”Table III shows that the payoff to player 1 using theoptimal strategy (including the value ” a ”) is − / . Leftmostin Table III are the 6 equally possible ways that two cardscan be dealt. Adjacent to that are the probabilistic amountsthat each player can expect to win for that deal. We presumethe first player never folds as their initial play, there is no TABLE IIIT
HE VALUE OF THE GAME TO PLAYER WHEN BOTH PLAYERS AREPLAYING FAIRLY cards dealtplayer 1 player 2 player 1 winnings player 2 winningsK J / − a / K Q / + a / Q J / + a / / − a / Q K / + a / J K / + a / J Q a / / − a / net winnings to player 1 − / TABLE IVT
HE VALUE OF THE GAME TO PLAYER WHEN PLAYER CHEATS ANDPLAYER PLAYS FAIRLY cards dealtplayer 1 player 2 player 1 winnings player 2 winningsK J / K Q / Q J / + a / / − a / Q K / + a / J K / J Q / net winnings to player 1 − / − a / TABLE VT
HE VALUE OF THE GAME TO PLAYER WHEN PLAYER CHEATS ANDPLAYER PLAYS FAIRLY cards dealtplayer 1 player 2 player 1 winnings player 2 winningsK J / K Q / Q J / Q K / J K / J Q / / net winnings to player 1 / benefit to this. The result shows that player 1 expects to win − / (that is, a loss) each play of the game, and the resultdoes not depend on the value of “ a .”Games 2 and 3 (defined in the last section) have one playercheating and the other player unaware of this cheating. In thiscase the natural question is “How much does the cheatingplayer benefit?” The analysis is shown in • Table IV, where player 2 cheats 100% of the time, andplayer 1 plays their fair strategy • Table V, where player 1 cheats 100% of the time, andplayer 2 plays their fair strategywhere new optimal strategies were determined for the cheat-ing player.When player 2 cheats then player 1 expects to win − / − a / (that is, a loss, for any value of ” a ”) on each play of thegame. While the value of ” a ” did not make a difference inthe fair game (with neither side cheating), it does make adifference when player 2 cheats. Hence, to minimize the lossagainst a possibly cheating opponent, player 1 should alwaysplay their optimal strategy with a = 0 . When using a = 0 ,player 1 wins − / (that is, a loss). This is a change (loss)of / compared to the fair game.2lternately, player 1 could cheat and player 2 could beunaware of this. In this case (Table V), player 1 now expectsto win / (that is, a gain) each play of the game. This is achange (increase) of / compared to the fair game.We conclude that player 2 is more motivated to cheat thanplayer 1 is. That is, the benefit to player 2, when player 2cheats, exceeds the benefit to player 1, when player 1 cheats. B. Kuhn Poker Programmatic Computation
The optimal strategy for Kuhn poker can also be solvedprogrammatically using software tools such as Gambit [4].Gambit is an open source library of game theory tools forthe construction and analysis of finite extensive and strategicgames. The Gambit GUI displays the extensive tree form ofthe game as shown in Figure 1. Some information about thistree: • The player actions are shown on the tree edges. • The top/root node in green is a chance node, it has 6links representing the 6 possible card dealings. • Player 1 actions are in red, player 2’s are in blue. • Figure 1 displays the information sets (ISs) as dottedlines – these are the states that a player cannot distin-guish between with the available knowledge. • Kuhn poker has 24 decision nodes, of which 12 aredistinct after IS consideration.
Fig. 1. Game tree representation in Gambit
After the game was created in extensive form, Gambitsolved it. In the fair case (no cheating), the result are thesame as the solutions previously shown in Table I, with ” a ”equal to zero.The way to implement our extensions to the classic versionof Kuhn poker is to build on the tree in Figure 1. Specifically,a game in which only player 1 cheats probabilistically isshown in Figure 2. For this Figure: • P1C is the path from the root node when player 1 doescheats (with probability p ). • P1N is the path from the root node when player 1 doesnot cheats (with probability − p ). • Each boxed images at the bottom of the tree in Figure 2represents all of Figure 1.
Fig. 2. Representation of Kuhn poker when player 1 cheats probabilistically
There are lines drawn between the boxed images, theserepresent the IS connections. For example, player 2, who isnot cheating, cannot determine from the information availableto them if they are on the left or right branch of the tree inFigure 2. Hence, the actions taken by player 2 when player 1bets, cannot be different. Stated differently, player 2 does notknow if player 1 is cheating or not.The analogous game when only player 2 cheats is to re-place the labels in Figure 2 from “P1C( p )” to “P2C( q )” (thatis, player 2 cheats with probability q ) and from “P1N( − p )”to “P2N( − q )”. To implement a game where both players 1and 2 cheat probabilistically, we merely need to extend theheight of the game tree one more level, as shown in Figure 3. Fig. 3. Representation of Kuhn poker when players 1 and 2 both cheatprobabilistically
In this case, the ISs are connected across all four sub-figures of the bottom layer, depending on what a playerknows when it is their turn. Note that the same result isobtained if the player 2 options are followed by the player 1options (as shown in Figure 3) or if the order is reversed andthe first options are for player 1. Adding cheating detectionis performed in the same manner as adding cheating.Figure 4 adds cheating detection by player 1. In this case,player 1 detects cheating on the left branch (case “P1D”which occurs with probability r ) and does not on the rightbranch (case “P1F” which occurs with probability − r ).Once again, there are horizontal lines connecting the ISs.Additionally, when we incorporated cheating detection, wealso changed the game payoffs. We implemented the rule3hat a player caught cheating always lost, even if the otherplayer had the lower card. When both players cheat, it is adraw. Fig. 4. Representation of Kuhn poker when both players cheat and player 1can detect cheating
Unsurprisingly, we can also create the tree representingthe game when each player cheats probabilistically and eachplayer can also probabilistically detect cheating by the otherplayer, as shown in Figure 5. We solved this game usingGambit, there are approximately 1000 nodes when bothcheating and cheating detection were incorporated.
Fig. 5. Representation of Kuhn poker when both players cheat and bothplayers can detect cheating
We used python to create the tree in Figure 5, and specifyall the information sets. Gambit determined the optimalgame strategy and the corresponding payoffs, using the linearprogram algorithm.The Gambit calculations for the payoff to player 1, whenonly one player cheats, are shown in Tables VI and VII.Unlike the analytical calculations shown in Section IV-A,the programatic calculations include players adapting tocheating. For example, player 1 may not know that player 2is cheating, but based on player 2’s actions, player 1 changestheir strategy with respect to the fair play strategy. Therefore,the incorporation of cheating also influences the non-cheatingplayer’s game play.When player 2 cheats and player 1 adapts to the cheating,player 1 expects to win − / , this is a change (loss) of / compared to the fair game. When player 1 cheats and player 2adapts to the cheating, player 1 expects to wins / , this is achange (gain) of / compared to the fair game. In this case,when players adapt to cheating, player 1 is more motivatedthan player 2 to cheat. This finding is opposite of what wasseen in Section IV-A, where player 2 was more motivated tocheat. TABLE VIT
HE VALUE OF THE GAME TO PLAYER WHEN PLAYER CHEATS ANDPLAYER ADAPTS cards dealtplayer 1 player 2 player 1 winnings player 2 winningsK J 1K Q 1Q J / / Q K / J K 1J Q 1net winnings to player 1 − / TABLE VIIT
HE VALUE OF THE GAME TO PLAYER WHEN PLAYER CHEATS ANDPLAYER ADAPTS cards dealtplayer 1 player 2 player 1 winnings player 2 winningsK J 1K Q / Q J 1Q K 1J K 1J Q / / net winnings to player 1 / V. R
ESULTS
We investigated the effects of cheating and its detectionon the 5 games described in section III. Game 1 is theclassic, fair play, version of Kuhn poker. This version hasbeen well studied, and our numerical results reproduce theknown results for the parameter value a = 0 .Games 2–4 focus primarily on incorporating cheating intoKuhn poker. We analyzed these cases for cheating probabili-ties ranging from 0 (never cheat) to 1 (always cheat) for eachplayer.In this analysis, the game was fully adaptive; equivalentlyeach player knew the probability of the other player’s cheat-ing. The computational results are shown in Figure 6, forwhich the axes are the probability that each player cheatsand the height is the expected payoff to player 1.There are several observations:1) The payoff surface has several bi-linear patches – wehave not yet been able to analytically explain this.2) The 4 marked corners are special cases, the valuesbelow are the expected payoff to player 1:a) Point 1: fair game, neither player cheats: – / b) Point 2: player 1 cheats, player 2 does not: / c) Point 3: player 2 cheats, player 1 does not: – / d) Point 4: both players cheat:
3) When both players cheat (Point 4), the payoff surfaceis flat and equal to zero for a range of cheat values forboth players 1 and 2. For example, when both playerscheat 90% of the time, the expected payoff is zero. It issurprising that if either player changes their likelihoodof cheating from 90% to 89% or 91%, there is nochange in the payoff.4 ig. 6. The value of the game to player 1 when both players 1 and 2 areprobabilistically cheatingFig. 7. The value of the game to player 1 when both players areprobabilistically detecting
Note that the payoff to player 1, when only one playercheats, is different from the payoff in Section IV-A. Sec-tion IV-A had one player cheating, while the other naivelyplayed fairly. In this case, there is full adaption to the cheating(“I know that you are cheating p % of the time.”) Game5 incorporated both cheating and detecting cheating. Weimplemented multiple iterations of game 5, varying the prob-ability of cheating and the probability of detecting cheating.The version of most interest was when both players alwayscheated and both players’ detection probabilities varied from0 to 1. In this case, player 1 payoffs are shown in Figures 7.The axes represent the probability of each player detectingcheating and the payoffs are color coded (higher payoffs= yellow, lower payoffs = blue). These figures show thatplayer 1 benefits more from detecting cheating than player 2does. Even in a fair game, player 1 is at a disadvantage bybetting first. Therefore, player 1 has more to gain by detectingcheating. VI. C ONCLUSION
Kuhn poker is a toy poker game involving two playersand three cards; it is zero sum. Each player is dealt one card,while the third card is face down. Classic Kuhn poker is agame of imperfect information, each player only knowingtheir own card. In this paper we analyzed Kuhn poker when the game allowed cheating; one or both players peaked at theface down card.In the analytical analysis, we assumed only one playerwas cheating and the other player was using the “fair” (non-cheating) strategy; this is the non-adaptive approach. In thiscase player 2 was more motivated to cheat than player 1. Wealso analyzed cheating in a fully adaptive situation, wherethe non-cheating player knew how likely the cheating playerwas to cheat. In this case, player 1 was more motivated tocheat than player 2: The main results are (the values beloware the expected payoff to player 1):1) When neither player cheats: − /
2) When player 1 cheats (alone, non-adaptive): /
3) When player 1 cheats (alone, adaptive): /
4) When player 2 cheats (alone, non-adaptive): − /
5) When player 2 cheats (alone, adaptive): − /
6) If both players cheat “enough”: The incorporation of cheating detection within Kuhn pokerwas also analyzed. Either players could probabilisticallydetect if the other player was cheating. In this case, when oneplayer always cheats and the other player detects cheatingprobabilistically, player 1 gains more by detecting thanplayer 2 does. This is due to the fast that Kuhn poker isasymmetric, player 1 is at a disadvantage by having to betfirst. R
EFERENCES[1] Billingham, John. ”Equilibrium solutions of three player Kuhn pokerwith
N > cards: A new numerical method using regularization andarc-length continuation”. arXiv . 2018. 1802.04670[2] Ross, Don, ”Game Theory”, The Stanford Encyclopedia of Philosophy (Winter 2019 Edition), Edward N. Zalta (ed.). https://plato.stanford.edu/archives/win2019/entries/game-theory. Accessed April 2020.[3] Kuhn, H. W. A simplified two-person poker.
Contributions to the Theoryof Games . 1950. pp 97–103.[4] “An Overview of Gambit.”
An Overview of Gambit - Gambit 16.0.1 Doc-umentation , gambitproject.readthedocs.io/en/latest/intro.html. AccessedMay 2020, gambitproject.readthedocs.io/en/latest/intro.html. AccessedMay 2020