[PDF] Convergence of Bayesian Nash Equilibrium in Infinite Bayesian Games under Discretization

Abstract

We prove the existence of Bayesian Nash Equilibrium (BNE) of general-sum Bayesian games with continuous types and finite actions under the conditions that the utility functions and the prior type distributions are continuous concerning the players' types. Moreover, there exists a sequence of discretized Bayesian games whose BNE strategies converge weakly to a BNE strategy of the infinite Bayesian game. Our proof establishes a connection between the equilibria of the infinite Bayesian game and those of finite approximations, which leads to an algorithm to construct \varepsilon-BNE of infinite Bayesian games by discretizing players' type spaces.

Full PDF

aa r X i v : . [ c s . G T ] F e b Convergence of Bayesian Nash Equilibrium inInﬁnite Bayesian Games under Discretization

Linan Huang and Quanyan ZhuDepartment of Electrical and Computer Engineering, Tandon School of Engineering,New York University, Brooklyn, NY 11201 USAEmail: { lh2328, qz494 } @nyu.edu Abstract

We prove the existence of Bayesian Nash Equilibrium (BNE) of general-sum Bayesian gameswith continuous types and ﬁnite actions under the conditions that the utility functions and the priortype distributions are continuous concerning the players’ types. Moreover, there exists a sequence ofdiscretized Bayesian games whose BNE strategies converge weakly to a BNE strategy of the inﬁniteBayesian game. Our proof establishes a connection between the equilibria of the inﬁnite Bayesiangame and those of ﬁnite approximations, which leads to an algorithm to construct ε -BNE of inﬁniteBayesian games by discretizing players’ type spaces. I. I

NTRODUCTION

Bayesian games [6] have found wide application in auctions [12], wireless networks [1], cyberse-curity [7], [8], and robotic systems [9]. In these applications, it is natural to model the incompleteinformation such as players’ bids in auction theory as a continuous random variable. However, theexisting computational techniques are mainly for ﬁnite Bayesian games where the action and thetype spaces are both ﬁnite. For Bayesian games with continuous types, the equilibrium is usuallycomputed under restrictive assumptions. For example, [3] focuses on the single crossing conditionand the authors in [5] restrict the type distribution to be piecewise linear with some prior domainknowledge of a qualitative model. Iterative methods and learning have also been applied. The authorsin [18] focus on the piecewise uniform type distribution and payoffs that are linear functions fromplayers’ types and actions. They apply an iterated best response to compute the BNE. The authorsin [17] restrict each player’s utility to be independent of other’s types and develop a ﬁctitious playalgorithm to learn pure-strategy equilibrium.In this paper, we consider general Bayesian games with continuous types and prove the existenceof BNE in these games. Comparing to previous works (see e.g., [14], [4]) that prove the existenceof BNE in inﬁnite Bayesian games, we further prove that there exists a sequence of discretizedBayesian games whose BNE strategies converge weakly to a BNE strategy of the inﬁnite Bayesiangame. Our proof further implies an algorithm to approximate the BNE of inﬁnite Bayesian games bydiscretization. The convergence of equilibrium strategies by discretization or sampling has been shownin complete information games with continuous actions [15], signaling games of certain classes [13],and inﬁnite Bayesian Stackelberg games [11]. The authors in [2] deﬁne a new concept of constrainedstrategic equilibrium (CSE) for Bayesian games and propose sufﬁcient conditions under which asequence of CSEs converges toward a BNE. However, the convergence of BNE has not been shownin simultaneous-move Bayesian games of continuous types.After a proper reformulation, we obtain BNE in its distributional form, which enables us to adoptthe key idea from [15]. Following a similar argument in [15], our results in two-player general-suminﬁnite Bayesian games can be directly extended to the N -player case. Since there exists a one-to-onemapping from any set with the cardinality of the continuum to the unit interval [0 , (i.e., R n and [0 , has the same cardinality), we can directly extend the convergence theorem to any compact jointtype space of higher dimensions. II. B

AYESIAN G AMES WITH C ONTINUOUS T YPES

We consider the following Bayesian game

Γ := < X , Y , Θ × Θ , b ( · ) , { ¯ u x,y ( · ) , ¯ v x,y ( · ) } x ∈X ,y ∈Y > with a compact joint type space Θ × Θ and two ﬁnite action spaces of X := { x , ..., x L } and Y := { y , ..., y H } ; i.e., the ﬁrst and the second player have L and H actions to choose from andsimultaneously take action x ∈ X and y ∈ Y , respectively. The incomplete information of the gameis represented by two single-dimensional continuous random variables ˜ θ ∈ Θ , ˜ θ ∈ Θ whosejoint distribution b is assumed to be common knowledge and continuous over the joint type space Θ × Θ . We require the marginal distribution to be positive, i.e., ¯ b i ( θ i ) := R Θ j b ( θ i , θ j ) dθ j > , ∀ i ∈{ , } , ∀ θ i ∈ Θ i and take Θ = Θ = [0 , without loss of generality. Player i privately observeshis type realization θ i ∈ Θ i and knows that the other player j has a type θ j ∈ Θ j with a probabilitydensity of b i ( θ j | θ i ) := b ( θ j , θ i ) / ¯ b i ( θ i ) ∈ R +0 . Then, b i is a valid conditional probability measure andwe have R b i ( θ j | θ i ) dθ j = 1 , ∀ θ i ∈ Θ i .The utility functions ¯ u x,y ( θ , θ ) ∈ R +0 and ¯ v x,y ( θ , θ ) ∈ R +0 of the ﬁrst and the second player,respectively, depend on players’ actions x ∈ X , y ∈ Y , and types θ ∈ Θ , θ ∈ Θ . We furtherassume that both players’ utility functions ¯ u x,y ( θ , θ ) and ¯ v x,y ( θ , θ ) are continuous over the jointtype set Θ × Θ for all actions x ∈ X , y ∈ Y . Since a continuous function on a compact metric spaceis bounded and uniformly continuous , we know that both players’ utility functions are bounded anduniformly continuous over the joint type set. Therefore, we can assume non-negative utility functionswithout loss of generality as we can always add a sufﬁciently large constant, which is guaranteed bythe boundedness, to make them non-negative without any change to the equilibrium policy.The behavioral strategies σ : Θ ∆ X and σ : Θ ∆ Y of the ﬁrst and the second player,respectively, map each player’s type to the distribution of his action space. In particular, we denote σ ( x | θ ) ∈ R +0 (resp. σ ( y | θ ) ∈ R +0 ) as the probability of player (resp. player ) taking action x ∈ X (resp. action y ∈ Y ) when his type is θ ∈ Θ (resp. θ ∈ Θ ). Obviously, we have P x ∈X σ ( x | θ ) = 1 , ∀ θ ∈ Θ and P y ∈Y σ ( y | θ ) = 1 , ∀ θ ∈ Θ . Deﬁne two players’ expectedutilities under any strategy pair ( σ , σ ) as r ( θ , σ , σ ) := Z b ( θ | θ ) X x ∈X σ ( x | θ ) X y ∈Y σ ( y | θ )¯ u x,y ( θ , θ ) dθ .r ( θ , σ , σ ) := Z b ( θ | θ ) X x ∈X σ ∗ ( x | θ ) X y ∈Y σ ( y | θ )¯ v x,y ( θ , θ ) dθ . (1)For player i of type θ i ∈ Θ i , his best response strategy σ ∗ i ( ·| θ i ) with respect to the player’s strategy σ j belongs to a set B i ( θ i , σ j ) , i.e., σ ∗ i ( ·| θ i ) ∈ B i ( θ i , σ j ) := arg max σ i ( ·| θ i ) r i ( θ i , σ i , σ j ) . (2)For any given policy σ j of the other player j , player i ’s best response set B i ( θ i , σ j ) under type θ i isnonempty and contains a pure policy as shown in Lemma 1. Analogous statement holds for player . Lemma 1 (Pure Policy in Best Response Set) . If the second player’s strategy σ is common knowledge,then player ’s best response set B ( θ , σ ) under any θ ∈ Θ contains the following pure policy arg max x ∈X Z Z b ( θ | θ ) X y ∈Y σ ∗ ( y | θ )¯ u x,y ( θ , θ ) dθ . A strategy pair consists a BNE if they are best response to each other as deﬁned below.

Deﬁnition 1 (Bayesian Nash Equilibrium) . A strategy pair ( σ ∗ , σ ∗ ) consists a BNE of inﬁnite Bayesiangame Γ if σ ∗ i ( ·| θ i ) ∈ B i ( θ i , σ ∗ j ) , ∀ i, j ∈ { , } , i = j , for almost every θ ∈ Θ and θ ∈ Θ . The joint type space refers to the Cartesian product (denoted as × ) of each player i ’s type space Θ i . Since the joint typespace is compact, each Θ i has to be compact. “Almost” in this context means that the probability of all types for which the strategy does not prescribe an optimalaction is zero. For example, if player i ’s strategies differ only at countable points over Θ i , then they result in the same valueof Riemann integration in (1). Since ¯ b i ( θ i ) > , ∀ θ i ∈ Θ , ∀ i ∈ { , } , Lemma 2 below shows that we can compute BNE strategypair ( σ ∗ , σ ∗ ) through the following integration form in (3) and (4); i.e., no player has a proﬁtabledeviation after he knows his private type if and only if he does not beneﬁt from any deviation beforeknowing his type [6]. Z Z b ( θ , θ ) X x ∈X σ ∗ ( x | θ ) X y ∈Y σ ∗ ( y | θ )¯ u x,y ( θ , θ ) dθ dθ = max σ Z Z b ( θ , θ ) X x ∈X σ ( x | θ ) X y ∈Y σ ∗ ( y | θ )¯ u x,y ( θ , θ ) dθ dθ , (3)and Z Z b ( θ , θ ) X x ∈X σ ∗ ( x | θ ) X y ∈Y σ ∗ ( y | θ )¯ v x,y ( θ , θ ) dθ dθ = max σ Z Z b ( θ , θ ) X x ∈X σ ∗ ( x | θ ) X y ∈Y σ ( y | θ )¯ v x,y ( θ , θ ) dθ dθ . (4) Lemma 2 (BNE is equivalent to Nash Equilibrium) . A strategy pair ( σ ∗ , σ ∗ ) consists a BNE if andonly if (3) and (4) holds.Proof. The ‘only if’ part (sufﬁciency) is straight forward as (2) results in (3) and (4). To prove the‘if’ part (necessity), we show that if ( σ ∗ , σ ∗ ) is not a BNE deﬁned in Deﬁnition 1, then (3) and (4)cannot hold in the same time. As ( σ ∗ , σ ∗ ) is not a BNE, there exists a measurable set ˆΘ i ⊆ Θ i andat least one player i (assume the second player) who has a proﬁtable deviation from σ ∗ ( ·| θ ) to anaction y l ∈ Y when θ ∈ ˆΘ , i.e., Z ˆΘ ¯ b ( θ ) (cid:20) Z b ( θ | θ ) X x ∈X σ ∗ ( x | θ )¯ v x,y l ( θ , θ ) dθ (cid:21) dθ > Z ˆΘ ¯ b ( θ ) (cid:20) Z b ( θ | θ ) X x ∈X σ ∗ ( x | θ ) X y ∈Y σ ∗ ( y | θ )¯ v x,y ( θ , θ ) dθ (cid:21) dθ . Consider a strategy ˆ σ where ˆ σ ( y | θ ) = { y = y l } , ∀ y ∈ Y , θ ∈ ˆΘ and ˆ σ = σ , ∀ θ / ∈ ˆΘ ; i.e., ˆ σ is identical to σ except over the set ˆΘ . Then, we know that Z Z b ( θ , θ ) X x ∈X σ ∗ ( x | θ ) X y ∈Y ˆ σ ( y | θ )¯ v x,y ( θ , θ ) dθ dθ = Z ˆΘ ˆ b ( θ ) (cid:20) Z b ( θ | θ ) X x ∈X σ ∗ ( x | θ ) X y ∈Y ˆ σ ( y | θ )¯ v x,y ( θ , θ ) dθ (cid:21) dθ + Z Θ \ ˆΘ ˆ b ( θ ) (cid:20) Z b ( θ | θ ) X x ∈X σ ∗ ( x | θ ) X y ∈Y σ ∗ ( y | θ )¯ v x,y ( θ , θ ) dθ (cid:21) dθ > Z Z b ( θ , θ ) X x ∈X σ ∗ ( x | θ ) X y ∈Y σ ∗ ( y | θ )¯ v x,y ( θ , θ ) dθ dθ , which contradicts (4). A. Equivalent Reformulation in Distributional Form

Since both integrands in (3) and (4) are non-negative, we can exchange the summation of actionsand the integration of types according to Fubini’s theorem. Deﬁne u x,y ( θ , θ ) := b ( θ , θ )¯ u x,y ( θ , θ ) and v x,y ( θ , θ ) := b ( θ , θ )¯ v x,y ( θ , θ ) . Since a ﬁnite production of continuous functions is stillcontinuous, u x,y and v x,y are both continuous over the joint set Θ × Θ . By assimilating the prior Since the best response of any give policy contains a pure policy as shown in Lemma 1, we can restrict the proﬁtabledeviation to an action without loss of generality. distribution of types into the players’ utility functions, we can discretize the continuous type setuniformly as shown Section IV. Let represent the ﬁrst player’s behavioral strategy σ ( x | θ ) as afunction of θ parameterized by action x , i.e., f x ( θ ) . Then we can deﬁne a non-decreasing boundedfunction F x ( θ ) := R θ f x (˜ θ ) d ˜ θ of θ parameterized by action x . Since P x ∈X f x ( θ ) = 1 , ∀ θ ∈ Θ , and f x ( θ ) ≥ , ∀ θ , ∀ x ∈ X , we obtain P x ∈X F x ( θ ) = θ , ∀ θ ∈ Θ by Fubini’s theorem. Weuse F X to denote the set of functions F X := { F x } x ∈X that satisfy the above conditions. Similarly,we can represent the second player’s strategy σ ( y | θ ) as g y ( θ ) and deﬁne G y ( θ ) := R θ f x (˜ θ ) d ˜ θ as the non-decreasing bounded function of θ . Analogously, we have G y (0) = 0 for any y ∈ Y and P y ∈Y G y ( θ ) = θ , ∀ θ ∈ Θ . We use G Y to denote the set of functions G Y := { G y } y ∈Y that satisfythe above conditions. Then, we can recast a BNE strategy pair ( F X ∈ F X , G Y ∈ G Y ) in the following distributional form , i.e., X x ∈X X y ∈Y Z Z u x,y ( θ , θ ) dF x ( θ ) dG y ( θ ) = max F X ∈F X X x ∈X X y ∈Y Z Z u x,y ( θ , θ ) dF x ( θ ) dG y ( θ ) , X x ∈X X y ∈Y Z Z v x,y ( θ , θ ) dF x ( θ ) dG y ( θ ) = max G Y ∈G Y X x ∈X X y ∈Y Z Z v x,y ( θ , θ ) dF x ( θ ) dG y ( θ ) . Due to the difﬁculty of computing an exact BNE, it is common to consider an approximateequilibrium deﬁned below.

Deﬁnition 2 ( ε -BNE) . A strategy pair ( F X ∈ F X , G Y ∈ G Y ) consists a ε -BNE if for all ( F X ∈F X , G Y ∈ G Y ) , the following holds. X x ∈X X y ∈Y Z Z u x,y ( θ , θ ) dF x ( θ ) dG y ( θ ) ≥ X x ∈X X y ∈Y Z Z u x,y ( θ , θ ) dF x ( θ ) dG y ( θ ) − ε, X x ∈X X y ∈Y Z Z v x,y ( θ , θ ) dF x ( θ ) dG y ( θ ) ≥ X x ∈X X y ∈Y Z Z v x,y ( θ , θ ) dF x ( θ ) dG y ( θ ) − ε. III. E

XTENSION OF H ELLY ’ S S ELECTION T HEOREM

Based on the reformulation of BNE in Section II-A, the players’ strategies F X and G Y become L - and H -dimensional vectors of constrained functions, respectively. Thus, we extend the originalHelly’s selection theorem in the following lemma to ﬁt the vector of functions with constraints. Lemma 3 (Convergence on Countable Set) . Consider the ﬁnite set X := { x , ..., x L } and a se-quence of functions { F X n } n ∈Z + , where F X n ∈ F X for each n ∈ Z + . Let D := { θ , θ , ... } beany countable subset of Θ . Then there is a subsequence of { F X n } n ∈Z + , i.e., { F X n k } ∞ k =1 such that ¯ F x ( θ ) := lim k →∞ F xn k ( θ ) exists for any θ ∈ D , x ∈ X . Moreover, the limit function ¯ F X ∈ F X .Proof. With a little abuse of notation, the vector F X n ( θ d ) := [ F x n ( θ d ) , · · · , F x L n ( θ d )] at θ d ∈ D , d ∈{ , , · · · } , ∀ n ∈ Z + , belongs to a subset of R L , i.e., F X ( θ d ) , that is closed and bounded. Thenthe subset must be sequentially compact based on Bolzano–Weierstrass theorem and every sequenceof points in this subset has a convergent subsequence to a point in the subset. Thus, we know thatthere exist a subsequence F X n dk ( θ d ) converge to ¯ F X ( θ d ) ∈ F X ( θ d ) . Then, we can apply the standarddiagonalization argument to repeatedly ﬁnd subsequence from subsequence so that there exist a ﬁnalsubsequence n k that makes F X n k ( θ ) converges to ¯ F X ( θ ) ∈ F X ( θ ) for all θ ∈ D . Note that ¯ F x isnon-decreasing with respect to θ ∈ D for each x ∈ X as the inequality is preserved in the limit; i.e.,if θ d < θ d ∈ D , then ¯ F x ( θ d ) = lim k →∞ F xn k ( θ d ) ≤ lim k →∞ F xn k ( θ d ) = ¯ F x ( θ d ) . Theorem 1 (Convergence on Compact Set) . Consider

Θ := [0 , , ﬁnite set X := { x , ..., x L } , anda sequence of functions { F X n } n ∈Z + , where F X n ∈ F X for each n ∈ Z + . Then, some subsequence of { F X n } n ∈Z + , i.e., { F X n k } ∞ k =1 , converges point-wise to a non-decreasing bounded function F X ∈ F X ,i.e., F x ( θ ) = lim k →∞ F xn k ( θ ) , ∀ x ∈ X , ∀ θ ∈ Θ .Proof. Let D := Q ∩ [0 , , then D is countable where Q represents the set of rational number. Then,based on Lemma 3, there exists a subsequence { F X n k } ∞ k =1 that converges to ¯ F X ∈ F X if θ ∈ D .Next, we need to extend the function ¯ F X deﬁned on discrete set D to a function F X deﬁned over the continuous region by connecting the dots, i.e., F x ( α ) = sup β ≤ α,β ∈D ¯ F x ( β ) , ∀ α ∈ Θ , ∀ x ∈ X . Then, F X ( θ ) = ¯ F X ( θ ) , ∀ θ ∈ D , and F X ∈ F X is also element-wise non-decreasing with respect to θ ∈ Θ as α < γ leads to F x ( α ) = sup β ≤ α,β ∈D ¯ F x ( β ) ≤ sup β ≤ γ,β ∈D ¯ F x ( β ) = F x ( γ ) , ∀ x ∈ X . Note that by connecting discrete dots, F X ( θ ) is right continuous and there are countable jumps at θ ∈ D . We ﬁrst show that for each x ∈ X , if F x is continuous at α ∈ Θ , then there exists a subsequence n k of the subsequence n k such that F x ( α ) = lim k →∞ F xn k ( α ) . For any α ∈ Θ , since F x , ∀ x ∈ X ,is continuous at α ∈ Θ , we can choose p, q ∈ D , α ∈ ( p, q ) such that F x ( q ) − F x ( p ) < ǫ/ , ∀ x ∈ X .Owning to the convergence on the countable set D , we can pick k sufﬁciently large such that F xn k ( p ) ∈ ( F x ( p ) − ǫ/ , F x ( p ) + ǫ/ and F xn k ( q ) ∈ ( F x ( q ) − ǫ/ , F x ( q ) + ǫ/ for all x ∈ X .Then, F xn k ( α ) ≤ F xn k ( q ) < F x ( q ) + ǫ/ < F x ( p ) + ǫ ≤ F x ( α ) + ǫ, ∀ x ∈ X . Analogously, we can also obtain F xn k ( α ) > F x ( α ) − ǫ, ∀ x ∈ X , which together show the convergenceat α . Second, we show the convergence at discontinuous point β ∈ Θ . Since F X is element-wise non-decreasing, the set of discontinuity is at most countable based on Froda’s theorem. Thus, Lemma3 guarantees that we can select a convergent subsequence n k from the subsequence n k such that F x ( β ) = lim k →∞ F xn k ( β ) , ∀ x ∈ X . Combining the above two cases, we have found a convergentsubsequence n k over the entire set Θ , i.e., F x ( θ ) = lim k →∞ F xn k ( θ ) , ∀ x ∈ X , ∀ θ ∈ Θ .Next, we extend Helly’s second theorem to a production of sets in Theorem 2. Theorem 2.

Let u x,y ( θ , θ ) be continuous over the joint set [ α, β ] × C for each action x ∈ X , y ∈ Y ,where C is compact and [ α, β ] ⊆ Θ , then for each y ∈ Y , P x ∈X R βα u x,y ( θ , θ ) dF xn ( θ ) convergesto P x ∈X R βα u x,y ( θ , θ ) dF x ( θ ) uniformly in θ .Proof. For any ǫ > , since u x,y is uniformly continuous over the joint set, we can choose δ > such that | u x,y ( θ , θ ) − u x,y ( θ ′ , θ ) | < ǫ (5)for all x ∈ X , y ∈ Y , θ ∈ C , | θ − θ ′ | < δ . Choose α = θ < θ < · · · < θ D = β such that F xn iscontinuous at each θ d , d ∈ { , · · · , D − } and θ d +11 − θ d < δ for all x ∈ X , which can be done as F xn has at most countable discontinuities over the set [ α, β ] . Deﬁne u x,yd ( θ ) := min θ d ≤ θ ≤ θ d +11 u x,y ( θ , θ ) and S x,yn ( θ ) := P D − d =1 R θ d +11 θ d u x,yd ( θ ) dF xn ( θ ) , ∀ n ∈ Z +0 . Then, we have Z βα u x,y ( θ , θ ) dF xn ( θ ) ≥ S x,yn ( θ ) . (6)Now by (5) and the monotonicity of F xn , we have S x,yn ( θ ) ≥ R βα u x,y ( θ , θ ) dF xn ( θ ) − ǫ ( F xn ( β ) − F xn ( α )) , ∀ x ∈ X , y ∈ Y . Then, using (6) and the fact that F xn ( β ) − F xn ( α ) ≤ , ∀ x ∈ X , we have | S x,yn ( θ ) − Z βα u x,y ( θ , θ ) dF xn ( θ ) | < ǫ, ∀ x ∈ X , y ∈ Y . (7)Now, choose N large enough such that, for each d = 1 , · · · , D , and each n ≥ N , | F xn ( θ d ) − F x ( θ d ) | < ǫD , ∀ x ∈ X . (8)Then, we obtain R θ d +11 θ d u x,yd ( θ ) dF xn ( θ ) = u x,yd ( θ )( F xn ( θ d +11 ) − F xn ( θ d )) , ∀ x ∈ X , y ∈ Y , ∀ n ∈ Z +0 , which, together with (8), gives us | R θ d +11 θ d u x,yd ( θ ) dF xn ( θ ) − R θ d +11 θ d u x,yd ( θ ) dF x ( θ ) | < ǫD | u x,yd ( θ ) | , ∀ x ∈ X , y ∈ Y . Since u x,y is continuous over a compact set, there exists a ﬁnite upper bound M for | u x,yd ( θ ) | . Therefore, | S x,yn ( θ ) − S x,y ( θ ) | < D − X d =1 ǫD | u x,yd ( θ ) | ≤ ǫM, (9)Combine (7) and (9), we have that ∀ θ ∈ C , n ≥ N , | Z βα u x,y ( θ , θ ) dF xn ( θ ) − Z βα u x,y ( θ , θ ) dF x ( θ ) | < (2 M + 1) ǫ, ∀ x ∈ X , y ∈ Y , or equivalently, X x ∈X | Z βα u x,y ( θ , θ ) dF xn ( θ ) − Z βα u x,y ( θ , θ ) dF x ( θ ) | < |X | · (2 M + 1) ǫ, ∀ y ∈ Y . Since ǫ is arbitrary and its coefﬁcient |X | · (2 M + 1) is ﬁxed for all θ ∈ C , the convergence isuniformly in θ for each y ∈ Y .IV. D ISCRETIZATION AND C ONVERGENCE

In this section, we provide a theoretical guarantee to approximate inﬁnite Bayesian games byproperly discretizing the type space and solving the resulted ﬁnite Bayesian games. The convergenceof the BNE is guaranteed as long as the maximum distance of intervals under the discretization schemegoes to zero when the number of intervals goes to inﬁnity. For simplicity, we adopt the followinguniform discretization scheme. We can also adopt other deterministic schemes such as dichotomy orstochastic schemes such as sampling.For any integer n ≥ and action pair x ∈ X , y ∈ Y , deﬁne the level- n approximation of two players’utility functions u x,y , v x,y as two n × n matrices [ u x,y,ni,j ] i,j ∈{ , ··· ,n } , [ v x,y,ni,j ] i,j ∈{ , ··· ,n } , respectively,where the ( i, j ) elements are u x,y,ni,j = u x,y ( in , jn ) , v x,y,ni,j = v x,y ( in , jn ) . (10)Then, the level- n discretized version of the inﬁnite Bayesian game Γ is denoted as Γ n = < X , Y , ¯Θ n × ¯Θ n , b n ( · ) , { u x,y,ni,j , v x,y,ni,j } i,j ∈{ , ··· ,n } x ∈X ,y ∈Y >, where the ﬁnite type set ¯Θ ni := { n , · · · , nn } contains n discrete types of player i . Since we haveassimilated the the prior type distribution b ( · ) into the players’ utility functions u x,y , v x,y , the priordistribution of the discrete types is b n ( in , jn ) = n , ∀ i, j ∈ { , · · · , n } . Let s X ,n := ( s X ,n , · · · , s X ,nn ) and t Y ,n := ( t Y ,n , · · · , t Y ,nn ) be a BNE of the level- n discretized Bayesian game Γ n where the elementsof s X ,ni := [ s x ,ni , s x ,ni , · · · , s x L ,ni ] and t Y ,nj := [ t y ,ni , t y ,ni , · · · , t y H ,ni ] are all non-negative for all i, j ∈ { , · · · , n } and each sum up to be , i.e., P Ll =1 s x l ,ni = 1 , P Hh =1 t y h ,ni = 1 , ∀ i, j ∈ { , · · · , n } .The existence of behavioral strategy pairs ( s X ,n , t Y ,n ) is guaranteed [19] for any ﬁnite Bayesian games Γ n . For any x ∈ X , y ∈ Y and n ∈ Z + , deﬁne the non-decreasing right-continuous step functions F xn ( θ ) = 1 n ⌊ nθ ⌋ X i =1 s x,ni , G yn ( θ ) = 1 n ⌊ nθ ⌋ X j =1 t y,nj , (11)where ⌊ nθ ⌋ represents the great integer that is not greater than the value of nθ . Obviously, F X n ∈ F X and G Y n ∈ G Y for any n ∈ Z + .Since player has H possible actions, we can divide the entire type space into at most H disjointsubsets, i.e., Θ = ∪ Hh =1 Θ h , Θ h ∩ Θ h ′ = ∅ , ∀ h = h ′ , where player chooses to take action y h ∈ Y when his type θ belongs to Θ h , i.e., g y h ( θ ) = { θ ∈ Θ h } , ∀ h ∈ { , , ..., H } , ∀ θ ∈ Θ . Note thateach subset Θ h ⊆ Θ , h ∈ { , · · · , H } , does not need to be connected and can be empty. Lemma 4.

The function P h ∈{ , ··· ,H } R Θ h v x,y h ( θ , θ ) dθ is continuous over θ for any x ∈ X .Proof. Since v x,y h ( θ , θ ) is continuous over the joint type space for any x ∈ X , y ∈ Y , weknow that for any number ǫ > , however small, there exists some number δ > such thatfor all θ ∈ ( α − δ, α + δ ) , v x,y h ( θ , θ ) ∈ ( v x,y h ( α, θ ) − ǫ, v x,y h ( α, θ ) + ǫ ) for all θ ∈ Θ .Based on the fact that P h ∈{ , ··· ,H } R Θ h dθ ≡ , we have P h ∈{ , ··· ,H } R Θ h v x,y h ( α, θ ) dθ − ǫ < P h ∈{ , ··· ,H } R Θ h v x,y h ( θ , θ ) dθ < P h ∈{ , ··· ,H } R Θ h v x,y h ( α, θ ) dθ + ǫ, which proves the continuityin θ .Now, we are ready to prove our main result of equilibrium convergence in Theorem 3. Theorem 3 (Convergence of BNE by Discretization) . A inﬁnite Bayesian game Γ has at least oneBNE pair ( F X ∈ F X , G Y ∈ G Y ) in behavioral strategies. Moreover, there exists a sequence of discretized Bayesian games { Γ n k } k ∈ Z + such that F x ( θ ) = lim k →∞ F xn k ( θ ) , ∀ θ ∈ Θ , ∀ x ∈ X and G y ( θ ) = lim k →∞ G yn k ( θ ) , ∀ θ ∈ Θ , ∀ y ∈ Y .Proof. We prove the theorem by contradiction. According to Theorem 1, the sequence of mixedstrategy pairs ( F X n ∈ F X , G Y n ∈ G Y ) will have a subsequence ( F X n k , G Y n k ) that converges weakly toa pair of strategies ( F X ∈ F X , G Y ∈ G Y ) . Suppose the strategy pair ( F X , G Y ) does not consist aBNE. Then, at least one of the two strategies is not a best response against the other. We may assumethat G Y is not optimal againt F X . The second player’s expected utility under the BNE of Γ is w := X x ∈X X y ∈Y Z Z v x,y ( θ , θ ) dF x ( θ ) dG y ( θ ) . (12)For each n , the second player’s expected utility under the BNE of Γ n is w n := 1 n X x ∈X X y ∈Y n X i =1 n X j =1 v x,y,ni,j s x,ni t y,nj = X x ∈X X y ∈Y Z Z v x,y ( θ , θ ) dF xn ( θ ) dG yn ( θ ) . (13)Since G Y is not an optimal response against F X and Lemma 1 shows that the deviation can be a purestrategy without loss of generality, there exists a set division of Θ , i.e., Θ h , ∀ h ∈ { , , ..., H } , suchthat the deviation strategy ¯ g y h ( θ ) = { θ ∈ Θ h } , ∀ h ∈ { , , ..., H } , ∀ θ ∈ Θ , achieves an expectedutility larger than w . Then, there exists ǫ > such that X h ∈{ , ··· ,H } Z Θ h X x ∈X Z v x,y h ( θ , θ ) dF x ( θ ) dθ ≥ w + 4 ǫ. Based on the continuity result in Lemma 4 and the convergence result in Theorem 2, for the setdivision { Θ h } h ∈{ , ··· ,H } , there exists K such that if k ≥ K , we have X x ∈X Z (cid:20) X h ∈{ , ··· ,H } Z Θ h v x,y h ( θ , θ ) dθ (cid:21) dF xn k ( θ ) > X x ∈X Z (cid:20) X h ∈{ , ··· ,H } Z Θ h v x,y h ( θ , θ ) dθ (cid:21) dF x ( θ ) − ǫ, or equivalently, n k X x ∈X n k X i =1 (cid:20) X h ∈{ , ··· ,H } Z Θ h v x,y h ( in k , θ ) dθ (cid:21) s x,n k i > w + 3 ǫ. (14)Theorem 2 also guarantees that there exists K such that if k ≥ K , X x ∈X Z v x,y ( θ , θ ) dF xn k ( θ ) < X x ∈X Z v x,y ( θ , θ ) dF x ( θ ) + ǫ, ∀ θ ∈ [0 , , ∀ y ∈ Y , X y ∈Y Z v x,y ( θ , θ ) dG xn k ( θ ) < X y ∈Y Z v x,y ( θ , θ ) dG x ( θ ) + ǫ, ∀ θ ∈ [0 , , ∀ x ∈ X . Thus, we obtain w n k < w + 2 ǫ and from (14), we have n k X x ∈X n k X i =1 (cid:20) X h ∈{ , ··· ,H } Z Θ h v x,y h ,n k ( in k , θ ) dθ (cid:21) s x,n k i > w n k + ǫ. (15)Owning to the continuity of v x,y h ( in k , θ ) over θ , R Θ h v x,y h ,n k ( in k , θ ) dθ is Riemann integrable.Since we discretize the entire type set Θ uniformly, the length of the sub-interval of the partition is n k . Thus, there exists K such that if k ≥ K , n k X h ∈{ , ··· ,H } X jnk ∈ Θ h v x,y h ,n k ( in k , jn k ) dθ > X h ∈{ , ··· ,H } Z Θ h v x,y h ,n k ( in k , θ ) dθ − ǫ, and so n k X x ∈X n k X i =1 (cid:20) X h ∈{ , ··· ,H } X jnk ∈ Θ h v x,y h ,n k ( in k , jn k ) (cid:21) s x,n k i > X x ∈X n k X i =1 (cid:20) X h ∈{ , ··· ,H } Z Θ h v x,y h ,n k ( in k , θ ) dθ (cid:21) s x,n k i − n k · ǫ. Finally, combine with (15), we know that n k ) X x ∈X n k X i =1 (cid:20) X h ∈{ , ··· ,H } X jnk ∈ Θ h v x,y h ,n k ( in k , jn k ) (cid:21) s x,n k i > w n k , which leads to a contradiction as t Y ,n k was assumed to be an optimal response against s X ,n k in theﬁnite Bayesian game Γ n k . However, the second player achieves a higher expected utility under s X ,n k if he adopts the pure BNE strategy ¯ t Y ,n k whose i -th element ¯ t Y ,n k i satisﬁes ¯ t y h ′ ,n k i = { h ′ = h } , ∀ h ′ ∈{ , · · · , H } , if in k ∈ Θ h . Therefore, the contradiction leads to the conclusion that G Y is alwaysoptimal against F X and the strategy pair ( F X , G Y ) consists a BNE in behavioral strategy for theinﬁnite Bayesian game Γ . A. Algorithm to Compute ε -BNE of Inﬁnite Bayesian Games Although Theorem 3 proves the asymptotic convergence of BNE, there is no ﬁnite-step performanceguarantee. There exist counterexamples (see e.g., [18]) where the ﬁnite approximation of an inﬁnitegame leads to misleading results. Due to the pathology, we construct Algorithm 1 as follows to checkwhether a ε -BNE has been reached at some ﬁnite level n . Algorithm 1:

Compute ε -BNE of inﬁnite Bayesian game Γ Input the inﬁnite Bayesian game Γ , the approximation accuracy ε > , and the maximumnumber of discretization K ; Initialize the discretization level n = 1 ; while n < K do Discretize Γ via (10) to obtain Γ n ; Solve Γ n to obtain the equilibrium strategy pair ( s X ,n , t Y ,n ) ; Obtain the level- n approximated strategy pair ( F X n , G Y n ) for Γ via (11); if ( F X n , G Y n ) consists a ε -BNE of Γ in Deﬁnition 2 then Terminate ; n := n + 1 ; end Output the ε -BNE strategy ( F X n , G Y n ) of the inﬁnite Bayesian game Γ .To compute the BNE of ﬁnite Bayesian games in line , we can construct the following bilinearprogram C K (see Theorem 1 of [10]). Recall that the ﬁnite type set ¯Θ ni ⊂ Θ contains the n discretetypes of player i . [ C K ] : max σ ,σ ,s ,s X θ ∈ ¯Θ n α ( θ ) s ( θ ) + X θ ∈ ¯Θ n α ( θ ) E θ ∼ b ( ·| θ ) ,x ∼ σ ,y ∼ σ [¯ u x,y ( θ , θ )]+ X θ ∈ ¯Θ n α ( θ ) s ( θ ) + X θ ∈ ¯Θ n α ( θ ) E θ ∼ b ( ·| θ ) ,x ∼ σ ,y ∼ σ [¯ v x,y ( θ , θ )] s.t. ( a ) E θ ∼ b ( ·| θ ) ,x ∼ σ [¯ v x,y ( θ , θ )] ≤ − s ( θ ) , ∀ θ ∈ ¯Θ n , ∀ y ∈ Y , ( b ) X x ∈X σ ( x | θ ) = 1 , σ ( x | θ ) ≥ , ∀ θ ∈ ¯Θ n , ( c ) E θ ∼ b ( ·| θ ) ,y ∼ σ [¯ u x,y ( θ , θ )] ≤ − s ( θ ) , ∀ θ ∈ ¯Θ n , ∀ x ∈ X , ( d ) X y ∈Y σ ( y | x, θ ) = 1 , σ ( y | θ ) ≥ , ∀ θ ∈ ¯Θ n . (16) Note that α ( θ ) , ∀ θ ∈ ¯Θ n and α ( θ ) , ∀ θ ∈ ¯Θ n , are not decision variables and can be any strictlypositive and ﬁnite numbers. Thus, we have the freedom to pick them properly to obtain a linearprogram rather than a bilinear program under certain conditions as shown in Proposition 1. Proposition 1 (Linear Program Reformulation) . If there exists m i ( θ i ) > , ∀ i ∈ { , } , ∀ θ i ∈ ¯Θ ni ,such that m ( θ )¯ u x,y ( θ , θ ) = − m ( θ )¯ v x,y ( θ , θ ) holds for all x ∈ X , y ∈ Y , θ ∈ ¯Θ n , θ ∈ ¯Θ n ,then we can pick α i ( θ i ) = ¯ b i ( θ i ) /m i ( θ i ) > to make C K a linear program.Proof. It is straightforward to verify that two bilinear terms always sum up to , i.e., X θ ∈ ¯Θ n α ( θ ) E θ ∼ b ( ·| θ ) ,x ∼ σ ,y ∼ σ [¯ u x,y ( θ , θ )] + X θ ∈ ¯Θ n α ( θ ) E θ ∼ b ( ·| θ ) ,x ∼ σ ,y ∼ σ [¯ v x,y ( θ , θ )] ≡ , for all feasible strategy pair σ , σ , if we choose α i ( θ i ) = ¯ b i ( θ i ) /m i ( θ i ) .Note that the condition m i ( θ i ) = 1 , ∀ i ∈ { , } , ∀ θ i ∈ ¯Θ ni , results in a zero-sum ﬁnite Bayesiangame. Then, we can recast C K as a linear program by picking α i ( θ i ) = ¯ b i ( θ i ) , ∀ θ i ∈ ¯Θ n , whichcoincides with the existing result in [16]. R EFERENCES [1] Khajonpong Akkarajitsakul, Ekram Hossain, and Dusit Niyato. Distributed resource allocation in wireless networksunder uncertainty and application of bayesian game.

IEEE Communications Magazine , 49(8):120–127, 2011.[2] Olivier Armantier, Jean-Pierre Florens, and Jean-Francois Richard. Approximation of nash equilibria in bayesian games.

Journal of Applied Econometrics , 23(7):965–981, 2008.[3] Susan Athey. Single crossing properties and the existence of pure strategy equilibria in games of incomplete information.

Econometrica , 69(4):861–889, 2001.[4] Oriol Carbonell-Nicolau and Richard P McLean. On the existence of nash equilibrium in bayesian games.

Mathematicsof Operations Research , 43(1):100–129, 2018.[5] Sam Ganzfried and Tuomas Sandholm. Computing equilibria by incorporating qualitative models. In

Proceedings ofthe 9th International Conference on Autonomous Agents and Multiagent Systems: Volume 1 - Volume 1 , AAMAS ’10,page 183–190. International Foundation for Autonomous Agents and Multiagent Systems, 2010.[6] John C Harsanyi. Games with incomplete information played by “bayesian” players, i–iii part i. the basic model.

Management science , 14(3):159–182, 1967.[7] Linan Huang and Quanyan Zhu. Analysis and computation of adaptive defense strategies against advanced persistentthreats for cyber-physical systems. In

International Conference on Decision and Game Theory for Security , pages205–226. Springer, 2018.[8] Linan Huang and Quanyan Zhu. Adaptive strategic cyber defense for advanced persistent threats in critical infrastructurenetworks.

ACM SIGMETRICS Performance Evaluation Review , 46(2):52–56, 2019.[9] Linan Huang and Quanyan Zhu. Dynamic games of asymmetric information for deceptive autonomous vehicles. arXivpreprint arXiv:1907.00459 , 2019.[10] Linan Huang and Quanyan Zhu. A dynamic games approach to proactive defense strategies against advanced persistentthreats in cyber-physical systems.

Computers & Security , 89:101660, 2020.[11] Christopher Kiekintveld, Janusz Marecki, and Milind Tambe. Approximation methods for inﬁnite bayesian stackelberggames: Modeling distributional payoff uncertainty. In

The 10th International Conference on Autonomous Agents andMultiagent Systems-Volume 3 , pages 1005–1012, 2011.[12] Vijay Krishna.

Auction theory . Academic press, 2009.[13] Alejandro M Manelli. The convergence of equilibrium strategies of approximating signaling games.

Economic Theory ,7(2):323–335, 1996.[14] Paul R Milgrom and Robert J Weber. Distributional strategies for games with incomplete information.

Mathematicsof operations research , 10(4):619–632, 1985.[15] Guillermo Owen. Existence of equilibrium pairs in continuous games.

International Journal of Game Theory , 5(2):97–105, 1976.[16] J-P Ponssard and Sylvain Sorin. The lp formulation of ﬁnite zero-sum games with incomplete information.

InternationalJournal of Game Theory , 9(2):99–105, 1980.[17] Zinovi Rabinovich, Victor Naroditskiy, Enrico H Gerding, and Nicholas R Jennings. Computing pure bayesian-nashequilibria in games with ﬁnite actions and continuous types.

Artiﬁcial Intelligence , 195:106–139, 2013.[18] Daniel M Reeves and Michael P Wellman. Computing best-response strategies in inﬁnite games of incompleteinformation. In

Proceedings of the 20th conference on Uncertainty in artiﬁcial intelligence , pages 470–478, 2004.[19] Yoav Shoham and Kevin Leyton-Brown.