Infinite-Duration Poorman-Bidding Games
IInfinite-Duration Poorman-Bidding Games ∗ Guy Avni † Thomas A. Henzinger ‡ Rasmus Ibsen-Jensen § IST Austria
Abstract
In two-player games on graphs, the players move a token through a graph to produce an infinite path,which determines the winner or payoff of the game. Such games are central in formal verification sincethey model the interaction between a non-terminating system and its environment. We study biddinggames in which the players bid for the right to move the token. Bidding games with variants of first-price auctions were previously studied: in each round, the players simultaneously submit bids, the higherbidder moves the token, and, in
Richman bidding, pays his bid to the other player whereas in poorman bidding, pays his bid to the “bank”. While reachability poorman games have been studied before, wepresent, for the first time, results on infinite-duration poorman games. A central quantity in these gamesis the ratio between the two players’ initial budgets. We show that the favorable properties of reachabilitypoorman games extend to complex qualitative objectives such as parity, similarly to the Richman case:each vertex has a threshold value , which is a necessary and sufficient ratio with which a player canachieve a goal. Our most interesting results concern quantitative poorman games, namely mean-payoffpoorman games, where we construct optimal strategies depending on the initial ratio. The crux of theproof shows that strongly-connected mean-payoff poorman games are equivalent to biased random-turngames . The equivalence in itself is interesting, because it does not hold for reachability poorman gamesand it is richer than the equivalence with uniform random-turn games that Richman bidding exhibit. Wealso solve the complexity problems that arise in poorman games.
Two-player infinite-duration games on graphs are a central class of games in formal verification [4] and havedeep connections to foundations of logic [43]. They are used to model the interaction between a system andits environment, and the problem of synthesizing a correct system then reduces to finding a winning strategyin a graph game [41]. Theoretically, they have been widely studied. For example, the problem of decidingthe winner in a parity game is a rare problem that is in NP and coNP [27], not known to be in P, and forwhich a quasi-polynomial algorithm was only recently discovered [15].A graph game proceeds by placing a token on a vertex in the graph, which the players move throughoutthe graph to produce an infinite path (“play”) π . The game is zero-sum and π determines the winner or pay-off. Two ways to classify graph games are according to the type of objectives of the players, and accordingto the mode of moving the token. For example, in reachability games , the objective of Player is to reach adesignated vertex t , and the objective of Player is to avoid t . An infinite play π is winning for Player iff ∗ This paper is a full version of [7]. This research was supported in part by the Austrian Science Fund (FWF) under grantsS11402-N23 (RiSE/SHiNE), Z211-N23 (Wittgenstein Award), and M 2369-N33 (Meitner fellowship). † [email protected] ‡ [email protected] § [email protected] a r X i v : . [ c s . G T ] J a n t visits t . The simplest mode of moving is turn based : the vertices are partitioned between the two playersand whenever the token reaches a vertex that is controlled by a player, he decides how to move the token.We study a new mode of moving in infinite-duration games, which is called bidding , and in which theplayers bid for the right to move the token. The bidding mode of moving was introduced in [31, 32] forreachability games, where two variants of first-price auctions where studied: Each player has a budget, andbefore each move, the players submit sealed bids simultaneously, where a bid is legal if it does not exceedthe available budget, and the higher bidder moves the token. The bidding rules differ in where the higherbidder pays his bid. In Richman bidding (named after David Richman), the higher bidder pays the lowerbidder. In poorman bidding , which is the bidding rule that we focus on in this paper, the higher bidderpays the “bank”. Thus, the bid is deducted from his budget and the money is lost. Note that while the sumof budgets is constant in Richman bidding, in poorman bidding, the sum of budgets shrinks as the gameproceeds. One needs to devise a mechanism that resolves ties in biddings, and our results are not affectedby the tie-breaking mechanism that is used.Bidding games naturally model decision-making settings in which agents need to invest resources in anongoing manner. We argue that the modelling capabilities of poorman bidding exceed those of Richmanbidding. Richman bidding is restricted to model “scrip” systems that use internal currency to avoid freeriding and guarantee fairness. Poorman bidding, on the other hand, model a wider variety of settings sincethe bidders pay their bid to the auctioneer. We illustrate a specific application of infinite-duration poormanbidding in reasoning about ongoing stateful auctions, which we elaborate on in Section 4.6.
Example 1.
Consider a setting in which two buyers compete in auction to buy k ∈ IN goods that are “rented”for a specific time duration. For example, a webpage has k ad slots, and each slot is sold for a fixed timeduration, e.g., one day. At time point ≤ i ≤ k , good i is put up for sale in a second-price auction, wherethe higher bidder pays the auctioneer and keeps the good for the fixed duration of time. We focus on thefirst buyer. Each good entails a reward for him, and we are interested in devising a bidding strategy thatmaximize the long-run average of the rewards. For example, the simple case of a site with one ad slot isrepresented by the game that is depicted in Fig. 1, where the vertex v represents the case that Player ’s adappears and v represents the case that Player ’s ad appears. Player ’s goal is to maximize the long-runaverage time that his ad appears, which intuitively amounts to “staying” as much time as possible in v .Player ’s goal is formally described as a mean-payoff objective, which we elaborate on below. Our resultson mean-payoff poorman games allow us to construct an optimal strategy for the players.Another advantage of poorman bidding over the Richman bidding is that their definition generalizeseasily to domains in which the restriction of a fixed sum of budgets is an obstacle. For example, in ongoingauctions as described in the example above, often a good is sold to multiple buyers with partial informationof the budgets. These are two orthogonal concepts that have not been studied in bidding games and are botheasier to define in poorman bidding rather than in Richman bidding.A central quantity in bidding games is the ratio of the players’ initial budgets. Formally, let B i ∈ IR ≥ ,for i ∈ { , } , be Player i ’s initial budget. The total initial budget is B = B + B and Player i ’s initialratio is B i /B . The first question that arises in the context of bidding games is a necessary and sufficientinitial ratio for a player to guarantee winning. For reachability games, it was shown in [31, 32] that such threshold ratios exist in every reachability Richman and poorman game: for every vertex v there is a ratio Th ( v ) ∈ [0 , such that (1) if Player ’s initial ratio exceeds Th ( v ) , he can guarantee winning, and (2) ifhis initial ratio is less than Th ( v ) , Player can guarantee winning. This is a central property of the game,which is a form of determinacy , and shows that no ties can occur. An intriguing equivalence was observed in [31, 32] between random-turn games [39] and reachabilitybidding games, but only with Richman-bidding. For r ∈ [0 , , the random-turn game that corresponds to a When the initial budget of Player is exactly Th ( v ) , the winner of the game depends on how we resolve draws in biddings. G w.r.t. r , denoted RT r ( G ) , is a special case of stochastic game [23]: rather than bidding formoving, in each round, independently, Player is chosen to move with probability r and Player moveswith the remaining probability of − r . Richman reachability games are equivalent to uniform random-turn games, i.e., with r = 0 . (see Theorem 7 for a precise statement of the equivalence). For reachabilitypoorman-bidding games, no such equivalence is known and it is unlikely to exist since there are (simple)finite poorman games with irrational threshold ratios. The lack of such an equivalence makes poormangames technically more complicated.More interesting, from the synthesis and logic perspective, are infinite winning conditions, but theyhave only been studied in the Richman setting previously [6]. We show, for the first time, existence ofthreshold ratios in qualitative poorman games with infinite winning conditions such as parity. We showa linear reduction from poorman parity games to poorman reachability games, similarly to the proof inthe Richman setting. First, we show that in a strongly-connected game, one of the players wins with anypositive initial ratio, thus the bottom strongly-connected components (BSCCs, for short) of the game graphcan be partitioned into “winning” for Player and “losing” for Player . Second, we construct a reachabilitypoorman game in which each player tries to force the game to a BSCC that is winning for him.Things get more interesting in mean-payoff poorman games, which are zero-sum quantitative games; aninfinite play of the game is associated with a payoff which is Player ’s reward and Player ’s cost, thuswe respectively refer to the players in a mean-payoff game as Max and Min. The central question in thesegames is: Given a value c ∈ Q , what is the initial ratio that is necessary and sufficient for Max to guaranteea payoff of c ? More formally, we say that c is the value with respect to a ratio r ∈ [0 , if for every (cid:15) > ,we have (1) when Max’s initial ratio is r + (cid:15) , he can guarantee a payoff of at least c , and (2) intuitively, Maxcannot hope for more: if Max’s initial ratio is r − (cid:15) , then Min can guarantee a payoff of at most c .Our most technically-involved contribution is a construction of optimal strategies in mean-payoff poor-man games, which depend on the initial ratio r ∈ [0 , . The key component of the solution is a quantitativesolution to strongly-connected games, which, similar to parity games, allows us to reduce general mean-payoff poorman games to reachability poorman games by reasoning about the BSCCs of the graph. Beforedescribing our solution, let us highlight an interesting difference between Richman and poorman bidding.With Richman bidding, it is shown in [6] that a strongly-connected mean-payoff Richman-bidding game hasa value that does not depend on the initial ratio and only on the structure of the game. It thus seems reason-able to guess that the initial ratio would not matter with poorman bidding as well. We show, however, thatthis is not the case; the higher Max’s initial ratio is, the higher the payoff he can guarantee. We demonstratethis phenomenon with the following simple game. Technically, each vertex in the graph has a weight thepayoff of an infinite play π is defined as follows. The energy of a prefix π n of length n of π , denoted E ( π n ) ,is the sum of the weights it traverses. The payoff of π is lim inf n →∞ E ( π n ) /n . Example 2.
Consider the mean-payoff poorman game that is depicted in Figure 1. We take the viewpointof Min in this example. We consider the case of r = , and claim that the value with respect to r = is .Suppose for convenience that Min wins ties. Note that the players’ choices upon winning a bid in the gameare obvious, and the difficulty in devising a strategy is finding the right bids. Intuitively, Min copies Max’sbidding strategy. Suppose, for example, that Min starts with a budget of (cid:15) and Max starts with , forsome (cid:15) > . A strategy for Min that ensures a payoff of is based on a queue of numbers as follows: Inround i , if the queue is empty Min bids (cid:15) · − i , and otherwise the maximal number in the queue. If Min wins,he removes the minimal number from the queue (if non-empty). If Max wins, Min adds Max’s winning bidto the queue. For example, suppose Max’s first bid is . , he wins since Min bids (cid:15)/ , and Min adds . tothe empty queue. Min’s second bid is . . Suppose Max bids . in the second turn, thus he wins again. Minadds . to the queue and bids . in the third bidding. Suppose Max bids . , thus Min wins and removes . from the queue. In the next bidding his bid is . .We make several observations. (1) Min’s strategy is legal: it never bids higher than the available budget.3 − Figure 1: A mean-payoff game. − − − v v v v Figure 2: A second mean-payoff game.(2) The size of the queue is an upper bound on the energy; indeed, every bid in the queue corresponds to aMax winning bid that is not “matched” (the size is an upper bound since Min might win biddings when thequeue is empty). (3) If Min’s queue fills, it will eventually empty. Indeed, if b ∈ IR is in the queue, in orderto keep b in the queue, Max must bid at least b , thus eventually his budget runs out. Combining, since theenergy is at most when the queue empties, Min’s strategy guarantees that the energy is at most infinitelyoften. Since we use lim inf in the definition of the payoff, Min guarantees a non-positive payoff. Showingthat Max can guarantee a non-negative payoff with an initial ratio of + (cid:15) is harder, and a proof for thegeneral case can be found in Section 4.We show that the value c decreases with Max’s initial ratio r . We set r = . Suppose, for example,that Min’s initial budget is (cid:15) and Max’s initial budget is . We claim that Min can guarantee a payoffof − / . His strategy is similar to the one above, only that whenever Max wins with b , Min pushes b to thequeue twice. Observations (1-3) still hold. The difference is that now, since every Max win is matched bytwo Min wins, when the queue empties, the number of Min wins is at least twice as much as Max’s wins,and the claim follows.This example shows the contrast between Richman and poorman bidding. When using Richman bidding,Min can guarantee a payoff of with every initial budget, and cannot guarantee − (cid:15) , even with a ratio of − δ , for any (cid:15), δ > .In order to solve strongly-connected mean-payoff poorman games, we identify the following equivalencewith biased random-turn games. Consider a strongly-connected mean-payoff poorman game G and a ratio r ∈ [0 , . Recall that RT r ( G ) is the random-turn game in which Max is chosen with probability r andMin with probability − r . Since G is a mean-payoff game, the game RT r ( G ) is a stochastic mean-payoffgame. Its value, denoted MP ( RT r ( G )) , is the optimal expected payoff that the players can guarantee, and isknown to exist [35]. For every (cid:15) > , we show that when Max’s initial ratio is r + (cid:15) , he can guarantee apayoff of MP ( RT r ( G )) , and he cannot do better: Min can guarantee a payoff of at most MP ( RT r ( G )) withan initial ratio of − r + (cid:15) . Thus, the value of G w.r.t. r equals MP ( RT r ( G )) . One way to see this resultis as a form of derandomization: we show that Max has a deterministic bidding strategy in G that ensures abehavior that is similar to the random behavior of RT r ( G ) . We find this equivalence between the two modelsparticularly surprising due to the fact that, unlike Richman bidding, an equivalence between random-turngames and reachability poorman games is unlikely to exist. Second, while Richman games are equivalentto uniform random-turn games, we are not aware of any known equivalences between bidding games andbiased random-turn games, i.e., r (cid:54) = 0 . .Recall that a strongly-connected mean-payoff Richman-bidding game G has a value c that does notdepend on the initial ratio. The value comes from an equivalence with uniform random-turn games [6]: thevalue c of G under Richman bidding equals the value of the uniform stochastic mean-payoff game RT . ( G ) .That is, with Richman bidding, Min can guarantee c with an initial ratio of δ , and cannot guarantee c − (cid:15) with an initial ratio of − δ , for every (cid:15), δ > . One interesting corollary is that the value of G when viewedas a Richman game equals the value of G when viewed as a poorman game with respect to the initial ratio . . We are not aware of previous such connections between the two bidding rules.Finally, we address, for the first time, complexity issues in poorman games; namely, we study the prob-lem of finding threshold ratios in poorman games. We show that for qualitative games, the correspondingdecision problem is in PSPACE using the existential theory of the reals [16]. For mean-payoff games, the4roblem of finding the value of the game with respect to a given ratio is also in PSPACE for general games,and for strongly-connected games, we show the value can be found in NP and coNP, and even in P forstrongly-connected games with out-degree . Related work
As mentioned above, bidding games can model ongoing auctions, like the ones that areused in internet companies such as Google to sell advertisement slots [37]. Sequential auctions, which arealso ongoing, have been well studied, e.g., [33, 45], and let us specifically point [26, 44], which, similar tobidding games, studies two-player sequential auctions with perfect information. Bidding games differ fromthese models in two important aspects: (1) bidding games are zero-sum games, and (2) the budgets that areused for bidding do not contribute to the utility and are only used to determine which player moves. Point(2) implies that bidding games are particularly appropriate to model settings in which the budget has littleor no value, similar in spirit to the well-studied
Colonel Blotto games [13]. A dynamic version of ColonelBlotto games called all-pay bidding games has been recently studied [10]. Non-zero-sum Richman-biddinggames have been used to reason about ongoing negotiations [34].Graph games are popular to reason about systems in formal methods [22] and about multi-agent systemsin AI [3]. Bidding games extend the modelling capabilities of these games and allow reasoning about multi-process systems in which a scheduler accepts payment in return for priority.
Blockchain technology is oneexample of such a technology. Simplifying the technology, a blockchain is a log of transactions issued byclients and maintained by miners . In order to write to the log, clients send their transactions and an offerfor a transaction fee to a miner, who has freedom to decide transaction priority. We expect that a moreprecise modelling of such systems will assist in their verification against attacks, which is a problem ofspecial interest since bugs can result in significant losses of money (see for example, [18] and a descriptionof an attack http://bit.ly/2obzyE7 ). Note that poorman bidding models such settings better thanRichman bidding since transaction fees are paid to the scheduler (the miners) rather than the other player.Richman bidding is appropriate when modelling “scrip systems” that use internal currency to prevent free-riding [28], and are popular in databases for example.In this work, we show that mean-payoff poorman games are equivalent to biased random-turn games.Thus, there is a contrast with mean-payoff Richman games, which are equivalent to uniform random-turngames. To better understand these differences between the seemingly similar bidding rules, mean-payoff taxman games where studied in [9]. Taxman bidding were defined and studied in [31] for reachabilityobjectives span the spectrum between Richman and poorman bidding. They are parameterized by a constant τ ∈ [0 , : portion τ of the winning bid is paid to the other player, and portion − τ to the bank. Thus,with τ = 1 we obtain Richman bidding and with τ = 0 , we obtain poorman bidding. It was shownthat the value of a mean-payoff taxman bidding game G parameterized by τ and with initial ratio r equals MP ( RT F ( τ,r ) ( G )) , for F ( τ, r ) = r + τ · (1 − r )1+ τ .To the best of our knowledge, since their introduction, poorman games have not been studied. Motivatedby recreational games, e.g., bidding chess [12, 30], discrete bidding games with Richman bidding rulesare studied in [24], where the money is divided into chips, so a bid cannot be arbitrarily small unlike thebidding games we study. Infinite-duration discrete bidding games with Richman bidding and various tie-breaking mechanisms have been studied in [1], where they were shown to be a largely determined sub-classof concurrent games. A graph game is played on a directed graph G = (cid:104) V, E (cid:105) , where V is a finite set of vertices and E ⊆ V × V isa set of edges. The neighbors of a vertex v ∈ V , denoted N ( v ) , is the set of vertices { u ∈ V : (cid:104) v, u (cid:105) ∈ E } ,5nd we say that G has out-degree if for every v ∈ V , we have | N ( v ) | = 2 . A path in G is a finite orinfinite sequence of vertices v , v , . . . such that for every i ≥ , we have (cid:104) v i , v i +1 (cid:105) ∈ E . Objectives
An objective O is a set of infinite paths. In reachability games, Player has a target vertex v R and an infinite path is winning for him if it visits v R . In parity games each vertex has a parity index in { , . . . , d } , and an infinite path is winning for Player iff the maximal parity index that is visited infinitelyoften is odd. We also consider games that are played on a weighted graph (cid:104) V, E, w (cid:105) , where w : V → Q .Consider an infinite path π = v , v , . . . . For n ∈ IN, we use π n to denote the prefix of length n of π .We call the sum of weights that π n traverses the energy of the game, denoted E ( π n ) . Thus, E ( π n ) = (cid:80) ≤ j
In a stochastic game the vertices of the graph are partitioned between two playersand a nature player. As in turn-based games, whenever the game reaches a vertex of Player i , for i = 1 , ,he choses how the game proceeds, and whenever the game reaches a vertex v that is controlled by nature,the next vertex is chosen according to a probability distribution that depends only on v .Consider a game G = (cid:104) V, E (cid:105) . The random-turn game with ratio r ∈ [0 , that is associated with G isa stochastic game that intuitively simulates the fact that Player chooses the next move with probability r and Player chooses with probability − r . Formally, we define RT r ( G ) = (cid:104) V , V , V N , E, Pr , w (cid:105) , whereeach vertex in V is split into three vertices, each controlled by a different player, thus for α ∈ { , , N } ,we have V α = { v α : v ∈ V } , nature vertices simulate the fact that Player chooses the next move withprobability r , thus Pr[ v N , v ] = r = 1 − Pr[ v N , v ] , and reaching a vertex that is controlled by one of thetwo players means that he chooses the next move, thus E = {(cid:104) v α , u N (cid:105) : (cid:104) v, u (cid:105) ∈ E and α ∈ { , }} . When G is weighted, then the weights of v , v , and v N equal that of v .Fixing two strategies f and g for the two players in a stochastic game results in a Markov chain, whichin turn gives rise to a probability distribution D ( f, g ) over infinite sequences of vertices. A strategy f is optimal w.r.t. an objective O if it maximizes sup f inf g Pr π ∼ D ( f,g ) [ π ∈ O ] . For the objectives we consider, itis well-known that optimal strategies exist, which are, in fact, positional ; namely, strategies that only dependon the current position of the game and not on its history. Definition 6. ( Values ) Let r ∈ [0 , . For a qualitative game G , the value of RT r ( G ) , denoted val ( RT r ( G )) ,is the probability that Player wins when he plays optimally. For a mean-payoff game G , the mean-payoffvalue of RT r ( G ) , denoted MP ( RT r ( G )) , is the maximal expected payoff Max obtains when he plays opti-mally. For qualitative objectives, poorman games have mostly similar properties to the corresponding Richmangames, though they are technically more complicated than Richman bidding. We start with reachabilityobjectives, which were studied in [32, 31]. The objective they study is slightly different than ours and wecall it double-reachability : both players have targets and the game ends once one of the targets is reached.As we show below, for our purposes, the variants are equivalent since there are no draws in finite-statedouble-reachability poorman and Richman games.Consider a double-reachability game G = (cid:104) V, E, u , u (cid:105) , where, for i = 1 , , the target of Player i is u i . In both Richman and poorman bidding, trivially Player wins in u with any initial budget and Player wins in u with any initial budget, thus Th ( u ) = 0 and Th ( u ) = 1 . For v ∈ V , let v + , v − ∈ N ( v ) besuch that, for every v (cid:48) ∈ N ( v ) , we have Th ( v − ) ≤ Th ( v (cid:48) ) ≤ Th ( v + ) . Theorem 7. [32, 31] Threshold ratios exist in reachability Richman and poorman games. Moreover, con-sider a double-reachability game G = (cid:104) V, E, u , u (cid:105) . • In Richman bidding, for v ∈ V \ { u , u } , we have Th ( v ) = (cid:0) Th ( v + ) + Th ( v − ) (cid:1) , and it followsthat Th ( v ) = val ( RT . ( G , v )) and that Th ( v ) is a rational number. In poorman bidding, for v ∈ V \ { u , u } , we have Th ( v ) = Th ( v + ) / (cid:0) − Th ( v − ) + Th ( v + ) (cid:1) . Thereis a game G and a vertex v with an irrational Th ( v ) .Proof. The proof here is similar to [31] and is included for completeness, with a slight difference: unlike[31], which assume that every vertex has a path to both targets, we also address the case where one of thetargets is not reachable. This will prove helpful when reasoning about infinite-duration bidding games. TheRichman case is irrelevant for us and we leave it out.We start with the two simpler claims. Assume that in a double-reachability poorman game G , for eachvertex v , we have Th ( v ) = Th ( v + ) / (cid:0) − Th ( v − ) + Th ( v + ) (cid:1) . We show a double-reachability poormangame with irrational threshold ratios. Consider the game with vertices u , v , v , and u , and edges u ← v ↔ v → u . Solving the equation above we get Th ( v ) = ( √ − / and Th ( v ) = (3 − √ / , whichare irrational.Next, we show existence of threshold ratios in a reachability poorman games by reducing them to double-reachability games. Consider a game G = (cid:104) V, E, u (cid:105) . Let S ⊆ V be the set of vertices that have no pathto u . Since Player cannot win from any vertex in S , we have Th ( v ) = 1 . Let G (cid:48) = (cid:104) V (cid:48) , E (cid:48) , u , u (cid:105) bethe double-reachability game that is obtained from G by setting V (cid:48) = V \ S and Player ’s target u to be avertex in S . Consider a vertex v ∈ V (cid:48) . We claim that Th ( v ) in G (cid:48) equals Th ( v ) in G . Indeed, if Player ’sratio exceeds Th ( v ) he can draw the game to u , and if Player ’s ratio exceeds − Th ( v ) he can draw thegame to S .Finally, we show that every vertex in a double-reachability game has a threshold ratio. Consider adouble-reachability poorman game G = (cid:104) V, E, u , u (cid:105) . It is shown in [31] that there exists a unique function f : V → [0 , that satisfies the following conditions: we have f ( u ) = 0 and f ( u ) = 1 , and for every v ∈ V , we have f ( v ) = f ( v + )1+ f ( v + ) − f ( v − ) , where v + , v − ∈ N ( v ) are the neighbors of v that respectivelymaximize and minimize f , i.e., for every v (cid:48) ∈ N ( v ) , we have f ( v − ) ≤ f ( v (cid:48) ) ≤ f ( v + ) .We claim that for every v ∈ V , we have Th ( v ) = f ( v ) . Our argument will be for Player and dualitygives an argument for Player . Suppose Player ’s budget is f ( v ) + (cid:15) and Player ’s budget is − f ( v ) , forsome (cid:15) > . Note that we implicitly assume that f ( v ) < . In case f ( v ) = 1 we do not show anything, butstill, our dual strategy for Player ensures that u is visited, when the initial budget for Player is positive.We describe a Player strategy that forces the game to u .Similar to [31], we divide Player ’s budget ratio into his real budget and a slush fund . We will ensurethe following invariants:1. Whenever we are in state v , if x is Player ’s real budget and y is Player ’s budget, then f ( v ) = x/ ( x + y ) .2. Every time Player wins a bidding the slush fund increases by a constant factor. Formally, there existsa constant c > , such that when (cid:15) is the initial slush fund and (cid:15) i is the slush fund after Player winsfor the i -th time, we have that (cid:15) i > c · (cid:15) i − , for all i ≥ .Note that these invariants are satisfied initially.We describe a Player strategy. Consider a round in vertex v in which Player ’s real budget is x (cid:48) ,Player ’s budget is y (cid:48) and the last time Player won (or initially, in case Player has not won yet) his slushfund was (cid:15) (cid:48) . Player ’s bid is ∆( v ) · x (cid:48) + δ v · (cid:15) (cid:48) , where we define ∆( v ) and δ v below. Upon winning, Player moves to v − , i.e., to the neighbor that minimizes f ( v ) , or, when f ( v ) = 0 , he moves to a vertex closer to u . Upon winning, Player pays ∆( v ) · x (cid:48) from his real budget and δ v · (cid:15) (cid:48) from his slush fund.For v ∈ V \ { u , u } , if f ( v ) > and f ( v − ) < , let ∆( v ) = f ( v ) − f ( v − ) f ( v )(1 − f ( v − )) and otherwise, let ∆( v ) = 0 . Note that the second invariant indicates that Player cannot win more than a finite number oftimes, since whenever he wins, the slush fund increases by a constant and the slush fund cannot exceed ,8ecause then it would be bigger than the total budget. This in turn shows that eventually Player wins n times in a row, which ensures that the play reaches u .We choose δ v , for v ∈ V , and show that our choice implies that Player ’s strategy maintains theinvariant above. Let ∆ min be the smallest positive number such that f ( v ) = ∆ min for some v , and ∆ min = 1 if f ( v ) = 0 for all v ∈ V . Let δ be 1 and δ i be such that (cid:80) i − j =1 δ j < ∆ min / δ i , for all i ∈ { , . . . , | V |} .Also, let γ be such that (cid:80) | V | j =1 δ j < /γ . For each state v (such that f ( v ) > ), consider that Player winsall bids and let dist ( v ) be the number of bids before the play ends up in u starting from v . When f ( v ) = 0 ,let dist ( v ) be the length of the shortest path from v to u . Then, δ v = γδ i , for i = | V | − dist ( v ) .In case Player wins, his real budget becomes x (cid:48) − ∆( v ) x (cid:48) , and Player ’s budget stays y (cid:48) . In that case,Player ’s new real budget ratio becomes (1 − ∆( v )) x (cid:48) (1 − ∆( v )) x (cid:48) + y (cid:48) = f ( v − ) , and the invariants are thus satisfied. (Hisslush fund also decreases by δ v (cid:15) (cid:48) . We will not proof anything about the slush fund in this case, except notingthat it stays positive).In case Player wins, Player ’s real budget stays x (cid:48) and Player ’s budget is at most y (cid:48) − ∆( v ) x (cid:48) − δ v (cid:15) (cid:48) .By construction, we have that if Player ’s budget became y (cid:48) − ∆( v ) x (cid:48) , then Player ’s budget ratio becomes x (cid:48) x (cid:48) + y (cid:48) − ∆( v ) x (cid:48) = f ( v + ) , so even if Player moves to v + , Player has paid δ v (cid:15) (cid:48) too much for Player ’s realbudget ratio to be f ( v + ) . Thus, the first invariant is satisfied. Note that this also indicates that f ( v + ) (cid:54) = 1 ,in this case, since otherwise Player ’s budget ratio must be above 1, indicating that Player ’s budget isnegative. When f ( v + ) > , we can move δ v (cid:15) (cid:48) f ( v + ) / (1 − f ( v + )) ≥ δ v (cid:15) (cid:48) ∆ min into the slush fund. When f ( v + ) = 0 , the new slush fund is δ v (cid:15) (cid:48) . Let j be such that δ j = δ v . By construction of δ v , we have that sincethe last time Player won a bidding (or since the start if Player never won a bid before), we have subtractedat most (cid:15) (cid:48) (cid:80) | V | i = j +1 δ i from the slush fond and now we have added δ j (cid:15) (cid:48) ∆ min . But δ i was chosen such that (cid:80) | V | i = j +1 δ i was below δ v ∆ min / . Hence, we have added δ v (cid:15) (cid:48) ∆ min to the previous content of (cid:15) (cid:48) . Because δ v and ∆ min are constants, we have thus increased the slush fund by a constant factor. The invariants are thussatisfied in this case.We continue to study poorman games with richer objectives. Theorem 8.
Parity poorman games are linearly reducible to reachability poorman games. Specifically,threshold ratios exist in parity poorman games.Proof.
The crux of the proof is to show that in a bottom strongly-connected component (BSCC, for short)of G , one of the players wins with every initial budget. Thus, the threshold ratios for vertices in BSCCs areeither or . For the rest of the vertices, we construct a reachability game in which a player’s goal is toreach a BSCC that is “winning” for him.Formally, consider a strongly-connected parity poorman game G = (cid:104) V, E, p (cid:105) . We claim that there is α ∈ { , } such that for every v ∈ V , we have Th ( v ) = α , i.e., when α = 0 , Player wins with anypositive initial budget, and similarly for α = 1 . Moreover, deciding which is the case is easy: let v Max ∈ V be the vertex with maximal parity index, then α = 0 iff p ( v Max ) is odd.Suppose p ( v Max ) is odd and the proof for an even p ( v Max ) is dual. We prove in two steps. First,following the proof of Theorem 7, we have that when Player ’s initial budget is (cid:15) > , he can draw thegame to v Max once. Second, we show that Player can reach v Max infinitely often when his initial budgetis (cid:15) > . Player splits his budget into parts (cid:15) , (cid:15) , . . . , where (cid:15) i = (cid:15) · − i , for i ≥ , thus (cid:80) i ≥ (cid:15) i = (cid:15) .Then, for i ≥ , following the i -th visit to v Max , he plays the strategy necessary to draw the game to v Max with initial budget (cid:15) i +1 .We turn to show the reduction from parity poorman games to double-reachability poorman games. Con-sider a parity poorman game G = (cid:104) V, E, p (cid:105) . Let S ⊆ V be a BSCC in G . We call S winning for Player ifthe vertex v Max with highest parity index in S has odd p ( v Max ) . Dually, we call S winning for Player if p ( v Max ) is even. Indeed, the claim above implies that for every S that is winning for Player and v ∈ S ,9e have Th ( v ) = 0 , and dually for Player . Let G (cid:48) be a double-reachability poorman game that is obtainedfrom G by setting the BSCCs that are winning for Player in G to be his target in G (cid:48) and the BSCCs that arewinning for Player in G to be his target in G (cid:48) . Similar to the proof of Theorem 7, we have that Th ( v ) in G equals Th ( v ) in G (cid:48) , and we are done. This section consists of our most technically challenging contribution. We construct optimal strategies forthe players in mean-payoff poorman games. The crux of the solution regards strongly-connected mean-payoff games, which we develop in the first three sub-sections.Consider a strongly-connected game G and an initial ratio r ∈ [0 , . We claim that the value in G w.r.t. r does not depend on the initial vertex. For a vertex v in G , recall that MP r ( G , v ) is the maximal payoff Maxcan guarantee when his initial ratio in v is r + (cid:15) , for every (cid:15) > . We claim that for every vertex u (cid:54) = v in G , we have MP r ( G , u ) = MP r ( G , v ) . Indeed, as in Theorem 8, Max can play as if his initial ratio is (cid:15)/ anddraw the game from u to v , and from there play using an initial ratio of r + (cid:15)/ . Since the energy that isaccumulated until reaching v is constant, it does not affect the payoff of the infinite play starting from v .We write MP r ( G ) to denote the value of G w.r.t. r . We show the equivalence with random-turn games:the value MP r ( G ) equals the value MP ( RT r ( G )) of the random-turn mean-payoff game RT r ( G ) in which Maxchooses the next move with probability r and Min with probability − r . In this section we solve a simple game through which we demonstrate the ideas of the general case. Recallthat in an energy game, Min wins a finite play if the sum of weights it traverses, a.k.a. the energy, is andMax wins an infinite play in which the energy stays positive throughout the play. Lemma 9. [31] In the energy game that is depicted in Fig. 1, if the initial energy is k ∈ IN, then Max winsiff his initial ratio exceeds k +22 k +2 . The first implication in Lemma 9 is the important one for us. It shows that Max can guarantee a payoffof with an initial budget that exceeds . . Indeed, given an initial ratio of . (cid:15) , Max plays as if theinitial energy is k ∈ IN such that k +22 k +2 < . (cid:15) . He thus keeps the energy bounded from below by − k ,which implies that the payoff is non-negative.We describe an alternative proof for the first implication in Lemma 9 whose ideas we will later general-ize. We need several definitions. For k ∈ IN, let S k be the square of area k . In Fig. 3, we depict S . Wesplit S k into unit-area boxes such that each of its sides contains k boxes. A diagonal in S k splits it into asmaller black triangle and a larger white one. For k ∈ IN, we respectively denote by t k and T k the areas ofthe smaller black triangle and the larger white triangle of S k . For example, we have t = 10 and T = 15 ,and in general t k = k ( k − and T k = k ( k +1)2 . t k T k . . . Figure 3: The square S with area and the sizes of some triangles.Suppose the game starts with energy κ ∈ IN. We show that Max wins when his ratio exceeds κ +22 κ +2 ,which equals T κ +1 ( κ +1) . For ease of presentation, it is convenient to assume that the players’ ratios add up to10 + (cid:15) , Max’s initial ratio is T κ +1 ( κ +1) + (cid:15) , and Min’s initial ratio is t κ +1 ( κ +1) . For j ≥ , we think of (cid:15) j as Max’sslush fund in the j -th round of the game, though its role here is somewhat less significant than in Theorem 7.Consider a play π . We think of changes in energy throughout π and changes in budget ratio as representingtwo walks on two sequences. The energy sequence is IN and the budget sequence is { t k /S k : k ∈ IN } , withthe natural order in the two sets. We show a strategy for Max that maintains the invariant that whenever theenergy is k ∈ IN, then Max’s ratio is greater than T k +1 / ( k + 1) . That is, whenever Max wins a bidding,both sequences take a “step up” and when he loses, both sequences take a “step down”.We describe Max’s strategy. Upon winning a bidding, Max proceeds to v , thus the energy increasesby one. We assume WLog. that upon winning, Min proceeds to v , thus the energy decreases by one. Thechallenge is to find the right bids. Suppose the energy level is k at the j -th round. Thus, Max and Min’s ratioare respectively T k +1 / ( k + 1) + (cid:15) j and t k +1 / ( k + 1) . In other words, Min owns t k +1 boxes and Max ownsa bit more than T k +1 boxes. Max’s bid consists of two parts. Max bids / ( k + 1) + (cid:15) j / , or in other words,a single box and half of his slush fund. We first show how the strategy maintains the invariant and then howit guarantees that an energy of is never reached. Suppose first that Max wins the bidding. The total numberof boxes decreases by one to ( k + 1) − , his slush fund is cut by half, and Min’s budget is unchanged.Thus, Max’s ratio of the budget is more than ( T k +1 − / (cid:0) ( k + 1) − (cid:1) , which equals T k +2 / ( k + 2) . Forexample, let k = 4 and Max’s ratio exceeds T t + T . Following a bidding win the energy increases to k = 5 and Max’s ratio is more than T − t + T − = − − = = T t + T . In other words, we take a step up in bothsequences. The other case is when Min wins the bidding, the energy decreases by , and we show that thebudget sequences takes a step down. Since Max bids more than one box, and Min overbids, Min bids atleast one box. Max’s new ratio is more than T k +1 / (( k + 1) −
1) = T k /k , thus dually, both sequencestake a step down. For example, again let k = 4 and Max’s ratio exceeds T t + T . Upon losing a bidding, theenergy decreases to k = 3 and Max’s ratio is − = = T t + T .It is left to show that the energy never reaches , thus the walk on the budget sequence never reaches thefirst element. Suppose the energy is k = 1 in the j -th round, thus according to the invariant, Max’s ratio is + (cid:15) j and Min’s ratio is . Recall that Max bids k +1) + (cid:15) j / at energy k . In particular, he bids + (cid:15) j / at energy , which exceeds Min’s budget, thus Max necessarily wins the bidding, implying that the energyincreases. In an arbitrary strongly-connected game the bids in the different vertices cannot be the same. In this sectionwe develop a technique to determine the “importance” of a node v , which we call its strength and measureshow high the bid should be in v compared with the other vertices.Consider a strongly-connected game G = (cid:104) V, E, w (cid:105) and r ∈ [0 , . Recall that RT r ( G ) is a random-turngame in which Max chooses the next move with probability r and Min with probability − r . A positionalstrategy is a strategy that always chooses the same action (edge) in a vertex. It is well known that there existoptimal positional strategies for both players in stochastic mean-payoff games.Consider two optimal positional strategies f and g in RT r ( G ) , for Min and Max, respectively. For avertex v ∈ V , let v − , v + ∈ V be such that v − = f ( v Min ) and v + = g ( v Max ) . The potential of v ,denoted Pot r ( v ) , is a known concept in probabilistic models and its existence is guaranteed [42]. We usethe potential to define the strength of v , denoted St r ( v ) , which intuitively measures how much the potentialsof the neighbors of v differ. We assume w.l.o.g. that MP ( RT r ( G )) = 0 as otherwise we can decrease allweights by this value. Let ν ∈ Q be such that r = νν +1 . The potential and strengths of v are functions thatsatisfy the following:Pot r ( v ) = ν · Pot r ( v + ) + Pot r ( v − )1 + ν + w ( v ) and St r ( v ) = Pot r ( v + ) − Pot r ( v − )1 + ν r ( v − ) ≤ Pot r ( v (cid:48) ) ≤ Pot r ( v + ) , for every v (cid:48) ∈ N ( v ) , which canbe found for example using the strategy iteration algorithm.Consider a finite path π = v , . . . , v n in G . We intuitively think of π as a play, where for every ≤ i < n ,the bid of Max in v i is St r ( v i ) and he moves to v + i upon winning. Thus, if v i +1 = v + i , we say that Maxwon in v i , and if v i +1 (cid:54) = v + i , we say that Max lost in v i . Let W ( π ) and L ( π ) respectively be the indicesin which Max wins and loses in π . We call Max wins investments and Max loses gains , where intuitivelyhe invests in increasing the energy and gains a higher ratio of the budget whenever the energy decreases.Let G ( π ) and I ( π ) be the sum of gains and investments in π , respectively, thus G ( π ) = (cid:80) i ∈ L ( π ) St r ( v i ) and I ( π ) = (cid:80) i ∈ W ( π ) St r ( v i ) . Recall that the energy of π is E ( π ) = (cid:80) ≤ i Consider a strongly-connected game G , a ratio r = ν ν ∈ (0 , such that MP ( RT r ( G )) = 0 ,and a finite path π in G from v to u . Then, Pot r ( v ) − Pot r ( u ) ≤ E ( π ) + ν · G ( π ) − I ( π ) .Proof. We prove by induction on the length of π . For n = 1 , the claim is trivial since both sides of theequation are . Suppose the claim is true for paths of length n and we prove for paths of length n + 1 . Let π (cid:48) be the prefix of π starting from the second vertex. We distinguish between two cases. In the first case,Max wins in v , thus π (cid:48) starts from v + . Note that since Max wins the first bidding, we have G ( π ) = G ( π (cid:48) ) and I ( π ) = St r ( v ) + I ( π (cid:48) ) . Also, we have E ( π ) = E ( π (cid:48) ) + w ( v ) . Combining these with the inductionhypothesis, we have E ( π )+ ν · G ( π ) − I ( π ) = − St r ( v )+ w ( v )+ E ( π (cid:48) )+ ν · G ( π (cid:48) ) − I ( π (cid:48) ) ≥ − St r ( v )+ w ( v )+ Pot r ( v + ) − Pot r ( u ) == Pot r ( v − ) − Pot r ( v + ) + (1 + ν ) · Pot r ( v + )1 + ν + w ( v ) − Pot r ( u ) = Pot r ( v ) − Pot r ( u ) In the second case, Max loses the first bidding, thus π (cid:48) starts from some v (cid:48) with Pot r ( v (cid:48) ) ≥ Pot r ( v − ) , I ( π ) = I ( π (cid:48) ) , and G ( π ) = G ( π (cid:48) ) + St r ( v ) . We combine with the induction hypothesis to obtain thefollowing E ( π )+ ν · G ( π ) − I ( π ) = ν · St r ( v )+ w ( v )+ E ( π (cid:48) )+ ν · G ( π (cid:48) ) − I ( π (cid:48) ) ≥ St r ( v )+ w ( v )+ Pot r ( v (cid:48) ) − Pot r ( u ) ≥≥ St r ( v )+ w ( v )+ Pot r ( v − ) − Pot r ( u ) = ν · Pot r ( v + ) − ν · Pot r ( v − ) + (1 + ν ) · Pot r ( v − )1 + ν + w ( v ) − Pot r ( u ) == Pot r ( v ) − Pot r ( u ) Example 11. Consider the game depicted in Fig. 2. Max always proceeds left and Min always proceedsright, so, for example, we have v +2 = v and v − = v . It is not hard to verify that MP ( RT / ( G )) = 0 by finding the stationary distribution of RT / ( G ) . We have P ( v ) = 6 , P ( v ) = 3 , P ( v ) = 0 ,and P ( v ) = − . Thus, the strengths are St ( v ) = 1 , St ( v ) = 2 , St ( v ) = 2 , and St ( v ) = 1 .Consider the path π = v , v , v , v , v , v in which Max wins the first three bids and loses the last two, thus G ( π ) = 1+2 and I ( π ) = 2+2+1 = 5 . We have E ( π ) = − since the last vertex does not contribute to theenergy. The left-hand side of the expression in Lemma 10 is , and the right-hand side is − · − .12 .3 Defining a richer budget sequence In this section we generalize the ideas from Section 4.1 so that we can treat any strongly-connected graphand any initial ratio. Let r = ν ν . For the remainder of this section we fix Min’s budget to and let Max’sbudget be ν . We find two sequences { ν x } x> and { β x } x> , which we refer to as the budget sequence withproperties on which we elaborate below. Max’s bid depends on the position in the budget sequence as wellas the strength of the vertex. We find it more convenient to normalize the strength. Definition 12. (Normalized strength). Let S = max v | St r ( v ) | . The normalized strength of a vertex v ∈ V is nSt r ( v ) = St r ( v ) /S .Formally, when the token is placed on a vertex v ∈ V and the position of the walk is x , then Max bids β x · nSt r ( v ) . Note that nSt r ( v ) ∈ [0 , , for all v ∈ V .We describe the intuition of the construction. We think of Max’s strategy as maintaining a position x ∈ IR > on a walk, where his bidding strategy maintains the invariant that his ratio exceeds ν x . Forexample, in Section 4.1, the vertices have the same importance, thus their strength is . For k ∈ IN, wehave ν k = T k +1 / ( k + 1) and β k = 1 / ( k + 1) , and whenever the position is x = k , Max’s ratio exceeds ν k . We distinguish between two cases. Suppose first that ν ≥ . If Max wins the bidding in v , then thenext position of the walk is x + nSt r ( v ) , and if Min wins the bidding, the next position is x − nSt r ( v ) · ν .When ν < , the next position when Max wins is x + nSt r ( v ) · ν − , and when he loses, the next positionis x − nSt r ( v ) . There are two complications when comparing with the proof in Section 4.1. First, while inSection 4.1, we always take one step when winning a bidding, here the number of steps taken at a vertex v depends on the importance of v . Unlike that proof, a step of s ∈ Q does not necessarily correspond to achange of s in the energy. Lemma 10 guarantees that steps in the walk even out with changes in energy atthe end of cycles, which suffices for our purposes. Second, that proof addresses the case of r = 1 / andhere we consider general ratios. When Max’s initial ratio is r , winning a bidding is r -times more costly thanwinning a bidding for Min. This is illustrated in Example 2, where when Min has a budget of (cid:15) and Maxhas a budget of , Min pushes a Max winning bid of b on the queue twice.We define the following budget sequence. Definition 13. Let r = ν ν > be an initial ratio. For x > , we define ν x = ν (1+ x ) and β x = · min(1 ,ν ) x ( x +1) .The most important property of the sequences is maintaining the invariant between x and the ratio ν x .Recall that Max’s budget exceeds ν x at position x and Min’s budget is . Suppose Max’s bid is b . Then,upon winning, Max’s new budget is ν x − b , and upon losing and re-normalizing Min’s budget to , Max’snew budget is at least ν x / (1 − b ) . The following lemma shows that the invariant is maintained in both cases. Lemma 14. For any < x, ν and n ∈ [0 , , if x ( x + 1) > · n · min(1 , ν ) , we have ν (1 + x )1 − · n · min(1 ,ν ) x ( x +1) ≥ ν (cid:18) x − n · min(1 , ν ) (cid:19) and ν (1+ 2 x ) − · n · min(1 , ν ) x ( x + 1) ≥ ν (cid:18) x + n · min(1 , ν − ) (cid:19) Proof. We start with the first claim and argue that x ( x + 1) > · n · min(1 , ν ) implies that x > n min(1 , ν ) .If x > , the latter follows directly from our assumptions on n (and that min(1 , ν ) ≤ ). On the other hand,if < x ≤ , the former can be written as xc > n · min(1 , ν ) , for c = x +12 ≤ , which in particular, impliesthat x > n · min(1 , ν ) .We have that ν (1 + x )1 − · n · min(1 ,ν ) x ( x +1) = ν · x +2 xx ( x +1) − · n · min(1 ,ν ) x ( x +1) = ν · ( x + 2)( x + 1) x ( x + 1) − · n · min(1 , ν ) > since x ( x + 1) > · n · min(1 , ν ) ).Also, ν (cid:18) x − n · min (1 , ν ) (cid:19) = ν (cid:18) x − n · min(1 , ν ) + 2 x − n · min(1 , ν ) (cid:19) . (we have that x − n · min (1 , ν ) > from above).Thus, ν (cid:0) x (cid:1) − · n · min(1 ,ν ) x ( x +1) ≥ ν (cid:18) x − n · min(1 , ν ) (cid:19) ⇔ ( x + 2)( x + 1) x ( x + 1) − · n · min(1 , ν ) ≥ x − n · min(1 , ν ) + 2 x − n · min(1 , ν ) ⇔ ( x + 2)( x + 1)( x − n · min(1 , ν )) ≥ ( x − n · min(1 , ν ) + 2)( x ( x + 1) − · n · min(1 , ν )) ⇔ ( x + 2)( x + 1)( x − n · min(1 , ν )) − ( x − n · min(1 , ν ) + 2)( x ( x + 1) − · n · min(1 , ν )) ≥ ⇔ n min(1 , ν )(1 − n min(1 , ν )) ≥ Note that n and min(1 , ν ) are in [0 , and thus, the inequality is true, because each factor is ≥ , andwe are done.We proceed to the second claim and show that for any < x, ν and n ∈ [0 , , we have ν (cid:18) x (cid:19) − · n · min(1 , ν ) x ( x + 1) ≥ ν (cid:18) x + n · min(1 , ν − ) (cid:19) We have that ν (cid:18) x (cid:19) − · n · min(1 , ν ) x ( x + 1) = ν · x + 2 x − · n · min(1 , ν ) x ( x + 1) = ν ( x + 2)( x + 1) − · n · min(1 , ν ) x ( x + 1) . Also, ν (1 + 2 x + n · min(1 , ν − ) ) = ν · x + n · min(1 , ν − ) + 2 x + n · min(1 , ν − ) . Thus, ν (cid:18) x (cid:19) − · n · min(1 , ν ) x ( x + 1) ≥ ν (cid:18) x + n · min(1 , ν − ) (cid:19) ⇔ ν ( x + 2)( x + 1) − · n · min(1 , ν ) x ( x + 1) ≥ ν · x + n · min(1 , ν − ) + 2 x + n · min(1 , ν − ) ⇔ ( ν ( x + 2)( x + 1) − · n · min(1 , ν ))( x + n · min(1 , ν − )) ≥ ν · x ( x + 1) · ( x + n · min(1 , ν − ) + 2) ⇔ ( ν ( x + 2)( x + 1) − · n · min(1 , ν ))( x + n · min(1 , ν − )) − ν · x ( x + 1) · ( x + n · min(1 , ν − ) + 2) ≥ ⇔ n min(1 , ν )(1 − n min(1 , ν − )) ≥ Note that n , min(1 , ν ) and min(1 , ν − ) are in [0 , and thus, the inequality is true, because each factoris ≥ . In this section we combine the ingredients developed in the previous sections to solve arbitrary strongly-connected mean-payoff games. 14 heorem 15. Consider a strongly-connected mean-payoff poorman game G and a ratio r ∈ [0 , . Thevalue of G with respect to r equals the value of the random-turn mean-payoff game RT r ( G ) in which Maxchooses the next move with probability r , thus MP r ( G ) = MP ( RT r ( G )) .Proof. We assume w.l.o.g. that MP ( RT r ( G )) = 0 since otherwise we decrease this value from all weights.Also, the case where r ∈ { , } is easy since RT r ( G ) is a graph and in G , one of the players can win allbiddings. Thus, we assume r ∈ (0 , . Recall that MP ( π ) = lim inf n →∞ E ( π n ) n . We show a Max strategythat, when the game starts from a vertex v ∈ V and with an initial ratio of r + (cid:15) , guarantees that the energyis bounded below by a constant, which implies MP ( π ) ≥ .Note that showing such a strategy for Max suffices to prove MP r ( G ) = 0 since our definition for a payofffavors Min. Consider the game G (cid:48) that is obtained from G by multiplying all weights by − . We associateMin in G with Max in G (cid:48) , thus an initial ratio of − r − (cid:15) for Min in G is associated with an initial ratioof r + (cid:15) of Max in G (cid:48) . We have MP ( RT − r ( G (cid:48) )) = − MP ( RT r ( G )) = 0 . Let f be a Max strategy in G (cid:48) thatguarantees a non-negative payoff. Suppose Min plays in G according to f and let π be a play when Maxplays some strategy. Since f guarantees a non-negative payoff in G (cid:48) , we have lim sup n →∞ E ( π n ) /n ≤ in G , and in particular MP ( π ) = lim inf n →∞ E ( π n ) /n ≤ .Before we describe Max’s strategy, we need several definitions. In Definition 13, we set ν x = ν · (1 +2 /x ) , which clearly tends to ν from above. We can thus choose κ ∈ IN such that Max’s ratio is greater than ν κ . Suppose Max is playing according to the strategy we describe below and Min is playing according tosome strategy. The play induces a walk on { ν x } x ∈ Q ≥ , which we refer to as the budget walk . Max’s strategyguarantees the following: Invariant: Whenever the budget walk reaches an x ∈ Q , then Max’s ratio is greater than ν x .The walk starts in κ and the invariant holds initially due to our choice of κ . Suppose the token is placedon the vertex v ∈ V and the position of the walk is x . Max bids nSt r ( v ) · β x , and he moves to v + uponwinning. Suppose first that ν ≥ . If Max wins the bidding, then the next position of the walk is x + nSt r ( v ) ,and if Min wins the bidding, the next position is x − nSt r ( v ) · ν . When ν < , the next position when Maxwins is x + nSt r ( v ) · ν − , and when he loses, the next position is x − nSt r ( v ) . Lemma 14 implies that inboth cases the invariant is maintained. Claim: For every Min strategy, the budget walk stays on positive positions and never reaches x = 0 .Suppose ν ≥ . Thus, when Max loses with a bid of n/x ( x + 1) , we step down n steps. In order toreach x = 0 , the position needs to be x = n . But then, Max’s bid is n/n ( n + 1) ≥ , thus Max wins thebidding since Min’s budget is . Similarly, when ν < , when the bid is nν/x ( x + 1) , we step down n · ν ,and we need x = n · ν to reach x = 0 . Again, since nν/nν ( nν + 1) ≥ , Max wins the bidding. Claim: The strategy is legal; Max’s bids never exceed his available budget.Indeed, we have n min(1 , ν ) /x ( x + 1) ≤ ν (1 + 2 /x ) , for every ≤ n ≤ and ν > since x > . Claim: The energy throughout a play is bounded from below. Formally, there exists a constant c ∈ IR suchthat for every Min strategy and a finite play π , we have E ( π ) ≥ c .Consider a finite play π . We view π as a sequence of vertices in G . Recall that the budget walk starts at κ , that G ( π ) and I ( π ) represent sums of strength of vertices, and that S = max v ∈ V | St r ( v ) | and nSt r ( v ) = St r ( v ) /S . Suppose the budget walk reaches x following the play π . Then, when ν ≥ , we have x = κ − G ( π ) /S + I ( π ) /νS . Combining with x ≥ , we have S · κ · ν ≤ − G ( π ) · ν + I ( π ) . Let P =max u,v Pot r ( u ) − Pot r ( v ) . Re-writing Lemma 10, we obtain − G ( π ) · ν + I ( π ) ≤ E ( π ) + P . Combiningthe two, we have E ( π ) ≥ − P − S · κ · ν . Similarly, when ν < , we have x = κ − G ( π ) · ν/S + I ( π ) /S and combining with Lemma 10, we obtain E ( π ) ≥ − P − S · κ , and we are done. Remark 16. Richman vs poorman bidding. An interesting connection between poorman and Richmanbiddings arrises from Theorem 15. Consider a strongly-connected mean-payoff game G . For an initial ratio15 ∈ [0 , , let MP r P ( G ) denote the value of G with respect to r with poorman bidding. With Richman bidding[6], the value does not depend on the initial ratio rather it only depends on the structure of G and we canthus omit r and use MP R ( G ) . Moreover, mean-payoff Richman-bidding games are equivalent to uniformrandom-turn games, thus MP R ( G ) = MP ( RT . ( G )) . Our results show that poorman games with initial ratio . coincide with Richman games. Indeed, we have MP R ( G ) = MP . P ( G ) . To the best of our knowledgesuch a connection between the two bidding rules has not been identified before. Remark 17. Energy poorman games. The proof technique in Theorem 15 extends to energy poormangames. Consider a strongly-connected mean-payoff game G , and let r ∈ [0 , such that MP r ( G ) = 0 . Now,view G as an energy poorman game. The proof of Theorem 15 shows that when Max’s initial ratio is r + (cid:15) ,there exists an initial energy level from which he can win the game. On the other hand, when Max’s initialratio is r − (cid:15) , Min can win the energy game from every initial energy. Indeed, consider the game G (cid:48) thatis obtained from G by multiplying all weights by − . Again, using Theorem 15 and associating Min withMax, Min can keep the energy level bounded from above, which allows him, similar to the qualitative case,to play a strategy in which he either wins or increases his ratio by a constant. Eventually, his ratio is highenough to win arbitrarily many times in a row and drop the energy as low as required. Remark 18. A general budget sequence. The proof of Theorem 15 uses four properties of the “budgetsequence” { ν x } x ≥ and { β x } x ≥ that is defined in Definition 13: (1) the invariant between Max’s ratio and r x is maintained (shown in Lemma 14), (2) the bids never exceed the available budget, (3) lim x →∞ ν x = ν , and (4) the walk never reaches x = 0 . The existence of a budget sequence with these properties isshown in [9] for taxman bidding , which generalize both Richman and poorman bidding: taxman biddingis parameterized with a constant τ ∈ [0 , , where the higher bidder pays portion τ of his bid to the otherplayer and portion (1 − τ ) to the bank. Unlike that proof, we define an explicit budget sequence for poormanbidding. We extend the solution in the previous sections to general graphs in a similar manner to the qualitative case;we first reason about the BSCCs of the graph and then construct an appropriate reachability game on therest of the vertices. Recall that, for a vertex v in a mean-payoff game, the ratio Th ( v ) is a necessary andsufficient initial ratio to guarantee a payoff of .Consider a mean-payoff poorman game G = (cid:104) V, E, w (cid:105) . Recall that, for v ∈ V , Th ( v ) is the necessaryand sufficient initial ratio for Max to guarantee a non-positive payoff. Let S , . . . , S k ⊆ V be the BSCCsof G and S = (cid:83) ≤ i ≤ k S i . For ≤ i ≤ k , the mean-payoff poorman game G i = (cid:104) S i , E | S i , w | S i (cid:105) is astrongly-connected game. We define r i ∈ [0 , as follows. If there is an r ∈ [0 , such that MP r ( G i ) = 0 ,then r i = r . Otherwise, if for every r , we have MP r ( G i ) > , then r i = 0 , and if for every r , we have MP r ( G i ) < , then r i = 1 . By Theorem 15, for every v ∈ S i , we have Th ( v ) = r i . We construct a generalized reachability game G (cid:48) that corresponds to G by replacing every S i in G with a vertex u i . Player wins a path in G iff it visits some u i and when it visits u i , Player ’s ratio is at least r i . It is not hard togeneralize the proof of Theorem 7 to generalized reachability poorman games and obtain the following. Theorem 19. The threshold ratios in a mean-payoff poorman game G coincide with the threshold ratios inthe generalized reachability game that corresponds to G . In this section we show an application of mean-payoff poorman-bidding games in reasoning about auctionsfor online advertisements. A typical webpage has ad slots ; e.g., in Google’s search-results page, ads typically16ppear above or beside the “actual” search results. Different slots have different value depending on theirpositions; e.g., slots at the top of the page are typically seen first, thus generate more clicks and are morevaluable. A large chunk of the revenue of companies like Google comes from auctions for allocating adslots that they regularly hold between advertisement companies.Consider the following auction mechanism. At each time point (e.g., each day), a slot is auctionedand the winner places an ad in the slot. It is common practice in auctions for online ads to hold second-price auctions; namely, the higher bidder sets the ad and pays the bid of the second-highest bidder to theauctioneer. Suppose there are k ∈ IN ad slots. We take the view-point of an advertiser. The state of thewebpage is given by ¯ s ∈ { , } k , where an advertiser’s ad appears in a slot ≤ i ≤ k iff s i = 1 . Weassume that we are given a reward function ρ : { , } k → Q that assigns the utility obtained from each state ¯ s ∈ { , } k ; e.g., the reward can be the expected revenue, which is the expected number of clicks on hisads times the expected revenue from each click. The utility for an infinite sequence ¯ s , ¯ s , . . . is the mean-payoff of ρ ( ¯ s ) , ρ ( ¯ s ) , . . . . We are interested in finding an optimal bidding strategy in the ongoing auctionunder two simplifying assumptions: (1) the utility is obtained only from the ads and does not include theprice paid for them, and (2) we assume two competitors and full information of the budgets. We obtain anoptimal bidding strategy by finding an optimal strategy for Max in a mean-payoff poorman-bidding game.In Section 6, we discuss extensions of the bidding games that we study in this paper, that are needed toweaken the two assumptions above.As a simple example, the special case of one ad slot is modelled as the game in Fig. 1: in each turn thead slot is auctioned, Max gets a reward of when his ad shows and a penalty of − when the competitor’sad is shown. We formalize the general case. Consider an ongoing auction with k slots and a reward function ρ . We construct a mean-payoff poorman-bidding game A k,ρ = (cid:104) V, E, w (cid:105) as follows. We define V = { , . . . , k } × { , } k . Consider v = (cid:104) (cid:96), ¯ s (cid:105) ∈ V , where ≤ (cid:96) ≤ k and ¯ s = (cid:104) s , . . . , s k (cid:105) ∈ { , } k . Thevector ¯ s represents the state of the webpage following the previous bidding. The slot that is auctioned at v is (cid:96) , thus the vertex v has two neighbors u = (cid:104) (cid:96) , ¯ s (cid:105) and u = (cid:104) (cid:96) , ¯ s (cid:105) with (cid:96) = (cid:96) = (cid:96) + 1 mod k .The state of the slots apart from the (cid:96) -th slot stay the same, thus for every i (cid:54) = (cid:96) , we have s i = s i = s i .The vertex u represents a Max win in the bidding and u a Max lose, thus s (cid:96) = 1 and s (cid:96) = 0 . Finally, theweight of v is ρ (¯ s ) . Note that A k,ρ is a strongly-connected mean-payoff poorman-bidding game. Theorem 20. Consider a second-price ongoing auction with k slots and a reward function ρ . An optimalstrategy for Max in the poorman-bidding game A k,ρ coincides with an optimal bidding strategy in theauction.Proof. The only point that requires proof is that mean-payoff poorman-bidding games are equivalent tomean-payoff games with second-price auctions. Consider a strongly-connected mean-payoff game G . Let r ∈ (0 , . Suppose Max’s initial budget is r + (cid:15) , for (cid:15) > . Theorem 15 constructs a Max strategy f that guarantees a payoff of at least MP ( RT r ( G )) under poorman bidding rules. A close look at this strategyreveals that it ensures a payoff of at least MP ( RT r ( G )) under second-price rules. Indeed, let b be the Maxbid prescribed by f following a finite play. Then, if Max wins the bidding, his payment is at most b . Onthe other hand, if Min wins the bidding, he pays at least b . In both cases the invariant on Max’s budget ismaintained as in the proof of Theorem 15. Finally, a dual argument as in Theorem 15 shows that Min canguarantee a payoff of at most MP ( RT r ( G )) with second-price bidding rules. We thus conclude that the valueof G under second-price bidding coincides with the value under poorman bidding, and we are done.We can use Theorem 20 to answer questions of the form “can an advertiser guarantee that his ad showsat least half the time, in the long run?”. Indeed, set ρ (¯ s ) = 1 when the ad shows and ρ (¯ s ) = 0 when it doesnot. Then, the payoff corresponds to the long-run average time that the ad shows.17 Computational Complexity We study the complexity of finding the threshold ratios in poorman games. We formalize this search problemas the following decision problem. Recall that threshold ratios in reachability poorman games may beirrational (see Theorem 7). THRESH-BUD Given a bidding game G , a vertex v , and a ratio r ∈ [0 , ∩ Q , decide whether Th ( v ) ≥ r . Theorem 21. For poorman parity games, THRESH-BUD is in PSPACE.Proof. To show membership in PSPACE, we guess the optimal moves for the two players. To verify theguess, we construct a program of the existential theory of the reals that uses the relation between the thresh-old ratios that is described in Theorem 7. Deciding whether such a program has a solution is known to bein PSPACE [16]. Formally, given a parity poorman game G = (cid:104) V, E, p (cid:105) and a vertex v ∈ V , we guess,for each vertex u ∈ V , two neighbors u + , u − ∈ N ( u ) . We construct the following program. For everyvertex u ∈ V , we introduce a variable x u , and we add constraints so that a satisfying assignment to x u coincides with the threshold ratio in u . Consider a BSCC S of G . Recall that the threshold ratios in S areall either or , and verifying which is the case can be done in linear time. Suppose the threshold ratios are α ∈ { , } . We add constraints x u = α , for every u ∈ S . For every vertex u ∈ V that is not in a BSCC,we have constraints x u = x u + − x u − + x u + and x u − ≤ x u (cid:48) ≤ x u + , for every u (cid:48) ∈ N ( u ) . By Theorems 7 and 8,a satisfying assignment assigns to x u the ratio Th ( u ) . We conclude by adding a final constraint x v ≥ r .Clearly, the program has a satisfying assignment iff Th ( v ) ≥ r , and we are done.We continue to study mean-payoff games. Theorem 22. For mean-payoff poorman games, THRESH-BUD is in PSPACE. For strongly-connectedgames, it is in NP and coNP. For strongly-connected games with out-degree , THRESH-BUD is in P.Proof. To show membership in PSPACE, we proceed similarly to the qualitative case, and show a nondeter-ministic polynomial-space that uses the existential theory of the reals to verify its guess. Given a game G , weconstruct a program that finds, for each BSCC S of G , the threshold ratio for all the vertices in V . We thenextend the program to propagate the threshold ratios to the rest of the vertices, similar to Theorem 19. Givena strongly-connected game G and a ratio r ∈ [0 , , we construct RT r ( G ) in linear time. Then, decidingwhether MP ( RT r ( G )) ≥ , is known to be in NP and coNP.The more challenging case is the solution for strongly-connected games with out-degree . Considersuch a game G = (cid:104) V, E, w (cid:105) and r ∈ [0 , . We construct an MDP D on the structure of G such that MP ( D ) = MP r ( G ) . Since finding MP ( D ) is known to be in P, the claim follows. When r ≥ , then D is amax-MDP, and when r < , it is a min-MDP. Assume the first case, and the second case is similar. We splitevery vertex v ∈ V in three, where v ∈ V Max and v , v ∈ V N . Suppose { u , u } = N ( v ) . Intuitively,moving to v means that Max prefers moving to u over u . Thus, we have Pr[ v , u ] = r = 1 − Pr[ v , u ] and Pr[ v , u ] = 1 − r = 1 − Pr[ v , u ] . It is not hard to see that MP ( D ) = MP r ( G ) . We studied for the first time infinite-duration poorman-bidding games. Historically, poorman bidding hasbeen studied less than Richman bidding, but the reason was technical difficulty, not lack of motivation. Inpractice, while the canonical use of Richman bidding is a richer notion of fairness, poorman bidding, on theother hand, are more common since they model an ongoing investment from a bounded budget. We showthe existence of threshold ratios for poorman games with qualitative objectives. For mean-payoff poorman18ames, we construct optimal strategies with respect to the initial ratio of the budgets. We show an equiva-lence between mean-payoff poorman games and random-turn games, which, to the best of our knowledge,is the first such equivalence for poorman bidding. Unlike Richman bidding for which an equivalence withrandom-turn games holds for reachability objectives, for poorman bidding no such equivalence is known.We thus find the equivalence we show here to be particularly surprising.We expect the mathematical structure that we find for poorman bidding to be useful in adding to thesegames concepts that are important for modelling practical settings. For example, our modelling of ongoingauctions made two simplifying assumptions: (1) utility is only obtained from the weights in the graph,and (2) two companies compete for ads and there is full information on the company’s budgets. Relaxingboth assumptions are an interesting direction for future work. Relaxing the second assumption requires anaddition of two orthogonal concepts that were never studied in bidding games: multiple players and partialinformation regarding the budgets. Finally, the deterministic nature of bidding games is questionable forpractical applications, and a study of probabilistic behavior is initiated in [8].To the best of our knowledge, we show the first complexity upper bounds on finding threshold ratiosin poorman games. We leave open the problem of improving the bounds we show; either improving thePSPACE upper bounds or showing non-trivial lower bounds, e.g., showing ETR-hardness. Since thresholdratios can be irrational, we conjecture that the problem is at least Sum-of-squares -hard. The complexity offinding threshold ratios in un-directed reachability Richman-bidding games (a.k.a. “tug-of-war” games) wasshown to be in P in [31], thereby solving the problem for uniform undirected random-turn games. Recently,the solution was extended to un-directed biased reachability random-turn games [40].This work belongs to a line of works that transfer concepts and ideas between three areas with differenttakes on game theory: formal methods, algorithmic game theory [38], and AI. Examples of works in theintersection of these fields include logics for specifying multi-agent systems [3, 20, 36], studies of equilibriain games related to synthesis and repair problems [19, 17, 25, 2], non-zero-sum games in formal verification[21, 14], and applying concepts from formal methods to resource allocation games ; e.g., network gameswith rich specifications [11] and an efficient reasoning about very large games [5, 29]. References [1] M. Aghajohari, G. Avni, and T. A. Henzinger. Determinacy in discrete-bidding infinite-duration games. In Proc.30th CONCUR , volume 140 of LIPIcs , pages 20:1–20:17, 2019.[2] S. Almagor, G. Avni, and O. Kupferman. Repairing multi-player games. In Proc. 26th CONCUR , pages 325–339,2015.[3] R. Alur, T. A. Henzinger, and O. Kupferman. Alternating-time temporal logic. J. ACM , 49(5):672–713, 2002.[4] K.R. Apt and E. Grädel. Lectures in Game Theory for Computer Scientists . Cambridge University Press, 2011.[5] G. Avni, S. Guha, and O. Kupferman. An abstraction-refinement methodology for reasoning about networkgames. In Proc. 26th IJCAI , pages 70–76, 2017.[6] G. Avni, T. A. Henzinger, and V. Chonev. Infinite-duration bidding games. J. ACM , 66(4):31:1–31:29, 2019.[7] G. Avni, T. A. Henzinger, and R. Ibsen-Jensen. Infinite-duration poorman-bidding games. In Proc. 14th WINE ,volume 11316 of LNCS , pages 21–36. Springer, 2018.[8] G. Avni, T. A. Henzinger, R. Ibsen-Jensen, and P. Novotný. Bidding games on markov decision processes. In Proc. 13th RP , pages 1–12, 2019.[9] G. Avni, T. A. Henzinger, and Ð. Žikeli´c. Bidding mechanisms in graph games. In In Proc. 44th MFCS , volume138 of LIPIcs , pages 11:1–11:13, 2019.[10] G. Avni, R. Ibsen-Jensen, and J. Tkadlec. All-pay bidding games on graphs. Proc. 34th AAAI , 2020. 11] G. Avni, O. Kupferman, and T. Tamir. Network-formation games with regular objectives. Inf. Comput. , 251:165–178, 2016.[12] J. Bhatt and S. Payne. Bidding chess. Math. Intelligencer , 31:37–39, 2009.[13] E. Borel. La théorie du jeu les équations intégrales á noyau symétrique. Comptes Rendus de l’Académie ,173(1304–1308):58, 1921.[14] T. Brihaye, V. Bruyère, J. De Pril, and H. Gimbert. On subgame perfection in quantitative reachability games. Logical Methods in Computer Science , 9(1), 2012.[15] C. Calude, S. Jain, B. Khoussainov, W. Li, and F. Stephan. Deciding parity games in quasipolynomial time. In Proc. 49th STOC , 2017.[16] J. F. Canny. Some algebraic and geometric computations in PSPACE. In Proc. 20th STOC , pages 460–467, 1988.[17] K. Chatterjee. Nash equilibrium for upward-closed objectives. In Proc. 15th CSL , volume 4207 of Lecture Notesin Computer Science , pages 271–286. Springer, 2006.[18] K. Chatterjee, A. K. Goharshady, and Y. Velner. Quantitative analysis of smart contracts. In Proc. 27th ESOP ,pages 739–767, 2018.[19] K. Chatterjee, T. A. Henzinger, and M. Jurdzinski. Games with secure equilibria. Theor. Comput. Sci. , 365(1-2):67–82, 2006.[20] K. Chatterjee, T. A. Henzinger, and N. Piterman. Strategy logic. Inf. Comput. , 208(6):677–693, 2010.[21] K. Chatterjee, R. Majumdar, and M. Jurdzinski. On nash equilibria in stochastic games. In Proc. 13th CSL ,pages 26–40, 2004.[22] E. M. Clarke, T. A. Henzinger, H. Veith, and R. Bloem, editors. Handbook of Model Checking . Springer, 2018.[23] A. Condon. On algorithms for simple stochastic games. In Proc. DIMACS , pages 51–72, 1990.[24] M. Develin and S. Payne. Discrete bidding games. The Electronic Journal of Combinatorics , 17(1):R85, 2010.[25] D. Fisman, O. Kupferman, and Y. Lustig. Rational synthesis. In Proc. 16th TACAS , pages 190–204, 2010.[26] I. L. Gale and M. Stegeman. Sequential auctions of endogenously valued objects. Games and Economic Behav-ior , 36(1):74–103, 2001.[27] M. Jurdzinski. Deciding the winner in parity games is in up ∩ co-up. Information Processing Letters , 68(3):119–124, 1998.[28] I. A. Kash, E. J. Friedman, and J. Y. Halpern. Optimizing scrip systems: crashes, altruists, hoarders, sybils andcollusion. Distributed Computing , 25(5):335–357, 2012.[29] O. Kupferman and T. Tamir. Hierarchical network formation games. In Proc. 23rd TACAS , pages 229–246,2017.[30] U. Larsson and J. Wästlund. Endgames in bidding chess. Games of No Chance 5 , 70, 2018.[31] A. J. Lazarus, D. E. Loeb, J. G. Propp, W. R. Stromquist, and D. H. Ullman. Combinatorial games under auctionplay. Games and Economic Behavior , 27(2):229–264, 1999.[32] A. J. Lazarus, D. E. Loeb, J. G. Propp, and D. Ullman. Richman games. Games of No Chance , 29:439–449,1996.[33] R. Paes Leme, V. Syrgkanis, and É. Tardos. Sequential auctions and externalities. In Proc. 23rd SODA , pages869–886, 2012.[34] R. Meir, G. Kalai, and M. Tennenholtz. Bidding games and efficient allocations. Games and Economic Behavior ,2018.[35] J. Mertens and A. Neyman. Stochastic games. International Journal of Game Theory , 10(2):53–66, 1981. 36] F. Mogavero, A. Murano, G. Perelli, and M. Y. Vardi. Reasoning about strategies: On the model-checkingproblem. ACM Trans. Comput. Log. , 15(4):34:1–34:47, 2014.[37] S. Muthukrishnan. Ad exchanges: Research issues. In Proc. 5th WINE , pages 1–12, 2009.[38] N. Nisan, T. Roughgarden, E. Tardos, and V. Vazirani. Algorithmic Game Theory . Cambridge University Press,2007.[39] Y. Peres, O. Schramm, S. Sheffield, and D. Bruce Wilson. Random-turn hex and other selection games. TheAmerican Mathematical Monthly , 114(5):373–387, 2007.[40] Y. Peres and Z. Sunic. Biased infinity laplacian boundary problem on finite graphs. CoRR , abs/1912.13394,2019. https://arxiv.org/abs/1912.13394 .[41] A. Pnueli and R. Rosner. On the synthesis of a reactive module. In Proc. 16th POPL , pages 179–190, 1989.[42] M. L. Puterman. Markov Decision Processes: Discrete Stochastic Dynamic Programming . John Wiley & Sons,Inc., New York, NY, USA, 2005.[43] M.O. Rabin. Decidability of second order theories and automata on infinite trees. Transaction of the AMS ,141:1–35, 1969.[44] G. E Rodriguez. Sequential auctions with multi-unit demands. The BE Journal of Theoretical Economics , 9(1),2009.[45] R. Weber. Multiple-object auctions. 113, 09 1981., 9(1),2009.[45] R. Weber. Multiple-object auctions. 113, 09 1981.