Stackelberg-Pareto Synthesis (Full Version)
SStackelberg-Pareto Synthesis (full version)
Véronique Bruyère
Université de Mons (UMONS), Belgium
Jean-François Raskin
Université libre de Bruxelles (ULB), Belgium
Clément Tamines
Université de Mons (UMONS), Belgium
Abstract
In this paper, we study the framework of two-player Stackelberg games played on graphs in whichPlayer 0 announces a strategy and Player 1 responds rationally with a strategy that is an optimalresponse. While it is usually assumed that Player 1 has a single objective, we consider here thenew setting where he has several. In this context, after responding with his strategy, Player 1 getsa payoff in the form of a vector of Booleans corresponding to his satisfied objectives. Rationalityof Player 1 is encoded by the fact that his response must produce a Pareto-optimal payoff giventhe strategy of Player 0. We study the Stackelberg-Pareto Synthesis problem which asks whetherPlayer 0 can announce a strategy which satisfies his objective, whatever the rational response ofPlayer 1. For games in which objectives are either all parity or all reachability objectives, we showthat this problem is fixed-parameter tractable and
NEXPTIME -complete. This problem is already NP -complete in the simple case of reachability objectives and graphs that are trees. Software and its engineering → Formal methods; Theory ofcomputation → Logic and verification; Theory of computation → Solution concepts in game theory
Keywords and phrases
Stackelberg non-zero sum games played on graphs, synthesis, parity objectives
Funding
This work is partially supported by the PDR project Subgame perfection in graph games(F.R.S.-FNRS), the ARC project Non-Zero Sum Game Graphs: Applications to Reactive Synthesisand Beyond (Fédération Wallonie-Bruxelles), the EOS project Verifying Learning Artificial Intel-ligence Systems (F.R.S.-FNRS and FWO), and the COST Action 16228 GAMENET (EuropeanCooperation in Science and Technology).
Two-player zero-sum infinite-duration games played on graphs are a mathematical modelused to formalize several important problems in computer science, such as reactive systemsynthesis . In this context, see e.g. [27], the graph represents the possible interactions betweenthe system and the environment in which it operates. One player models the system tosynthesize, and the other player models the (uncontrollable) environment. In this classicalsetting, the objectives of the two players are opposite, that is, the environment is adversarial .Modelling the environment as fully adversarial is usually a bold abstraction of reality as itcan be composed of one or several components, each of them having their own objective.In this paper, we consider the framework of
Stackelberg games [32], a richer non-zero-sumsetting, in which Player 0 (the system) called leader announces his strategy and then Player 1(the environment) called follower plays rationally by using a strategy that is an optimalresponse to the leader’s strategy. The goal of the leader is therefore to announce a strategythat guarantees him a payoff at least equal to some given threshold. In the specific caseof Boolean objectives, the leader wants to see his objective being satisfied. The conceptof leader and follower is also present in the framework of rational synthesis [19, 25] withthe difference that this framework considers several followers, each of them with their ownBoolean objective. In that case, rationality of the followers is modeled by assuming that a r X i v : . [ c s . G T ] F e b Stackelberg-Pareto Synthesis (full version) the environment settles to an equilibrium (e.g. a Nash equilibrium) where each component(composing the environment) is considered to be an independent selfish individual , excludingcooperation scenarios between components or the possibility of coordinated rational multipledeviations. Our work proposes a novel and natural alternative in which the single follower,modeling the environment, has several objectives that he wants to satisfy. After respondingto the leader with his own strategy, Player 1 receives a vector of Booleans which is his payoffin the corresponding outcome. Rationality of Player 1 is encoded by the fact that he onlyresponds in such a way to receive
Pareto-optimal payoffs , given the strategy announced bythe leader. This setting encompasses scenarios where, for instance, several components cancollaborate and agree on trade-offs. The goal of the leader is therefore to announce a strategythat guarantees him to satisfy his own objective, whatever the response of the follower whichensures him a Pareto-optimal payoff. The problem of deciding whether the leader has such astrategy is called the
Stackelberg-Pareto Synthesis problem (SPS problem).
Contributions.
In addition to the definition of the new setting, our main contributions arethe following ones. We consider the general class of ω -regular objectives modelled by parity conditions and also consider the case of reachability objectives for their simplicity . Weprovide a thorough analysis of the complexity of solving the SPS problem for both objectives.Our results are interesting and singular both from a theoretical and practical point of view.First, we show that the SPS problem is fixed-parameter tractable ( FPT ) for reachabilityobjectives when the number of objectives of the follower is a parameter (Theorem 3) and forparity objectives when, in addition, the maximal priority used in each priority function isalso a parameter of the complexity analysis (Theorem 4). These are important results as itis expected that, in practice, the number of objectives of the environment is limited to a few.To obtain these results, we develop a reduction from our non-zero-sum games to a zero-sumgame in which the protagonist, called
Prover , tries to show the existence of a solution tothe problem, while the antagonist, called
Challenger , tries to disprove it. We were unable toobtain
FPT results with the classical tree automata techniques that are usually used in theliterature for related problems, see e.g. [13, 29].Second, we prove that the SPS problem is
NEXPTIME -complete for both reachabilityand parity objectives (Theorems 9, 17 and 18), and that it is already NP -complete in thesimple setting of reachability objectives and graphs that are trees (Theorem 15). To thebest of our knowledge, this is the first NEXPTIME -completeness result for a natural classof games played on graphs. To obtain the hardness for
NEXPTIME , we present a natural succinct version of the set cover problem that is complete for this class (Theorem 20), aresult of potential independent interest. We then show how to reduce this problem to theSPS problem. Concerning the
NEXPTIME -membership of the SPS problem, unfortunately,the zero-sum game used for our
FPT results cannot be used directly. To obtain this result,we have shown that exponential size solutions exist for positive instances of the SPS problemand this allows us to design a nondeterministic exponential-time algorithm.
Related Work.
Rational synthesis is introduced in [19] for ω -regular objectives in a settingwhere the followers are cooperative with the leader, and later in [25] where they are adversarial.Precise complexity results for various ω -regular objectives are established in [13] for bothsettings. Those complexities differ from the ones of the problem studied in this paper. Indeed, in the classical context of two-player zero-sum games, solving reachability games is in P whereassolving parity games is only known to be in NP ∩ co- NP , see e.g. [20]. . Bruyère, J.-F. Raskin and C. Tamines 3 Indeed, for reachability objectives, adversarial rational synthesis is
PSPACE -complete, whilefor parity objectives, its precise complexity is not settled (the problem is
PSPACE -hard andin
NEXPTIME ). Extension to non-Boolean payoffs, like mean-payoff or discounted sum, isstudied in [21, 22] in the cooperative setting and in [1, 18] in the adversarial setting.When several players (like the followers) play with the aim to satisfy their objectives,several solution concepts exist such as Nash equilibrium [26], subgame perfect equilibrium [28],secure equilibria [11, 12], or admissibility [2, 5]. The constrained existence problem, closeto the cooperative rational synthesis problem, is to decide whether there exists a solutionconcept such that the payoff obtained by each player is larger than some threshold. Let usmention [13, 30, 31] for results on the constrained existence for Nash equilibria and [6, 7, 29]for such results for subgame perfect equilibria. The interested reader can find more pointersto works on non-zero-sum games for reactive synthesis in [4, 8].
Structure.
The paper is structured as follows. In Section 2, we introduce the class ofStackelberg-Pareto games and the SPS problem. We prove in Section 3 that the SPS problemis in
FPT for reachability and parity objectives. The complexity class of this problem isstudied in Section 4 where we prove that it is
NEXPTIME -complete and NP -complete in caseof reachability objectives and graphs that are trees. In Section 5, we provide a conclusionand discuss future work. This section introduces the class of two-player Stackelberg-Pareto games in which the firstplayer has a single objective and the second has several. We present a decision problem onthose games called the Stackelberg-Pareto Synthesis problem, which we study in this paper.
Game Arena. A game arena is a tuple G = ( V, V , V , E, v ) where ( V, E ) is a finite directedgraph such that: (i) V is the set of vertices and ( V , V ) forms a partition of V where V (resp. V ) is the set of vertices controlled by Player 0 (resp. Player 1), (ii) E ⊆ V × V is theset of edges such that each vertex v has at least one successor v ′ , i.e., ( v, v ′ ) ∈ E , and (iii) v ∈ V is the initial vertex. We call a game arena a tree arena if it is a tree in which everyleaf vertex has itself as its only successor. A sub-arena G ′ with a set V ′ ⊆ V of vertices andinitial vertex v ′ ∈ V ′ is a game arena defined from G as expected. Plays. A play in a game arena G is an infinite sequence of vertices ρ = v v . . . ∈ V ω suchthat it starts with the initial vertex v and ( v j , v j +1 ) ∈ E for all j ∈ N . Histories in G arefinite sequences h = v . . . v j ∈ V + defined similarly. A history is elementary if it containsno cycles. We denote by Plays G the set of plays in G . We write Hist G (resp. Hist
G,i ) the setof histories (resp. histories ending with a vertex in V i ). We use the notations Plays , Hist , and
Hist i when G is clear from the context. We write Occ ( ρ ) the set of vertices occurring in ρ and Inf ( ρ ) the set of vertices occurring infinitely often in ρ . Strategies. A strategy σ i for Player i is a function σ i : Hist i → V assigning to each history hv ∈ Hist i a vertex v ′ = σ i ( hv ) such that ( v, v ′ ) ∈ E . It is memoryless if σ i ( hv ) = σ i ( h ′ v ) forall histories hv, h ′ v ending with the same vertex v ∈ V i . More generally, it is finite-memory if it can be encoded by a Moore machine M [20]. The memory size of σ i is the number ofmemory states of M . In particular, σ i is memoryless when it has a memory size of one. Stackelberg-Pareto Synthesis (full version)
Given a strategy σ i of Player i , a play ρ = v v . . . is consistent with σ i if v j +1 = σ i ( v . . . v j ) for all j ∈ N such that v j ∈ V i . Consistency is naturally extended to histories.We denote by Plays σ i (resp. Hist σ i ) the set of plays (resp. histories) consistent with σ i . A strategy profile is a tuple σ = ( σ , σ ) of strategies, one for each player. We write out ( σ ) theunique play consistent with both strategies and we call it the outcome of σ . Objectives. An objective for Player i is a set of plays Ω ⊆ Plays . A play ρ satisfies theobjective Ω if ρ ∈ Ω. In this paper, we focus on the two following ω -regular objectives. Let T ⊆ V be a subset of vertices called a target set , the reachability objective Reach ( T ) = { ρ ∈ Plays | Occ ( ρ ) ∩ T ̸ = ∅} asks to visit at least one vertex of T . Let c : V → N be afunction called a priority function which assigns an integer to each vertex in the arena, the parity objective Parity ( c ) = { ρ ∈ Plays | min v ∈ Inf ( ρ ) ( c ( v )) is even } asks that the minimumpriority visited infinitely often be even. Stackelberg-Pareto Games. A Stackelberg-Pareto game (SP game) G = ( G, Ω , Ω , . . . , Ω t )is composed of a game arena G , an objective Ω for Player 0 and t ≥ , . . . , Ω t for Player 1. In this paper, we focus on SP games where the objectives are either allreachability or all parity objectives and call such games reachability (resp. parity ) SP games . Payoffs in SP Games.
The payoff of a play ρ ∈ Plays corresponds to the vector of Booleans pay ( ρ ) ∈ { , } t such that for all i ∈ { , . . . , t } , pay i ( ρ ) = 1 if ρ ∈ Ω i , and pay i ( ρ ) = 0otherwise. Note that we omit to include Player 0 when discussing the payoff of a play.Instead we say that a play ρ is won by Player 0 if ρ ∈ Ω and we write won ( ρ ) = 1, otherwiseit is lost by Player 0 and we write won ( ρ ) = 0. We write ( won ( ρ ) , pay ( ρ )) the extended payoff of ρ . Given a strategy profile σ , we write won ( σ ) = won ( out ( σ )) and pay ( σ ) = pay ( out ( σ )).For reachability SP games, since reachability objectives are prefix-dependant and given ahistory h ∈ Hist , we also define won ( h ) and pay ( h ) as done for plays.We introduce the following partial order on payoffs. Given two payoffs p = ( p , . . . , p t )and p ′ = ( p ′ , . . . , p ′ t ) such that p, p ′ ∈ { , } t , we say that p ′ is larger than p and write p ≤ p ′ if p i ≤ p ′ i for all i ∈ { , . . . , t } . Moreover, when it also holds that p i < p ′ i for some i , we saythat p ′ is strictly larger than p and we write p < p ′ . A subset of payoffs P ⊆ { , } t is an antichain if it is composed of pairwise incomparable payoffs with respect to ≤ . Stackelberg-Pareto Synthesis Problem.
Given a strategy σ of Player 0, we consider theset of payoffs of plays consistent with σ which are Pareto-optimal , i.e., maximal with respectto ≤ . We write this set P σ = max { pay ( ρ ) | ρ ∈ Plays σ } . Notice that it is an antichain. Wesay that those payoffs are σ -fixed Pareto-optimal and write | P σ | the number of such payoffs.A play ρ ∈ Plays σ is called σ -fixed Pareto-optimal if its payoff pay ( ρ ) is in P σ .The problem studied in this paper asks whether there exists a strategy σ for Player 0such that every play in Plays σ which is σ -fixed Pareto-optimal satisfies the objective ofPlayer 0. This corresponds to the assumption that given a strategy of Player 0, Player 1 willplay rationally , that is, with a strategy σ such that out (( σ , σ )) is σ -fixed Pareto-optimal.It is therefore sound to ask that Player 0 wins against such rational strategies. ▶ Definition 1.
Given an SP game, the
Stackelberg-Pareto Synthesis problem (SPS problem)is to decide whether there exists a strategy σ for Player (called a solution ) such that foreach strategy profile σ = ( σ , σ ) with pay ( σ ) ∈ P σ , it holds that won ( σ ) = 1 . . Bruyère, J.-F. Raskin and C. Tamines 5 v v v v v v v v (0 , (0 , , , (1 , , , (1 , , , (0 , , Figure 1
A reachability SP game.
Witnesses.
Given a strategy σ that is a solution to the SPS problem and any payoff p ∈ P σ , for each play ρ consistent with σ such that pay ( ρ ) = p it holds that won ( ρ ) = 1.For each p ∈ P σ , we arbitrarily select such a play which we call a witness (of p ). We denoteby Wit σ the set of all witnesses, of which there are as many as payoffs in P σ . In thesequel, it is useful to see this set as a tree composed of | Wit σ | branches. Additionally fora given history h ∈ Hist , we write
Wit σ ( h ) the set of witnesses for which h is a prefix, i.e., Wit σ ( h ) = { ρ ∈ Wit σ | h is prefix of ρ } . Notice that Wit σ ( h ) = Wit σ when h = v andthat Wit σ ( h ) decreases as h increases, until it contains a single value or becomes empty. ▶ Example 2.
Consider the reachability SP game with arena G depicted in Figure 1 in whichPlayer 1 has t = 3 objectives. The vertices of Player 0 (resp. Player 1) are depicted as circles(resp. squares) . Every objective in the game is a reachability objective defined as follows:Ω = Reach ( { v , v } ), Ω = Reach ( { v , v } ), Ω = Reach ( { v } ), Ω = Reach ( { v , v } ). Theextended payoff of plays reaching vertices from which they can only loop is displayed in thearena next to those vertices, and the extended payoff of play v v ( v v ) ω is (0 , (0 , , σ of Player 0 such that he chooses to always move to v from v . The set of payoffs of plays consistent with σ is { (0 , , , (0 , , , (1 , , , (0 , , } and the set of those that are Pareto-optimal is P σ = { (1 , , , (0 , , } . Notice that play ρ = v v ( v ) ω is consistent with σ , has payoff (1 , ,
0) and is lost by Player 0. Strategy σ is therefore not a solution to the SPS problem. In this game, there is only one othermemoryless strategy for Player 0, where he chooses to always move to v from v . One canverify that it is again not a solution to the SPS problem.We can however define a finite-memory strategy σ ′ such that σ ′ ( v v v ) = v and σ ′ ( v v v v v ) = v and show that it is a solution to the problem. Indeed, the set of σ ′ -fixed Pareto-optimal payoffs is P σ ′ = { (0 , , , (1 , , } and Player 0 wins every playconsistent with σ ′ whose payoff is in this set. A set Wit σ ′ of witnesses for these payoffs is { v v v v v ω , v v v v v v ω } and is in this case the unique set of witnesses. This exampleshows that Player 0 sometimes needs memory in order to have a solution to the SPS problem. In this section, we show that the SPS problem is in
FPT for both cases of reachability andparity SP games. The details of our proof for each type of objective are provided separatelyin their own subsection. We refer the reader to [15] for the concept of fixed-parametercomplexity. This convention is used throughout this paper.
Stackelberg-Pareto Synthesis (full version) ▶ Theorem 3.
Solving the SPS problem is in
FPT for reachability SP games for parameter t equal to the number of objectives of Player . ▶ Theorem 4.
Solving the SPS problem is in
FPT for parity SP games for parameters t andthe maximal priority according to each parity objective of Player . In order to prove Theorem 3 and Theorem 4, we provide a reduction to a specific two-playerzero-sum game, called the
Challenger-Prover game (C-P game). This game is a zero-sum game played between Challenger (written C ) and Prover (written P ). We will show thatPlayer 0 has a solution to the SPS problem in an SP game if and only if P has a winningstrategy in the corresponding C-P game. In the latter game, P tries to show the existence ofa strategy σ that is solution to the SPS problem in the original game and C tries to disproveit. The C-P game is described independently of the objectives used in the SP game andits objective is described as such in a generic way . We later provide the proof of our FPT results by adapting it specifically for reachability and parity SP games.
Intuition on the C-P Game.
Without loss of generality, the SP games we consider in thissection are such that each vertex in their arena has at most two successors . It can be shown(see Appendix A) that any SP game G with n vertices can be transformed into an SP game¯ G with O ( n ) vertices such that every vertex has at most two successors and Player 0 has asolution to the SPS problem in G if and only if he has a solution to the SPS problem in ¯ G .Let G be an SP game. The C-P game G ′ is a zero-sum game associated with G thatintuitively works as follows. First, P selects a set P of payoffs which he announces as theset of Pareto-optimal payoffs P σ for the solution σ to the SPS problem in G he is tryingto construct. Then, P tries to show that there exists a set of witnesses Wit σ in G for thepayoffs in P . After the selection of P in G ′ , there is a one-to-one correspondence betweenplays in the arenas G and G ′ such that the vertices in G ′ are augmented with a set W whichis a subset of P . Initially W is equal to P and after some history in G ′ , W contains payoff p if the corresponding history in G is prefix of the witness with payoff p in the set Wit σ that P is building. In addition, the objective Ω P of P is such that he has a winning strategy σ P in G ′ if and only if the set P that he selected coincides with the set P σ for the correspondingstrategy σ in G and the latter strategy is a solution to the SPS problem in G . A part of thearena of the C-P game for Example 2 with a highlighted winning strategy for P is illustratedin Figure 2. Arena of the C-P Game.
The initial vertex ⊥ belongs to P . From this vertex, he selects asuccessor ( v , P, W ) such that W = P and P is an antichain of payoffs which P announcesas the set P σ for the strategy σ in G he is trying to construct. All vertices in plays startingwith this vertex will have this same value for their P -component. Those vertices are eithera triplet ( v, P, W ) that belongs to P or ( v, P, ( W l , W r )) that belongs to C . Given a play ρ (resp. history h ) in G ′ , we denote by ρ V (resp. h V ) the play (resp. history) in G obtained byremoving ⊥ and keeping the v -component of every vertex of P in ρ (resp. h ), which we callits projection . We suppose the reader familiar with the concept of zero-sum games, see e.g. [20]. . Bruyère, J.-F. Raskin and C. Tamines 7 ⊥ v , P, { p , p } v , P, ( ∅ , { p , p } ) v , P, ∅ v , P, { p , p } v , P, ( { p , p } , ∅ ) v , P, { p , p } v , P, ∅ v , P, { p , p } v , P, { p , p } v , P, ( { p } , { p } ) v , P, ( { p } , { p } ) v , P, { p } v , P, { p } v , P, { p } v , P, ( { p , p } , ∅ ) v , P, ∅ v , P, { p } v , P, { p } v , P, { p } . . . . . . . . . . . . . . .. . . . . . . . . . . .. . . Figure 2
A part of the C-P game for Example 2 with P = { p , p } , p = (1 , ,
0) and p = (0 , , After history hm such that m = ( v, P, W ) with v ∈ V , P selects a successor v ′ suchthat ( v, v ′ ) ∈ E and vertex ( v ′ , P, W ) is added to the play. This corresponds to Player 0choosing a successor v ′ after history h V v in G .After history hm such that m = ( v, P, W ) with v ∈ V , P selects a successor( v, P, ( W l , W r )) with ( W l , W r ) a partition of W . This corresponds to P splitting the set W into two parts according to the two successors v l and v r of v . For the strategy σ that P tries to construct and its set of witnesses Wit σ he is building, he asserts that W l (resp. W r ) is the set of payoffs of the witnesses in Wit σ ( h V v l ) (resp. Wit σ ( h V v r )).From a vertex ( v, P, ( W l , W r )), C can select a successor ( v l , P, W l ) or ( v r , P, W r ) whichcorresponds to the choice of Player 1.Formally, the game arena of the C-P game is the tuple G ′ = ( V ′ , V ′P , V ′C , E ′ , ⊥ ) with V ′P = {⊥} ∪ { ( v, P, W ) | v ∈ V, P ⊆ { , } t is an antichain and W ⊆ P } , V ′C = { ( v, P, ( W l , W r )) | v ∈ V , P ⊆ { , } t is an antichain and W l , W r ⊆ P } ,( ⊥ , ( v, P, W )) ∈ E ′ if v = v and P = W ,(( v, P, W ) , ( v ′ , P, W )) ∈ E ′ if v ∈ V and ( v, v ′ ) ∈ E ,(( v, P, W ) , ( v, P, ( W l , W r ))) ∈ E ′ if v ∈ V and ( W l , W r ) is a partition of W ,(( v, P, ( W l , W r )) , ( v ′ , P, W )) ∈ E ′ if v ′ = v l and W = W l or v ′ = v r and W = W r .In the definition of E ′ , if v has a single successor v ′ in G , it is assumed to be v l and W r isalways equal to ∅ . We use as a convention that given the two successors v i and v j of vertex v , v i is the left successor if i < j . Objective of P in the C-P Game. Let us now discuss the objective Ω P of P . The W -component of the vertices controlled by P has a size that decreases along a play ρ in G ′ .We write lim W ( ρ ) the value of the W -component at the limit in ρ . Recall that with this W -component, P tries to construct a solution σ to the SPS problem with associated sets P σ and Wit σ . Therefore, for him to win in the C-P game, lim W ( ρ ) must be a singleton orempty in every consistent play such that: lim W ( ρ ) must be a singleton { p } with p the payoff of ρ V in G , showing that ρ V ∈ Wit σ is a correct witness for p . In addition, it must hold that won ( ρ V ) = 1 as p ∈ P and as P wants σ to be a solution. lim W ( ρ ) must be the empty set such that either the payoff of ρ V belongs to P σ and won ( ρ V ) = 1, or the payoff of ρ V is strictly smaller than some payoff in P σ . Stackelberg-Pareto Synthesis (full version)
These three conditions verify that the sets P = P σ and Wit σ are correct and that σ isindeed a solution to the SPS problem in G . They are generic as they do not depend on theactual objectives used in the SP game.Let us give the formal definition of Ω P . For an antichain P of payoffs, we write Plays PG ′ the set of plays in G ′ which start with ⊥ ( v , P, P ) and we define the following set B P = (cid:8) ρ ∈ Plays PG ′ | ( lim W ( ρ ) = { p } ∧ pay ( ρ V ) = p ∈ P ∧ won ( ρ V ) = 1) ∨ (1)( lim W ( ρ ) = ∅ ∧ pay ( ρ V ) ∈ P ∧ won ( ρ V ) = 1) ∨ (2)( lim W ( ρ ) = ∅ ∧ ∃ p ∈ P, pay ( ρ V ) < p ) (cid:9) . (3)Objective Ω P of P in G ′ is the union of B P over all antichains P . As the C-P game iszero-sum, objective Ω C equals Plays G ′ \ Ω P . The following theorem holds. ▶ Theorem 5.
Player has a strategy σ that is solution to the SPS problem in G if andonly if P has a winning strategy σ P from ⊥ in the C-P game G ′ . Proof of Theorem 5.
Let us first assume that Player 0 has a strategy σ that is solution tothe SPS problem in G . Let P σ be its set of σ -fixed Pareto-optimal payoffs and let Wit σ bea set of witnesses. We construct the strategy σ P from σ such that σ P ( ⊥ ) = ( v , P, P ) such that P = P σ (this vertex exists as P σ is an antichain), σ P ( hm ) = ( v ′ , P, W ) if m = ( v, P, W ) with v ∈ V and v ′ = σ ( h V v ), σ P ( hm ) = ( v, P, ( W l , W r )) if m = ( v, P, W ) with v ∈ V and for i ∈ { l, r } , W i = { pay ( ρ ) | ρ ∈ Wit σ ( h V v i ) } .It is clear that given a play ρ in G ′ consistent with σ P , the play ρ V in G is consistent with σ .Let us show that σ P is winning for P from ⊥ in G ′ . Consider a play ρ in G ′ consistent with σ P .There are two possibilities. (i) ρ V is a witness of Wit σ and by construction lim W ( ρ ) = { p } with p = pay ( ρ V ); thus won ( ρ V ) = 1 as σ is a solution and ρ V is a witness. (ii) ρ V is not awitness and by construction lim W ( ρ ) = ∅ ; as σ is a solution, then p = pay ( ρ V ) is boundedby some payoff of P σ and in case of equality won ( ρ V ) = 1. Therefore ρ satisfies the objective B P of Ω P since it satisfies condition (1) in case (i) and condition (2) or (3) in case (ii) .Let us now assume that P has a winning strategy σ P from ⊥ in G ′ . Let P be theantichain of payoffs chosen from ⊥ by this strategy. We construct the strategy σ from σ P such that σ ( h V v ) = v ′ given σ P ( hm ) = ( v ′ , P, W ) with m = ( v, P, W ) and v ∈ V . Noticethat this definition makes sense since there is a unique history hm ending with a vertex of P associated with h V v showing a one-to-one correspondence between those histories.Let us show σ is a solution to the SPS problem with P σ being the set P . First noticethat P is not empty. Indeed let ρ be a play consistent with σ P . As ρ belongs to Ω P andin particular to B P , one can check that P ̸ = ∅ by inspecting conditions (1) to (3). Secondnotice that by definition of E ′ , if (( v, P, W ) , ( v, P, ( W l , W r ))) ∈ E ′ with W ̸ = ∅ , then either W l or W r is not empty. Therefore given any payoff p ∈ P , there is a unique play ρ consistentwith σ P such that lim W ( ρ ) = { p } . By construction of σ and as σ P is winning, the play ρ V is consistent with σ , has payoff p , and is won by Player 0 (see (1)).Let ρ V be a play consistent with σ and ρ be the corresponding play consistent with σ P .It remains to consider (2) and (3). These conditions indicate that ρ V has a payoff equal toor strictly smaller than a payoff in P and that in case of equality won ( ρ V ) = 1. This showsthat P σ = P and that σ is a solution to the SPS problem. ◀ We now develop the proof of Theorem 3 which works by specializing the generic objectiveΩ P to handle reachability SP games. We extend the arena G ′ of the C-P game such that . Bruyère, J.-F. Raskin and C. Tamines 9 its vertices keep track of the objectives of G which are satisfied along a play. Given anextended payoff ( w, p ) ∈ { , } × { , } t and a vertex v ∈ V , we define the payoff update upd ( w, p, v ) = ( w ′ , p ′ ) such that w ′ = 1 ⇐⇒ w = 1 or v ∈ T ,p ′ i = 1 ⇐⇒ p i = 1 or v ∈ T i , ∀ i ∈ { , . . . , t } . We obtain the extended arena G ∗ as follows: (i) its set of vertices is V ′ ×{ , }×{ , } t , (ii) itsinitial vertex is ⊥ ∗ = ( ⊥ , , (0 , . . . , (iii) (( m, w, p ) , ( m ′ , w ′ , p ′ )) with m ′ = ( v ′ , P, W )or m ′ = ( v ′ , P, ( W l , W r )) is an edge in G ∗ if ( m, m ′ ) ∈ E ′ and ( w ′ , p ′ ) = upd ( w, p, v ′ ).We define the zero-sum game G ∗ = ( G ∗ , Ω ∗P ) in which the three abstract conditions(1-3) detailed previously are encoded into the following Büchi objective by using the ( w, p )-component added to vertices. We define Ω ∗P = B¨uchi ( B ∗ ) with B ∗ = (cid:8) ( v, P, W, w, p ) ∈ V ∗P | ( W = { p } ∧ w = 1) ∨ (1’)( W = ∅ ∧ p ∈ P ∧ w = 1) ∨ (2’)( W = ∅ ∧ ∃ p ′ ∈ P, p < p ′ ) (cid:9) . (3’) ▶ Proposition 6.
Player has a strategy σ that is solution to the SPS problem in areachability SP game G if and only if P has a winning strategy σ ∗P in G ∗ . The proof of this proposition is a consequence of Theorem 5. Using the one-to-onecorrespondence between plays in G and plays in G ∗ and the fact that G is a reachabilitySP game, the ( w, p )-component in vertices of G ∗ allows us to easily retrieve the extendedpayoff of a play in G . Indeed, in a play ρ ∈ Plays G ∗ , given the construction of G ∗ and thepayoff update function, it holds that from some point on the W - and ( w, p )-components areconstant. Therefore it holds that w = won ( ρ V ), p = pay ( ρ V ) and W = lim W ( ρ ) for thatplay ρ . Moreover the P -component is constant along a play in G ∗ . It is direct to see thatthe plays ρ in G ∗ which visit infinitely often the set B ∗ , and therefore satisfy the Büchiobjective Ω ∗P = B¨uchi ( B ∗ ), satisfy one of the three conditions (1-3) stated in Subsection 3.1.The converse is also true.We now describe a FPT algorithm for deciding the existence of a solution to the SPSproblem in a reachability SP game, thus proving Theorem 3.
Proof of Theorem 3.
We describe the following
FPT algorithm (for parameter t ) for decidingthe existence of a solution to the SPS problem in a reachability SP game G by usingProposition 6. First, we construct the zero-sum game G ∗ . Its number n of vertices isupper-bounded by 1 + | V | · t +1 · t +1 + | V | · · t · t +1 . Indeed, except the initial vertex,vertices are of the form either ( v, P, W, w, p ) or ( v, P, ( W l , W r ) , w, p ) such that P , W , W l and W r are antichains of payoffs in { , } t , and ( w, p ) is an extended payoff. The construction of G ∗ is thus in FPT for parameter t . Second, By Proposition 6, deciding whether there existsa solution to the SPS problem in G amounts to deciding if P has a winning strategy from ⊥ ∗ in G ∗ . Since the objective Ω ∗P of P in G ∗ is a Büchi objective, this game can be solved in O ( n ) [10]. It follows that G ∗ can be solved in FPT for parameter t . ◀ We now turn to parity SP games and explain why solving the SPS problem in these gamesis in
FPT , again by reduction to the C-P game. To this end, we first recall the notion ofBoolean Büchi games.
Boolean Büchi games are zero-sum games which we use in our reduction to the C-Pfor parity SP game. Given m sets T , . . . , T m such that T i ⊆ V , i ∈ { , . . . , m } and ϕ aBoolean formula over the set of variables X = { x , . . . , x m } , the Boolean Büchi objective
BooleanB¨uchi ( ϕ, T , . . . , T m ) = { ρ ∈ Plays | ρ satisfies ( ϕ, T , . . . , T m ) } is the set of playswhose valuation of the variables in X satisfy formula ϕ . Given a play ρ , its valuation is suchthat x i = 1 if and only if Inf ( ρ ) ∩ T i ̸ = ∅ and x i = 0 otherwise. That is, a play satisfiesthe objective if the Boolean formula describing sets to be visited infinitely often by a playis satisfied. We denote by | ϕ | the size of ϕ as equal to the number of conjunctions anddisjunctions in ϕ . The following theorem on the fixed-parameter complexity of Boolean Büchigames is proved in [9]. ▶ Theorem 7.
Solving Boolean Büchi games is in
FPT , with an algorithm in O (2 M · | ϕ | +( M M · | V | ) ) time with M = 2 m such that m is the number of variables and | ϕ | is the size of ϕ in the Boolean Büchi objective [9]. Let G = ( G, Ω , . . . , Ω t ) be a parity SP game with parity objectives such that Ω i = Parity ( c i ) for a priority function c i : V → N . Let G ′ be the arena of the C-P game. Inthe following, we construct a Boolean Büchi objective Ω ′P for P such that the followingproposition holds. ▶ Proposition 8.
Player has a strategy σ that is solution to the SPS problem in G if andonly if P has a winning strategy σ P in G ∗ = ( G ′ , Ω ′P ) . Proof of Proposition 8. let G ′ = ( V ′ , V ′P , V ′C , E ′ , ⊥ ) be the arena of the C-P game presentedin Section 3.1. Recall that the objective Ω P of this game is the union of the sets B P over allantichains P such that B P is the disjunction of conditions (1-3). The idea of the proof is totranslate this objective into a Boolean Büchi objective Ω ′P . We will proceed step by step.The required Boolean formula for defining Ω ′P is equal to ϕ = _ P (cid:0) x P ∧ ( cond P1 ∨ cond P2 ∨ cond P3 ) (cid:1) such that the main disjunction is over all antichains P . The variable x P corresponds to theset T P = { ( v, P, W ) ∈ V ′P } . The valuation of x P is true for a given play if and only if the set T P is visited infinitely often and therefore P is the antichain chosen by P in G ′ . Since the P -component is constant along a play, only one x P is valued as true for a given play. Let usnow detail each subformula cond Pi that is the translation of condition ( i ) for i ∈ { , , } .Let us begin with the encoding of payoffs. Let d , . . . , d t be such that d i is the maximal evenpriority appearing in G according to priority function c i for objective Ω i with i ∈ { , . . . , t } .First, we show that a parity objective Parity ( c i ) from G can be encoded as a BooleanBüchi objective. Given the parity objective Parity ( c i ), we construct the Boolean formula parity i over variables { x i , x i , . . . , x id i } such that parity i = x i ∨ ( x i ∧ ¬ x i ) ∨ · · · ∨ ( x id i ∧ ¬ x id i − ∧ ¬ x id i − ∧ · · · ∧ ¬ x i )and for j ∈ { , . . . , d i } , the set corresponding to variable x ij is T ij = { ( v, P, W ) ∈ V ′P | c i ( v ) = j } . It is easy to show that the parity objective Parity ( c i ) is satisfied if and only if the BooleanBüchi objective BooleanB¨uchi ( parity i , T i , . . . , T id i ) is satisfied.Second, given a payoff p = ( p , . . . , p t ) in G , we consider the Boolean formula payoff p = C ∧ · · · ∧ C t . Bruyère, J.-F. Raskin and C. Tamines 11 such that C i = parity i if p i = 1 and C i = ¬ parity i otherwise. Clearly the projec-tion ρ V of play ρ realizes payoff p if and only if ρ satisfies the Boolean Büchi objective BooleanB¨uchi ( payoff p , T , . . . , T d , . . . , T t , . . . , T td t ).We now fix some antichain P . Let us detail subformula cond P1 encoding condition (1).For a payoff p , we define formula single p = x p ∧ payoff p ∧ parity such that x p = { ( v, P, W ) ∈ V ′P | W = { p }} . Since at some point during a play, the W -component stabilizes and since a play ρ which satisfies payoff p ∧ parity is such that pay ( ρ V ) = p and won ( ρ V ) = 1, it holds that satisfying this formula corresponds exactly tosatisfying condition (1) for some p . Formula cond P1 is thus the disjunction cond P1 = _ p ∈ P single p . Let us shift to subformula cond P2 encoding condition (2). Using similar arguments, wedefine for payoff p formula empty p = x ∅ ∧ payoff p ∧ parity such that x ∅ = { ( v, P, ∅ ) ∈ V ′P } . It corresponds exactly to the set of plays ρ such that lim W ( ρ ) = ∅ , pay ( ρ V ) = p and won ( ρ V ) = 1. Therefore cond P2 = _ p ∈ P empty p . Finally, we define subformula cond P3 encoding condition (3). Let P be the set containingevery payoff p ′ such that ∃ p ∈ P, p ′ < p . We define cond P3 = _ p ′ ∈ P smaller p ′ with smaller p ′ being the formula x ∅ ∧ payoff p ′ .Notice that the Boolean formula ϕ constructed in this proof has a number m of variablesand a size | ϕ | that only depend on t and d i , i ∈ { , . . . , t } . ◀ This previous construction can be used to provide the proof of Theorem 4.
Proof of Theorem 4.
We describe the following
FPT algorithm for deciding the existenceof a solution to the SPS problem in a parity SP game G by using Proposition 8. First, weconstruct the zero-sum game G ∗ of Proposition 8. Its number n of vertices is upper-boundedby 1 + | V | · t +1 + | V | · · t which is in O ( | V | · f ( t )) with f a computable function whichonly depends on t . Moreover the number m of variables and the size | ϕ | of the Booleanformula ϕ defining the Boolean Büchi objective of G ∗ depend only on parameters t and d i for i ∈ { , . . . , t } . Therefore the construction of G ∗ is in FPT for these parameters. Decidingwhether there exists a solution to the SPS problem in G amounts to deciding if P has awinning strategy from ⊥ in G ∗ . By Theorem 7, the latter Boolean Büchi game can be solvedwith an algorithm in O (2 M · | ϕ | + ( M M · n ) ) time with M = 2 m . It follows that G ∗ can besolved in O (2 M · | ϕ | + ( M M · | V | · f ( t )) ) which is in FPT for the announced parameters. ◀ In this section, we study the complexity class of the SPS problem and prove its
NEXPTIME -completeness for both reachability and parity SP games.
We first show the membership to
NEXPTIME of the SPS problem by providing a nondetermin-istic algorithm with time exponential in the size of the game G . By size , we mean the number | V | of its vertices and the number t of objectives of Player 1. ▶ Theorem 9.
The SPS problem is in
NEXPTIME for reachability and parity SP games.
We show that the SPS problem is in
NEXPTIME by proving that if Player 0 has a strategywhich is a solution to the problem, then he has one which is finite-memory with at most anexponential number of memory states . This yields a NEXPTIME algorithm in which wenondeterministically guess such a strategy and check in exponential time that it is indeed asolution to the problem. ▶ Proposition 10.
Let G be a reachability SP game or a parity SP game. Let σ be a solutionthe the SPS problem. Then there exists another solution ˜ σ that is finite-memory and has amemory size exponential in the size of G . While the proof of Proposition 10 requires some specific arguments to treat both reachab-ility and parity objectives, it is based on the following common principles.We start from a winning strategy σ for the SPS problem and the objectives Ω , Ω , . . . , Ω t and consider a set of witnesses Wit σ , that contains one play for each element of the set P σ of σ -fixed Pareto-optimal payoffs.We start by showing the existence of a strategy ˆ σ constructed from σ , in which Player 0follows σ as long as the current consistent history is prefix of at least one witness in Wit σ .Then when a deviation from Wit σ occurs, Player 0 switches to a punishing strategy . Adeviation is a history that leaves the set of witnesses Wit σ after a move of Player 1 (thisis not possible by a move of Player 0). After such a deviation, ˆ σ systematically imposesthat the consistent play either satisfies Ω or is not σ -fixed Pareto-optimal, i.e., it givesto Player 1 a payoff that is strictly smaller than the payoff of a witness in Wit σ . Thismakes the deviation irrational for Player 1. We show that this can be done, both forreachability and parity objectives, with at most exponentially many different punishingstrategies, each having a size bounded exponentially in the size of the game. The strategyˆ σ that we obtain is therefore composed of the part of σ that produces Wit σ and apunishment part whose size is at most exponential.Then, we show how to decompose each witness in Wit σ into at most exponentially many sections that can, in turn, be compacted into finite elementary paths or lasso shapedpaths of polynomial length. As Wit σ contains exactly | P σ | witnesses ρ , those compactwitnesses cρ can be produced by a finite-memory strategy with an exponential size forboth reachability and parity objectives. This allows us to construct a strategy ˜ σ thatproduces the compact witnesses and acts as ˆ σ after any deviation. This strategy is asolution of the SPS problem and has an exponential size as announced.We now develop the details of the construction of the strategies ˆ σ and ˜ σ . Figure 3illustrates this construction. It is done in several steps to finally get the proof of Proposition 10.For the rest of this section, we fix an SP game G with objectives Ω , Ω , . . . , Ω t , a strategy σ that is solution to the SPS problem, a set of witnesses Wit σ for the σ -fixed Pareto-optimalpayoffs in P σ , and we write Ω < ( P σ ) the set of plays whose payoff is strictly smaller thansome payoff in P σ . Recall that to have a solution to the SPS problem, memory is sometimes necessary as shown inExample 2. . Bruyère, J.-F. Raskin and C. Tamines 13 σ ρ ρ ρ ρ ˆ σ σ Pun σ Pun ˜ σ σ Pun σ Pun cρ cρ cρ cρ Figure 3
The creation of strategies ˆ σ and ˜ σ from a solution σ with Wit σ = { ρ , ρ , ρ , ρ } . Deviations and Punishing Strategies.
First, we define the set of deviations
Dev ( Wit σ ) asfollows: Dev ( Wit σ ) = { hv ∈ Hist σ | Wit σ ( h ) ̸ = ∅ ∧ Wit σ ( hv ) = ∅} . As explained above, a deviation is a history that leaves the set of witnesses
Wit σ (by a moveof Player 1).Second, we establish the existence of canonical forms for punishing strategies. Wepotentially need an exponential number of them for reachability objectives and a polynomialnumber of them for parity objectives. In both cases, each punishing strategy has a size whichcan be bounded exponentially. The existence of those strategies are direct consequences ofthe two following lemmas. ▶ Lemma 11 (Parity) . Let v ∈ V be such that there exists hv ∈ Dev ( Wit σ ) .Then there exists a finite-memory strategy σ Pun v such that for all deviations hv ∈ Dev ( Wit σ ) ,when Player plays σ Pun v from hv , all consistent plays ρ starting in v are such that either hρ ∈ Ω or hρ ∈ Ω < ( P σ ) . The size of σ Pun v is at most exponential in the size of G . Proof of Lemma 11.
First, we note that, after a deviation hv ∈ Dev ( Wit σ ), if Player 0continues to play the strategy σ from hv , then all consistent plays ρ are such that either ρ ∈ Ω or ρ ∈ Ω < ( P σ ) as σ is a solution to the SPS problem. Therefore, we know thatPlayer 0 has a punishing strategy for all such deviations hv . Second, as parity objectivesare prefix-independent, he can use one uniform strategy that only depends on v (and noton hv ). There exists such a strategy with finite memory that can be constructed as follows.We express the objective Ω ∪ Ω < ( P σ ) as an explicit Muller objective [23] for a zero-sumgame played on the arena G from initial vertex v . This objective is defined by the set { C ⊆ V | ∃ ρ such that Inf ( ρ ) = C ∧ ρ ∈ Ω ∪ Ω < ( P σ ) } . This exactly encodes the objectiveof Player 0 when he plays the punishing strategy after a deviation hv . It is well-known thatin zero-sum explicit Muller games, there always exist finite-memory winning strategies witha size exponential in the number | V | of vertices of the arena [16]. ◀▶ Lemma 12 (Reachability) . Let v ∈ V and ( w, p ) ∈ { , } × { , } t be such that there exists hv ∈ Dev ( Wit σ ) with ( won ( hv ) , pay ( hv )) = ( w, p ) .Then there exists a finite-memory strategy σ Pun ( v,w,p ) such that for all deviations hv ∈ Dev ( Wit σ ) with ( won ( hv ) , pay ( hv )) = ( w, p ) , when Player plays σ Pun ( v,w,p ) from hv , all consistent plays ρ starting in v are such that either hρ ∈ Ω or hρ ∈ Ω < ( P σ ) . The size of σ Pun ( v,w,p ) is at mostexponential in the size of G . Proof of Lemma 12.
We follow the same reasoning as in the proof of Lemma 11, exceptthat reachability objectives are not prefix-independent. We thus need to take into account the set of objectives Ω i already satisfied along the history hv , which is recorded in ( w, p ).The uniform finite-memory strategy σ Pun ( v,w,p ) that Player 0 can use from all deviations hv such that won ( hv ) = w and pay ( hv ) = p is constructed as follows. First, notice that if w = 1, meaning that objective Ω is already satisfied, then Player 0 can play using anymemoryless strategy as punishing strategy. Second, if w = 0, as done in Subsection 3.2,we consider the extension of G such that its vertices are of the form ( v ′ , w ′ , p ′ ) where the( w ′ , p ′ )-component keeps track of the objectives that have been satisfied so far and such thatits initial vertex is equal to ( v, w, p ). On this extended arena, we consider the zero-sum gamewith the objective Ω ∪ Ω < ( P σ ) encoded as the disjunction of a reachability objective (Ω )and a safety objective (Ω < ( P σ )). More precisely, in the extended game, Player 0 has theobjective either to reach a vertex in the set { ( v ′ , w ′ , p ′ ) | w ′ = 1 } or to stay forever withinthe set of vertices { ( v ′ , w ′ , p ′ ) | ∃ p ′′ ∈ P σ : p ′ < p ′′ } . It is known, see e.g. [9], that therealways exist memoryless winning strategies for zero-sum games with an objective which isthe disjunction of a reachability objective and a safety objective. Therefore, this is the casehere for the extended game, and thus also in the original game however with a winningfinite-memory strategy with exponential size. ◀ If we systematically change within σ the behavior of Player 0 after a deviation from Wit σ , and use the punishing strategies as defined in the proofs of Lemmas 11 and 12, weobtain a new strategy ˆ σ that is solution to the SPS problem. The total size of the punishingfinite-memory strategies in ˆ σ is at most exponential in the size of G . To obtain our results,it remains to show how to compact the plays in Wit σ . To that end, we study the historiesand plays within Wit σ . Compacting Witnesses.
We now show how to compact the set of witnesses in a way toproduce them with a finite-memory strategy. Together with the punishing strategies this willlead to a solution ˜ σ to SPS problem with a memory of exponential size. We first considerreachability objectives and explain later how to modify the construction for parity objectives.Given a history h that is prefix of at least one witness in Wit σ , we call region and we denoteby Reg ( h ) the tuple Reg ( h ) = ( won ( h ) , pay ( h ) , Wit σ ( h )). We also use notation R = ( w, p, W )for a region. Given a witness ρ = v v . . . ∈ Wit σ , we consider ρ ∗ = ( v , R )( v , R ) . . . suchthat each v j is extended with the region R j = ( w j , p j , W j ) = Reg ( v v . . . v j ). Similarly wedefine h ∗ associated with any history h prefix of a witness. The following properties hold fora witness ρ and its corresponding play ρ ∗ :for all j ≥
0, we have w j ≤ w j +1 , p j ≤ p j +1 , and W j ⊇ W j +1 ,the sequence ( w j , p j ) j ≥ eventually stabilizes on ( w, p ) equal to the extended payoff( won ( ρ ) , pay ( ρ )) of ρ ,the sequence ( W j ) j ≥ eventually stabilizes on a set W which is a singleton such that W = { ρ } .Thanks to the previous properties, each ρ ∈ Wit σ can be region decomposed into asequence of paths π [1] π [2] · · · π [ k ] where the corresponding decomposition π ∗ [1] π ∗ [2] · · · π ∗ [ k ]of ρ ∗ is such that for each ℓ : (i) the region is constant along the path π ∗ [ ℓ ] and (ii) it isdistinct from the region of the next path π ∗ [ ℓ + 1] (if ℓ < k ). Each π [ ℓ ] is called a section of ρ , such that it is internal (resp. terminal ) if ℓ < k (resp. ℓ = k ).Notice that the number of regions that are traversed by ρ is bounded by( t + 2) · | Wit σ | . (4)Indeed along ρ , the first two components ( w, p ) of a region correspond to a monotonicallyincreasing vector of t + 1 Boolean values (from (0 , (0 , . . . , , (1 , . . . , . Bruyère, J.-F. Raskin and C. Tamines 15 case), and the last component W is a monotonically decreasing set of witnesses (from Wit σ to { ρ } in the worst case). So the number of regions traversed by a witness is boundedexponentially in the size of the game G .We have the following important properties for the sections of the witnesses of Wit σ .Let ρ, ρ ′ ∈ Wit σ , with region decompositions ρ = π [1] · · · π [ k ] and ρ ′ = π ′ [1] · · · π ′ [ k ′ ] andlet h be the longest common prefix of ρ and ρ ′ . Then there exists k < k, k ′ such that h = π [1] · · · π [ k ], π [ ℓ ] = π ′ [ ℓ ] for all ℓ ∈ { , . . . , k } and π [ k + 1] ̸ = π ′ [ k + 1]. Therefore,when Wit σ is seen as a tree, the branching structure of this tree is respected by thesections.Let R = ( w, p, W ) be a region and consider the set of all histories h such that Reg ( h ) = R .Then all these histories are prefixes of each other and are prefixes of exactly | W | witnesses(as Wit σ ( h ) = W for each such h ). Therefore, the branching structure of Wit σ isrespected by the sections such that the associated regions are all pairwise distinct. Thelatter property is called the region-tree structure of Wit σ .We consider a compact version c Wit σ of Wit σ defined as follows:each internal section π of Wit σ is replaced by the elementary path cπ obtained byeliminating all the cycles of π . Each terminal section π of Wit σ is replaced by a lasso cπ = π ′ ( uπ ′ ) ω such that u is a vertex, π ′ uπ ′ is an elementary path, and π ′ uπ ′ u is prefixof π .each witness ρ of Wit σ with region decomposition ρ = π [1] · · · π [ k ] is replaced by cρ = cπ [1] · · · cπ [ k ] such that each π [ ℓ ] is replaced by cπ [ ℓ ]. Notice that as the region is constantinside the sections, the region decomposition of cρ coincide with the sequence of its cπ [ ℓ ], ℓ ∈ { , . . . , k } .Therefore, by construction of the compact witnesses, the region-tree structure of Wit σ iskept by the set { cρ | ρ ∈ Wit σ } and for each cρ ∈ c Wit σ ,( won ( cρ ) , pay ( cρ )) = ( won ( ρ ) , pay ( ρ )) . (5)We then construct the announced strategy ˜ σ that produces the set c Wit σ of compactwitnesses and after any deviation acts with the adequate punishing strategy (as mentionedin Lemma 12). More precisely, let gv be such that g is prefix of a compact witness and gv is not (Player 1 deviates from c Wit σ ). Then by definition of the compact witnesses, thereexists a deviation hv such that ( won ( gv ) , pay ( gv )) = ( won ( hv ) , pay ( hv )) = ( w, p ). Then from gv Player 0 switches to the punishing strategy σ Pun ( v,w,p ) . ▶ Lemma 13.
The strategy ˜ σ is a solution to the SPS problem for reachability SP gamesand its size is bounded exponentially in the size of the game G . Proof of Lemma 13.
Let us first prove that ˜ σ is a solution to the SPS problem. (i) By(5), the set of extended payoffs of plays in c Wit σ is equal to the set of extended payoffsof witnesses in Wit σ . This means that with c Wit σ , we keep the same set P σ and theobjective Ω is satisfied along each compact witness. (ii) The punishing strategies used by˜ σ guarantee the satisfaction of the objective Ω ∪ Ω < ( P σ ) by Lemma 12. Therefore ˜ σ is asolution to the SPS problem.Let us now show that the memory size of ˜ σ is bounded exponentially in the size of G . (i) By Lemma 12, each punishing strategy used by ˜ σ is of exponential size and thenumber of punishing strategies is exponential. (ii) To produce the compact witnesses, ˜ σ keeps in memory the current region and produces in a memoryless way the correspondingcompact section (which is an elementary path or lasso). Thus the required memory size forproducing c Wit σ is the number of regions. By (4), every play in c Wit σ traverses at most an exponential number of regions and there is an exponential number of such plays (equal to | P σ | ). ◀ We now switch to parity SP games and state the following lemma whose proof followsthe same line of arguments as those given for reachability objectives. ▶ Lemma 14.
The strategy ˜ σ is a solution to the SPS problem for parity SP games and itssize is bounded exponentially in the size of the game G . Proof of Lemma 14.
We highlight here the main differences from reachability SP games.As parity objectives are prefix-independent, we associate to each history h of a play ρ ∈ Wit σ a singleton Reg ( h ) = Wit σ ( h ) instead of the triplet ( won ( h ) , pay ( h ) , Wit σ ( h ))as in the case of reachability. This is because ( won ( h ) , pay ( h )) does not make sense forprefix-independent objectives.For the definition of the compact witnesses, we proceed identically as for reachability bysimply removing cycles inside each section with the exception of terminal sections. Giventhe terminal section π [ k ] of a witness ρ ∈ Wit σ , we replace it by a lasso cπ [ k ] = π ′ ( π ′ ) ω such that cπ [ k ] and π [ k ] start at the same vertex, Occ ( cπ [ k ]) = Occ ( π [ k ]), Inf ( cπ [ k ]) = Inf ( π [ k ]), and | π ′ π ′ | is quadratic in | V | [3, Proposition 3.1]. Therefore, by construction,the objectives Ω i satisfied by a witness ρ are exactly the same as for its correspondingcompact play cρ .We then construct the strategy ˜ σ that produces the set c Wit σ of compact witnessesand after any deviation gv from c Wit σ acts with the adequate punishing strategy σ Pun v (asmentioned in Lemma 11). ◀ Lemmas 13 and 14 lead to Proposition 10. Using this proposition, we are now able toprove our result on the
NEXPTIME -membership of reachability and parity SP games.
Proof of Theorem 9.
We have established the existence of solutions to the SPS problemthat use a finite memory bounded exponentially, both for reachability (Lemma 13) and forparity (Lemma 14) SP games. Let σ be such a solution. As it is finite-memory, we can guessit as a Moore machine M with a set of memory states at most exponential in the size of G .Let us explain how to verify that the guessed solution σ is a solution to the SPS problemfor parity objectives, i.e., every play in Plays σ which is σ -fixed Pareto-optimal satisfies theobjective Ω of Player 0. First, we construct the cartesian product G × M of the arena G with the Moore machine M which is a graph whose infinite paths (starting from the initialvertex v and the initial memory state) are exactly the plays consistent with σ . Second, tocompute P σ , we test for the existence of a play ρ in G × M with a given payoff p = pay ( ρ ),beginning with the largest possible payoff p = (1 , . . . ,
1) and finishing with the smallestpossible one p = (0 , . . . , G × M [17]. Third, to check that each Pareto optimal play in Plays σ satisfies Ω , we test for each p ∈ P σ whether there exists a play that satisfies theobjectives Ω i such that p i = 1 as well as the objective Plays G \ Ω . As the complement of aparity objective is again a parity objective, we use again the polynomial algorithm of [17].As a consequence we have a NEXPTIME algorithm for parity SP games.The case of reachability SP games is solved similarly with the following two differences.Concerning the second step, the existence of a play in G × M that satisfies an intersectionof reachability objectives can be checked in polynomial time by first extending this graphwith a Boolean vector in { , } t keeping track of the objectives of Player 1 already satisfied. . Bruyère, J.-F. Raskin and C. Tamines 17 G v e e . . . e n G c S S . . . S m c S S . . . S m . . . c k S S . . . S m v v Figure 4
The tree arena used in the reduction from the SC problem.
Notice that the resulting graph is still of exponential size and that the intersection ofreachability objectives becomes a single reachability objective. Concerning the third step, asthe complement
Plays G \ Ω of Ω is not a reachability objective, we rather remove verticesof G × M that contains an element of the target set T before checking whether there existsa play that satisfies the objectives Ω i such that p i = 1 for a given p ∈ P σ . ◀ We now turn to the
NEXPTIME -hardness of the SPS problem. The proof of this result usesa reduction from the Succinct Set Cover problem. Before presenting this problem and proofin the next section, we want to show that the SPS problem is already NP -complete in thesimple setting of reachability objectives and arenas that are trees. We use a reduction fromthe Set Cover problem (SC problem) which is NP -complete [24]. ▶ Theorem 15.
The SPS problem is NP -complete for reachability SP games on tree arenas. Notice that when the game arena is a tree, it is easy to design an algorithm for solvingthe SPS problem that is in NP . First, we nondeterministically guess a strategy σ that canbe assumed to be memoryless as the arena is a tree. Second, we apply a depth-first searchalgorithm from the root vertex which accumulates to leaf vertices the extended payoff ofplays which are consistent with σ . Finally, we check that σ is a solution.Let us explain why the SPS problem is NP -hard on tree arenas by reduction from the SCproblem. We recall that an instance of the SC problem is defined by a set C = { e , e , . . . , e n } of n elements, m subsets S , S , . . . , S m such that S i ⊆ C for each i ∈ { , . . . , m } , and aninteger k ≤ m . The problem consists in finding k indexes i , i , . . . , i k such that the union ofthe corresponding subsets equals C , i.e., C = k S j =1 S i j .Given an instance of the SC problem, we construct a game with an arena consisting of n + k · ( m + 1) + 3 vertices. The arena G of the game is provided in Figure 4 and can beseen as two sub-arenas reachable from the initial vertex v . The game is such that thereis a solution to the SC problem if and only if Player 0 has a strategy from v in G whichis a solution to the SPS problem. The game is played between Player 0 with reachabilityobjective Ω and Player 1 with n + 1 reachability objectives. The objectives are definedas follows: Ω = Reach ( { v } ), Ω i = Reach ( { e i } ∪ { S j | e i ∈ S j } ) for i ∈ { , , . . . , n } andΩ n +1 = Reach ( { v } ). First, notice that every play in G is consistent with any strategy ofPlayer 0 and is lost by that player. It holds that for each ℓ ∈ { , , . . . , n } , there is sucha play with payoff ( p , . . . , p n +1 ) such that p ℓ = 1 and p j = 0 for j ̸ = ℓ . These payoffscorrespond to the elements e ℓ we aim to cover in the SC problem. A play in G visits v and then a vertex c from which Player 0 selects a vertex S . Such a play is always won byPlayer 0 and its payoff is ( p , . . . , p n +1 ) such that p n +1 = 1 and p r = 1 if and only if the element e r belongs to the set S . It follows that the payoff of such a play corresponds to a setof elements in the SC problem. It is easy to see that the following proposition holds and itfollows that, as a consequence, Theorem 15 holds. ▶ Proposition 16.
There is a solution to an instance of the SC problem if and only ifPlayer has a strategy from v in the corresponding SP game that is a solution to the SPSproblem. Proof of Proposition 16.
First, let us assume that there is a solution to the SC problem. Itholds that there exists a set of k indexes i , i , . . . , i k such that the union of the correspondingsets equals the set C of elements we aim to cover. We define the strategy σ as follows: σ ( v v c j ) = S i j . Let us show that this strategy is solution to the SPS problem by showingthat any play with a σ -fixed Pareto-optimal payoff is won by Player 0. This amounts toshowing that for every play in G there is a play in G with a strictly larger payoff. This issufficient as it makes sure that the payoff of plays in G are not σ -fixed Pareto-optimal andas every play in G is won by Player 0. Let p = ( p , . . . , p n +1 ) be the payoff of a play in G .It holds that p ℓ = 1 for some ℓ ∈ { , , . . . , n } and p j = 0 for ℓ ̸ = j . This corresponds to theelement e ℓ in C . Since the k indexes i , i , . . . , i k are a solution to the SC problem, it holdsthat there exists some index i j such that e ℓ ∈ S i j . It also holds that the play v v c j ( S i j ) ω isconsistent with σ . Its payoff is p ′ = ( p ′ , . . . , p ′ n +1 ) with p ′ ℓ = 1 since e ℓ ∈ S i j and p ′ n +1 = 1.It follows that payoff p ′ is strictly larger than p .Now, let us assume that Player 0 has a strategy σ from v that is a solution to theSPS problem. We can show that the set of indexes { i j | σ ( v v c j ) = S i j , j ∈ { , . . . , k }} isa solution to the SC problem. It is easy to see that since strategy σ is a solution to theSPS problem, every payoff p in G is strictly smaller than some payoff p ′ in G . It followsthat in the SC problem, each element e ∈ C corresponding to p is contained in some set S corresponding to p ′ . Since it also holds that S ⊆ C for each set S , it follows that the setsmentioned above are an exact cover of C . ◀ Let us come back to regular game arenas and show the
NEXPTIME -hardness result for bothreachability and parity SP games. Each type of objective is studied in a dedicated subsection. ▶ Theorem 17.
The SPS problem is
NEXPTIME -hard for reachability SP games. ▶ Theorem 18.
The SPS problem is
NEXPTIME -hard for parity SP games.
The
NEXPTIME -hardness is obtained thanks to the succinct variant of the SC problempresented below.
The
Succinct Set Cover problem (SSC problem) is defined as follows. We are given aConjunctive Normal Form (CNF) formula ϕ = C ∧ C ∧ · · · ∧ C p over the variables X = { x , x , . . . , x m } made up of p clauses, each containing some disjunction of literals of thevariables in X . The set of valuations of the variables X which satisfy ϕ is written (cid:74) ϕ (cid:75) .We are also given an integer k ∈ N (encoded in binary) and an other CNF formula ψ = D ∧ D ∧ · · · ∧ D q over the variables X ∪ Y with Y = { y , y , . . . , y n } , made up of q clauses.Given a valuation val Y : Y → { , } of the variables in Y , called a partial valuation , we write ψ [ val Y ] the CNF formula obtained by replacing in ψ each variable y ∈ Y by its valuation . Bruyère, J.-F. Raskin and C. Tamines 19 val Y ( y ). We write (cid:74) ψ [ val Y ] (cid:75) the valuations of the remaining variables X which satisfy ψ [ val Y ].The SSC problem is to decide whether there exists a set K = (cid:8) val Y | val Y : Y → { , } (cid:9) of k valuations of the variables in Y such that the valuations of the remaining variables X which satisfy the formulas ψ [ val Y ] include the valuations of X which satisfy ϕ . Formally, wewrite this (cid:74) ϕ (cid:75) ⊆ S val Y ∈ K (cid:74) ψ [ val Y ] (cid:75) .We can show that this corresponds to a set cover problem succinctly defined using CNFformulas. The set (cid:74) ϕ (cid:75) of valuations of X which satisfy ϕ corresponds to the set of elementswe aim to cover. Parameter k is the number of sets that can be used to cover these elements.Such a set is described by a formula ψ [ val Y ], given a partial valuation val Y , and its elementsare the valuations of X in (cid:74) ψ [ val Y ] (cid:75) . This is illustrated in the following example. ▶ Example 19.
Consider the CNF formula ϕ = ( x ∨ ¬ x ) ∧ ( x ∨ x ) over the vari-ables X = { x , x , x } . The set of valuations of the variables which satisfy ϕ is (cid:74) ϕ (cid:75) = { (1 , , , (1 , , , (1 , , , (0 , , } . Each such valuation corresponds to one ele-ment we aim to cover. Consider the CNF formula ψ = ( y ∨ y ) ∧ ( x ∨ y ) ∧ ( x ∨ x ∨ y )over the variables X ∪ Y with Y = { y , y } . Given the partial valuation val Y of thevariables in Y such that val Y ( y ) = 0 and val Y ( y ) = 1, we get the CNF formula ψ [ val Y ] = (0 ∨ ∧ ( x ∨ ∧ ( x ∨ x ∨ X which satisfy ψ [ val Y ] are the elements contained in the set. Inthis case, these elements are (cid:74) ψ [ val Y ] (cid:75) = { (0 , , , (0 , , , (0 , , , (1 , , , (1 , , , (1 , , } .We can see that this set contains the elements { (1 , , , (1 , , , (1 , , , (0 , , } of (cid:74) ϕ (cid:75) .The following result is used in the proof of our NEXPTIME -hardness results and is ofpotential independent interest. ▶ Theorem 20.
The SSC problem is
NEXPTIME -complete.
Proof of Theorem 20.
It is easy to see that the SSC problem is in
NEXPTIME . We canshow that the SSC problem is
NEXPTIME -hard by reduction from the
Succinct DominatingSet problem (SDS problem) which is known to be
NEXPTIME -complete for graphs succinctly defined using CNF formulas [14]. An instance of the SDS problem is defined by a CNFformula θ over two sets of n variables X = { x , x , . . . , x n } and Y = { y , y , . . . , y n } andan integer k (encoded in binary). The formula θ succinctly defines an undirected graph inthe following way. The set of vertices is the set of all valuations of the n variables in X (orover the n variables in Y ) of which there are 2 n . Let val X and val Y be two such valuations,representing two vertices. Then, there is an edge between val X and val Y if and only if θ [ val X , val Y ] or θ [ val Y , val X ] is true. An instance of the SDS problem is positive if thereexists a set K = { val X , val X , . . . , val kX } of k valuations of the variables in X , correspondingto k vertices, such that all vertices in the graph are adjacent to a vertex in K . Formally, wewrite this | S val X ∈ K { val Y | θ [ val X , val Y ] ∨ θ [ val Y , val X ] is true }| = 2 n .The SDS problem can be reduced in polynomial time to the SSC problem as follows.We define the CNF formula ϕ over the set of variables X such that the formula is empty.Therefore, the set (cid:74) ϕ (cid:75) is equal to the 2 n valuations of the variables in X . We then definethe CNF formula ψ over the set of variables X and Y such that it is the CNF equivalent to θ ( X, Y ) ∨ θ ( Y, X ). The latter formula has a size which is polynomial in the size of the CNFformula θ which defines the graph. We keep the same integer k . Then, it is direct to see thatthe instance of the SDS problem is positive if and only if the instance of SSC problem ispositive. Indeed, there is a positive instance to the SDS problem if and only if there exists aset K of k valuations of the variables in Y such that (cid:74) ϕ (cid:75) ⊆ S val Y ∈ K (cid:74) ψ [ val Y ] (cid:75) . ◀ α β Figure 5
The gadget Q . v v G i ... i p x ¬ x . . .. . . x m ¬ x m G v x ¬ x . . .. . . x m ¬ x m G Q k v ¬ y y . . .. . . ¬ y n y n ¬ x x . . .. . . ¬ x m x m Figure 6
The arena G used in the reduction from the SSC problem. We now describe in details our reduction from the SSC problem which allows us to show the
NEXPTIME -hardness of solving the SPS problem in reachability SP games.
Gadget Q k . Parameter k can be represented in binary using r = ⌊ log ( k ) ⌋ + 1 bits. It alsoholds that the binary encoding of k corresponds to the sum of at most r powers of 2. Given thebinary encoding b b . . . b r − of k such that b i ∈ { , } , let ones = { i ∈ { , . . . , r − } | b i = 1 } .It holds that k = P i ∈ ones i . Our gadget Q k is a graph with a polynomial number of vertices(in the length of the binary encoding of k ) such that all these vertices belong to Player 1.For each i ∈ ones there is 2 i different paths from the initial vertex α to vertex β . Therefore,it holds that in Q k there are k different paths from vertex α to vertex β . ▶ Example 21.
Let k = 11, it holds that it can be represented in binary using ⌊ log (11) ⌋ +1 =4 bits. The binary representation of 11 is 1011 and it can be obtained by the following sum2 + 2 + 2 . The gadget Q is detailed in Figure 5. Game Arena.
The arena G of the game is provided in Figure 6. It can be viewed as threesub-arenas that can be reached from the initial vertex v . We call these sub-arenas G , G . Bruyère, J.-F. Raskin and C. Tamines 21 and G . Sub-arena G starts with the gadget Q k described previously. The number ofvertices of this arena is polynomial in the number of clauses and variables in the formulas ϕ and ψ and in the length of the binary encoding of the integer k . The game is played betweenPlayer 0 with reachability objective Ω and Player 1 with t = 1 + 2 m + p + q reachabilityobjectives. The payoff of a play therefore consists in a single Boolean for objective Ω , avector of 2 m Booleans for objectives Ω x , Ω ¬ x , . . . , Ω x m , Ω ¬ x m , a vector of p Booleans forobjectives Ω C , . . . , Ω C p and a vector of q Booleans for objectives Ω D , . . . , Ω D q . Objectives.
Each objective Ω i in the game is a reachability objective Reach ( T i ) defined bya target set T i as follows. We later explain how these objectives are used in our reduction.The objective Ω of Player 0 and objective Ω of Player 1 are Reach ( { v , α } ) where α isthe initial vertex of gadget Q k .The target set for the objective Ω x i (resp. Ω ¬ x i ) is the set of vertices { x i } (resp. {¬ x i } )in G , G and G .The target sets for the objective Ω C i with i ∈ { , . . . , p } is the set of vertices in G and G corresponding to the literals of X which make up the clause C i in ϕ . In addition,vertex i j in G belongs to the target set of objective Ω C ℓ for all ℓ ∈ { , . . . , p } such that ℓ ̸ = j .The target set of objective Ω D i with i ∈ { , . . . , q } is the set of vertices in G correspondingto the literals of X and Y which make up the clause D i in ψ . In addition, vertices v and v satisfy every objective Ω D i with i ∈ { , . . . , q } . Payoff of Plays in G and G . In each sub-arena G and G , for each variable x i ∈ X ,there is one choice vertex controlled by Player 1 which leads to x i and ¬ x i . These verticeshave the next choice vertex as their successor, except for vertices x m and ¬ x m which havea self loop. In G , there is also a vertex v controlled by Player 1 with p successors, eachleading to the first choice vertex for the variables in X . The sub-arenas G and G arecompletely controlled by Player 1. The plays in the corresponding sub-arenas are thereforeconsistent with any strategy of Player 0.The plays in G do not satisfy objective Ω of Player 0 nor objective Ω of Player 1. Aplay in G is of the form v (cid:31) z (cid:31) z · · · (cid:31) ( z m ) ω where z i is either x i or ¬ x i . It followsthat a play satisfies the objective Ω x i or Ω ¬ x i for each x i ∈ X . The vector of payoffs forthese objectives corresponds to a valuation of the variables in X . In addition, due to theway the objectives are defined, objective Ω C i is satisfied in a play if and only if clause C i of ϕ is satisfied by the valuation this play corresponds to. The objective Ω D i for i ∈ { , . . . , q } is satisfied in every play in G . ▶ Lemma 22.
The plays in G are consistent with any strategy of Player . The payoffsof plays in G are of the form (0 , val, sat ( ϕ, val ) , , . . . , where val is a valuation of thevariables in X expressed as a vector of payoffs for the objectives Ω x to Ω ¬ x m and sat ( ϕ, val ) is the vector of payoffs for objectives Ω C to Ω C p corresponding to that valuation. It holdsthat no plays in G are won by Player . The plays in G satisfy the objectives Ω of Player 0 and Ω of Player 1. A play in G is of the form v i j (cid:31) z (cid:31) z · · · (cid:31) ( z m ) ω where z ℓ is either x ℓ or ¬ x ℓ . It follows that aplay satisfies either the objective Ω x or Ω ¬ x for each x ∈ X which again corresponds to avaluation of the variables in X . The objective Ω D i for i ∈ { , . . . , q } is satisfied in every playin G . Compared to the plays in G , the difference lies in the objectives corresponding toclauses of ϕ which are satisfied. In any play in G , a vertex i j with j ∈ { , . . . , p } is first visited, satisfying all the objectives Ω C ℓ with ℓ ∈ { , . . . , p } and ℓ ̸ = j . All but one objectivecorresponding to the clauses of ϕ are therefore satisfied. ▶ Lemma 23.
The plays in G are consistent with any strategy of Player . The payoffs ofplays in G are of the form (1 , val, vec, , . . . , where val is a valuation of the variables in X expressed as a vector of payoffs for objectives Ω x to Ω ¬ x m and vec is a vector of payoffsfor objectives Ω C to Ω C p in which all of them except one are satisfied. It holds that all playsin G are won by Player . From the two previous lemmas, we can state the following lemma when considering thepayoffs of plays in G and G . ▶ Lemma 24.
For every play in G which corresponds to a valuation of the variables in X that does not satisfy ϕ , there is a play in G with a strictly larger payoff. Proof of Lemma 24.
Let ρ be a play in G which corresponds to a valuation of the variablesin X that does not satisfy ϕ . It follows that at least one objective, say Ω C ℓ , is not satisfied in ρ as at least one clause of ϕ (clause C ℓ ) is not satisfied by that valuation. Let us consider theplay ρ ′ in G which visits vertex i ℓ and after visits the vertices corresponding to the samevaluation of the variables in X as ρ . By Lemmas 22 and 23, it follows that the payoff of ρ ′ isstrictly larger than that of ρ (as we have (0 , val, sat ( ϕ, val ) , , . . . , < (1 , val, vec, , . . . , sat ( ϕ, val ) ≤ vec ). ◀ The following lemma is a consequence of Lemma 24. ▶ Lemma 25.
The set of payoffs of plays in G that are σ -fixed Pareto-optimal whenconsidering G ∪ G for any strategy σ of Player is equal to the set of payoffs of plays in G whose valuation of X satisfy ϕ . Proof of Lemma 25.
This property stems from the following observations. First, any playin G which satisfies every objective Ω C i with i ∈ { , . . . , p } , and therefore corresponds to avaluation of X which satisfies ϕ , has a payoff that is incomparable to every possible payoff in G . This is because such a play satisfies more objectives in Ω C , . . . , Ω C p than the plays in G but does not satisfy objective Ω while the plays in G do. Second, every other play in G has a strictly smaller payoff then at least one play in G due to Lemma 24 and its payoffis therefore not σ -fixed Pareto-optimal. ◀ Problematic Payoffs in G . The plays described in the previous lemma correspond exactlyto the valuations of X which satisfy ϕ and therefore to the elements we aim to cover in theSSC problem. They are σ -fixed Pareto-optimal when considering G ∪ G and are lost byPlayer 0. All other σ -fixed Pareto-optimal payoffs in G ∪ G are only realized by plays in G which are all won by Player 0. It follows that in order for Player 0 to find a strategy σ from v that is solution to the SPS problem, it must hold that those payoffs are not σ -fixedPareto-optimal when considering G ∪ G ∪ G . Otherwise, a play consistent with σ with a σ -fixed Pareto-optimal payoff is lost by Player 0. We call those payoffs problematic payoffs . Creating Strictly Larger Payoffs in G . In order for Player 0 to find a strategy σ whichis a solution to the SPS problem, this strategy must be such that for each problematic payoffin G , there is a play in G consistent with σ and with a strictly larger payoff. Since theplays in G are all won by Player 0, this would ensure that the strategy σ is a solution tothe problem. This corresponds in the SSC problem to selecting a series of sets in order tocover the valuations of X which satisfy ϕ . . Bruyère, J.-F. Raskin and C. Tamines 23 Payoff of Plays in G . Sub-arena G starts with gadget Q k whose vertices are controlledby Player 1. Then, for each variable y i ∈ Y , there is one choice vertex controlled by Player 0which leads to y i and ¬ y i . These vertices have the next choice vertex as their successor,except for y n and ¬ y n which lead to the first choice vertex for the variables in X . Each playin G satisfies the objectives Ω of Player 0 and Ω of Player 1. A play in G consistent witha strategy σ is of the form ( α (cid:31) · · · (cid:31) β )( (cid:44) r (cid:44) r . . . (cid:44) r n )( (cid:31) z (cid:31) z · · · (cid:31) ( z m ) ω ) where r i iseither y i or ¬ y i and z i is either x i or ¬ x i . Since only the choice vertices leading to y or ¬ y for y ∈ Y belong to Player 0, it holds that ( (cid:44) r (cid:44) r . . . (cid:44) r n ) is the only part of any play in G which is directly influenced by σ . Since that part of a play comes after a history from α to β of which there are k and by definition of a strategy, this can be interpreted as choosing k valuations of the variables in Y . After this, the play satisfies either the objective Ω x orΩ ¬ x for each x ∈ X which corresponds to a valuation of X . Due to the way the objectivesare defined, the objective Ω C i (resp. Ω D i ) is satisfied if and only if clause C i of ϕ (resp. D i of ψ ) is satisfied by the valuation the play corresponds to. In order to create a play with apayoff r ′ that is strictly larger than a problematic payoff r , σ must choose a valuation of Y such that there exists a valuation of the remaining variables X which together with thisvaluation of Y satisfies ψ and ϕ (since in r every objective Ω C i for i ∈ { , . . . , p } and Ω D i for i ∈ { , . . . , q } is satisfied). Since the plays in G also satisfy the objective Ω and playsin G do not, this ensures that r < r ′ .We can finally establish that our reduction is correct. ▶ Proposition 26.
Player has a strategy σ from v in G that is a solution to the SPSproblem if and only if there is a solution to the corresponding instance of the SSC problem. Proof of Proposition 26.
Let us assume that that σ is a solution to the SPS problem in G and show that there is a solution to the SSC problem. Let val X be a valuation of the variablesin X which satisfies ϕ . This valuation corresponds to a play in G with a problematic payoff r . Since the objective of Player 0 is not satisfied in that play and since σ is a solutionto the SPS problem, it holds that r is not σ -fixed Pareto-optimal. It follows that thereexists a play in G that is consistent with σ and whose payoff is strictly larger than r . Asdescribed above, such a play corresponds to a valuation val Y of the variables in Y such that val X ∈ (cid:74) ψ [ val Y ] (cid:75) . Since this can be done for each val X ∈ (cid:74) ϕ (cid:75) and since there is a set K of k possible valuations val Y in G , it holds that (cid:74) ϕ (cid:75) ⊆ S val Y ∈ K (cid:74) ψ [ val Y ] (cid:75) .Let us now assume that there is a solution to the SSC problem and show that we canconstruct a strategy σ that is solution to the SPS problem. Let K be the set of k valuations val Y of the variables in Y which is a solution to the SSC problem. Since there are k possiblehistories from α to β in G , we define σ such that the n vertices y i or ¬ y i for i ∈ { , . . . , n } visited after each history correspond to a valuation in K . We can now show that this strategyis a solution to the SPS problem. We do this by showing that each play ρ with problematicpayoff r in G has a strictly smaller payoff than that of some play ρ ′ with payoff r ′ in G .Such a payoff r corresponds to a valuation val X ∈ (cid:74) ϕ (cid:75) . Since K is a solution to the SSCproblem, it holds that there exists some valuation val Y ∈ K such that val X ∈ (cid:74) ψ [ val Y ] (cid:75) . Itfollows, given the definition of σ , that there exists a play ρ ′ in G corresponding to thatvaluation val Y and which visits the vertices x or ¬ x for each x ∈ X such that it correspondsto the valuation val X . Given the properties mentioned before, the payoff r ′ of this play issuch that r < r ′ . ◀ The previous proof yields our result on the
NEXPTIME -hardness of the SPS problem inreachability SP games. v x ¬ x . . .. . . x m ¬ x m Figure 7
The repeating structure used in the reduction from the SSC problem for parity SPgames.
We now provide the
NEXPTIME -hardness result for parity SP games.The proof of this result follows the same ideas used in the proof for reachability SP games.It again uses a reduction from the SSC problem in which we construct an arena G , and itsstructure of three sub-arenas G , G , and G is kept. We describe the main difficulties thatwe encounter when adapting this proof for parity objectives and how to overcome them bymodifying each sub-arena G i into G ′ i . The modified arena G ′ is depicted in Figure 8. Two Particular Objectives.
Remember that for the case of reachability SP games, theobjective Ω of Player 0 and the first objective Ω of Player 1 were either always satisfied oralways not satisfied in all plays of a given sub-arena G , G , or G . This property holds inthe modified arena G ′ , with these objectives being expressed using parity conditions. On the Choice of Valuations.
Recall that for reachability objectives, we encoded the choiceof valuations for the variables x i ∈ X made by Player 1 by using sub-arena G of Figure 6(which was also reused as part of G and G ). A play in this sub-arena encodes a valuationby visiting one literal l i ∈ { x i , ¬ x i } for each x i ∈ X thus satisfying the objectives Ω l i , i ∈ { , . . . , m } .This simple schema cannot be reused in the case of parity objectives as they are prefix-independent objectives. Instead, we ask Player 1 to repeatedly produce the same choice along loops (from v back to v ) in the adapted gadget of Figure 7.We associate with each variable x i ∈ X two parity objectives with priority function c l i with l i ∈ { x i , ¬ x i } defined as follows: c l i ( l i ) = 2, c l i ( ¬ l i ) = 1 and c l i ( v ) = 3 for all the othervertices. It is easy to see that plays in which the valuation changes infinitely many times(for example, visiting x i and ¬ x i infinitely often), have a payoff strictly smaller than someother play which settles on a choice of valuation for each variable. Indeed, the payoff forobjectives Ω x i and Ω x i is (0 ,
0) in the first case and (1 ,
0) or (0 ,
1) in the second. We saythat a valuation is properly encoded if Player 1 eventually repeats the same choice along theloops to settle on a valuation.Notice that the gadget of Figure 7 appears nearly identical as part of the three sub-arenas G ′ , G ′ , and G ′ of Figure 8. In all of these sub-arenas, we define the values of each priorityfunction c l i with l i ∈ { x i , ¬ x i } exactly as explained above. On the Satisfied Clauses.
Recall that in the case of reachability SP games, we associatedone reachability objective with each clause C i (resp. D i ) of ϕ (resp. ψ ). As encodingvaluations is made more complex by the prefix-independency of parity objectives, we alsoneed to adapt the way we check which clauses are satisfied by a given valuation. . Bruyère, J.-F. Raskin and C. Tamines 25 v v G ′ i . . .i p x ¬ x . . .. . . x m ¬ x m ¬ x x . . .. . . ¬ x m x m G ′ v x ¬ x . . .. . . x m ¬ x m D l D n D ... l D D . . . ... . . . D q l D q n Dq ... l D q G ′ Q k v ¬ y y . . .. . . ¬ y n y n ¬ x x . . .. . . ¬ x m x m Figure 8
The arena G ′ used in the reduction from the SSC problem for parity SP games. Let us first explain our encoding for clauses C j of ϕ . We associate one parity objective withpriority function c C j l with each literal l of each clause C j . Therefore, if C j = l C j ∨ . . . ∨ l C j n Cj ,there are n C j parity objectives for clause C j . Priority function c C j l is defined as followsfor each vertex l i ∈ { x i , ¬ x i } that appears in G ′ and G ′ (we will define it later for G ′ ): c C j l ( l i ) = 2 if l = l i and c C j l ( l i ) = 1 if l = ¬ l i , and for all the other vertices v , we have c C j l ( v ) = 3. This encoding of clauses has the following important property: given a valuation val X of the variables in X properly encoded by Player 1, a clause C j is satisfied by val X ifand only if the parity condition Parity ( c C j l ) is satisfied for at least one of the literals l of C j .Thus with the proposed encoding with priority functions, there are several ways to observethat a clause C j is satisfied (the corresponding payoff is a non-null vector of n C j Booleans).In the sub-arena G ′ , we see a part resembling the gadget of Figure 7, however made ofvertices y i , i ∈ { , . . . , n } . This part is related to the choice of a valuation of the variables in Y made by Player 0 with respect to ψ . To encode clauses D j of ψ , we proceed exactly as wedid previously with clauses C j of ϕ . We associate one priority function c D j l with each literal l of each clause D j (recall that such a literal uses both sets of variables X and Y ). We similarlydefine the values of c D j l for vertices of G ′ : for l ′ ∈ { x i , ¬ x i | i ∈ { , . . . , m }} ∪ { y i , ¬ y i | i ∈{ , . . . , n }} , we define c D j l ( l ′ ) = 2 if l = l ′ and c D j l ( l ′ ) = 1 if l = ¬ l ′ , and for all the othervertices v of G ′ , we define c D j l ( v ) = 3. Notice that the definition of c D j l is given for G ′ only.We will later give its definition for G ′ and G ′ . Modifications Needed on G . Let us now explain how to modify G into G ′ . Rememberthat in the case of reachability SP games, the objective associated with each clause D j of ψ is satisfied by all plays in G . Indeed the purpose of G (in combination with G ) is toisolate all the encodings of the valuations of X that satisfy ϕ independently of ψ . In case of apositive instance of the SSC problem, the payoff of these encodings are strictly smaller thanthat of some play in G , which have to satisfy all clauses of ψ by definition of this problem.We proceed similarly in G ′ . However as there are several ways to satisfy D j in ψ (atleast one of its literals has to be satisfied), we let Player 0 choose which way to do it. This isencoded by the part of G ′ made with vertices D j , j ∈ { , . . . , q } , controlled by Player 0, andtheir successors l D j , . . . , l D j n Dj such that D j = l D j ∨ . . . ∨ l D j n Dj . Given a properly encoded X valuation val X made by Player 1, Player 0 chooses a Y valuation val Y such that if val X ∈ (cid:74) ϕ (cid:75) ,then val X ∈ (cid:74) ψ [ val Y ] (cid:75) . Player 0 makes such a choice by selecting at least one literal l D j of D j , for each clause D j of ψ , such that l is satisfied by the valuation made of val X and val Y . This is encoded in G ′ by defining the priority function c D j l such that c D j l ( l ) = 2 and c D j l ( v ) = 3 for all the other vertices v of G ′ . Modifications Needed on G . Recall that in the case of reachability SP games, theconstruction of G ensures that the plays of G whose payoff is not strictly smaller thanthe payoff of a play in G are exactly the plays of G that encode X valuations val X suchthat val X ∈ (cid:74) ϕ (cid:75) . As a consequence, the objectives that are satisfied by the plays in G areexactly (i) those associated with a valuation val X , (ii) the objectives associated with allclauses D j of ψ , and (iii) the objectives associated with all clauses C j of ϕ except one.We achieve the same requirement for parity SP games by using G ′ in place of G . Inthis sub-arena, Player 1 first selects one clause C in ϕ and then in the selecting part of G ′ ,the priority functions are defined as follows.We use the priority functions c l i with l i ∈ { x i , ¬ x i | i ∈ { , . . . , m }} as defined above forencoding X valuations. . Bruyère, J.-F. Raskin and C. Tamines 27 The priority functions c D j l are all defined such that the associated objective Parity ( c D j l )is satisfied.Similarly the priority functions c C j l are defined such that the associated objective Parity ( c C j l ) is satisfied, except for all priority functions c C j l such that C j = C for whichthis objective is not satisfied. Modifications Needed on G . We have modified G into G ′ and G into G ′ such that theonly plays in G ′ with a Pareto-optimal payoff when considering G ′ ∪ G ′ , given any strategyof Player 0 are those that encode valuations val X ∈ (cid:74) ϕ (cid:75) . In G ′ , after such a valuation chosenby Player 1, Player 0 indicates which valuation val Y to use such that val X ∈ (cid:74) ψ [ val Y ] (cid:75) . Hechooses this valuation val Y by indicating for each clause D j which literals of D j he haschosen such that the valuation made of val X and val Y satisfies D j .Let us now explain how to modify G into G ′ . After each of the k histories producedby gadget Q k , both players have to choose some valuation (resp. val X and val Y ) for thevariables that they control. In case of a positive instance of SSC problem, Player 0 willbe able to select one of the valuations val Y that he used in G ′ such that val X ∈ (cid:74) ψ [ val Y ] (cid:75) whenever val X ∈ (cid:74) ϕ (cid:75) . It follows that plays in G ′ have a larger payoff than the Pareto-optimalpayoffs of G ′ ∪ G ′ . In case of a negative instance of SSC problem, Player 0 will not be ableto do so.Clearly there exists a solution to the SPS problem in the modified arena G ′ if and only ifthe instance of the SSC problem is positive. We have introduced in this paper the class of two-player SP games and the SPS problem inthose games. We provided a reduction from SP games to a two-player zero-sum game calledthe C-P game in order to provide
FPT results on solving this problem. We then showedhow the arena and the generic objective of this C-P game can be adapted to specificallyhandle reachability and parity SP games. This allowed us to prove that reachability (resp.parity) SP games are in
FPT when the number t of objectives of Player 1 (resp. when t and the maximal priority according to each priority function in the game) is a parameter.We then turned to the complexity class of the SPS problem and provided a proof of its NEXPTIME -membership, which relied on showing that any solution to the SPS problem ina reachability or parity SP game can be transformed into a solution with an exponentialmemory. We provided a proof of the NP -completeness of the problem in the simple settingof reachability SP games played on tree arenas. We then came back to regular game arenasand provided the proof of the NEXPTIME -hardness of the SPS problem in reachability andparity SP games. This proof relied on a reduction from the SSC problem which we proved tobe
NEXPTIME -complete, a result of potential independent interest.In future work, we want to study other ω -regular objectives as well as quantitativeobjectives such as mean-payoff in the framework of SP games and the SPS problem. It wouldalso be interesting to study whether other works, such as rational synthesis, could benefitfrom the approaches used in this paper. References Mrudula Balachander, Shibashis Guha, and Jean-François Raskin. Stackelberg mean-payoffgames with a rationally bounded adversarial follower.
CoRR , abs/2007.07209, 2020. URL: https://arxiv.org/abs/2007.07209 , arXiv:2007.07209 . Dietmar Berwanger. Admissibility in infinite games. In Wolfgang Thomas and Pascal Weil,editors,
STACS 2007, 24th Annual Symposium on Theoretical Aspects of Computer Science,Aachen, Germany, February 22-24, 2007, Proceedings , volume 4393 of
Lecture Notes inComputer Science , pages 188–199. Springer, 2007. doi:10.1007/978-3-540-70918-3\_17 . Patricia Bouyer, Romain Brenguier, Nicolas Markey, and Michael Ummels. Pure Nashequilibria in concurrent deterministic games.
Log. Methods Comput. Sci. , 11(2), 2015. doi:10.2168/LMCS-11(2:9)2015 . Romain Brenguier, Lorenzo Clemente, Paul Hunter, Guillermo A. Pérez, Mickael Randour,Jean-François Raskin, Ocan Sankur, and Mathieu Sassolas. Non-zero sum games for reactivesynthesis. In Adrian-Horia Dediu, Jan Janousek, Carlos Martín-Vide, and Bianca Truthe,editors,
Language and Automata Theory and Applications - 10th International Conference,LATA 2016, Prague, Czech Republic, March 14-18, 2016, Proceedings , volume 9618 of
LectureNotes in Computer Science , pages 3–23. Springer, 2016. doi:10.1007/978-3-319-30000-9\_1 . Romain Brenguier, Jean-François Raskin, and Ocan Sankur. Assume-admissible synthesis. InLuca Aceto and David de Frutos-Escrig, editors, , volume 42 of
LIPIcs , pages100–113. Schloss Dagstuhl - Leibniz-Zentrum für Informatik, 2015. doi:10.4230/LIPIcs.CONCUR.2015.100 . Léonard Brice, Jean-François Raskin, and Marie van den Bogaard. Subgame-perfect equilibriain mean-payoff games.
CoRR , abs/2101.10685, 2021. URL: https://arxiv.org/abs/2101.10685 , arXiv:2101.10685 . Thomas Brihaye, Véronique Bruyère, Aline Goeminne, Jean-François Raskin, and Marievan den Bogaard. The complexity of subgame perfect equilibria in quantitative reachabilitygames.
Log. Methods Comput. Sci. , 16(4), 2020. URL: https://lmcs.episciences.org/6883 . Véronique Bruyère. Computer aided synthesis: A game-theoretic approach. In ÉmilieCharlier, Julien Leroy, and Michel Rigo, editors,
Developments in Language Theory - 21stInternational Conference, DLT 2017, Liège, Belgium, August 7-11, 2017, Proceedings , volume10396 of
Lecture Notes in Computer Science , pages 3–35. Springer, 2017. doi:10.1007/978-3-319-62809-7\_1 . Véronique Bruyère, Quentin Hautem, and Jean-François Raskin. Parameterized complexityof games with monotonically ordered omega-regular objectives. In Sven Schewe and LijunZhang, editors, , volume 118 of
LIPIcs , pages 29:1–29:16. Schloss Dagstuhl- Leibniz-Zentrum für Informatik, 2018. doi:10.4230/LIPIcs.CONCUR.2018.29 . Krishnendu Chatterjee and Monika Henzinger. Efficient and dynamic algorithms for alternatingBüchi games and maximal end-component decomposition.
J. ACM , 61(3):15:1–15:40, 2014. doi:10.1145/2597631 . Krishnendu Chatterjee and Thomas A. Henzinger. Assume-guarantee synthesis. In OrnaGrumberg and Michael Huth, editors,
Tools and Algorithms for the Construction and Analysisof Systems, 13th International Conference, TACAS 2007, Held as Part of the Joint EuropeanConferences on Theory and Practice of Software, ETAPS 2007 Braga, Portugal, March 24 -April 1, 2007, Proceedings , volume 4424 of
Lecture Notes in Computer Science , pages 261–275.Springer, 2007. doi:10.1007/978-3-540-71209-1\_21 . Krishnendu Chatterjee, Thomas A. Henzinger, and Marcin Jurdzinski. Games with secureequilibria.
Theor. Comput. Sci. , 365(1-2):67–82, 2006. doi:10.1016/j.tcs.2006.07.032 . Rodica Condurache, Emmanuel Filiot, Raffaella Gentilini, and Jean-François Raskin. Thecomplexity of rational synthesis. In Ioannis Chatzigiannakis, Michael Mitzenmacher, YuvalRabani, and Davide Sangiorgi, editors, , volume 55 of
LIPIcs , pages121:1–121:15. Schloss Dagstuhl - Leibniz-Zentrum für Informatik, 2016. doi:10.4230/LIPIcs.ICALP.2016.121 . . Bruyère, J.-F. Raskin and C. Tamines 29 Bireswar Das, Patrick Scharpfenecker, and Jacobo Torán. CNF and DNF succinct graphencodings.
Inf. Comput. , 253:436–447, 2017. doi:10.1016/j.ic.2016.06.009 . R.G. Downey and M.R. Fellows.
Parameterized Complexity . Monographs in Computer Science.Springer New York, 2012. URL: https://books.google.be/books?id=HyTjBwAAQBAJ . Stefan Dziembowski, Marcin Jurdzinski, and Igor Walukiewicz. How much memory is neededto win infinite games? In
Proceedings, 12th Annual IEEE Symposium on Logic in ComputerScience, Warsaw, Poland, June 29 - July 2, 1997 , pages 99–110. IEEE Computer Society,1997. doi:10.1109/LICS.1997.614939 . E. Allen Emerson and Chin-Laung Lei. Modalities for model checking: Branching timelogic strikes back.
Sci. Comput. Program. , 8(3):275–306, 1987. doi:10.1016/0167-6423(87)90036-0 . Emmanuel Filiot, Raffaella Gentilini, and Jean-François Raskin. The adversarial Stackelbergvalue in quantitative games. In Artur Czumaj, Anuj Dawar, and Emanuela Merelli, editors, , volume 168 of
LIPIcs , pages127:1–127:18. Schloss Dagstuhl - Leibniz-Zentrum für Informatik, 2020. doi:10.4230/LIPIcs.ICALP.2020.127 . Dana Fisman, Orna Kupferman, and Yoad Lustig. Rational synthesis. In Javier Esparzaand Rupak Majumdar, editors,
Tools and Algorithms for the Construction and Analysis ofSystems, 16th International Conference, TACAS 2010, Held as Part of the Joint EuropeanConferences on Theory and Practice of Software, ETAPS 2010, Paphos, Cyprus, March 20-28,2010. Proceedings , volume 6015 of
Lecture Notes in Computer Science , pages 190–204. Springer,2010. doi:10.1007/978-3-642-12002-2\_16 . Erich Grädel, Wolfgang Thomas, and Thomas Wilke, editors.
Automata, Logics, and InfiniteGames: A Guide to Current Research [outcome of a Dagstuhl seminar, February 2001] , volume2500 of
Lecture Notes in Computer Science . Springer, 2002. doi:10.1007/3-540-36387-4 . Anshul Gupta and Sven Schewe. Quantitative verification in rational environments. InAmedeo Cesta, Carlo Combi, and François Laroussinie, editors, ,pages 123–131. IEEE Computer Society, 2014. doi:10.1109/TIME.2014.9 . Anshul Gupta, Sven Schewe, and Dominik Wojtczak. Making the best of limited memory inmulti-player discounted sum games. In Javier Esparza and Enrico Tronci, editors,
ProceedingsSixth International Symposium on Games, Automata, Logics and Formal Verification, Gan-dALF 2015, Genoa, Italy, 21-22nd September 2015 , volume 193 of
EPTCS , pages 16–30, 2015. doi:10.4204/EPTCS.193.2 . Florian Horn. Explicit Muller games are PTIME. In Ramesh Hariharan, Madhavan Mukund,and V. Vinay, editors,
IARCS Annual Conference on Foundations of Software Technologyand Theoretical Computer Science, FSTTCS 2008, December 9-11, 2008, Bangalore, India ,volume 2 of
LIPIcs , pages 235–243. Schloss Dagstuhl - Leibniz-Zentrum für Informatik, 2008. doi:10.4230/LIPIcs.FSTTCS.2008.1756 . Richard M. Karp. Reducibility among combinatorial problems. In Raymond E. Miller andJames W. Thatcher, editors,
Proceedings of a symposium on the Complexity of ComputerComputations, held March 20-22, 1972, at the IBM Thomas J. Watson Research Center,Yorktown Heights, New York, USA , The IBM Research Symposia Series, pages 85–103. PlenumPress, New York, 1972. doi:10.1007/978-1-4684-2001-2\_9 . Orna Kupferman, Giuseppe Perelli, and Moshe Y. Vardi. Synthesis with rational environments.
Ann. Math. Artif. Intell. , 78(1):3–20, 2016. doi:10.1007/s10472-016-9508-8 . John F. Nash. Equilibrium points in n -person games. In PNAS , volume 36, pages 48–49.National Academy of Sciences, 1950. Amir Pnueli and Roni Rosner. On the synthesis of a reactive module. In
Conference Record ofthe Sixteenth Annual ACM Symposium on Principles of Programming Languages, Austin, Texas,USA, January 11-13, 1989 , pages 179–190. ACM Press, 1989. doi:10.1145/75277.75293 . Reinhard Selten. Spieltheoretische Behandlung eines Oligopolmodells mit Nachfrageträgheit.
Zeitschrift für die gesamte Staatswissenschaft , 121:301–324 and 667–689, 1965. Michael Ummels. Rational behaviour and strategy construction in infinite multiplayer games. InS. Arun-Kumar and Naveen Garg, editors,
FSTTCS 2006: Foundations of Software Technologyand Theoretical Computer Science, 26th International Conference, Kolkata, India, December13-15, 2006, Proceedings , volume 4337 of
Lecture Notes in Computer Science , pages 212–223.Springer, 2006. doi:10.1007/11944836\_21 . Michael Ummels. The complexity of Nash equilibria in infinite multiplayer games. InRoberto M. Amadio, editor,
Foundations of Software Science and Computational Structures,11th International Conference, FOSSACS 2008, Held as Part of the Joint European Conferenceson Theory and Practice of Software, ETAPS 2008, Budapest, Hungary, March 29 - April 6,2008. Proceedings , volume 4962 of
Lecture Notes in Computer Science , pages 20–34. Springer,2008. doi:10.1007/978-3-540-78499-9\_3 . Michael Ummels and Dominik Wojtczak. The complexity of Nash equilibria in limit-averagegames. In Joost-Pieter Katoen and Barbara König, editors,
CONCUR 2011 - ConcurrencyTheory - 22nd International Conference, CONCUR 2011, Aachen, Germany, September 6-9,2011. Proceedings , volume 6901 of
Lecture Notes in Computer Science , pages 482–496. Springer,2011. doi:10.1007/978-3-642-23217-6\_32 . Heinrich Freiherr von Stackelberg.
Marktform und Gleichgewicht . Wien und Berlin, J. Springer,Cambridge, MA, 1937.
A Useful Result on SP Games ▶ Proposition 27.
Every parity (resp. reachability) SP game G with arena G containing n vertices can be transformed into a parity (resp. reachability) SP game ¯ G with arena ¯ G containing at most n vertices such that any vertex in ¯ G has at most successors and Player has a strategy σ that is solution to the SPS problem in G if and only if Player has astrategy ¯ σ that is solution to the problem in ¯ G . Proof of Proposition 27.
Let G be an SP game with arena G . Let us first describe thearena ¯ G of ¯ G . Let v ∈ V be a vertex of G , then v is also a vertex of ¯ G such that it belongsto the same player and is the root of a complete binary tree with ℓ = |{ v ′ | ( v, v ′ ) ∈ E }| leaves if ( v, v ) ̸∈ E . Otherwise, v has a self loop and its other successor is the root of sucha tree with ℓ − v ,nor the leaves) belong to the same player as v . Each leaf vertex v ′ of this tree is such that( v, v ′ ) ∈ E , belongs to the same player as in G and is again the root of its own tree. Theinitial vertex v of G remains unchanged in ¯ G . Since every vertex in ¯ G is part of a binarytree or has a self loop and a single successor, it holds that it has at most two successors.Since G is a game arena, this transformation is such that each vertex in ¯ G has at least onesuccessor. It follows that ¯ G is a game arena containing n vertices v ∈ V and at most n − v ∈ V has n successors in G . It follows that thenumber of vertices in ¯ G is at most n + n · ( n −
1) = n . If G is a reachability SP game, thetarget sets remain unchanged in ¯ G . For parity SP games, the priority function c remainsunchanged for vertices v ∈ V and we define c ( v ′ ) = c ( v ) for v ′ ∈ ¯ V \ V such that v ′ is aninternal vertex of a tree whose root is v .Let us now show that there is a solution to the SPS problem in G if and only if there is asolution in ¯ G . From each root v of a tree in ¯ G (corresponding to a vertex v of Player i in G )there is a set of ℓ = |{ v ′ | ( v, v ′ ) ∈ E }| different paths controlled by Player i , each leading toa vertex v ′ . It follows that there exists a play ρ = v v v . . . ∈ Plays G if and only if thereexists a play ρ ′ = v a . . . a n v b . . . b n v . . . ∈ Plays ¯ G such that every vertex a i (resp. b i ) . Bruyère, J.-F. Raskin and C. Tamines 31 belongs to the same player as v (resp. v ) and so on. Given the way the objectives aredefined, in the case of reachability or parity SP games, it holds that pay ( ρ ) = pay ( ρ ′ ) and won ( ρ ) = won ( ρ ′ ). Therefore, a strategy σ that is solution to the SPS problem in G can betransformed into a strategy ¯ σ which is a solution in ¯ G and vice-versa.and vice-versa.