[PDF] Subgame-perfect Equilibria in Mean-payoff Games

Abstract

In this paper, we provide an effective characterization of all the subgame-perfect equilibria in infinite duration games played on finite graphs with mean-payoff objectives. To this end, we introduce the notion of requirement, and the notion of negotiation function. We establish that the plays that are supported by SPEs are exactly those that are consistent with the least fixed point of the negotiation function. Finally, we show that the negotiation function is piecewise linear, and can be analyzed using the linear algebraic tool box. As a corollary, we prove the decidability of the SPE constrained existence problem, whose status was left open in the literature.

Full PDF

aa r X i v : . [ c s . G T ] J a n Subgame-perfect Equilibriain Mean-payoff Games

Léonard Brice

ENS Paris-Saclay, FranceEmail: [email protected]

Jean-François Raskin

Université libre de Bruxelles (ULB), BelgiumEmail: [email protected]

Marie Van Den Bogaard

Université Gustave Eiffel, LIGM, FranceEmail: [email protected]

Abstract —In this paper, we provide an effective characteri-zation of all the subgame-perfect equilibria in inﬁnite durationgames played on ﬁnite graphs with mean-payoff objectives. Tothis end, we introduce the notion of requirement and the notionof negotiation function. We establish that the set of plays that aresupported by SPEs are exactly those that are consistent with theleast ﬁxed point of the negotiation function. Finally, we show thatthe negotiation function is piecewise linear and can be analyzedusing the linear algebraic tool box.

I. I

NTRODUCTION

The notion of Nash equilibrium (NE) is one of the mostimportant and most studied solution concepts in game theory.A proﬁle of strategies is an NE when no rational player hasan incentive to change their strategy unilaterally, i.e. while theother players keep their strategies. Thus an NE models a stablesituation. Unfortunately, it is well known that, in sequentialgames, NEs suffer from the problem of non-credible threats ,see e.g. [16]. In those games, some NE only exists when someplayers do not play rationally in subgames and so use non-credible threats to force the NE. This is why in sequentialgames, the stronger notion of subgame-perfect equilibrium is used instead: a proﬁle of strategies is a subgame-perfectequilibrium (SPE) if it is an NE in all the subgames of thesequential game. Thus SPE imposes rationality even after adeviation has occured.In this paper, we study sequential games that are inﬁniteduration games played on graphs with mean-payoff objectivesand focus on SPEs. While NEs are guaranteed to exist ininﬁnite duration games played on graphs with mean-payoffobjectives, it is known that it is not the case for SPEs,see e.g. [17], [3]. We provide in this paper a constructivecharacterization of the entire set of SPEs which allows usto decide, among others, the SPE existence problem. Thisproblem was left open in previous contributions on the subject.More precisely, our contributions are described in the nextparagraphs.

Contributions.

First, we introduce two important new notionsthat allow us to capture NEs, and more importantly SPEs ininﬁnite duration games played on graphs with mean-payoffobjectives : the notion of requirement and the notion of negotiation function . A large part of our results apply to the larger class of games with preﬁxindependent objectives. For the sake of readability of this introduction, wefocus here on mean-payoff games but the technical results in the paper areusually covering broader classes of games.

A requirement λ is a function that assigns to each vertex v ∈ V of a game graph a value in R ∪ {−∞ , + ∞} . The value λ ( v ) represents a requirement on any play ρ = ρ ρ . . . ρ n . . . that traverses this vertex: if we want the player that controlsthe vertex v to follow ρ and to give up deviating from ρ , thenthe play must offer a payoff to this player that is at least λ ( v ) .An inﬁnite play ρ is λ -consistent if, for each player i , thepayoff of ρ for player i is larger than or equal to the largestvalue of λ on vertices occurring along ρ and controlled byplayer i .We ﬁrst establish that if λ maps a vertex v to the largestvalue that the player that controls v can secure against a fullyadversarial coalition of the other players, i.e. if λ ( v ) is thezero-sum worst-case value, then the set of plays that are λ -consistent are exactly the set of plays that are supported byan NE (Theorem 1).As SPEs are forcing players to play rationally in all sub-games, we cannot rely of the zero-sum worst-case value tocharacterize them. Indeed, when considering the worst-casevalue, we allow adversaries to play fully adversarially aftera deviation and so potentially in an irrational way w.r.t. theirown objective. In fact, in an SPE, a player is refrained todeviate when opposed by a coalition of rational adversaries .To characterize this relaxation of the notion of worst-casevalue, we rely on our notion of negotiation function .The negotiation function operates from the set of require-ments into itself. To understand the purpose of the negotiationfunction, let us consider its application on the requirement λ that maps every vertex v on the worst-case value as above.Now, we can naturally formulate the following question. Given v and λ , can the player that controls v improve the valuethat they can ensure against all the other players if onlyplays that are consistent with λ are proposed by the otherplayers? In other words, can this player enforce a better valuewhen playing against the other players if those players arenot willing to give away their own worst-case value? Clearly,securing this worst-case value can be seen as a minimal goalfor any rational adversary. So nego( λ )( v ) returns this value.But, now this reasoning can be iterated. One of the maincontributions of this paper is to show that the least ﬁxed point λ ∗ of the negotiation function is exactly characterizing the setof plays supported by SPEs (Theorem 2).To turn this ﬁxed point characterization of SPEs into al-gorithms, we additionally draw links between the negotiationunction and two classes of zero-sum games, that are called abstract and concrete negotiation games (see Theorem 3).We show that the latter can be solved effectively and allow,given λ , to compute nego( λ ) (Lemma 5). While solvingconcrete negotiation games allows us to compute nego( λ ) for any requirement λ , and even if the function nego( · ) is monotone and Scott-continuous (Proposition 2), a directapplication of the Kleene-Tarski ﬁxed point theorem is notsufﬁcient to obtain an effective algorithm to compute λ ∗ .Indeed, we give examples that require a transﬁnite number ofiterations to converge to the least ﬁxed point. To provide analgorithm to compute λ ∗ , we show that the function nego( · ) ispiecewise linear and we provide an effective representation ofthis function (Theorem 4). This effective representation canthen be used to extract all its ﬁxed points and in particularits least ﬁxed point using linear algebraic techniques. Finally,let us note that all our results are also shown to extend to ε − SPEs, those are quantitative relaxations of SPEs.

Related works.

Non-zero sum inﬁnite duration games haveattracted a large attention in recent years with applicationstargeting reactive synthesis problems. We refer the interestedreader to the following survey papers [1], [5] and their refer-ences for the relevant literature. We detail below contributionsmore closely related to the work presented here.In [4], Brihaye et al. offer a characterization of NE in quan-titative games for cost-preﬁx-linear reward functions based onthe worst-case value. The mean-payoff is cost-preﬁx-linear. Intheir paper, the authors do not consider the stronger notion ofSPE which is the central solution concept studied in our paper.In [18], Ummels proves that there always exists an SPE ingames with ω -regular objectives and deﬁnes algorithms basedon tree automata to decide constrained SPE problems. Strategylogics, see e.g. [10], can be used to encode the concept ofSPE in the case of ω -regular objectives with application tothe rational synthesis problem [13] for instance. The mean-payoff reward function is not ω -regular and so the techniquesdeﬁned there cannot be used in our setting. Furthermore, asalready recalled above, see e.g. [20], [3], contrary to the ω -regular case, SPEs in games with mean-payoff objectives mayfail to exist.In [3], Brihaye et al. introduce and study the notion ofweak subgame-perfect equilibria which is a weakening ofthe classical notion of SPE. This weakening is equivalentto the original SPE concept on reward functions that are continuous . This is the case for example for the quantitativereachability reward function. On the contrary, the mean-payoffcost function is not continuous and the techniques used in [3]and generalized in [8], cannot be used to characterize SPEsfor the mean-payoff reward function.In [11], Flesch et al. show that the existence of ε -SPEs isguaranteed when the reward function is lower-semicontinuous ,which is not the case of the mean-payoff reward function.In [6], Bruyère et al. study secure equilibria that are areﬁnement of NEs. Secure equilibria are not subgame-perfectand are, as classical NEs, subject to non-credible threats insequential games. In [2], Brihaye et al. solve the problem of the existenceof SPEs on quantitative reachability games. Their techniquesrely on the property that the quantitative reachability rewardfunction is continuous which implies that in that case weakSPEs and SPEs are equivalent. This is not the case for themean-payoff reward function.In [15], Noémie Meunier develops a method based onProver-Challenger games to solve the problem of the existenceof SPEs on games with a ﬁnite number of possible outcomes.This method is not applicable to the mean-payoff rewardfunction as the number of outcomes in this case is uncountablyinﬁnite. Structure of the paper.

In Sect. 2, we introduce the necessarybackground. Sect. 3 deﬁnes the notion of requirement andthe negotiation function. Sect. 4 contains the main technicalcontribution of the paper which shows that the set of plays thatare supported by an SPE are those that are λ ∗ -consistent where λ ∗ is the least ﬁxed point of the negotiation function. Sect. 5draws a link between the negotiation function and negotiationgames. Finally Sect. 6 establishes that the negotiation functionis effectively piecewise linear. All the detailed proofs of ourresults can be found in a well identiﬁed appendix and a largenumber of examples are provided in the main part of thepaper to illustrate the main ideas behind our new conceptsand constructions. II. B ACKGROUND

In all what follows, we will use the word game for theinﬁnite duration turn-based quantitative games on graphs withcomplete information.

Deﬁnition 1 (Game) . A game is a tuple: G = (Π , V, ( V i ) i ∈ Π , E, µ ) , where: • Π is a ﬁnite set of players ; • ( V, E ) is a directed graph, whose vertices are sometimescalled states and whose edges are sometimes called transitions , and in which for each v ∈ V , the set: { w ∈ V | ( v, w ) ∈ E } of the states directly accessible from v is nonempty; • ( V i ) i ∈ Π is a partition of V , in which V i is the set of states controlled by player i ; • µ : V ω → R Π is an outcome function .For the simplicity of writing, a transition ( v, w ) ∈ E willoften be written vw .When µ is an outcome function and i is a player, µ i denotesthe i -th component of µ : if µ ( ρ ) = ¯ x = ( x i ) i ∈ Π , then µ i ( ρ ) = x i . That quantity is the payoff of player i in ρ . Deﬁnition 2 (Initialized game) . An initialized game is a tuple ( G, v ) , often written G ↾ v , where G is a game and v ∈ V isa state called initial state . Moreover, the game G ↾ v is well-initialized if any state of G is accessible from v in the graph ( V, E ) .2 eﬁnition 3 (Play) . A play in a game G is an inﬁnite word ρ = ρ ρ · · · ∈ V ω such that for all n , we have ρ n ρ n +1 ∈ E .It is also a play in the initialized game G ↾ ρ . The set of playsin the game G (resp. the initialized game G ↾ v ) is denoted by Plays G (resp. Plays G ↾ v ). Remark.

In the literature, the word outcome can be usedto name plays, and the word payoff to name what we callhere outcome. Here, the word payoff will be used to refer tooutcomes, seen from the point of view of a given player – orin other words, an outcome will be seen as the collection ofall players’ payoffs.

Deﬁnition 4 (History) . A history in a game G is a ﬁnite preﬁx h . . . h n of a play in G . If it is nonempty, it is also a historyin the initialized game G ↾ h . The set of histories in the game G (resp. the initialized game G ↾ v ) is denoted by Hist G (resp. Hist G ↾ v ). Deﬁnition 5 (Strategy) . Let i be a player in an initializedgame G ↾ v . A strategy for player i is a function: σ i : { hv ∈ Hist G ↾ v | v ∈ V i } → V such that vσ i ( hv ) is an edge of ( V, E ) for all hv .A play ρ is compatible with a strategy σ i if and only if ρ n +1 = σ i ( ρ . . . ρ n ) for all n such that ρ n ∈ V i . A history h is compatible with a strategy σ i if it is the preﬁx of a playthat is compatible with σ i . Deﬁnition 6 (Strategy proﬁle) . Let P ⊆ Π be a set of playersin an initialized game G ↾ v . A strategy proﬁle for P is a tuple ¯ σ P = ( σ i ) i ∈ P where for each i , σ i is a strategy for player i in G ↾ v .A complete strategy proﬁle is a strategy proﬁle for Π , theset of all the players in the considered game: then, it is simplywritten ¯ σ . A play or a history is compatible with a strategyproﬁle ¯ σ P if it is compatible with every strategy σ i for i ∈ P .In a strategy proﬁle ¯ σ P , the σ i ’s domains are pairwise disjoint.Therefore, we can consider ¯ σ P as one function: for hv ∈ Hist G ↾ v such that v ∈ S i ∈ P V i , we liberally write ¯ σ P ( hv ) for σ i ( hv ) with i such that v ∈ V i . For any complete strategyproﬁle ¯ σ , there exists only one play in G ↾ v compatible withevery σ i , denoted by h ¯ σ i v .When i is a player and when the context is clear, we willoften write − i for the set Π \ { i } . Then, a strategy proﬁle forall the players, except player i , will typically be written ¯ σ − i .We will often refer to Π \{ i } as the environment against player i . When ¯ τ P and ¯ τ ′ Q are two strategy proﬁles with P ∩ Q = ∅ , (¯ τ P , ¯ τ ′ Q ) denotes the strategy proﬁle ¯ σ P ∪ Q such that σ i = τ i for i ∈ P , and σ i = τ ′ i for i ∈ Q .Before moving on to SPEs, let us recall the notion of Nashequilibrium. Deﬁnition 7 (Nash equilibrium) . Let G ↾ v be an initializedgame. The strategy proﬁle ¯ σ is a Nash equilibrium , or NE for short, in G ↾ v , if and only if for each player i and for everystrategy σ ′ i , called deviation of σ i , we have the inequality: µ i ( h σ ′ i , ¯ σ − i i v ) ≤ µ i ( h ¯ σ i v ) . To deﬁne SPEs, we need the notion of subgame.

Deﬁnition 8 (Subgame) . Let G = (Π , V, ( V i ) i , E, µ ) be agame, and let hv be a history in G . The subgame of G after hv , denoted by G ↾ hv , is the initialized game: (Π , V, ( V i ) i , E, µ ↾ hv ) ↾ v where: µ ↾ hv : (cid:26) vV ω → R Π vρ µ ( hvρ ) Remark.

A subgame is initialized, but deﬁned from a gamewhich is not: that is why G ↾ v denotes the game G initializedin the state v , which is also the subgame of G after the one-state history v . Deﬁnition 9 (Substrategy) . Let G ↾ v be an initialized game, σ i be a strategy for some player i , and hv be a history in G ↾ v . Then, the substrategy of σ i after hv , denoted by σ i ↾ hv ,is the strategy in the subgame G ↾ hv : σ i ↾ hv : vh ′ σ i ( hvh ′ ) . Deﬁnition 10 (Subgame-perfect equilibrium) . Let G ↾ v be aninitialized game. The strategy proﬁle ¯ σ is a subgame-perfectequilibrium , or SPE for short, in G ↾ v , if and only if for everyhistory h in G ↾ v , the strategy proﬁle ¯ σ ↾ h is a Nash equilibriumin the subgame G ↾ h .The notion of subgame-perfect equilibrium can be seen asa reﬁnement of Nash equilibrium: it is a stronger equilibrium,which excludes players resorting to non-credible threats. Example . In the game represented in Figure 1, where thesquare state is controlled by player and the round states byplayer , if both players get the payoff by reaching the state d and the payoff in the other cases, there are actually twoNEs: one, in blue, where goes to the state b and then player goes to the state d , and both win, and one, in red, whereplayer goes to the state c because player was planningto go to the state e . However, only the blue one is an SPE, asmoving from b to e is irrational for player in the subgame G ↾ ab .An ε -SPE is a strategy proﬁle which is almost an SPE,meaning that if a player deviates after some history, they willnot be able to improve their payoff by more than a quantity ε ≥ . Deﬁnition 11 ( ε -SPE) . Let G ↾ v be an initialized game, and ε ≥ . A strategy proﬁle ¯ σ from v is an ε -SPE if and onlyif for every history hv , for every player i , for every strategy σ ′ i , we have: µ i ( h ¯ σ − i ↾ hv , σ ′ i ↾ hv i v ) ≤ µ i ( h ¯ σ ↾ hv i v ) + ε. Remark. A -SPE is an SPE.3 b cd e f g Fig. 1. A game with two NEs and one SPE

In this article, we will focus on preﬁx-independent games,and in particular mean-payoff-inf games.

Deﬁnition 12 (Mean-payoff-inf game) . A mean-payoff-infgame is a game G = (Π , V, ( V i ) i , E, µ ) , where µ is deﬁnedfrom a function π : E → Q Π , called weight function , by, foreach player i : µ i : ρ lim inf n →∞ n n − X k =0 π i ( ρ k ρ k +1 ) . In a mean-payoff-inf game, the weight given by the function π represents the immediate reward that each action gives toeach player. The ﬁnal payoff of each player is their averagepayoff along the play, deﬁned as the limit inferior over n (sincethe limit may not be deﬁned) of the average payoff after n steps. Deﬁnition 13 (Preﬁx-independent game) . A game G = (Π , V, ( V i ) i , E, µ ) is preﬁx-independent if forevery history h and for every play ρ , µ ( hρ ) = µ ( ρ ) .We also say, in that case, that the outcome function µ ispreﬁx-independent. Remark.

Mean-payoff-inf games are preﬁx-independent.Before moving on to some examples, we recall a fewclassical results about two-player zero-sum games.

Deﬁnition 14 (Zero-sum game) . A game G =(Π , V, ( V i ) i , E, µ ) is zero-sum if for every play ρ , wehave: X i ∈ Π µ i ( ρ ) = 0 . Deﬁnition 15 (Borelian game) . A game G =(Π , V, ( V i ) i , E, µ ) is Borelian if the function µ , fromthe set V ω equipped with the product topology to theeuclidian space R Π , is Borelian, i.e. if for any Borelian set B ⊆ R Π , the set µ − ( B ) is Borelian. Proposition 1 (Determinacy of Borelian two-playergames [14]) . Let G ↾ v = ( { , } , V, ( V i ) i , E, µ ) ↾ v be ac b d

03 2211

Fig. 2. A game without SPE a b

Fig. 3. A game with an inﬁnity of SPEs an initialized two-player zero-sum Borelian game, with adistinguished player . Then, we have the following equality: sup σ inf σ µ ( h ¯ σ i v ) = inf σ sup σ µ ( h ¯ σ i v ) . Deﬁnition 16 (Value of a zero-sum game) . Let: G ↾ v = ( { , } , V, ( V i ) i , E, µ ) ↾ v be an initialized two-player zero-sum Borelian game, with adistinguished player . Then, the quantity: sup σ inf σ µ ( h ¯ σ i v ) = inf σ sup σ µ ( h ¯ σ i v ) is called value of G ↾ v , and denoted by val( G ↾ v ) .In the two following examples, we illustrate the problem ofthe existence of SPEs in mean-payoff games. Example . Let G be the mean-payoff-inf game of Figure 2,where for every edge, the left number is the weight for player , and the right number is the weight for player . No weightis given for the edges ac and bd since they can be used onlyonce, and therefore do not inﬂuence the ﬁnal payoff.As shown in [7], this game does not have any SPE, neitherfrom the state a nor from the state b . The idea is the following: • The only NE plays from the state b are the plays whereplayer eventually leaves the cycle ab and goes to d : ifthey stay in the cycle ab , then player would be betteroff leaving it, and if she does, player would be betteroff leaving it before. • From the state a , if player knows that player willleave, she has no incentive to do it before: there is no NEwhere leaves the cycle and plans to do it if ever shedoes not. Therefore, there is no SPE where leaves thecycle. • But then, after a history that terminates in b , player has actually no incentive to leave if player never plansto do it afterwards: contradiction.Throughout the remaining of the paper, we will show howto apply our method on that example. Example . Let us now study the game of Figure 3.Using techniques from [9], we can represent the outcomesof possible plays in that game as in Figure 4 (gray and blueareas).Following exclusively one of the three simple cycles a , ab , b of the game graph during a play yields the outcomes , and , respectively. By combining those cycles with well chosen4

12 0 1 2

Fig. 4. The outcomes of plays and SPE plays in the game of Figure 3 frequencies, one can obtain any outcome in the convex hullof those three points.Now, it is also possible to obtain the point by using theproperties of the limit inferior: it is for instance the outcomeof the play: a b a b . . . a n b n +1 . . . , In fact, one can construct a play that yields any outcome inthe convex hull of the four points , , , and .We claim that the outcomes of SPEs plays correspond tothe entire blue area in Figure 4: there exists an SPE ¯ σ in G ↾ a with h ¯ σ i a = ρ if and only if µ ( ρ ) , µ ( ρ ) ≥ .That statement is a direct consequence of the results weshow in the remaining sections, but let us give a ﬁrst intuition:a play with such an outcome necessarily uses inﬁnitely oftenboth states. It is an NE play because none of the players canget a better payoff by looping forever on their state, and theycan both force each other to follow that play, by threateningthem to loop for ever on their state whenever they can. Butsuch a strategy proﬁle is clearly not an SPE.It can be transformed into an SPE as follows: when a playerdeviates, say player , then player can punish him bylooping on a , not forever, but a great number of times, untilplayer ’s mean-payoff gets very close to . Afterwards, bothplayers follow again the play that was initially planned. Sincethat threat is temporary, it does not affect player ’s payoffon the long term, but it really punishes player if that onetries to deviate inﬁnitely often.For example, let us consider the play ( aaabbb ) ω , with out-come . An SPE which generates that play is the following:when player has to play, by default, she loops twice on a , and when player has to play, he loops twice on b . Butif player goes to b without having looped twice on a , thenplayer stays in b until ’s payoff passes below n , where n is the number of times player has already deviated; thenand only then, he plays as it was planned. And if player deviates, player punishes him the same way. III. R EQUIREMENTS AND NEGOTIATION

We will now see that SPEs are strategy proﬁles that respectsome requirements about the payoffs, depending on the statesit traverses. In this part, we develop the notions of requirement and negotiation . A. Requirement

In the method we will develop further, we will need to ana-lyze the players’ behaviour when they have some requirement to satisfy. Intuitively, one can see requirements as rationalityconstraints for the players, that is, a threshold payoff valueunder which a player will not accept to follow a play.In all what follows, R denotes the set R ∪ {±∞} . Deﬁnition 17 (Requirement) . A requirement on the game G is a function λ : V → R .For a given state v , the quantity λ ( v ) represents the minimalpayoff that the player controlling v will require in a playbeginning in v . Deﬁnition 18 ( λ -consistency) . Let λ be a requirement on agame G . A play ρ in the game G is λ -consistent if and onlyif for all n ∈ N , for i ∈ Π such that ρ n ∈ V i , we have µ i ( ρ n ρ n +1 . . . ) ≥ λ ( ρ n ) .The set of the λ -consistent plays from a state v is denotedby λ Cons( v ) . Deﬁnition 19 (Satisﬁability) . A requirement λ on the ini-tialized game G ↾ v is satisﬁable if and only if for each v accessible from v , there exists a λ -consistent play in the game G ↾ v . Deﬁnition 20 ( λ -rationality) . Let λ be a requirement on amean-payoff-inf game G . Let i ∈ Π . A strategy proﬁle ¯ σ − i is λ -rational if and only if there exists a strategy σ i such thatfor every history hv compatible with ¯ σ − i , the play h ¯ σ ↾ hv i v is λ -consistent. We then say that the strategy σ i λ - rationalizes the strategy proﬁle ¯ σ − i .The set of λ -rational strategy proﬁles in G ↾ v is denoted by λ Rat( v ) .Note that λ -rationality is a property of a strategy proﬁlefor all the players but one, player i . Intuitively: all players(including player i ) have some requirement to satisfy. Theother players than i made a coalition: they choose collectivelytheir strategy proﬁle, and they propose a strategy to player i ,so that if player i eventually follows that strategy (i.e. possiblyafter a ﬁnite number of deviations), then every player has theirrequirements satisﬁed. Remark.

A requirement λ is satisﬁable in G ↾ v if and only iffor some player i , there exists a λ -rational strategy proﬁle ¯ σ − i from v .Finally, let us deﬁne a particular requirement: the vacuousrequirement , which requires nothing. Deﬁnition 21 (Vacuous requirement) . In any game, the vacu-ous requirement , denoted by λ , is the requirement constantlyequal to −∞ .5 c b d

03 2211 ( λ )1 21 2 Fig. 5. The requirement λ on the game of Example 2 Remark.

Any play is λ -consistent. B. Negotiation

We will show that SPEs in preﬁx-independent games arecharacterized by the ﬁxed points of a function on requirements.That function can be seen as a negotiation : when a player has arequirement to satisfy, another player can hope a better payoffthan what they can secure in general, and therefore updatetheir own requirement.

Deﬁnition 22 (Negotiation function) . Let G be a game.The negotiation function is the function that transforms anyrequirement λ on G into a requirement nego( λ ) on G , suchthat for each i ∈ Π and v ∈ V i , nego( λ )( v ) = inf ¯ σ − i ∈ λ Rat( v ) sup σ i µ i ( h ¯ σ i v ) , with the convention inf ∅ = + ∞ . Remark.

The negotiation function has the following proper-ties: • The requirement λ is not satisﬁable from v if and onlyif nego( λ )( v ) = + ∞ . • The negotiation function is monotone: if λ ≤ λ ′ (for thepointwise order, i.e. if for each v , λ ( v ) ≤ λ ′ ( v ) ), then nego( λ ) ≤ nego( λ ′ ) . • The negotiation function is non-decreasing: for every λ ,we have λ ≤ nego( λ ) .In the general case, the quantity nego( λ )( v ) represents theworst case value that the player controlling v can ensure,assuming that the other players play λ -rationally. Example . Let us consider the game of Example 2: werepresented it again in Figure 5, with the requirement λ =nego( λ ) , which is easy to compute since any strategy proﬁleis λ -rational: for each v , λ ( v ) is the classical worst-casevalue or antagonistic value of v , i.e. the best value the playercontrolling v can enforce against a fully hostile environment.Let us now compute the requirement λ = nego( λ ) . • From the state c , there exists exactly one λ -rationalstrategy proﬁle ¯ σ − = σ , which is the empty strategysince player has never to choose anything. Against thatstrategy, the best and the only payoff player can get is , hence λ ( c ) = 1 . • For the same reasons, λ ( d ) = 2 . • From the state b , player can force to get the payoff or less, with the strategy proﬁle σ : h c . Such astrategy is λ -rational, rationalized by the strategy σ : h d . Therefore, λ ( b ) = 2 . • Finally, from the state a , player can force to get thepayoff or less, with the strategy proﬁle σ : h d .Such a strategy is λ -rational, rationalized by the strategy σ : h c . But, he can not force her to get less than thepayoff , because she can force the access to the state b ,and the only λ -consistent plays starting from b are theplays with the form ( ba ) k bd ω . Therefore, λ ( a ) = 2 . C. Steady negotiation

Often in what follows, we will need a game to be withsteady negotiation , i.e. such that there always exists a worse λ -rational behaviour for the environment against a given player. Deﬁnition 23 (Game with steady negotiation) . A game G is with steady negotiation if and only if for every player i , forevery vertex v , and for every requirement λ satisﬁable from v , there exists a λ -rational strategy proﬁle ¯ σ − i from v suchthat: inf ¯ σ ′− i ∈ λ Rat( v ) sup σ ′ i µ i ( h ¯ σ ′ i v ) = sup σ i µ i ( h ¯ σ − i , σ i i v ) . Remark.

In particular, when a game is with steady negotiation,the inﬁmum in the deﬁnition of negotiation is always reached.It will be proved in Section V that mean-payoff-inf gamesare with steady negotiation.

D. Link with Nash equilibria

Requirements and the negotiation function are able to cap-ture Nash equilibria. Indeed, if λ is the vacuous requirement,then nego( λ ) characterizes the plays that are supported by aNash equilibrium (abbreviated by NE plays), in the followingformal sense: Theorem 1.

Let G be a game with steady negotiation. Then,a play ρ in G is an NE play if and only if ρ is nego( λ ) -consistent.Proof sketch: For a given state v , the value nego( λ )( v ) is deﬁned as the best payoff that the player controlling v canensure against any λ -rational strategy proﬁle, that is againstany strategy proﬁle: it is what is often called in the literaturethe antagonistic value or the worst-case value of v .In a play that is not nego( λ ) -consistent, any player thatdoes not have their requirements satisﬁed can deviate and en-sure that requirement, and therefore has a proﬁtable deviation.Conversely, given a play that is nego( λ ) -consistent, wecan construct an NE realizing that play, where all the playersforce each other to follow it, by threatening them to playfully adversarily against the one who would deviate. Thesteady negotiation property guarantees the existence of a fullyadversarial strategy proﬁle.See Appendix A for a detailed proof. Example . Let us consider again the game of Example 2, withthe requirement λ given in Figure 5. The only λ -consistentplays in this game, starting from the state a , are ac ω , and ( ab ) k d ω with k ≥ . One can check that those plays areexactly the NE plays in that game.6 v ξ • i • • (cid:27) "worst" λ ∗ -rational ¯ σ − i • j • } even worse λ ∗ -rational ¯ τ − i Fig. 6. The construction of the SPE ¯ σ In the following section, we will prove that as well as thisrequirement nego( λ ) characterizes the NEs, the requirementthat is the least ﬁxed point of the negotiation function charac-terizes the SPEs.IV. L INK BETWEEN NEGOTIATION AND

SPE S A. From negotiation ﬁxed points to SPEs

The notion of negotiation will enable us to ﬁnd the SPEs, butalso more generally the ε -SPEs, in a game. For that purpose,we need the notion of ε -ﬁxed points of a function. Deﬁnition 24 ( ε -ﬁxed point) . Let ε ≥ , let D be a ﬁnite setand let f : R D → R D be a mapping. A tuple ¯ x ∈ R D is a ε -ﬁxed point of f if for each d ∈ D , if ¯ y = f (¯ x ) , we have y d ∈ [ x d − ε, x d + ε ] . Remark. A -ﬁxed point is a ﬁxed point.We can now prove that in games with steady negotiation,the ε -ﬁxed points λ of the negotiation function are such thatall λ -consistent plays are ε -SPEs plays. Lemma 1.

Let G ↾ v be a well-initialized preﬁx-independentgame with steady negotiation, and ε ≥ . Let λ be an ε -ﬁxedpoint of the function nego . Then, for every λ -consistent play ξ starting in v , there exists an ε -SPE ¯ σ such that h ¯ σ i v = ξ .Proof sketch: Given a play ξ , the construction of ¯ σ canbe represented as in Figure 6.First, the play generated by ¯ σ is ξ . Then, whenever a player i deviates from that play, the other players must punish them,and stay λ -rational: they follow the λ -rational strategy proﬁlethat minimizes player i ’s payoff, whose existence is guaranteedby the steady negotiation property. Player i plays a strategythat λ -rationalizes that strategy proﬁle.After a history where player i would have make seriousmistakes, that is, choices that lower the best payoff they canensure against a λ -rational environment, that environment must"reset" its strategy proﬁle in order to be as hostile as it canbe. Preﬁx-independence guarantees that those resets happenﬁnitely many times.We use the same construction if a second player j deviatesafter ﬁnitely many deviations of player i , and so on.See Appendix B for a detailed proof. B. From SPEs to negotiation ﬁxed points

Conversely, let us prove that every ε -SPE play is λ -consistent for some ε -ﬁxed point λ of the negotiation function. Lemma 2.

Let G be a game, and let ε ≥ . The negotiationfunction has a least ε -ﬁxed point. A proof is given in Appendix D.

D. Theorem

The following theorem is an immediate consequence of thethree previous lemmas, and sums up the link between thenegotiation function and the SPEs, or ε -SPEs. Theorem 2.

Let G ↾ v be an initialized preﬁx-independentgame, and let ε ≥ . Let λ ∗ be the least ε -ﬁxed point of thenegotiation function. Let ξ be a play starting in v . If thereexists an ε -SPE ¯ σ such that h ¯ σ i v = ξ , then ξ is λ ∗ -consistent.The converse is true if the game G is with steady negotiation.Proof: First, let us recall that λ ∗ , the least ε -ﬁxed pointof the negotiation function, exists by Lemma 3.If ¯ σ is an ε -SPE, then by Lemma 2, there exists an ε -ﬁxed point λ of the negotiation function such that all theplays generated by ¯ σ after some history are λ -consistent;in particular, the play ξ is λ -consistent, and therefore λ ∗ -consistent since λ ∗ ≤ λ .Conversely, if the game G is with steady negotiation, andif the play ξ is λ ∗ -consistent, then by Lemma 1, there existsan ε -SPE ¯ σ such that h ¯ σ i v = ξ .In the following sections, we will develop a method tocompute the negotiation function: we will prove that in thecase of mean-payoff-inf games, it is actually a piecewise linearfunction, which makes it feasible to compute and express theset of its ε -ﬁxed points; and therefore, to ﬁnd the least of themusing classical linear algebraic techniques.However, when one looks for a least ﬁxed point, a usualmethod, under some continuity hypothesis, is to compute the7 d a b ef Fig. 7. A game where the negotiation function is not stationary limit of the iterations, by successive approximations basedon Kleene-Tarski theorem. We develop that possibility in thefollowing subsection, conﬁrming that the least ﬁxed point ofthe negotiation function is indeed the limit of its iteration onthe vacuous requirement λ , and explaining why it can notalways be used in practice. E. An insufﬁcient track: the negotiation sequence

We assume in that subsection that G is a game such thatthe negotiation function is Scott-continuous , i.e. such that forevery non-decreasing sequence ( λ n ) n of requirements on G ,we have: nego (cid:18) sup n λ n (cid:19) = sup n nego( λ n ) . The least ﬁxed point of the negotiation function is, then,the limit of the negotiation sequence , deﬁned as the sequence ( λ n ) n ∈ N = (nego n ( λ )) n .Indeed, that sequence is non-decreasing; therefore it has alimit λ . By Scott-continuity, the equality λ n +1 = nego( λ n ) implies, when we take the suprema over n , that λ is a ﬁxedpoint of the negotiation function. If λ ∗ is the least ﬁxed pointof the negotiation function, then λ ∗ ≤ λ ; and on the otherhand, λ ∗ ≥ λ by induction, since λ ∗ ≤ λ and if λ ∗ ≤ λ n ,then λ ∗ ≤ λ n +1 because the negotiation function is non-decreasing. Therefore, λ ∗ = λ .In mean-payoff-inf games, in particular, the negotiationfunction is Scott-continuous: Proposition 2.

In mean-payoff-inf games, the negotiationfunction is Scott-continuous.

A proof of that statement is given in Appendix I: note thatit uses results that will be presented in Section V.In many cases, the negotiation sequence is stationary, and inthat case, it is possible to compute its limit: whenever a termis equal to the previous one, we know that we reached it. Butactually, the negotiation sequence is not always stationary. Thegame of Figure 7, where for all edges, the ﬁrst label is theweight for player , the second one is the label for player ,and the third one for player , is a counter-example.For all n , we have: λ n ( a ) = λ n ( b ) = 2 − n − . v • • • Fig. 8. The play constructed by Prover and Challenger in the abstractnegotiation game

Indeed, the game is symmetric, and the lowest payoff thatcan be proposed to player from the state a will be obtainedby a combination of the cycles ef and f that has to satisfyplayer ’s requirement in the state b , hence the followinginductive equation: λ n +1 ( a ) = 1 + 12 λ n ( a ) , whose solution is the sequence proposed above. This sequenceconverges to but never reaches it. All the details of thatstatement, and a similar example with only two players, aregiven in Example 15, in Appendix K.V. N EGOTIATION GAMES

We have now proved that SPEs are characterized by therequirements that are ﬁxed points of the negotiation function;but we need to know how to compute, in practice, the quantity nego( λ ) for a given requirement λ . In other words, we need aalgorithm that gives, given a state v controlled by a player i in the game G , and given a requirement λ , which value player i can ensure in G ↾ v if the other players play λ -rationally. Theconcept of Prover-Challenger games, used for example in [15],gives us a tool for that purpose. A. Abstract negotiation game

We ﬁrst deﬁne an abstract negotiation game , that is concep-tually simple but not directly usable for algorithmic purposebecause it is deﬁned on an uncoutable inﬁnite state space.Here is an intuitive deﬁnition of the abstract negotiationgame

Abs λi ( G ) ↾ [ v ] from a state v , a player i and a require-ment λ : • player Prover proposes a λ -consistent play ρ from v (orlooses, if she has no play to propose); • either: – player Challenger accepts the play and the gameterminates; – or he chooses an edge ρ k ρ k +1 , with ρ k ∈ V i , fromwhich he can make player i deviate, using another edge ρ k v with v = ρ k +1 : then, the game starts again from w instead of v . • In the resulting play (either eventually accepted by Chal-lenger, or constructed by an inﬁnity of deviations), asrepresented in Figure 8, Prover wants player i ’s payoff tobe low, and Challenger wants it to be high.That game gives us the basis of a method to compute nego( λ ) from λ : if α is the maximal outcome that Challengercan ensure in Abs λi ( G ) [ v ] , with v ∈ V i , then it is the8 c b d

03 2211 ( λ )2 21 2 Fig. 9. The requirement λ on the game of Example 2 maximal payoff that player i can guarantee in G ↾ v , against a λ -rational environment. Hence the equality: val (cid:0) Abs λi ( G ) [ v ] (cid:1) = nego( λ )( v ) . A proof of that statement, with a complete formalization ofthe abstract negotiation game, are presented in Appendix E.

Example . Let us take the game of Example 2: in Figure 9,we wrote, in red, the requirement λ = nego( λ ) , computedin Section III-B. Let us use the abstract negotiation game tocompute the requirement λ = nego( λ ) .From the state a , Prover can propose the play abd ω , andthe only deviation Challenger can do is going to c ; he has ofcourse no incentive to do it. Therefore, λ ( a ) = 2 .From the state b , whatever Prover proposes at ﬁrst, Chal-lenger can deviate and go to a . Then, from a , Prover cannotpropose the play ac ω , which is not λ -consistent: the only pos-sibility she has is proposing a play beginning by ab , and lettingChallenger deviate once more. He can deviate inﬁnitely oftenthat way, and generate the play ( ba ) ω : therefore, λ ( b ) = 3 .The other states keep the same values.Note that λ is no longer satisﬁable from a or b , andtherefore that if λ = nego( λ ) , then λ ( a ) = λ ( b ) = + ∞ .By the considerations on the negotiation sequence given inSection IV-E, this proves that the least ﬁxed point of thenegotiation function is not satisﬁable, and therefore that thereis no SPE in that game.The interested reader will ﬁnd other examples in Ap-pendix K. B. Concrete negotiation game

In the abstract negotiation game, Prover has to proposecomplete plays, on which we can make the hypothesis thatthey are λ -consistent. In practice, there will often be an inﬁnityof such plays, and therefore it cannot be used directly for analgorithmic purpose. Instead, those plays can be given edge byedge, with a ﬁnite state game. Its deﬁnition is more technical,but it can be shown that it is equivalent to this abstract one. Deﬁnition 25 (Concrete negotiation game) . Let G ↾ v be aninitialized preﬁx-independent game, and let λ be a requirementon G , with either λ ( V ) ⊆ R , or λ = λ .The concrete negotiation game of G ↾ v for player i is thetwo-player zero-sum game: Conc λi ( G ) ↾ s = ( {P , C} , S, ( S P , S C ) , ∆ , ν ) ↾ s , that, intuitively, mimics the abstract game introduced above: • the states controlled by Prover are: S P = V × V where the state s = ( v, M ) represents a current state v on which Prover has to deﬁne the strategy proﬁle,called localization of the state s , and M is the memory of s , which memorizes the states that have been traversedso far since the last deviation, in order to check the λ -consistency of the proposed play; • the states controlled by Challenger are: S C = E × V ; • there are three types of transitions: proposals , accepta-tions and deviations : ∆ = Prop ∪ Acc ∪ Dev where: – the proposals are transitions in which Prover proposesan edge to follow in the game G : Prop = (cid:26) ( v, M )( vw, M ) (cid:12)(cid:12)(cid:12)(cid:12) vw ∈ E,M ∈ V (cid:27) ; – the acceptations are transitions in which Challengeraccepts to follow the edge proposed by Prover (thisis in particular his only possibility whenever that edgebegins on a state that is not controlled by player i ): Acc = (cid:26) ( vw, M ) ( w, M ∪ { w } ) (cid:12)(cid:12)(cid:12)(cid:12) j ∈ Π ,w ∈ V j (cid:27) (note that the memory is updated); – the deviations are transitions in which Challenger re-fuses to follow the edge proposed by Prover, as he canif that edge begins in a state controlled by player i : Dev =  ( uv, M )( w, { w } ) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) u ∈ V i ,w = v,uw ∈ E  (the memory is erased, and only the new state thedeviating edge leads to is memorized); • the outcome function ν measures player i ’s payoff, witha defeat condition if the constructed strategy proﬁle isnot λ -rational, that is to say if after ﬁnitely many player i ’s deviations, it can generate a play which is not λ -consistent: – ν C ( η ) = + ∞ if there exists n such that no transition inthe play η n η n +1 . . . is a deviation, and if there exists j ∈ Π such that ˆ µ j ( η ) < ; – ν C ( η ) = ˆ µ ⋆ ( η ) otherwise; – and ν P = − ν C ;where for each dimension j ∈ Π , ˆ µ j measures the differ-ence between player j ’s payoff and player j ’s maximalrequirement: ˆ µ j (( ρ , M ) ( ρ ρ ′ , M ) ( ρ , M ) . . . )= µ j ( ρ ) − lim sup n max v ∈ M n ∩ V j λ ( v ) ⋆ , ˆ µ ⋆ measuresplayer i ’s payoff: ˆ µ ⋆ (( ρ , M ) ( ρ ρ ′ , M ) ( ρ , M ) . . . ) = µ i ( ρ ) The dimension ⋆ is called main dimension, and each j ∈ Π is a non-main dimension; • and ﬁnally, s = ( v , { v } ) .Like in the abstract negotiation game, the goal of Proveris to ﬁnd a λ -rational strategy proﬁle that forces the worstpossible payoff for player i , and the goal of Prover is to ﬁnda possibly deviating strategy for player i that gives them thehighest possible payoff.In the case of mean-payoff-inf games, the function ˆ µ is amulti-mean-payoff function, which will enable us to computethe value of the concrete negotiation game. Remark.

In the case of a mean-payoff-inf game, each function ˆ µ j is the mean-payoff-inf function corresponding to the weightfunction ˆ π j deﬁned by, for j ∈ Π : ˆ π j (( v, M )( vw, M )) = 0ˆ π j (( uv, M )( w, N )) = 2 (cid:18) π j ( uw ) − max v j ∈ M ∩ V j λ ( v j ) (cid:19) and: ˆ π ⋆ (( v, M ) , ( vw, M )) = 0ˆ π ⋆ (( uv, M ) , ( w, N )) = 2 π i ( uw ) . A play or a history in the concrete negotiation game hasa projection in the game on which that negotiation game hasbeen constructed, deﬁned as follow:

Deﬁnition 26 (Projection of a history, of a play) . Let G bea preﬁx-independent game. Let λ be a requirement and i aplayer, and let Conc λi ( G ) be the corresponding concrete ne-gotiation game. Let H = ( h , M )( h h ′ , M ) . . . ( h n h ′ n , M n ) be a history in Conc λi ( G ) : the projection of the history H isthe history in the game G : ˙ H = h . . . h n . That deﬁnition is naturally extended to plays.

Remark.

For a play η where no transition is a deviation, wehave that ˆ µ j ( η ) ≥ for each j ∈ Π if and only if ˙ η is λ -consistent.Although the construction is technically more complex, theconcrete negotiation game is equivalent to the abstract one:the only differences are that the plays proposed by Prover areproposed edge by edge, and that their λ -consistency is notwritten in the rules of the game but in its outcome function. Theorem 3.

Let G ↾ v be an initialized preﬁx-independentBorelian game. Let λ be a requirement and i a player. Then,we have: val (Conc λi ( G ) ↾ s ) = inf ¯ σ − i ∈ λ Rat( v ) sup σ i µ i ( h ¯ σ i v ) . A proof can be found in Appendix F. b, { a,b } ba, { a,b } a, { a,b } ab, { a,b } a, { a } ab, { a } c, { c } cc, { c } ac, { a } c, { a,c } cc, { a,c } b, { b } ba, { b } bd, { b } d, { b,d } dd, { b,d } ac, { a,b } c, { a,b,c } cc, { a,b,c } bd, { a,b } d, { a,b,d } dd, { a,b,d } − − −

222 04 0 20 − Fig. 10. A concrete negotiation game

Example . Let us consider again the game from Example 2.Figure 10 represents the game

Conc λ ( G ) (with λ ( a ) = 1 and λ ( b ) = 2 ), where the dashed states are controlled byChallenger, and the other ones by Prover.The dotted arrows indicate the deviations, and when atransition ss ′ is labelled by: x yz,x denotes ˆ π ( ss ′ ) , y denotes ˆ π ( ss ′ ) and z denotes ˆ π ⋆ ( ss ′ ) .The transitions that are not labelled are either zero for thethree coordinates, or meaningless since they cannot be usedmore than once.The red arrows indicate a (memoryless) optimal strategy forChallenger. Against that strategy, the lower outcome Provercan ensure is .Therefore, nego( λ )( v ) = 2 , in line with the abstract gamein Example 6. C. Solving the concrete negotiation game for the mean-payoff-inf case

First, let us note that mean-payoff-inf games are Borelian,and therefore satisfy the hypotheses of Theorem 3.We now know that nego( λ )( v ) , for a given requirement λ , a given player i and a given state v ∈ V i , is the valueof the concrete negotiation game Conc λi ( G ) ↾ ( v, { v } ) . Let usnow show how, in the mean-payoff-inf case, that value can becomputed. Deﬁnition 27 (Memoryless strategy) . A strategy σ i in a game G is memoryless if for all vertices v ∈ V i and for all histories h and h ′ , σ i ( hv ) = σ i ( h ′ v ) .For any game G and any memoryless strategy σ i , G [ σ i ] denotes the graph induced by σ i , that is the graph ( V, E ′ ) ,with: E ′ = { vw ∈ E | v V i or w = σ i ( v ) } . For any ﬁnite set D and any set X ⊆ R D , Conv X denotesthe convex envelopp of X .10e can now prove that in the concrete negotiation gameconstructed from a mean-payoff-inf game, the player Chal-lenger has an optimal strategy that is memoryless. Lemma 4.

V, E ) denotes the stronglyconnected components of ( V, E ) (considered as a subgraph of ( V, E ) or as a subset of V , depending on the context). Lemma 5.

Let G ↾ v be an initialized mean-payoff-inf game,and let Conc λi ( G ) ↾ s be its concrete negotiation game forsome λ and some i . Then, the value of the game Conc λi ( G ) ↾ s is given by the formula: max τ C ∈ ML C (Conc λi ( G )) min K ∈ SConn (Conc λi ( G )[ τ C ])accessible from s opt( K ) , where opt( K ) is the minimal value ν C ( ρ ) for ρ among theinﬁnite paths in K . • If K contains a deviation, then Prover can choose thesimple cycle of K that minimizes player i ’s payoff: opt( K ) = min c ∈ SC( K ) ˆ µ ⋆ ( c ω ) . • If K does not contain a deviation, then Prover mustchoose a combination of the simple cycles of K thatminimizes player i ’s payoff while keeping the non-maindimensions above : opt( K ) = min ⋆ Conv c ∈ SC( K ) ˆ µ ( c ω ) . A proof can be found in Appendix H.

Corollary 1.

For each player i and every state v ∈ V i , thevalue nego( λ )( v ) can be computed with the formula given inLemma 5 applied to the game Conc λi ( G ) ( v, { v } ) Moreover, another corollary of that result is that therealways exists a best play that Prover can choose, i.e. Proverhas an optimal strategy; by Theorem 3, this is equivalent to saythat mean-payoff-inf games are games with steady negotiation.

Corollary 2.

The mean-payoff-inf games are games withsteady negotiation.

VI. A

NALYSIS OF THE NEGOTIATION FUNCTION INMEAN - PAYOFF - INF GAMES

In this section, we will show that for the case of mean-payoff-inf games, the negotiation function is a piecewise linearfunction from the vector space of requirements into itself,which can therefore be computed and analyzed using classicallinear algebra techniques. Then, it becomes possible to searchfor the ﬁxed points or the ε -ﬁxed points of such a function,and to decide the existence or not of SPEs or ε -SPEs in thegame studied. Theorem 4.

120 1 2 (1 , , λ ( b )) ( λ ( a ) , λ ( b ) − , λ ( b )) ( λ ( a ) , λ ( a ) − λ ( a ) , λ ( b )) (+ ∞ , + ∞ ) Fig. 12. The negotiation function on the game of Example 3 the minimum, along the dimension ⋆ , of its intersection withthe quadrant of the points that have positive coordinates alongthe non-main dimensions: that quantity evolves linearly withsuch translations, as illustrated in Figure 11.See Appendix J for a detailed proof. Example . Let G be the game of Example 3, represented inFigure 3.Then, if a requirement λ is represented by the tuple ( λ ( a ) , λ ( b )) , then the function nego : R → R can berepresented by Figure 12, where in any one of the regionsdelimited by the dashed lines, we wrote a formula for thecouple (nego( λ )( a ) , nego( λ )( b )) .The orange area indicates the ﬁxed points of the function,and the yellow area the other -ﬁxed points.Remember that, by Proposition 2, the negotiation functionis Scott-continuous: that means that when we are exactly ona magenta line, the good expression of nego is the one ofthe tile to the left (for a vertical line) or at the bottom (for a λ ( a ) λ ( b ) (1 ,

2) (2 ,

2) (2 , ,

3) (+ ∞ , + ∞ ) Fig. 13. The negotiation function on the game of Example 2 horizontal one).

Example . As a second example, let us consider the game ofthe Example 2, represented in Figure 2.First, let us note that if λ ( c ) ≤ and λ ( d ) ≤ , then nego( λ )( c ) = 1 and nego( λ )( d ) = 2 . We can therefore ﬁx λ ( c ) = 1 and λ ( d ) = 2 , and represent the requirements λ bythe tuples ( λ ( a ) , λ ( b )) , as in the previous example. Then, thenegotiation function can be represented as in Figure 13.One can check that there is no ﬁxed point here, and evenno -ﬁxed point, except (+ ∞ , + ∞ ) .C ONCLUSION AND FUTURE WORKS

With the tools that we deﬁned, we are now able to character-ize effectively all the SPEs, and ε -SPEs, in a mean-payoff-infgame with arbitrarily many players. We do it by constructinga complete representation of the negotiation function usingits associated concrete negotiation games, and then ﬁndingits least ﬁxed point, or ε -ﬁxed point, with classical linearalgebraic tools. That algorithm also provides a solution to the(constrained) existence problem for SPEs in mean-payoff-infgames, which was left open in the literature.If we are able to ﬁnd a formula expressing the negotia-tion function like in Lemma 4 for other classes of preﬁx-independent games with steady negotiation, for example usingthe concrete negotiation game, then it will be also possible,by Theorem 2, to characterize the SPEs and ε -SPEs in thosegames.When we look for SPEs, i.e. ε -SPEs with ε = 0 , anothermethod can be the computation of the limit of the negotiationsequence, using the concrete negotiation game, or the abstractone, if the game is simple enough. But such an algorithm doesnot always stop, since in some cases, the negotiation sequenceneeds transﬁnite number of steps to converge. Nevertheless, letus note that the sequence will always converge after ﬁnitelymany iterations in games where there exists only a ﬁnitenumber of possible outcomes; examples of such games areliminf and limsup games, for example, or mean-payoff-inf12nd mean-payoff-sup games with only disjoint cycles. Analgorithmic method was already deﬁned in [15] for such cases:an open question is then to know which alternative has thebetter complexity, and perhaps to deﬁne wider classes ofgames in which we know that the negotiation sequence willstabilize in a ﬁnite number of steps.Finally, another open question is whether the notion ofnegotiation, and its link with the SPEs, can be generalizedto games that are not preﬁx-independent, or that are not withsteady negotiation. R EFERENCES[1] Romain Brenguier, Lorenzo Clemente, Paul Hunter, Guillermo A. Pérez,Mickael Randour, Jean-François Raskin, Ocan Sankur, and MathieuSassolas. Non-zero sum games for reactive synthesis. In

Languageand Automata Theory and Applications - 10th International Conference,LATA 2016, Prague, Czech Republic, March 14-18, 2016, Proceedings ,volume 9618 of

Lecture Notes in Computer Science , pages 3–23.Springer, 2016.[2] Thomas Brihaye, Véronique Bruyère, Aline Goeminne, Jean-FrançoisRaskin, and Marie van den Bogaard. The complexity of subgame perfectequilibria in quantitative reachability games. In

CONCUR , volume 140of

LIPIcs , pages 13:1–13:16. Schloss Dagstuhl - Leibniz-Zentrum fürInformatik, 2019.[3] Thomas Brihaye, Véronique Bruyère, Noémie Meunier, and Jean-François Raskin. Weak subgame perfect equilibria and their applicationto quantitative reachability. In , volume 41 of

LIPIcs , pages 504–518. Schloss Dagstuhl -Leibniz-Zentrum für Informatik, 2015.[4] Thomas Brihaye, Julie De Pril, and Sven Schewe. Multiplayer costgames with simple nash equilibria. In

Logical Foundations of ComputerScience, International Symposium, LFCS 2013, San Diego, CA, USA,January 6-8, 2013. Proceedings , volume 7734 of

Lecture Notes inComputer Science , pages 59–73. Springer, 2013.[5] Véronique Bruyère. Computer aided synthesis: A game-theoretic ap-proach. In

Developments in Language Theory - 21st InternationalConference, DLT 2017, Liège, Belgium, August 7-11, 2017, Proceedings ,volume 10396 of

Lecture Notes in Computer Science , pages 3–35.Springer, 2017.[6] Véronique Bruyère, Noémie Meunier, and Jean-François Raskin. Secureequilibria in weighted games. In

CSL-LICS , pages 26:1–26:26. ACM,2014.[7] Véronique Bruyère, Stéphane Le Roux, Arno Pauly, and Jean-FrançoisRaskin. On the existence of weak subgame perfect equilibria.

CoRR ,abs/1612.01402, 2016.[8] Véronique Bruyère, Stéphane Le Roux, Arno Pauly, and Jean-FrançoisRaskin. On the existence of weak subgame perfect equilibria. In

Foundations of Software Science and Computation Structures - 20thInternational Conference, FOSSACS 2017, Held as Part of the EuropeanJoint Conferences on Theory and Practice of Software, ETAPS 2017,Uppsala, Sweden, April 22-29, 2017, Proceedings , volume 10203 of

Lecture Notes in Computer Science , pages 145–161, 2017.[9] Krishnendu Chatterjee, Laurent Doyen, Herbert Edelsbrunner,Thomas A. Henzinger, and Philippe Rannou. Mean-payoff automatonexpressions. In Paul Gastin and François Laroussinie, editors,

CONCUR2010 - Concurrency Theory, 21th International Conference, CONCUR2010, Paris, France, August 31-September 3, 2010. Proceedings ,volume 6269 of

Lecture Notes in Computer Science , pages 269–283.Springer, 2010.[10] Krishnendu Chatterjee, Thomas A. Henzinger, and Nir Piter-man. Strategy logic.

Inf. Comput. , 208(6):677–693, 2010. doi:10.1016/j.ic.2009.07.004 .[11] János Flesch, Jeroen Kuipers, Ayala Mashiah-Yaakovi, Gijs Schoenmak-ers, Eilon Solan, and Koos Vrieze. Perfect-information games withlower-semicontinuous payoffs.

Math. Oper. Res. , 35(4):742–755, 2010.[12] Eryk Kopczynski. Half-positional determinacy of inﬁnite games. InMichele Bugliesi, Bart Preneel, Vladimiro Sassone, and Ingo Wegener,editors,

Automata, Languages and Programming, 33rd InternationalColloquium, ICALP 2006, Venice, Italy, July 10-14, 2006, Proceedings, Part II , volume 4052 of

Lecture Notes in Computer Science , pages 336–347. Springer, 2006.[13] Orna Kupferman, Giuseppe Perelli, and Moshe Y. Vardi. Synthesiswith rational environments.

Ann. Math. Artif. Intell. , 78(1):3–20, 2016. doi:10.1007/s10472-016-9508-8 .[14] Donald A. Martin. Borel determinacy.

Annals of Mathematics , pages363–371, 1975.[15] Noémie Meunier.

Multi-Player Quantitative Games: Equilibria andAlgorithms . PhD thesis, Université de Mons, 2016.[16] Martin J. Osborne.

An introduction to game theory . Oxford Univ. Press,2004.[17] Eilon Solan and Nicolas Vieille. Deterministic multi-player dynkingames.

Journal of Mathematical Economics , 39(8):911–929, 2003.[18] Michael Ummels. Rational behaviour and strategy construction ininﬁnite multiplayer games. In

FSTTCS 2006: Foundations of SoftwareTechnology and Theoretical Computer Science, 26th International Con-ference, Kolkata, India, December 13-15, 2006, Proceedings , volume4337 of

Lecture Notes in Computer Science , pages 212–223. Springer,2006.[19] Yaron Velner, Krishnendu Chatterjee, Laurent Doyen, Thomas A. Hen-zinger, Alexander Moshe Rabinovich, and Jean-François Raskin. Thecomplexity of multi-mean-payoff and multi-energy games.

Inf. Comput. ,241:177–196, 2015.[20] Nicolas Vieille and Eilon Solan. Deterministic multi-player Dynkin games.

Journal of Mathematical Eco-nomics , Vol.39,num. 8:pp.911–929, November 2003.URL: https://hal-hec.archives-ouvertes.fr/hal-00464953, doi:10.1016/S0304-4068(03)00021-1 . PPENDIX AP ROOF OF T HEOREM Theorem 1.

Let G be a game with steady negotiation. Then,a play ρ in G is an NE play if and only if ρ is nego( λ ) -consistent.Proof: • Let ¯ σ be a Nash equilibrium in G ↾ v , for some state v , and let ρ = h ¯ σ i v : let us prove that the play ρ is nego( λ ) -consistent.Let k ∈ N , let i ∈ Π be such that ρ k ∈ V i , and let usprove that µ i ( ρ k ρ k +1 . . . ) ≥ nego( λ )( ρ k ) .For any deviation σ ′ i of σ i ↾ ρ ...ρ k , by deﬁnition of NEs, µ i ( h ¯ σ − i ↾ ρ ...ρ k , σ ′ i i ρ k ) ≤ µ i ( ρ ) . Therefore: µ i ( ρ ) ≥ sup σ ′ i µ i ( h ¯ σ − i ↾ ρ ...ρ k , σ ′ i i ρ k ) hence: µ i ( ρ ) ≥ inf ¯ τ − i sup τ i µ i ( h ¯ τ − i ↾ ρ ...ρ k , τ i i ρ k ) i.e.: µ i ( ρ ) ≥ nego( λ )( ρ k ) . • Let ρ be a nego( λ ) -consistent play from a state v . Letus deﬁne a strategy proﬁle ¯ σ such that h ¯ σ i v = ρ , by: – h ¯ σ i v = ρ ; – for all histories of the form ρ . . . ρ k v with v = ρ k +1 ,let i be the player controlling ρ k .Since the game G is with steady negotiation, theinﬁmum: inf ¯ τ − i ∈ λ Rat( ρ k ) sup τ i µ i ( h ¯ τ i ρ k ) is a minimum. Let ¯ τ k − i be λ -rational strategy proﬁlefrom ρ k realizing that minimum, and let τ ki be somestrategy from ρ k such that τ ki ( ρ k ) = v . Then, wedeﬁne: h ¯ σ ↾ ρ ...ρ k v i v = h ¯ τ kρ k v i v ; – for every other history h , ¯ σ ( h ) is deﬁned arbitrarily.Let us prove that ¯ σ is an NE: let σ ′ i be a deviation of σ i , let ρ ′ = h ¯ σ − i , σ ′ i i v and let ρ . . . ρ k be the longestcommon preﬁx of ρ and ρ ′ . Let v = ρ ′ k +1 .Then, we have: µ i ( ρ ′ ) ≤ sup τ ki µ i (cid:0) h ¯ τ k i ρ k (cid:1) = nego( λ )( ρ k ) , and since ρ is λ -consistent, nego( λ )( ρ k ) ≤ µ i ( ρ ) ,hence µ i ( ρ ′ ) ≤ µ i ( ρ ) . A PPENDIX BP ROOF OF L EMMA Lemma 1.

Let G ↾ v be a well-initialized preﬁx-independentgame with steady negotiation, and ε ≥ . Let λ be an ε -ﬁxedpoint of the function nego . Then, for every λ -consistent play ξ starting in v , there exists an ε -SPE ¯ σ such that h ¯ σ i v = ξ .Proof: • Particular case: if there exists v such that λ ( v ) = + ∞ . In that case, since the game G ↾ v is well-initialized,there is no λ -rational strategy proﬁle from v , and nego( λ )( v ) = + ∞ . Since ε is ﬁnite and since λ isan ε -ﬁxed point of the negotiation function, it followsthat λ ( v ) = + ∞ : in that case, there is no λ -consistentplay ξ from v , and then the proof is done. Therefore,for the rest of the proof, we assume that for all v , wehave λ ( v ) = + ∞ . As a consequence, since λ is an ε -ﬁxed point of the function nego , for all v , we have nego( λ )( v ) = + ∞ ; and so ﬁnally, for each such v , thereexists a λ -consistent play starting from v . • Preliminary result: a game with steady negotiation is alsowith subgame-steady negotiation.

Recall that a game with steady negotiation is a game suchthat for every requirement λ , for every player i and forevery state v , there exists a λ -rational strategy proﬁle ¯ τ v such that: sup τ vi µ i ( h ¯ τ v i v ) = inf ¯ τ − i ∈ λ Rat( v ) sup τ i µ i ( h ¯ τ i v ) is realized, i.e. there exists a worst λ -rational strategyproﬁle against player i from the state v , with regards toplayer i ’s payoff.Our goal in this part of the proof is to show that a gamethat is with steady negotiation is also with subgame-steady negotiation , that is to say, for every requirement λ , for every player i and for every state v , there exists a λ -rational strategy proﬁle ¯ τ v ∗− i such that for every history hw starting from v compatible with ¯ τ v ∗− i , we have: sup τ v ∗ i µ i ( h ¯ τ v ∗ ↾ hw i w ) = inf ¯ τ − i ∈ λ Rat( w ) sup τ i µ i ( h ¯ τ i w ) , i.e. there exists a λ -rational strategy proﬁle against player i from the state v , that is the worst with regards to player i ’s payoff in any subgame, in other words a subgame-worst strategy proﬁle.Let us construct inductively the strategy proﬁle ¯ τ v ∗− i andthe strategy τ v ∗ i λ -rationalizing it. We deﬁne them onlyon histories that are compatible with ¯ τ v ∗− i , since they canbe deﬁned arbitrarily on any other histories. We proceedby assembling the strategy proﬁles of the form ¯ τ w , andthe histories after which we follow a new ¯ τ w will becalled the resets of ¯ τ v ∗− i . – First, h ¯ τ v ∗ i v = h ¯ τ v i v : the one-state history v is thenthe ﬁrst reset of ¯ τ v ∗− i ; – then, for every history hw from v such that h iscompatible with ¯ τ v ∗− i and ends in V i , and such that w = τ v ∗ i ( h ) : let us write hw = h ′ uh ′′ so that h ′ u

14s the longest reset of ¯ τ v ∗− i among the preﬁxes of h ,and therefore so that the strategy proﬁle ¯ τ v ∗ ↾ h ′ u has beendeﬁned as equal to ¯ τ u over the preﬁxes of h ′′ until w .Then, we have: sup τ i µ i ( h ¯ τ w − i , τ i i w ) ≤ sup τ i µ i ( h ¯ τ u − i ↾ uh ′′ , τ i i w ) by preﬁx-independence of G and since by its deﬁni-tion, the strategy proﬁle ¯ τ w − i minimizes the quantity sup τ i µ i ( h ¯ τ w − i , τ i i w ) . Let us separate two cases. ∗ Suppose ﬁrst that: sup τ i µ i ( h ¯ τ w − i , τ i i w ) = sup τ i µ i ( h ¯ τ u − i ↾ uh ′′ , τ i i w ) . Then, h ¯ τ v ∗ ↾ hw i = h ¯ τ u ↾ uh ′′ i w : the coalition of play-ers against player i keeps following their strategyproﬁle so that player i will have no more than thepayoff they can ensure. ∗ Suppose now that: sup τ i µ i ( h ¯ τ w − i , τ i i w ) < sup τ i µ i ( h ¯ τ u − i ↾ uh ′′ , τ i i w ) . Then, h ¯ τ v ∗ ↾ hw i = h ¯ τ w i w : player i has done some-thing that lowers the payoff they can ensure, andtherefore the other players have to update theirstrategy proﬁle in order to enforce that new mini-mum.The history hw is a reset of ¯ τ v ∗− i .All the plays constructed are λ -consistent, hence ¯ τ v ∗− i is indeed λ -rationalized by τ v ∗ i .Let us now prove that τ v ∗ i is the subgame-worst λ -rational strategy proﬁle against player i . Let hw be ahistory starting in v compatible with ¯ τ v ∗− i , let υ i be astrategy from the state w , let η = h ¯ τ v ∗− i ↾ hw , υ i i w andlet us prove that: µ i ( η ) ≤ inf ¯ τ − i ∈ λ Rat( w ) sup τ i µ i ( h ¯ τ i w ) . Let us consider the sequence ( α n ) n ∈ N , deﬁned by: α n = inf ¯ τ − i ∈ λ Rat( η n ) sup τ i µ i ( h ¯ τ i η n ) . That sequence is non-increasing. Indeed, for all n : ∗ If η n ∈ V i , then no action of player i can improvethe payoff player i themself can secure against a λ -rational environment. ∗ If η n V i , then: η n +1 = ¯ τ v ∗− i ( hη . . . η n ) =¯ τ η k − i ( η k . . . η n ) for some k such that, by construc-tion of ¯ τ v ∗− i , α k = · · · = α n . Since the strategyproﬁle ¯ τ η k − i is deﬁned to realize the payoff α k = α n ,we have α n +1 = α n .Moreover, that sequence can only take a ﬁnite numberof values (at most card V ). Therefore, it is stationary:there exists n ∈ N such that ( α n ) n ≥ n is constant,and there are no resets of ¯ τ v ∗− i among the preﬁxes of η of length greater than n . Therefore, if we choose n minimal (i.e., n is theindex of the last reset in η ), then the play η n η n +1 . . . is compatible with the strategy proﬁle ¯ τ η n − i . Then, wehave: µ i ( η ) ≤ α n ≤ α , and: α = inf ¯ τ − i ∈ λ Rat( w ) sup τ i µ i ( h ¯ τ i w ) , which proves that ¯ τ v ∗ is the subgame-worst λ -strategyproﬁle against player i from the state w , and thereforethat the game G is a game with subgame-steadynegotiation. • Construction of ¯ σ . Let H = Hist G ↾ v . Let us construct inductively ¯ σ bydeﬁning all the plays h ¯ σ ↾ hv i v , for hv ∈ H , keepingthe hypothesis that at any step n , the set H n containsexactly the histories hv such that the play h ¯ σ ↾ hv i v hasbeen deﬁned, and that such a play is always λ -consistent:it will deﬁne a λ -rational strategy proﬁle, and we will thenprove it is an ε -SPE. – First, h ¯ σ i v = ξ , which satisﬁes the induction hypoth-esis. We remove then all the ﬁnite preﬁxes of ξ form H to obtain H . Note that the only history of length has been removed. – At the n -th step, with n > : let us choose hv ∈H n of minimal length, and therefore minimal for thepreﬁx order: the strategy proﬁle ¯ σ has been deﬁned onall the strict preﬁxes of hv , but not on hv itself, and v = ¯ σ ( h ) . Let then i be the player controlling the laststate of h (which exists since all the histories of H n have length at least ). Let ¯ τ v ∗− i be a subgame-worst λ -rational strategy proﬁle against player i from v , whoseexistence has been proved in the previous point, andlet τ v ∗ i be a strategy rationalizing it.Then, we deﬁne h ¯ σ ↾ hv i v = h ¯ τ v ∗ i v , and inductively,for every history h ′ w starting from v and compatiblewith ¯ σ − i ↾ hv as it has been deﬁned so far, we deﬁne h ¯ σ ↾ hh ′ w i v = h ¯ τ v ∗ ↾ h ′ w i w . The strategy proﬁle ¯ σ ↾ hv is thenequal to ¯ τ v ∗ on any history compatible with ¯ τ v ∗− i .We remove all such histories from H n to obtain H n +1 .All the plays we built are λ -consistent, which was ourinduction hypothesis.Since each step removes from H n a history of minimallength, and since there are ﬁnitely many histories of anygiven length, we have T n H n = ∅ , and this processcompletely deﬁnes ¯ σ . • Such ¯ σ is an ε -SPE. Let h (0) w ∈ Hist G ↾ v , let i ∈ Π , let σ ′ i be adeviation of σ i . Let ρ = h (0) h ¯ σ ↾ h (0) w i w and let ρ ′ = h (0) h σ ′ i ↾ h (0) w , ¯ σ − i ↾ h (0) w i w . We prove that µ i ( ρ ′ ) ≤ µ i ( ρ ) + ε .If ρ ′ is compatible with σ i , then ρ ′ = ρ and the proof isimmediate. If it is not, we let huv denote the shortestpreﬁx of ρ ′ such that u ∈ V i and v = σ i ( hu ) . Thetransition uv can be considered as the ﬁrst deviation of15layer i , but note that hu can be both longer or shorterthan h (0) : player i may have already deviated in h (0) .Be that as it may, the history hu is a common preﬁx ofthe play ρ and ρ ′ , and if ¯ τ v ∗− i denotes a subgame-worst λ -rational strategy proﬁle against player i from the state v ,and if τ v ∗ i is a strategy λ -rationalizing it, then ¯ σ ↾ huv hasbeen deﬁned as equal to ¯ τ v ∗ on any history compatiblewith ¯ σ − i ↾ huv . – If huv is a preﬁx of ρ : let huh ′ w ′ be the longestcommon preﬁx of ρ and ρ ′ . Necessarily, w ′ ∈ V i . Then,by deﬁnition of ¯ τ v ∗− i , we have: µ i ( ρ ′ ) ≤ inf ¯ τ − i ∈ λ Rat( w ′ ) sup τ i µ i ( h ¯ τ i w ′ ) = nego( λ )( w ′ ) , and since λ is an ε -ﬁxed point of nego : µ i ( ρ ′ ) ≤ λ ( w ′ ) + ε. On the other hand, the play h ¯ σ ↾ h ′ w ′ i w ′ , which is asufﬁx of ρ , is λ -consistent, hence µ i ( ρ ) ≥ λ ( w ′ ) .Therefore, µ i ( ρ ′ ) ≤ µ i ( ρ ) + ε . – If huv is not a preﬁx of ρ : then, ρ = h h ¯ σ ↾ hu i u . Since u ∈ V i , we have: nego( λ )( u ) = sup uv ′ ∈ E inf ¯ τ − i ∈ λ Rat( v ′ ) sup τ i µ i ( h ¯ τ i v ′ ) . In particular, we have: inf ¯ τ − i ∈ λ Rat( v ) sup τ i µ i ( h ¯ τ i v ) ≤ nego( λ )( u ) ≤ λ ( u ) + ε. Then, for the same reason as above, we know that: µ i ( ρ ′ ) ≤ inf ¯ τ − i ∈ λ Rat( v ) sup τ i µ i ( h ¯ τ i v ) . Finally, since the sufﬁx h ¯ σ ↾ hu i u of ρ is λ -consistent,we have µ i ( ρ ) ≥ λ ( u ) ≥ nego( λ )( u ) − ε ≥ µ i ( ρ ′ ) .The strategy proﬁle ¯ σ is an ε -SPE.A PPENDIX CP ROOF OF L EMMA Lemma 2.

Let G ↾ v be a well-initialized preﬁx-independentgame, and let ε ≥ . Let ¯ σ be an ε -SPE in G ↾ v . Then, thereexists an ε -ﬁxed point λ of the negotiation function such thatfor every history hv starting in v , the play h ¯ σ ↾ hv i v is λ -consistent.Proof: Let us deﬁne the requirement λ by, for each i ∈ Π and v ∈ V i : λ ( v ) = inf hv ∈ Hist G ↾ v µ i ( h ¯ σ ↾ hv i v ) . Note that the set { µ i ( h ¯ σ ↾ hv i v ) | hv ∈ Hist G ↾ v } is neverempty, since the game G ↾ v is well-initialized.Then, for every history hv starting in v , the play h ¯ σ ↾ hv i v is λ -consistent. Let us prove that λ is an ε -ﬁxed point of nego : let i ∈ Π , let v ∈ V i , and let us assume towardscontradiction (since the negotiation function is non-decreasing)that nego( λ )( v ) > λ ( v ) + ε , that is to say: inf ¯ τ − i ∈ λ Rat( v ) sup τ i µ i ( h ¯ τ i v ) > inf hv ∈ Hist G ↾ v µ i ( h ¯ σ ↾ hv i v ) + ε. Then, since all the plays generated by the strategy proﬁle ¯ σ are λ -consistent, and therefore since any strategy proﬁle ofthe form ¯ σ − i ↾ hv is λ -rational, we have: inf hv sup τ i µ i ( h ¯ σ − i ↾ hv , τ i i v ) > inf hv µ i ( h ¯ σ ↾ hv i v ) + ε. Therefore, there exists a history hv such that: sup τ i µ i ( h ¯ σ − i ↾ hv , τ i i v ) > µ i ( h ¯ σ ↾ hv i v ) + ε, which is impossible if the strategy proﬁle ¯ σ is an ε -SPE.Therefore, there is no such v , and the requirement λ is an ε -ﬁxed point of the negotiation function.A PPENDIX DP ROOF OF L EMMA Lemma 3.

Let G be a game, and let ε ≥ . The negotiationfunction has a least ε -ﬁxed point.Proof: The following proof is a generalization of aclassical proof of Tarski’s ﬁxed point theorem.Let Λ be the set of the ε -ﬁxed points of the negotiationfunction. The set Λ is not empty, since it contains at least therequirement v + ∞ . Let λ ∗ be the requirement deﬁned by: λ ∗ : v inf λ ∈ Λ λ ( v ) . For all ε -ﬁxed point λ of the negotiation func-tion, we have then for each v , λ ∗ ( v ) ≤ λ ( v ) , and nego( λ ∗ )( v ) ≤ nego( λ )( v ) since nego is monotone; andtherefore, nego( λ ∗ )( v ) ≤ λ ( v ) + ε .As a consequence, we have: nego( λ ∗ )( v ) ≤ inf λ ∈ Λ λ ( v ) + ε = λ ∗ ( v ) + ε. The requirement λ ∗ is an ε -ﬁxed point of the negotiation func-tion, and is therefore the least ε -ﬁxed point of the negotiationfunction. A PPENDIX EA BSTRACT NEGOTIATION GAME

Deﬁnition 28 (Abstract negotiation game) . Let G ↾ v be aninitialized game, let i ∈ Π , and let λ be a requirement on G . The abstract negotiation game of G ↾ v for player i withrequirement λ is the two-player zero-sum initialized game: Abs λi ( G ) ↾ [ v ] = ( {P , C} , S, ( S P , S C ) , ∆ , ν ) ↾ [ v ] , where: • P denotes the player Prover and C the player Challenger ; • the states of S C are written [ ρ ] , where ρ is a λ -consistentplay in G ; • the states of S P are written [ hwv ] , where hwv is a historyin G , with w ∈ V i , or [ v ] with v ∈ V , plus two additionalstates ⊤ and ⊥ ; • the set ∆ contains the transitions of the form: – [ hv ][ vρ ] , where [ hv ] ∈ S P and [ vρ ] ∈ S C (Proverproposes a play);16 [ ρ ][ ρ ...ρ n v ] , where [ ρ ] ∈ S C , n ∈ N , ρ n ∈ V i , and v = ρ n +1 (Challenger makes player i deviate); – [ ρ ] ⊤ , where [ ρ ] ∈ S C (Challenger accepts the proposedplay); – ⊤⊤ (the game is over); – [ hv ] ⊥ (Prover has no more play to propose); – ⊥⊥ (the game is over). • ν is the outcome function deﬁned by, for all ρ (0) , ρ (1) , . . . , h (1) v , h (2) v , . . . , k, H : ν C (cid:0) [ v ] (cid:2) ρ (0) (cid:3) (cid:2) h (1) v (cid:3) (cid:2) ρ (1) (cid:3) . . . (cid:2) h ( k ) v k (cid:3) (cid:2) ρ ( k ) (cid:3) ⊤ ω (cid:1) = µ i (cid:0) h (1) . . . h ( k ) ρ ( k ) (cid:1) ,ν C (cid:0) [ v ] (cid:2) ρ (0) (cid:3) (cid:2) h (1) v (cid:3) (cid:2) ρ (1) (cid:3) . . . (cid:2) h ( n ) v n (cid:3) (cid:2) ρ ( n ) (cid:3) . . . (cid:1) = µ i (cid:0) h (1) h (2) . . . (cid:1) ,ν C ( H ⊥ ω ) = + ∞ , and by ν P = − ν C . Remark.

If the game G is Borelian, then so is the game Abs λi ( G ) . Proposition 3.

Let G ↾ v be an initialized Borelian game, let λ be a requirement on G and let i ∈ Π . Then, choosingChallenger as distinguished player, the value of the game Abs λi ( G ) [ v ] is equal to the quantity: inf ¯ σ − i ∈ λ Rat( v ) sup σ i µ i ( h ¯ σ i v ) . Proof:

Let α ∈ R , and let us prove that the followingstatements are equivalent:1) there exists a strategy τ P such that for every strategy τ C , ν C ( h ¯ τ i [ v ] ) < α ;2) there exists a λ -rational strategy proﬁle ¯ σ − i in thegame G ↾ v such that for every strategy σ i , we have µ i ( h ¯ σ i v ) < α . • (1) implies (2). Let τ P be such that for every strategy τ C , ν C ( h ¯ τ i [ v ] ) < α .In what follows, any history h compatible with an alreadydeﬁned strategy proﬁle ¯ σ − i in G ↾ v will be decomposedin: h = v h (0) v h (1) . . . h ( n − v n h ( n ) , so that there exist plays ρ (0) , . . . , ρ ( n − , η and a history: [ v ] h ρ (0) i h v h (1) v i . . . h v n − h ( n − v n i h v n h ( n ) η i in the game Abs λi ( G ) compatible with τ P : the existenceand the unicity of that decomposition can be proved byinduction. Intuitively, the history h is cut in historieswhich are preﬁxes of plays that can be proposed byProver.Then, let us deﬁne inductively the strategy proﬁle ¯ σ − i by,for every h such that ¯ σ − i has been deﬁned on the preﬁxesof h , and such that the last state of h is not controlled byplayer i , ¯ σ − i ( h ) = η with η deﬁned from h as higher.Let us prove that ¯ σ − i is the desired strategy proﬁle. – The strategy proﬁle ¯ σ − i is λ -rational. Let us deﬁne σ i so that for every history hv compatiblewith ¯ σ − i , the play h ¯ σ ↾ hv i v is λ -consistent.For any history: h = v h (0) v h (1) . . . h ( n − v n h ( n ) compatible with ¯ σ − i and ending in V i , let σ i ( h ) = η with η corresponding to the decomposition of h , sothat by induction: h ¯ σ ↾ v h (0) v h (1) ...h ( n − v n i v n = v n h ( n ) η. Let now hv be a history in G ↾ v , and let us show thatthe play h ¯ σ ↾ hv i v is λ -consistent. If we decompose: hv = v h (0) v h (1) . . . h ( n − v n h ( n ) with the same deﬁnition of η (note that the vertex v isnow included in the decomposition), then h ¯ σ ↾ hv i v = vη , and by deﬁnition of the abstract negotiation game, v n h ( n ) η is a λ -consistent play, and therefore so is vη . – The strategy proﬁle ¯ σ − i keeps player i ’s payoff underthe value α . Let σ i be a strategy for player i , and let ρ = h ¯ σ i v .We want to prove that µ i ( ρ ) < α .Let us deﬁne two ﬁnite or inﬁnite sequences (cid:0) ρ ( k ) (cid:1) k ∈ K and (cid:0) h ( k ) v k (cid:1) k ∈ K , where K = { , . . . , n } or K = N \ { } , by setting h (0) equal to the empty history,and for every k ∈ K : h ρ ( k ) i = τ P (cid:16) [ v ] h ρ (0) i . . . h ρ ( k − i h h ( k ) v k i(cid:17) and so that for every k , the history h ( k ) v k is the shortestpreﬁx of ρ that is not a preﬁx of h (1) . . . h ( k − ρ ( k − (or equivalently, the history h ( k ) is the longest commonpreﬁx of ρ and h (1) . . . h ( k − ρ ( k − ).Then, the length of the longest common preﬁx of h (1) . . . h ( k − ρ ( k ) and ρ increases with k , and theset K is ﬁnite if and only if there exists n such that h (1) . . . h ( n − ρ ( n ) = ρ .In the inﬁnite case, let: χ = [ v ] h ρ (0) i h h (1) v i . . . h ρ ( k ) i h h ( k ) v k i . . . . The play χ is compatible with τ P , hence ν C ( χ ) < α ,that is to say: µ i (cid:16) h (1) h (2) . . . (cid:17) < α, ie. µ i ( ρ ) < α .In the ﬁnite case, let: χ = [ v ] h ρ (0) i h h (1) v i . . . h ρ ( n ) i ⊤ ω . For the same reason, ν C ( χ ) < α , that is to say µ i (cid:0) h (1) . . . h ( n ) ρ ( n ) (cid:1) = µ i ( ρ ) < α . • (2) implies (1). Let ¯ σ − i be a λ -rational strategy proﬁle keeping player i ’spayoff below α .17hen, let σ i be a strategy λ -rationalizing ¯ σ − i . Let usdeﬁne a strategy τ P for Prover in the abstract negotiationgame.Let H = [ v ] (cid:2) ρ (0) (cid:3) (cid:2) h (1) v (cid:3) (cid:2) ρ (1) (cid:3) . . . (cid:2) h ( n ) v n (cid:3) be ahistory in the abstract game, ending in S P . Then, wedeﬁne: τ P ( H ) = (cid:2) h ¯ σ ↾ h (1) ...h ( n ) v n i v n (cid:3) . If H is a history ending in ⊤ , then τ P ( H ) = ⊤ , and inthe same way if H ends in ⊥ , then τ P ( H ) = ⊥ .Let us show that τ P is the strategy we were looking for.Let χ be a play compatible with τ P , and let us note thatthe state ⊥ does not appear in χ . Then, the play χ canonly have two forms: – If χ = [ v ] (cid:2) ρ (0) (cid:3) (cid:2) h (1) v (cid:3) . . . (cid:2) ρ ( n ) (cid:3) ⊤ ω , then we have: ρ ( n ) = h ¯ σ ↾ h (1) ...h ( n ) v n i v n , and the history h (1) . . . h ( n ) v n in the game G ↾ v iscompatible with ¯ σ − i . By hypothesis, we have: µ i (cid:16) h (1) . . . h ( n ) ρ ( n ) (cid:17) < α, hence ν C ( χ ) < α . – If χ = [ v ] (cid:2) ρ (0) (cid:3) . . . (cid:2) h ( n ) v n (cid:3) (cid:2) ρ ( n ) (cid:3) . . . , then the play ρ = h (1) h (2) . . . is compatible with ¯ σ − i , and byhypothesis µ i ( ρ ) < α , hence ν C ( χ ) < α . Remark.

Prover has a strategy to avoid ⊥ if and only if λ issatisﬁable. A PPENDIX FP ROOF OF T HEOREM Theorem 3.

Let G ↾ v be an initialized preﬁx-independentBorelian game. Let λ be a requirement and i a player.Then, we have: val (Conc λi ( G ) ↾ s ) = inf ¯ σ − i ∈ λ Rat( v ) sup σ i µ i ( h ¯ σ i v ) . Proof:

First, let us deﬁne: A = (cid:26) sup σ i µ i ( h ¯ σ i v ) (cid:12)(cid:12)(cid:12)(cid:12) ¯ σ − i ∈ λ Rat( v ) (cid:27) and: B = (cid:26) sup τ C ν C ( h ¯ τ i s ) (cid:12)(cid:12)(cid:12)(cid:12) τ P (cid:27) \ { + ∞} . We prove our point if we prove that A = B . • B ⊆ A . Let τ P be a strategy such that: sup τ C ν C ( h ¯ τ i s ) < + ∞ , and let ¯ σ be the strategy proﬁle deﬁned by: ¯ σ ( ˙ H ) = w for every history H compatible with τ P (by induction,the localized projection is injective on the histories com-patible with τ P ) with τ P ( H ) = ( vw, · ) , and arbitrarilydeﬁned on any other histories. – The strategy proﬁle ¯ σ − i is λ -rational , rationalized bythe strategy σ i . Indeed, let us assume it is not.Then, there exists a history h = h . . . h n in G ↾ v compatible with ¯ σ − i such that the play ˙ ρ = h ¯ σ ↾ h i h n isnot λ -consistent. Then, let: Hs = ( h , M ) ( h ¯ σ ( h ) , M ) . . . ( h n , M n ) be the only history in Conc λi ( G ) ↾ s compatible with τ P such that ˙ H = h .Let τ C be a strategy constructing the history h , deﬁnedby: τ C ( H . . . H k − ) = H k for every k , and: τ C ( H ′ ( vw, M )) = ( w, M ∪ { w } ) for any other history H ′ ( vw, M ) .Then, the play η = h ¯ τ i s contains ﬁnitely manydeviations (Challenger stops the deviations after havingdrawn the history h ), and the play ˙ η = h . . . h n − ˙ ρ is not λ -consistent, i.e. there exists a dimension j ∈ Π such that: µ j ( ˙ η ) − max v ∈ M n ∩ V j λ ( v ) < i.e.: ˆ µ j ( η ) < and therefore ν C ( ρ ) = ν C ( η ) = + ∞ , which is false byhypothesis. – Now, let us prove the equality: sup σ ′ i µ i ( h ¯ σ − i , σ ′ i i v ) = sup τ C ν C ( h ¯ τ i s ) . For that purpose, let us prove the equality of sets: { µ i ( h ¯ σ − i , σ ′ i i v ) | σ ′ i } = { ν C ( h ¯ τ i s ) | τ C } . ∗ Let τ C be a strategy for Challenger, and let ρ = h ¯ τ i s . Since ν C ( ρ ) = + ∞ by hypothesis, we have ν C ( ρ ) = ˆ µ ⋆ ( ρ ) = µ i ( ˙ ρ ) , which is an element ofthe left-hand set. ∗ Conversely, if σ ′ i is a strategy for player i and if η = h ¯ σ − i , σ ′ i i v , let τ C be a strategy such that forevery k : τ C (( η , · )( η · , · ) . . . ( η k · , · ) = ( η k +1 , · )) , i.e. a strategy forcing η .Then, since ν C ( ρ ) = + ∞ by hypothesis on τ P , wehave µ i ( η ) = ν C ( ρ ) , which is an element of theright-hand set. • A ⊆ B . Let ¯ σ − i be a λ -rational strategy proﬁle from v , rational-ized by the strategy σ i ; let us deﬁne a strategy τ P by, forevery history H and for every v ∈ V : τ P ( H ( v, · )) = (cid:16) v ¯ σ ( ˙ Hv ) , · (cid:17) . sup σ ′ i µ i ( h ¯ σ − i , σ ′ i i v ) = sup τ C ν C ( h ¯ τ i s ) . For that purpose, let us prove the equality of sets: { µ i ( h ¯ σ − i , σ ′ i i v ) | σ ′ i } = { ν C ( h ¯ τ i s ) | τ C } . – Let τ C be a strategy for Challenger, and let ρ = h ¯ τ i s .If ν C ( ρ ) = + ∞ , then ˙ ρ is compatible with ¯ σ andnot λ -consistent after ﬁnitely many steps, which isimpossible.Therefore, ν C ( h ¯ τ i s ) = + ∞ , and as a consequence wehave ν C ( ρ ) = ˆ µ ⋆ ( ρ ) = µ i ( ˙ ρ ) , which is an element ofthe left-hand set. – Conversely, if σ ′ i is a strategy for player i and if η = h ¯ σ − i , σ ′ i i v , let τ C be a strategy such that for all k : τ C (( η , · )( η · , · ) . . . ( η k · , · )) = ( η k +1 , · ) , i.e. a strategy forcing η .Then, either ν C ( ρ ) = + ∞ , and therefore η is not λ -consistent, and is compatible with ¯ σ after ﬁnitely manysteps, which is impossible.Or, µ i ( η ) = ν C ( ρ ) , which is an element of the right-hand set. A PPENDIX GP ROOF OF L EMMA Theorem 4.

Let G be a mean-payoff-inf game, let i be a player,let λ be a requirement and let Conc λi ( G ) be the correspondingconcrete negotiation game. There exists a memoryless strategy τ C such that for each state s : inf τ P ν C ( h ¯ τ i s ) = val (Conc λi ( G ) ↾ s ) , i.e. that is optimal for Challenger from all state.Proof: The structure of that proof is inspired from theproof of lemma 14 in [19].Let α ∈ R , and let Φ be the set of the plays ρ in Conc λi ( G ) such that: • lim inf n →∞ n n − P k =0 ( − ˆ π ⋆ ( ρ k ρ k +1 )) ≥ − α ; • and either: – ρ contains inﬁnitely many deviations; – or for each j ∈ Π , ˆ µ j ( ρ ) ≥ .Note that the set of the plays ρ such that µ i ( ˙ ρ ) ≤ α could bedeﬁned almost the same way, but with a limit superior insteadof the limit inferior.By [12], if Challenger can falsify the objective Φ , he canfalsify it with a memoryless strategy, if Φ is preﬁx-independent and convex . Convex objectives are deﬁned as follows: the objective Φ isconvex if for all ρ, η ∈ Φ and for any decomposition: ρ . . . ρ k . . . ρ k . . . and: η . . . η ℓ . . . η ℓ . . . such that: χ = ρ . . . ρ k η . . . η ℓ ρ k +1 . . . ρ k η ℓ +1 . . . is a play, we have χ ∈ Φ . Let then be such two plays anddecomposition, and let us prove that χ ∈ Φ .Let us write Φ = Ψ ∩ (X ∪ Ξ) , where: • Ψ is the set of the plays ρ such that: lim inf n →∞ n n − X k =0 ( − ˆ π ⋆ ( ρ k ρ k +1 )) ≥ − α • X is the set of the plays containing inﬁnitely manydeviations; • Ξ is the set of the plays ρ such that for each j ∈ Π , ˆ µ j ( ρ ) ≥ .As shown in [19], a mean-payoff-inf objective is convex:therefore, we can already say that χ ∈ Ψ . Let us now provethat χ ∈ X ∪ Ξ . • If ρ ∈ X or η ∈ X . Then, χ contains the deviations of ρ and η , hence χ ∈ X . • If ρ, η ∈ Ξ . Then, since mean-payoff-inf objectives are convex, then χ ∈ Ξ .In both cases, χ ∈ X ∪ Ξ , so χ ∈ Φ : the objective Φ isconvex.Therefore, there exists a memoryless strategy τ C such thatfor every strategy τ P , for each state s from which Challengerhas some strategy to falsify the objective Φ , we have h ¯ τ i s Φ .Let s be a state from which Challenger can enforce anoutcome ν C greater than α . Then, since the limit inferior of asequence is always lesser than or equal to its limit superior,Challenger can, from s , falsify the objective Φ . Therefore, bydeﬁnition of τ C , for every strategy τ P , we have h ¯ τ i s Φ . Letus prove that ν C ( h ¯ τ i s ) > α .In other words, let us prove that for every inﬁnite path ρ from s in the graph Conc λi ( G )[ τ C ] , we have ν C ( ρ ) > α . Since ρ Φ , we have either ρ X ∪ Ξ or ρ Ψ . In the ﬁrst case,we have ν C ( ρ ) = + ∞ , which ends the proof. In the secondcase, we have: lim sup n →∞ n n − X k =0 ˆ π ⋆ ( ρ k ρ k +1 ) > α. We want to prove that ν C ( ρ ) > α , that is, since we assume ρ ∈ X ∪ Ξ : ˆ µ ⋆ ( ρ ) = lim inf n →∞ n n − X k =0 ˆ π ⋆ ( ρ k ρ k +1 ) > α. Here, the play ρ is an inﬁnite path in the graph Conc λi ( G )[ τ C ] : by the description of the possible outcomes ina mean-payoff game given in [9], the mean-payoff-inf ˆ µ ⋆ ( ρ ) is then above or equal to the mean-payoff-inf ˆ µ ⋆ we get bylooping on all simple cycle c of that graph accessible fromthe state s : intuitively, a play can be seen as a combination ofthose cycles. That is to say: ˆ µ ⋆ ( ρ ) ≥ min c ∈ SC (cid:0) Conc λi ( G )[ τ C ] (cid:1) accessible from s ˆ µ ⋆ ( c ω ) . c ω is a play compatible with τ C ,we have: lim sup n →∞ n n − X k =0 ˆ π ⋆ ( c k c k +1 ) > α where the indices are taken in Z / | c | Z , i.e.: lim n →∞ n n − X k =0 ˆ π ⋆ ( c k c k +1 ) > α, and therefore: lim inf n →∞ n n − X k =0 ˆ π ⋆ ( c k c k +1 ) > α, that is to say: ˆ µ ⋆ ( c ω ) > α, hence ˆ µ ⋆ ( ρ ) > α . A PPENDIX HP ROOF OF L EMMA Lemma 5.

Let G ↾ v be an initialized mean-payoff-inf game,and let Conc λi ( G ) ↾ s be its concrete negotiation game forsome λ and some i .Then, the value of the game Conc λi ( G ) ↾ s is given by theformula: max τ C ∈ ML C (Conc λi ( G )) min K ∈ SConn (Conc λi ( G )[ τ C ])accessible from s opt( K ) , where opt( K ) is the minimal value ν C ( ρ ) for ρ among theinﬁnite paths in K . • If K contains a deviation, then Prover can simply choosethe simple cycle of K that minimizes player i ’s payoff: opt( K ) = min c ∈ SC( K ) ˆ µ ⋆ ( c ω ) . • If K does not contain a deviation, then Prover mustchoose a combination of the simple cycles of K thatminimizes player i ’s payoff while keeping the non-maindimensions above : opt( K ) = min ⋆ Conv c ∈ SC( K ) ˆ µ ( c ω ) . Proof:

By Lemma 4, there exists a memoryless strategy τ C which is optimal for Challenger among all his possiblestrategies.It follows from Theorem 3 that the highest value player i can get against a hostile λ -rational environment is the minimalpayoff of Challenger in a path in the graph Conc λi ( G )[ τ C ] . Forany such path ρ , there exists a strongly connected component K of Conc λi ( G )[ τ C ] accessible from s such that after aﬁnite number of steps, ρ is a play in K . The least payoffof Challenger in such a path, for a given K , is opt( K ) ; let usprove that it is given by the desired formula.There are, then, two cases to distinguish: • If there is at least a deviation in K . Then, a play in K can contain inﬁnitely many deviations.Therefore, the outcomes ν C ( ρ ) of plays in K are exactly xy Fig. 14. An example for the operator · x the mean-payoff-infs ˆ µ ⋆ ( ρ ) of plays in K , and possibly + ∞ ; and in particular, the lowest outcome Prover canget in K is the quantity: min c ∈ SC( K ) ˆ µ ⋆ ( c ω ) , the least value of a simple cycle in K . • If there is no deviation in K . Let us ﬁrst introduce a notation: for any ﬁnite set D andany set X ⊆ R D , X x denotes the set: X x = (cid:26) (cid:18) min ¯ y ∈ Y y d (cid:19) d ∈ D (cid:12)(cid:12)(cid:12)(cid:12) Y ⊆ X ﬁnite (cid:27) . For example, in R , if X is the blue area below, then X x is the union of the blue area and the gray area inFigure 14.Let us already note that for all X ∈ R Π ∪{ ⋆ } , min ⋆ X x = min (cid:26) x ⋆ (cid:12)(cid:12)(cid:12)(cid:12) ¯ x ∈ X x , ∀ j ∈ Π , x j ≥ (cid:27) = min (cid:26) min ¯ y ∈ Y y ⋆ (cid:12)(cid:12)(cid:12)(cid:12) Y ⊆ X ﬁnite , ∀ ¯ y ∈ Y, ∀ j ∈ Π , y j ≥ (cid:27) = min (cid:26) y ⋆ (cid:12)(cid:12)(cid:12)(cid:12) ¯ y ∈ X, ∀ ¯ y ∈ Y, ∀ j ∈ Π , y j ≥ (cid:27) = min ⋆ X. Then, it has been proved in [9] that the set of possiblevalues of ˆ µ ( ρ ) for all plays ρ in K is exactly the set: X = (cid:18) Conv c ∈ SC( K ) ˆ µ ( c ω ) (cid:19) x . Since all the plays in K contain ﬁnitely many deviations(actually none), for every ¯ x = ˆ µ ( ρ ) ∈ X , we have ν C ( ρ ) = + ∞ if and only if there exists j ∈ Π suchthat x j < . Then, the lowest outcome Prover can get in K is: min { x ⋆ | ¯ x ∈ X, ∀ j ∈ Π , x j ≥ } , that is to say: min ⋆ (cid:18) Conv c ∈ SC( K ) ˆ µ ( c ω ) (cid:19) x , i.e. min ⋆ Conv c ∈ SC( K ) ˆ µ ( c ω ) .Theorem 3 enables to conclude to the desired formula.20 PPENDIX IP ROOF OF P ROPOSITION Proposition 2.

In mean-payoff-inf games, the negotiationfunction is Scott-continuous.Proof:

Let ( λ n ) n be a non-decreasing sequence of re-quirements on a mean-payoff-inf game G , and let λ =sup n λ n . We want to prove that nego( λ ) = sup n nego( λ n ) .Since the negotiation function is monotone, we already have nego( λ ) ≥ sup n nego( λ n ) . Let us prove that nego( λ ) ≤ sup n nego( λ n ) .Let δ > : we want to ﬁnd n such that nego( λ n )( v ) ≥ nego( λ )( v ) − δ for each v ∈ V .Let: Conc λi ( G ) ↾ s = ( {P , C} , S, ( S P , S C ) , ∆ , ν ) ↾ s be the concrete negotiation game of G for λ and player i controlling v , and let: Conc λ n i ( G ) ↾ s = ( {P , C} , S, ( S P , S C ) , ∆ , ν ′ ) ↾ s be the concrete negotiation game of G for some requirement λ n in v . Let us note that both have the same underlying graph,and that the only difference are the weight functions ˆ π and ˆ π ′ ,on the non-main dimensions.By Lemma 5, we have: nego( λ )( v ) =max τ C ∈ ML C ( Conc λi ( G ) ↾ s ) min K ∈ SConn (Conc λi ( G )[ τ C ])accessible from s opt( K ) with: opt( K ) =  if K contains a deviation :min c ∈ SC( K ) ˆ µ ⋆ ( c ω )otherwise :min ⋆ Conv c ∈ SC( K ) ˆ µ ( c ω ) , and identically: nego( λ n )( v ) =max τ C ∈ ML C ( Conc λni ( G ) ↾ s ) min K ∈ SConn (Conc λi ( G )[ τ C ])accessible from s opt ′ ( K ) with: opt ′ ( K ) =  if K contains a deviation :min c ∈ SC( K ) ˆ µ ⋆ ( c ω )otherwise :min ⋆ Conv c ∈ SC( K ) ˆ µ ′ ( c ω ) . Let τ C be a memoryless strategy for Challenger in thegame Conc λi ( G ) s ; it can also be considered as a memorylessstrategy in the game Conc λ n i ( G ) s .Let us now deﬁne: γ n = sup v ∈ V ( λ ( v ) − λ n ( v )) . Then, the sequence ( γ n ) n is non-increasing and converges to . Moreover, for each transition st ∈ ∆ , we have: ˆ π ′ j ( st ) ∈ [ˆ π j ( st ) − γ n , ˆ π j ( st )] . Let: Γ n = n ¯ x ∈ R Π ∪{ ⋆ } (cid:12)(cid:12)(cid:12) x ⋆ = 0 and ∀ j ∈ Π , x j ∈ [0 , γ n ] o . Then, let K be a strongly connected component of thegraph Conc λi ( G )[ τ C ] , without deviation, accessible from s ;we have: Conv c ∈ SC( K ) ˆ µ ′ ( c ω ) ⊆ Conv c ∈ SC( K ) ˆ µ ( c ω ) + Γ n . Let R = (cid:8) ¯ x ∈ R Π ∪{ ⋆ } (cid:12)(cid:12) ∀ j ∈ Π , x j ≥ (cid:9) . • If Conv c ∈ SC( K ) ˆ µ ( c ω ) ∩ R = ∅ , since Conv c ∈ SC( K ) ˆ µ ( c ω ) and R are closed sets, if γ n is small enough, we have Conv c ∈ SC( K ) ˆ µ ′ ( c ω ) ∩ R = ∅ . Therefore, if: min ⋆ Conv c ∈ SC( K ) ˆ µ ( c ω ) = + ∞ , then, for n great enough: min ⋆ Conv c ∈ SC( K ) ˆ µ ′ ( c ω ) = + ∞ . • Otherwise, we have: min ⋆ Conv c ∈ SC( K ) ˆ µ ′ ( c ω ) ≥ min ⋆ Conv c ∈ SC( K ) ˆ µ ( c ω ) − γ n max c ∈ SC( K ) d ∈ SC( K ) X j ∈ Π , ˆ µj ( cω ) > ˆ µj ( dω ) ˆ µ ⋆ ( c ω ) − ˆ µ ⋆ ( d ω )ˆ µ j ( c ω ) − ˆ µ j ( d ω ) and if γ n is small enough, we have: min ⋆ Conv c ∈ SC( K ) ˆ µ ′ ( c ω ) ≥ min ⋆ Conv c ∈ SC( K ) ˆ µ ( c ω ) − δ. In both cases, we ﬁnd that there exists γ n small enough,i.e. n great enough, to ensure: min ⋆ Conv c ∈ SC( K ) ˆ µ ′ ( c ω ) ≥ min ⋆ Conv c ∈ SC( K ) ˆ µ ( c ω ) − δ. We can ﬁnd such n for each strongly connected component K without deviation, and there exists a ﬁnite number ofsuch components. Moreover, when K is a strongly connectedcomponent with a deviation, the quantity: min c ∈ SC( K ) ˆ µ ⋆ ( c ω ) is the same in Conc λi ( G ) and in Conc λ n i ( G ) . Therefore, thereexists n ∈ N such that: min K ∈ SConn (Conc λni ( G )[ τ C ])accessible from s opt( K ) ≥ min K ∈ SConn (Conc λi ( G )[ τ C ])accessible from s opt( K ) − δ.

21e ﬁnd such n for every memoryless strategy τ C , and thereexists a ﬁnite number of such strategies. Therefore, there exists n ∈ N such that: nego( λ n )( v ) ≥ nego( λ )( v ) − δ. Finally, since there are ﬁnitely many states v ∈ V , we canconclude to the existence of n ∈ N such that for each v ∈ V ,we have: nego( λ n )( v ) ≥ nego( λ )( v ) − δ. The negotiation function is Scott-continuous.A

PPENDIX JP ROOF OF T HEOREM Theorem 4.

Let G ↾ v be an initialized mean-payoff-inf game.Let us assimilate any requirement λ on G with ﬁnite values tothe tuple λ ¯ = ( λ ( v )) v ∈ V , element of the vector space of ﬁnitedimension R V .Then, for each player i and each vertex v ∈ V i , the quantity nego( λ )( v ) is a piecewise linear function of λ ¯ , which can beeffectively expressed and whose ﬁxed points are computable.Proof: By Lemma 5, we have the formula: nego( λ )( v ) = max τ C ∈ ML C (Conc λi ( G )) opt( K ) Let τ C be a memoryless strategy realizing the maximumabove, and let K be a strongly connected component realizingthe minimum above. Let us prove that the quantity: opt( K ) =  if K contains a dev . : min c ∈ SC( K ) ˆ µ ⋆ ( c ω )otherwise : min ⋆ Conv c ∈ SC( K ) ˆ µ ( c ω ) is the desired piecewise linear function of λ .When K contains a deviation, the quantity: min c ∈ SC( K ) ˆ µ ⋆ ( c ω ) is independent of λ , and the result is then immediate. Letus assume that K does not contain any deviation, and as aconsequence let us prove that the quantity: f ( λ ) = min ⋆ Conv c ∈ SC( K ) ˆ µ ( c ω ) is an piecewise linear function of λ .Let M be the common memory of the states of K (since K does not contain deviations). We know that for each j ∈ Π and for every cycle c ∈ SC( K ) , we have: ˆ µ j ( c ω ) = µ j ( ˙ c ω ) − max v ∈ V j ∩ M λ ( v ) . Let C = { ˙ c | c ∈ SC( K ) } . Since there is no deviation in K , any cycle in C is a simple cycle of G . Then, the quantity f ( λ ) is the minimal x i for ¯ x in the set: X = Conv c ∈ C µ ( c ω ) ∩ \ v ∈ M { ¯ x | x j ≥ λ ( v ) with v ∈ V j } . The set X is a polyhedron: therefore, there exists a vertex ¯ x of that polyhedron which minimizes x i for ¯ x ∈ X . That vertexis the intersection between a face of the greater polyhedron j ij • Fig. 15. The intersection between a -dimensional face and zero hyperplane j ij • Fig. 16. The intersection between a -dimensional face and two hyperplanes P = Conv c ∈ C µ ( c ω ) , and some of the hyperplanes H v (possiblyzero), deﬁned as the hyperplanes of equation x j = λ ( v ) for j ∈ Π controlling v , such that λ ( v ) = max w ∈ M ∩ V j λ ( w ) . Example . With three cycles and two players against player i , each controlling one vertex v such that λ ( v ) = 0 , the vertex ¯ x is the red point in Figure 15 and Figure 16.The set of vertices of the polyhedron X is included in theﬁnite set: Y =  ¯ y ∈ R Π ∪{ ⋆ } (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) ∃ W ⊆ M, ∃ D ⊆ C, Conv c ∈ D µ ( c ω ) ∩ T w ∈ W H w = { ¯ y } and ∀ j, ∀ v ∈ M ∩ V j ,y j ≥ λ ( v )  , where the tuple ¯ y corresponding to the sets W and D is theintersection of the face of P delimited by the values of thecycles of D , and the hyperplanes H v for v ∈ W . That set Y is itself included in X .22e have, therefore: min ⋆ Conv c ∈ C µ ( c ω ) = min ¯ y ∈ Y y i . Let now ¯ y ∈ Y , and let D ⊆ C and W ⊆ M be thecorresponding sets.Let us choose D and W minimal, so that all player j ∈ Π controls at most one state w ∈ W , and so that there existsonly one decomposition: ¯ y = X c ∈ D α c µ ( c ω ) with for all c , we have < α c < , and P c α c = 1 .Furthermore, ¯ y is the only such solution of the system ofequations: ∀ j ∈ Π , ∀ w ∈ W ∩ V j , y j = λ ( w ) . Therefore, ¯ α = ( α c ) c ∈ D is the only solution of the system:  P c ∈ D α c = 1 ∀ j ∈ Π , ∀ w ∈ W ∩ V j , P c ∈ D α c µ j ( c ω ) = λ ( w ) ∀ c ∈ D, α c > . Then, if ⊕ is a symbol and A W D is the matrix: A W D = (cid:18)(cid:26) w = ⊕ µ j ( c ω ) else , with w ∈ V j (cid:19) w ∈ W ∪ {⊕} ,c ∈ D then A W D is invertible and: ¯ α = A − W D (cid:18)(cid:26) w = ⊕ λ ( w ) otherwise (cid:19) w ∈ W ∪{⊕} , with for all c ∈ D , α c > .Let us write: ¯ β λW = (cid:18)(cid:26) j = ⊕ λ ( w ) otherwise (cid:19) w ∈ W ∪{⊕} . We have, thus, ¯ α = A − W D ¯ β λW .Let us write, for each player j , ¯ γ jD = ( µ j ( c ω )) c ∈ D . Then,we can write: y i = P c α c µ i ( c ω )= t ¯ γ iD ¯ α = t ¯ γ iD A − W D ¯ β λW . Finally, if we write: B W = (cid:18)(cid:26) w = v (cid:19) w ∈ W ∪{⊕} ,v ∈ V and: δ ¯ W = (cid:18)(cid:26) w = ⊕ (cid:19) w ∈ W ∪{⊕} we have ¯ β λW = B W λ ¯ + δ ¯ W , and therefore: y i = t ¯ γ iD A − W D ( B W λ ¯ + δ ¯ W ) . Conversely, the tuple ¯ y deﬁned by, for each j ∈ Π , y j = t ¯ γ jD A − W D ( B W λ ¯ + δ ¯ W ) for given W ⊆ M and D ⊆ C , is an element of the set Y ifand only if: • the intersection Conv c ∈ D µ ( c ω ) ∩ T w ∈ W H w is a singleton, i.e.the matrix A W D is invertible (otherwise the matrix A − W D is not deﬁned); • ¯ y ∈ Conv c ∈ D µ ( c ω ) , i.e. the tuple ¯ α = A − W D ( B W λ ¯ + δ ¯ W ) has only non-negative coordinates (actually positive if D is minimal); • for each player j , for each vertex v ∈ M ∩ V j , we have y j ≥ λ ( v ) , i.e. t γ jD A − W D ( B W λ ¯ + δ ¯ W ) ≥ λ ( v ) .Hence the formula: nego( λ )( v ) = max τ C ∈ ML ( Conc λ i ( G ) ) min K ∈ SConn (cid:16)

Conc λ i ( G )[ τ C ] (cid:17) accessible from ( v , { v } ) ( if K contains a deviation : min c ∈ SC( K ) ˆ µ ⋆ ( c ω )otherwise : min S K , where S K is the set of real numbers of the form: t ¯ γ iD A − W D ( B W λ ¯ + δ ¯ W ) such that: • W ⊆ M K ; • D ⊆ C K ; • the matrix A W D is invertible; • the tuple A − W D ( B W λ ¯ + δ ¯ W ) has only positive coordi-nates; • and for each j ∈ Π , for each v ∈ M K ∩ V j , we have t ¯ γ jD A − W D ( B W λ ¯ + δ ¯ W ) ≥ λ ( v ) .This is, indeed, the expression of a piecewise linear func-tion. A PPENDIX KS OME EXAMPLES OF NEGOTIATION SEQUENCES

We gather in this section some examples that could beinteresting for the reader who would want to get a full overallview on the behaviour of the negotiation function on the mean-payoff-inf games. For all of them, we computed the negotiationsequence, as deﬁned in Section IV-E. For some of them, wejust gave the negotiation sequence; for the most importantones, we gave a complete explanation of how we computed it,using the abstract negotiation game, as deﬁned in AppendixE.

Example . Let us take again the game of Example 2: let usgive (in red) the values of λ = nego( λ ) , which correspondto the antagonistic values. ac b d

03 2211( λ ) 1 21 2 at the second step, let us execute the abstract game on thestate a , with the requirement λ : whatever Prover proposesat ﬁrst, Challenger has the possibility to deviate and to reachthe state b . Then, Prover has to propose a λ -consistent playfrom the state b , i.e. a play in which player gets at least23he payoff : such a play necessarily ends in the state d , andgives player the payoff .The other states keep the same values. ac b d

03 2211 ( λ )2 21 2 But then, at the third step, from the state b : whatever Proverproposes at ﬁrst, Challenger can deviate to reach the state a .Then, Prover has to propose a λ -consistent play from a , i.e.a play in which player gets at least the payoff : such aplay necessarily end in the state d , i.e. after possibly somepreﬁx, Prover proposes the play abd ω . But then, Challengercan always deviate to go back to the state a ; and the play whichis thus created is ( ab ) ω which gives player the payoff . ac b d

03 2211 ( λ )2 31 2 Finally, from the states a and b , there exist no λ -consistentplay, and therefore no λ -rational strategy proﬁle. ac b d

03 2211 ( λ )+ ∞ + ∞ and for all n ≥ , λ n = λ . Example . In this example, we show a game that can beturned into a family of games, where the negotiation functionneeds as many steps as there are states to reach its limit:when the requirement changes in some state, it opens newpossibilities from the neighbour states, and so on. a b c d e

11 00 00 00 00 22( λ ) 1 0 0 0 2 a b c d e

11 00 00 00 00 22( λ ) 1 1 0 2 2 a b c d e

11 00 00 00 00 22( λ ) 1 1 2 2 2 a b c d e

11 00 00 00 00 22( λ ) 1 2 2 2 2 a b c d e

11 00 00 00 00 22( λ ) 2 2 2 2 2 and the requirement λ is a ﬁxed point of the negocationfunction. Example . In all the previous examples, all the gameswhose underlying graphs were strongly connected containedSPEs. Here is an example of game with a strongly connectedunderlying graph that does not contain SPEs. b c a d ef λ ) 11 3 2 24 b c a d ef λ ) 21 3 2 24 b c a d ef λ ) 21 3 3 24 c a d ef λ ) + ∞ + ∞ + ∞ + ∞ + ∞ + ∞ Example . This example shows how a new requirement canemerge from the combination of several cycles.Let G be the following game: a b c defg ( λ ) 1 1 0 0003 13 00 0032 00 23 13000000 At the ﬁrst step, the requirement λ captures the antagonisticvalues.Then, from the state c , if player forces the access to thestate b , then player must get at least : the worst play thatcan be proposed to player is then ( babc ) ω , which givesplayer the payoff .From the state f , if player forces the access to the state g , then the worst play that can be proposed to them is g ω . a b c defg ( λ ) 1 1 Then, from the state d , if player forces the access tothe state c , then player must get at least : the worst playthat can be proposed to player is then ( cccd ) ω , which givesplayer the payoff .At the same time, from the state e , player can now forcethe acces to the state f : then, the worst play that can beproposed to them is f g ω . a b c defg ( λ ) 1 1

32 12

323 13 00 0032 00 23 13000000

But then, from the state c , player can now force the accessto the state e : then, the worst play that can be proposed to themis ef g ω . a b c defg ( λ ) 1 1 2

323 13 00 0032 00 23 13000000

And ﬁnally, from that point, if from the state d player forces the access to the state c , then player must have atleast the payof ; and therefore, the worst play that can beproposed to player is now ( ccd ) ω , which gives them thepayoff . a b c defg ( λ ) 1 1 2

323 13 00 0032 00 23 13000000

The requirement λ is a ﬁxed point of the negotiationfunction. Example . We give here the details of the example givenin Figure 15, in which the negotiation sequence was notstationary, and we provide a similar example with only twoplayers. cd a b ef

For each edge, the weights are given in the followingorder: player ﬁrst, player second, player third. Sinceall the weight are equal to , for all n > , we have λ n ( d ) = λ n ( f ) = 0 . It comes that for all n > , we also25ave λ n ( c ) = λ n ( e ) = 0 . Moreover, by symmetry of thegame, we always have λ n ( a ) = λ n ( b ) . Therefore, to computethe negotiation sequence, it sufﬁces to compute λ n +1 ( a ) asa function of λ n ( b ) , knowing that λ ( a ) = λ ( b ) = 1 , andtherefore that for all n > , λ n ( a ) = λ n ( b ) ≥ .From a , the worst play player could propose to player would be a combination of the cycles cd and d giving herexactly . But then, player will deviate to go to b , fromwhich if player proposes plays in the strongly connectedcomponent containing c and d , then player will alwaysdeviate and generate the play ( ab ) ω , and then get the payoff . Then, in order to give her a payoff lower than , player has to go to the state e . Since player does not controlany state in that strongly connected component, the play hewill propose will be accepted: he will, then propose the worstpossible combination of the cycles ef and f for player ,such that he gets at least his requirement λ n ( b ) . The payoff λ n +1 ( a ) is then the maximal solution of the system:  λ n +1 ( a ) = x + 2(1 − x )2(1 − x ) ≥ λ n ( b )0 ≤ x ≤ that is to say λ n +1 ( a ) = 1 + λ n ( b )2 = 1 + λ n ( a )2 , and byinduction, for all n > : λ n ( a ) = λ n ( b ) = 2 − n − which tends to but never reaches it.That example could let us think that we need three playersto observe such a phenomena. Actually, the existence of aplayer for whom all the plays are equivalent was useful tobuild a not too complicated example, but not necessary. Hereis a variant of that example with only two players, slightlyless intuitive, but where the sequences ( λ n ( a )) n and ( λ n ( b )) n are the same as previously: cd a b ef , − − , − , , −2