[PDF] Approximating the Minimal Lookahead Needed to Win Infinite Games

Abstract

We present an exponential-time algorithm approximating the minimal lookahead necessary to win an ω -regular delay game.

Full PDF

aa r X i v : . [ c s . F L ] O c t Approximating the Minimal LookaheadNeeded to Win Inﬁnite Games

Martin Zimmermann

University of Liverpool, Liverpool L69 3BX, United Kingdom [email protected]

Abstract

We present an exponential-time algorithm approximating the minimal lookahead necessary towin an ω -regular delay game. Games can be found in the standard toolkit for many areas of theoretical computer science and mathe-matics, e.g., set theory and logic, automata theory, and complexity theory. Here, we are concerned withtwo-player zero-sum games of inﬁnite duration and perfect information. In their most abstract form,these are known as Gale-Stewart games [2], played between Player I and Player O in rounds i ∈ N : Ineach round i , ﬁrst Player I picks some letter a i from an alphabet Σ I , then Player O picks a letter b i froman alphabet Σ O . Thus, after ω rounds, they have produced an inﬁnite sequence w = (cid:0) a b (cid:1)(cid:0) a b (cid:1)(cid:0) a b (cid:1) · · · ofletters. Now, Player O wins such a play, if w satisﬁes some given winning condition, e.g., w ∈ L for somegiven set L of inﬁnite words. In this work, we only consider ω -regular L given by deterministic parityautomata.Note that here both players move in strict alternation and the automaton will process the sequencein this order. Hosch and Landweber introduced delay games, relaxing this rigid interaction by allowingPlayer O to delay her moves in order to obtain a lookahead on Player I ’s moves [5]. Hosch and Landweber proved that it is decidable whether Player O wins an ω -regular delay gamewith some bounded lookahead. In later work, Holtmann, Kaiser, and Thomas showed that such boundedlookahead is suﬃcient in the following sense: in an ω -regular delay game, Player O either wins withdoubly-exponential lookahead or not at all (not even with unbounded lookahead) [4]. In subsequentwork, an improved exponential upper bound and matching lower bounds have been proved [6].Here, we consider the problem of determining the minimal lookahead that is suﬃcient for Player O to win an ω -regular delay game. It is trivial to determine this value in doubly-exponential-time byhardcoding the exponential lookahead into the game, thereby turning the delay game into a classical,i.e., delay-free, game (see [8], Section 3.1 for details). As the resulting games can be solved in doubly-exponential time, one obtains the minimal lookahead in doubly-exponential time by exhaustive search.However, this has to be contrasted with the ExpTime -hardness of determining whether Player O winswith some lookahead [6], the only known lower bound on the complexity of the optimization problem.We present the ﬁrst improvement over the naive algorithm for the lookahead optimization problemby presenting an exponential-time algorithm approximating the minimal lookahead within a factor oftwo. To this end, we show that the exponential-time algorithm for determining whether Player O winsfor some lookahead can be reﬁned into an approximation algorithm with an exponential running time.Due to the hardness result for the related decision problem, this is the best running time one can hopefor (barring major surprises in complexity theory). ω -regular Delay Games Given an alphabet Σ, i.e., a non-empty ﬁnite set of letters, Σ ∗ and Σ ω denote the set of ﬁnite respectivelyinﬁnite words over Σ. Given a product alphabet Σ I × Σ O we write (cid:0) a a a ··· b b b ··· (cid:1) for the word (cid:0) a b (cid:1)(cid:0) a b (cid:1)(cid:0) a b (cid:1) · · · We refer to the introduction of [4] for a discussion of the history of delay games, including motivation and a connectionto the uniformization of ω -regular relations by continous functions. a i ∈ Σ I and b i ∈ Σ O . Also, we use similar notation for ﬁnite words, provided they are of the samelength. We denote the empty word by ε , the power set of a set S by 2 S , and the set of non-negativeintegers by N .A delay game (with constant lookahead) Γ k ( L ) consists of a lookahead k ∈ N and a winning condi-tion L ⊆ (Σ I × Σ O ) ω . It is played in rounds i ∈ N as follows: In round 0, Player I picks letters a · · · a k from Σ I , then Player O picks a letter b from Σ O . In round i >

0, Player I picks a letter a k + i ∈ Σ I andthen Player O picks a letter b i ∈ Σ O . After ω rounds, they have produced an outcome (cid:0) a b (cid:1)(cid:0) a b (cid:1)(cid:0) a b (cid:1) · · · .We say that the outcome is winning for Player O if it is in L .A strategy for Player O is a mapping σ : Σ ∗ I → Σ O . An outcome (cid:0) a b (cid:1)(cid:0) a b (cid:1)(cid:0) a b (cid:1) · · · is consistent with σ , if b i = σ ( a · · · a i + k ) for all i . A strategy σ is winning, if every outcome that is consistent with σ iswinning for Player O . If Player O has a winning strategy for Γ k ( L ), then we say she wins Γ k ( L ). Remark 1 ([4], Remark 3.3) . If Player O wins Γ k ( L ) , then she also wins Γ k ′ ( L ) for every k ′ > k . In this work, we consider winning conditions L recognized by deterministic parity automata (DPA) A =( Q, Σ , q ι , δ, Ω), where Q is a ﬁnite set of states containing the initial state q ι , Σ is the input alphabet, δ : Q × Σ → Q is the transition function, and Ω : Q → N is a coloring of the states. As usual, we extend δ to ﬁnite words by deﬁning δ ∗ : Q × Σ ∗ → Q via δ ∗ ( q, ε ) = q and δ ∗ ( q, wa ) = δ ( δ ∗ ( q, w ) , a ) for all q ∈ Q , w ∈ Σ ∗ , and a ∈ Σ. Given a word a a a · · · ∈ Σ ω , the run of A on α is the unique sequence q q q · · · ofstates given by q i = δ ∗ ( q ι , a · · · a i − ). A run q q q · · · is accepting, if lim sup i →∞ Ω( q i ), i.e., the max-imal color occurring inﬁnitely often, is even. The language L ( A ) recognized by A is the set containingall words whose run is accepting.It is known that one can determine in exponential time whether Player O wins a delay game for somelookahead k , if the winning condition L is recognized by a DPA. Proposition 1 ([6], Theorem 4.4) . The following problem is

ExpTime -complete: Given a DPA A , doesPlayer O win Γ k ( L ( A )) for some k ? Furthermore, there is an exponential upper bound on the lookahead necessary to win a delay game.

Proposition 2 ([6], Theorem 4.8) . Let A be a DPA with n states and c colors, and deﬁne k max = 2 n c +1 .Player O wins Γ k ( L ( A )) for some k if, and only if, she wins Γ k max ( L ( A )) . Finally, the exponential upper bound on the necessary lookahead is tight.

Proposition 3 ([6], Theorem 3.2) . For every n > , there is a language L n recognized by a DPA with O ( n ) states and two colors such that Player O wins Γ k ( L n ) for some k , but she does not win Γ n ( L n ) . In this work, we consider the following problem: Given a DPA A over Σ I × Σ O , determine the smallest k such Player O wins Γ k ( L ( A )) (or that there is no such k ). Due to Proposition 2, the search space forthe smallest such k is bounded by k max = 2 n c +1 .Now, one can easily transform a game Γ k ( L ( A )) (for some ﬁxed k ) into an equivalent classical paritygame (see, e.g., [3] for an introduction to parity games) encoding a queue of k letters from Σ I imple-menting the lookahead ([8], Section 3.1). Thus, one can construct the equivalent parity game for each k ≤ k max and determine the smallest k such that Player O wins the resulting parity game. This is alsothe smallest k such that Player O wins Γ k ( L ( A ))).However, if k is exponential in the size of A (i.e., close to k max ), then the resulting parity gameis of doubly-exponential size in the size of A , as one encodes a queue of exponential length. Due toProposition 3, considering an exponential k is, in general, unavoidable. Hence, the resulting algorithmhas doubly-exponential running time.In the following, we show that the minimal lookahead can be approximated within a factor of twoin exponential time. As the related decision problem is ExpTime -hard (Proposition 1), one cannot dobetter than exponential time (barring major surprises in complexity theory).

Given a DPA A over Σ I × Σ O and a k >

0, we deﬁne a game G k played between Player I and O withthe following properties: 2. If Player O wins Γ k ( L ( A )), then she wins G k (Lemma 1).2. If Player O wins G k , then she wins Γ k − ( L ( A )) (Lemma 2).3. Given A and k ≤ n c +1 , one can construct G k and determine its winner in exponential time in n ,where n and c are the number of states and colors of A (Lemma 3).Now, consider Algorithm 1. Algorithm 1

Approximating the minimal lookahead necessary to win a delay game with winning con-dition L ( A ), where A is a given DPA with n states and c colors for k = 1 to n c +1 do if Player O wins G k then return k − return “Player O does not win for any lookahead k ”. It is obvious that the algorithm runs in exponential time, as the calls in Line 2 can be executed inexponential time (Property 3) and the loop terminates after an exponential number of iterations (whichcan obviously be reduced to a polynomial number of iterations using binary search).Now, ﬁx an input A . If Player O does not win Γ k ( L ( A )) for any k , then she does not win any G k (Property 2), i.e., the algorithm returns the correct output in Line 4. So, consider the case wherePlayer O wins Γ k ( L ( A )) for some k . Further, let k opt be the minimal k such that Player O wins Γ k ( L ( A )).Proposition 2 yields k opt ≤ n c +1 . Hence, Player O wins G k for some k ≤ n c +1 due to Property 1.We pick k ∗ minimal with this property, i.e., 2 k ∗ − k ∗ − O to win the delay game with winning condition L ( A ). Finally, thealgorithm indeed approximates the minimal lookahead within a factor of two: Due to Property 1, wehave k ∗ ≤ k opt , which implies that the approximation ratio between the algorithm’s output 2 k ∗ − k opt is indeed bounded by two:2 k ∗ − k opt ≤ k ∗ k opt ≤ k opt k opt ≤ . Altogether, we obtain our main result.

Theorem 1.

The following problem can be approximated within a factor of two in exponential time:Given a DPA A , determine the smallest k such that Player O wins Γ k ( L ( A )) . Note that we do not consider the computation of a strategy realizing the approximation, as the notionof ﬁnite-state strategies for delay games comes with some technical complications [8].In the remainder of this section, we present the construction of G k and prove Properties 1, 2, and 3. G k The construction of G k is a reﬁnement of a similar game used to prove Proposition 1 [6, 8]. For a detailedexplanation of the construction, we refer the reader to these works.Throughout this section, we ﬁx A = ( Q, Σ I × Σ O , q ι , δ, Ω), some 0 < k ≤ n c +1 , and let C = Ω( Q )denote the set of colors of A .First, we modify the transition function of A so that it keeps track of the maximal color occurringalong a (partial) run of A . Formally, we deﬁne δ T : ( Q × C ) × (Σ I × Σ O ) → ( Q × C ) via δ T (cid:18) ( q, c ) , (cid:18) ab (cid:19)(cid:19) = (cid:18) δ (cid:18) q, (cid:18) ab (cid:19)(cid:19) , max (cid:26) c, Ω (cid:18) δ (cid:18) q, (cid:18) ab (cid:19)(cid:19)(cid:19)(cid:27)(cid:19) for all q ∈ Q , c ∈ C , and (cid:0) ab (cid:1) ∈ Σ I × Σ O .Next, we project away the Σ O -component of the letter and perform a power set construction bydeﬁning δ P : 2 Q × C × Σ I → Q × C via δ P ( S, a ) = [ ( q,c ) ∈ S [ b ∈ Σ O δ T (cid:18) ( q, c ) , (cid:18) ab (cid:19)(cid:19) S ⊆ Q × C and a ∈ Σ I . We extend δ P to non-empty words via δ ∗P ( S, ε ) = S and δ ∗ P ( S, wa ) = δ P ( δ ∗P ( S, w ) , a ) for all w ∈ Σ ∗ I and a ∈ Σ I .Finally, for every non-empty D ⊆ Q × C and w ∈ Σ ∗ I , we deﬁne the function r Dw : D → Q × C via r Dw ( q, c ) = δ ∗P ( { ( q, c ) } , w )for all ( q, Ω( q )) ∈ D . Note that the ﬁrst argument of δ ∗P is ( q, Ω( q )) and not ( q, c ), the argument of r Dw . Remark 2. ( q ′ , c ′ ) ∈ r Dw ( q, c ) if and only if there is a word w over Σ I × Σ O whose projection to Σ I is w and such that the run of A processing w from q leads to q ′ and has maximal color c ′ . We call w ∈ Σ kI a witness for a partial function r : Q × C → Q × C if we have r = r dom( r ) w , wheredom( r ) denotes the domain of r . Note that we require a witness to have length k . Let R be the set ofall such functions that have a witness.Now, we deﬁne G k , which is played in rounds i ∈ N between Player I and Player O . In each round i ,Player I has to pick some r i ∈ R and then Player O picks ( q i , c i ) ∈ Q × C subject to the followingconstraints: • For Player I : dom( r ) = { ( q ι , Ω( q ι )) } and dom( r i ) = r i − ( q i − ) for all i > • For Player O : q i ∈ dom( r i ) for all i ≥ G k is a sequence r ( q , c ) r ( q , c ) r ( q , c ) · · · . It is winning for Player O if the sequence ofcolors satisﬁes the parity condition, i.e., if lim sup i →∞ c i is even.A strategy σ for Player O maps a sequence r ( q , c ) · · · r i to a pair ( q i , c i ) ∈ dom( r i ). A play r ( q , c ) r ( q , c ) r ( q , c ) · · · is consistent with σ if ( q i , c i ) = σ ( r ( q , c ) · · · r i ) for all i ≥

0. We saythat σ is a winning strategy for Player O if every play that is consistent with σ is winning for her.Finally, Player O wins G k if she has a winning strategy. Lemma 1.

If Player O wins Γ k ( L ( A )) , then she wins G k .Proof. Let σ be a winning strategy for Player O in Γ k ( L ( A )). We construct a winning strategy σ ′ forPlayer O in G k , which will simulate σ .So, let r ∈ R be the ﬁrst move of Player I . This has to be answered by Player O by picking( q , c ) = ( q ι , Ω( q ι )), as this is the only legal move for her. Hence, we deﬁne σ ′ ( r ) = ( q ι , Ω( q ι )). Now,Player I picks some r ∈ R .We simulate this in Γ k ( L ( A )) as follows. Pick witnesses w and w for r and r , respectively. IfPlayer I uses w w during the ﬁrst k rounds of Γ k ( L ( A )), then σ yields k letters w ′ ∈ Σ kO as response.Thus, we are in the following situation for i = 1: • In G k , we have a play preﬁx r ( q , c ) · · · ( q i − , c i − ) r i , and • in Γ k ( L ( A )), Player I has picked w · · · w i and Player w ′ · · · w ′ i − , where each w j is a witness for r j (and thus is in Σ kI ) and each w ′ j is in Σ kO .Now, consider an arbitrary i ≥

1. Let q i be the unique state of A that is reached from q i − by process-ing (cid:0) w i − w ′ i − (cid:1) , and let c i be the maximal color on the induced run inﬁx. We have ( q i , c i ) ∈ r i − ( q i − , c i − ),i.e., ( q i , c i ) is a legal move for Player O in G k to extend the play preﬁx r ( q , c ) · · · ( q i − , c i − ) r i . Ac-cordingly, we deﬁne σ ′ ( r ( q , c ) · · · ( q i − , c i − ) r i ) = ( q i , c i ). Player O reacts by picking some r i +1 ∈ R ,which has some witness w i +1 . In Γ k ( L ( A )) we let Player I pick the letters of w i +1 during the next k rounds, which yield k letters w ′ i ∈ Σ kO determined by σ . Thus, we are in the situation above for i + 1,i.e., we have concluded the deﬁnition of σ ′ .It remains to show that σ ′ is indeed winning. Fix a play r ( q , c ) r ( q , c ) r ( q , c ) · · · that is consis-tent with σ ′ , and let (cid:0) w w w ··· w ′ w ′ w ′ ··· (cid:1) be the play in Γ k ( L ( A )) constructed during the simulation. By construc-tion, each w i is a witness of r i . An induction shows that q i +1 is the unique state of A reached when pro-cessing (cid:0) w i w ′ i (cid:1) when starting at q i , and that c i +1 is the maximal color encountered on this run inﬁx. As the4nique run of A on (cid:0) w w w ··· w ′ w ′ w ′ ··· (cid:1) is accepting (as it is, by construction, an outcome consistent with the win-ning strategy σ ), we conclude that lim sup i →∞ c i is even. Hence, the play r ( q , c ) r ( q , c ) r ( q , c ) · · · is winning for Player O . As the play was chosen arbitrarily, σ ′ is indeed a winning strategy for Player O in G k . Lemma 2.

If Player O wins G k , then she wins Γ k − ( L ( A )) .Proof. Let σ ′ be a winning strategy for Player O in G k . We construct a winning strategy σ for Player O in Γ k − ( L ( A )), which will simulate σ ′ .So, let Player I pick letters a · · · a k − in round 0 and deﬁne w = a · · · a k − and w = a k · · · a k − .Furthermore, let ( q , c ) = ( q ι , Ω( q ι )), r = r { ( q ,c ) } w , and r = r r ( { ( q ,c ) } ) w . Then, r ( q , c ) r is a playpreﬁx in G k that is consistent with σ ′ . Then, we are in the following situation for i = 1: • In Γ k − ( L ( A )), Player I has picked w · · · w i and Player O has picked w ′ · · · w i − (which is emptyfor i = 1), and • in G k , we have a play preﬁx r ( q , c ) · · · ( q i − , c i − ) r i that is consistent with σ ′ and where each w j is a witness for r j . Note that being a play preﬁx implies ( q j , c j ) ∈ dom( r j ) = r j − ( q j − ).Now, pick some arbitrary i ≥ q i , c i ) = σ ′ ( r ( q , c ) · · · ( q i − , c i − ) r i ). As σ ′ is astrategy for G k , we have again ( q i , c i ) ∈ dom( r i ) = r i − ( q i − ). Furthermore, as w i − is a witness for r i − , there is some w ′ i − ∈ Σ kO such that q i is the unique state A reaches when processing (cid:0) w i − w ′ i − (cid:1) from q i − , and c i is the maximal color occurring in this run inﬁx.Now, we deﬁne σ such that it picks the k letters of w ′ i − during the next k rounds (independentlyof the choices of Player I ). During these rounds, Player I again determines some w i +1 ∈ Σ kI , inducing r i +1 = r r i ( q i ,c i ) w i +1 . Then, we are in the situation above for i + 1, i.e., we have concluded the deﬁnition of σ .It remains to show that σ is winning. To this end, ﬁx an outcome (cid:0) w w w ··· w ′ w ′ w ′ ··· (cid:1) that is consistent with σ , where each w i is in Σ kI and each w ′ i is in Σ kO . Further, let r ( q , c ) r ( q , c ) r ( q , c ) · · · the play of G k constructed during the simulation. By construction, each w i is a witness of r i .As r ( q , c ) r ( q , c ) r ( q , c ) · · · is consistent with σ ′ by construction, it is winning for Player O ,i.e., lim sup i →∞ c i is even. Now, an induction shows that q i +1 is the unique state reached by A whenprocessing (cid:0) w i w ′ i (cid:1) starting in q i , and c i +1 is the maximal color on this run inﬁx. From these two properties,we conclude that the run of A on (cid:0) w w w ··· w ′ w ′ w ′ ··· (cid:1) is accepting, i.e., the outcome is winning for Player O . Asthe outcome was chosen arbitrarily, σ is indeed a winning strategy for Player O in Γ k − ( L ( A )). Lemma 3.

Given A and k ≤ n c +1 , one can construct G k and determine its winner in exponential timein n , where n and c are the number of states and colors of A .Proof. We argue that G k can be expressed as an arena-based parity game (see, e.g., [3] for a deﬁnition)of exponential size in n with the same colors as A . Such a game can be solved in exponential time in n [1]. Thus, it remains to argue that one can construct the parity game in exponential time.First, we argue that for each partial function r : Q × C → Q × C one can construct a deterministic ﬁniteautomaton recognizing the set of witnesses of r . The construction is based on a powerset construction(mirroring the deﬁnition of δ T and δ P ) and a counter checking that only inputs of length k are accepted.As there are only exponentially many such functions, one can eﬀectively determine R , i.e., the set offunctions whose associated automaton has a non-empty language, in exponential time.Now, it is straightforward to construct a parity game ( V I , V O , E, v ι , Ω ′ ) in a graph ( V I ∪ V O , E ) ofexponential size with the following components: • V I = { v ι } ∪ R × ( Q × C ): vertices of Player I , where v ι is a fresh initial vertex. • V O = R : vertices of Player O . • E is the union of the following sets of edges: – { ( v ι , r ) | dom( r ) = { q ι , Ω( q ι ) }} : initial moves of Player I , allowing him to pick some r ∈ R with dom( r ) = { q ι , Ω( q ι ) } . 5 { (( r, ( q, c )) , r ′ ) | dom( r ′ ) = r ( q, c ) } : non-initial moves of Player I allowing him to pick some r ′ satisfying dom( r ′ ) = r ( q, c ), where r and ( q, c ) were previsouly picked by the players. – { ( r, ( r, ( q, c ))) | ( q, c ) ∈ dom( r ) } : moves of Player O allowing here to pick some ( q, c ) ∈ dom( r ),where r was previously picked by Player I . • Ω ′ ( v ) = ( c if v = ( r, ( q, c )) ∈ V I , min C otherwise. . Note that the color min C is neutral in the followingsense: Whether a play is winning or not only depends on the colors of the vertices in V I \ { v ι } , butnot on vertices in V O ∪ { v ι } .The resulting parity game implements exactly the rules of the abstract game G k and is therefore won byPlayer O if and only if she wins G k . We have presented an exponential-time algorithm approximating the minimal lookahead necessary to wina delay game. Here, we only considered the case of ω -regular winning conditions given by deterministicparity automata.In the literature, several other types of winning conditions have been considered, e.g., quantitativeparity [9] and (quantitative) Linear Temporal Logic [7]. For these types, one can also exhibit an approx-imation algorithm for the minimal lookahead that has the same complexity as an algorithm deciding theexistence of some lookahead using techniques very similar to those introduced here.Unfortunately, the complexity of the exact optimization problem for games with winning conditionsgiven by deterministic parity automata remains open. Let us conclude by mentioning another openproblem on delay games: There is an exponential gap between the upper and lower bounds on thenecessary lookahead in delay games with winning conditions given by deterministic Muller automata.The same is true for deciding whether Player O wins the game for some lookahead. References [1] Cristian S. Calude, Sanjay Jain, Bakhadyr Khoussainov, Wei Li, and Frank Stephan. Deciding paritygames in quasipolynomial time. In Hamed Hatami, Pierre McKenzie, and Valerie King, editors,

STOC2017 , pages 252–263. ACM, 2017.[2] David Gale and Frank M. Stewart. Inﬁnite games with perfect information.

Annals of Mathematics ,28:245–266, 1953.[3] Erich Gr¨adel, Wolfgang Thomas, and Thomas Wilke, editors.

Automata, Logics, and Inﬁnite Games:A Guide to Current Research , volume 2500 of

LNCS . Springer, 2002.[4] Michael Holtmann, Lukasz Kaiser, and Wolfgang Thomas. Degrees of lookahead in regular inﬁnitegames.

Log. Methods Comput. Sci. , 8(3), 2012.[5] Frederick A. Hosch and Lawrence H. Landweber. Finite delay solutions for sequential conditions. InMaurice Nivat, editor,

ICALP 1972 , pages 45–60. North-Holland, Amsterdam, 1972.[6] Felix Klein and Martin Zimmermann. How much lookahead is needed to win inﬁnite games?

Log.Methods Comput. Sci. , 12(3), 2016.[7] Felix Klein and Martin Zimmermann. Prompt delay. In Akash Lal, S. Akshay, Saket Saurabh, andSandeep Sen, editors,

FSTTCS 2016 , volume 65 of

LIPIcs , pages 43:1–43:14. Schloss Dagstuhl -Leibniz-Zentrum f¨ur Informatik, 2016.[8] Sarah Winter and Martin Zimmermann. Finite-state strategies in delay games.

Inf. Comput. ,272:104500, 2020.[9] Martin Zimmermann. Games with costs and delays. In