Universal Nash Equilibrium Strategies for Differential Games
aa r X i v : . [ m a t h . O C ] J un Universal Nash Equilibrium Strategies forDifferential Games
Yurii Averboukh ∗ February 6, 2018
Abstract
The paper is concerned with a two-player nonzero-sum differential game in the casewhen players are informed about the current position. We consider the game in controlwith guide strategies first proposed by Krasovskii and Subbotin. The constructionof universal strategies is given both for the case of continuous and discontinuousvalue functions. The existence of a discontinuous value function is established. Thecontinuous value function does not exist in the general case. In addition, we show theexample of smooth value function not being a solution of the system of Hamilton–Jacobi equation.
Keywords.
Nash equilibrium, nonzero-sum differential game, control with guide strategies.
AMS 2010 Subject Classification.
The purpose of this paper is to study Nash equilibria for a two-player deterministicdifferential game in the case when the players are informed about the present position. Welook for the universal equilibrium solution. The term ‘universal Nash equilibrium strategies’means that the strategies provide the Nash equilibrium at any initial position. The notionof universality generalizes the notion of time consistency, and it is appropriative for the casewhen the players form their controls stepwise. Generally speaking, in this case the notionof time consistence isn’t well-defined.There are two approaches in the literature dealing with this problem (see [8], and thereferences therein). The first approach is close to the so-called Folk Theorem for repeatedgames, and is based on the punishment strategy technique. This technique makes it possibleto establish the existence of Nash equilibrium at the given initial position in the frameworkof feedback strategies [14], [15] and in the framework of Friedman strategies [21]. The set ofNash equilibria at the given initial position is characterized in [12], [14]. The infinitesimalversion of this characterization is derived in [2], [4]. In addition, each Nash equilibriumpayoff at the given position corresponds to the pair of continuous functions; these functionsare stable with respect to auxiliary zero-sum differential games, and their values at the initial ∗ Institute of Mathematics and Mechanics UrB RAS & Ural Federal University, 16, S. Kovalevskaya str.,Ekaterinburg, 620990, Russia, [email protected], [email protected]
Let us consider a two-player differential game with the dynamics˙ x = f ( t, x, u ) + g ( t, x, v ) , t ∈ [0 , T ] , x ∈ R n , u ∈ P, v ∈ Q. (1)Here u and v are controls of player I and player II respectively. Payoffs are terminal. PlayerI wants to maximize σ ( x ( T )), whereas player II wants to maximize σ ( x ( T )). We assumethat sets P and Q are compacts, functions f , g , σ and σ are continuous. In addition,suppose that functions f and g are Lipschitz continuous with respect to the phase variableand satisfy the sublinear growth condition with respect to x .Denote U := { u : [0 , T ] → P measurable } , := { v : [0 , T ] → Q measurable } . If u ∈ U , v ∈ V then denote by x ( · , t , x , u, v ) the solution of the initial value problem˙ x ( t ) = f ( t, x ( t ) , u ( t )) + g ( t, x ( t ) , v ( t )) , x ( t ) = x . We assume that the players use control with guide strategies (CGS). In this case thecontrol depends not only on a current position but also on a vector w . The vector w iscalled a guide. The dimension of the guide can differ from n .The control with guide strategy of player I U is a triple of functions ( u, ψ , χ ) suchthat for some natural m the function u maps [0 , T ] × R n × R m to P , the function ψ maps[0 , T ] × [0 , T ] × R n × R m to R m , and χ is a function of [0 , T ] × R n with values in R m .The meaning of the functions u , ψ , and χ is the following. Let w be a m -dimensionalvector. Further it denotes the state of first player’s guide. Player I computes the value ofthe variable w using the rules which are given by the strategy U . The function u ( t ∗ , x ∗ , w )is a function forming the control of player I. It depends on the current position ( t ∗ , x ∗ ) andthe current state of guide w . The function ψ ( t + , t ∗ , x ∗ , w ) determines the value of theguide at time t + under condition that at time t ∗ the phase vector is equal to x ∗ , the stateof guide is equal to w . The function χ ( t , x ) determines the initial state of guide.Player I forms his control stepwise. Let ( t , x ) be an initial position, and let ∆ = { t k } rk =0 be a partition of the interval [ t , T ]. Suppose that player II chooses his control v [ · ] arbitrarily.He can also use his own CGS and form the control v [ · ] stepwise. Denote the solution x [ · ]of equation (1) with the initial condition x [ t ] = x such that the control of player I isequal to u ( t k , x k , w k ) on [ t k , t k +1 [ by x [ · , t , x , U, ∆ , v [ · ]]. Here the state of the systemat time t k is x k , the state of the first player’s guide is w k ; it is computed by the rule w k = ψ ( t k , t k − , x k − , w k − ) for k = 1 , r , w = χ ( t , x ) . The control with guide strategy of player II is defined analogously. It is a triple V =( v, ψ , χ ). Here v = v ( t ∗ , x ∗ , w ) , ψ = ψ ( t + , t ∗ , x ∗ , w ), χ = χ ( t , x )); ( t ∗ , x ∗ ) is acurrent position, w denotes the guide of player II, ( t , x ) is an initial position. Themotion generated by a strategy V , a partition ∆ of the interval [ t , T ], and a measurablecontrol of player II u [ · ] is also constructed stepwise. Denote it by x [ · , t ∗ , x ∗ , V, ∆ , u [ · ]].We assume that the Nash equilibrium is achieved when the players get the same par-tition. Let ∆ = { t k } mk =0 be a partition of the interval [ t , T ]. Denote the solution x [ · ] ofequation (1) with the initial condition x [ t ] = x such that the control of player I is equalto u ( t k , x k , w k ) on [ t k , t k +1 [, and the control of player II is equal to v ( t k , x k , w k ) on [ t k , t k +1 [by x ( c ) [ · , t ∗ , x ∗ , U, V, ∆]. Here x k denotes the state of the system at time t k ; w ik is the stateof the i -th player’s guide at time t k . Recall that w ik +1 = ψ i ( t k +1 , t k , x k , w k ), w i = χ i ( t , x ), i = 1 , Definition 2.1.
Let G ⊂ [0 , T ] × R n . A pair of control with guide strategies ( U ∗ , V ∗ ) issaid to be a Control with Guide Nash equilibrium on G iff for all ( t , x ) ∈ G the followinginequalities hold: lim δ ↓ sup { σ ( x [ T, t , x , V ∗ , ∆ , u [ · ]]) : d (∆) ≤ δ, u [ · ] ∈ U }≤ lim δ ↓ inf { σ ( x ( c ) [ T, t , x , U ∗ , V ∗ , ∆]) : d (∆) ≤ δ } , lim δ ↓ sup { σ ( x [ T, t , x , U ∗ , ∆ , v [ · ]]) : d (∆) ≤ δ, v [ · ] ∈ V}≤ lim δ ↓ inf { σ ( x ( c ) [ T, t , x , U ∗ , V ∗ , ∆]) : d (∆) ≤ δ } . Continuous value function
In this section we assume that there exists a continuous function satisfying some viabilityconditions.
Let ( t ∗ , x ∗ ) ∈ [0 , T ] × R n , u ∗ ∈ P , v ∗ ∈ Q .Define Sol ( t ∗ , x ∗ ; v ∗ ) := cl { x ( · , t ∗ , x ∗ , u, v ∗ ) : u ∈ U } , Sol ( t ∗ , x ∗ ; u ∗ ) := cl { x ( · , t ∗ , x ∗ , u ∗ , v ) : v ∈ V} , Sol( t ∗ , x ∗ ) := cl { x ( · , t ∗ , x ∗ , u, v ) : u ∈ U , v ∈ V} . Here cl denotes the closure in the space of continuous vector function on [0 , T ]. Note, thatthe sets Sol ( t ∗ , x ∗ ; v ∗ ), Sol ( t ∗ , x ∗ ; u ∗ ), Sol( t ∗ , x ∗ ) are compact. Theorem 3.1.
Let a continuous function ( c , c ) : [0 , T ] × R n → R satisfy the followingconditions: (F1) c i ( T, x ) = σ i ( x ) , i = 1 , ; (F2) for every ( t ∗ , x ∗ ) ∈ [0 , T ] × R n , u ∈ P there exists a motion y ( · ) ∈ Sol ( t ∗ , x ∗ ; u ) such that c ( t, y ( t )) ≤ c ( t ∗ , x ∗ ) for t ∈ [ t ∗ , T ] ; (F3) for every ( t ∗ , x ∗ ) ∈ [0 , T ] × R n , v ∈ Q there exists a motion y ( · ) ∈ Sol ( t ∗ , x ∗ ; v ) such that c ( t, y ( t )) ≤ c ( t ∗ , x ∗ ) for t ∈ [ t ∗ , T ] ; (F4) for every ( t ∗ , x ∗ ) ∈ [0 , T ] × R n there exists a motion y ( c ) ( · ) ∈ Sol( t ∗ , x ∗ ) such that c i ( t, y ( c ) ( t )) = c i ( t ∗ , x ∗ ) for t ∈ [ t ∗ , T ] , i = 1 , .Then for each compact G ⊂ [0 , T ] × R n there exists a Control with Guide Nash equilibriumon G . The corresponding payoff of player i is c i ( t , x ) . Note that conditions (F1)–(F4) were first derived in [3] as the sufficient condition for thefunction ( c , c ) to provide Nash equilibrium payoff at the given position in the frameworkof Kleimenov approach. In that papers the obtained equilibria aren’t universal.The proof of Theorem 3.1 is based on the Krasovskii-Subbotin extremal shift rule.Let G ⊂ [0 , T ] × R n be a compact. Denote by E the reachable set from G : E := { x ( t, t ∗ , x ∗ , u, v ) : ( t ∗ , x ∗ ) ∈ G, t ∈ [ t ∗ , T ] , u ∈ U , v ∈ V} . (2)Put K := max {k f ( t, x, u ) + g ( t, x, v ) k : t ∈ [0 , T ] , x ∈ E, u ∈ P, v ∈ Q } , (3)Let L be a Lipschitz constant of the function f + g on [0 , T ] × E × P × Q , i.e. for all t ∈ [0 , T ] , x ′ , x ′′ ∈ E, u ∈ P, v ∈ Q k f ( t, x ′ , u ) + g ( t, x ′ , v ) − f ( t, x ′′ , u ) − g ( t, x ′′ , v ) k ≤ L k x ′ − x ′′ k . Also, put ϕ ∗ ( δ ) := sup {k f ( t ′ , x, u ) + g ( t ′ , x, u ) − f ( t ′′ , x, u ) − g ( t ′′ , x, u ) k : t ′ , t ′′ ∈ [0 , T ] , | t ′ − t ′′ | ≤ δ, x ∈ E, u ∈ P, v ∈ Q } . ϕ ∗ ( δ ) →
0, as δ → s = h ( t, s, ω , ω ) , s ∈ R n , ω i ∈ Ω i . (4)Below we consider two cases.(i) Ω = P , Ω = Q , h = f + g ;(ii) Ω = P × Q , Ω = ∅ , h = f + g .Note that in both cases system (4) satisfies the Isaacs condition.Put β := 2 L , R := max {k s ′ − s ′′ k : s ′ , s ′′ ∈ E } , ϕ ( δ ) = 4 ϕ ∗ ( δ ) R + 4 K δ .The following lemma was proved by Krasovskii and Subbotins (see [17]) Lemma 3.1.
Let s , s ∈ R n , t ∗ ∈ [0 , T ] , ω ∗ ∈ Ω , ω ∗ ∈ Ω satisfy the following conditions max ω ∈ Ω min ω ∈ Ω h s − s , h ( t ∗ , s , ω , ω ) i = min ω ∈ Ω h s − s , h ( t ∗ , s , ω ∗ , ω ) i , min ω ∈ Ω max ω ∈ Ω h s − s , h ( t ∗ , s , ω , ω ) i = max ω ∈ Ω h s − s , h ( t ∗ , s , ω , ω ∗ ) i . If s ( · ) is a solution of the initial value problem ˙ s = h ( t, s , ω ∗ , ω ( t )) , s ( t ∗ ) = s , and s ( · ) is a solution of the initial value problem ˙ s = h ( t, s , ω ( t ) , ω ∗ ) , s ( t ∗ ) = s , for some measurable controls ω ( · ) and ω ( · ) , then for all t + ∈ [ t ∗ , T ] the estimate k s ( t + ) − s ( t + ) k ≤ k s − s k (1 + β ( t + − t ∗ )) + ϕ ( t + − t ∗ ) · ( t + − t ∗ ) is fulfilled. We assume that the i -th player’s guide w i is a quadruple ( d i , τ i , w i, ( a ) , w i, ( c ) ). Thevariable d i ∈ R describes an accumulated error, τ i ∈ [0 , T ] is a previous time of the controlcorrection, w i, ( a ) ∈ R n is a punishment part of the guide, and w i, ( c ) ∈ R n is a consistentpart of the guide. The whole dimension of the guide is 2 n + 2.For any ( t ∗ , x ∗ ) ∈ [0 , T ] × R n , u ∈ P , v ∈ Q choose and fix a motion y ( · ; t ∗ , x ∗ , u )satisfying condition (F2), a motion y ( · ; t ∗ , x ∗ , v ) satisfying condition (F3), and a motion y ( c ) ( · ; t ∗ , x ∗ ) satisfying condition (F4).Now let us define the strategies U ∗ and V ∗ . Below we prove that the pair of strategies( U ∗ , V ∗ ) is a Control with Guide Nash equilibrium on G .First put χ ( t , x ) = χ ( t , x ) := (0 , t , x , x ).Let ( t, x ) be a position, w i = ( d i , τ i , w i, ( a ) , w i, ( c ) ) be a state of the i -th player’s guide.Put z i := (cid:26) w i, ( c ) , k w i, ( c ) − x k ≤ d i (1 + β ( t − τ i )) + ϕ ( t − τ i )( t − τ i ) ,w i, ( a ) , otherwise . (5)Let us consider two cases. 5 =1. Choose a control u ∗ by the rulemax u ∈ P h z − x, f ( t, x, u ) i = h z − x, f ( t, x, u ∗ ) i . (6)Further, let v ∗ satisfy the following conditionmin v ∈ Q h z − x, g ( t, x, v ) i = h z − x, g ( t, x, v ∗ ) i . (7)Define u ( t, x, w ) := u ∗ . For t + > t put ψ ( t + , t, x, w ) be equal to w =( d , τ , w , ( a )+ , w , ( c )+ ), where d := k z − x k , τ := t, w , ( a )+ := y ( t + ; t, z , v ∗ ) , w , ( c )+ := y ( c ) ( t + ; t, z ) .i =2. Let a control v ∗ be such thatmax v ∈ Q h z − x, g ( t, x, v ) i = h z − x, g ( t, x, v ∗ ) i . (8)Choose u ∗ satisfying the conditionmin u ∈ P h z − x, f ( t, x, u ) i = h z − x, f ( t, x, u ∗ ) i . (9)Set v ( t, x, w ) := v ∗ . For t + > t put ψ ( t + , t, x, w ) be equal to w =( d , τ , w , ( a )+ , w , ( c )+ ), where d := k z − x k , τ := t, w , ( a )+ := y ( t + ; t, z , u ∗ ) , w , ( c )+ := y ( c ) ( t + ; t, z ) . Note that c j ( t + , w i, ( c )+ ) = c j ( t, z i ) for all i, j = 1 , , (10) c ( t + , w , ( a )+ ) ≤ c ( t, z ) , c ( t + , w , ( a )+ ) ≤ c ( t, z ) . (11)Below let x + denote the state of the system at time t + . Lemma 3.2.
Suppose that z = z = z . If players I and II use respectively the controls u ∗ and v ∗ on the interval [ t, t + ] , then w , ( c )+ = w , ( c )+ and k x + − w i, ( c )+ k ≤ d i + (1 + β ( t + − τ + )) + ϕ ( t + − τ + )( t + − τ + ) . Proof.
The controls u ∗ and v ∗ satisfy the conditionmax u ∈ P,v ∈ Q h z − x, f ( t, x, u ) + g ( t, x, v ) i = h z − x, f ( t, x, u ∗ ) + g ( t, x, v ∗ ) i . We apply Lemma 3.1 with Ω = P × Q , Ω = ∅ , h = f + g . If x ( · ) = x ( · , t, x, u ∗ , v ∗ ), y ( c ) ( · ) = y ( c ) ( · ; t, z ), then k x ( t + ) − y ( c ) ( t + ) k ≤ k x − z k (1 + β ( t + − t )) + ϕ ( t + − t ) · ( t + − t ) . The definition of the strategies U ∗ and V ∗ yields that w i, ( c )+ = y ( c ) ( t + ) for i = 1 ,
2. Byconstruction of the functions ψ i , i = 1 , t = τ i + , and d i + = k x − z k . Thiscompletes the proof of the Lemma. 6 emma 3.3. If player I uses the control u ∗ on the interval [ t, t + ] , then k x + − w , ( a )+ k ≤ d i + (1 + β ( t + − τ + )) + ϕ ( t + − τ + )( t + − τ + ) , i = 1 , . Proof.
We apply Lemma 3.1 with Ω = P , Ω = Q and h = f + g . The choice of u ∗ (see(6)) and v ∗ (see (7)) yields that the inequality k x ( t + ) − y ( t + ) k ≤ k x − z k (1 + β ( t + − t )) + ϕ ( t + − t ) · ( t + − t )holds with x ( · ) = x ( · , t, x, u ∗ , v ) and y ( · ) = y ( · , t, z , v ∗ ). Since w , ( a )+ = y ( t + ), τ = t ,and d = k x − z k , the conclusion of the Lemma follows.We need the following estimate. Let ∆ = { t k } rk =0 be a partition of the interval [ t , T ],and let { γ k } rk =0 be a collection of numbers such that γ k +1 ≤ γ k (1 + β ( t k +1 − t k )) + ϕ (( t k +1 − t k )) · (( t k +1 − t k )) . (12)Then, γ k ≤ [ γ + (1 + ( t k − t )) ϕ ( d (∆))] exp β ( t k − t ) . (13) Proof of Theorem 3.1
First let us show that for all ( t , x ) ∈ G the following equality isvalid: c j ( t , x ) = lim δ ↓ inf { σ j ( x ( c ) [ T, t , x , U ∗ , V ∗ , ∆]) , d (∆) ≤ δ } , j = 1 , . (14)Let ∆ = { t k } rk =1 be a partition of the interval [ t , T ]. Denote the state of the system attime t k by x k , the state of the i -th player’s guide by w ik = ( d ik , τ k , w ( a ) ,ik , w i, ( c ) k ). Also let z ik be chosen by rule (5) at time t k . We have that τ = t , τ k +1 = t k for k ≥
0. Moreover, z = w , ( c )0 = w , ( c )0 = z . Hence using lemma 3.2 inductively we get that z k = w , ( c ) k = z k = w , ( c ) k , d ik +1 = k x k − z ik k . (15)and k x k +1 − z ik +1 k ≤ k x k − z ik k (1 + β ( t k +1 − t k )) + ϕ ( t k +1 − t k )( t k +1 − t k )for all k = 0 , N .It follows from (13) that k x r − z ir k ≤ [ k x − z i k + (1 + ( t r − t )) ϕ ( d (∆))] exp β ( t r − t ) . Since z i = x , we obtain that k x r − z ir k ≤ κ ( δ ) := h (1 + ( t r − t )) ϕ ( δ ) exp β ( t r − t ) i / , (16)where δ = d (∆). Note that κ ( δ ) → δ → φ j ( Y ) be a modulus of continuity of the function σ j on the set Eφ j ( γ ) := sup {| σ j ( x ′ ) − σ j ( x ′′ ) | : x ′ , x ′′ ∈ E, k x ′ − x ′′ k ≤ γ } .
7e have, that k σ j ( x r ) − σ j ( z ir ) k ≤ φ j ( κ ( δ )) . (17)Since z ik = w i, ( c ) k , it follows from (10) that c j ( t k +1 , w i, ( c ) k +1 ) = c j ( t k , z ik ) = c j ( t k , w i, ( c ) k ).Therefore, using condition (F1) we get k σ j ( x [ T, t , x , U ∗ , V ∗ , ∆]) − c j ( t , x ) k ≤ φ j ( κ ( δ ))with δ = d (∆). Passing to the limit we obtain equality (14).Now let us show that for all ( t , x ) ∈ Gc ( t , x ) ≥ lim δ ↓ sup { σ ( x [ T, t , x , U ∗ , ∆ , v [ · ]) , d (∆) ≤ δ, v [ · ] ∈ V} . (18)Let ∆ = { t k } rk =1 be a partition of the interval [ t , T ], and let v [ · ] be a control of playerII. Denote the state of the system at time t k by x k , the state of the first player’s guide by w k = ( d k , τ k , w ( a ) , k , w , ( c ) k ). Also let z k be chosen by rule (5) at time t k .We claim that inequality (12) is valid with γ k = k z k − x k k . Note that τ k +1 = t k , d k +1 = k z k − x k k . If z k +1 = w , ( c ) k +1 , then inequality (12) holds by construction. If z k +1 = w , ( c ) k +1 ,then using lemma 3.3 we obtain that inequality (12) is fulfilled also.Therefore, we have inequality (13) with γ = 0 and γ k = k z k − x k k . Hence, k z r − x r k ≤ κ ( d (∆)) . Consequently, inequality (17) is fulfilled for i = 1, j = 2.It follows from (5), (10), and (11) that c ( t k +1 , z k +1 ) ≤ c ( t k , z k ) . (19)Condition (F1) and the equality z = x yield the inequality σ ( z r ) = c ( T, z r ) ≤ c ( t , x ) . From this and (19) we conclude that σ ( x [ T, t , x , U ∗ , ∆ , v [ · ]]) ≤ c ( t , x ) + φ ( κ ( δ )) , with δ = d (∆). Passing to the limit we get inequality (18).Analogously one can prove the inequality c ( t , x ) ≥ lim δ ↓ sup { σ ( x [ T, t , x , V ∗ , ∆ , u [ · ]) , d (∆) ≤ δ, u [ · ] ∈ U } . (20)Combining equality (14) and inequalities (18), (20) we conclude that the strategies U ∗ and V ∗ form the Control with Guide Nash equilibrium on G . Moreover the Nash equilibriumpayoff of player i at the position ( t , x ) is c i ( t , x ).8 .2 Infinitesimal Form of Conditions (F1)–(F4) Define H ( t, x, s ) := max u ∈ P min v ∈ Q h s, f ( t, x, u ) + g ( t, x, v ) i ,H ( t, x, s ) := max v ∈ Q min u ∈ P h s, f ( t, x, u ) + g ( t, x, v ) i . Proposition 3.1.
Conditions (F2), and (F3) are equivalent to the the following one: thefunction c i is viscosity supersolution of the equation ∂c i ∂t + H i ( t, x, ∇ c i ) = 0 . (21)This Proposition directly follows from [20, Theorem 6.4].Further, define a modulus derivative at the position ( t, x ) in the direction w ∈ R n bythe ruled abs ( c , c )( t, x ; w ) := lim inf δ ↓ ,w ′ → w | c ( t + δ, x + δw ′ ) − c ( t, x ) | + | c ( t + δ, x + δw ′ ) − c ( t, x ) | δ . Proposition 3.2.
Condition (F4) is valid if and only if for every ( t, x ) ∈ [0 , T ] × R n inf w ∈F ( t,x ) d abs ( c , c )( t, x ; w ) = 0 . Proof.
Condition (F4) means that the graph of the function ( c , c ) is viable under thedifferential inclusion ˙ x ˙ J ˙ J = co f ( t, x, u ) + g ( t, x, v )00 : u ∈ P, v ∈ Q . One can rewrite this condition in the infinitesimal form [1, Theorem 11.1.3]: for J = c ( t, x ), J = c ( t, x ) and some w ∈ co { f ( t, x, u ) + g ( t, x, v ) : u ∈ P, v ∈ Q } the inclusion w ∈ D gr( c , c )( t, ( x, J , J )) (22)holds. Here D denotes the contingent derivative. It is defined in the following way. Let G ⊂ [0 , T ] × R m , G [ t ] denote a section of G by t : G [ t ] := { w ∈ R m : ( t, x ) ∈ G} , and let the symbol d denote the Euclidian distance between a point and a set. Following[1] set D G ( t, y ) := (cid:26) h ∈ R m : lim inf δ → d( y + δh ; G [ t + δ ]) δ = 0 (cid:27) . Let J i = c i ( t, x ). We have that ( w, Y , Y ) ∈ D gr( c , c )( t, ( x, J , J )) if and only if thereexist sequences { w k } ∞ k =1 and { δ k } ∞ k =1 such that w = lim k →∞ w k , and Y i = lim k →∞ c i ( t + δ k , x + δ k w k ) − c i ( t, x ) δ k . Therefore, condition (22) is equivalent to the condition d abs ( c , c )( t, x ; w ) = 0 for some w ∈ co { f ( t, x, u ) + g ( t, x, v ) : u ∈ P, v ∈ Q } . 9 .3 System of Hamilton–Jacobi equations Let us show that Theorem 3.1 generalizes the method based on the system of Hamilton–Jacobi equations.It is well known that the solutions of the system of Hamilton–Jacobi equations provideNash equilibria [5].For any s ∈ R n let ˆ u ( t, x, s ) satisfy the condition h s, f ( t, x, ˆ u ( t, x, s )) i = max {h s, f ( t, x, u ) i : u ∈ P } , and let ˆ v ( t, x, s ) satisfy the condition h s, g ( t, x, ˆ v ( t, x, s )) i = max {h s, g ( t, x, u ) i : u ∈ P } . Set H i ( t, x, s , s ) := h s i , f ( t, x, ˆ u ( t, x, s )) + g ( t, x, ˆ v ( t, x, s )) i . Consider the system of Hamilton–Jacobi equations (cid:26) ∂ϕ i ∂t + H i ( t, x, ∇ ϕ , ∇ ϕ ) = 0 ,ϕ i ( T, x ) = σ i ( x ) . i = 1 , Proposition 3.3.
If the function ( ϕ , ϕ ) is a classical solution of system (23), then itsatisfies condition (F1)–(F4).Proof. Condition (F1) is obvious.Since ( ϕ , ϕ ) is the solution of system (23), we have that0 = ∂ϕ ( t, x ) ∂t + max u ∈ P h∇ ϕ ( t, x ) , f ( t, x, u ) i + h∇ ϕ ( t, x ) , g ( t, x, ˆ v ( t, x, ∇ ϕ ( t, x ))) i≥ ∂ϕ ( t, x ) ∂t + max u ∈ P h∇ ϕ ( t, x ) , f ( t, x, u ) i + min v ∈ Q h∇ ϕ ( t, x ) , g ( t, x, v ) i = ∂ϕ ( t, x ) ∂t + H ( t, x, ∇ ϕ ( t, x )) . The subdifferential of the smooth function ϕ is equal to D − ϕ ( t, x ) = { ( ∂ϕ ( t, x ) /∂t, ∇ ϕ ( t, x )) } . Therefore, ϕ is a viscosity supersolution of equation (21) for i = 1 [20, Definition (U4)]. This is equivalent to condition (F2).Condition (F3) is proved in the same way. d abs ( ϕ , ϕ )( t, x ; w ) = (cid:12)(cid:12)(cid:12)(cid:12) ∂ϕ ( t, x ) ∂t + h∇ ϕ ( t, x ) , w i (cid:12)(cid:12)(cid:12)(cid:12) + (cid:12)(cid:12)(cid:12)(cid:12) ∂ϕ ( t, x ) ∂t + h∇ ϕ ( t, x ) , w i (cid:12)(cid:12)(cid:12)(cid:12) . Substituting w = f ( t, x, ˆ u ( t, x, ∇ ϕ ( t, x ))) + g ( t, x, ˆ v ( t, x, ∇ ϕ ( t, x ))) gives condi-tion (F4).Generally, there exists a smooth function ( c , c ) satisfying conditions (F1)–(F4) notbeing a solution of the system of Hamilton–Jacobi equations.10 xample 3.1. Consider the system (cid:26) ˙ x = − v ˙ x = 2 u + v (24) Here t ∈ [0 , , u, v ∈ [ − , . The purpose of the i -th player is to maximize x i (1) .The function ( c ∗ , c ∗ ) with c ∗ ( t, x , x ) = x + (1 − t ) , c ∗ ( t, x , x ) = x + (1 − t ) satisfiesconditions (F1)–(F4), but it is not a solution of the system of Hamilton–Jacobi equations(23). Moreover, c ∗ i ( t, x ) > ϕ i ( t, x ) for some solutions of system (23) ( ϕ , ϕ ) .Proof. First let us write down the system of Hamilton–Jacobi equations for the case underconsideration. Denote ∂ϕ /∂x j by p j , ∂ϕ /∂x j by q j .The variables ˆ u and ˆ v satisfy the conditionsmax u ∈ [ − , p u = p ˆ u, max v ∈ [ − , ( − q + q ) v = ( − q + q )ˆ v. Hence the system of Hamilton–Jacobi equations (23) takes the form ( ∂ϕ ∂t − p ˆ v + p (2ˆ u + ˆ v ) = 0 , ∂ϕ ∂t − q ˆ v + q (2ˆ u + ˆ v ) = 0 . (25)The boundary conditions are ϕ (1 , x , x ) = x , ϕ (1 , x , x ) = x .The function ( c ∗ , c ∗ ) satisfies conditions (F1)–(F4). Indeed, condition (F1) holds obvi-ously. Condition (F2) is valid with v = 1, analogously condition (F3) is valid with u = − v = − u = 1. This means that condition (F4) holds also.On the other hand the pair of functions ( c ∗ , c ∗ ) does not satisfy the system of Hamilton–Jacobi equations. Indeed, ∂c ∗ /∂x = p = 1 , ∂c ∗ /∂x = p = 0 , ∂c ∗ /∂x = q = 0 ,∂c ∗ /∂x = q = 1 , ∂c ∗ /∂t = ∂c ∗ /∂t = − . Therefore, ˆ v = 1. Substitution into the first equation of (25) leads to the contradiction.Further, consider the functions ϕ ( t, x , x ) = x − (1 − t ), ϕ α ( t, x , x ) = x + (1 +2 α )(1 − t ). Here α is a parameter from [ − , v = 1 and ˆ u = α , then ( ϕ , ϕ α )is a classical solution of system (25).We have that for α ∈ [ − , c ∗ ( t, x , x ) > ϕ ( t, x , x ) , c ∗ ( t, x , x ) > ϕ α ( t, x , x ) . The continuous function ( c , c ) satisfying conditions (F1)–(F4) does not exist in thegeneral case. 11 xample 3.2. Let the dynamics of the system be given by ˙ x = u, t ∈ [0 , , x ∈ R , u ∈ [ − , . The purpose of the first player is to maximize | x (1) | . The second player is fictitious, andhis purpose is to maximize x (1) . In this case there is no continuous function satisfyingconditions (F1)–(F4).Proof. Let a function ( c , c ) : [0 , × R → R satisfy conditions (F1)–(F4). Condition (F2)means that c ( t, x ) ≥ c (cid:18) t + , x + Z t + t u ( θ ) dθ (cid:19) (26)for any u ∈ U , t + ∈ [ t, c ( t, x ) ≥ | x | + (1 − t ). Condition (F4) means thatthere exists a control u ∗ such that c ( t, x ) = (cid:12)(cid:12)(cid:12)(cid:12) x + Z t u ∗ ( τ ) dτ (cid:12)(cid:12)(cid:12)(cid:12) , c ( t, x ) = x + Z t u ∗ ( τ ) dτ. (27)This yields the inequality c ( t, x ) ≤ max u ∈ [ − , | x + u (1 − t ) | = | x | + (1 − t ) . From this, and (26) it follows that c ( t, x ) = | x | + (1 − t ). Moreover, u ∗ ( · ) ≡ x ≥ u ∗ ( · ) ≡ − x ≤
0. Hence, c ( t, x ) = x + (1 − t ) for x > c ( t, x ) = x − (1 − t )for x < Theorem 4.1.
Assume that there exists an upper semicontinuous multivalued function S : [0 , T ] × R n ⇒ R with nonempty images satisfying the following conditions: (S1) S ( T, x ) = { ( σ ( x ) , σ ( x )) } , x ∈ R n ; (S2) for all ( t, x ) ∈ [0 , T ] × R n , ( J , J ) ∈ S ( t, x ) , u ∈ P and t + ∈ [ t, T ] there exist amotion y ( · ) ∈ Sol ( t, x ; u ) and a pair ( J ′ , J ′ ) ∈ S ( t + , y ( t + )) such that J ≥ J ′ ; (S3) for all ( t, x ) ∈ [0 , T ] × R n , ( J , J ) ∈ S ( t, x ) , v ∈ Q and t + ∈ [ t, T ] there exist amotion y ( · ) ∈ Sol ( t, x ; u ) and a pair ( J ′′ , J ′′ ) ∈ S ( t + , y ( t + )) such that J ≥ J ′′ ; (S4) for all ( t, x ) ∈ [0 , T ] × R n , ( J , J ) ∈ S ( t, x ) and t + ∈ [ t, T ] there exists a motion y ( c ) ( · ) ∈ Sol( t ∗ , x ∗ ) such that ( J , J ) ∈ S ( t + , y ( c ) ( t + )) .Then for any selector ( ˆ J , ˆ J ) of the multivalued function S and a compact set G ⊂ [0 , T ] × R n there exists a Control with Guide Nash equilibrium on G such that corresponding Nashequilibrium payoff at ( t , x ) ∈ G is ( ˆ J ( t , x ) , ˆ J ( t , x )) ∈ S ( t , x ) . emark 4.1. Let U ∗ , V ∗ be Nash equilibrium strategies constructed for the compact G ⊂ [0 , T ] × R n and the selector ( ˆ J , ˆ J ) . The value of ( ˆ J , ˆ J ) may vary along the Nash trajectory x c ∗ [ · ] , that is a limit of step-by-step motions generated by U ∗ and V ∗ . However, it followsfrom Theorem 4.1 that for any intermediate time instant θ there exists a pair of Nashequilibrium strategies such that the corresponding Nash equilibrium payoff at ( θ, x c ∗ [ θ ]) isequal to the value of ( ˆ J , ˆ J ) at the initial position.Analogously, if x ∗ [ · ] is a limit of step-by-by step motions generated by strategy of playerI U ∗ , and a control of player II v [ · ] , then for any intermediate time instant θ there exists apair of Nash equilibrium strategies such that the corresponding Nash equilibrium payoff at ( θ, x ∗ [ θ ]) of the player II doesn’t exceed the value of the function ˆ J at the initial position.Proof of Theorem 4.1 To prove the theorem we modify the construction of the guide pro-posed in the proof of Theorem 3.1. We assume that the guide consists of the followingcomponents: d ∈ R is an accumulated error, τ ∈ R is a previous time of correction, w ( a ) isa punishment part of the guide, w ( c ) is a consistent part of the guide, Y ∈ R , Y ∈ R areexpected payoffs of the players.Let ( t, x ) ∈ [0 , T ] × R n be a position, t + > t , ( J , J ) ∈ S ( t, x ), u ∈ P , v ∈ Q . Leta motion y ( · ) satisfy condition (S2). Denote b ( t + , t, x, J , J , u ) := y ( t + ). Analogouslylet y ( · ) satisfy condition (S3). Put b ( t + , t, x, J , J , v ) := y ( t + ). Also, if y ( c ) ( · ) satisfiescondition (S4), then denote b c ( t + , t, x, J , J ) := y ( c ) ( t + ) . First let us define the functions χ ( t, x ) = χ ( t, x ) := ( d , τ , w ( c )0 , w ( a )0 , Y , , Y , )by the following rule: d := 0, τ := t , w ( c )0 = w ( a )0 := x , Y , := ˆ J ( t , x ), Y , := ˆ J ( t , x ).Now we shall define controls and transitional functions of the guides. Let t be a timeinstant. Assume that at time t the state of the system is x , and the state of the i -th player’sguide is w i = ( d i , τ i , w ( a ) ,i , w ( c ) ,i , Y i , Y i ). Define z i by rule (5). Now let us consider the caseof the first player. Put ( Y , + , Y , + ) := (cid:26) ( Y i , Y i ) , z = w ( c ) , ( Y ′′ , Y ′′ ) , z = w ( a ) , . Here ( Y ′′ , Y ′′ ) is an element of S ( t, w ( a ) , ) such that Y ′′ = min { J : ( J , J ) ∈ S ( t, w ( a ) , ) } .Choose u ∗ by rule (6), and v ∗ by (7). As above, put u ( t, x, w ) := u ∗ , also set ψ ( t + , t, x, w ) := ( d , τ , w ( a ) , , w ( c ) , , Y , + , Y , + ) where d = k z − x k , τ = t, w ( a ) , = b ( t + , t, z , Y , + , Y , + , v ∗ ) ,w ( c ) , = b c ( t + , t, z , Y , + , Y , + ) . The case of the second player is considered in the same way. Put( Y , + , Y , + ) := (cid:26) ( Y i , Y i ) , z = w ( c ) , ( Y ′ , Y ′ ) , z = w ( a ) , . Here ( Y ′ , Y ′ ) is an element of S ( t + , w ( a ) , ) such that Y ′ = min { J : ( J , J ) ∈ S ( t, w ( a ) , ) } .Let v ∗ satisfy condition (8). Also, let u ∗ satisfy condition (9). Put v ( t, x, w ) := v ∗ . Further,set ψ ( t + , t, x, w ) := ( d , τ , w ( a ) , , w ( c ) , , Y , + , Y , + ) where d = k z − x k , τ + = t, w ( a ) , = b ( t + , t, z , Y , + , Y , + , v ∗ ) ,w ( c ) , = b c ( t + , t, z , Y , + , Y , + ) . t , x ) ∈ G the following equality is fulfilledˆ J i = lim δ ↓ inf { σ i ( x ( c ) [ T, t , x , U ∗ , V ∗ , ∆]) , d (∆) ≤ δ } , i = 1 , . (28)Let ∆ = { t k } rk =0 be a partition of [ t , T ], d (∆) ≤ δ , x c [ · ] := x c [ · , t ∗ , x ∗ , U ∗ , V ∗ , ∆].Extend the partition ∆ by adding the element t r +1 = t r = T . Denote x k := x c [ t k ]. Letus denote the state of the i -th player’s guide at time t k by w ik = ( d ik , w ( a ) ,ik , w ( c ) ,ik , Y i ,k , Y i ,k ).Let z ik be a position chosen by rule (5) for the i -th player at time t k .It follows from lemma 3.2 that the point z ik is equal to w ( c ) ,ik . In addition, w ( c ) , k = w ( c ) , k ,and the following inequality is valid: k x k − w ( c ) ,ik k ≤ k x k − − z ik − k (1 + β ( t k − τ k − )) + ϕ ( t k − τ k − )( t k − τ k − ) . Applying this inequality sequentially and using the equality z i = x we get estimate (16)for i = 1 ,
2. Further, estimate (17) holds for i = 1 , j = 1 ,
2. The choice of z ik yieldsthat ( Y i ,k , Y i ,k ) = ( Y i ,k − , Y i ,k − ), and ( Y i ,k , Y i ,k ) ∈ S ( t k − , z k − ) for k = 1 , r + 1. Also,the construction of the function χ i leads to the equality ( Y i , , Y i , ) = ( ˆ J ( t , x ) , ˆ J ( t , x )).Hence, ( ˆ J ( t , x ) , ˆ J ( t , x )) ∈ S ( t r , z ir ) = { ( σ ( z ir ) , σ ( z ir )) } . By (17) we conclude thatequality (28) holds.Now let us prove that for any position ( t , x ) ∈ G the following inequality is fulfilled:ˆ J ( t , x ) ≥ lim δ ↓ sup { σ ( x [ T, t , x , U ∗ , ∆ , v [ · ]]) , d (∆) ≤ δ, v [ · ] ∈ V} . (29)As above, let ∆ = { t k } rk =0 be a partition of the interval [ t , T ], d (∆) ≤ δ , x [ · ] = x [ · , t , x , U ∗ , ∆ , v [ · ]]. We add the element t r +1 = t r = T to the partition ∆. De-note x k := x [ t k ]. Let us denote the state of the first player’s guide at time t k by w k = ( d k , w ( a ) , k , w ( c ) , k , Y ,k , Y ,k ). Further, let z k be a point chosen by rule (5) for thefirst player at time t k .The choice of z k (see (5)) and lemma 3.3 yield the inequality k x k − z k k ≤ k x k − − z k − k (1 + β ( t k − t k − )) + ϕ ( t k − t k − )( t k − t k − ) . Applying this inequality sequentially and using the equality z = x we get estimate (16)for i = 1. Therefore, inequality (17) is fulfilled for i = 1, j = 2. In addition, Y ,k ≥ Y ,k − .Indeed, if z k = w ( c ) , k , then ( Y ,k , Y ,k ) = ( Y ,k − , Y ,k − ). If z k = w ( a ) , k , we have that anelement ( Y ,k , Y ,k ) is chosen so that Y ,k is the minimum of { J : ( J , J ) ∈ S ( t k − , z k − ) } .By the construction we have ( Y ,k , Y ,k ) ∈ S ( t k − , z k − ). Hence, using condition (S1) weobtain that ˆ J ( t , x ) ≥ Y ,r +1 = σ ( z r ) . (30)Since inequality (17) is valid for i = 1, j = 2, estimate (30) yields inequality (29).Analogously, we get that for any position ( t , x ) ∈ G the inequalityˆ J ( t , x ) ≥ lim δ ↓ sup { σ ( x [ T, t , x , V ∗ , ∆ , u [ · ]]) , d (∆) ≤ δ, u [ · ] ∈ U } (31)is fulfilled.Equality (28) and inequalities (29), (31) mean that the pair of strategies U ∗ and V ∗ is a Nash equilibrium on G . Moreover, the Nash equilibrium payoff at the initial position( t , x ) ∈ G is equal to ( ˆ J ( t , x ) , ˆ J ( t , x )).14 Existence of Multivalued Value Function
In order to prove the existence of a multivalued function satisfying conditions (S1)–(S4)we consider the auxiliary discrete time dynamical game. Let N be a natural number, and let δ N := T /N be a time step. We discretize [0 , T ] by means of the uniform grid ∆ N := { t Nk } Nk =0 with t Nk = kδ N .Consider the discrete time control system ξ N ( t Nk +1 ) = ξ N ( t k )+ δ N [ f ( t Nk , ξ N ( t Nk ) , u ( t Nk )) + g ( t Nk , ξ N ( t Nk ) , v ( t Nk ))] ,k = 0 , N − , u ( t Nk ) ∈ P, v ( t Nk ) ∈ Q. (32)Denote U N := { u : [0 , T ] → P : u ( t ) = u Nk ∈ P for t ∈ [ t Nk , t Nk +1 [ } , V N := { v : [0 , T ] → Q : v ( t ) = v Nk ∈ Q for t ∈ [ t Nk , t Nk +1 [ } . For t ∗ ∈ ∆ N , ξ ∗ ∈ R n , u ∈ U N , and v ∈ V N let ξ N ( · , t ∗ , ξ ∗ , u, v ) : ∆ N ∩ [ t ∗ , T ] → R n bea solution of initial value problem (32), ξ N ( t ∗ ) = ξ ∗ .First, we shall estimate k ξ N ( t + , t ∗ , ξ ∗ , u, v ) − x ( t + , t ∗ , x ∗ , u, v ) k .Let G ⊂ [0 , T ] × R n be a compact of initial positions. Let E ′ ⊂ R n be a compactsuch that x ( t, t ∗ , x ∗ , u, v ) ∈ E ′ , and ξ N ( t, t ∗ , x ∗ , u, v ) ∈ E ′ for all natural N , ( t ∗ , x ∗ ) ∈ G , t, t ∗ ∈ ∆ N , u ∈ U N , v ∈ V N . Set K ′ := max {k f ( t, x, u ) + g ( t, x, v ) k : t ∈ [0 , T ] , x ∈ E ′ , u ∈ P, v ∈ Q } . Denote by L ′ the Lipschitz constant of the function f + g on [0 , T ] × E ′ × P × Q : for all t ∈ [0 , T ], x ′ , x ′′ ∈ E ′ , u ∈ P , v ∈ Q k f ( t, x ′ , u ) + g ( t, x ′ , v ) − f ( t, x ′′ , u ) − g ( t, x ′′ , v ) k ≤ L ′ k x ′ − x ′′ k . Further, set ϕ ′ ( δ ) := sup {k f ( t ′ , x ′ , u ) − f ( t ′′ , x ′′ , u ) k + k g ( t ′ , x ′ , v ) − g ( t ′′ , x ′′ , v ) k : t ′ , t ′′ ∈ [0 , T ] , x ′ , x ′′ ∈ E ′ , | t ′ − t ′ | ≤ δ, k x ′ − x ′′ k ≤ K ′ δ, u ∈ P, v ∈ Q } . Lemma 5.1. If t ∗ , t + ∈ ∆ N , t + ≥ t ∗ , ( t ∗ , x ∗ ) , ( t ∗ , ξ ∗ ) ∈ G , u ∈ U N , and v ∈ V N , then, k x ( t + , t ∗ , x ∗ , u, v ) − ξ N ( t + , t ∗ , ξ ∗ , u, v ) k≤ k x ∗ − ξ ∗ k exp(2 L ′ ( t + − t ∗ )) + ϕ ′ ( δ N ) exp( L ′ ( t + − t ∗ )) . (33) Proof.
Let m and r be natural numbers such that t ∗ = t Nm , t + = t Nr . Denote x ( · ) := x ( · , t ∗ , x ∗ , u, v ), x k := x ( t Nk , t ∗ , x ∗ , u, v ), ξ k := ξ N ( t Nk , t ∗ , ξ ∗ , u, v ). We have that x k +1 = x k + Z t Nk +1 t Nk [ f ( t, x ( t ) , u k ) + g ( t, x ( t ) , v k )] dt = x k + δ N [ f ( t Nk , x k , u k ) + g ( t Nk x k , v k )]+ Z t Nk +1 t Nk [ f ( t, x ( t ) , u k ) + g ( t, x ( t ) , v k ) − f ( t Nk , x k , u k ) − g ( t Nk , x k , v k )] dt. u k and v k denote the values of u and v on [ t Nk , t Nk +1 [ respectively.Further, k x ( t ) − x k k ≤ K ′ ( t − t k ) , t ∈ [ t k , t k +1 ] . Therefore, the following inequality is fulfilled: Z t k +1 t k [ f ( t, x ( t ) , u k ) + g ( t, x ( t ) , v k ) − f ( t Nk , x k , u k ) − g ( t Nk , x k , v k )] dt ≤ δ N ϕ ( δ N ) . Hence, k x k +1 − x k − δ N [ f ( t Nk , x k , u k ) + g ( t Nk , x k , v k )] k ≤ δ N ϕ ( δ N ) . (34)Further, we have x k + δ N [ f ( t Nk , x k , u k ) + g ( t Nk , x k , v k )] − ξ k +1 = x k − ξ k + δ N [ f ( t Nk , x k , u k ) + g ( t Nk , x k , v k ) − f ( t Nk , ξ k , u k ) − g ( t Nk , ξ k , v k )] . Consequently, k x k + δ N [ f ( t Nk , x k , u k ) + g ( t Nk , x k , v k )] − ξ k +1 k ≤ k x k − ξ k k + δ N L ′ k x k − ξ k k . This inequality and estimate (34) yield that k x k +1 − ξ k +1 k ≤ k x k − ξ k k + δ N L k x k − ξ k k + δ N ϕ ( δ N ) . Applying the last inequality sequentially we get inequality (33).Now let us proof the existence of a function satisfying discrete time analogs of conditions(S1)–(S4).
Theorem 5.1.
For any natural N there exists an upper semicontinuous multivalued func-tion Z N : ∆ N × R n ⇒ R satisfying the following properties1. Z N ( T, ξ ) = { ( σ ( ξ ) , σ ( ξ )) } ;2. for all ( t ∗ , ξ ∗ ) ∈ ∆ N × R n , u ∈ P , ( Y , Y ) ∈ Z N ( t ∗ , ξ ∗ ) and t + ∈ ∆ N , t + > t ∗ there exist a control v ∈ V N and a pair ( Y ′ , Y ′ ) ∈ Z N ( t + , ξ N ( t + , t ∗ , ξ ∗ , u, v )) such that Y ≥ Y ′ ;3. for all ( t ∗ , ξ ∗ ) ∈ ∆ N × R n , v ∈ Q , ( Y , Y ) ∈ Z N ( t ∗ , ξ ∗ ) and t + ∈ ∆ N , t + > t ∗ thereexist a control u ∈ V N and a pair ( Y ′′ , Y ′′ ) ∈ Z N ( t + , ξ N ( t + , t ∗ , ξ ∗ , u, v )) such that Y ≥ Y ′′ ;4. for all ( t ∗ , ξ ∗ ) ∈ ∆ N × R n , ( Y , Y ) ∈ Z N ( t ∗ , ξ ∗ ) and t + ∈ ∆ N , t + > t ∗ there existcontrols u ∈ U N and v ∈ V N such that ( Y , Y ) ∈ Z N ( t + , ξ N ( t + , t ∗ , ξ ∗ , u, v )) .Proof. In the proof we fix the number N and omit the superindex N . Denote f k ( z, u ) := δf ( t k , z, u ) , g k ( z, v ) := δg ( t k , z, v ) . The proof is by inverse induction on k . For k = N put Z ( t N , z ) := { σ ( z ) , σ ( z ) } .16ow let k ∈ , N −
1. Assume that the values Z ( t k +1 , z ) , . . . , Z ( t N , z ) are constructedfor all z ∈ R n . In addition, suppose that the functions Z ( t k +1 , · ) , . . . , Z ( t N , · ) are uppersemicontinuous. Define ς ik +1 ( z ) := min { Y i : ( Y , Y ) ∈ Z ( t k +1 , z ) } , i = 1 , . It follows from the upper semicontinuity of the multivalued function Z ( t k +1 , · ) that thefunctions ς k +1 and ς k +1 are lower semicontinuity.Set W k ( z ) := [ u ∈ P,v ∈ Q Z ( t k +1 , ξ ( t k +1 , t k , z, u, v )) ,̺ k ( z ) := max u ∈ P min v ∈ Q ς k +1 ( ξ ( t k +1 , t k , z, u, v )) , (35) ̺ k ( z ) := max v ∈ Q min u ∈ P ς k +1 ( ξ ( t k +1 , t k , z, u, v )) . (36)We claim that the multivalued function W k is upper semicontinuous. Indeed, let z l → z ∗ ,and let ( Y l , Y l ) ∈ W k ( z l ) be such that ( Y l , Y l ) → ( Y ∗ , Y ∗ ). We have that ( Y l , Y l ) ∈ Z ( t k +1 , ξ ( t k +1 , t k , z l , u l , v l )) for some u l ∈ P , v l ∈ Q . We can assume without loss ofgenerality that ( u l , v l ) → ( u ∗ , v ∗ ). By the continuity of the functions f k and g k we getthat ξ ( t k +1 , t k , z l , u l , v l ) = z l + f k ( z l , u l ) + g k ( z l , v l ) → ξ ( t k +1 , t k , z ∗ , u ∗ , v ∗ ), as l → ∞ .The upper semicontinuity of the multivalued function Z ( t k +1 , · ) yields that ( Y ∗ , Y ∗ ) ∈ Z ( t k +1 , ξ ( t k +1 , t k , z ∗ , u ∗ , v ∗ )) ⊂ W k ( z ∗ ).Now let us show that the functions ̺ ik are lower semicontinuous. We give theproof only for the case i = 1. For a fixed u ∈ P consider the function z min v ∈ Q ς k +1 ( ξ ( t k +1 , t k , z, u, v )). We shall prove that this function is lower semicontinuous,i.e. for any z ∗ the following inequality holds:lim inf z → z ∗ min v ∈ Q ς k +1 ( ξ ( t k +1 , t k , z, u, v )) ≥ min v ∈ Q ς k +1 ( ξ ( t k +1 , t k , z ∗ , u, v )) . (37)Let { z l } ∞ l =1 be a minimizing sequence:lim inf z → z ∗ min v ∈ Q ς k +1 ( ξ ( t k +1 , t k , z, u, v )) = lim l →∞ min v ∈ Q ς k +1 ( ξ ( t k +1 , t k , z l , u, v )) . Let v l ∈ Q satisfy the condition ς k +11 ( ξ ( t k +1 , t k , z l , u, v l )) = min v ∈ Q ς k +1 ( ξ ( t k +1 , t k , z l , u, v )) . Hence we havelim inf z → z ∗ min v ∈ Q ς k +1 ( ξ ( t k +1 , t k , z, u, v )) = lim l →∞ ς k +1 ( ξ ( t k +1 , t k , z l , u, v l )) . (38)We can assume without loss of generality that the sequence { v l } converges to a control v ∗ ∈ Q . From continuity of the function ξ ( t k +1 , t k , · , u, · ) and lower semicontinuity of thefunction ς k +1 we obtain thatlim l →∞ ς k +1 ( ξ ( t k +1 , t k , z l , u, v l )) ≥ ς k +1 ( ξ ( t k +1 , t k , z ∗ , u, v ∗ )) ≥ min v ∈ Q ς k +1 ( ξ ( t k +1 , t k , z ∗ , u, v )) . z min v ∈ Q ς k +11 ( ξ ( t k +1 , t k , z, u, v )) are lower semicontinuous foreach u ∈ P , the function ̺ k ( z ) = max u ∈ P min v ∈ Q ς k +11 ( ξ ( t k +1 , t k , z, u, v ))is lower semicontinuous.Put Z ( t k , z ) := { ( Y , Y ) ∈ W k ( z ) : Y i ≥ ̺ ik ( z ) , i = 1 , } . (39)First, we shall prove that it is nonempty. Let z ∈ R n . Let u ∗ maximize the right-hand side of (35), and let v ∗ maximize the right-hand side of (36). Choose ( Y , Y ) ∈ Z ( t k +1 , ξ ( t k +1 , t k , z, u ∗ , v ∗ )). We have that ( Y , Y ) ∈ W k ( z ). Further, ̺ ik ( z ) ≤ ς ik +1 ( ξ ( t k +1 , t k , z, u ∗ , v ∗ )) ≤ Y i . Therefore, ( Y , Y ) ∈ Z ( t k , z ).The upper semicontinuity of the function Z ( t k , · ) follows from (39), the upper semi-continuity of the multivalued function W k , and the lower semicontinuity of the function ̺ ik ( z ).Now let us show that the function Z satisfies condition 1–4 of the theorem.Note that conditions 1 and 4 are fulfilled by the construction. Prove conditions 2 and 3.Let ( t ∗ , ξ ∗ ) ∈ ∆ N × R n , t + ∈ ∆ N , t + > t , u ∗ ∈ P , ( Y , Y ) ∈ Z ( t ∗ , ξ ∗ ). It suffices to considerthe case t = t k , t + = t k +1 . By construction of the function Z we have that Y ≥ ̺ k ( ξ ∗ ).From the definition of the function ̺ k (see (35)) it follows that Y ≥ max u ∈ P min v ∈ Q ς k +1 ( ξ ( t k +1 , t k , ξ ∗ , u, v )) ≥ min v ∈ Q ς k +1 ( ξ ( t k +1 , t k , ξ ∗ , u ∗ , v )) . Let v ∗ ∈ Q be a control of player II such thatmin v ∈ Q ς k +1 ( ξ ( t k +1 , t k , ξ ∗ , u ∗ , v )) = ς k +1 ( ξ ( t k +1 , t k , ξ ∗ , u ∗ , v ∗ )) . From the definition of the function ς k +1 we get that there exists a pair ( Y ′ , Y ′ ) ∈ Z ( t k +1 , ξ ( t k +1 , t k , ξ ∗ , u ∗ , v ∗ )) such that Y ′ = ς k +1 ( ξ ( t k +1 , t k , ξ ∗ , u ∗ , v ∗ )). Consequently, Y ≥ Y ′ . Hence, condition 2 holds. Condition 3 is proved analogously. Theorem 5.2.
There exists an upper semicontinuous multivalued function S : [0 , T ] × R n ⇒ R with nonempty images satisfying conditions (S1)–(S4). The proof of Theorem 5.2 is given in the end of the section.First, for each N define the multivalued function S N : [0 , T ] × R n ⇒ R by the followingrule: S N ( t, x ) := Z N ( t Nk , x ) , t ∈ ( t k − , t k ) , k = 1 , N − Z N ( t k , x ) ∪ Z N ( t k +1 , x ) , t = t k , k = 0 , N − Z N ( t NN , x ) , t = T (40)The functions S N have the closed graph. 18enote B ( ν ) := { x : k x k ≤ ν } . For Σ : [0 , T ] × R n ⇒ R setGr ν Σ := { ( t, x, Y , Y ) : k x k ≤ ν, ( Y , Y ) ∈ Σ( t, x ) } . The sets Gr ν S N are compacts. Indeed, M i,ν := max {| σ i ( x ( T, t ∗ , x ∗ , u, v )) | : t ∗ ∈ [0 , T ] , k x ∗ k ≤ ν, u ∈ U , v ∈ V} < ∞ . We have that Gr ν S N ⊂ [0 , T ] × B ( ν ) × [ − M ,ν , M ,ν ] × [ − M ,ν , M ,ν ] . Consider the Hausdorff distance between compact sets
A, B ⊂ [0 , T ] × R n × R h ( A, B ):= max (cid:26) max ( t,x,Y ,Y ) ∈ A d(( t, x, Y , Y ) , B ) , max ( t,x,Y ,Y ) ∈ B d(( t, x, Y , Y ) , A ) (cid:27) . Here d(( t, x, Y , Y ) , A ) is the distance from the point ( t, x, Y , Y ) to the set A generatedby the norm k ( t, x, Y , Y ) k = | t | + k x k + | Y | + | Y | . Since for any ν the set [0 , T ] × B ( ν + 1) × [ − M ,ν , M ,ν ] × [ − M ,ν , M ,ν ] is compact,using [18, Theorem 4.18] we get that one can extract a convergent subsequence from thesequence { Gr ν +1 S N } ∞ N =1 .Using the diagonal process we construct the subsequence { N j } such that for any ν thereexists the limit lim j →∞ Gr ν +1 S N j = R ν . One can choose the subsequence { N j } satisfying the property: h (Gr ν +1 S N j , R ν ) ≤ − j for j ≥ ν. Denote e S j := S N j . Lemma 5.2.
Let ( Y ,l , Y ,l ) ∈ e S j l ( t l , x l ) , k x l k ≤ ν + 1 , ( t l , x l ) → ( t ∗ , x ∗ ) , ( Y ,l , Y ,l ) → ( Y ∗ , Y ∗ ) , as l → ∞ . Then ( t ∗ , x ∗ , Y ∗ , Y ∗ ) ∈ R ν .Proof. Consider the set R ν ∪ { ( t ∗ , x ∗ , Y ∗ , Y ∗ ) } . This set is closed. We claim that h (Gr ν +1 e S j l , R ν ∪ { ( t ∗ , x ∗ , Y ∗ , Y ∗ ) } ) → , l → ∞ . (41)Indeed, d(( t, x, Y , Y ) , R ν ∪ { ( t ∗ , x ∗ , Y ∗ , Y ∗ ) } ) ≤ d(( t, x, Y , Y ) , R ν ) for all ( t, x, Y , Y ) ≤ Gr ν +1 e S j l . Hencemax ( t,x,Y ,Y ) ∈ Gr ν +1 e S jl d(( t, x, Y , Y ) , R ν ∪ { ( t ∗ , x ∗ , Y ∗ , Y ∗ ) } ) → , as l → ∞ . (42)Further, the following convergence is valid:max ( t,x,Y ,Y ) ∈ R ν ∪{ ( t ∗ ,x ∗ ,Y ∗ ,Y ∗ ) } { d(( t, x, Y , Y ) , Gr ν +1 e S j l ) } → , as l → ∞ . This and (42) yield (41).Formula (41) means that R ν ∪ { ( t ∗ , x ∗ , Y ∗ , Y ∗ ) } = lim l →∞ Gr ν +1 e S j l = R ν . This completes the proof. 19 emma 5.3.
For r > ν the following equality holds: R r ∩ ([0 , T ] × B ( ν ) × R ) = R ν ∩ ([0 , T ] × B ( ν ) × R ) . Proof.
Let ( t, x, Y , Y ) ∈ R r , k x k ≤ ν , and j ≥ r . There exists a quadruple( θ j , y j , ζ ,j , ζ ,j ) ∈ Gr r +1 e S j such that | t − θ j | + k x − y j k + | Y − ζ ,j | + | Y − ζ ,j | = d(( t, x, Y , Y ) , Gr r +1 e S j ) ≤ − j . (43)Therefore, k x − y j k ≤ d(( t, x, Y , Y ) , Gr r +1 e S j ) ≤ − j . We have that k y j k ≤ k x k + 2 − j ≤ ν + 1. Therefore, ( θ j , y j , ζ ,j , ζ ,j ) ∈ Gr ν +1 e S j . It follows from formula (43) and lemma 5.2that ( t, x, Y , Y ) ∈ R ν . Since the quadrable ( t, x, Y , Y ) satisfies the condition k x k ≤ ν weconclude that R r ∩ ([ t , T ] × B ( ν ) × R ) ⊂ R ν ∩ ([ t , T ] × B ( ν ) × R ) . The opposite inclusion is proved in the same way.Define the multivalued function ¯ S : [0 , T ] × R n ⇒ R by the following rule: for k x k ≤ ν ¯ S ( t, x ) := { ( Y , Y ) : ( t, x, Y , Y ) ∈ R ν } . Note that this definition is correct by virtue of lemma 5.3. We have that Gr ν ¯ S = R ν ∩ ([ t , T ] × B ( ν ) × R ). Proof of theorem 5.2
We shall show that the function ¯ S has nonempty images, and satisfiesconditions (S1)–(S4).First we shall prove that the sets ¯ S ( t, x ) are nonempty. Let ν satisfy the condition k x k < ν , and let ( Y ,j , Y ,j ) ∈ e S j ( t, x ). Since e S j ( t, x ) ⊂ [ − M ,ν , M ,ν ] × [ − M ,ν , M ,ν ], thereexists a subsequence { ( Y ,j l , Y ,j l ) } ∞ l =1 converging to a pair ( Y ∗ , Y ∗ ). By lemma 5.2 we obtainthat ( Y ∗ , Y ∗ ) ∈ ¯ S ( t, x ).Now let us prove that the multivalued function ¯ S satisfies conditions (S1)–(S4).We begin with condition (S1). Let x ∗ ∈ R n . Choose ν such that the following conditionshold1. x ( t, T, x ∗ , u, v ) ∈ B ( ν ) for all t ∈ [0 , T ], u ∈ U , v ∈ V ;2. all z such that x ∗ = ξ N ( T, t, z, u, v ) for some natural N , t ∈ ∆ N , u ∈ U N , v ∈ V N belong to B ( ν ).Let K ν be defined by (3) for E = B ( ν + 1).Let N be a natural number, t ∗ ∈ ∆ N , and ξ ∗ ∈ B ( ν ). By conditions 1 and 4 of Theorem5.1 we have that if ( Y , Y ) ∈ Z N ( t ∗ , ξ ∗ ), then there exist u ∈ U N , and v ∈ V N such that Y i = σ i ( ξ N ( T, t ∗ , ξ ∗ , u, v )) , i = 1 , . (44)We have the estimate k ξ ∗ − ξ N ( T, t ∗ , ξ ∗ , u, v ) k ≤ K ν ( T − t ∗ ) . (45)20et ( J , J ) ∈ ¯ S ( T, x ). This means that there exists a sequence { ( t j , x j , Y ,j , Y ,j ) } ∞ j =1 such that ( Y ,j , Y ,j ) ∈ e S j ( t j , x j ) = S N j ( t j , x j ), and t j → T , x j → x , Y i,j → J i as j → ∞ . Let θ j ∈ ∆ N j be such that ( Y ,j , Y ,j ) ∈ Z N j ( θ j , x j ) and t j ∈ ( θ j − δ N , θ j ]. Combining this, (44),and (45) we conclude that for any j there exists x ′ j ∈ B ( ν ) such that k x j − x ′ j k ≤ K ν ( T − t j )and Y i,j = σ i ( x ′ j ), i = 1 ,
2. We have that x ′ j → x , as j → ∞ . By the continuity of thefunctions σ i we obtain that J i = lim l →∞ Y i,j = lim j →∞ σ i ( x ′ j ) = σ i ( x ) . Now we shall prove the fulfillment of condition (S2). Let ( t ∗ , x ∗ ) ∈ [0 , T ] × R n , ( J , J ) ∈ ¯ S ( t ∗ , x ∗ ), u ∈ P , t + ∈ [ t ∗ , T ]. We shall show that there exists y ( · ) ∈ Sol ( t ∗ , x ∗ , u ) suchthat J ′ ≤ J for some ( J ′ , J ′ ) ∈ ¯ S ( t + , y ( t + )).There exists a sequence { ( t j , x j , Y ,j , Y ,j ) } ∞ j =1 such that ( Y ,j , Y ,j ) ∈ e S j ( t j , x j ) = S N j ( t j , x j ), and t j → t ∗ , x j → x ∗ , Y i,j → J i , as j → ∞ . Let θ j be an element of ∆ N j such that ( Y ,j , Y ,j ) ∈ Z N j ( θ j , x j ) and t j ∈ ( θ j − δ N , θ j ]. Further, let τ j be the least elementof ∆ N j such that t + ≤ τ j .By condition 2 of Theorem 5.1 for each j there exist a control v j ∈ V N j , and a pair( Y ′ ,j , Y ′ ,j ) such that ( Y ′ ,j , Y ′ ,j ) ∈ Z N j ( τ j , ξ N j ( τ j , θ j , x j , u, v j )) ⊂ e S j ( τ j , ξ N j ( τ j , θ j , x j , u, v j ))and Y ′ ,j ≤ Y ,j . By lemma 5.1 we have that k x ( τ j , θ j , x j , u, v j ) − ξ N j ( τ j , θ j , x j , u, v j ) k ≤ ϕ ′ ( δ N j ) exp( LT ) . We may extract a subsequence { j l } ∞ l =1 such that { x ( · , θ j l , x j l , u, v j l ) } ∞ l =1 converges to somemotion y ( · ), and { ( Y ′ ,j l , Y ′ ,j l ) } converges to some pair ( J ′ , J ′ ). We have that y ( · ) ∈ Sol ( t ∗ , x ∗ , u ). Lemma 5.2 gives the inclusion ( J ′ , J ′ ) ∈ S ( t + , y ( t + )). We also have J ′ ≤ J . This completes the proof of condition (S2).Conditions (S3) and (S4) are proved analogously.
In this paper the Nash equilibria for differential games in the class of control with guidestrategies are constructed on the basis of an upper semicontinuous multivalued functionsatisfying boundary condition and some viability conditions. The main result is that forany compact of initial positions and any selector of the multivalued map it is possible toconstruct a Nash equilibrium such that the corresponding players’ payoff is equal to thevalue of the given selector. The existence of the multivalued function satisfying proposedconditions is also proved. If the upper semicontinuous multivalued function is replaced witha continuous function, then the construction of the strategies is simplified. However, in thegeneral case the desired continuous function doesn’t exist.Only two players nonzero-sum differential games with terminal payoffs and compactcontrol spaces were considered. The results can be extended to the games with payoffsequal to the sum of terminal and running parts by introducing new variables describingrunning payoffs. Note that if the running payoff of each player doesn’t depend on the21ontrol of the another one, then the players need only the information about the statevariable to construct the Nash equilibrium control with guide strategies. The condition ofcompactness of control spaces is essential, and the methods developed in the paper can’t beused for the games with unbounded control spaces. (Such games were studied by Bressanand Shen in [6], [7] on the basis of BV solutions of PDEs.)Future work includes the extension of the obtained results to the game with manyplayers and the stability analysis of proposed conditions.