Imitation Dynamics with Payoff Shocks
aa r X i v : . [ m a t h . P R ] D ec IMITATION DYNAMICS WITH PAYOFF SHOCKS
PANAYOTIS MERTIKOPOULOS AND YANNICK VIOSSAT
Abstract.
We investigate the impact of payoff shocks on the evolution oflarge populations of myopic players that employ simple strategy revision pro-tocols such as the “imitation of success”. In the noiseless case, this process isgoverned by the standard (deterministic) replicator dynamics; in the presenceof noise however, the induced stochastic dynamics are different from previousversions of the stochastic replicator dynamics (such as the aggregate-shocksmodel of Fudenberg and Harris, 1992). In this context, we show that strictequilibria are always stochastically asymptotically stable, irrespective of themagnitude of the shocks; on the other hand, in the high-noise regime, non-equilibrium states may also become stochastically asymptotically stable anddominated strategies may survive in perpetuity (they become extinct if thenoise is low). Such behavior is eliminated if players are less myopic and revisetheir strategies based on their cumulative payoffs. In this case, we obtain a sec-ond order stochastic dynamical system whose attracting states coincide withthe game’s strict equilibria and where dominated strategies become extinct(a.s.), no matter the noise level. Introduction
Evolutionary game dynamics study the evolution of behavior in populations ofboundedly rational agents that interact strategically. The most widely studied dy-namical model in this context is the replicator dynamics: introduced in biology as amodel of natural selection (Taylor and Jonker, 1978), the replicator dynamics alsoarise from models of imitation of successful individuals (Björnerstedt and Weibull,1996; Schlag, 1998; Weibull, 1995) and from models of learning in games (Hofbaueret al., 2009; Mertikopoulos and Moustakas, 2010; Rustichini, 1999). Mathemati-cally, they stipulate that the growth rate of the frequency of a strategy is propor-tional to the difference between the payoff of individuals playing this strategy andthe mean payoff in the population. These payoffs are usually assumed deterministic:this is typically motivated by a large population assumption and the premise that,owing to the law of large numbers, the resulting mean field provides a good ap-proximation of a more realistic but less tractable stochastic model. This approachmakes sense when the stochasticity affecting payoffs is independent across individ-uals playing the same strategies, but it fails when the payoff shocks are aggregate,that is, when they affect all individuals playing a given strategy in a similar way.Such aggregate shocks are not uncommon. Bergstrom (2014) recounts the storyof squirrels stocking nuts for the winter months: squirrels may stock a few or alot of nuts, the latter leading to a higher probability of surviving a long winterbut a higher exposure to predation. The unpredictable mildness or harshness of
Supported in part by the French National Research Agency under grant no. GAGA–13–JS01–0004–01 and the French National Center for Scientific Research (CNRS) under grant no.PEPS–GATHERING–2014. the ensuing winter will then favor one of these strategies in an aggregate way (seealso Robson and Samuelson, 2011, Sec. 3.1.1, and references therein). In trafficengineering, one might think of a choice of itinerary to go to work: fluctuations oftraffic on some roads affect all those who chose them in a similar way. Likewise, indata networks, a major challenge occurs when trying to minimize network latenciesin the presence of stochastic disturbances: in this setting, the travel time of apacket in the network does not depend only on the load of each link it traverses,but also on unpredictable factors such as random packet drops and retransmissions,fluctuations in link quality, excessive backlog queues, etc. (Bertsekas and Gallager,1992).Incorporating such aggregate payoff shocks in the biological derivation of thereplicator dynamics leads to the stochastic replicator dynamics of Fudenberg andHarris (1992), later studied by (among others) Cabrales (2000), Imhof (2005) andHofbauer and Imhof (2009). To study the long-run behavior of these dynamics,Imhof (2005) introduced a modified game where the expected payoff of a strategyis penalized by a term which increases with the variance of the noise affecting thisstrategy’s payoff (see also Hofbauer and Imhof, 2009). Among other results, itwas then shown that a ) strategies that are iteratively (strictly) dominated in thismodified game become extinct almost surely; and b ) strict equilibria of the modifiedgame are stochastically asymptotically stable.In this biological model, noise is detrimental to the long-term survival of strate-gies: a strategy which is strictly dominant on average (i.e. in the original, un-modified game) but which is affected by shocks of substantially higher intensitybecomes extinct almost surely. By contrast, in the learning derivation of the repli-cator dynamics, noise leads to a stochastic exponential learning model where onlyiteratively undominated strategies survive, irrespective of the intensity of the noise(Mertikopoulos and Moustakas, 2010); as a result the frequency of a strictly domi-nant strategy converges to almost surely. Moreover, strict Nash equilibria of theoriginal game remain stochastically asymptotically stable (again, independently ofthe level of the noise), so the impact of the noise in the stochastic replicator dynam-ics of exponential learning is minimal when compared to the stochastic replicatordynamics with aggregate shocks.In this paper, we study the effect of payoff shocks when the replicator equationis seen as a model of imitation of successful agents. As in the case of Imhof (2005)and Hofbauer and Imhof (2009), it is convenient to introduce a noise-adjusted gamewhich is reduced to the original game in the noiseless, deterministic regime. Weshow that: a ) strategies that are iteratively strictly dominated in the modifiedgame become extinct almost surely; and b ) strict equilibria of the modified gameare stochastically asymptotically stable. However, despite the formal similarity,our results are qualitatively different from those of Imhof (2005) and Hofbauer andImhof (2009): in the modified game induced by imitation of success in the presenceof noise, noise is not detrimental per se. In fact, in the absence of differences inexpected payoffs, a strategy survives with a probability that does not depend on therandom variance of its payoffs: a strategy’s survival probability is simply its initialfrequency. Similarly, even if a strategy which is strictly dominant in expectationis subject to arbitrarily high noise, it will always survive with positive probability;by contrast, such strategies become extinct (a.s.) in the aggregate shocks model ofFudenberg and Harris (1992). MITATION DYNAMICS WITH PAYOFF SHOCKS 3
That said, the dynamics’ long-term properties change dramatically if playersare less “myopic” and, instead of imitating strategies based on their instantaneouspayoffs, they base their decisions on the cumulative payoffs of their strategies overtime. In this case, we obtain a second-order stochastic replicator equation whichcan be seen as a noisy version of the higher order game dynamics of Laraki andMertikopoulos (2013). Thanks to this payoff aggregation mechanism, the noiseaverages out in the long run and we recover results that are similar to those ofMertikopoulos and Moustakas (2010): strategies that are dominated in the orig-inal game become extinct (a.s.) and strict Nash equilibria attract nearby initialconditions with arbitrarily high probability.1.1.
Paper Outline.
The remainder of our paper is structured as follows: in Sec-tion 2, we present our model and we derive the stochastic replicator dynamicsinduced by imitation of success in the presence of noise. Our long-term rationalityanalysis begins in Section 3 where we introduce the noise-adjusted game discussedabove and we state our elimination and stability results in terms of this modifiedgame. In Section 4, we consider the case where players imitate strategies based ontheir cumulative payoffs and we show that the adjustment due to noise is no longerrelevant. Finally, in Section 5, we discuss some variants of our core model relatedto different noise processes.1.2.
Notational conventions.
The real space spanned by a finite set S = { s α } d +1 α =1 will be denoted by R S and we will write { e s } s ∈ S for its canonical basis; in a slightabuse of notation, we will also use α to refer interchangeably to either s α or e α and we will write δ αβ for the Kronecker delta symbols on S . The set ∆( S ) ofprobability measures on S will be identified with the d -dimensional simplex ∆ = { x ∈ R S : P α x α = 1 and x α ≥ } of R S and the relative interior of ∆ will bedenoted by ∆ ◦ ; also, the support of p ∈ ∆( S ) will be written supp( p ) = { α ∈ S : p α > } . For simplicity, if { S k } k ∈ N is a finite family of finite sets, we use theshorthand ( α k ; α − k ) for the tuple ( . . . , α k − , α k , α k +1 , . . . ) and we write P kα insteadof P α ∈ S k . Unless mentioned otherwise, deterministic processes will be representedby lowercase letters, while their stochastic counterparts will be denoted by thecorresponding uppercase letter. Finally, we will suppress the dependence of the lawof a process X ( t ) on its initial condition X (0) = x , and we will write P instead of P x . 2. The model
In this section, we recall a few preliminaries from the theory of population gamesand evolutionary dynamics, and we introduce the stochastic game dynamics understudy.2.1.
Population games.
Our main focus will be games played by populations ofnonatomic players. Formally, such games consist of a finite set of player populations N = { , . . . , N } (assumed for simplicity to have unit mass), each with a finite setof pure strategies (or types ) A k = { α k, , α k, , . . . } , k ∈ N . During play, each playerchooses a strategy and the state of each population is given by the distribution x k = ( x kα ) α ∈ A k of players employing each strategy α ∈ A k . Accordingly, the statespace of the k -th population is the simplex X k ≡ ∆( A k ) and the state space of thegame is the product X ≡ Q k X k . P. MERTIKOPOULOS AND Y. VIOSSAT
The payoff to a player of population k ∈ N playing α ∈ A k is determined bythe corresponding payoff function v kα : X → R (assumed Lipschitz). Thus, givena population state x ∈ X , the average payoff to population k will be X kα x kα v kα ( x ) = h v k ( x ) | x i , (2.1)where v k ( x ) ≡ ( v kα ( x )) α ∈ A k denotes the payoff vector of the k -th population inthe state x ∈ X . Putting all this together, a population game is then defined as atuple G ≡ G ( N , A , v ) of nonatomic player populations k ∈ N , their pure strategies α ∈ A k and the associated payoff functions v kα : X → R .In this context, we say that a pure strategy α ∈ A k is dominated by β ∈ A k if v kα ( x ) < v kβ ( x ) for all x ∈ X , (2.2)i.e. the payoff of an α -strategist is always inferior to that of a β -strategist. Moregenerally (and in a slight abuse of terminology), we will say that p k ∈ X k is domi-nated by p ′ k ∈ X k if h v k ( x ) | p k i < h v k ( x ) | p ′ k i for all x ∈ X , (2.3)i.e. when the average payoff of a small influx of mutants in population k is alwaysgreater when they are distributed according to p ′ k rather than p k (irrespective ofthe incumbent population state x ∈ X ).Finally, we will say that the population state x ∗ ∈ X is at Nash equilibrium if v kα ( x ∗ ) ≥ v kβ ( x ∗ ) for all α ∈ supp( x ∗ k ) and for all β ∈ A k , k ∈ N . (NE)In particular, if x ∗ is pure (in the sense that supp( x ∗ ) is a singleton) and (NE) holdsas a strict inequality for all β / ∈ supp( x ∗ k ) , x ∗ will be called a strict equilibrium . Remark . Throughout this paper, we will be suppressing the population index k ∈ N for simplicity, essentially focusing in the single-population case. This is doneonly for notational clarity: all our results apply as stated to the multi-populationmodel described in detail above.2.2. Revision protocols.
A fundamental evolutionary model in the context ofpopulation games is provided by the notion of a revision protocol . Following Sand-holm (2010, Chapter 3), it is assumed that each nonatomic player receives an op-portunity to switch strategies at every ring of an independent Poisson alarm clock,and this decision is based on the payoffs associated to each strategy and the cur-rent population state. The players’ revision protocol is thus defined in terms of the conditional switch rates ρ αβ ≡ ρ αβ ( v, x ) that determine the relative mass dx αβ ofplayers switching from α to β over an infinitesimal time interval dt : dx αβ = x α ρ αβ dt. (2.4)The population shares x α are then governed by the revision protocol dynamics : ˙ x α = X β x β ρ βα − x α X β ρ αβ , (2.5)with ρ αα defined arbitrarily. Note that we are considering general payoff functions and not only multilinear (resp. linear)payoffs arising from asymmetric (resp. symmetric) random matching in finite N -person (resp. -person) games. This distinction is important as it allows our model to cover e.g. general trafficgames as in Sandholm (2010). In other words, ρ αβ is the probability of an α -strategist becoming a β -strategist up to nor-malization by the alarm clocks’ rate. MITATION DYNAMICS WITH PAYOFF SHOCKS 5
In what follows, we will focus on revision protocols of the general “imitative”form ρ αβ ( v, x ) = x β r αβ ( v, x ) , (2.6)corresponding to the case where a player imitates the strategy of a uniformly drawnopponent with probability proportional to the so-called conditional imitation rate r αβ (assumed Lipschitz). In particular, one of the most widely studied revisionprotocols of this type is the “imitation of success” protocol (Weibull, 1995) wherethe imitation rate of a given target strategy is proportional to its payoff, i.e. r αβ ( v, x ) = v β (2.7)On account of (2.5), the mean evolutionary dynamics induced by (2.7) take theform: ˙ x α = x α h v α ( x ) − X β x β v β ( x ) i , (RD)which is simply the classical replicator equation of Taylor and Jonker (1978).The replicator dynamics have attracted significant interest in the literature andtheir long-run behavior is relatively well understood. For instance, Akin (1980),Nachbar (1990) and Samuelson and Zhang (1992) showed that dominated strate-gies become extinct under (RD), whereas the (multi-population) “folk theorem” ofevolutionary game theory (Hofbauer and Sigmund, 2003) states that a ) (Lyapunov)stable states are Nash; b ) limits of interior trajectories are Nash; and c ) strict Nashequilibria are asymptotically stable under (RD).2.3. Payoff shocks and the induced dynamics.
Our main goal in this paperis to investigate the rationality properties of the replicator dynamics in a settingwhere the players’ payoffs are subject to exogenous stochastic disturbances. Tomodel these “payoff shocks”, we assume that the players’ payoffs at time t are of theform ˆ v α ( t ) = v α ( x ( t )) + ξ α ( t ) for some zero-mean “white noise” process ξ α . Then,in Langevin notation, the replicator dynamics (RD) become: dX α dt = X α h ˆ v α − X β X β ˆ v β i = X α h v α ( X ) − X β X β v β ( X ) i + X α h ξ α − X β X β ξ β i , (2.8)or, in stochastic differential equation (SDE) form: dX α = X α h v α ( X ) − X β X β v β ( X ) i dt + X α h σ α ( X ) dW α − X β X β σ β ( X ) dW β i , (SRD)where the diffusion coefficients σ α : X → R (assumed Lipschitz) measure the inten-sity of the payoff shocks and the Wiener processes W α are assumed independent.The stochastic dynamics (SRD) will constitute the main focus of this paper, sosome remarks are in order: Remark . With v and σ assumed Lipschitz, it follows that (SRD) admits a unique(strong) solution X ( t ) for every initial condition X (0) ∈ X . Moreover, since thedrift and diffusion terms of (SRD) all vanish at the boundary bd( X ) of X , standard Modulo an additive constant which ensures that ρ is positive but which cancels out when itcomes to the dynamics. P. MERTIKOPOULOS AND Y. VIOSSAT arguments can be used to show that these solutions exist (a.s.) for all time, andthat X ( t ) ∈ X ◦ for all t ≥ if X (0) ∈ X ◦ (Khasminskii, 2012; Øksendal, 2007). Remark . The independence assumption for the Wiener processes W α can berelaxed without qualitatively affecting our analysis; in particular, as we shall seein the proofs of our results, the rationality properties of (SRD) can be formulateddirectly in terms of the quadratic (co)variation of the noise processes W α . Doingso however would complicate the relevant expressions considerably, so, for clarity,we will retain this independence assumption throughout our paper. Remark . The deterministic replicator dynamics (RD) are also the governingdynamics for the “pairwise proportional imitation” revision protocol (Schlag, 1998)where a revising agent imitates the strategy of a randomly chosen opponent onlyif the opponent’s payoff is higher than his own, and he does so with probabilityproportional to the payoff difference. Formally, the conditional switch rate ρ αβ under this revision protocol is: ρ αβ = x β (cid:2) v β − v α (cid:3) + , (2.9)where [ x ] + = max { x, } denotes the positive part of x . Accordingly, if the game’spayoffs at time t are of the perturbed form ˆ v α ( t ) = v α ( x ( t )) + ξ α ( t ) as before, (2.5)leads to the master stochastic equation: ˙ X α = X β X β X α (cid:2) ˆ v α − ˆ v β (cid:3) + − X α X β X β (cid:2) ˆ v β − ˆ v α (cid:3) + = X α X β X β n(cid:2) ˆ v α − ˆ v β (cid:3) + − (cid:2) ˆ v β − ˆ v α (cid:3) + o = X α X β X β (ˆ v α − ˆ v β )= X α h ˆ v α − X β X β ˆ v β i , (2.10)which is simply the stochastic replicator dynamics (2.8). In other words, (SRD)could also be interpreted as the mean dynamics of a pairwise imitation process withperturbed payoff comparisons as above.2.4. Related stochastic models.
The replicator dynamics were first introducedin biology, as a model of frequency-dependent selection. They arise from the geo-metric population growth equation: ˙ z α = z α v α (2.11)where z α denotes the absolute population size of the α -th genotype of a givenspecies. This biological model was also the starting point of Fudenberg and Har-ris (1992) who added aggregate payoff shocks to (2.11) based on the geometricBrownian model: dZ α = Z α [ v α dt + σ α dW α ] , (2.12)where the diffusion process σ α dW α represents the impact of random, weather-likeeffects on the genotype’s fitness (see also Cabrales, 2000; Hofbauer and Imhof, 2009; An important special case where it makes sense to consider correlated shocks is if the payofffunctions v α ( x ) are derived from random matchings in a finite game whose payoff matrix is subjectto stochastic perturbations. This specific disturbance model is discussed in Section 5. The replicator equation (RD) is obtained simply by computing the evolution of the frequencies x α = z α / P β z β under (2.11). MITATION DYNAMICS WITH PAYOFF SHOCKS 7
Imhof, 2005). Itô’s lemma applied to the population shares X α = Z α / P β Z β thenyields the replicator dynamics with aggregate shocks: dX α = X α h v α ( X ) − X β X β v β ( X ) i dt + X α h σ α dW α − X β σ β X β dW β i − X α h σ α X α − X β σ β X β i dt. (2.13)In a repeated game context, the replicator dynamics also arise from a continuous-time variant of the exponential weight algorithm introduced by Vovk (1990) andLittlestone and Warmuth (1994) (see also Sorin, 2009). In particular, if playersfollow the exponential learning scheme: dy α = v α ( x ) dt,x α = exp( y α ) P β exp( y β ) , (2.14)that is, if they play a logit best response to the vector of their cumulative pay-offs, then the frequencies x α follow (RD). Building on this, Mertikopoulos andMoustakas (2009, 2010) considered the stochastically perturbed exponential learn-ing scheme: dY α = v α ( X ) dt + σ α ( X ) dW α ,X α = exp( Y α ) P β exp( Y β ) , (2.15)where the cumulative payoffs are perturbed by the observation noise process σ α dW α .By Itô’s lemma, we then obtain the stochastic replicator dynamics of exponentiallearning: dX α = X α h v α ( X ) − X β X β v β ( X ) i dt + X α h σ α dW α − X β σ β X β dW β i + X α h σ α (1 − X α ) − X β σ β X β (1 − X β ) i dt. (2.16)Besides their very distinct origins, a key difference between the stochastic repli-cator dynamics (SRD) and the stochastic models (2.13)/(2.16) is that there is noItô correction term in the former. The reason for this is that in (2.13) and (2.16),the noise affects primarily the evolution of an intermediary variable (the absolutepopulation sizes Z α and the players’ cumulative payoffs Y α respectively) before be-ing carried over to the evolution of the strategy shares X α . By contrast, the payoffshocks that impact the players’ revision protocol in (SRD) affect the correspondingstrategy shares directly, so there is no intervening Itô correction. Khasminskii and Potsepun (2006) also considered a related evolutionary model withStratonovich-type perturbations while, more recently, Vlasic (2012) studied the effect of discon-tinuous semimartingale shocks incurred by catastrophic, earthquake-like events. The intermediate variable y α should be thought of as an evaluation of how good the strategy α is, and the formula for x α as a way of transforming these evaluations into a strategy. P. MERTIKOPOULOS AND Y. VIOSSAT
The pure noise case. To better understand the differences between our model andprevious models of stochastic replicator dynamics, it is useful to consider the caseof pure noise, that is, when the expected payoff of each strategy is equal to one andthe same constant C : v α ( x ) = C for all α ∈ A and for all x ∈ X .For simplicity, let us also assume that σ α ( x ) is independent of the state of thepopulation x . Eq. (2.12) then becomes a simple geometric Brownian motion of theform: dZ α = Z α [ C dt + σ α dW α ] , (2.17)which readily yields Z α ( t ) = Z α (0) exp (cid:0) ( C − σ α / t + σ α W α ( t ) (cid:1) . The correspond-ing frequency X α = Z α / P β Z β will then be: X α ( t ) = X α (0) exp (cid:0) − σ α t + σ α W α ( t ) (cid:1)P β X β (0) exp (cid:16) − σ β t + σ β W β ( t ) (cid:17) . (2.18)If σ α = 0 , the law of large numbers yields − σ α t + σ α W α ( t ) ∼ − σ α t (a.s.).Therefore, letting σ min = min α ∈ A σ α , it follows from (2.18) that strategy α ∈ A is eliminated if σ α > σ min and survives if σ α = σ min (a.s.). In particular, if allintensities are equal ( σ α = σ min for all α ∈ A ), then all strategies survive and theshare of each strategy oscillates for ever, occasionally taking values arbitrarily closeto and arbitrarily close to . On the other hand, under the stochastic replicatordynamics of exponential learning for the pure noise case, (2.16) readily yields: X α ( t ) = X α (0) exp( σ α W α ( t )) P β X β (0) exp( σ β W β ( t )) . (2.19)Therefore, for any value of the diffusion coefficients σ α (and, in particular, even ifsome strategies are affected by noise much more than others), all pure strategiessurvive.Our model behaves differently from both (2.13) and (2.16): in the pure noisecase, for any value of the noise coefficients σ α (as long as σ α > for all α ), onlya single strategy survives (a.s.), and strategy α survives with probability equal to X α (0) . To see this, consider first the model with pure noise and only two strategies, α and β . Then, letting X ( t ) = X α ( t ) (so X β ( t ) = 1 − X ( t ) ), we get: dX ( t ) = X ( t )(1 − X ( t )) [ σ α dW α − σ β dW β ] = X ( t )(1 − X ( t )) σ dW ( t ) , (2.20)where σ = σ α + σ β and we have used the time-change theorem for martingalesto write σ dW = σ α dW α − σ β dW β for some Wiener process W ( t ) . This diffusionprocess can be seen as a continuous-time random walk on [0 , with step sizes thatget smaller as X approaches { , } . Thus, at a heuristic level, when X ( t ) startsclose to X = 1 and takes one step to the left followed by one step to the right (or theopposite), the walk does not return to its initial position, but will approach (ofcourse, the same phenomenon occurs near ). This suggests that the process shouldeventually converge to one of the vertices: indeed, letting f ( x ) = log x (1 − x ) , Itô’slemma yields df ( X ) = (1 − X ) σdW − (cid:2) (1 − X ) + X (cid:3) σ dt ≤ (1 − X ) σdW − σ dt, (2.21)so, by Lemma A.1, we get lim t →∞ f ( X ( t )) = 0 (a.s.), that is, lim t →∞ X ( t ) ∈ { , } . Elimination is obvious; for survival, simply add σ t to the exponents of (2.18) and recallthat any Wiener process has lim sup t W ( t ) > and lim inf t W ( t ) < (a.s.). MITATION DYNAMICS WITH PAYOFF SHOCKS 9
More generally, consider the model with pure noise and n strategies. Then,computing d [log X α (1 − X α )] as above, we readily obtain lim t →∞ X α ( t ) ∈ { , } (a.s.), for every strategy α ∈ A with σ α > . Since X α is a martingale, we will have E [ X α ( t )] = X α (0) for all t ≥ , so X α → with probability X α (0) and X α ( t ) → with probability − X α (0) . The above highlights two important differences between our model and the sto-chastic replicator dynamics of Fudenberg and Harris (1992). First, in our model,noise is not detrimental in itself: in the pure noise case, the expected frequency of astrategy remains constant, irrespective of the noise level; by contrast, in the modelof Fudenberg and Harris (1992), the expected frequency of strategies affected bystrong payoff noise decreases. Second, our model behaves in a somewhat more“unpredictable” way: for instance, in the model of Fudenberg and Harris (1992),when there are only two strategies with the same expected payoff, and if one of thestrategies is affected by a stronger payoff noise, then it will be eliminated (a.s.); inour model, we cannot say in advance whether it will be eliminated or not.3.
Long-term rationality analysis
In this section, we investigate the long-run rationality properties of the stochasticdynamics (SRD); in particular, we focus on the elimination of dominated strategiesand the stability of equilibrium play.3.1.
Elimination of dominated strategies.
We begin with the elimination ofdominated strategies. Formally, given a trajectory of play x ( t ) ∈ X , we say thata pure strategy α ∈ A becomes extinct along x ( t ) if x α ( t ) → as t → ∞ . Moregenerally, following Samuelson and Zhang (1992), we will say that the mixed strat-egy p ∈ X becomes extinct along x ( t ) if min { x α ( t ) : α ∈ supp( p ) } → as t → ∞ ;otherwise, we say that p survives.Now, with a fair degree of hindsight, it will be convenient to introduce a modifiedgame G σ ≡ G σ ( N , A , v σ ) with payoff functions v σα adjusted for noise as follows: v σα ( x ) = v α ( x ) −
12 (1 − x α ) σ α ( x ) . (3.1)Imhof (2005) introduced a similar modified game to study the long-term conver-gence and stability properties of the stochastic replicator dynamics with aggregateshocks (2.13) and showed that strategies that are dominated in this modified game We are implicitly assuming here deterministic initial conditions, i.e. X (0) = x (a.s.) for some x ∈ X . If several strategies are unaffected by noise, that is, are such that σ α = 0 , then their rel-ative shares remain constant (that is, if α and β are two such strategies, then X α ( t ) /X β ( t ) = X α (0) /X β (0) for all t ≥ ). It follows from this observation and the above result that, almostsurely, all these strategies are eliminated or all these strategies survive (and only them). In the pure noise case of the model of Fudenberg and Harris (1992), what remains constantis the expected number of individuals playing a strategy. A crucial point here is that this numbermay grow to infinity. What happens to strategies affected by large aggregate shocks is that withsmall probability, the total number of individuals playing this strategy gets huge, but with a largeprobability (going to 1), it gets small (at least compared to the number of individuals playing otherstrategies). This can be seen as a gambler’s ruin phenomenon, which explains that even with ahigher expected payoff than others (hence a higher expected subpopulation size), the frequency ofa strategy may go to zero almost surely (see e.g. Robson and Samuelson, 2011, Sec 3.1.1). Thiscannot happen in our model since noise is added directly to the frequencies (which are bounded). are eliminated (a.s.) – cf. Remark 3.7 below. Our main result concerning theelimination of dominated strategies under (SRD) is of a similar nature:
Theorem 3.1.
Let X ( t ) be an interior solution orbit of the stochastic replicatordynamics (SRD) . Assume further that p ∈ X is dominated by p ′ ∈ X in the modifiedgame G σ . Then, p becomes extinct along X ( t ) ( a.s. ) .Remark . As a special case, if the (pure) strategy α ∈ A is dominated by the(pure) strategy β ∈ A , Theorem 3.1 shows that α becomes extinct under (SRD) aslong as v β ( x ) − v α ( x ) > (cid:2) σ α ( x ) + σ β ( x ) (cid:3) for all x ∈ X . (3.2)In terms of the original game, this condition can be interpreted as saying that α isdominated by β by a margin no less that max x (cid:0) σ α ( x ) + σ β ( x ) (cid:1) . Put differently,Theorem 3.1 shows that dominated strategies in the original, unmodified gamebecome extinct provided that the payoff shocks are mild enough. Proof of Theorem 3.1.
Following Cabrales (2000), we will show that p becomesextinct along X ( t ) by studying the “cross-entropy” function: V ( x ) = D KL ( p, x ) − D KL ( p ′ , x ) = X α ( p α log p α − p ′ α log p ′ α )+ X α ( p ′ α − p α ) log x α , (3.3)where D KL ( p, x ) = P α p α log( p α /x α ) denotes the Kullback–Leibler (KL) diver-gence of x with respect to p . By a standard argument (Weibull, 1995), p becomesextinct along X ( t ) if lim t →∞ D KL ( p, X ( t )) = ∞ ; thus, with D KL ( p ′ , x ) ≥ , itsuffices to show that lim t →∞ V ( X ( t )) = ∞ .To that end, let Y α = log X α so that dY α = dX α X α −
12 1 X α ( dX α ) , (3.4)by Itô’s lemma. Then, writing dS α = X α (cid:2) σ α dW α − P β X β σ β dW β (cid:3) for the mar-tingale term of (SRD), we readily obtain: ( dS α ) = X α h σ α dW α − X β X β σ β dW β i · h σ α dW α − X γ X γ σ γ dW γ i = X α h (1 − X α ) σ α + X β σ β X β i dt, (3.5)where we have used the orthogonality conditions dW β · dW γ = δ βγ dt . By the sametoken, we also get ( dX α ) = ( dS α ) , and hence: dY α = (cid:0) v α − h v | X i (cid:1) dt − h (1 − X α ) σ α + X β σ β X β i dt + σ α dW α − X β X β σ β dW β . (3.6)Therefore, after some easy algebra, we obtain: dV = X α ( p ′ α − p α ) dY α = h v ( X ) | p ′ − p i dt − X α (cid:0) p ′ α − p α (cid:1) (1 − X α ) σ α ( X ) dt + X α ( p ′ α − p α ) σ α ( X ) dW α = h v σ ( X ) | p ′ − p i dt + X α ( p ′ α − p α ) σ α ( X ) dW α (3.7) MITATION DYNAMICS WITH PAYOFF SHOCKS 11 where we have used the fact that P α ( p ′ α − p α ) = 0 .Now, since p is dominated by p ′ in G σ , we will have h v σ ( x ) | p ′ − p i ≥ m for somepositive constant m > and for all x ∈ X . Eq. (4.6) then yields: V ( X ( t )) ≥ V ( X (0)) + mt + ξ ( t ) , (3.8)where ξ denotes the martingale part of (4.6), viz. ξ ( t ) = X α ( p ′ α − p α ) Z t σ α ( X ( s )) dW α ( s ) . (3.9)Since the σ ( X ( t )) is bounded and continuous (a.s.), Lemma A.1 shows that mt + ξ ( t ) ∼ mt as t → ∞ , so the RHS of (3.8) escapes to ∞ as t → ∞ . This implies lim t →∞ V ( X ( t )) = ∞ and our proof is complete. (cid:3) Theorem 3.1 is our main result concerning the extinction of dominated strategiesunder (SRD) so a few remarks are in order:
Remark . Theorem 3.1 is analogous to the elimination results of Imhof (2005,Theorem 3.1) and Cabrales (2000, Prop. 1A) who show that dominated strategiesbecome extinct under the replicator dynamics with aggregate shocks (2.13) if theshocks satisfy certain “tameness” requirements. On the other hand, Theorem 3.1should be contrasted to the corresponding results of Mertikopoulos and Moustakas(2010) who showed that dominated strategies become extinct under the stochasticreplicator dynamics of exponential learning (2.16) irrespective of the noise level (fora related elimination result, see also Bravo and Mertikopoulos, 2014). The crucialqualitative difference here lies in the Itô correction term that appears in the driftof the stochastic replicator dynamics: the Itô correction in (2.16) is “just right”with respect to the logarithmic variables Y α = log X α and this is what leads to theunconditional elimination of dominated strategies. On the other hand, even thoughthere is no additional drift term in (SRD) except for the one driven by the game’spayoffs, the logarithmic transformation Y α = log X α incurs an Itô correction whichis reflected in the definition of the modified payoff functions (3.1). Remark . A standard induction argument based on the rounds of eliminationof iteratively dominated strategies (see e.g. Cabrales, 2000 or Mertikopoulos andMoustakas, 2010) can be used to show that the only strategies that survive underthe stochastic replicator dynamics (SRD) must be iteratively undominated in themodified game G σ . Remark . Finally, it is worth mentioning that Imhof (2005) also establishes anexponential rate of extinction of dominated strategies under the stochastic repli-cator dynamics with aggregate shocks (2.13). Specifically, if α ∈ A is dominated,Imhof (2005) showed that there exist constants A, B > and A ′ , B ′ > such that X α ( t ) = o (cid:16) exp (cid:16) − At + B p t log log t (cid:17)(cid:17) (a.s.) , (3.10)and P [ X α ( t ) > ε ] ≤
12 erfc h A ′ t / + B ′ log ε · t − / i , (3.11)provided that the noise coefficients of (2.13) satisfy a certain “tameness” condition.Following the same reasoning, it is possible to establish similar exponential decayrates for the elimination of dominated strategies under (SRD), but the exact ex-pressions for the constants in (3.10) and (3.11) are more complicated, so we do notpresent them here. Stability analysis of equilibrium play.
In this section, our goal will beto investigate the stability and convergence properties of the stochastic replicatordynamics (SRD) with respect to equilibrium play. Motivated by a collection ofstability results that is sometimes called the “folk theorem” of evolutionary gametheory (Hofbauer and Sigmund, 2003), we will focus on the following three proper-ties of the deterministic replicator dynamics (RD):(1) Limits of interior orbits are Nash equilibria.(2) Lyapunov stable states are Nash equilibria.(3) Strict Nash equilibria are asymptotically stable under (RD).Of course, given the stochastic character of the dynamics (SRD), the notions ofLyapunov and asymptotic stability must be suitably modified. In this SDE context,we have:
Definition 3.2.
Let x ∗ ∈ X . We will say that:(1) x ∗ is stochastically Lyapunov stable under (SRD) if, for every ε > and forevery neighborhood U of x ∗ in X , there exists a neighborhood U ⊆ U of x ∗ such that P ( X ( t ) ∈ U for all t ≥ ≥ − ε whenever X (0) ∈ U . (3.12)(2) x ∗ is stochastically asymptotically stable under (SRD) if it is stochasticallystable and attracting: for every ε > and for every neighborhood U of x ∗ in X , there exists a neighborhood U ⊆ U of x ∗ such that P (cid:16) X ( t ) ∈ U for all t ≥ and lim t →∞ X ( t ) = x ∗ (cid:17) ≥ − ε whenever X (0) ∈ U . (3.13)For (SRD), we have:
Theorem 3.3.
Let X ( t ) be an interior solution orbit of the stochastic replicatordynamics (SRD) and let x ∗ ∈ X . (1) If P (lim t →∞ X ( t ) = x ∗ ) > , then x ∗ is a Nash equilibrium of the noise-adjusted game G σ . (2) If x ∗ is stochastically Lyapunov stable, then it is also a Nash equilibrium ofthe noise-adjusted game G σ . (3) If x ∗ is a strict Nash equilibrium of the noise-adjusted game G σ , then it isstochastically asymptotically stable under (SRD) .Remark . By the nature of the modified payoff functions (3.1), strict equilibriaof the original game G are also strict equilibria of G σ , so Theorem 3.3 impliesthat strict equilibria of G are also stochastically asymptotically stable under thestochastic dynamics (SRD). The converse does not hold: if the noise coefficients σ α are sufficiently large, (SRD) possesses stochastically asymptotically stable statesthat are not Nash equilibria of G . This is consistent with the behavior of (SRD) inthe pure noise case that we discussed in the previous section: if X ( t ) starts within ε of a vertex of X and there are no payoff differences, then X ( t ) converges to thisvertex with probability at least − ε . Remark . The condition for α to be a strict equilibrium of the modified game isthat v β − v α < (cid:0) σ α + σ β (cid:1) for all β = α , (3.14) MITATION DYNAMICS WITH PAYOFF SHOCKS 13 where the payoffs and the noise coefficients are evaluated at the vertex e α of X (note the similarity with (3.2)). To provide some intuition for this condition,consider the case of only two pure strategies, α and β , and assume constantnoise coefficients. Letting X ( t ) = X β ( t ) and proceeding as in (2.20), we get dX = X (1 − X ) [( v β − v α ) dt − σ dW ] where σ = σ α + σ β and W is a rescaledWiener process. Heuristically, a discrete-time counterpart of X ( t ) is then providedby the random walk: X ( n + 1) − X ( n ) = X ( n )(1 − X ( n )) h ( v β − v α ) δ + σξ n √ δ i (3.15)where ξ n ∈ { +1 , − } is a zero-mean Bernoulli process, and the noise term is mul-tiplied by √ δ instead of δ because dW · dW = dt . For small X and δ , a simplecomputation then shows that, in the event ξ n +1 = − ξ n , we have: X ( n + 2) − X ( n ) = 2 δX ( n ) (cid:2) v β − v α − σ (cid:3) + o ( δ ) + o ( X ( n )) . (3.16)Since σ = σ α + σ β , the bracket is negative (so X α = 1 − X increases) if andonly if condition (3.14) is satisfied. Thus, (3.14) may be interpreted as saying thatwhen the discrete-time process X ( n ) is close to e α and the random noise term ξ n takes two successive steps in opposite direction, then the process ends up evencloser to e α . On the other hand, if the opposite strict inequality holds, then thisinterpretation suggests that β should successfully invade a population where mostindividuals play α – which, in turn, explains (3.2). Proof of Theorem 3.3.
Contrary to the approach of Hofbauer and Imhof (2009),we will not employ the stochastic Lyapunov method (see e.g. Khasminskii, 2012)which requires calculating the infinitesimal generator of (SRD). Instead, motivatedby the recent analysis of Bravo and Mertikopoulos (2014), our proof will rely onthe “dual” variables Y α = log X α that were already used in the proof of Theorem3.1.Part 1. We argue by contradiction. Indeed, assume that P (lim t →∞ X ( t ) = x ∗ ) > but that x ∗ is not Nash for the noise-adjusted game G σ , so v σα ( x ∗ ) < v σβ ( x ∗ ) for some α ∈ supp( x ∗ ) , β ∈ A . On that account, let U be a sufficiently small neighborhoodof x ∗ in X such that v σβ ( x ) − v σα ( x ) ≥ m for some m > and for all x ∈ U . Then,by (3.6), we get: dY α − dY β = [ v α − v β ] dt − (cid:2) (1 − X α ) σ α − (1 − X β ) σ β (cid:3) dt + σ α dW α − σ β dW β , (3.17)so, if X ( t ) is an interior orbit of (SRD) that converges to x ∗ , we will have: dY α − dY β ≤ − m dt − dξ for all large enough t > , (3.18)where ξ denoting the martingale part of (3.17). Since the diffusion coefficients of(3.17) are bounded, Lemma A.1 shows that mt + ξ ( t ) ∼ mt for large t (a.s.), so log X α ( t ) X β ( t ) ≤ log X α (0) X β (0) − mt − ξ ( t ) ∼ − mt → −∞ (a.s.) (3.19)as t → ∞ . This implies that lim t →∞ X α ( t ) = 0 , contradicting our original assump-tion that X ( t ) stays in a small enough neighborhood of x ∗ with positive probability Put differently, it’s more probable for X ( n ) to decrease rather than increase: X ( n +2) > X ( n ) with probability / (i.e. if and only if ξ n takes two positive steps), while X ( n + 2) < X ( n ) withprobability / . (recall that x ∗ α > ); we thus conclude that x ∗ is a Nash equilibrium of the noise-adjusted game G σ , as claimed.Part 2. Assume that x ∗ is stochastically Lyapunov stable. Then, every neighbor-hood U of x ∗ admits an interior trajectory X ( t ) that stays in U for all time withpositive probability. The proof of Part 1 shows that this only possible if x ∗ is aNash equilibrium of the modified G σ , so our claim follows.Part 3. To show that strict Nash equilibria of G σ are stochastically asymptoticallystable, let x ∗ = ( α ∗ , . . . , α ∗ N ) ∈ X be a strict equilibrium of G σ . Then, suppressingthe population index k as before, let Z α = Y α − Y α ∗ , (3.20)so that X ( t ) → x ∗ if and only if Z α ( t ) → −∞ for all α ∈ A ∗ ≡ A \ { α ∗ } . To proceed, fix some probability threshold ε > and a neighborhood U of x ∗ in X . Since x ∗ is a strict equilibrium of G σ , there exists a neighborhood U ⊆ U of x ∗ and some m > such that v σα ∗ ( x ) − v σα ( x ) ≥ m for all x ∈ U and for all α ∈ A ∗ . (3.21)Let M > be sufficiently large so that X ( t ) ∈ U if Z α ( t ) ≤ − M for all α ∈ A ∗ ;we will show that if M is chosen suitably (in terms of ε ) and Z α (0) < − M , then X ( t ) ∈ U for all t ≥ and Z α ( t ) → −∞ with probability at least − ε , i.e. x ∗ isstochastically asymptotically stable.To that end, take Z α (0) ≤ − M in (3.20) and define the first exit time: τ U = inf { t > X ( t ) / ∈ U } . (3.22)By applying (3.17), we then get: dZ α = dY α − dY α ∗ = (cid:2) v σα − v σα ∗ (cid:3) dt − dξ, (3.23)where the martingale term dξ is defined as in (3.17), taking β = α ∗ . Hence, for all t ≤ τ U , we will have: Z α ( t ) = Z α (0) + Z t [ v σα ( X ( s )) − v σα ∗ ( X ( s ))] ds − ξ ( t ) ≤ − M − mt − ξ ( t ) . (3.24)By the time-change theorem for martingales (Øksendal, 2007, Cor. 8.5.4), thereexists a standard Wiener process f W ( t ) such that ξ ( t ) = f W ( ρ ( t )) where ρ = [ ξ, ξ ] denotes the quadratic variation of ξ ; as such, we will have Z α ( t ) ≤ − M whenever f W ( ρ ( t )) ≥ − M − mt . However, with σ Lipschitz over X , we readily get ρ ( t ) ≤ Kt for some positive constant K > , so it suffices to show that the hitting time τ = inf (cid:8) t > f W ( t ) = − M − mt/K (cid:9) (3.25)is finite with probability not exceeding ε . Indeed, if a trajectory of f W ( t ) has f W ( t ) ≥ − M − mt/K for all t ≥ , we will also have f W ( ρ ( t )) ≥ − M − mρ ( t ) /K ≥ − M − mt, (3.26)so τ U is infinite for every trajectory of f W with infinite τ , hence P ( τ U < + ∞ ) ≤ P ( τ < + ∞ ) . Lemma A.2 then shows that P ( τ < + ∞ ) = e − Mm/K , so, if wetake
M > − (2 m ) − K log ε , we get P ( τ U = ∞ ) ≥ − ε . Conditioning on the event τ U = + ∞ , Lemma A.1 applied to (3.24) yields Z α ( t ) ≤ − M − mt − ξ ( t ) ∼ − mt → −∞ (a.s.) (3.27) Simply note that X α ∗ = (cid:0) P β ∈ A ∗ exp( Z β ) (cid:1) − . MITATION DYNAMICS WITH PAYOFF SHOCKS 15 so X ( t ) → x ∗ with probability at least − ε , as was to be shown. (cid:3) Remark . As mentioned before, Hofbauer and Imhof (2009) state a similar “evo-lutionary folk theorem” in the context of single-population random matching gamesunder the stochastic replicator dynamics with aggregate shocks (2.13). In particu-lar, Hofbauer and Imhof (2009) consider the modified game: v σα ( x ) = v α ( x ) − σ α , (3.28)where σ α denotes the intensity of the aggregate shocks in (2.13), and they show thatstrict Nash equilibria of this noise-adjusted game are stochastically asymptoticallystable under (2.13). It is interesting to note that the adjustments (3.1) and (3.28)do not coincide: the payoff shocks affect the deterministic replicator equation (RD)in a different way than the aggregate shocks of (2.13). Heuristically, in the modelof Fudenberg and Harris (1992), noise is detrimental because for a given expectedgrowth rate, noise almost surely lowers the long-term average geometric growth rateof the total number of individuals playing α by the quantity σ α . In a geometricgrowth process, the quantities that matter (the proper fitness measures) are theselong-term geometric growth rates, so the relevant payoffs are those of this modifiedgame. In our model, noise is not detrimental, but if it is strong enough comparedto the deterministic drift, then, with positive probability, it may lead to otheroutcomes than the deterministic model. Instead, the assumptions of Theorems 3.1and 3.3 should be interpreted as guaranteeing that the deterministic drift prevails.One way to see this is to note that if β strictly dominates α in the original gameand both strategies are affected by the same noise intensity ( σ α = σ β = σ ), then β need not dominate α in the modified game defined by (3.1), unless the payoffmargin in the original game is always greater than σ . Remark . It is also worth contrasting Theorem 3.3 to the unconditional conver-gence and stability results of Mertikopoulos and Moustakas (2010) for the stochas-tic replicator dynamics of exponential learning (2.16). As in the case of dominatedstrategies, the reason for this qualitative difference is the distinct origins of theperturbation process: the Itô correction in (2.16) is “just right” with respect to thedual variables Y α = log X α , so a state x ∗ ∈ X is stochastically asymptotically stableunder the (2.16) if and only if it is a strict equilibrium of the original game G .4. The effect of aggregating payoffs
In this section, we examine the case where players are less “myopic” and, insteadof using revision protocols driven by their instantaneous payoffs, they base theirdecisions on the cumulative payoffs of their strategies over time. Formally, focusingfor concreteness on the “imitation of success” revision protocol (2.7), this amountsto considering conditional switch rates of the form: ˜ ρ αβ = x β U β , (4.1)where U β ( t ) = Z t v β ( x ( s )) ds (4.2) In a discrete time setting, if Z ( n + 1) = g ( n ) Z n and g ( n ) = k i with probability p i , what wemean is that the quantity that a.s. governs the long-term growth of Z is not E ( g ) = P i p i k i , but exp( E (ln g )) = Q i k p i i . denotes the cumulative payoff of strategy β up to time t . In this case, (RD) becomes: ˙ x α = x α h U α − X β x β U β i , (4.3)and, as was shown by Laraki and Mertikopoulos (2013), the evolution of mixedstrategy shares is governed by the (deterministic) second order replicator dynamics : ¨ x α = x α h v α ( x ) − X β x β u β ( x ) i + x α h ˙ x α /x α − X β ˙ x β /x β i . (RD )As in the previous section, we are interested in the effects of random payoff shockson the dynamics (RD ). If the game’s payoff functions are subject to random shocksat each instant in time, then these shocks will also be aggregated over time, leadingto the perturbed cumulative payoff process: ˆ U α ( t ) = Z t v α ( X ( s )) ds + Z t σ α ( X ( s )) dW α ( s ) . (4.4)Since ˆ U α is continuous (a.s.), we obtain the stochastic integro-differential dynamics: ˙ X α = X α h U α ( t ) − X β X β ( t ) U β ( t ) i + X α (cid:20)Z t σ α ( X ( s )) dW α ( s ) − X β Z t X β ( s ) σ β ( X ( s )) dW β ( s ) (cid:21) , (4.5)where, as in (SRD), we assume that the Brownian disturbances W α ( t ) are indepen-dent.To obtain an autonomous SDE from (4.5), let V α = ˙ X α denote the growth rateof strategy α . Then, differentiating (4.5) yields: dV α = X α h ˙ U α − X β X β ˙ U β i dt (4.6a) + V α h U α − X β X β U β i dt − X α X β U β V β dt (4.6b) + V α (cid:20)Z t σ α ( X ( s )) dW α ( s ) − X β Z t X β ( s ) σ β ( X ( s )) dW β ( s ) (cid:21) dt (4.6c) + X α h σ α ( X ) dW α − X β σ β ( X ) X β dW β i . (4.6d)By (4.5), the sum of the first term of (4.6b) and (4.6c) is equal to V α /X α . Thus,using (4.2) we obtain: dV α = X α h v α ( X ) − X β X β v β ( X ) i dt + V α X α dt − X α X β U β V β dt + X α h σ α ( X ) dW α − X β σ β ( X ) X β dW β i , (4.7)and, after summing over all α and solving for X α P β U β V β dt , we get the secondorder SDE system: dX α = V α dtdV α = X α h v α ( X ) − X β x β v β ( X ) i dt + X α h V α /X α − X β V β /X β i dt (4.8) + X α h σ α ( X ) dW α − X β σ β ( X ) X β dW β i . Recall that P α dV α = 0 since P α X α = 1 . MITATION DYNAMICS WITH PAYOFF SHOCKS 17
By comparing the second order system (4.8) to (RD ), we see that there is noItô correction, just as in the first order case. By using similar arguments as inLaraki and Mertikopoulos (2013), it is then possible to show that the system (4.8)is well-posed, i.e. it admits a unique (strong) solution X ( t ) for every interior initialcondition X (0) ∈ X ◦ , V (0) ∈ R A and this solution remains in X ◦ for all time.With this well-posedness result at hand, we begin by showing that (4.8) elimi-nates strategies that are dominated in the original game G (instead of the modifiedgame G σ ): Theorem 4.1.
Let X ( t ) be a solution orbit of the dynamics (4.5) and assume that α ∈ A is dominated by β ∈ A . Then, α becomes extinct ( a.s. ) .Proof. As in the proof of Theorem 3.1, let Y α = log X α . Then, following the samestring of calculations leading to (3.6), we get: dY α − dY β = [ U α − U β ] dt (4.9a) + (cid:20)Z t σ α ( X ) dW α − Z t σ β ( X ) dW β (cid:21) dt (4.9b)Since α is dominated by β , there exists some positive m > such that v α − v β ≤ − m ,and hence U α ( t ) − U β ( t ) ≤ − mt . Furthermore, with σ bounded and continuous on X , Lemma A.1 readily yields: − mt + (cid:20)Z t σ α ( X ) dW α − Z t σ β ( X ) dW β (cid:21) ∼ − mt (4.10)as t → ∞ . Accordingly, (4.9) becomes: dY α − dY β ≤ − mtdt + θ ( t ) dt (4.11)where the remainder function θ ( t ) corresponding to the drift term (4.9b) is sublinearin t . By integrating and applying Lemma A.1 a second time, we then obtain: Y α ( t ) − Y β ( t ) ≤ Y α (0) − Y β (0) − mt + Z t θ ( s ) ds ∼ − mt (a.s.) . (4.12)We infer that lim t →∞ Y α ( t ) = 0 (a.s.), i.e. α becomes extinct along X ( t ) . (cid:3) Remark . In view of Theorem 4.1, we see that the “imitation of long-term suc-cess” protocol (4.1) provides more robust elimination results than (2.7) in the pres-ence of payoff shocks: contrary to Theorem 3.1, there are no “small noise” require-ments in Theorem 4.1. Our next result provides the analogue of Theorem 3.3 regarding the stability ofequilibrium play:
Theorem 4.2.
Let X ( t ) be an interior solution orbit of the stochastic dynamics (4.5) and let x ∗ ∈ X . Then:(1) If P (lim t →∞ X ( t ) = x ∗ ) > , x ∗ is a Nash equilibrium of G . The reason however is different: in (SRD), there is no Itô correction because the noise isadded directly to the dynamical system under study; in (4.8), there is no Itô correction becausethe noise is integrated over, so X α is smooth (and, hence, obeys the rules of ordinary calculus). Recall that R t σ α ( X ( s )) dW α ( s ) is continuous, so the only Itô correction stems from randommutations. Theorem 4.1 actually applies to mixed dominated strategies as well (even iteratively domi-nated ones). The proof is a simple adaptation of the pure strategies case, so we omit it.
Moreover, for every neighborhood U of x ∗ and for all ε > , we have:(2) If P ( X ( t ) ∈ U for all t ≥ ≥ − ε whenever X (0) ∈ U for some neigh-borhood U ⊆ U of x ∗ , then x ∗ is a Nash equilibrium of G .(3) If x ∗ is a strict Nash equilibrium of G , there exists a neighborhood U of x ∗ such that: P (cid:16) X ( t ) ∈ U for all t ≥ and lim t →∞ X ( t ) = x ∗ (cid:17) ≥ − ε, (4.13) whenever X (0) ∈ U .Remark . Part 1 of Theorem 4.2 is in direct analogy with Part 1 of Theorem 3.3:the difference is that Theorem 4.2 shows that only Nash equilibria of the originalgame G can be ω -limits of interior orbits with positive probability. Put differently,if x ∗ is a strict equilibrium of G σ but not of G , there is zero probability that (4.5)converges to x ∗ .On the other hand, Parts 2 and 3 are not tantamount to stochastic stability (Lya-punov or asymptotic) under the autonomous SDE system (4.8). The difference hereis that (4.8) is only well-defined in the interior of X ◦ , so it is not straightforwardhow to define the notion of (stochastic) stability for boundary points x ∗ ∈ bd( X ) ;moreover, given that (4.8) is a second order system, stability should be stated interms of the problem’s entire phase space, including initial velocities (for a relevantdiscussion, see Laraki and Mertikopoulos, 2013). Instead, the stated stability con-ditions simply reflect the fact that the integro-differential dynamics (4.5) alwaysstart with initial velocity V (0) = 0 , so this added complexity is not relevant. Proof of Theorem 4.2.
We shadow the proof of Theorem 3.3.Part 1. Assume that P (lim t →∞ X ( t ) = x ∗ ) > for some x ∗ ∈ X . If x ∗ is nota Nash equilibrium of G , we will have v α ( x ∗ ) < v β ( x ∗ ) for some α ∈ supp( x ∗ ) , β ∈ A . Accordingly, let U be a sufficiently small neighborhood of x ∗ in X such that v β ( x ) − v α ( x ) ≥ m for some m > and for all x ∈ U . Since X ( t ) → x ∗ with positiveprobability, it also follows that P ( X ( t ) ∈ U for all t ≥ > ; hence, arguing as in(4.12) and conditioning on the positive probability event “ X ( t ) ∈ U for all t ≥ ”,we get: Y α ( t ) − Y β ( t ) ∼ − mt (conditionally a.s.). (4.14)This implies X α ( t ) → , contradicting our original assumption that X ( t ) stays in asmall neighborhood of x ∗ . We infer that x ∗ is a Nash equilibrium of G , as claimed.Part 2. Simply note that the stability assumption of Part 2 implies that there existsa positive measure of interior trajectories X ( t ) that remain in an arbitrarily smallneighborhood of x ∗ with positive probability. The proof then follows as in Part 1.Part 3. Let Z α = Y α − Y α ∗ be defined as in (3.20) and let m > be such that v α ∗ ( x ) − v α ( x ) ≥ m for all x in some sufficiently small neighborhood of x ∗ . Also,let M > be sufficiently large so that X ( t ) ∈ U if Z α ( t ) ≤ − M for all α ∈ A ∗ ;as in the proof of Theorem 3.3, we will show that there is a suitable choice of M such that Z α (0) < − M for all α = α ∗ implies that X ( t ) ∈ U for all t ≥ and Z α ( t ) → −∞ for all α = α ∗ with probability at least − ε . Recall here that strict equilibria of G are also strict equilibria of G σ , but the converse neednot hold. MITATION DYNAMICS WITH PAYOFF SHOCKS 19
Indeed, by setting β = α ∗ in (4.9), we obtain: dZ α = [ U α − U α ∗ ] dt + (cid:20)Z t σ α ( X ) dW α − Z t σ α ∗ ( X ) dW α ∗ (cid:21) dt (4.15)so, recalling (eq:pay-cum-noise), for all t ≤ τ U = inf { t > X ( t ) / ∈ U } , we willhave: Z α ( t ) ≤ − M − mt + Z t θ ( s ) ds − ξ ( t ) , (4.16)where ξ denotes the martingale part of (4.15) and θ ( t ) is defined as in (4.11).Now, let W ( t ) be a Wiener process starting at the origin. We will show that if M is chosen sufficiently large, then P (cid:16) M + mt ≥ R t W ( s ) ds for all t ≥ (cid:17) ≥ − ε. (4.17)With a fair degree of hindsight, we note first that mt + M ≥ at + bt + c where a = m , b = √ M m and c = M , so it suffices to show that the hitting time τ = inf { t : R t W ( s ) ds = at + bt + c } is infinite with probability at least − ε .However, by the mean value theorem, there exists some (random) time τ such that: aτ + b − W ( τ ) = 0 − cτ ≤ . (4.18)Since c/τ > , the hitting time τ ′ = inf { t > W ( t ) = 2 at + b } will satisfy τ ′ ( ω ) < τ ( ω ) for every trajectory ω of W with τ ( ω ) < ∞ . However, Lemma A.2gives P [ τ ′ < ∞ ] = exp( − ab ) , hence: P ( τ < ∞ ) ≤ P ( τ ′ < ∞ ) = exp( − ab ) = exp( − M m/ , (4.19)i.e. P ( τ < ∞ ) can be made arbitrarily small by choosing M large enough. We thusdeduce that Z t W ( s ) ds ≤ mt + M for all t ≥ (4.20)with probability no less than − ε .Going back to (4.16), we see that R t θ ( s ) ds − ξ ( t ) − mt remains below M for all time with probability at least − ε (simply use the probability estimate(4.17) and argue as in the proof of Theorem 3.3 recalling that the processes W α areassumed independent). In turn, this shows that P ( X ( t ) ∈ U for all t ≥ ≥ − ε ,so, conditioning on this last event and letting t → ∞ in (4.16), we obtain: P (cid:16) lim t →∞ Z α ( t ) = −∞ (cid:12)(cid:12)(cid:12) X ( t ) ∈ U for all t ≥ (cid:17) = 1 for all α = α ∗ . (4.21)We conclude that X ( t ) remains in U and X ( t ) → x ∗ with probability at least − ε ,as was to be shown. (cid:3) Discussion
In this section, we discuss some points that would have otherwise disrupted theflow of the main text:
Payoff shocks in bimatrix games.
Throughout our paper, we have workedwith generic population games, so we have not made any specific assumptions onthe payoff shocks either. On the other hand, if the game’s payoff functions areobtained from some common underlying structure, then the resulting payoff shocksmay also end up having a likewise specific form.For instance, consider a basic (symmetric) random matching model where pairsof players are drawn randomly from a nonatomic population to play a symmetrictwo-player game with payoff matrix V αβ , α, β = 1 , . . . , n . In this case, the payoffto an α -strategist in the population state x ∈ X will be of the form: v α ( x ) = X β V αβ x β . (5.1)Thus, if the entries of V are disturbed at each t ≥ by some (otherwise independent)white noise process ξ αβ , the perturbed payoff matrix ˆ V αβ = V αβ + ξ αβ will resultin the total payoff shock: ξ α = X β ξ αβ x β . (5.2)The stochastic dynamics (2.8) thus become: dX α = X α hX β V αβ X β − X β,γ V βγ X β X γ i dt + X α hX β σ αβ X β dW αβ − X β,γ σ βγ X β X γ dW βγ i , (5.3)where the Wiener processes W αβ are assumed independent.To compare (5.3) with the core model (SRD), the same string of calculations asin the proof of Theorems 3.1 and 3.3 leads to the modified payoff functions: v σα ( x ) = v α ( x ) −
12 (1 − x α ) X β σ αβ x β . (5.4)It is then trivial to see that Theorems 3.1 and 3.3 still apply with respect to themodified game G σ with payoff functions defined as above; however, seeing as (5.4)is cubic in x α and considering the case of constant noise, these modified payofffunctions no longer correspond to random matching in a modified bimatrix game.5.2. Stratonovich-type perturbations.
Depending on the origins of the payoffshock process (for instance, if there are nontrivial autocorrelation effects that donot vanish in the continuous-time regime), the perturbed dynamics (SRD) couldinstead be written as a Stratonovich-type SDE (Kuo, 2006): ∂X α = X α h v α ( X ) − X β X β v β ( X ) i dt + X α h σ α ∂W α − X β X β σ β ∂W β i , (5.5)where ∂ ( · ) denotes Stratonovich integration. In this case, if M αβ = X α ( δ αβ − X β ) σ β denotes the diffusion matrix of (5.5), the Itô equivalent SDE correspondingto (5.5) will be: dX α = X α h v α ( X ) − X β X β v β ( X ) i dt + 12 X β,γ ∂M αβ ∂X γ M γβ dt + X α h σ α dW α − X β X β σ β dW β i . (5.6) For a general overview of the differences between Itô and Stratonovich integration, see vanKampen (1981); for a more specific account in the context of stochastic population growth models,the reader is instead referred to Khasminskii and Potsepun (2006) and Hofbauer and Imhof (2009).
MITATION DYNAMICS WITH PAYOFF SHOCKS 21
Then, assuming that the shock coefficients σ β are constant, some algebra yields thefollowing explicit expression for the Itô correction of (5.6): X β,γ ∂M αβ ∂X γ M γβ dt = 12 X β,γ ( δ αβγ − δ αγ x β − δ βγ x α ) σ β · X γ ( δ γβ − X β ) σ β dt = 12 X β,γ (cid:2) δ αβγ (1 − X β ) + δ αγ X β − δ βγ X α + δ βγ X α X β (cid:3) X γ σ β dt = 12 X α h (1 − X β ) σ β − X β X β (1 − X β ) σ β i dt. (5.7)By substituting this correction back to (5.6), we see that the replicator dynam-ics with Stratonovich shocks (5.5) are equivalent to the (Itô) stochastic replicatordynamics of exponential learning (2.16). In this context, Mertikopoulos and Mous-takas (2010) showed that the conclusions of Theorems 3.1 and 3.3 apply directlyto the original, unmodified game G under (2.16), so dominated strategies becomeextinct and strict equilibria are stochastically asymptotically stable under (5.5) aswell. Alternatively, this can also be seen directly from the correction term (5.7)which cancels with that of (3.1).5.3. Random strategy switches.
An alternative source of noise to the players’evolution under (RD) could come from random masses of players that switch strate-gies without following an underlying deterministic drift – as opposed to jumps witha well-defined direction induced by a revision protocol. To model this kind of “mu-tations”, we posit that the relative mass dX αβ of players switching from α to β overan infinitesimal time interval dt is governed by the SDE: dX αβ = X α ( ρ αβ dt + dM αβ ) , (5.8)where dM αβ denotes the (conditional) mutation rate from α to β .To account for randomness, we will assume that M αβ has unbounded variationover finite time intervals (contrary to the bounded variation drift term X α ρ αβ dt ).Moreover, for concreteness, we will focus on the imitative regime where ρ αβ = x β r αβ and the mutation processes M αβ follow a similar imitative pattern, namely dM αβ = X β dR αβ . The net change in the population of α -strategists will then be X β X β dM βα − X α X β dM αβ = X α X β X β dQ βα , (5.9)where dQ βα = dR βα − dR αβ describes the net influx of β -strategists in strategy α per unit population mass. Thus, assuming that the increments dQ βα are zero-mean,we will model Q as an Itô process of the form: dQ αβ = η αβ ( X ) dW αβ (5.10)where W αβ is an ordinary Wiener process and the diffusion coefficients η αβ : X → R reflect the intensity of the mutation process. In particular, the only assumptionsthat we need to make for W and η are that: dW αβ = − dW βα and η αβ = η βα for all α, β ∈ A and for all k ∈ N , (5.11)so that the net influx from α to β is minus the net influx from β to α ; except forthis “conservation of mass” requirement, we will assume that the processes dW αβ are otherwise independent. Thus, in the special case of the “imitation of success”revision protocol (2.7), we obtain the replicator dynamics with random mutations : dX α = X α h v α ( X ) − X β X β v β ( X ) i dt + X α X β = α X β η βα ( X ) dW βα . (5.12)This equation differs from (SRD) in that the martingale term of (SRD) cannotbe recovered from that of (5.12) without violating the symmetry conditions (5.11)that guarantee that there is no net transfer of mass across any pair of strategies α, β ∈ A . Nonetheless, by repeating the same analysis as in the case of Theorems3.1 and 3.3, we obtain the following proposition for the stochastic dynamics (5.12): Proposition 5.1.
Let X ( t ) be an interior solution orbit of the stochastic dynamics (5.12) and consider the noise-adjusted game G η with modified payoff functions: v ηα ( x ) = v α ( x ) − X β = α x β η βα ( x ) . (5.13) We then have: (1) If p ∈ X is dominated in G η , then it becomes extinct under (5.12) . (2) If P (lim t →∞ X ( t ) = x ∗ ) > for some x ∗ ∈ X , then x ∗ is a Nash equilibriumof G η . (3) If x ∗ ∈ X is stochastically Lyapunov stable, then it is a Nash equilibrium of G η . (4) If x ∗ ∈ X is a strict Nash equilibrium of G η , then it is stochastically asymp-totically stable under (5.12) .Proof. The proof is similar to that of Theorems 3.1 (for Part 1) and 3.3 (for Parts2–4), so we omit it. (cid:3)
Appendix A. Auxiliary results from stochastic analysis
In this appendix, we state and prove two auxiliary results from stochastic analysisthat were used throughout the paper. Lemma A.1 is an asymptotic growth boundfor Wiener processes relying on the law of the iterated logarithm, while Lemma A.2is a calculation of the probability that a Wiener process starting at the origin hitsthe line a + bt in finite time. Both lemmas appear in a similar context in Bravoand Mertikopoulos (2014); we provide a proof here only for completeness and easeof reference. Lemma A.1.
Let W ( t ) = ( W ( t ) , . . . , W n ( t )) , t ≥ , be an n -dimensional Wienerprocesses and let Z ( t ) be a bounded, continuous process in R n . Then: f ( t ) + Z t Z ( s ) · dW ( s ) ∼ f ( t ) as t → ∞ ( a.s. ) , (A.1) for any function f : [0 , ∞ ) → R such that lim t →∞ ( t log log t ) − / f ( t ) = + ∞ .Proof. Let ξ ( t ) = R t Z ( s ) · dW ( s ) = P ni =1 R t Z i ( s ) dW i ( s ) . Then, the quadraticvariation ρ = [ ξ, ξ ] of ξ satisfies: d [ ξ, ξ ] = dξ · dξ = n X i =1 Z i Z j δ ij dt ≤ M dt, (A.2)where M = sup t ≥ k Z ( t ) k < + ∞ (recall that Z ( t ) is bounded by assumption).On the other hand, by the time-change theorem for martingales (Øksendal, 2007, MITATION DYNAMICS WITH PAYOFF SHOCKS 23
Corollary 8.5.4), there exists a Wiener process f W ( t ) such that ξ ( t ) = f W ( ρ ( t )) , andhence: f ( t ) + ξ ( t ) f ( t ) = 1 + f W ( ρ ( t )) f ( t ) . (A.3)Obviously, if lim t →∞ ρ ( t ) ≡ ρ ( ∞ ) < + ∞ , f W ( ρ ( ∞ )) is normally distributed so f W ( ρ ( t )) /f ( t ) → and there is nothing to show. Otherwise, if lim t →∞ ρ ( t ) = + ∞ ,the quadratic variation bound (A.2) and the law of the iterated logarithm yield: (cid:12)(cid:12)f W ( ρ ( t )) (cid:12)(cid:12) f ( t ) ≤ (cid:12)(cid:12)f W ( ρ ( t )) (cid:12)(cid:12)p ρ ( t ) log log ρ ( t ) × √ M t log log
M tf ( t ) → as t → ∞ , (A.4)and our claim follows. (cid:3) Lemma A.2.
Let W ( t ) be a standard one-dimensional Wiener process and considerthe hitting time τ a,b = inf { t > W ( t ) = a + bt } , a, b ∈ R . Then: P ( τ a,b < ∞ ) = exp( − ab − | ab | ) . (A.5) Proof.
Let W ( t ) = W ( t ) − bt so that τ a,b = inf { t > W ( t ) = a } . By Girsanov’stheorem (see e.g. Øksendal, 2007, Chap. 8), there exists a probability measure Q such that a ) W is a Brownian motion with respect to Q ; and b ) the Radon–Nikodymderivative of Q with respect to P satisfies d Q d P (cid:12)(cid:12)(cid:12)(cid:12) F t = exp (cid:0) − b t/ bW ( t ) (cid:1) = exp (cid:0) b t/ − bW ( t ) (cid:1) , (A.6)where F t denotes the natural filtration of W ( t ) . We then get P ( τ a,b < t ) = E P [ ( τ a,b < t )]= E Q (cid:2) ( τ a,b < t ) · exp( − b t/ − bW ( t )) (cid:3) = E Q (cid:2) ( τ a,b < t ) · exp( − b τ a,b / − bW ( τ a,b )) (cid:3) = exp( − ab ) E Q (cid:2) ( τ a,b < t ) · exp( − b τ a,b / (cid:3) , (A.7)and hence: P ( τ a,b < ∞ ) = lim t →∞ P ( τ a,b < t )= lim t →∞ exp( − ab ) E Q (cid:2) ( τ a,b < t ) · exp( − b τ a,b / (cid:3) = exp( − ab ) E Q (cid:2) exp( − b τ a,b / (cid:3) = exp( − ab − | ab | ) , (A.8)where, in the last step, we used the expression E [exp( − λτ a )] = exp( − a √ λ ) forthe Laplace transform of the Brownian hitting time τ a = inf { t > W ( t ) = a } (Karatzas and Shreve, 1998). (cid:3) References
Akin, E., 1980: Domination or equilibrium.
Mathematical Biosciences ,
50 (3-4) , 239–250.Bergstrom, T. C., 2014: On the evolution of hoarding, risk-taking, and wealth distribution innonhuman and human populations.
Proceedings of the National Academy of Sciences of theUSA ,
111 (3) , 10 860–10 867.Bertsekas, D. P. and R. Gallager, 1992:
Data Networks . 2d ed., Prentice Hall, Englewood Cliffs,NJ.
Björnerstedt, J. and J. W. Weibull, 1996: Nash equilibrium and evolution by imitation.
TheRational Foundations of Economic Behavior , K. J. Arrow, E. Colombatto, M. Perlman, andC. Schmidt, Eds., St. Martin’s Press, New York, NY, 155–181.Bravo, M. and P. Mertikopoulos, 2014: On the robustness of learning in games with stochasticallyperturbed payoff observations. http://arxiv.org/abs/1412.6565 .Cabrales, A., 2000: Stochastic replicator dynamics.
International Economic Review ,
41 (2) , 451–81.Fudenberg, D. and C. Harris, 1992: Evolutionary dynamics with aggregate shocks.
Journal ofEconomic Theory ,
57 (2) , 420–441.Hofbauer, J. and L. A. Imhof, 2009: Time averages, recurrence and transience in the stochasticreplicator dynamics.
The Annals of Applied Probability ,
19 (4) , 1347–1368.Hofbauer, J. and K. Sigmund, 2003: Evolutionary game dynamics.
Bulletin of the AmericanMathematical Society ,
40 (4) , 479–519.Hofbauer, J., S. Sorin, and Y. Viossat, 2009: Time average replicator and best reply dynamics.
Mathematics of Operations Research ,
34 (2) , 263–269.Imhof, L. A., 2005: The long-run behavior of the stochastic replicator dynamics.
The Annals ofApplied Probability ,
15 (1B) , 1019–1045.Karatzas, I. and S. E. Shreve, 1998:
Brownian Motion and Stochastic Calculus . Springer-Verlag,Berlin.Khasminskii, R. Z., 2012:
Stochastic Stability of Differential Equations . 2d ed., No. 66 in Sto-chastic Modelling and Applied Probability, Springer-Verlag, Berlin.Khasminskii, R. Z. and N. Potsepun, 2006: On the replicator dynamics behavior underStratonovich type random perturbations.
Stochastic Dynamics , , 197–211.Kuo, H.-H., 2006: Introduction to Stochastic Integration . Springer, Berlin.Laraki, R. and P. Mertikopoulos, 2013: Higher order game dynamics.
Journal of Economic Theory ,
148 (6) , 2666–2695.Littlestone, N. and M. K. Warmuth, 1994: The weighted majority algorithm.
Information andComputation ,
108 (2) , 212–261.Mertikopoulos, P. and A. L. Moustakas, 2009: Learning in the presence of noise.
GameNets ’09:Proceedings of the 1st International Conference on Game Theory for Networks .Mertikopoulos, P. and A. L. Moustakas, 2010: The emergence of rational behavior in the presenceof stochastic perturbations.
The Annals of Applied Probability ,
20 (4) , 1359–1388.Nachbar, J. H., 1990: Evolutionary selection dynamics in games.
International Journal of GameTheory , , 59–89.Øksendal, B., 2007: Stochastic Differential Equations . 6th ed., Springer-Verlag, Berlin.Robson, A. J. and L. Samuelson, 2011: The evolutionary foundations of preferences.
Handbookof Social Economics , J. Benhabib, A. Bisin, and M. O. Jackson, Eds., North-Holland, Vol. 1,chap. 7, 221–310.Rustichini, A., 1999: Optimal properties of stimulus-response learning models.
Games and Eco-nomic Behavior , , 230–244.Samuelson, L. and J. Zhang, 1992: Evolutionary stability in asymmetric games. Journal of Eco-nomic Theory , , 363–391.Sandholm, W. H., 2010: Population Games and Evolutionary Dynamics . Economic learning andsocial evolution, MIT Press, Cambridge, MA.Schlag, K. H., 1998: Why imitate, and if so, how? a boundedly rational approach to multi-armedbandits.
Journal of Economic Theory ,
78 (1) , 130–156.Sorin, S., 2009: Exponential weight algorithm in continuous time.
Mathematical Programming ,
116 (1) , 513–528.Taylor, P. D. and L. B. Jonker, 1978: Evolutionary stable strategies and game dynamics.
Mathe-matical Biosciences ,
40 (1-2) , 145–156.van Kampen, N. G., 1981: Itô versus Stratonovich.
Journal of Statistical Physics ,
24 (1) , 175–187.Vlasic, A., 2012: Long-run analysis of the stochastic replicator dynamics in the presence of randomjumps. http://arxiv.org/abs/1206.0344 . MITATION DYNAMICS WITH PAYOFF SHOCKS 25
Vovk, V. G., 1990: Aggregating strategies.
COLT ’90: Proceedings of the 3rd Workshop onComputational Learning Theory , 371–383.Weibull, J. W., 1995:
Evolutionary Game Theory . MIT Press, Cambridge, MA.(P. Mertikopoulos)
CNRS (French National Center for Scientific Research), LIG, F-38000 Grenoble, France, and Univ. Grenoble Alpes, LIG, F-38000 Grenoble, France
E-mail address : [email protected] URL : http://mescal.imag.fr/membres/panayotis.mertikopoulos (Y. Viossat) PSL, Université Paris–Dauphine, CEREMADE UMR7534, Place duMaréchal de Lattre de Tassigny, 75775 Paris, France
E-mail address ::