Game of Variable Contributions to the Common Good under Uncertainty
aa r X i v : . [ m a t h . O C ] M a r Game of Variable Contributions to the Common Goodunder Uncertainty
H. Dharma Kwon ∗ March 31, 2019
Abstract
We consider a stochastic game of contribution to the common good in which the playershave continuous control over the degree of contribution, and we examine the gradualism aris-ing from the free rider effect. This game belongs to the class of variable concession gameswhich generalize wars of attrition. Previously known examples of variable concession games inthe literature yield equilibria characterized by singular control strategies without any delay ofconcession. However, these no-delay equilibria are in contrast to mixed strategy equilibria ofcanonical wars of attrition in which each player delays concession by a randomized time. Wefind that a variable contribution game with a single state variable, which extends the Nerlove-Arrow model, possesses an equilibrium characterized by regular control strategies that result ina gradual concession. This equilibrium naturally generalizes the mixed strategy equilibria fromthe canonical wars of attrition. Stochasticity of the problem accentuates the qualitative differ-ence between a singular control solution and a regular control equilibrium solution. We also findthat asymmetry between the players can mitigate the inefficiency caused by the gradualism.Keywords: Nerlove-Arrow model, war of attrition, stochastic control game, free rider problem,gradualism ∗ Gies College of Business, University of Illinois at Urbana-Champaign, Champaign, Illinois 61820Kellogg School of Management, Northwestern University, Evanston, Illinois 60208Email: [email protected] Introduction
Many business or public policy decisions concern the free rider problem when contributing to astock of common good. (Hardin, 1968). It is well-known that a free rider problem induces a waitand see approach of the individuals who are in a position to contribute to the common good (Tirole,2017). The wait and see approach in turn results in underinvestment in the common good. Hence, itis important for decision makers and social planners to understand the game-theoretic implicationsof the free rider problem involving the common good. Industry examples of a free rider prob-lem with the common good can be found in the context of generic advertisement for commodities.For instance, the advertising expenditures by Florida orange juice advertising programs not onlybenefit the Florida orange juice industry, but it also benefits non-Florida orange juice importers(Lee and Fairchild, 1988). In another example, it has been shown that a salmon promotion pro-gram conducted by Norway has benefited its international competitors, too (Kinnucan and Myrland,2003). In these examples, the advertising expenditures of one agent contribute to the stock of theproduct’s overall goodwill , “which summarizes the effects of current and past advertising outlays ondemand” (Nerlove and Arrow, 1962). The stock of goodwill is the common good in the context ofgeneric advertising because it benefits other agents, even if they do not contribute to it. In this paper,we examine the game of variable contribution to the common good where the stock of commongood evolves stochastically. In particular, we obtain the free rider effect on its Markov perfect equi-librium (MPE) and compare and contrast it to other games of concession. We address the questionof whether the equilibrium suffers from the gradualism of the players’ contributions to the commongood and, if so, whether the inefficiency arising from the gradualism can be mitigated.One objective of the paper is to fill the gaps in the equilibrium characteristics of variable conces-sion games in which the cost is linear in the contribution. The problem of contribution to commongood belongs to the class of variable concession games in which the players can control the degreeof concession. This class of games constitutes a significant generalization of the war of attrition.In the canonical war of attrition, each player can either continue the game or concede completely,and it typically yields a mixed strategy equilibrium in which the players delay their concession by2 randomized time. The central question of this paper concerns the characteristics of the variableconcession games. We can imagine three possibilities of equilibrium strategies: (1) singular control(lump-sum contribution) strategies without time delay, (2) singular control strategies with time de-lay, and (3) regular control strategies that lead to an equilibrium characterized by gradualism. In thecurrent literature, the variable concession games thus far have resulted in type (1) equilibrium withsingular control strategies of immediate lump-sum concession. Type (2), if it exists, is closest to themixed strategy equilibrium of the war of attrition, but it has not been found in the literature or inthis paper. The other natural generalization of the mixed strategy time delay equilibrium to variableconcession games is type (3), which has yet to be found in the current literature on variable conces-sion games with linear cost. Our paper shows that type (3) is found in a very simple game-theoreticand stochastic extension of the Nerlove-Arrow model of goodwill stock (Nerlove and Arrow, 1962;Sethi, 1977; Lon and Zervos, 2011).In our model, two players are considering irreversible and costly contribution to the stock ofcommon good. Each player can contribute any amount to the common good at any point in time,but the common good increases the flow profit to both players. The stock of common good evolvesstochastically, and it tends to decline in time on average unless someone contributes to it, just asthe stock of goodwill for a product depreciates in time without advertisement (Nerlove and Arrow,1962). In this game, the strategy of each player is represented by the dynamic path of its cumulativecontribution. We formulate the problem as a stochastic control game and utilize the well-establishedstochastic control theory. In order to find the equilibrium, we need to obtain the best responses, sowe establish the verification theorem for the best response stochastic control.This paper has three main contributions. First, we show that the model that we consider hasa gradualist equilibrium characterized by regular control. This result is in contrast to the typicalcontrol solution: in a control problem with a linear cost structure, the single decision maker solutionis characterized by singular control rather than regular control. Second, we find that stochasticityand asymmetry have significant impact on the equilibrium characteristics. In the deterministic game,both the singular control solution and the equilibrium solution exhibit a stable steady state so that3n outsider may not be able to tell the difference between the two. In contrast, in the stochasticcase, the two solutions exhibit markedly different behavior and are easier to observe in the empiricalsense. We also find that asymmetry between the players destabilizes the gradualist equilibrium, andthe outcome is an asymmetric equilibrium with singular control strategy adopted by at least one ofthe players. Hence, asymmetry can mitigate the inefficiency of the gradualist equilibrium. Third,the paper provides a mathematical framework to obtain an MPE of a stochastic game of variableconcession involving both singular and regular control.Although there are many equilibrium solution concepts, we limit our attention to MPE (Maskin and Tirole,2001). MPE is a subgame perfect equilibrium in which the players’ actions are determined by thecurrent value, but not by the past history, of the economically relevant state variable, and hence it isa key notion for analyzing a game.Cooperative equilibrium concepts are beyond the scope of this paper. Coordinated plans ofaction do produce an efficient outcome, which will change the form of the solution; for instance,the singular control boundary will change. Although cooperation does happen between contributorsto common good, it often requires prior coordination or bargaining, and we can still consider thenon-cooperative MPE a baseline solution prior to coordination. For instance, in a Nash bargainingsolution (Nash, 1950), the non-cooperative MPE outcome can serve as the disagreement point, andtherefore, it is still a meaningful reference point.The paper contributes to the literature on variable concession games, which an extension of awar of attrition (Maynard Smith, 1974). Typical attrition games under complete information possessmixed strategy equilibria with random time delays, both in the deterministic case (Hendricks et al.,1988) and in the stochastic case (Steg, 2015; Georgiadis et al., 2019). In contrast, the known exam-ples of the game of variable concession exhibit singular control equilibria with no time delay. Oneexample is Cournot competition under declining demand when the firms can reduce the productioncapacity at a variable cost (Ghemawat and Nalebuff, 1990). The equilibrium strategy is to imme-diately reduce the capacity to the myopic Cournot equilibrium level through singular control. Thestochastic generalization of the Cournot model also exhibits similar characteristics (Steg, 2012).4he paper also contributes to the literature on games of contribution to public goods. Fershtman and Nitzan(1991) examine a dynamic game of voluntary contribution to public goods. In their model, playerscontinuously contribute to the stock of public goods over time. They obtain an equilibrium using adifferential game approach and demonstrate that the free riding problem is acute without commit-ment. Their model is similar to ours, but they model a situation with costs that grow quadraticallywith the rate of contribution, so it is prohibitively costly to make a lump sum contribution. Sinceour model allows for a lump sum contribution due to the linear cost structure, the characteristics ofthe equilibria are very different, and it is difficult to compare their results to ours. Battaglini et al.(2014) examine a problem of dynamic free riding in which each individual allocates its endowmentbetween private consumption and irreversible contribution to the public good. They study the impli-cations of the irreversibility of their model and conclude that irreversibility can alleviate inefficiencyof the equilibria. It is noteworthy that the equilibrium of their model involves lump sum contri-bution (singular control) strategies. Ferrari et al. (2017) examine a significantly generalized modelwith stochasticity to obtain its equilibria and study the effect of uncertainty and irreversibility ofcontribution to the public good. They also obtain equilibria characterized by lump sum contributionstrategies. In contrast to the literature on private consumption and contribution to the public good,our model does not incorporate consumption of the players.Because our model assumes that the cost of contribution is linear in the magnitude of theimprovement in the common good, we formulate it as a game-theoretic extension of monotonefollower singular control problems with a single dimensional state variable. In a similar vein,Lon and Zervos (2011) apply singular control framework to the Nerlove-Arrow model of expen-diture in the stock of goodwill. Recently, some work on game-theoretic study of singular controlproblems has emerged. Steg (2012) examines Cournot competition that leads to a singular controlequilibrium. Kwon and Zhang (2015) examines a singular control game in the context of a marketshare competition in which a player’s control is to negate his opponent’s payoff. Ferrari et al. (2017)also analyze a model that incorporates game-theoretic singular control, but the model has a morecomplex structure as the players make consumption and contribution decisions at the same time.5he paper is organized as follows. In Section 2, we examine a game of variable contribution tothe common good and show that it yields a regular control strategy equilibrium. In Section 3, weexamine the impact of stochasticity and asymmetry between the players. In particular, we show thatasymmetry eliminates the regular control equilibrium thereby improving the efficiency. In Section4, we discuss several aspects of the results that are worthy of note. In Section 5, we summarize themain results and implications of the paper and provide concluding remarks.
In this section, we present a game of variable contribution to the common good that results in aregular control equilibrium. We first present the model in Section 2.1, and then we examine thesingle decision maker case as a benchmark in Section 2.2. We construct the verification theorem forbest responses in Section 2.3 and obtain the regular control MPE in Section 2.4.
We consider a game between two players, each of whom receives a flow profit that depends on a common state variable. Either player can boost the common state variable at a cost by any amountat any point in time. The model is applicable to a number of industry examples. One example isa game between two manufacturers who share a common supplier. Each manufacturer can make avariable investment to boost the quality of the shared supplier, which in turn benefits the other man-ufacturer through spillover (Muthulingam and Agrawal, 2016; Kim et al., 2017). Another exampleis the game of irreversible and variable investment in the stock of goodwill (Nerlove and Arrow,1962) through advertisement such as in generic advertising on commodities (Lee and Fairchild,1988; Kinnucan and Myrland, 2003).We let the process Z = { Z t : t ≥ } denote the stock of common good defined in the interval I = ( a , b ) ⊆ R on a filtered probability space ( Ω , F , F t , P ) that satisfies the usual condition . If I = R , for example, it is understood that a = − ∞ and b = ∞ . We assume that Z satisfies the following6tochastic differential equation (SDE): dZ t = µ ( Z t ) dt + σ ( Z t ) dW t + d ξ it + d ξ jt , where W = { W t : t ≥ } is a Wiener process progressively measurable with respect to { F t : t ≥ } .Here µ ( · ) is the drift term which we interpret as the time-averaged rate of change of Z in the absenceof control. In this paper, we assume µ ( · ) < σ ( · ) > ξ i = { ξ it : t ≥ } is anon-decreasing c`adl`ag (right continuous with left limits) process controlled by player i adapted to { F t : t ≥ } . We interpret ξ it as the cumulative contribution of player i to Z up to time t . Since eachplayer i controls the process ξ i , we say that ξ i is player i ’s strategy, and ξ = ( ξ i , ξ j ) is the strategyprofile. Throughout the paper, we let Ξ i denote the set of all possible F t -adapted control processes ξ i . We remark that ξ i is composed of a continuous process and a discontinuous process as follows: ξ it − ξ it − = ˆ t t d ξ cit + ∑ t ∈ [ t , t ] ∆ξ it , where ξ ci is the continuous part of ξ i , and ∆ξ it = ξ it − ξ it − is the instantaneous jump in ξ i at time t . Similarly, we can decompose the process Z t into a continuous part Z ct and a discontinuous part ∆ Z t = Z t − Z t − = ∆ξ t + ∆ξ t .Given a strategy profile ξ , player i ’s payoff is given by the following function: V i ( x ; ξ ) = E x (cid:20) ˆ ∞ e − rt π i ( Z t ) dt − ˆ ∞ e − rt k i d ξ it (cid:21) . Here E x [ · ] = E [ ·| Z = x ] is the conditional expectation operator given the initial condition Z = x .The integrand π i ( · ) is a non-decreasing function that represents the profit flow for player i , and k i > ξ i . Lastly, r > µ ( · ) and σ ( · ) . Let X = { X t : t ≥ } denote the uncontrolled process which satisfies dX t = µ ( X t ) dt + σ ( X t ) dW t . Assumption 1 (i) µ ( · ) and σ ( · ) are Lipschitz continuous functions satisfying | µ ( x ) | + | σ ( x ) | ≤ δ ( + | x | ) for some constant δ > .(ii) { e − r τ ( X t ) − : τ is a stopping time , τ < ∞ } is uniformly integrable for any initial value X = x.Furthermore, lim t → ∞ E x [ e − rt ( X t ) − ] = . Assumption 1 (i) implies that the uncontrolled process X has a unique strong solution to the SDE.It also implies that σ ( · ) is locally bounded; this will be useful when we apply Dynkin’s formula tothe payoff function because the stochastic integral involving σ ( Z t ) dW t is a local martingale whichpossesses convenient properties (Chapter IV, Revuz and Yor, 1999). Assumption 1 (ii) ensures thatthe limiting behaviors of the process X are well-defined so that we can construct a verificationtheorem in Section 2.3.Below we let C ( I ) denote the set of continuous functions defined on I . Assumption 2 π i ( · ) ∈ C ( I ) is strictly increasing and bounded from above, i.e., lim x → b π i ( x ) < π M for some positive constant π M . Furthermore, it satisfies the absolute integrability condition E x (cid:2) ´ ∞ | e − rt π i ( X t ) | dt (cid:3) < ∞ for the uncontrolled process X . Assumption 2 ensures that the payoff V i ( x ; ξ ) is well-defined and that the function ( R π i )( x ) : = E x (cid:20) ˆ ∞ e − rt π i ( X xt ) dt (cid:21) (1)exists. The function ( R π i )( · ) has the meaning of the payoff from perpetually keeping an uncontrolledprocess X . Later we establish that ( R π i )( · ) is an element of the payoff function.Next, we define the r -excessive characteristic operator (Alvarez, 2003): A : = σ ( x ) ∂ x + µ ( x ) ∂ x − r . (2)8e let ψ ( · ) and φ ( · ) respectively denote two linearly independent increasing and decreasing fun-damental solutions to the differential equation A ψ ( x ) = A φ ( x ) = ( R π i )( · ) satisfies the differential equation A ( R π i )( x ) + π i ( x ) = q i ( x ) = π i ( x ) + A k i x = π i ( x ) + [ µ ( x ) − rx ] k i , (3) ( Rq i )( x ) = E x (cid:20) ˆ ∞ e − rt q i ( X t ) dt (cid:21) , and assume the following properties of ( R π i )( · ) and q i ( · ) : Assumption 3 (i) ( R π i )( · ) satisfies lim x → a ( R π i )( x ) < and lim x → a ( R π i )( x ) / φ ( x ) = . (ii) Thereexists some x ∗ i ∈ I such that q i ( x ) is strictly increasing for x < x ∗ and strictly decreasing for x > x ∗ .(iii) lim x → a q ( x ) = − ∞ , lim x → a q ( x ) ψ ′ ( x ) exp (cid:16) ´ x µ ( y ) σ ( y ) dy (cid:17) = , lim x → b q ( x ) φ ′ ( x ) exp (cid:16) ´ x µ ( y ) σ ( y ) dy (cid:17) = , and lim x → b ( Rq ) ′ ( x ) / φ ′ ( x ) > . Assumption 3 serves as the sufficient condition for the unique optimal control solution to existfor the model examined in Section 2.2. Specifically, in the single decision maker’s problem, theassumptions drive a solution with a singular control region of the form ( a , θ ) for some threshold θ . Assumption 3 (i) ensures that the flow profit function π i ( · ) is negative and well-behaved near a .To gain an intuitive understanding of the assumptions regarding q i ( · ) , we consider a special caseof µ ( x ) = µ , in which case q ′ i ( x ) = π ′ i ( x ) − rk i . Then Assumption 3 (ii) implies that π ′ i ( x ) > rk i for x < x ∗ i and π ′ i ( x ) < rk i for x > x ∗ i . This implies that it is optimal to boost ξ i if and only if x < x ∗ i .Thus, the players have incentive to boost ξ i only if X t falls below a threshold. Assumption 3 (iii)ensures that a unique threshold for boosting ξ i exists for the model examined in Section 2.2. In this subsection, we review the single decision maker problem as a benchmark and provide theoptimal solution. This class of problems is extensively examined in the literature (Alvarez, 2001;9ksendal and Sulem, 2005; Lon and Zervos, 2011), but we reproduce it here because its solutionwill be utilized in the equilibrium solution of the game in the remainder of the paper.Since there is only one decision maker, we drop the player index i for convenience throughoutSection 2.2. The value function associated with a control policy ξ = { ξ t : t ≥ } is given by V ( x ; ξ ) = E x (cid:20) ˆ ∞ e − rt π ( Z t ) dt − ˆ ∞ e − rt kd ξ t (cid:21) . The objective of the decision maker is to maximize V ( · ; ξ ) with respect to ξ . This class of problemsis known as the singular stochastic control monotone follower problems.We first provide the sufficient condition (the optimality condition ) for the optimal control solu-tion. We let C n ( I ) denote the set of functions defined on I that are n times continuously differentiable.Suppose that V ( · ) ∈ C ( I ) satisfies the following conditions: (i) A V ( x ) + π ( x ) ≤ V ′ ( x ) − k ≤ x ∈ I , and (ii) [ A V ( x ) + π ( x )][ V ′ ( x ) − k ] = x ∈ I . Then V ( x ) coincides with theoptimal solution sup ξ ∈ Ξ V ( x ; ξ ) . The proof of this sufficient condition is provided, for example, byØksendal and Sulem (2005) and Lon and Zervos (2011), and so we will not reproduce it here. Lemma 1
Under Assumptions 1–3, there exist a threshold θ ∈ I and a coefficient A such that theoptimal solution V ( · ) ∈ C ( I ) is given as follows:V ( x ) = k ( x − θ ) + V ( θ ) for x < θ ( R π )( x ) + A φ ( x ) for x ≥ θ . (4)Next, we provide the intuition behind the solution through a numerical example. Example 1 : We consider a problem of constant µ ( x ) = µ < σ ( x ) = σ > π ( x ) = − exp ( ν x ) with ν < σ ν + µ ν − r <
0. Then it is straightforward toverify that φ ( x ) = exp ( γ − x ) and ψ ( x ) = exp ( γ + x ) where γ ± = σ ( − µ ± p µ + r σ ) , ( R π )( x ) = r − exp ( ν x ) β , where β = r − µ ν − σ ν . Assumptions 1–3 are satisfied in this example so that Lemma 1 applies. Furthermore, the optimalthreshold θ and the coefficient are given by θ = ν ln k βγ − ν ( ν − γ − ) , A = k − ( R π ) ′ ( θ ) φ ′ ( θ ) . For the numerical illustration in Fig. 1, we set µ = − σ = r = k =
1. In this case, θ = − . Z and ξ and the optimal valuefunction V ( x ) .The optimal policy associated with the value function (4) is a singular control policy: to boost Z t instantaneously up to θ whenever Z t falls below θ . If the initial value Z is below θ , then Z t discontinuously jumps to θ at time 0, after which Z t is continuous. The threshold θ functions asa reflecting boundary as shown in Fig. 1. In the region below θ , the value function is linear in x with the slope k , as illustrated in the figure as well as expressed in (4). This is because ( a , θ ) is thesingular control region in which it costs exactly k to boost Z by a unit. Next, we return to the game-theoretic model introduced in the beginning of the section. Our goal isto construct the verification theorem for best responses, which will then be used to construct MPEsin the remainder of the paper.In a conventional solution to the singular control problem as in Section 2.2, the optimal controlprocess is decomposed as d ξ it = d ξ lit + ∆ξ it where ∆ξ it = ξ it − ξ it − represents the discontinuous11 -6 -4 -2 0 2-6-5-4-3-2-10 Figure 1: Single decision maker solution. The dotted lines indicate the threshold θ .12volution of the process ξ i while ξ lit is a continuous process like a local time (Protter, 2003). Theprocess ξ lit is not absolutely continuous with respect to Lebesgue measure and cannot be representedas an integral ´ t u is ds for any process u i = { u it : t ≥ } (Karatzas, 1983); for example, the samplepath of ξ in Figure 1 does not possess well-defined time derivatives when ξ t increases in time. Bothcomponents ξ lit and ∆ξ it constitute the singular part of ξ it . In general, however, a control processmust also encompass a regular control process as follows: d ξ it = u it dt + d ξ lit + ∆ξ it , where u it ≥ ( F t ) .To apply the conventional stochastic control theory, we need to define the feasible space of u i .More specifically, in order for the SDE dZ t = µ ( Z t ) dt + σ ( Z t ) dW t + ∑ i = d ξ it to have a unique strongsolution, we need to limit u i within the class of u i ( t , x ) that satisfies the two following conditions:(1) u i ( t , x ) is Lipschitz continuous in x , and (2) | u i ( t , x ) | ≤ δ ( + | x | ) for some constant δ . Let U bethe set of functions u : ( , ∞ ) × I → R that satisfy these two conditions. Then we let Σ i denote the setof F t -adapted processes ξ it that satisfy d ξ it = u i ( t , Z t ) dt + d ξ lit + ∆ξ it for some u i ∈ U . We remarkthat if ξ i ∈ Σ i then the SDE of ZdZ t = [ µ ( Z t ) + ∑ i = u i ( t , Z t )] dt + σ ( Z t ) dW t + ∑ i = ( d ξ lit + ∆ξ it ) satisfies the sufficient condition for possessing a unique strong solution because u i ∈ U . Note alsothat Σ i is a proper subset of the feasible strategy space Ξ i , which is the set of all possible F t -adaptedcontrol processes ξ it .Let M i be the set of player i ’s Markov control strategies ξ i which depend only on the currentvalue of Z t . It means that ξ it satisfies d ξ it = u i ( Z t ) dt + d ξ lit + ∆ξ it where the singular control regionis given as a subset of I . For instance, if the singular control region of player i is [ α , β ] , then whenever Z t − ∈ [ α , β ] , ξ undergoes a jump ∆ξ it = β − Z t − , i.e., player i boosts Z up to β . Furthermore, d ξ lit > Z t hits β . By definition, an MPE is a subgame perfect equilibrium ξ = ( ξ i , ξ j ) that belongs13o M i × M j .For the purpose of obtaining MPE ξ ∈ M i × M j we can safely focus on obtaining the bestresponse strategies ξ ′ i ∈ Ξ i in response to the opponent’s Markov strategy ξ j ∈ M j , i.e., to ob-tain ξ i ∈ Ξ i such that V i ( x ; ξ i , ξ j ) = sup ξ ′ i ∈ Ξ i V i ( x ; ξ ′ i , ξ j ) . If the best response ξ i happens to be-long to M i , then ξ = ( ξ i , ξ j ) is an MPE. Note that we do not, however, look for ξ i ∈ M i suchthat V i ( x ; ξ i , ξ j ) = sup ξ ′ i ∈ M i V i ( x ; ξ ′ i , ξ j ) ; because M i ⊂ Ξ i , there may be another ξ ′ i ∈ Ξ i such that V i ( x ; ξ ′ i , ξ j ) > V i ( x ; ξ i , ξ j ) , in which case ( ξ i , ξ j ) is not a Nash equilibrium.Since we assume a Markov control process ξ j ∈ M j , we can partition the interval I into regionsof discontinuous ξ jt and continuous ξ jt . Let C j denote the open subset of I in which ξ jt evolvescontinuously and non-singularly in time, and let D j = I \ C j denote the singular control region where d ξ cjt > ∆ξ jt > ξ j . Theorem 1
Suppose Assumptions 1–3 hold. Assume player j’s Markov strategy ξ j ∈ M j ∩ Σ j thatsatisfies d ξ jt = u j ( Z t ) dt + d ξ ljt + ∆ξ jt for some function u j ( · ) ≥ . Suppose that there exist a functionU ( · ) on I and some Markov strategy ξ ∗ i ∈ M i ∩ Σ i that satisfy the SDEd ξ ∗ it = u ∗ i ( Z t ) dt + d ξ ∗ lit + ∆ξ ∗ it (5) for some u ∗ i ( · ) ≥ and the following conditions:(i) U ( · ) ∈ C ( C j ) ∩ C ( I ) , and U ( · ) is non-decreasing and bounded from above. Furthermore,U ′ ( x ) = in the interior of D j , and U ′ ( Z t ) d ξ ljt = for all t, where Z is the state process that evolvesunder the strategy profile ( ξ ∗ i , ξ j ) .(ii) There is a function ˜ U ( x ) ∈ C ( I ) such that ˜ U ( x ) = U ( x ) for all x ∈ C j and ˜ U ′ ( x ) is boundedfor x ∈ D j .(iii) max { A U ( x ) + π i ( x ) + u j ( x ) U ′ ( x ) + v i U ′ ( x ) − v i k i , U ′ ( x ) − k i } ≤ for all x ∈ C j and anyarbitrary v i ≥ .(iv) Let D i = { x ∈ I : ∆ξ ∗ it > whenever x = Z t − } ∩ C j be player i’s singular control region ithin C j and C i = { x ∈ I : ∆ξ ∗ it = whenever x = Z t − } ∩ C j . Then A U ( x ) + π i ( x ) + u j ( x ) U ′ ( x ) + u ∗ i ( x ) U ′ ( x ) − u ∗ i ( x ) k i = for all x ∈ C i and U ′ ( x ) = k i for all x ∈ D i .Then ξ ∗ i is the best response to ξ j amongst all control processes that belong to Σ i , i.e., V i ( x ; ξ ∗ i , ξ j ) = sup ξ i ∈ Σ i V i ( x ; ξ i , ξ j ) .Remark : Strictly speaking, Theorem 1 does not give sufficient conditions for the best responseamong the whole strategy space Ξ i , but it gives sufficient conditions for the best response amongthe limited space Σ i . However, the strategy profiles that we obtain in Theorem 2 and Proposition 1are proper MPE. This result is obtained because even though Theorem 1 is used to obtain ( ξ , ξ ) ∈ Σ × Σ that satisfies V i ( x ; ξ i , ξ j ) = sup ξ ′ i ∈ Σ i V i ( x ; ξ ′ i , ξ j ) for both i = ,
2, we can show that V i ( x ; ξ i , ξ j ) = sup ξ ′ i ∈ Ξ i V i ( x ; ξ ′ i , ξ j ) , by the optimality condition for singular control given in Section 2.2. The same is true if i and j areinterchanged. Therefore, ( ξ , ξ ) is a proper MPE. The detail is provided in the proof of Theorem 2. Next, we construct a regular control MPE. We define a regular control MPE as one with controlstrategies of the following form for both players i = , d ξ it = u i ( Z t ) dt , where ξ i ∈ Σ i . In this subsection, we assume that the two players are symmetric, i.e., π i = π j = π and k i = k j = k . Theorem 2
Suppose Assumptions 1–3 hold. Let V ( · ) denote the solution (4) to the single decisionmaker problem. Also define a symmetric strategy profile ξ = ( ξ , ξ ) with the regular control process ξ it = u ( Z t ) dt, where u ( x ) = − k ( A V ( x ) + π ( x )) for x < θ for x ≥ θ . (6) Lastly, suppose u ( · ) ∈ U . Then ξ is an MPE with a symmetric payoff V i ( x ; ξ ) = V j ( x ; ξ ) = V ( x ) . Theorem 2 obtains an MPE completely characterized by regular control. Intuitively, both playersexert gradual control u ( · ) if and only if Z is sufficiently low (less than θ ). The regular control MPEis reminiscent of the mixed strategy delay equilibrium of the canonical war of attrition in whichboth players gradually concede in the probabilistic sense through a Poisson process. Thus, wemay consider the regular control strategy equilibrium a generalization of the mixed strategy delayequilibrium.One notable characteristic of the regular control MPE is that the threshold of the control region θ is identical to the threshold of singular control region in the single decision maker solution. Itimplies that the free rider effect does not shift the control threshold; instead, it drives gradualism. Example
2: We now consider the game-theoretic extension of Example 1. For analytical tractabil-ity, the profit flow π ( · ) is modified as follows: π ( x ) = − exp ( ν x ) for x ≥ x c π ( x c ) + ( x − x c ) ρ for x < x c , where ρ > rk and x c < θ are parameters that we specify below. The form of π ( · ) is modified sothat | π ( x ) | does not grow faster than | x | for sufficiently large | x | . The modification is necessary toensure that u i ( x ) in equilibrium does not grow faster than | x | for large | x | , preserving the well-knownsufficient conditions for existence and uniqueness of the strong solution to SDE of Z .16n this case, the function ( R π )( x ) has a more complicated form. Define f ( x ) = r − exp ( ν x ) β , f ( x ) = ρ µr + π ( x c ) r + ρ r ( x − x c ) , so that A f ( x ) + π ( x ) = x > x c and A f ( x ) + π ( x ) = x < x c . Then ( R π )( x ) = f ( x ) + c φ ( x ) for x ≥ x c f ( x ) + c ψ ( x ) for x < x c , where c and c are chosen so that ( R π )( x ) is continuous and differentiable at x = x c . Note thatthis modification of π ( · ) does not alter the single decision maker solution of Section 2.2 if x c issufficiently low (lower than θ and x ∗ ) and if ρ > rk so that Assumption 3 is satisfied.From (6), we have u ( x ) = x ≥ θ − A V ( x )+ π ( x ) k = − µ + rk V ( θ ) + r ( x − θ ) − k π ( x ) for x < θ . (7)Note that | u ( x ) | < δ ( + | x | ) for some constant δ , so the unique strong solution to the SDE for Z exists.Figure 2 illustrates a simulated sample path of Z and ξ i and the rate of contribution u ( · ) as afunction of Z where ρ = x c = −
10 and the other model parameters are set as in Figure 1including θ = − . Z freely fluctuates below θ in the regular control equilibrium. Incontrast, Z in the single decision maker’s solution shown in Figure 1 never falls below θ because itis subject to singular control at θ . The sample path of ξ it = ´ t u ( Z s ) ds is smooth and differentiablewith respect to time in the regular control equilibrium, in contrast to the sample path of ξ in Figure1. The rate u ( · ) gradually grows as Z decreases; this is because the players have stronger incentiveto control Z for lower values of Z . 17 -6 -5 -4 -3 -200.20.40.60.811.2 Figure 2: Symmetric regular control equilibrium. The dotted lines indicate the threshold θ .18 emark : The MPE obtained in Theorem 2 exists as long as the solution shown in Lemma 1exists. Hence, Assumptions 1–3 do not need to hold as long as the single decision maker controlproblem yields a single-threshold singular control solution. In this section, we examine the implications of stochasticity and asymmetry on the regular controlequilibrium that we obtain in Section 2.4. This is an important inquiry because most realistic situa-tions often possess stochastic state variables and heterogeneity among the players. We first examinethe case of deterministic Z in Section 3.1 and contrast its equilibrium to that of the stochastic game.We find that the contrast between a single decision maker solution and an equilibrium is starker inthe stochastic case. Then we examine an asymmetric game in Section 3.2 and find that a regular con-trol MPE does not exist in an asymmetric game. Instead, asymmetry leads to asymmetric equilibriawith singular control strategies. To examine the impact of stochasticity, we will simply discuss the deterministic case of Example 2and contrast it to the stochastic case.
Example 3 : We revisit the model of Example 2 and set σ = x c = − ∞ . Then the character-istic operator A = µ ∂ z − r is a first-order differential operator, and there is only one fundamental solution φ ( x ) = exp ( rx / µ ) that satisfies A φ ( x ) =
0. Furthermore, ( R π )( x ) = r − exp ( ν x ) r − µ ν , Equilibrium solutionSingle DM solution
Figure 3: Deterministic solutions.where we assume µ ν < r . The critical threshold of control is given by θ = ν ln kr | ν | . As in Example 2, u ( · ) is given by (7). It is straightforward to verify that lim x ↑ θ u ( x ) =
0, lim x ↑ θ u ′ ( x ) = u ′′ ( x ) > x < θ , which implies that u ′ ( x ) < x < θ . Furthermore, thereexists η < θ at which µ + u ( η ) = dZ t / dt < Z t > η and dZ t / dt > Z t < η .Lastly, we can show that Z t asymptotically approaches η if Z = η . Define y t = Z t − η . For y t sufficiently close to 0, we have dy t dt = µ + u ( y t + η ) = µ + u ( η ) + u ′ ( η ) y t + O ( y t )= u ′ ( η ) y t + O ( y t ) , where we used the fact that µ + u ( η ) = u ′ ( η ) <
0; thus, | y t | = | y | exp [ u ′ ( η ) t + O ( y t )] for large t , which implies lim t → ∞ y t =
0. Hence, η is the steady state of Z .Figure 3 illustrates Z t as a function of t in the equilibrium where θ = − . η = − . Z t follows the horizontal line20 = θ as soon as Z t hits θ . Thus, θ is the steady state of the single decision maker solution. Hence,in the deterministic model, the behavior of Z t exhibits very little qualitative difference between thesingle decision maker solution and the equilibrium of the game. In particular, an outside observerwill not be able to tell the difference between the two solutions except that the steady state valuesdiffer. The difference in the steady state value of Z can be simply attributed to the free rider effect:the players are less willing to contribute to the common good, so the steady state is lower.In contrast, in the presence of stochasticity ( σ > Z t is markedly different asillustrated in Figures 1 and 2: in the single decision maker’s case, Z t is reflected off of the threshold θ , whereas in the equilibrium of the game, Z t can assume any value although it tends to fluctuatearound θ . We conclude that stochasticity induces qualitatively different behaviors between the twosolutions and hence renders the regular control equilibrium observable to an outsider. Next, we examine the impact of asymmetry between the two players. We first show that a regularcontrol MPE is absent in an asymmetric game in Section 3.2.1, and then we construct the simplestclass of asymmetric equilibria in Section 3.2.2 and demonstrate that they exhibit the key character-istics of a singular control solution.
Suppose that θ i = θ j because of asymmetry ( k i = k j and/or π i = π j ), where θ i is the unique solutionto the equation k i − ( R π i ) ′ ( θ i ) φ ′ ( θ i ) = − ( R π i ) ′′ ( θ i ) φ ′′ ( θ i ) . For analytical tractability, we make the following additional assumption:
Assumption 4 µ ′ ( x ) + π ′ ( x ) − rk = almost everywhere x ∈ I. Assumption 4 ensures that µ ( x ) + π ( x ) − rk is never constant within any given non-empty interval. Itgives a non-trivial structure to the Hamilton-Jacobi-Bellman (HJB) equation of the payoff function.21he following theorem establishes that there is no payoff function V i ( · ; ξ ) ∈ C ( I ) associatedwith a regular control MPE. Theorem 3 If θ i = θ j , then there is no regular control MPE ξ = ( ξ i , ξ j ) ∈ Σ i × Σ j such that V i ( · ; ξ ) ∈ C ( I ) for i = , . The implication of Theorem 3 is that an MPE of an asymmetric game must involve singularcontrol. Hence, the natural next step is to explore the form of such a singular control MPE in anasymmetric game, which is the goal of Section 3.2.2.
Remark 1 : Strictly speaking, Theorem 3 does not necessarily preclude the possibility of an equi-librium with payoffs V i ( · ; ξ ) that do not belong to C ( I ) such as a general viscosity solution . In thispaper, we limit ourselves to equilibria that produce classic solutions only and defer the possibilityof an equilibrium with non- C ( I ) viscosity solutions to future endeavors. We also remark that we donot attempt to exclude the possibility of an equilibrium ξ that does not belong to Σ i × Σ j . This pos-sibility is beyond the scope of the paper because of the issue of the existence of the unique solutionto the SDE for Z t ; this is a common restriction in stochastic control theory (Øksendal and Sulem,2005). Remark 2 : Even in an asymmetric game, it is possible to have some combinations of k i and π i ( · ) such that θ = θ . In this case, a regular control MPE is possible because Theorem 2 is applicable.As a corollary, we can also exclude the possibility of an MPE in which there is a common regularcontrol region ( α , β ) for some α > a and a no-control region [ β , b ) . (We remark that the proof ofTheorem 3 establishes that the regular control regions of the two players must coincide.) We remainagnostic about what happens in the region ( a , α ] , however, except that ( a , α ] should contain singularcontrol regions D and D if they exist. Again, we focus on an MPE with classical solutions, i.e., V i ( · ; ξ ) ∈ C ( I \ D j ) . Corollary 1
There is no MPE ξ ∈ Σ i × Σ j such that V i ( · ; ξ ) ∈ C ( I \ D j ) with a common regularcontrol region ( α , β ) for some α > a and a no-control region [ β , b ) . Its proof essentially follows that of Theorem 3, so it is omitted.22 .2.2 Non-regular Control Asymmetric MPE
By virtue of Theorem 3, an asymmetric game allows no regular control MPE, so any MPE mustinvolve some singular control by at least one player. Furthermore, Corollary 1 implies that the onlypossible equilibria are the ones with a singular control region [ α , β ] of one player and a no-controlregion ( β , b ) . Our goal is to present the simplest class of such equilibria and compare its character-istics to those of the regular control MPE. Although our main focus is on strictly asymmetric games,we will keep our discourse general and implicitly include the case of a symmetric game.As a candidate for an asymmetric MPE, we consider a strategy profile ξ in which player 1 exertssingular control in the interval [ θ ′ , θ ] and regular control in ( a , θ ′ ) for some θ ′ < θ while player 2exerts regular control in ( a , θ ′ ) . Furthermore, we define the regular control rate functions u ( x ) = − k [ A U ( x ) + π ( x )] for x < θ ′ , , (8) u ( x ) = − k [ A U ( x ) + π ( x )] for x < θ ′ , , (9)where U ( · ) , U ( · ) are given by U ( x ) = V ( x ) , U ( x ) = B φ ( x ) + ( R π )( x ) x ≥ θ U ( θ ) x ∈ ( θ ′ , θ ) U ( θ ) + ( x − θ ′ ) k x ≤ θ ′ , Here V ( · ) denotes the solution to the single decision maker problem given by (4) where k , θ and π k , θ and π , and B = − ( R π ) ′ ( θ ) φ ′ ( θ ) , (10) U ( θ ) = B φ ( θ ) + ( R π )( θ ) . Note that the functions U ( · ) , U ( · ) are actually the payoff functions associated with the proposedstrategy profile ξ . In the region ( θ , b ) , both U ( · ) and U ( · ) assume the form of continuation regionwithout control. In player 1’s singular control region [ θ ′ , θ ] , we have U ′ ( x ) = k because player 1expends the cost of singular control while U ′ ( x ) = ( a , θ ′ ) , both players expend cost in such a waythat U ′ i ( x ) = k i .By Theorem 3, to confirm that the strategy profile ξ is an MPE, we only need to verify thefollowing sufficient conditions: U ′ ( x ) ≤ k ∀ x ∈ I \{ θ , θ ′ } , (11) A U ( x ) + π ( x ) ≤ ∀ x ∈ I \{ θ , θ ′ } , (12) u ( · ) ∈ U , u ( · ) ∈ U . (13) Proposition 1
Suppose that Assumptions 1–3 hold and (11)-(13) are satisfied. Then the strategyprofile ξ is an MPE with the payoff functions given by V ( x ; ξ ) = U ( x ) and V ( x ; ξ ) = U ( x ) .Example 4 : Recall Example 2 and set µ = − σ = r = k = k = ρ =
2, and x c = −
10 justas in Figure 2. Then θ = − . θ ′ < θ = θ satisfies conditions (11) and (12). Hence, there is a continuum of asymmetric equilibria parameter-ized by θ ′ ∈ ( − ∞ , θ ) . Figure 4 illustrates a numerical example of the case θ ′ = −
6. Note that evenif we set k < k , the qualitative features of this asymmetric equilibrium continue to hold.24 Player 1’s singular control region
Figure 4: Sample path of Z t in an asymmetric equilibrium.In particular, Figure 4 illustrates the case when Z < θ ′ . Before Z t hits θ ′ for the first time, Z t is subject to regular control of both players. Upon reaching θ ′ , Z t is subject to singular control byplayer 1 and is boosted up to θ . Once Z t enters the region [ θ , ∞ ) , the threshold θ takes the roleof a reflecting boundary for Z t because of player 1’s singular control strategy. If Z t ∈ [ θ , ∞ ) , theequilibrium reduces to the single decision maker solution.As illustrated by the example, asymmetric non-regular equilibria exist even for the symmetricgame where k = k and π = π . However, if the two players are identical to each other, the play-ers are likely to be drawn to the symmetric equilibrium. For instance, suppose that the equilibriumshown in Figure 4 is the outcome of the symmetric game. Then eventually player 1 ends up being theonly one contributing to the common good every time Z t hits θ , so he will feel that the current equi-librium is unfair to him. Consequently, he will likely attempt to switch to a more equitable equilib-rium. Thus, the symmetric regular MPE is the likely focal point equilibrium (Fudenberg and Tirole,1991). In contrast, in an asymmetric game in which the two players have unequal thresholds θ i = θ j ,the only possible equilibrium is an asymmetric non-regular control one by virtue of Theorem 3.One interesting question regards which player exerts singular control. In the numerical exampleabove, we can fix k = k and see how that affects the equilibrium. It canbe numerically verified that the MPE of Proposition 1 exists as long as k ≥ . k is less25han the critical value 0.5383, then (11) is violated, so the MPE is not possible. Intuitively, if k issufficiently low, then player 2 has strong incentive to exert singular control, and knowing this, player1 would never exert singular control. In this case, the only equilibrium is the one in which player 2exerts singular control. Thus, sufficiently high asymmetry induces the more efficient player to exertsingular control in an equilibrium. If the asymmetry is modest, then either player can be the onewho exerts singular control.In summary, the MPE obtained by Proposition 1 is the simplest form of asymmetric equilibria.In these equilibria, Z t eventually ends up in the region [ θ , b ) and subject to the reflecting boundaryat θ . Thus, in the long run, Z t is subject to the singular control policy of player 1 just as in thesingle decision maker’s case, and it is not plagued by the gradualism of the regular control MPE ofSection 2.4. Therefore, we conclude that asymmetry reduces inefficiency. In contrast to the singular control equilibria obtained in, for example, Ghemawat and Nalebuff(1990), Steg (2012), Battaglini et al. (2014), Ferrari et al. (2017), and Appendix B, a regular controlequilibrium arises in our model. We speculate that we obtain a contrasting result because of thedifference in the dimensionality of the state variable. In the model that we study, the state variable Z t is one-dimensional; the control variables ξ it and ξ jt only add to Z t , so they are not indepen-dent state variables that stand alone. In contrast, in the examples from the literature as well as theR&D spillover game analyzed in Appendix B, the state variables are multidimensional because theplayers’ control variables are decoupled from the state variable. For instance, in the R&D gameof Appendix B, the state variable is two-dimensional: ( λ t , λ t ) , where λ it is the current level ofR&D effort of firm i . Consequently, the possibility of a subgame with λ t = λ t is allowed, so thestate variable is allowed to be asymmetric between the two players. However, as shown by Section3.2.2, the regular control equilibrium of Theorem 2 hinges on the symmetry of the state variable26etween the two players so that they share the common regular control region (see the proof of The-orem 2). Thus, we anticipate that the emergence of a regular control equilibrium is driven by thesingle-dimensionality of the state variable. We can straightforwardly generalize Theorem 2 to an N -player game for N > u N ( · ) as follows: u N ( x ) = − k ( N − ) ( A V ( x ) + π ( x )) for x < θ x ≥ θ , where V ( · ) is given by (4). Then it can be verified that a strategy profile ξ in which each player i exerts a regular control of d ξ it = u N ( Z t ) dt constitutes a MPE. This is because the HJB equation(condition (iii) of Theorem 1) for each player’s payoff function continues to be satisfied when all N − u N ( Z t ) dt . There exists a close analogy between the regular control equilibrium obtained in Theorem 2 and themixed strategy equilibrium of a war of attrition. In the mixed strategy equilibrium of the canonicalwar of attrition, each player has control over the hazard rate of exit, which is analogous to the rateof regular control in our model. Similarly, a pure strategy concession (a deterministic concession)in a war of attrition is analogous to the singular control strategy in our model.The analogy between the two equilibrium solutions goes even further with the impact of asym-metry. In a stochastic extension of a war of attrition game, the mixed strategy MPEs disappear whenthe players’s reward from concession is asymmetric (Georgiadis et al., 2019). This is analogous toour result that a completely regular control MPE disappears when the players are asymmetric.27
Conclusions
We examine a stochastic game of variable contribution as a generalization of a war of attrition. Inparticular, we analyze a stochastic game-theoretic extension of the Nerlove-Arrow model, whichpossesses a novel MPE characterized by regular control. This finding is in contrast to the singularcontrol equilibria possessed by variable concession games with multidimensional state variables. Inthe examples of singular control equilibria obtained in the literature, the free rider effect manifestsin the value of the threshold of the control region, but the action of concession is immediate and notplagued by delay or gradualism. In contrast, the variable contribution game analyzed in Section 2possesses a regular control equilibrium in which the free rider effect manifests in the gradualism ofthe players’ actions.We find that it is important to understand the effect of stochasticity on the game. The state vari-able Z exhibits qualitatively different behavior under a regular control MPE from that of a singledecision maker solution. However, the difference almost disappears if the state variable is determin-istic. We conclude that stochasticity renders the gradual MPE observable to an outsider.We also examine the impact of asymmetry between the players and find that the regular con-trol MPE is not possible under asymmetry. The implication of this finding is that the problem ofinefficiency arising from the gradual regular control MPE is mitigated by asymmetry between theplayers. From a social planner’s perspective, this result suggests that heterogeneity between agentsshould be cultivated or encouraged when there is a free rider problem with the agents’ contributionsto the common good.The results and their implications of this paper warrant some related future research endeavors.First, it will be interesting to study multidimensional variable concession problems and see if theyhave gradual regular control MPE even though we speculate that they do not. Second, it will befruitful to examine an extension of our model in which the players have private types and asymmetricinformation regarding the cost of contribution. In this case, there is inherent asymmetry betweenany two players, so the regular control equilibrium may exist only under very stringent conditions.Lastly, just as Wang (2009) finds empirical evidence of a delay in action in a war of attrition, it28ight be possible to find empirical evidence of gradualism in a regular control equilibrium for acontribution game with a free rider problem such as in generic advertising or investment in publicgoods. References
Alvarez, L., J. Lempa. 2008. On the optimal stochastic impulse control of linear diffusions.
SIAM Journal onControl and Optimization (2) 703–732.Alvarez, L. H. R. 2001. Singular stochastic control, linear diffusions, and optimal stopping: A class ofsolvable problems. SIAM Journal on Control and Optimization (6) 1697–1710.Alvarez, L. H. R. 2003. On the properties of r-excessive mappings for a class of diffusions. The Annals ofApplied Probability TheAmerican Economic Review (9) 2858–2871.Borodin, A., P. Salminen. 1996.
Handbook of Brownian motion - Facts and Formulae . Birkhauser, Basel.Ferrari, G., F. Riedel, J.-H. Steg. 2017. Continuous-time public good contribution under uncertainty: Astochastic control approach.
Applied Mathematics & Optimization (3) 429–470.Fershtman, C., S. Nitzan. 1991. Dynamic voluntary provision of public goods. European Economic Review (5) 1057 – 1067.Fudenberg, D., J. Tirole. 1991. Game Theory . The MIT Press, Cambridge, MA.Georgiadis, G., Y. Kim, H. D. Kwon. 2019. Equilibrium selection in the war of attrition under completeinformation.
SSRN https://ssrn.com/abstract=3353450 .Ghemawat, P., B. Nalebuff. 1990. The devolution of declining industries.
The Quarterly Journal of Economics (1) 167–186.Hardin, G. 1968. The tragedy of the commons.
Science (3859) 1243–1248. endricks, K., A. Weiss, C. Wilson. 1988. The war of attrition in continuous time with complete information. International Economic Review (4) 663–680.Karatzas, I. 1983. A class of singular stochastic control problems. Advances in Applied Probability (2)225–254.Karatzas, I., S. E. Shreve. 1998. Brownian Motion and Stochastic Calculus . 2nd ed. Springer, New York.Kim, Y., H. D. Kwon, A. Agrawal. 2017. Strategic investment in shared suppliers with quality deterioration.
UIUC Working Paper .Kinnucan, H. W., ˜A. Myrland. 2003. Free-rider effects of generic advertising: The case of salmon.
Agribusi-ness (3) 315–324.Kwon, H. D., H. Zhang. 2015. Game of singular stochastic control and strategic exit. Mathematics ofOperations Research (4) 869–887.Lee, J.-Y., G. F. Fairchild. 1988. Commodity advertising, imports and the free rider problem. Journal of FoodDistribution Research (Number 2) 36–42.Lon, P. C., M. Zervos. 2011. A model for optimally advertising and launching a product. Mathematics ofOperations Research (2) 363–376.Maskin, E., J. Tirole. 2001. Markov perfect equilibrium: I. Observable actions. Journal of Economic Theory (2) 191 – 219.Maynard Smith, J. 1974. Theory of games and the evolution of animal conflicts.
Journal of TheoreticalBiology Manufacturing & Service Operations Management (4) 525–544.Nash, J. F. 1950. The bargaining problem. Econometrica (2) 155–162.Nerlove, M., K. J. Arrow. 1962. Optimal advertising policy under dynamic conditions. Economica (114)129–142. ksendal, B. 2003. Stochastic Differential Equations: An Introduction with Applications . 6th ed. Springer,Berlin.Øksendal, B., A. Sulem. 2005.
Applied Stochastic Control of Jump Diffusions . 2nd ed. Springer, Berlin.Protter, P. E. 2003.
Stochastic integration and differential equations . Springer-Verlag, Berlin.Revuz, D., M. Yor. 1999.
Continuous martingales and Brownian motion , vol. 293. 3rd ed. Springer-Verlag,Berlin.Sethi, S. P. 1977. Dynamic optimal control models in advertising: a survey.
SIAM Review (4) 685–725.Steg, J.-H. 2012. Irreversible investment in oligopoly. Finance and Stochastics (2) 207–224.Steg, J.-H. 2015. Symmetric equilibria in stochastic timing games. Center for Mathematical EconomicsWorking Papers (543).Tirole, J. 2017.
Economics for the Common Good . Princeton University Press, Princeton.Wang, Z. 2009. (Mixed) strategy in oligopoly pricing: Evidence from gasoline price cycles before and undera timing regulation.
Journal of Political Economy (6) 987–1030. Mathematical Proofs
Proof of Lemma 1 : Because the first and second derivatives of V ( · ) are continuous at θ , the coef-ficient A and the threshold θ must satisfy k = ( R π ) ′ ( θ ) + A φ ′ ( θ ) and 0 = ( R π ) ′′ ( θ ) + A φ ′′ ( θ ) . Thesimultaneous equations are solved if we can find θ that satisfies k − ( R π ) ′ ( θ ) φ ′ ( θ ) = − ( R π ) ′′ ( θ ) φ ′′ ( θ ) . (A.1)Below we show that there is a unique value of θ that satisfies this condition and that the resultingsolution V ( · ) satisfies the optimality conditions for singular stochastic control.As a first step, we show that the function F ( x ) : = ( k − ( R π ) ′ ( x )) / φ ′ ( x ) achieves a unique globalmaximum at x = θ ∈ I . If this holds, then it is straightforward to verify that (A.1) is satisfied.Note that ( Rq ) ′ ( x ) = ( R π ) ′ ( x ) − k from the definition of q in (3) without the player index i , so F ( x ) = − ( Rq ) ′ ( x ) / φ ′ ( x ) . From the theory of diffusive processes (Borodin and Salminen, 1996;Alvarez and Lempa, 2008), it is well-known that ( Rq )( x ) = B − " φ ( x ) ˆ xa ψ ( y ) q ( y ) m ′ ( y ) dt + ψ ( x ) ˆ bx φ ( y ) q ( y ) m ′ ( y ) dt , where B = S ′ ( x ) [ ψ ′ ( x ) φ ( x ) − φ ′ ( x ) ψ ( x )] , m ′ ( x ) = σ ( x ) S ′ ( x ) , S ′ ( x ) = exp (cid:18) − ˆ x µ ( y ) σ ( y ) dy (cid:19) . dF ( x ) dx = − S ′ ( x ) σ ( x )[ φ ′ ( x )] L ( x ) , (A.2)where L ( x ) = − q ( x ) φ ′ ( x ) S ′ ( x ) − r ˆ bx φ ( y ) q ( y ) m ′ ( y ) dy . − x (A.3)By the definitions of m ′ ( · ) and S ′ ( · ) , we can derive the equality d ( φ ′ ( x ) / S ′ ( x )) / dx = r φ ( x ) m ′ ( x ) bysome algebra, from which we obtain dL ( x ) dx = − q ′ ( x ) φ ′ ( x ) S ′ ( x ) . By Assumption 3 (ii), q ′ ( x ) > x ∈ ( a , x ∗ ) and q ′ ( x ) < ( x ∗ , b ) , we have dL ( x ) / dx > x ∈ ( a , x ∗ ) and dL ( x ) / dx < x ∈ ( x ∗ , b ) . From lim x → a φ ′ ( x ) / S ′ ( x ) = − ∞ (p. 19 ofBorodin and Salminen (1996)) and lim x → a q ( x ) = − ∞ from Assumption 3 (iii), we have for y suffi-ciently close to a , L ( y ) − lim x → a L ( x ) = lim x → a ˆ yx dL ( u ) du du = lim x → a ˆ yx [ − q ′ ( u )] φ ′ ( u ) S ′ ( u ) du > φ ′ ( y ) S ′ ( y ) lim x → a ˆ yx [ − q ′ ( u )] du = ∞ , so that lim x → a L ( x ) = − ∞ . Furthermore, because lim x → b q ( x ) φ ′ ( x ) / S ′ ( x ) = x → b L ( x ) = L ( · ) and thesign change of dL ( x ) / dx , it follows that L ( θ ) = θ ∈ ( a , x ∗ ) . Since L ( · ) ismonotonically increasing in the interval ( a , x ∗ ) , L ( · ) turns from negative to positive at θ . Combinedwith the behavior of L ( x ) in the limits x → a , b , we conclude that L ( x ) is negative in the interval ( a , θ ) and positive in the interval ( θ , b ) . From (A.2), we also conclude that F ( x ) attains its globalmaximum at this unique point θ . Thus, (A.1) is also satisfied, which makes V ( · ) ∈ C ( I ) if A = F ( θ ) .The next step is to prove that V ( · ) satisfies the sufficient conditions (i) and (ii) of the optimality.(i) First, we show that A V ( x ) + π ( x ) ≤ V ′ ( x ) − k ≤ x ∈ I . For x > θ , the formof V ( · ) guarantees the condition A V ( x ) + π ( x ) =
0. Furthermore, because F ( x ) < F ( θ ) = A for all33 > θ , we have ( k − ( R π ) ′ ( x )) / φ ′ ( x ) < A , from which we obtain V ′ ( x ) = A φ ′ ( x ) + ( R π ) ′ ( x ) < k for all x > θ .For x < θ , we have V ′ ( x ) = k by the form of V ( · ) . Also note that V ′′ ( θ ) = V ′ ( θ ) = k and A V ( x ) + π ( x ) = x > θ so thatlim x ↓ θ A V ( x ) + π ( x ) = µ ( θ ) k − rV ( θ ) + π ( θ ) = . For any x < θ , we have A V ( x ) + π ( x ) − [ lim y ↓ θ A V ( y ) + π ( y )] = µ ( x ) k − rV ( x ) + π ( x ) − [ µ ( θ ) k − rV ( θ ) + π ( θ )]= q ( x ) − q ( θ ) < q ( · ) increases in the interval ( a , θ ) . Thus, A V ( x ) + π ( x ) < x < θ .(ii) We just showed that A V ( x ) + π ( x ) = x ≥ θ and V ′ ( x ) − k = x ≤ θ . Proof of Theorem 1 : To prove the theorem, it is sufficient to show that U ( x ) = V i ( x ; ξ ∗ i , ξ j ) ≥ V i ( x ; ξ i , ξ j ) for any arbitrary strategy ξ i of player i that satisfies d ξ it = u i ( t , Z t ) + d ξ cit + ∆ξ it . First,it is straightforward to verify that if ξ ∗ i is the best response, player i should not expend any costto control Z within D j because player j is already doing so in that region. Thus, one necessarycondition for the payoff function of the best response is U ′ ( x ) = D j .We consider an arbitrary strategy ξ i of player i that satisfies d ξ it = u i ( t , Z t ) dt + d ξ cit + ∆ξ it forsome arbitrary u i ( · ) ∈ U . Let Z be the state process dictated by the given strategy profile ( ξ i , ξ j ) .By conditions (i) and (iii), U ′ ( · ) is a bounded function. Furthermore, due to Assumption 1 (i), e − rs σ ( Z s ) ˜ U ′ ( Z s ) is locally bounded. Hence, the process M t : = ´ t e − rs σ ( Z s ) ˜ U ′ ( Z s ) dW s is a contin-uous local martingale. By the definition of a continuous local martingale (Karatzas and Shreve,34998), there exists a non-decreasing sequence { τ n } ∞ n = of stopping times of { F t } t > such thatlim n → ∞ τ n = ∞ a.s. and { M τ n ∧ t : t ∈ [ , ∞ ) } is a martingale for each n . We first consider any x ∈ C j as the initial point of Z . By the generalized Itˆo’s formula, we have e − r τ n ∧ t ˜ U ( Z τ n ∧ t ) = ˜ U ( x ) + ˆ τ n ∧ t e − rs A ˜ U ( Z s ) ds + ˆ τ n ∧ t e − rs ˜ U ′ ( Z s )[ u i ( t , Z s ) ds + u j ( Z s ) ds + ∑ l = d ξ cls ]+ ∑ ≤ s ≤ τ n ∧ t e − rs [ ˜ U ( Z s ) − ˜ U ( Z s − )] + M τ n ∧ t . Taking the expectation of both sides and rearranging terms, we obtain˜ U ( x ) = E x [ e − r τ n ∧ t ˜ U ( Z τ n ∧ t )] − E x (cid:20) ˆ τ n ∧ t e − rs ( A ˜ U ( Z s ) + ˜ U ′ ( Z s ) u j ( Z s ) + ˜ U ′ ( Z s ) u i ( s , Z s )) ds (cid:21) − E x " ˆ τ n ∧ t e − rs ˜ U ′ ( Z s ) ∑ l = d ξ cls + ∑ ≤ s ≤ τ n ∧ t e − rs [ ˜ U ( Z s ) − ˜ U ( Z s − )] . (A.4)Here we use the fact that E x [ M τ n ∧ t ] = M τ n ∧ t is a martingale.Next, we re-express (A.4) as an inequality involving U ( · ) and ξ i alone. Recall that ˜ U ( x ) = U ( x ) whenever x ∈ C j , so ˜ U ( Z t ) = U ( Z t ) whenever Z t is continuous in time. Thus, if Z s ∈ C j , A ˜ U ( Z s ) + ˜ U ′ ( Z s )( u i ( t , Z s ) + u j ( Z s )) = A U ( Z s ) + U ′ ( Z s )( u i ( t , Z s ) + u j ( Z s )) . We also note that ˜ U ′ ( Z s ) d ξ cjs = s from condition (i) since d ξ cjs = Z s ∈ D j and ˜ U ′ ( Z s ) = U ′ ( Z s ) if Z s ∈ C j . Also note that the process Z t spends zero time within D j , so ˜ U ( Z s ) − ˜ U ( Z s − ) = U ( Z s ) − U ( Z s − ) for all s >
0. Then we can re-express (A.4) as the following equality: U ( x ) = E x [ e − r τ n ∧ t U ( Z τ n ∧ t )] − E x (cid:20) ˆ τ n ∧ t e − rs ( A U ( Z s ) + U ′ ( Z s ) u j ( Z s ) + U ′ ( Z s ) u i ( s , Z s )) ds (cid:21) − E x " ˆ τ n ∧ t e − rs U ′ ( Z s ) ∑ l = d ξ cls + ∑ ≤ s ≤ τ n ∧ t e − rs [ U ( Z s ) − U ( Z s − )] . (A.5)35e note that A U ( Z s ) + U ′ ( Z s )( u i ( t , Z s ) + u j ( Z s )) ≤ k i u i ( t , Z s ) − π i ( Z s ) because of condition (iii). Regarding player j ’s singular control, we have U ( Z s − + ∆ξ js ) − U ( Z s − ) = ∆ξ js > U ′ ( x ) = x within the interior of D j . Lastly,we note that U ′ ( Z s ) ∑ l = d ξ cls ≤ k i d ξ cis because U ′ ( x ) ≤ k i from condition (iii) and U ′ ( Z s ) d ξ cjs = U ( Z s − + ∆ξ is ) − U ( Z s − ) ≤ k i ∆ξ is from U ′ ( x ) ≤ k i . Combining all these facts,we obtain U ( x ) ≥ E x [ e − r τ n ∧ t U ( Z τ n ∧ t )] + E x [ ˆ τ n ∧ t e − rs ( π i ( Z s ) − k i u i ( t , Z s )) ds ] − k i E x [ ˆ τ n ∧ t e − rs d ξ cis + ∑ ≤ s ≤ τ n ∧ t e − rs ∆ξ is ] . Since U ( · ) is non-decreasing and U ′ ( x ) ≤ k i , we have ( U ( Z t )) − ≤ k i ( Z t ) − + C for some C > X t , we have Z t ≥ X t for any control strategiestaken by the players because the controls ξ i and ξ j always boost Z t . It follows that ( Z t ) − ≤ ( X t ) − for all t ≥ X = Z . By virtue of Assumption 1 (ii), { e − r τ ( U ( Z τ )) − : τ > } isuniformly integrable under any control strategy profile ( ξ i , ξ j ) . Thus, we havelim inf n → ∞ E x [ e − r τ n ∧ t U ( Z τ n ∧ t )] = E x [ lim inf n → ∞ e − r τ n ∧ t U ( Z τ n ∧ t )] = E x [ e − rt U ( Z t )] . U ( x ) ≥ lim inf n → ∞ E x [ e − r τ n ∧ t U ( Z τ n ∧ t )] + lim inf n → ∞ E x (cid:20) ˆ τ n ∧ t e − rs ( π i ( Z s ) − k i u i ( t , Z s )) ds (cid:21) − k i lim inf n → ∞ E x [ ˆ τ n ∧ t e − rs d ξ cis + ∑ ≤ s ≤ τ n ∧ t e − rs ∆ξ is ] ≥ E x [ e − rt U ( Z t )] + E x [ ˆ t e − rs ( π i ( Z s ) − k i u i ( t , Z s )) ds ] − k i E x [ ˆ t e − rs d ξ cis + ∑ ≤ s ≤ t e − rs ∆ξ is ] . We note that U ( Z t ) ≤ M for some M > U ′ ( · ) is bounded from above by k i by condition (iii), we have ( U ( Z t )) − ≤ k i ( Z t ) − + C for some constant C >
0. Recall that ( Z t ) − ≤ ( X t ) − for the uncontrolled process X t and thatlim t → ∞ E x [ e − rt ( X t ) − ] = t → ∞ E x [ e − rt U ( Z t )] = U ( x ) ≥ E x [ ˆ ∞ e − rs ( π i ( Z s ) − k i u i ( t , Z s )) ds ] − k i ˆ ∞ e − rs d ξ cis − k i ∑ ≤ s < ∞ e − rs ∆ξ is ]= V i ( x ; ξ i , ξ j ) . Since ξ i is an arbitrary strategy of player i that satisfies d ξ it = u i ( t , Z t ) + d ξ cit + ∆ξ it , we have provedthat U ( x ) dominates all payoff functions that belong to the set { V i ( x ; ξ i , ξ j ) : ξ i ∈ Σ i } .Lastly, we consider Z subject to the strategy profile ( ξ ∗ i , ξ j ) . Note that { e − r τ U ( Z τ ) : τ > } is uniformly integrable because U ( · ) is bounded from above and { e − r τ ( U ( Z τ )) − : τ > } is uniformly integrable. By condition (iv), it is straightforward to verify that all the weak inequal-ities above can be exactly replaced by equalities if ξ i above is replaced by ξ ∗ i . Therefore, ξ ∗ i is thebest response among Σ i against ξ j , and U ( · ) is the best payoff function of player i within the set { V i ( x ; ξ i , ξ j ) : ξ i ∈ Σ i } . Proof of Theorem 2 : As a first step, we verify that V ( x ) in (4) and ξ given by the propositionsatisfy the conditions (i)–(iv) of Theorem 1. 37i) Note that C i = C j = I , and V ( · ) ∈ C ( I ) and that V ( · ) is non-decreasing. Thus, condition (i)is satisfied. Condition (ii) is not applicable because D = D = /0 .(iii) and (iv) For x > θ , we have V ′ ( x ) ≤ k by the definition of V ( · ) , and A V ( x ) + π ( x ) =
0, socondition (iii) is satisfied. For x ≤ θ , we have V ′ ( x ) = k and A V ( x ) + π ( x ) + u ( x ) V ′ ( x ) + v ( U ′ ( x ) − k ) = v by the definition of u ( · ) . Thus, condition (iii) is also satisfied for x ≤ θ . Because D = D = /0 , it also follows that condition (iv) is satisfied.Thus, we have proved that ξ = ( ξ i , ξ j ) is a strategy profile that belongs to Σ = Σ i × Σ j and that V i ( x ; ξ ) = sup ζ i ∈ Σ i V i ( x ; ζ i , ξ j ) = V ( x ) for both i = ,
2. This does not necessarily mean that ξ is aNash equilibrium. More precisely, we need to prove that ξ i is the best response among Ξ i . To proveit, we only need to show that V ( x ) = sup ζ i ∈ Ξ i V i ( x ; ζ i , ξ j ) , i.e., that V ( · ) is player i ’s optimal valuefunction given ξ j . We do so by showing that there exists a singular control strategy ξ ∗ i ∈ Ξ i such that V ( x ) = V i ( x ; ξ ∗ i , ξ j ) and that V ( · ) satisfies the optimality conditions for player i given in Section 2.2.Given ξ j , the SDE of Z and its r -excessive characteristic operator are given by dZ t = [ µ ( Z t ) + u j ( Z t )] dt + σ ( Z t ) dW t , A ξ j = σ ( x ) ∂ x + [ µ ( x ) + u j ( x )] ∂ x − r . Consider ξ ∗ i with a singular control region D i = ( a , θ ] , which is consistent with the fact that V ′ ( x ) = k for all x ∈ D i . First, note that A ξ j V ( x ) + π ( x ) = V ′ ( x ) ≤ k for all x ∈ I . Second, it follows that [ A ξ j V ( x ) + π ( x )][ V ′ ( x ) − k ] = x ∈ I . Thus, all the conditions of the optimality are satisfied,and we conclude that V ( x ) = sup ζ i ∈ Ξ i V i ( x ; ζ i , ξ j ) . Proof of Theorem 3 : Assume that there exists a regular control MPE with a payoff function V i ( · ; ξ ) ∈ C ( I ) . To prove Theorem 3, we establish the following two statements: (i) The controlregions of both players must coincide. (ii) Player i ’s control region must be ( a , θ i ) . Given (i) and(ii), because of the assumption θ i = θ j , we arrive at a contradiction and hence prove the theorem.38elow we employ Theorem 11.2.1 of Øksendal (2003) to prove (i) and (ii). For the theoremto be applicable, a few conditions have to be satisfied. First, V i ( · ; ξ ) ∈ C ( I ) has to be satisfied,which is assume above. Second, we need to have | A V i ( x ; ξ ) | < ∞ ∀ x ∈ I , which is satisfied becauseof Assumption 1. Lastly, we need to have | [ u i ( x ) + u j ( x )] V ′ i ( x ; ξ ) | < ∞ ∀ x ∈ I , which is satisfiedbecause we limit ξ to Σ i × Σ j . Thus, Theorem 11.2.1 of Øksendal (2003) is applicable.(i) We first prove that the regular control regions E i = { x ∈ I : u i ( x ) > } of both players mustcoincide, i.e., E = E . Suppose that there exists a non-empty open set F i ⊂ E i \ E j such that u i ( x ) > u j ( x ) = x ∈ F i . Note that V ′ i ( x ; ξ ) ≤ k i ; if there exists an interval in which V ′ i ( x ; ξ ) > k i , then it behooves player i to adopt a singular control strategy in this interval, which contradictsthe assumption that the equilibrium is characterized only by regular control strategies. By Theorem11.2.1 of Øksendal (2003), V i ( x ; ξ ) must satisfy the following HJB equation in F i : A V i ( x ; ξ ) + π i ( x ) + u i ( x )[ V ′ i ( x ; ξ ) − k i ] = u i ( x ) > V ′ i ( x ; ξ ) = k i . Here we used the fact that u j ( x ) = F i . By the assumptionthat u i ( x ) > x ∈ F i , V i ( x ; ξ ) = v + k i x for some constant v . Then the solution to A V i ( x ; ξ ) + π i ( x ) = µ ( x ) k i − r [ v + k i x ] + π i ( x ) = F i . It follows that a non-empty open set F i ⊂ E i \ E j cannot exist for either i . Because u i ( · ) is Lipschitz continuous, it implies E i = E j . For convenience, we let E = E i = E j denote the common regular control region for the remainder of this proof.(ii) By Theorem 11.2.1 of Øksendal (2003), V i ( · ; ξ ) must satisfy A V i ( z ; ξ ) + π i ( z ) + u j ( z ) ∂ z V i ( z ; ξ ) + u i ( z )[ ∂ z V i ( z ; ξ ) − k i ] = , (A.6)where u i ( x ) > V ′ i ( x ; ξ ) = k i and u j ( x ) > x ∈ E , and u i ( x ) = u j ( x ) = V ′ i ( x ; ξ ) < k i must be satisfied for x E . Furthermore, since u j ( x ) > x ∈ E , in order for(A.6) to hold, A V i ( x ; ξ ) + π i ( x ) < x ∈ E . In summary, the most salientnecessary conditions are [ A V i ( x ; ξ ) + π i ( x )][ V ′ i ( x ; ξ ) − k i ] = A V i ( x ; ξ ) + π i ( x ) ≤
0, and V ′ i ( x ; ξ ) ≤ k i .These conditions exactly coincide with the optimality conditions for a single decision maker singularstochastic control problem given in Section 2.2. By virtue of Lemma 1, there is a unique function V ∗ i ( · ) given by (4) where θ and π are replaced by θ and π that satisfies these necessary conditions.Based on the form of V ∗ i ( · ) given by (4), the regular control region is E i = ( a , θ i ) .From (ii), we conclude that the equilibrium is characterized by the player’s control region E i =( a , θ i ) and E j = ( a , θ j ) . However, as established by (i), the two control regions must coincide( E i = E j ), which is not possible if θ i = θ j . Therefore, there is no regular control strategy equilibriumwith associated payoff functions V i ( · ; ξ ) ∈ C ( I ) . Proof of Proposition 1 : Note that ξ continuously evolves in C = ( a , θ ′ ) ∪ ( θ , b ) and ξ contin-uously evolves in C = I according to the strategy profile ξ . To prove the proposition, it is sufficientto verify that U ( · ) and U ( · ) with the strategy profile ξ satisfy all the conditions of Theorem 1. Thecomplete proof exactly parallels that of Theorem 2.We first assume ξ and examine U ( · ) . First, it is straightforward to verify that U ( · ) satisfiesconditions (i) and (ii) of Theorem 1 because C = I . Hence, we only need to verify (iii) and (iv).Because of the forms of u ( · ) and u ( · ) given in (8) and (9), we have A U ( x ) + π ( x ) + u ( x ) U ′ ( x ) = x < θ ′ . We also have U ′ ( x ) ≤ k for all and x ∈ I , and, in particular, U ′ ( x ) = k for all x < θ .Hence, conditions (iii) and (iv) are satisfied.Next, we verify that U ( · ) given ξ satisfies the conditions (i)-(iv) of Theorem 1.(i) Note that D = [ θ ′ , θ ] and that U ( · ) is constant in D . Furthermore, by the definition of B in (10), we have lim x ↓ θ U ′ ( x ) =
0. Note that, by the nature of a singular control, ξ l t evolves only atthe right-most boundary of the singular control region D . This implies that U ′ ( Z t ) = d ξ l t > U ′ ( Z t ) d ξ l t =
0. Hence, (i) is satisfied.(ii) The first derivative of U ( · ) is discontinuous (not defined) at x = θ ′ , and its second derivativeis in general not defined at x = θ . However, given the first and second derivatives of U ( · ) near40 ′ and θ , it is always possible to construct a function ˜ U ( · ) ∈ C ( I ) such that ˜ U ( x ) = U ( x ) for all x ∈ C = ( a , θ ′ ) ∪ ( θ , b ) as long as θ ′ < θ .(iii) For x > θ , we have A U ( x ) + π ( x ) = U ( · ) , and U ′ ( x ) ≤ k is satisfied sothat v [ U ′ ( x ) − k ] ≤ v ≥
0. Hence, condition (iii) is satisfied for ( θ , b ) . For all x < θ ′ ,we have A U ( x ) + π ( x ) + u ( x ) k = u ( · ) , and U ′ ( x ) = k . Thus, condition (iii) issatisfied for all x ∈ C . Furthermore, it is straightforward to verify (iv) from the forms of u i ( · ) . B R&D Game with Spillover
An example of variable contribution games is an R&D game with spillover, which often occurs inhigh-tech industries. The technological advances made by one firm most often spill over to anotherfirm through various means such as reverse engineering, leakage of information due to geographicproximity, etc.We consider two firms engaging in R&D to develop a new technology. For simplicity, we assumethat the outcome of one firm’s successful completion of R&D completely spills over to the other. We also assume that the two firms are not in direct competition with each other because they arein two separate markets although they use the same technology. This model is an extension of anattrition game: each firm would rather that its opponent conducts the R&D. Thus, the R&D effort issubject to a free rider problem. Unlike the canonical attrition game, the two firms’ levels of R&Deffort are the state variables.Let R denote the reward to each firm from the new technology, irrespective of which firm de-velops it, and let λ it ≥ i ∈ { , } at time t . The completion time offirm i ’s R&D is an exponential random variable with the instantaneous arrival rate of λ it . Hence, bythe property of a Poisson process, the instantaneous arrival rate of the first completion of R&D isgiven by λ t + λ t . We model the cost of maintaining the effort level of λ it as c λ it / ∆λ it >
0, firm i has to spend k ∆λ it for some The assumption of complete spillover is not an essential one; partial spillover can be easily modeled, but it wouldcomplicate the analysis without altering the main insight. >
0. Each firm i can increase λ it by any amount at any time but can never decrease it.Let ( λ it ) t ≥ denote the non-decreasing process of the effort level of firm i , and let ˆ T i denotethe random completion time of firm i ’s R&D given the process ( λ it ) t ≥ . Given the strategy profile Λ = (( λ it ) t ≥ , ( λ jt ) t ≥ ) , firm i ’s payoff at t = V i ( Λ ) = E " ˆ ˆ T ∧ ˆ T e − rt ( − c λ it ) dt + e − r ˆ T ∧ ˆ T R − ˆ ˆ T ∧ ˆ T e − rt kd λ it . (B.1)The goal of this section is to demonstrate the existence of a subgame perfect equilibrium char-acterized by singular control strategies. Hence, we focus on a symmetric equilibrium in which thefirms immediately boost λ it to the equilibrium level. In particular, we show that there exists a sym-metric subgame perfect equilibrium in which both firms immediately set their effort level at a uniquevalue λ ∗ given by λ ∗ = r k + c (cid:20) − ( k + c ) r + q ( k + c ) + ( R / r − k )( k + c ) (cid:21) , (B.2)and maintain it until the end of the game. We can verify that this is an equilibrium by the first-orderoptimality condition for each player’s best response. Here we assume that the reward is sufficientlylarge so that ( k + c ) + ( R / r − k )( k + c ) > . Proposition B.1
Suppose that the initial effort levels are given by λ i < λ ∗ and λ j < λ ∗ . Then thereexists a subgame perfect equilibrium in which both firms boost their effort levels up to λ ∗ at timezero. Proof : Suppose that firm 2’s strategy is to boost the effort level to a level λ at time zero andkeep it at this level until the end of the game. Our goal is to obtain the best response of firm 1. Let λ , denote the initial effort level of firm 1. Given firm 2’s strategy, firm 1’s best response should beto similarly boost the effort level to some value λ and keep it until the end of the game because ofthe Markov property of the Poisson process. 42s a first step, we compute the following: E [ exp ( − r ˆ T ∧ ˆ T )] = ˆ ∞ λ e − λ t dt ˆ t λ e − λ t dt e − rt + ˆ ∞ λ e − λ t dt ˆ t λ e − λ t dt e − rt = λ + λ r + λ + λ . From () we obtain V ( Λ ) = − c r λ + ( R + c r λ ) λ + λ r + λ + λ − k ( λ − λ , ) . It follows that the first derivative is ∂ V ( Λ ) ∂λ = c λ + Rr − c λ ( r + λ + λ )( r + λ + λ ) − k = Rr − c λ ( r + λ / + λ ) − k ( r + λ + λ ) ( r + λ + λ ) . Note that the numerator of the second line is a concave function of λ , so the maximum value of V ( Λ ) is achieved by a unique value of λ that satisfies the first-order condition ∂ V ( Λ ) / ∂λ =
0. Assuminga symmetric equilibrium with λ = λ that solves the first order equation ∂ V i ( Λ ) / ∂λ =
0, we obtain λ = λ ∗ given by (B.2). Therefore, immediately boosting the effort level to λ ∗ is the best response tofirm 2’s strategy of λ = λ ∗∗