A variational characterization of the risk-sensitive average reward for controlled diffusions on R d
Ari Arapostathis, Anup Biswas, Vivek S. Borkar, K. Suresh Kumar
aa r X i v : . [ m a t h . A P ] S e p A VARIATIONAL CHARACTERIZATION OF THE RISK-SENSITIVEAVERAGE REWARD FOR CONTROLLED DIFFUSIONS ON R d . ARI ARAPOSTATHIS ∗ , ANUP BISWAS † , VIVEK S. BORKAR ‡ , AND
K. SURESHKUMAR § Abstract.
We address the variational formulation of the risk-sensitive reward problem for non-degenerate diffusions on R d controlled through the drift. We establish a variational formula on thewhole space and also show that the risk-sensitive value equals the generalized principal eigenvalue ofthe semilinear operator. This can be viewed as a controlled version of the variational formulas forprincipal eigenvalues of diffusion operators arising in large deviations. We also revisit the averagerisk-sensitive minimization problem and by employing a gradient estimate developed in this paperwe extend earlier results to unbounded drifts and running costs. Key words. principal eigenvalue, Donsker–Varadhan functional, risk-sensitive criterion
AMS subject classifications.
1. Introduction.
In this paper we consider the risk-sensitive reward maximiza-tion problem on R d for diffusions controlled through the drift. The main objectiveis to derive a variational formulation for the risk-sensitive reward in the spirit of[2], which does so for discrete time problems on a compact state space, and analyzethe associated Hamilton–Jacobi–Bellman (HJB) equation. Since the seminal work ofDonsker and Varadhan [18, 19], this problem has acquired prominence. The varia-tional formula derived here can be viewed as a controlled version of the variationalformulas for principal eigenvalues of diffusion operators arising in large deviations.For reversible diffusions, this formula can be viewed as an abstract Courant–Fischerformula [18]. For general diffusions, the correct counterpart in linear algebra is theCollatz–Wielandt formula for the principal eigenvalue of non-negative matrices [27,Chapter 8]. For its connection with the large deviations theory for finite Markovchains and an equivalent variational description, see [17].There has been considerable interest to generalize this theory to a natural classof nonlinear self-maps on positive cones of finite or infinite dimensional spaces. Thefirst task is to establish the existence and where possible, uniqueness of the principaleigenvalue and eigenvector (the latter modulo a scalar multiple as usual), that is,a nonlinear variant of the Perron–Frobenius theorem in the finite dimensional caseand its generalization, the Krein–Rutman theorem, in Banach spaces. This theoryis carried out in, e.g., [25, 29]. The next problem is to derive an abstract Collatz–Wielandt formula for the principal eigenvalue [1]. In bounded domains, a Collatz–Wielandt formula for the Dirichlet principal eigenvalue of a convex nonlinear operatoris obtained in [10]. Our first objective coincides with this, albeit for Feynman–Kacoperators arising in risk-sensitive control that we introduce later. For risk-sensitive reward processes, that is, the problem of maximizing the asymptotic growth rate forthe risk-sensitive reward in discrete time problems, one can go a step further and give ∗ Department of Electrical and Computer Engineering, The University of Texas at Austin,EER 7.824, Austin, TX 78712 ([email protected]). † Department of Mathematics, Indian Institute of Science Education and Research, Dr. HomiBhabha Road, Pune 411008, India ([email protected]). ‡ Department of Electrical Engineering, Indian Institute of Technology, Powai, Mumbai 400076,India ([email protected]). § Department of Mathematics, Indian Institute of Technology, Powai, Mumbai 400076, India([email protected]). 1
A. ARAPOSTATHIS, A. BISWAS, V.S. BORKAR, AND K. SURESH KUMAR an explicit characterization of the principal eigenvalue as the solution of a concavemaximization problem [2]. The objective of this article is to carry out this programfor controlled diffusions.At this juncture, it is worthwhile to underscore the difference between rewardmaximization and cost minimization problems with risk-sensitive criteria. Unlike themore classical criteria such as ergodic or discounted, they cannot be converted fromone to the other by a sign flip. The cost minimization criterion, after a logarithmictransformation applied to its HJB equation, leads to the Isaacs equation for a zero-sum stochastic differential game [20]. An identical procedure applied to the rewardmaximization problem would lead to a team problem wherein the two agents seek tomaximize the same payoff non-cooperatively . The latter in particular implies that theirdecisions at any time are conditionally independent given the state (more generally,the past history). Our approach leads to a concave maximization problem, an immenseimprovement with potential implications for possible numerical schemes. This doesnot seem possible for the cost minimization problem. Thus the complexity of thelatter is much higher. Recently, a risk-sensitive maximization problem is also studiedin [14] under a blanket geometric stability condition. In the present paper we do notimpose any blanket stability on the controlled processes.We first establish these results for reflected diffusions in a bounded domain, forwhich the nonlinear Krein–Rutman theorem of [29] paves the way. This is not so if thestate space is all of R d . Extension to the whole space turns out to be quite involveddue to the lack of compactness. Even the well-posedness of the underlying nonlineareigenvalue problem is pretty tricky. Hence we proceed via the infinite volume limit ofthe finite volume problems. This leads to an abstract Collatz–Wielandt formula andan abstract Donsker–Varadhan formula. More specifically, in Theorem 3.4 we showthat the generalized eigenvalue of the semilinear operator is simple, and identify someuseful properties of its eigenvector. We proceed to prove equality between the risk-sensitive value and the generalized principal eigenvalue in Theorem 3.6, which alsoestablishes a verification of optimality criterion. The general result for the variationalformula is in Proposition 4.1, followed by more specialized results in Theorems 4.10and 4.12. In the process of deriving these results, we present some techniques that mayhave wider applicability. Most prominent of these is perhaps the gradient estimate inLemma 4.5 for operators with measurable coefficients.Lastly, in section 5 we revisit the risk-sensitive minimization problem, and withthe aid of Lemma 4.5 we improve the main result in [3] by extending it to unboundeddrifts and running costs, under suitable growth conditions (see Assumption 5.1). We summarize here the resultsconcerning the variational formula on the whole space. We consider a controlleddiffusion in R d of the formd X t = b ( X t , ξ t ) d t + σ ( X t ) d W t defined in a complete probability space (Ω , F , P ). The process W is a d -dimensionalstandard Wiener process independent of the initial condition X , and the controlprocess { ξ t } t ≥ lives in a compact metrizable space K . We impose a standard set ofassumptions on the coefficients which guarantee existence and uniqueness of strongsolutions under all admissible controls. Namely, local Lipschitz continuity in x andat most affine growth of b and σ , and local non-degeneracy of a := σσ T (see As-sumption 3.1 (i)). But we do not impose any ergodicity assumptions on the controlleddiffusion. The process { X t } t ≥ could be transient. VARIATIONAL FORMULA FOR RISK-SENSITIVE CONTROL c : R d × K → R be a continuous running reward function, which is assumedbounded from above, and define the optimal risk-sensitive value J ∗ by J ∗ := sup { ξ t } t ≥ lim inf T →∞ T log E h e R T c ( X t ,ξ t ) d t i , where the supremum is over all admissible controls, and E denotes the expectationoperator. This problem is translated to an ergodic control problem for the operator A : C ( R d ) → C ( R d × K × R d ), defined by(1.1) A φ ( x, ξ, y ) := 12 trace (cid:0) a ( x ) ∇ φ ( x ) (cid:1) + (cid:10) b ( x, ξ ) + a ( x ) y, ∇ φ ( x ) (cid:11) , where ∇ denotes the Hessian, and a ( x ) = σ ( x ) σ T ( x ), that seeks to maximize theaverage value of the functional(1.2) L ( x, ξ, y ) := c ( x, ξ ) − | σ T ( x ) y | , ( x, ξ, y ) ∈ R d × K × R d . We first show that the generalized principal eigenvalue λ ∗ (see (3.17)) of the maximaloperator(1.3) G f ( x ) := 12 trace (cid:0) a ( x ) ∇ f ( x ) (cid:1) + max ξ ∈ K (cid:2)(cid:10) b ( x, ξ ) , ∇ f ( x ) (cid:11) + c ( x, ξ ) f ( x ) (cid:3) is simple. An important hypothesis for this is that c − λ ∗ is negative and boundedfrom above away from zero on the complement of some compact set (see Assump-tion 3.1 (iii)). This is always satisfied if − c is an inf-compact function (i.e., the sub-level sets {− c ≤ κ } are compact, or empty, in R d × K for each κ ∈ R ), or if c isa positive function vanishing at infinity and the process { X t } t ≥ is recurrent undersome stationary Markov control. Let the positive function Φ ∗ ∈ C ( R d ), normalized asΦ ∗ (0) = 1 to render it unique, denote the principal eigenvector, that is, G Φ ∗ = λ ∗ Φ ∗ ,and define ϕ ∗ = log Φ ∗ . The function(1.4) H ( x ) := 12 (cid:12)(cid:12) σ T ( x ) ∇ ϕ ∗ ( x ) (cid:12)(cid:12) , x ∈ R d , plays a very important role in the analysis, and can be interpreted as an infinitesimalrelative entropy rate (see section 4). To keep the notation simple, we define Z := R d × K × R d , and use the single variable z = ( x, ξ, y ) ∈ Z . Let P ( Z ) denote theset of probability measures on the Borel σ -algebra of Z , and M A denote the set of infinitesimal ergodic occupation measures for the operator A defined by(1.5) M A := (cid:26) µ ∈ P ( Z ) : Z Z A f ( z ) µ (d z ) = 0 ∀ f ∈ C c ( R d ) (cid:27) , where C c ( R d ) is the class of functions in C ( R d ) which have compact support. Wealso define(1.6) P ∗ ( Z ) := (cid:26) µ ∈ P ( Z ) : Z Z H ( x ) µ (d x, d ξ, d y ) < ∞ (cid:27) , P ◦ ( Z ) := (cid:26) µ ∈ P ( Z ) : Z Z L ( z ) µ (d z ) > −∞ (cid:27) . A. ARAPOSTATHIS, A. BISWAS, V.S. BORKAR, AND K. SURESH KUMAR
Then, under the mild hypotheses of Assumption 3.1, we show in Proposition 4.1 that(1.7) J ∗ = λ ∗ = sup µ ∈P ∗ ( Z ) inf g ∈C c ( R d ) Z Z (cid:0) A g ( z ) + L ( z ) (cid:1) µ (d z )= max µ ∈M A ∩P ∗ ( Z ) Z Z L ( z ) µ (d z ) . We next specialize the results to the case where the diffusion matrix a is boundedand uniformly elliptic (see Assumption 4.4), and show in Theorem 4.10 that underany of the hypotheses of Assumption 4.7 we have M A ∩ P ◦ ( Z ) ⊂ P ∗ ( Z ). This permitsus to replace P ∗ ( Z ) with P ( Z ) and M A ∩ P ∗ ( Z ) with M A in the second and thirdequalities of (1.7), respectively. We note here that if a is bounded and uniformlyelliptic, then Assumption 4.7 is satisfied when either − c is inf-compact, or h b, x i − hassubquadratic growth, or | b | | c | is bounded.We also show that if H | ϕ ∗ | is bounded (see Lemma 4.11 for explicit conditionson the parameters under which this holds), then we can commute the ‘sup’ and the‘inf’ to obtain J ∗ = inf g ∈C c ( R d ) sup µ ∈P ( Z ) Z Z (cid:0) A g ( z ) + L ( z ) (cid:1) µ (d z ) . Also, in Theorem 4.12, we establish the variational formula over the class of functionsin C ( R d ) whose partial derivatives up to second order have at most polynomial growthin | x | . The standard Euclidean norm in R d is denoted by | · | , and N stands for the set of natural numbers. The closure, the boundary and the complementof a set A ⊂ R d are denoted by ¯ A , ∂A and A c , respectively. We denote by τ ( A ) the first exit time of the process { X t } from the set A ⊂ R d , defined by τ ( A ) := inf { t > X t A } . The open ball of radius r in R d , centered at x ∈ R d , is denoted by B r ( x ), and B r isthe ball centered at 0. We let τ r := τ ( B r ), and ˘ τ r := τ ( B cr ). For a Borel space Y , P ( Y ) denotes the set of probability measures on its Borel σ -algebra.The term domain in R d refers to a nonempty, connected open subset of theEuclidean space R d . For a domain D ⊂ R d , the space C k ( D ) ( C kb ( D )) refers tothe class of all real-valued functions on D whose partial derivatives up to order k exist and are continuous (and bounded). In addition C kc ( D ) denotes the class offunctions in C k ( D ) that have compact support. The space L p ( D ), p ∈ [1 , ∞ ), standsfor the Banach space of (equivalence classes of) measurable functions f satisfying R D | f ( x ) | p d x < ∞ , and L ∞ ( D ) is the Banach space of functions that are essentiallybounded in D . The standard Sobolev space of functions on D whose generalizedderivatives up to order k are in L p ( D ), equipped with its natural norm, is denoted by W k,p ( D ), k ≥ p ≥ X is a space of real-valued functions on Q , X loc consists of allfunctions f such that f ϕ ∈ X for every ϕ ∈ C ∞ c ( Q ), the space of smooth functions on Q with compact support. In this manner we obtain for example the space W ,p loc ( Q ).We adopt the notation ∂ t := ∂∂t , and for i, j ∈ N , ∂ i := ∂∂x i and ∂ ij := ∂ ∂x i ∂x j ,and use the standard summation rule that repeated subscripts and superscripts aresummed from 1 through d . VARIATIONAL FORMULA FOR RISK-SENSITIVE CONTROL
2. The problem on a bounded domain.
In this section, we consider the risk-sensitive reward maximization with state dynamics given by a reflected diffusion ona bounded C domain Q ⊂ R d with co-normal direction of reflection. In particular,the dynamics are given by(2.1) d X t = b ( X t , ξ t ) d t + σ ( X t ) d W t − γ ( X t ) d η t , where η t denotes the local time of the process X on the boundary ∂Q . The ran-dom processes in (2.1) live in a complete probability space (Ω , F , P ). The process W = ( W t ) t ≥ is a d -dimensional standard Wiener process independent of the initialcondition X . The control process ξ = ( ξ t ) t ≥ takes values in a compact, metrizableset K , and ξ t ( ω ) is jointly measurable in ( t, ω ) ∈ [0 , ∞ ) × Ω. The set of admissiblecontrols
Ξ consists of the control processes ξ that are non-anticipative : for s < t , W t − W s is independent of(2.2) F s := the completion of σ { X , ξ r , W r , r ≤ s } relative to ( F , P ) . Concerning the coefficients of the equation, we assume the following:(i) The drift b is a continuous map from Q × K to R d , and Lipschitz in its firstargument uniformly with respect to the second.(ii) The diffusion matrix σ : Q → R d × d is continuously differentiable, its deriv-atives are H¨older continuous, and is non-degenerate in the sense that theminimum eigenvalue of a ( x ) = (cid:2) a ij ( x ) (cid:3) := σ ( x ) σ T ( x ) on Q is bounded awayfrom zero.(iii) The reflection direction γ = [ γ ( x ) , . . . , γ d ( x )] T : R d → R d is co-normal, thatis, γ is given by γ i ( x ) = d X j =1 a ij ( x ) n j ( x ) , x ∈ ∂Q , where ~n ( x ) = [ n ( x ) , . . . , n d ( x )] T is the unit outward normal.We let Ξ sm denote the set of stationary Markov controls, that is, the set of Borelmeasurable functions v : R d → K . Given ξ ∈ Ξ, the stochastic differential equation in(2.1) has a unique strong solution. The same is true for the class of Markov controls [8,Chapter 2]. Let P xξ and E xξ denote the probability measure and expectation operatoron the canonical space of the process controlled under ξ ∈ Ξ, with initial condition X = x .Given a continuous reward function c : Q × K → R , which is Lipschitz continuousin its first argument uniformly with respect to the second, the objective of the risk-sensitive reward problem is to maximize(2.3) J xξ ( c ; Q ) = lim inf T →∞ T log E xξ h e R T c ( X t ,ξ t ) d t i , x ∈ Q , over all admissible controls ξ ∈ Ξ. We define(2.4) J x ∗ ( c ; Q ) := sup ξ ∈ Ξ J xξ ( c ; Q ) , x ∈ Q , and J ∗ ( c ; Q ) := sup x ∈ Q J x ∗ ( c ; Q ) . The solution of this problem shows that J x ∗ ( c ; Q ) does not depend on x .We let C γ ( Q ) := (cid:8) f ∈ C ( Q ) : h∇ f, γ i = 0 on ∂Q (cid:9) , A. ARAPOSTATHIS, A. BISWAS, V.S. BORKAR, AND K. SURESH KUMAR and C γ, + ( Q ) denote its subspace consisting of nonnegative functions.For f ∈ C ( Q ), and ξ ∈ K , we define(2.5) L ξ f ( x ) := trace (cid:0) a ( x ) ∇ f ( x ) (cid:1) + (cid:10) b ( x, ξ ) , ∇ f ( x ) (cid:11) , G f ( x ) := trace (cid:0) a ( x ) ∇ f ( x ) (cid:1) + max ξ ∈ K (cid:2)(cid:10) b ( x, ξ ) , ∇ f ( x ) (cid:11) + c ( x, ξ ) f ( x ) (cid:3) . We summarize some results from [9] that are needed in Theorem 2.1 below. With-out loss of generality we assume that 0 ∈ Q .Consider the operator S t : C ( Q ) → C ( Q ), t ∈ R + , defined by S t f ( x ) := sup ξ ∈ Ξ E xξ h e R t c ( X s ,ξ s ) d s f ( X t ) i . The characterization of S t is exactly analogous to [9, Theorem 3.2], which considers theminimization problem (see also [9, Remark 4.2]). Specifically, for each f ∈ C δγ ( Q ),and T >
0, the quasi-linear parabolic p.d.e. ∂ t u ( t, x ) = G u ( t, x ) in (0 , T ] × Q ,with u (0 , x ) = f ( x ) for all x ∈ Q , and h∇ u ( t, x ) , γ ( x ) i = 0 for all ( t, x ) ∈ (0 , T ] × ∂Q , has a unique solution in C δ / , δ (cid:0) [0 , T ] × Q (cid:1) . This solution has the stochasticrepresentation u ( t, x ) = S t f ( x ) for all ( t, x ) ∈ [0 , T ] × Q .Following the analysis in [9] we obtain the following characterization of J ∗ ( c ; Q )defined in (2.4). Theorem
There exists a unique pair ( ρ, V ) ∈ R × C γ, + ( Q ) which solves (2.6) G V = ρV in Q , h∇ V, γ i = 0 on ∂Q , and V (0) = 1 . Also, S t V ( x ) = e ρt V ( x ) , for ( x, t ) ∈ Q × [0 , ∞ ) . In addition, we have J x ∗ ( c ; Q ) = J ∗ ( c ; Q ) = ρ ∀ x ∈ Q , and (2.7) ρ = inf f ∈C γ, + ( Q ) , f> sup x ∈ Q G f ( x ) f ( x ) = sup f ∈C γ, + ( Q ) , f> inf x ∈ Q G f ( x ) f ( x ) . Proof.
Equation (2.7) is the result in [9, Lemma 2.1], while the other assertionsfollow from Lemma 4.5 and Remark 4.2 in [9].
Define L ( x, ξ, y ) := c ( x, ξ ) − | σ T ( x ) y | , ( x, ξ, y ) ∈ Q × K × R d , and an operator A : C γ ( Q ) → C ( R d × K × R d ) by A φ ( x, ξ, y ) := 12 trace (cid:0) a ( x ) ∇ φ ( x ) (cid:1) + (cid:10) b ( x, ξ ) + a ( x ) y, ∇ φ ( x ) (cid:11) . It is important to note that if f ∈ C γ, + ( Q ) is a positive function and g = log f , then G f ( x ) f ( x ) = max ξ ∈ K max y ∈ R d (cid:2) A g ( x, ξ, y ) + L ( x, ξ, y ) (cid:3) . VARIATIONAL FORMULA FOR RISK-SENSITIVE CONTROL ρ = inf g ∈C γ ( Q ) sup x ∈ Q sup ξ ∈ K , y ∈ R d (cid:16) A g ( x, ξ, y ) + L ( x, ξ, y ) (cid:17) (2.8) = sup g ∈C γ ( Q ) inf x ∈ Q sup ξ ∈ K , y ∈ R d (cid:16) A g ( x, ξ, y ) + L ( x, ξ, y ) (cid:17) . We let(2.9) F ( g, µ ) := Z Q × K × R d (cid:0) A g ( x, ξ, y ) + L ( x, ξ, y ) (cid:1) µ (d x, d ξ, d y )for g ∈ C γ ( Q ) and µ ∈ P ( Q × K × R d ).It is clear that (2.8) can be written as(2.10) ρ = inf g ∈C γ ( Q ) sup µ ∈P ( Q × K × R d ) F ( g, µ ) . Let M A ,Q denote the class of infinitesimal ergodic occupation measures for theoperator A , defined by M A ,Q := (cid:26) µ ∈ P ( Q × K × R d ) : Z Q × K × R d A f d µ = 0 ∀ f ∈ C γ ( Q ) (cid:27) . Implicit in this definition is the requirement that R | A f | d µ < ∞ for all f ∈ C γ ( Q ) and µ ∈ M A ,Q . We have the following result. Theorem
It holds that (2.11) ρ = inf g ∈C γ ( Q ) sup µ ∈P ( Q × K × R d ) F ( g, µ ) = sup µ ∈P ( Q × K × R d ) inf g ∈C γ ( Q ) F ( g, µ ) . Moreover, P ( Q × K × R d ) may be replaced with M A ,Q in (2.11) , and thus ρ = sup µ ∈M A ,Q Z Q × K × R d L ( x, ξ, y ) µ (d x, d ξ, d y ) . Proof.
The first equality in (2.11) follows by (2.10). We continue to prove therest of the assertions. First note thatsup µ ∈P ( Q × K × R d ) inf g ∈C γ ( Q ) F ( g, µ ) = ˆ ρ := sup µ ∈M A ,Q Z Q × K × R d L ( x, ξ, y ) µ (d x, d ξ, d y ) , because the infimum on the left hand side is −∞ for µ / ∈ M A ,Q . It follows by (2.10)that ˆ ρ ≤ ρ . Let v ∗ be a measurable selector from the maximizer of (2.6), that is, (cid:10) b (cid:0) x, v ∗ ( x ) (cid:1) , ∇ V ( x ) (cid:11) + c (cid:0) x, v ∗ ( x ) (cid:1) V ( x ) = max ξ ∈ K (cid:2)(cid:10) b ( x, ξ ) , ∇ V ( x ) (cid:11) + c ( x, ξ ) V ( x ) (cid:3) . With φ := log V , (2.6) takes the form(2.12) A φ (cid:0) x, v ∗ ( x ) , ∇ φ ( x ) (cid:1) + L (cid:0) x, v ∗ ( x ) , ∇ φ ( x ) (cid:1) = ρ . A. ARAPOSTATHIS, A. BISWAS, V.S. BORKAR, AND K. SURESH KUMAR
The reflected diffusion with drift b (cid:0) x, v ∗ ( x ) (cid:1) + a ( x ) ∇ φ ( x ) is of course exponentiallyergodic. Let η ∗ denote its invariant probability measure. Then, (2.12) implies that(2.13) Z Q L (cid:0) x, v ∗ ( x ) , ∇ φ ( x ) (cid:1) η ∗ (d x ) = ρ . Let µ ∗ ∈ P ( Q × K × R d ) be defined by µ ∗ (d x, d ξ, d y ) := η ∗ (d x ) δ v ∗ ( x ) (d ξ ) δ ∇ φ ( x ) (d y ) , where δ y denotes the Dirac mass at y . Then µ ∗ is an ergodic occupation measure forthe controlled reflected diffusion with drift b ( x, ξ ) + a ( x ) y , and thus µ ∗ ∈ M A ,Q . Let g ∈ C γ ( Q ) be arbitrary. Then F ( g, µ ∗ ) = Z Q × K × R d L ( x, ξ, y ) µ ∗ (d x, d ξ, d y ) = ρ , where the second equality follows by (2.13). Thus ˆ ρ ≥ ρ , and since we have alreadyasserted the reverse inequality, we must have equality. This establishes (2.11), andalso proves the last assertion of the theorem.
3. The risk-sensitive reward problem on R d . In this section we study therisk-sensitive reward maximization problem on R d . We consider a controlled diffusionof the form(3.1) d X t = b ( X t , ξ t ) d t + σ ( X t ) d W t . All random processes in (3.1) live in a complete probability space (Ω , F , P ). Thecontrol process { ξ t } t ≥ lives in a compact metrizable space K .We approach the problem in R d as a limit of Dirichlet or Neumann eigenvalueproblems on balls B r , r >
0. Differentiability of the matrix a can be relaxed here.Consider the eigenvalue problem on a ball B r , with Neumann boundary conditions,and the reflection direction along the exterior normal ~n ( x ) to B r at x . The drift b : ¯ B r × K → R d is continuous, and Lipschitz in its first argument uniformly withrespect to the second. The diffusion matrix a is Lipschitz continuous on ¯ B r and non-degenerate. Let ρ r denote the principal eigenvalue on B r under Neumann boundaryconditions of the operator G defined in (2.5). We refer to ρ r as the Neumann eigenvalue on B r . It follows from the results in [30] (see in particular Theorems 5.1, 6.6, andProposition 7.1) that there exists a unique V r ∈ C ( B r ) ∩ C , ( ¯ B r ), with V r > B r and V r (0) = 1, solving(3.2) trace (cid:0) a ( x ) ∇ V r ( x ) (cid:1) + max ξ ∈ K (cid:2)(cid:10) b ( x, ξ ) , ∇ V r ( x ) (cid:11) + c ( x, ξ ) V r ( x ) (cid:3) = ρ r V r ( x ) , and h∇ V r ( x ) , ~n ( x ) i = 0 on ∂B r . We also refer the reader to [24, Theorem 12.1, p. 195].We adopt the following structural hypotheses on the coefficients of (3.1) and thereward function c have the following structural properties. Assumption b : R d × K → R d is continuous, and for someconstant C R > R >
0, we have | b ( x, ξ ) − b ( y, ξ ) | + k σ ( x ) − σ ( y ) k ≤ C R | x − y | ∀ x, y ∈ B R , ∀ ξ ∈ K , d X i,j =1 a ij ( x ) ζ i ζ j ≥ C − R | ζ | ∀ ( x, ζ ) ∈ B R × R d , VARIATIONAL FORMULA FOR RISK-SENSITIVE CONTROL | b ( x, ξ ) | + k σ ( x ) k ≤ C (cid:0) | x | (cid:1) ∀ ( x, ξ ) ∈ R d × K , (3.3)where k σ k := (cid:0) trace σσ T (cid:1) / denotes the Hilbert–Schmidt norm of σ .(ii) The reward function c : R d × K → R is continuous and locally Lipschitz inits first argument uniformly with respect to ξ ∈ K , is bounded from above in R d , and x max ξ ∈ Ξ | c ( x, ξ ) | has polynomial growth in | x | .(iii) We assume that the Neumann eigenvalues ρ n satisfy(3.4) ρ ∗ := lim sup n →∞ ρ n > lim r →∞ sup ( x,ξ ) ∈ B cr × K c ( x, ξ ) . Assumption 3.1 is enforced throughout the rest of the paper, unless mentionedotherwise. Part (i) of this assumption are the usual hypotheses that guarantee exis-tence and uniqueness of strong solutions to (3.1) under any admissible control.
Remark − c is inf-compact. In this case wehave ρ ∗ ≤ sup R d × K c and ρ ∗ > −∞ , since the Dirichlet eigenvalues which are a lowerbound for ρ ∗ are increasing as a function of the domain [7, Lemma 2.1]. Second,when c is positive and vanishes at infinity, and under some stationary Markov controlthe process { X t } t ≥ in (3.1) is recurrent. This can be established by comparing ρ n with the Dirichlet eigenvalue on B n (see subsection 3.2), and using [7, Theorems 2.6and 2.7 (ii)]. For related studies concerning the class of running reward functionsvanishing at infinity, albeit in the uncontrolled case, see [22, 23, 7, 10]. See also[4, Theorem 2.12] which studies the Collatz–Wielandt formula for the risk-sensitiveminimization problem.Recall that Ξ sm denotes the set of stationary Markov controls. For v ∈ Ξ sm , weuse the simplifying notation b v ( x ) := b (cid:0) x, v ( x ) (cid:1) , c v ( x ) := c (cid:0) x, v ( x ) (cid:1) , and define L v analogously.We next review some properties of eigenvalues of linear and semilinear operatorson R d . For f ∈ C ( R d ) and ψ ∈ W ,d loc ( R d ), define(3.5) e L ψξ f := L ξ f + h a ∇ ψ, ∇ f i , with L ξ as in (2.5). Let v ∈ Ξ sm . Suppose that a positive function Ψ ∈ W ,d loc ( R d ) and λ ∈ R solve the equation(3.6) L v Ψ( x ) + c v ( x )Ψ( x ) = λ Ψ( x ) a.e. x ∈ R d . We refer to any such solution (Ψ , λ ) as an eigenpair of the operator L v + c v , and wesay that Ψ is an eigenvector with eigenvalue λ . Note that by eigenvector we alwaysmean a positive function. Let ψ = log Ψ. We refer to the Itˆo stochastic differentialequation(3.7) d e X t = (cid:0) b v ( e X t ) + a ( e X t ) ∇ ψ ( e X t ) (cid:1) d t + σ ( e X t ) d W t A. ARAPOSTATHIS, A. BISWAS, V.S. BORKAR, AND K. SURESH KUMAR as the twisted
SDE, and to its solution as the twisted process corresponding to Ψ.Clearly e L ψv is the extended generator of (3.7).We define the generalized principal eigenvalue λ v = λ v ( c v ) of the operator L v + c v by(3.8) λ v := inf n λ ∈ R : ∃ φ ∈ W ,d loc ( R d ) , φ > , L v φ + ( c v − λ ) φ ≤ R d o . A principal eigenvector Ψ v ∈ W ,d loc ( R d ) is a positive solution of (3.6) with λ = λ v . Aprincipal eigenvector is also called a ground state , and we refer to the correspondingtwisted SDE and twisted process as a ground state SDE and ground state process respectively. Unlike what is common in criticality theory, our definition of a groundstate does not require the minimal growth property of the principal eigenfunction (see[6]).An easy calculation shows that any eigenpair (Ψ , λ ) of L v + c v satisfies(3.9) e L ψv Ψ − ( x ) − c v ( x )Ψ − ( x ) = − λ Ψ − ( x ) a.e. x ∈ R d , with ψ = log Ψ. In other words, (Ψ − , − λ ) is an eigenpair of e L ψv − c v . Note also that( ψ, λ ) is a solution to the ‘linear’ eigenvalue equation(3.10) e L ψv ψ − | σ T ∇ ψ | + c v = λ , and that this equation can also be written as(3.11) L v ψ + max y ∈ R d h h ay, ∇ ψ i − | σ T y | i + c v = λ . An extensive study of generalized principal eigenvalues with applications to risk-sensitive control can be found in [3, 7]. In these papers, the ‘potential’ c v is assumedto be bounded below in R d , so the results cannot be quoted directly. It is not ourintention to reproduce all these results for potentials which are bounded above, so weonly focus on results that are needed later in this paper. We only quote results in[3, 7] which do not depend on the assumption that c v is bounded below. Generallyspeaking, caution should be exercised with arguments in [3, 7] that employ the Fatoulemma. On the other hand, since c usually appears in the exponent, invoking Fatou’slemma hardly ever poses any problems.Suppose that the twisted process in (3.7) is regular, that is, the solution existsfor all times. Then, an application of [7, Lemma 2.3] shows that an eigenvector Ψ hasthe stochastic representation (semigroup property)Ψ( x ) = E xv h e R t [ c v ( X s ) − λ ] d s Ψ( X t ) i . Recall that ˘ τ r denotes the first hitting time of the ball B r , for r >
0. We needthe following lemma.
Lemma
We assume only Assumption . The following hold. ( a ) If (Ψ , λ ) is an eigenpair of L v + c v under some v ∈ Ξ sm , and the twisted processin (3.7) is exponentially ergodic, then we have the stochastic representation (3.12) Ψ( x ) = E xv h e R ˘ τ r [ c v ( X s ) − λ ] d s Ψ( X ˘ τ r ) { ˘ τ r < ∞} i ∀ x ∈ ¯ B cr , ∀ r > . In addition, λ = λ v , the generalized principal eigenvalue of L v + c v , and theground state Ψ = Ψ v is unique up to multiplication by a positive constant. VARIATIONAL FORMULA FOR RISK-SENSITIVE CONTROL b ) Any eigenpair (Ψ , λ ) ∈ W ,d loc ( R d ) × R d of L v + c v satisfying (3.12) is a principaleigenpair, and λ is a simple eigenvalue.Proof. Combining the proof of [7, Theorem 2.2] with [7, Theorem 3.1], we deducethat for every r >
0, there exists a δ > E xv h e R ˘ τ r [ c v ( X s ) − λ + δ ] d s { ˘ τ r < ∞} i < ∞ , x ∈ B cr . Applying the Itˆo formula to (3.6) we obtain(3.14) Ψ( x ) = E xv h e R t ∧ ˘ τ r ∧ τ n [ c v ( X s ) − λ ] d s Ψ( X t ∧ ˘ τ r ∧ τ n ) i = E xv h e R ˘ τ r [ c v ( X s ) − λ ] d s Ψ( X ˘ τ r ) { ˘ τ r 3. For the first integral we havelim n →∞ lim t →∞ J = E xv h e R ˘ τ r [ c v ( X s ) − λ ] d s Ψ( X ˘ τ r ) { ˘ τ r < ∞} i by monotone convergence. Note that the limit is also finite by (3.13).Let e P xψ,v and e E xψ,v denote the probability measure and expectation operator onthe canonical space of the twisted process in (3.7) with initial condition ˜ X = x . Next,using again the technique in [7, Theorem 2.2], we write J = e − δt E xv h e R t ∧ ˘ τ r ∧ τ n [ c v ( X s ) − λ + δ ] d s Ψ( X t ∧ ˘ τ r ∧ τ n ) { t< ˘ τ r ∧ τ n } i ≤ e − δt E xv h e R t ∧ ˘ τ r ∧ τ n [ c v ( X s ) − λ + δ ] d s Ψ( X t ∧ ˘ τ r ∧ τ n ) i ≤ e − δt e E xψ,v h e δ ( t ∧ ˘ τ r ∧ τ n ) i ≤ e − δt e E xψ,v (cid:2) e δ ˘ τ r (cid:3) , where in the second inequality we apply [7, Lemma 2.3]. Thus, J vanishes as t → ∞ .Concerning J , using monotone convergence, we obtain(3.15) lim t →∞ J = E xv h e R τ n [ c v ( X s ) − λ ] d s Ψ( X τ n ) { τ n < ˘ τ r } i ≤ Ψ( x ) e P xψ,v (cid:0) τ n < ˘ τ r ) . where the inequality follows from the proof of [7, Lemma 2.3]. In turn, the right-handside of (3.15) vanishes as n → ∞ , since the twisted process is geometrically ergodic.This completes the proof of (3.12).Suppose that a positive φ ∈ W ,d loc ( R d ) and ˆ λ ≤ λ solve L v φ ( x ) + c v ( x ) φ ( x ) ≤ ˆ λφ ( x ) a.e. x ∈ R d . An application of Itˆo’s formula and Fatou’s lemma then shows that(3.16) φ ( x ) ≥ E xv h e R ˘ τ r [ c v ( X s ) − ˆ λ ] d s φ ( X ˘ τ r ) { ˘ τ r < ∞} i ∀ x ∈ ¯ B cr , ∀ r > . A. ARAPOSTATHIS, A. BISWAS, V.S. BORKAR, AND K. SURESH KUMAR Equations (3.12) and (3.16) imply that if we scale φ by multiplying it with a posi-tive constant until it touches Ψ at one point from above, the function φ Ψ attains itsminimum value of 1 at some point in ¯ B r . A standard calculation shows that e L ψv (cid:0) φ Ψ (cid:1) ( x ) ≤ (ˆ λ − λ ) (cid:0) φ Ψ (cid:1) ( x ) . Thus, φ Ψ must equal a constant by the strong maximum principle, which implies thatˆ λ = λ . This of course means that λ = λ v . Uniqueness of Ψ v is evident from thepreceding argument. This completes the proof of part (a).Part (b) is evident from the preceding paragraph. This completes the proof. R d . Recall the solution ( V r , ρ r ) of (3.2), thedefinition of ρ ∗ in (3.4), and the definition of G in (1.3). We define(3.17) λ ∗ := inf n λ ∈ R : ∃ φ ∈ W ,d loc ( R d ) , φ > , G φ − λφ ≤ R d o . Recall the definitions of A and L in (1.1) and (1.2). Note that if (Φ , λ ) is an eigenpairof G , then similarly to (3.11), we have(3.18) max ξ ∈ K max y ∈ R d (cid:2) A ϕ ( x, ξ, y ) + L ( x, ξ, y ) (cid:3) = λ , with ϕ = log Φ. Theorem There exists Φ ∗ ∈ C ( R d ) satisfying (3.19) max ξ ∈ K (cid:2) L ξ Φ ∗ ( x ) + c ( x, ξ )Φ ∗ ( x ) (cid:3) = ρ ∗ Φ ∗ ( x ) ∀ x ∈ R d , and the following hold: ( a ) The function Φ − ∗ is inf-compact. ( b ) If v ∗ is an a.e. measurable selector from the maximizer of (3.19) , then, thediffusion with extended generator e L ϕ ∗ v ∗ , as defined in (3.5) , is exponentiallyergodic and satisfies (3.20) e L ϕ ∗ v ∗ Φ − ∗ ( x ) = (cid:0) c v ∗ ( x ) − ρ ∗ (cid:1) Φ − ∗ ( x ) , with ϕ ∗ := log Φ ∗ . ( c ) ρ ∗ = λ ∗ . ( d ) ρ n → ρ ∗ and V n → Φ ∗ as n → ∞ uniformly on compact sets, and the solution Φ ∗ to (3.19) is unique up to a scalar multiple, and satisfies (3.21) Φ ∗ ( x ) ≥ E xv h e R ˘ τ r [ c v ( X s ) − ρ ∗ ] d s Φ ∗ ( X ˘ τ r ) { ˘ τ r < ∞} i ∀ x ∈ ¯ B cr , for all r > , and for all v ∈ Ξ sm , with equality if and only if v is an a.e.measurable selector from the maximizer in (3.19) .Proof. Using Theorem 2.1 and (2.3)-(2.4), it follows that ρ n ≤ sup R d × K c , and thiscombined with Assumption 3.1 (iii) shows that { ρ n } converges along some subsequence { n k } k ∈ N ⊂ N to ρ ∗ . Therefore, the convergence of V n k along some further subsequence { n ′ k } ⊂ { n k } to a Φ ∗ satisfying (3.19) follows as in the proof of [13, Lemma 2.1].We now turn to part (a). Here in fact we show that −| ϕ ∗ | has at least logarithmicgrowth in | x | . Let δ ∈ (0 , 1) be a constant such that ρ ∗ − c ( x, ξ ) > δ for all x outside VARIATIONAL FORMULA FOR RISK-SENSITIVE CONTROL R d . Consider a function of the form φ ( x ) = (cid:0) | x | (cid:1) − θ , with θ > 0. By (3.3), there exists θ > r ◦ > (cid:0) L ξ φ ( x ) , (cid:12)(cid:12) σ T ( x ) ∇ φ ( x ) (cid:12)(cid:12)(cid:1) ≤ δφ ( x ) ∀ ( x, ξ ) ∈ B cr ◦ × K . We fix such a constant θ . We restrict our attention to solutions ( V n , ρ n ) of (3.2) overan increasing sequence in N , also denoted as { n } , such that ρ n converges to ρ ∗ . It isclear then that we may enlarge the radius r ◦ , if needed, so that(3.23) ρ n − c ( x, ξ ) > δ ∀ ( x, ξ ) ∈ B cr ◦ × K , and n ≥ r ◦ . Next, let ˘ χ : R → (0 , ∞ ) be a convex function in C ( R ) such that ˘ χ ( t ) = t for t ≥ χ ( t ) is constant and positive for t ≤ 1. This can be chosen so that ˘ χ ′′ < t> t ˘ χ ′′ ( t ) < 2. Such a function can be constructed by requiring, for example, that˘ χ ′′ ( t ) = 6(2 − t )( t − 1) for t ∈ [1 , χ ( t ) = − t + 3 t − t + 5 t for t ∈ [1 , χ (1) = . Note that ˘ χ ( t ) − t ˘ χ ′ ( t ) ≥ t > χ ǫ ( t ) := ǫ ˘ χ (cid:0) t / ǫ (cid:1) for ǫ > 0. Then(3.24) ˘ χ ǫ ( t ) − t ˘ χ ′ ǫ ( t ) ≥ , and t ˘ χ ′′ ǫ ( t ) < ∀ t > . Using (3.22)–(3.24), we obtain(3.25) L ξ ˘ χ ǫ (cid:0) φ ( x ) (cid:1) + (cid:0) c ( x, ξ ) − ρ n (cid:1) ˘ χ ǫ (cid:0) φ ( x ) (cid:1) ≤ − δ ˘ χ ǫ (cid:0) φ ( x ) (cid:1) + ˘ χ ′ ǫ (cid:0) φ ( x ) (cid:1) L ξ φ ( x ) + 12 ˘ χ ′′ ǫ (cid:0) φ ( x ) (cid:1) | σ T ( x ) ∇ φ ( x ) | ≤ − δ ˘ χ ǫ (cid:0) φ ( x ) (cid:1) + δφ ( x ) ˘ χ ′ ǫ (cid:0) φ ( x ) (cid:1) + 12 δ (cid:0) φ ( x ) (cid:1) ˘ χ ′′ ǫ (cid:0) φ ( x ) (cid:1) ≤ − δ ˘ χ ǫ (cid:0) φ ( x ) (cid:1) . For the last inequality in (3.25), we use the properties ˘ χ ǫ ( φ ) ≥ φ ˘ χ ′ ǫ ( φ ) and φ ˘ χ ′′ ǫ ( φ ) < χ ǫ ( φ ) ≥ φ and δ < 1. Note that, due to radialsymmetry, the support of ˘ χ ′ ǫ ◦ φ is a ball of the form B R ǫ , with ǫ R ǫ an nonincreasingcontinuous function with R ǫ → ∞ as ǫ ց 0. Recall the functions V n in (3.2). Select ǫ such that R ǫ = n > r ◦ . Scale V n until it touches ˘ χ ǫ ◦ φ at some point ˆ x from below.Here, ˘ χ ǫ ◦ φ denotes the composition of ˘ χ ǫ and φ . Let v n be a measurable selectorfrom the minimizer in (3.2), and define h n := ˘ χ ǫ ◦ φ − V n . Then, by (3.2) and (3.25),we have L v n h n ( x ) + (cid:0) c v n ( x ) − ρ n (cid:1) h n ( x ) < ∀ x ∈ R d , and h∇ h n , γ i = 0 on ∂B n , since the gradient of ˘ χ ǫ ◦ φ vanishes on ∂B R ǫ . It followsby the strong maximum principle that ˆ x cannot lie in the B n \ B r ◦ . Thus h n > x cannot lie on ∂B n either, without contradicting theHopf boundary point lemma. Thus ˆ x ∈ B r ◦ . This however shows by taking limits as ǫ ց 0, and employing the Harnack inequality which asserts that V n ( x ) ≤ C H V n ( y ) forall x, y ∈ B r ◦ for some constant C H , that Φ ∗ ≤ Cφ for some constant C . This provespart (a).Equation (3.20) follows by (3.9). Since Φ − ∗ is inf-compact and the right handside of (3.20) is negative and bounded away from zero outside a compact set byAssumption 3.1 (iii), the associated diffusion is ergodic [22, Theorem 4.1]. In turn, theFoster–Lyapunov equation in (3.20) shows that the diffusion is exponentially ergodic[28]. This proves part (b).4 A. ARAPOSTATHIS, A. BISWAS, V.S. BORKAR, AND K. SURESH KUMAR Moving to the proof of part (c), suppose that for some ρ ≤ ρ ∗ we have(3.26) max ξ ∈ K (cid:2) L ξ φ ( x ) + c ( x, ξ ) φ ( x ) (cid:3) ≤ ρ φ ( x ) . Evaluating this equation at measurable selector v ∗ from the maximizer of (3.19), andfollowing the argument in the proof of Lemma 3.3 we obtain ρ = ρ ∗ and φ = Φ ∗ .This also shows that ρ ∗ ≥ λ ∗ by the definition in (3.17), and thus we have equalityby (3.19).In order to prove part (d), suppose that ρ n → ρ ≤ ρ ∗ along some subsequence.Taking limits along perhaps a further subsequence, we obtain a positive function φ ∈ C ( R d ) that satisfies (3.26) with equality. Thus ρ = ρ ∗ and and φ = Φ ∗ by part(c). The stochastic representation in (3.21) follows as in the proof of Lemma 3.3. Thiscompletes the proof. In this sectionwe first show that the problem in R d can also be approached by using Dirichleteigensolutions. The main result is Theorem 3.6, which establishes that ρ ∗ equals therisk-sensitive value J ∗ , and the usual verification of optimality criterion.We borrow some results from [11, 12]. These can also be found in [3, Lemma 2.2],and are summarized as follows: Fix any v ∈ Ξ sm . For each r ∈ (0 , ∞ ) there exists aunique pair (Ψ v,r , λ v,r ) ∈ (cid:0) W ,p ( B r ) ∩ C ( ¯ B r ) (cid:1) × R , for any p > d , satisfying Ψ v,r > B r , Ψ v,r = 0 on ∂B r , and Ψ v,r (0) = 1, which solves(3.27) L v Ψ v,r ( x ) + c v ( x ) Ψ v,r ( x ) = λ v,r Ψ v,r ( x ) a.e. x ∈ B r . Moreover, the solution has the following properties:(i) The map r λ v,r is continuous and strictly increasing.(ii) In its dependence on the function c v , λ v,r is nondecreasing, convex, andLipschitz continuous (with respect to the L ∞ norm) with Lipschitz constant1. In addition, if c v (cid:8) c ′ v then λ v,r ( c v ) < λ v,r ( c ′ v ).We refer to λ v,r and Ψ v,r as the (Dirichlet) eigenvalue and eigenfunction, respectively,of the operator L v + c v on B r .Recall the definition of G in (1.3). Based on the results in [31], there exists aunique pair (Ψ ∗ ,r , λ ∗ ,r ) ∈ (cid:0) C ( B r ) ∩ C ( ¯ B r ) (cid:1) × R , satisfying Ψ ∗ ,r > B r , Ψ ∗ ,r = 0on ∂B r , and Ψ ∗ ,r (0) = 1, which solves(3.28) G Ψ ∗ ,r ( x ) = λ ∗ ,r Ψ ∗ ,r ( x ) ∀ x ∈ B r , and properties (i)–(ii) above hold for λ ∗ ,r . Also recall the definitions of the generalizedprincipal eigenvalues in (3.8) and (3.17), and ρ r defined in (3.2). Lemma The following hold: ( i ) For r > , we have λ v,r ≤ λ ∗ ,r for all v ∈ Ξ sm , and λ ∗ ,r < ρ r . ( ii ) lim r →∞ λ v,r = λ v for all v ∈ Ξ sm , and lim r →∞ λ ∗ ,r = λ ∗ .Proof. Part (i) is a straightforward application of the strong maximum principle.By (2.5) and (3.28) we have(3.29) L v Ψ ∗ ,r ( x ) + c v ( x ) Ψ ∗ ,r ( x ) ≤ λ ∗ ,r Ψ ∗ ,r ( x ) a.e. x ∈ B r . Let r ′ < r , and suppose that λ v,r ′ ≥ λ ∗ ,r . Scale Ψ v,r ′ so that it touches Ψ ∗ ,r at onepoint from below in B r ′ . Then Ψ ∗ ,r − Ψ v,r ′ is nonnegative, and by (3.27) and (3.29) VARIATIONAL FORMULA FOR RISK-SENSITIVE CONTROL L v (Ψ ∗ ,r − Ψ v,r ′ ) − (cid:0) c v − λ ∗ ,r (cid:1) − (Ψ ∗ ,r − Ψ v,r ′ )= − (cid:0) c v − λ ∗ ,r (cid:1) + (Ψ ∗ ,r − Ψ v,r ′ ) − (cid:0) λ v,r − λ ∗ ,r (cid:1) Ψ v,r ′ ≤ B r ′ . This however implies that Ψ ∗ ,r = Ψ v,r ′ on B r ′ which is a contradiction. Hence λ v,r ′ <λ ∗ ,r for all r ′ < r and the inequality λ v,r ≤ λ ∗ ,r follows by the continuity of r λ v,r .Following the same method, with r ′ = r , we obtain λ ∗ ,r < ρ r .Part (ii) follows by [7, Lemma 2.2 (ii)].Recall the definitions in (2.3) and (2.4), and let J xξ = J xξ ( c ) := J xξ ( c ; R d ) , and similarly for J x ∗ and J ∗ . Also, recall that J xv = J xv ( c ) = lim inf T →∞ T log E xv h e R T c v ( X t ) d t i , x ∈ R d , v ∈ Ξ sm . The theorem that follows concerns the equality λ ∗ = J ∗ . Recall the definition in (3.4). Theorem We have λ ∗ = ρ ∗ = J ∗ . In addition, J xv = J ∗ if and only if v isan a.e. measurable selector from the maximizer of (3.19) .Proof. We already have ρ ∗ = λ ∗ from Theorem 3.4. This also gives ρ ∗ ≤ J xv ∗ ( c ) ≤ J ∗ . Choose R > ρ ∗ > sup B cR × K c . This is possible by (3.4). Let δ > χ that vanishes in B R andequals to 1 in B cR +1 . Let Ψ = Φ ∗ + εχ , and select ǫ > ǫ (cid:0) G χ ( x ) − ρ ∗ χ ( x ) (cid:1) ≤ δ Φ ∗ ( x ) ∀ x ∈ ¯ B R +1 . This is clearly possible since Φ ∗ is positive and G χ ( x ) − ρ ∗ χ ( x ) = max ξ ∈ K ( c ( x, ξ ) − ρ ∗ ) χ ( x ) ≤ ∀ x ∈ B cR +1 . We have(3.30) G Ψ( x ) − ( ρ ∗ + δ )Ψ( x ) ≤ ( G− ρ ∗ )Φ ∗ ( x )+ ǫ ( G− ρ ∗ ) χ ( x ) − δ Ψ( x ) ≤ ∀ x ∈ R d . Since Ψ is bounded below away from zero, a standard use of Itˆo’s formula and theFatou lemma applied to (3.30) shows that J xξ ≤ ρ ∗ + δ for all ξ ∈ Ξ. Since δ isarbitrary this implies ρ ∗ ≥ J ∗ , and hence we must have equality. This also shows thatevery a.e. measurable selector from the maximizer of (3.19) is optimal.Next, for v ∈ Ξ sm , let ( λ v , Ψ v ) be an eigenpair, obtained as a limit of Dirich-let eigenpairs (cid:8) ( λ v,n , Ψ v,n ) (cid:9) n ∈ N , with Ψ v,n (0) = 1, along some subsequence (seeLemma 3.5). Let ν ∈ [ −∞ , ∞ ) be defined by ν := lim r →∞ sup ( x,ξ ) ∈ B cr × K c ( x, ξ ) . First suppose that λ v > ν . Then, using the the argument in the preceding paragraph,together with the fact that λ v ≤ J xv , we deduce that λ v = J xv for all x ∈ R d . Thus if6 A. ARAPOSTATHIS, A. BISWAS, V.S. BORKAR, AND K. SURESH KUMAR v ∈ Ξ sm is optimal, we must have λ v = ρ ∗ . This implies that we can select a ball B such that λ v,n − sup ( x,ξ ) ∈ B c × K c ( x, ξ ) > n . Let ˘ τ = τ ( B c ). By [3, Lemma 2.10 (i)], we have thestochastic representationΨ v,n ( x ) = E xv h e R ˘ τ [ c v ( X t ) − λ v,n ] d t Ψ v,n ( X ˘ τ ) { ˘ τ < τ n } i ∀ x ∈ B n \ ¯ B . Next we show that that Ψ v vanishes at infinity by using the argument in the proofof Theorem 3.4. The analysis is simpler here. Selecting the same function φ as in theproof of Theorem 3.4, there exists R > L v φ ( x ) + c v ( x ) φ ( x ) ≤ λ v φ ( x ) ∀ x ∈ B cR . Since Ψ v,n (0) = 1, employing the Harnack inequality we scale φ so that φ > Ψ v,n on B R for all n > R . The strong maximum principle then shows that Ψ v,n < φ on R d .Thus Ψ − v is inf-compact, which together with the Lyapunov equation e L ψ v v Ψ − v = (cid:0) c v − ρ ∗ )Ψ − v imply that the ground state process is exponentially ergodic. ByLemma 3.3, we then have(3.31) Ψ v ( x ) = E xv h e R ˘ τ [ c v ( X t ) − ρ ∗ ] d t Ψ v ( X ˘ τ ) { ˘ τ < ∞} i ∀ x ∈ ¯ B c . On the other hand, it holds that L v Φ ∗ + c v Φ ∗ ≤ ρ ∗ Φ ∗ , which implies that(3.32) Φ ∗ ( x ) ≥ E xv h e R ˘ τ [ c v ( X s ) − ρ ∗ ] d s Φ ∗ ( X ˘ τ ) { ˘ τ < ∞} i . Comparing the functions in (3.31) and (3.32) using the strong maximum principle, asdone in the proof of Lemma 3.3, we deduce that Ψ v = Φ ∗ . Thus v is a measurableselector from the maximizer of (3.19).It remains to address the case λ v ≤ ν . By [6, Corollary 3.2] there exists a positiveconstant δ such that λ v ( c v + δ B ) > ν , and λ v ( c v + δ B ) < ρ ∗ . Thus repeating theabove argument we obtain ρ ∗ > λ v ( c v + δ B ) = lim inf T →∞ T log E xv h e R T [ c v ( X t )+ δ B ( X t )] d t i ≥ J vx ∀ x ∈ R d . Therefore, v cannot be optimal. This completes the proof. 4. The variational formula on R d . In this section we establish the variationalformula on R d . As mentioned in subsection 1.1, the function H in (1.4) plays a veryimportant role in the analysis. To explain how this function arises, let P x,tv denote theprobability measure on the canonical path space { X s : 0 ≤ s ≤ t } of the diffusion (3.1)under a control v ∈ Ξ sm , and e P x,tv the analogous probability measure correspondingto the diffusion d e X t = (cid:0) b v ( e X t ) + a ( x ) ∇ ϕ ∗ ( e X t ) (cid:1) d t + σ ( e X t ) d f W t , with ϕ ∗ as in Theorem 3.4. By the Cameron–Martin–Girsanov theorem we obtaind P x,tv d e P x,tv = exp (cid:18) − Z t (cid:10) ∇ ϕ ∗ ( e X s ) , σ ( e X s )d f W s (cid:11) − Z t (cid:12)(cid:12) σ T ( e X s ) ∇ ϕ ∗ ( e X s ) (cid:12)(cid:12) d s (cid:19) . VARIATIONAL FORMULA FOR RISK-SENSITIVE CONTROL relative entropy , or Kullback–Leibner divergence between e P x,tv and P x,tv takesthe form D KL (cid:0)e P x,tv (cid:13)(cid:13) P x,tv (cid:1) = − Z log (cid:18) d P x,tv d e P x,tv (cid:19) d e P x,tv = 12 e E x,tv (cid:20)Z t (cid:12)(cid:12) σ T ( e X s ) ∇ ϕ ∗ ( e X s ) (cid:12)(cid:12) d s (cid:21) . Dividing this by t , and letting t ց 0, we see that H is the infinitesimal relative entropyrate .Recall from subsection 1.1 the definition Z := R d × K × R d , and the use of thesingle variable z = ( x, ξ, y ) ∈ Z in the interest of notational simplicity. Also recallthe definitions in (1.5) and (1.6). Recall the definitions in (1.1) and (1.2). In analogyto (2.9), we define F ( g, µ ) := Z Z (cid:0) A g ( z ) + L ( z ) (cid:1) µ (d z ) for g ∈ C ( R d ) and µ ∈ P ( Z ) . The following result plays a central role in this paper. Proposition We have (4.1) ρ ∗ = max µ ∈M A ∩P ∗ ( Z ) Z Z L ( z ) µ (d z ) = sup µ ∈P ∗ ( Z ) inf g ∈C c ( R d ) F ( g, µ ) . In addition, if M A ∩ P ◦ ( Z ) ⊂ P ∗ ( Z ) , then P ∗ ( Z ) may be replaced by P ( Z ) in (4.1) . In the proof of Proposition 4.1 and elsewhere in the paper we use a cut-off function χ defined as follows (compare this with the function ˘ χ in the proof of Theorem 3.4). Definition Let χ : R → R be a smooth convex function such that χ ( s ) = s for s ≥ , and χ ( s ) = − for s ≤ − . Then χ ′ and χ ′′ are nonnegative and the latteris supported on ( − , . It is clear that we can choose χ so that χ ′′ < . We scale thisfunction by defining χ t ( s ) := − t + χ ( s + t ) for t ∈ R . Thus χ t ( s ) = s for s ≥ − t , and χ t ( s ) = − t − for s ≤ − t − . Observe that if − f is an inf-compact function then χ t ( f ) + t + 1 is compactly supported by the definition of χ .Proof of Proposition . We start with the first equality in (4.1). By (3.10), wehave(4.2) e L ϕ ∗ v ∗ ϕ ∗ ( x ) + c v ∗ ( x ) − H ( x ) = ρ ∗ . As shown in Theorem 3.4 the twisted process ˜ X with extended generator e L ϕ ∗ v ∗ isexponentially ergodic. Let η v ∗ denote its invariant probability measure. Since | ϕ ∗ | Φ − ∗ vanishes at infinity, and Φ − ∗ is a Lyapunov function by (3.20), it then follows from(4.2), by using the Itˆo formula and applying [8, Lemma 3.7.2 (ii)], that(4.3) ρ ∗ = Z R d (cid:0) c v ∗ ( x ) − H ( x ) (cid:1) η v ∗ (d x ) = Z R d L (cid:0) x, v ∗ ( x ) , ∇ ϕ ∗ ( x ) (cid:1) η v ∗ (d x ) . Next, we show that(4.4) ρ ∗ ≥ Z Z L ( z ) µ (d z ) ∀ µ ∈ M A ∩ P ∗ ( Z ) . We write (3.19) asmax ξ ∈ K h L ξ ϕ ∗ ( x ) + (cid:12)(cid:12) σ T ( x ) ∇ ϕ ∗ ( x ) (cid:12)(cid:12) + c ( x, ξ ) i = ρ ∗ ∀ x ∈ R d , A. ARAPOSTATHIS, A. BISWAS, V.S. BORKAR, AND K. SURESH KUMAR and using the identity L ξ ϕ ∗ + (cid:12)(cid:12) σ T ∇ ϕ ∗ (cid:12)(cid:12) = L ξ ϕ ∗ + h ay, ∇ ϕ ∗ i + (cid:12)(cid:12) σ T ( y − ∇ ϕ ∗ ) (cid:12)(cid:12) − | σ T y | to obtain (compare with (3.18))(4.5) A ϕ ∗ ( x, ξ, y ) + (cid:12)(cid:12) σ T ( x ) (cid:0) y − ∇ ϕ ∗ ( x ) (cid:1)(cid:12)(cid:12) + L ( x, ξ, y ) ≤ ρ ∗ . Using the function χ t in Definition 4.2, the identity A χ t ( ϕ ∗ ) = χ ′ t ( ϕ ∗ ) A ϕ ∗ + χ ′′ t ( ϕ ∗ ) (cid:12)(cid:12) σ T ∇ ϕ ∗ (cid:12)(cid:12) , and the definition of H , we obtain from (4.5) that(4.6) A ( χ t ◦ ϕ ∗ )( x, ξ, y ) − χ ′′ t (cid:0) ϕ ∗ ( x ) (cid:1) H ( x )+ χ ′ t (cid:0) ϕ ∗ ( x ) (cid:1)(cid:16) (cid:12)(cid:12) σ T ( x ) (cid:0) y − ∇ ϕ ∗ ( x ) (cid:1)(cid:12)(cid:12) + L ( x, ξ, y ) − ρ ∗ (cid:17) ≤ . Let µ ∈ M A ∩ P ∗ ( Z ), and without loss of generality assume that µ ∈ P ◦ ( Z ). Theintegral of the first term in (4.6) with respect to µ vanishes by the definition of M A .Thus, we have(4.7) Z Z χ ′ t (cid:0) ϕ ∗ ( x ) (cid:1)(cid:16) (cid:12)(cid:12) σ T ( x ) (cid:0) y − ∇ ϕ ∗ ( x ) (cid:1)(cid:12)(cid:12) + L ( x, ξ, y ) − ρ ∗ (cid:17) µ (d x, d ξ, d y ) ≤ Z R d χ ′′ t (cid:0) ϕ ∗ ( x ) (cid:1) H ( x ) η (d x ) , with η ( · ) = R K × R d µ ( · , d ξ, d y ). Since R H d η < ∞ , then taking limits as t → ∞ in(4.7), using dominated convergence together with the fact that χ ′′ t ( s ) → t → ∞ ,we see that the right-hand side of (4.7) goes to 0. Also, using Fatou’s lemma and thefact that χ ′ t ( s ) → t → ∞ , we obtain from (4.7) that(4.8) Z Z (cid:16) (cid:12)(cid:12) σ T ( x ) (cid:0) y − ∇ ϕ ∗ ( x ) (cid:1)(cid:12)(cid:12) + L ( x, ξ, y ) (cid:17) µ (d x, d ξ, d y ) ≤ ρ ∗ . This proves (4.4). Now, if we let µ ∗ (d x, d ξ, d y ) := η v ∗ (d x ) δ v ∗ ( x ) (d ξ ) δ ∇ ϕ ∗ ( x ) (d y ) , then Z Z A f ( z ) µ ∗ (d z ) = Z R d e L ϕ ∗ v ∗ f ( x ) η v ∗ (d x ) = 0 ∀ f ∈ C c ( R d ) , which implies that µ ∗ ∈ M A . Then, the second equality in (4.3) can be written as(4.9) ρ ∗ = Z Z L ( z ) µ ∗ (d z ) , while the first equality in (4.3) together with the fact that c is bounded above and ρ ∗ is finite implies that µ ∗ ∈ P ∗ ( Z ). Therefore, µ ∗ ∈ M A ∩ P ∗ ( Z ), and the first equalityin (4.1) now follows from (4.4) and (4.9).We now turn to the proof of the second equality in (4.1). Note that it µ / ∈ P ◦ ( Z )then F (0 , µ ) = −∞ . On the other hand, if µ / ∈ M A then, as also stated in the proof VARIATIONAL FORMULA FOR RISK-SENSITIVE CONTROL g ∈C c ( R d ) F ( g, µ ) = −∞ . The remaining case is µ ∈ M A ∩ P ∗ ( Z ),for which we have F ( g, µ ) = R Z L ( z ) µ (d z ), thus proving the equality.The second statement of the proposition follows directly from the arguments usedabove. Remark µ (d x, d ξ, d y ) = π (d x, d ξ ) δ ∇ ϕ ∗ ( x ) (d y ) , where δ y denotes the Dirac mass at y ∈ R d , and π (d x, d ξ ) is an optimal ergodicoccupation measure of the diffusion associated with operator A ∗ defined by A ∗ φ ( x, ξ ) := 12 trace (cid:0) a ( x ) ∇ φ ( x ) (cid:1) + (cid:10) b ( x, ξ ) + a ( x ) ∇ ϕ ∗ ( x ) , ∇ φ ( x ) (cid:11) for ( x, ξ ) ∈ R d × K and f ∈ C ( R d ). We leave the verification of this assertion to thereader.We continue our analysis by investigating conditions on the model parameterswhich imply that M A ∩ P ◦ ( Z ) ⊂ P ∗ ( Z ). We impose the following hypothesis on thematrix a . Assumption a is bounded and has a uniform modulus of continu-ity on R d , and is uniformly non-degenerate in the sense that the minimum eigenvalueof a is bounded away from zero on R d .We start with the following lemma, which can be viewed as a generalization of [3,Lemma 3.3]. Assumption 3.1, which applies by default throughout the paper, neednot be enforced in this lemma. Lemma Consider a linear operator in R d , of the form L := a ij ∂ ij + b i ∂ i + c , and suppose that the matrix a = σσ T satisfies Assumption , and the coefficients b and c are locally bounded and measurable. Then, there exists a constant e C such thatany strong positive solution u ∈ W ,p loc ( R d ) , p > d , to the equation (4.10) L u ( x ) = 0 on R d satisfies (cid:12)(cid:12) ∇ u ( x ) (cid:12)(cid:12) u ( x ) ≤ e C h y ∈ B ( x ) (cid:16) | b ( y ) | + p | c ( y ) | (cid:17)i ∀ x ∈ R d . Proof. We use scaling. For any fixed x ∈ R d , with | x | ≥ 1, we define M x := 1 + sup x ∈ B ( x ) (cid:16) | b ( x ) | + p | c ( x ) | (cid:17) , and the scaled function ˜ u x ( y ) := u (cid:0) x + M − x y (cid:1) , y ∈ R d , A. ARAPOSTATHIS, A. BISWAS, V.S. BORKAR, AND K. SURESH KUMAR and similarly for the functions ˜ a x , ˜ b x , and ˜ c x . The equation in (4.10) then takesthe form(4.11) 12 ˜ a ijx ( y ) ∂ ij ˜ u x ( y ) + ˜ b ix ( y ) M x ∂ i ˜ u x ( y ) + ˜ c x ( y ) M x ˜ u x ( y ) = 0 on R d . It is clear from the hypotheses that the coefficients of (4.11) are bounded in theball B , with a bound independent of x , and that the modulus of continuity andellipticity constants of the matrix ˜ a x in B are independent of x . We follow theargument in [3, Lemma 3.3], which is repeated here for completeness. First, by theHarnack inequality [21, Theorem 9.1], there exists a positive constant C H independentof the point x chosen, such that ˜ u x ( y ) ≤ C H ˜ u x ( y ′ ) for all y, y ′ ∈ B . Let L := 12 ˜ a ijx ( y ) ∂ ij + ˜ b ix ( y ) M x ∂ i . By a well known a priori estimate [16, Lemma 5.3], there exists a constant C a , againindependent of x , such that,(4.12) (cid:13)(cid:13) ˜ u x (cid:13)(cid:13) W ,p ( B ) ≤ C a (cid:16)(cid:13)(cid:13) ˜ u x (cid:13)(cid:13) L p ( B ) + (cid:13)(cid:13) L ˜ u x (cid:13)(cid:13) L p ( B ) (cid:17) ≤ C a (cid:18) y ∈ B ˜ c x ( y ) M x (cid:19) (cid:13)(cid:13) ˜ u x (cid:13)(cid:13) L p ( B ) ≤ e C ˜ u x (0) , where in the last inequality, we used the Harnack property. Clearly then, the resultingconstant e C does not depend on x . Next, invoking Sobolev’s theorem, which assertsthe compactness of the embedding W ,p (cid:0) B ( x ) (cid:1) ֒ → C ,r (cid:0) B ( x ) (cid:1) , for p > d and r < − dp (see [16, Proposition 1.6]), and combining this with (4.12), we obtainsup y ∈ B (cid:12)(cid:12) ∇ ˜ u x ( y ) (cid:12)(cid:12) ≤ e C ˜ u x ( x )for some constant e C independent of x . Thus(4.13) |∇ ˜ u x (0) | ˜ u x (0) ≤ e C ∀ x ∈ B c . Using (4.13) and the identity ∇ u ( x ) = M x ∇ ˜ u x (0) for all x ∈ B c , we obtain (cid:12)(cid:12) ∇ u ( x ) (cid:12)(cid:12) u ( x ) = M x (cid:12)(cid:12) ∇ ˜ u x (0) (cid:12)(cid:12) ˜ u x (0) ≤ e C (cid:20) x ∈ B ( x ) (cid:16) | b ( x ) | + p | c ( x ) | (cid:17)(cid:21) ∀ x ∈ B c . Of course B ( x ) is arbitrary. The same is true with any radius, with perhaps adifferent constant. This completes the proof. Remark Assumption − c is inf-compact. VARIATIONAL FORMULA FOR RISK-SENSITIVE CONTROL b satisfies(4.14) max ( x,ξ ) ∈ B cr × K (cid:10) b ( x, ξ ) , x (cid:11) − | x | −−−→ r →∞ . (c) There exists a constant b C such that (compare this with [4, Theorem 3.1 (b)]) H ( x ) (cid:0) | ϕ ∗ ( x ) | (cid:1) (cid:0) | c ( x, ξ ) | (cid:1) ≤ b C ∀ ( x, ξ ) ∈ R d × K , where ϕ ∗ = log Φ ∗ , and Φ ∗ is as in Theorem 3.4. Remark | b | | c | is boundedimplies Assumption 4.7 (c). This is asserted by Lemma 4.5. See also Lemma 4.11 laterin this section.We have the following estimate concerning the growth of the function Φ ∗ in The-orem 3.4. This does not require the uniform ellipticity hypothesis in Assumption 4.4. Lemma Grant Assumption part (a) or (b) . Then there exists a function ζ : (0 , ∞ ) → (0 , ∞ ) , with lim r →∞ ζ ( r ) = ∞ , such that the solution Φ ∗ in (3.19) satisfies (4.15) (cid:12)(cid:12) log Φ ∗ ( x ) (cid:12)(cid:12) ≥ ζ ( r ) log (cid:0) | x | (cid:1) ∀ x ∈ B cr . Proof. We start with part (a). Let α : (0 , ∞ ) → (0 , ∞ ) be a strictly increasingfunction, satisfying α ( r ) → ∞ and α ( r ) r → r → ∞ , and(4.16) log α ( r ) ≥ log r − inf B cr | ϕ ∗ | / . This is always possible. A specific function satisfying these properties is given by α ( r ) := √ r + sup s ∈ (0 ,r ] (cid:18) s exp (cid:16) − inf B cr | ϕ ∗ | / (cid:17)(cid:19) . Let c be a constant such that (cid:12)(cid:12) L v ∗ (log | x | ) (cid:12)(cid:12) ≤ c for all | x | > 1. Such a constantexists since σ and b have at most linear growth in | x | by (3.3). We define(4.17) κ ( r ) := min (cid:18) √ r , c inf B cr × K (cid:12)(cid:12) c ( x, ξ ) − ρ ∗ (cid:12)(cid:12) / , inf B cr | ϕ ∗ | / (cid:19) . Since the functions − ϕ ∗ and − c are inf-compact, it is clear that κ ( r ) → ∞ as r → ∞ .Define the family of functions h r ( x ) := − κ ( r ) (cid:0) log | x | − log α ( r ) (cid:1) , r ≥ , x ∈ B cr . Note that for any g ∈ C ( R d ) we have(4.18) L ξ χ t ( g ) = χ ′ t ( g ) L ξ ( g ) + 12 χ ′′ t ( g ) (cid:12)(cid:12) σ T ∇ g (cid:12)(cid:12) . Thus, applying (4.18) and the bound (cid:12)(cid:12) L v ∗ (log | x | ) (cid:12)(cid:12) ≤ c , we obtain(4.19) e L ϕ ∗ v ∗ χ t (cid:0) h r ( x ) (cid:1) ≤ c κ ( r ) χ ′ t (cid:0) h r ( x ) (cid:1) + (cid:10) a ( x ) ∇ ϕ ∗ ( x ) , ∇ χ t (cid:0) h r ( x ) (cid:1)(cid:11) + 12 χ ′′ t (cid:0) h r ( x ) (cid:1) (cid:12)(cid:12) σ T ( x ) ∇ h r ( x ) (cid:12)(cid:12) ∀ x ∈ B cr . A. ARAPOSTATHIS, A. BISWAS, V.S. BORKAR, AND K. SURESH KUMAR Combining (4.2) and (4.19), and completing the squares, we have(4.20) e L ϕ ∗ v ∗ (cid:0) χ t ◦ h r − ϕ ∗ (cid:1) ( x ) ≤ c v ( x ) − ρ ∗ + c κ ( r ) χ ′ t (cid:0) h r ( x ) (cid:1) + 12 χ ′′ t (cid:0) h r ( x ) (cid:1) (cid:12)(cid:12) σ T ( x ) ∇ h r ( x ) (cid:12)(cid:12) + 12 (cid:12)(cid:12) σ T ( x ) ∇ χ t (cid:0) h r ( x ) (cid:1)(cid:12)(cid:12) − (cid:12)(cid:12) σ T ( x ) (cid:2) ∇ ϕ ∗ ( x ) − ∇ χ t (cid:0) h r ( x ) (cid:1)(cid:3)(cid:12)(cid:12) . Recall that χ ′ ≤ 1, and χ ′′ ≤ 1. Choose r large enough so that ϕ ∗ < − B cr . Itthen follows by the definitions in (4.16) and (4.17) that ϕ ∗ − χ t ◦ h r < ∂B r for all t ≥ 0. Also, for each t > 0, the difference ϕ ∗ − χ t ◦ h r is negative outside some compactset by the inf-compactness of − ϕ ∗ . Note also that |∇ h r | ≤ κ ( r ) r on B cr . Hence (3.3)and (4.17) imply that there exists r such the right-hand side of (4.20) is negative on B cr for all r > r and all t ≥ 0. An application of the strong maximum principle thenshows that ϕ ∗ < h r on B cr for all r > r .Now, note thatlog | x | α ( r ) ≥ 12 log (cid:0) | x | (cid:1) when | x | ≥ max (cid:0) , (cid:0) α ( r ) (cid:1) (cid:1) . Since α ( r ) is strictly increasing, the inequality (4.15) holds with ζ ( r ) := 12 κ (cid:16) α − (cid:0)p r / (cid:1)(cid:17) for all r ≥ (cid:0) α ( r ) (cid:1) . This completes the proof under Assumption 4.7 (a) .The proof under part (b) of the assumption is similar. The only difference is thathere we use the fact that m r := sup x ∈ B cr (cid:0) L v ∗ (log | x | ) (cid:1) − → t → ∞ , which isimplied by (4.14). Thus with ǫ > ρ ∗ − c > ǫ outside somecompact set, we choose κ ( r ) as κ ( r ) := min (cid:18) √ r , sup B cr × K ǫ √ m r , inf B cr | ϕ ∗ | / (cid:19) . The rest is completely analogous to the analysis above. This concludes the proof.The first part of the theorem which follows is quite technical, but identifiesa rather deep property of the ergodic occupation measures of the operator A . Itshows that under Assumptions 4.4 and 4.7 (a) or (b), or Assumption 4.7 (c), if sucha measure µ is feasible for the maximization problem, or in other words, it sat-isfies R Z L ( z ) µ (d z ) > −∞ , then it necessarily has “finite average” entropy, that is R H d µ < ∞ , or equivalently, it belongs in the class P ∗ ( Z ). The proof uses the methodof contradiction. We first show that if such a measure µ is not in the class P ∗ ( Z ),then the left hand side of (4.7) grows at a geometric rate as a function of t . Then weobtain a contradiction by evaluating the right-hand side of (4.7) using this geometricgrowth together with the bound in Lemma 4.9. Theorem i ) Under Assumptions and or (b) , or Assump-tion , we have M A ∩ P ◦ ( Z ) ⊂ P ∗ ( Z ) . This of course implies by Propo-sition that ρ ∗ = max µ ∈M A Z Z L ( z ) µ (d z ) = sup µ ∈P ( Z ) inf g ∈C c ( R d ) F ( g, µ ) . VARIATIONAL FORMULA FOR RISK-SENSITIVE CONTROL ii ) Let Assumption hold, and suppose that (4.21) sup x ∈ R d H ( x )1 + | ϕ ∗ ( x ) | < ∞ . Then (4.22) ρ ∗ = inf g ∈C c ( R d ) sup µ ∈P ( Z ) F ( g, µ ) . Proof. We first prove part (i) under under Assumption 4.7 (a) or (b). We argueby contradiction. Let µ ∈ M A ∩ P ◦ ( Z ), and suppose that µ / ∈ P ∗ ( Z ). As in the proofof Proposition 4.1 we let η ( · ) = R K × R d µ ( · , d ξ, d y ). Let I ( t ) and I ( t ) denote the leftand the right-hand side of (4.7), respectively, and define I ( t ) := Z R d χ ′ t (cid:0) ϕ ∗ ( x ) (cid:1) H ( x ) η (d x ) . Then of course I ( t ) → ∞ as t → ∞ by the hypothesis. Expanding I ( t ) we see that I ( t ) = I ( t ) − Z Z χ ′ t (cid:0) ϕ ∗ ( x ) (cid:1)(cid:10) a ( x ) y, ∇ ϕ ∗ ( x ) (cid:11) d µ + Z Z χ ′ t ( ϕ ∗ ( x ))( c − ρ ∗ ) d µ . Since R L d µ is finite, it follows that R Z | σ T y | d µ and R Z max {− c, } d µ are also finite.Moreover, the second assertion and the fact that c is bounded above imply that R Z | c | d µ < ∞ . Thus, using the Cauchy–Schwarz inequality in the above display andthe fact | χ ′ t | is bounded, we have(4.23) α ( t ) − α ( t ) p I ( t ) + I ( t ) ≤ I ( t ) ≤ α ( t ) + α ( t ) p I ( t ) + I ( t )for some constants α ( t ) and α ( t ) which are bounded in t ∈ [0 , ∞ ).First suppose that over some sequence t n → ∞ we have I ( t n ) I ( t n ) → δ < n → ∞ .This implies by (4.23) that I ( t n ) I ( t n ) → δ . However, if this is the case, then the inequality α ( t n ) − α ( t n ) p I ( t n ) + (cid:16) − I ( t n ) I ( t n ) (cid:17) I ( t n ) ≤ , which is implied by (4.7) and (4.23), contradicts the fact that I ( t ) → ∞ as t → ∞ .Thus we must have lim inf t →∞ I ( t ) I ( t ) ≥ 1, and same applies to the fraction I ( t ) I ( t ) .Define g k := Z R d H ( x ) {− k<ϕ ∗ ( x ) < − k +2 } η (d x ) , k ∈ N . We have I (2 n ) ≥ P nk =1 g k for n ∈ N , by definition of these quantities. Recall that I ( t ) is defined as the right-hand side of (4.7). Note then that, since χ ′′ < 1, we have I (2 n ) < δg n +1 for some δ < 1. Therefore, since lim inf t →∞ I ( t ) I ( t ) ≥ 1, there exists n ∈ N such that(4.24) S n := n X k =1 g k ≤ g n +1 ∀ n ≥ n . Thus S n +1 − S n = g n +1 ≥ S n , which implies that S n +1 ≥ S n . This of course meansthat S n diverges at a geometric rate in n , that is, S n ≥ n − S . Let h denote the4 A. ARAPOSTATHIS, A. BISWAS, V.S. BORKAR, AND K. SURESH KUMAR inverse of the map y ζ ( y ) log(1+ y ). Note that H ( x ) ≤ C (1+ | x | p ) for some positiveconstants C and p by Lemma 4.5 and the hypothesis that c has polynomial growthin Assumption 3.1 (ii). Thus, by Lemma 4.9, we obtain g n ≤ C Z R d (1 + | x | p ) {− n<ϕ ∗ ( x ) < − n +2 } η (d x ) ≤ C Z R d (1 + | x | p ) { ζ ( | x | ) log(1+ | x | ) < n } η (d x ) ≤ C (cid:0) h (2 n ) p (cid:1) for all n ∈ N . However, this implies from (4.24) thatlog 2 ≤ lim sup n →∞ log S n n ≤ C ′ lim sup n →∞ log h ( n ) n = C ′ lim sup k →∞ log kζ ( k ) log(1 + k ) = 0for some constant C ′ , and we reach a contradiction. Therefore, M A ∩ P ◦ ( Z ) ⊂ P ∗ ( Z ).Moving on to the proof under Assumption 4.7 (c), we replace the function χ t inDefinition 4.2 by a function ˜ χ t defined as follows. For t > 0, we let ˜ χ t be a convex C ( R ) function such that ˜ χ t ( s ) = s for s ≥ − t , and ˜ χ t ( s ) = constant for s ≤ − t e .Then ˜ χ ′ t and ˜ χ ′′ t are nonnegative. In addition, we select ˜ χ t so that ˜ χ ′′ t ( s ) ≤ − s for s ∈ [ − t e , − t ] and t ≥ 0. This is always possible. We follow the same analysis as inthe proof of Proposition 4.1, with the function ˜ χ t as chosen, and obtain(4.25) Z Z ˜ χ ′ t (cid:0) ϕ ∗ ( x ) (cid:1)(cid:16) (cid:12)(cid:12) σ T ( x ) (cid:0) y − ∇ ϕ ∗ ( x ) (cid:1)(cid:12)(cid:12) + L ( x, ξ, y ) − ρ ∗ (cid:17) µ (d x, d ξ, d y ) ≤ Z R d ˜ χ ′′ t (cid:0) ϕ ∗ ( x ) (cid:1) H ( x ) η (d x ) ≤ Z R d H ( x ) | ϕ ∗ ( x ) | A t ( x ) η (d x ) ≤ b C Z R d × K × R d | ϕ ∗ ( x ) || ϕ ∗ ( x ) | (cid:0) | c ( x, ξ ) | (cid:1) A t ( x ) µ (d x, d ξ, d y ) , where A t := { x : ϕ ∗ ( x ) ≤ − t } . The integral on the right-hand side of (4.25) vanishesas t → ∞ by the hypothesis that R c d µ > −∞ , so again we obtain (4.8) which impliesthe result. This completes the proof of part (i).We continue with part (ii). We use a C convex function ˆ χ t : R → R , for t ≥ χ t ( s ) = s for s ≤ − t , ˆ χ ′′ t ( s ) ≤ − s log | s | for s < − t , and ˆ χ t ( s ) = constant for s ≥ ˆ ζ ( t ), for some ˆ ζ ( t ) < − t . We let h t ( x ) = ˆ χ t (cid:0) ϕ ∗ ( x ) (cid:1) . We may translate ϕ ∗ so thatit is smaller than − R d . By (4.6), we have(4.26) A h t ( z ) + L ( z ) − ρ ∗ ≤ (cid:2) − ˆ χ ′ t (cid:0) ϕ ∗ ( x ) (cid:1)(cid:3)(cid:0) L ( z ) − ρ ∗ (cid:1) − ˆ χ ′ t (cid:0) ϕ ∗ ( x ) (cid:1)(cid:12)(cid:12) σ T ( x ) (cid:0) y − ∇ ϕ ∗ ( x ) (cid:1)(cid:12)(cid:12) + ˆ χ ′′ t (cid:0) ϕ ∗ ( x ) (cid:1) H ( x ) . We claim that given any ǫ > t > F ( h t , µ ) ≤ ρ ∗ + ǫ for all µ ∈ P ( Z ). This of course suffices to establish (4.22).By Assumption 3.1 (iii) there exists t > t ≥ t . Also, using the definition of ˆ χ , we VARIATIONAL FORMULA FOR RISK-SENSITIVE CONTROL χ ′′ t (cid:0) ϕ ∗ ( x ) (cid:1) H ( x ) ≤ H ( x ) | ϕ ∗ ( x ) | log | ϕ ∗ ( x ) | { x ∈ R d : ϕ ∗ ( x ) ≤ − t } −−−→ t →∞ − ϕ ∗ is inf-compact by Theorem 3.4. This proves theclaim, and completes the proof.There is a large class of problems which satisfy (4.21). It consists of equationswith | b | + | c | having at most linear growth in | x | and | x | − h b, x i − growing no fasterthan | c | . This fact is stated in the following lemma. Lemma Grant Assumption and suppose that sup ( x,ξ ) ∈ R d × K max (cid:18) h b ( x, ξ ) , x i − | x || c ( x, ξ ) | , | b ( x, ξ ) | + | c ( x, ξ ) | | x | (cid:19) < ∞ . Then (4.21) holds.Proof. We use the function χ t in Definition 4.2. Let ˜ r > ρ ∗ − c ( x, ξ ) > δ > B c ˜ r × K . Note that there exists a constant C such that e L ϕ ∗ v ∗ χ t (cid:0) ǫ (˜ r − | x | ) (cid:1) ≤ Cǫ (cid:0) | x | − h b v ∗ ( x ) , x i − + |∇ ϕ ∗ ( x ) | (cid:1) ∀ t > . Thus for some ǫ > e L ϕ ∗ v ∗ (cid:0) ϕ ∗ ( x ) − χ t (cid:0) ǫ (˜ r − | x | ) (cid:1)(cid:1) > ∀ x ∈ B c ˜ r , ∀ t > . An application of the strong maximum principle then shows that ϕ ∗ ( x ) ≤ ǫ (˜ r − | x | ) − .Therefore, using Lemma 4.5, we obtain (cid:12)(cid:12) ∇ ϕ ∗ ( x ) (cid:12)(cid:12) ≤ C ′ (1 + | x | ) ≤ C ′ (cid:0) r − ǫ − ϕ ∗ ( x ) (cid:1) ∀ x ∈ B c ˜ r , for some constant C ′ .We next present the variational formula over functions in C ( R d ) whose derivativesup to second order have at most polynomial growth in | x | . Let C pol ( R d ) denote thisspace of functions. Theorem Under Assumption alone, we have (4.27) ρ ∗ = inf g ∈C ( R d ) sup µ ∈P ( Z ) F ( g, µ ) . Under Assumptions and or (b) , we have (4.28) ρ ∗ = inf g ∈C pol ( R d ) sup µ ∈P ( Z ) F ( g, µ ) = sup µ ∈P ( Z ) inf g ∈C pol ( R d ) F ( g, µ ) . Proof. By (3.18) and (3.19) we havemax ξ ∈ K max y ∈ R d (cid:2) A ϕ ∗ ( x, ξ, y ) + L ( x, ξ, y ) (cid:3) = ρ ∗ . Since ϕ ∗ ∈ C ( R d ), this implies thatinf g ∈C ( R d ) sup µ ∈P ( Z ) F ( g, µ ) ≤ ρ ∗ . A. ARAPOSTATHIS, A. BISWAS, V.S. BORKAR, AND K. SURESH KUMAR On the other hand, by Theorem 3.4 (d), it follows that for any g ∈ C ( R d ) we havesup z ∈Z (cid:2) A g ( z ) + L ( z ) (cid:3) ≥ ρ ∗ , which then implies the converse inequalityinf g ∈C ( R d ) sup µ ∈P ( Z ) F ( g, µ ) ≥ ρ ∗ . This proves (4.27).Concerning (4.28), the first equality follows as in the preceding paragraph since ϕ ∗ ∈ C pol ( R d ) by Assumptions 3.1 (i)–(ii) and 4.4, and Lemma 4.5. Turning now ourattention to the second equality in (4.28), recall from the proof of Proposition 4.1 that η v ∗ denotes the invariant probability measure of e L ϕ ∗ v ∗ . Under Assumption 4.7 (a) or(b), Lemma 4.9 shows that Φ − ∗ ( x ) grows faster in | x | than any polynomial. Therefore, R R d | x | n η v ∗ (d x ) < ∞ for all n ∈ N by (3.20). Since |∇ ϕ ∗ ( x ) | has at most polynomialgrowth, and b has at most linear growth, we obtain(4.29) Z R d (cid:12)(cid:12) e L ϕ ∗ v ∗ f ( x ) (cid:12)(cid:12) η v ∗ (d x ) < ∞ ∀ f ∈ C pol ( R d ) . Continuing, if (4.29) holds, then it is standard to show by employing a cut-off function,that(4.30) Z R d e L ϕ ∗ v ∗ f ( x ) η v ∗ (d x ) = 0 ∀ f ∈ C pol ( R d ) . Let µ ∗ ∈ M A denote the ergodic occupation measure corresponding to η v ∗ , that is, µ ∗ (d x, d ξ, d y ) = η v ∗ (d x ) δ v ∗ ( x ) (d ξ ) δ ∇ ϕ ∗ (d y ) . Equation (4.30) implies that(4.31) F ( g, µ ∗ ) = Z Z L ( z ) µ ∗ (d z ) = ρ ∗ ∀ g ∈ C pol ( R d ) . Since sup µ ∈P ( Z ) inf g ∈C pol ( R d ) F ( g, µ ) ≤ inf g ∈C pol ( R d ) sup µ ∈P ( Z ) F ( g, µ ) , the second equality in (4.28) then follows by (4.27) and (4.31). 5. The risk-sensitive cost minimization problem. Using Lemma 4.5, wecan improve the main result in [3] which assumes bounded drift and running cost.We say that a function f : X → R defined on a locally compact space is coercive,or near-monotone, relative to a constant β ∈ R if there exists a compact set K suchthat inf K c f > β . Recall that an admissible control ξ for (3.1) is a process ξ t ( ω ) whichtakes values in K , is jointly measurable in ( t, ω ) ∈ [0 , ∞ ) × Ω, and is non-anticipative,that is, for s < t , W t − W s is independent of F s given in (2.2). We let Ξ denote theclass of admissible controls, and E xξ the expectation operator on the canonical spaceof the process under the control ξ ∈ Ξ, conditioned on the process X starting from x ∈ R d at t = 0. VARIATIONAL FORMULA FOR RISK-SENSITIVE CONTROL c : R d × K → R be continuous, and Lipschitz continuous in its first argumentuniformly with respect to the second. We define the risk-sensitive penalty by E xξ = E xξ ( c ) := lim sup T →∞ T log E xξ h e R T c ( X t ,ξ t ) d t i , ξ ∈ Ξ , and the risk-sensitive optimal values by E x ∗ := inf ξ ∈ Ξ E xξ , and E ∗ := inf x ∈ R d E x ∗ . Let b G f ( x ) := 12 trace (cid:0) a ( x ) ∇ f ( x ) (cid:1) + min ξ ∈ K (cid:2)(cid:10) b ( x, ξ ) , ∇ f ( x ) (cid:11) + c ( x, ξ ) f ( x ) (cid:3) , f ∈ C ( R d ) , and b λ ∗ = b λ ∗ ( c ) := inf n λ ∈ R : ∃ ϕ ∈ W ,d loc ( R d ) , ϕ > , b G ϕ − λϕ ≤ R d o . We say that b λ ∗ is strictly monotone at c on the right if b λ ∗ ( c + h ) > b λ ∗ ( c ) for allnon-trivial nonnegative functions h with compact support.Proposition 5.2 below improves [3, Proposition 1.1]. We first state the assump-tions. Assumption b and running cost c satisfy, for some θ ∈ [0 , 1) and a constant κ ,the bound | b ( x, ξ ) | ≤ κ (cid:0) | x | θ (cid:1) , and | c ( x, ξ ) | ≤ κ (cid:0) | x | θ (cid:1) for all ( x, ξ ) ∈ R d × K .(ii) The drift b satisfies(5.1) 1 | x | − θ max ξ ∈ K (cid:10) b ( x, ξ ) , x (cid:11) + −−−−→ | x |→∞ . Proposition Grant Assumption , and suppose that c is coercive relativeto E ∗ . Then the HJB equation (5.2) min ξ ∈ K (cid:2) L ξ V ∗ ( x ) + c ( x, ξ ) V ∗ ( x ) (cid:3) = E ∗ V ∗ ( x ) ∀ x ∈ R d has a solution V ∗ ∈ C ( R d ) , satisfying inf R d V ∗ > , and the following hold: ( a ) E x ∗ = E ∗ = b λ ∗ for all x ∈ R d . ( b ) Any v ∈ Ξ sm that satisfies (5.3) L v V ∗ ( x ) + c (cid:0) x, v ( x ) (cid:1) V ∗ ( x ) = min ξ ∈ K (cid:2) L ξ V ∗ ( x ) + c ( x, ξ ) V ∗ ( x ) (cid:3) a.e. x ∈ R d , is stable, and is optimal, that is, E vx = E ∗ for all x ∈ R d . ( c ) It holds that V ∗ ( x ) = E xv h e R T [ c ( X t ,v ( X t )) − E ∗ ] d t V ∗ ( X T ) i ∀ ( T, x ) ∈ R + × R d , for any v ∈ Ξ sm that satisfies (5.3) . ( d ) If b λ ∗ is strictly monotone at c on the right, then there exists a unique positivesolution to (5.2) , up to a multiplicative constant, and any optimal v ∈ Ξ sm satisfies (5.3) . A. ARAPOSTATHIS, A. BISWAS, V.S. BORKAR, AND K. SURESH KUMAR Proof. A modification of [3, Lemma 3.2] (e.g., applying Itˆo’s formula to the func-tion f ( x ) = | x | θ ) shows that (5.1) implies thatlim sup t →∞ t E xξ (cid:2) | X t | θ (cid:3) = 0 ∀ ξ ∈ Ξ . From this point on, the proof follows as in [3], using Lemma 4.5. Indeed, parts (a) and(b) follow from [3, Theorem 3.4] by using the above estimate and Lemma 4.5. Sinceinf R d V ∗ > 0, any minimizing selector is recurrent. Moreover, the twisted diffusioncorresponding to the minimizing selector is regular. Thus part (c) follows from [3,Theorem 1.5]. In addition, the hypothesis in (d) implies that for any minimizingselector v , λ v = ˆ λ ∗ is right monotone at c which, in turn, implies the simplicity ofthe principal eigenvalue by [3, Theorem 1.2]. This also implies the last claim by [3,Lemma 3.6]. Acknowledgements. The work of Ari Arapostathis was supported in part bythe National Science Foundation through grant DMS-1715210, in part the Army Re-search Office through grant W911NF-17-1-001, and in part by the Office of NavalResearch through grant N00014-16-1-2956 which was approved for public release un-der DCN REFERENCES[1] M. Akian, S. Gaubert, and R. Nussbaum , A Collatz-Wielandt characterization of the spectralradius of order-preserving homogeneous maps on cones , arXiv e-prints, 1112.5968 (2011),https://arxiv.org/abs/1112.5968.[2] V. Anantharam and V. S. Borkar , A variational formula for risk-sensitive reward , SIAMJ. Control Optim., 55 (2017), pp. 961–988, https://doi.org/10.1137/151002630.[3] A. Arapostathis and A. Biswas , Infinite horizon risk-sensitive control of diffusions withoutany blanket stability assumptions , Stochastic Process. Appl., 128 (2018), pp. 1485–1524,https://doi.org/10.1016/j.spa.2017.08.001.[4] A. Arapostathis and A. Biswas , A variational formula for risk-sensitive control of diffu-sions in R d , SIAM J. Control Optim., 58 (2020), pp. 85–103, https://doi.org/10.1137/18M1218704.[5] A. Arapostathis, A. Biswas, and V. S. Borkar , Controlled equilibrium selection in stochas-tically perturbed dynamics , Ann. Probab., 46 (2018), pp. 2749–2799, https://doi.org/10.1214/17-AOP1238.[6] A. Arapostathis, A. Biswas, and D. Ganguly , Certain Liouville properties of eigenfunctionsof elliptic operators , Trans. Amer. Math. Soc., 371 (2019), pp. 4377–4409, https://doi.org/10.1090/tran/7694.[7] A. Arapostathis, A. Biswas, and S. Saha , Strict monotonicity of principal eigenvalues ofelliptic operators in R d and risk-sensitive control , J. Math. Pures Appl. (9), 124 (2019),pp. 169–219, https://doi.org/10.1016/j.matpur.2018.05.008.[8] A. Arapostathis, V. S. Borkar, and M. K. Ghosh , Ergodic control of diffusion processes ,vol. 143 of Encyclopedia of Mathematics and its Applications, Cambridge University Press,Cambridge, 2012, https://doi.org/10.1017/CBO9781139003605.[9] A. Arapostathis, V. S. Borkar, and K. S. Kumar , Risk-sensitive control and an abstractCollatz-Wielandt formula , J. Theoret. Probab., 29 (2016), pp. 1458–1484, https://doi.org/10.1007/s10959-015-0616-x.[10] S. N. Armstrong , The Dirichlet problem for the Bellman equation at resonance , J. DifferentialEquations, 247 (2009), pp. 931–955, https://doi.org/10.1016/j.jde.2009.03.007.[11] H. Berestycki, L. Nirenberg, and S. R. S. Varadhan , The principal eigenvalue and maxi-mum principle for second-order elliptic operators in general domains , Comm. Pure Appl.Math., 47 (1994), pp. 47–92, https://doi.org/10.1002/cpa.3160470105.[12] H. Berestycki and L. Rossi , Generalizations and properties of the principal eigenvalue ofelliptic operators in unbounded domains , Comm. Pure Appl. Math., 68 (2015), pp. 1014– VARIATIONAL FORMULA FOR RISK-SENSITIVE CONTROL A. Biswas , An eigenvalue approach to the risk sensitive control problem in near monotonecase , Systems Control Lett., 60 (2011), pp. 181–184, https://doi.org/10.1016/j.sysconle.2010.12.002.[14] A. Biswas and S. Saha , Zero-sum stochastic differential games with risk-sensitive cost , Appl.Math. Optim., 81 (2020), pp. 113–140, https://doi.org/10.1007/s00245-018-9479-8.[15] E. Chasseigne and N. Ichihara , Ergodic problems for viscous Hamilton-Jacobi equationswith inward drift , SIAM J. Control Optim., 57 (2019), pp. 23–52, https://doi.org/10.1137/18M1179328.[16] Y.-Z. Chen and L.-C. Wu , Second order elliptic equations and elliptic systems , vol. 174 ofTranslations of Mathematical Monographs, American Mathematical Society, Providence,RI, 1998. Translated from the 1991 Chinese original by Bei Hu.[17] A. Dembo and O. Zeitouni , Large deviations: techniques and applications , vol. 38 of Appli-cations of Mathematics, Springer-Verlag, New York, second ed., 1998, https://doi.org/10.1007/978-1-4612-5320-4.[18] M. D. Donsker and S. R. S. Varadhan , On a variational formula for the principal eigenvaluefor operators with maximum principle , Proc. Nat. Acad. Sci. U.S.A., 72 (1975), pp. 780–783, https://doi.org/10.1073/pnas.72.3.780.[19] M. D. Donsker and S. R. S. Varadhan , On the principal eigenvalue of second-order ellipticdifferential operators , Comm. Pure Appl. Math., 29 (1976), pp. 595–621, https://doi.org/10.1002/cpa.3160290606.[20] W. H. Fleming and W. M. McEneaney , Risk-sensitive control on an infinite time hori-zon , SIAM J. Control Optim., 33 (1995), pp. 1881–1915, https://doi.org/10.1137/S0363012993258720.[21] D. Gilbarg and N. S. Trudinger , Elliptic partial differential equations of second order ,vol. 224 of Grundlehren der Mathematischen Wissenschaften, Springer-Verlag, Berlin, sec-ond ed., 1983, https://doi.org/10.1007/978-3-642-61798-0.[22] N. Ichihara , Criticality of viscous Hamilton-Jacobi equations and stochastic ergodic control ,J. Math. Pures Appl. (9), 100 (2013), pp. 368–390, https://doi.org/10.1016/j.matpur.2013.01.005.[23] N. Ichihara , The generalized principal eigenvalue for Hamilton-Jacobi-Bellman equations ofergodic type , Ann. Inst. H. Poincar´e Anal. Non Lin´eaire, 32 (2015), pp. 623–650, https://doi.org/10.1016/j.anihpc.2014.02.003.[24] O. A. Ladyzhenskaya and N. N. Ural’tseva , Linear and quasilinear elliptic equations , Trans-lated from the Russian by Scripta Technica, Inc., Academic Press, New York-London, 1968.[25] B. Lemmens and R. Nussbaum , Nonlinear Perron-Frobenius theory , vol. 189 of CambridgeTracts in Mathematics, Cambridge University Press, Cambridge, 2012, https://doi.org/10.1017/CBO9781139026079.[26] G. Metafune, D. Pallara, and A. Rhandi , Global properties of invariant measures , J. Funct.Anal., 223 (2005), pp. 396–424, https://doi.org/10.1016/j.jfa.2005.02.001.[27] C. Meyer , Matrix analysis and applied linear algebra , Society for Industrial and AppliedMathematics (SIAM), Philadelphia, PA, 2000, https://doi.org/10.1137/1.9780898719512.[28] S. P. Meyn and R. L. Tweedie , Stability of Markovian processes. III. Foster-Lyapunov crite-ria for continuous-time processes , Adv. in Appl. Probab., 25 (1993), pp. 518–548, https://doi.org/10.2307/1427522.[29] T. Ogiwara , Nonlinear Perron-Frobenius problem on an ordered Banach space , Japan. J.Math. (N.S.), 21 (1995), https://doi.org/10.4099/math1924.21.43.[30] S. Patrizi , Principal eigenvalues for Isaacs operators with Neumann boundary conditions ,NoDEA Nonlinear Differential Equations Appl., 16 (2009), pp. 79–107, https://doi.org/10.1007/s00030-008-7042-z.[31]