The prelimit generator comparison approach of Stein's method
aa r X i v : . [ m a t h . P R ] F e b The prelimit generator comparison approach ofStein’s method
Anton Braverman
Kellogg School of Management, Northwestern University, Evanston, IL 60208,[email protected]
This paper uses the generator comparison approach of Stein’s method to analyze the gap between steady-state distributions of Markov chains and diffusion processes. The “standard” generator comparison approachstarts with the Poisson equation for the diffusion, and the main technical difficulty is to establish derivativebounds for the solution to the Poisson equation, known as Stein factor bounds. In this paper we proposestarting with the Markov chain Poisson equation; we term this the prelimit approach . Although Stein factorbounds still must be established, they now correspond to the finite differences of the Markov chain Poissonequation solution rather than the derivatives of the solution to the diffusion Poisson equation. In certain cases,finite difference bounds are easier to obtain for example, when the drift of the diffusion is not everywheredifferentiable, or in the presence of a reflecting boundary condition. We use the
M/M/
Key words : Stein method; generator comparison; Markov chain; prelimit; convergence rate; diffusionapproximation
1. Introduction
Recent years have seen growing use of the generator comparison approach of Stein’smethod to establish rates of convergence for steady-state diffusion approximations ofMarkov chains. One very active area has been the study of queueing and service systems,e.g., Stolyar (2015), Gurvich (2014a), Braverman and Dai (2017), Braverman et al. (2016),Ying (2016, 2017), Dai and Shi (2017), Huang and Gurvich (2018), Feng and Shi (2018),Liu and Ying (2019), Braverman et al. (2020b), Braverman (2020), Braverman et al.(2020a). In the typical setup, one considers a parametric family of continuous-time Markov raverman: Prelimit generator comparison approach Article submitted to
Stochastic Systems ; manuscript no. chains (CTMCs) { X ( t ) } taking values in some discrete state space. This family is oftentermed the prelimit sequence. As the parameters tend to some asymptotic limit, the pre-limit sequence converges to a limiting diffusion process { Y ( t ) } . In queueing, for example,the CTMC parameters are usually the arrival rate, number of servers, and service rate,and one common asymptotic regime is where the system utilization approaches one, alsoknown as the heavy-traffic regime. Letting X and Y denote vectors having the stationarydistribution of the CTMC and diffusion, respectively, the generator approach of Stein’smethod has been used to study the rates of convergence of X to Y . The generator approachis attributed to Barbour (1988, 1990) and G¨otze (1991), which were the first papers toconnect Stein’s method to generators of diffusions and CTMCs.The limiting factor in the generator comparison approach is the curse of dimensionality,because the distance between X and Y depends on the derivatives of the solution to thePoisson equation of the diffusion. When the diffusion is multidimensional, the Poissonequation is a second-order partial differential equation (PDE), and obtaining derivativebounds, also known as Stein factor bounds, becomes a challenge. The present paper isconcerned with expanding the technical toolbox for getting multidimensional Stein factorbounds. Before discussing our contribution, let us examine this problem in detail.Assume that X takes values on the lattice δ Z d = { δk : k ∈ Z d } for some δ > Y ∈ R d . Let G X and G Y be the infinitesimal generators of the CTMC and diffusion,respectively. Suppose G X has the form G X f ( δk ) = X k ′ ∈ Z d q k,k ′ ( f ( δk ′ ) − f ( δk )) , k ∈ Z d , (1)where q k,k ′ are the transition rates from state x k to x k ′ . Further suppose the diffusiongenerator has the form G Y f ( x ) = d X i =1 b i ( x ) ∂∂x i f ( x ) + 12 d X i,j =1 a ij ( x ) ∂ ∂x i ∂x j f ( x ) , x ∈ R d , where f : R d → R is a twice continuously differentiable function and b ( x ) = ( b ( x ) , . . . , b d ( x ))and a ( x ) = ( a ij ( x )) di,j =1 are known as the drift and diffusion coefficient, respectively. raverman: Prelimit generator comparison approach
Article submitted to
Stochastic Systems ; manuscript no. The generator approach works as follows. First, we choose a test function h ∗ : R d → R and consider the Poisson equation G Y f h ∗ ( x ) = E h ∗ ( Y ) − h ∗ ( x ) , x ∈ R d . (2)We use the star superscript above to emphasize that the functions are defined on all of R d .The solution f h ∗ ( x ) is unique up to a constant and satisfies E G X f h ∗ ( X ) = 0 under somemild conditions. Hence, we can take expected values with respect to X in (2) to concludethat E h ∗ ( Y ) − E h ∗ ( X ) = E (cid:0) G Y f h ∗ ( X ) − G X f h ∗ ( X ) (cid:1) . (3)In practice, the chosen h ∗ ( x ) frequently belongs toLip(1) = { h ∗ : R d → R : | h ( x ) − h ( y ) | ≤ | x − y | , for all x, y ∈ R d } , and d Lip(1) ( X, Y ) = sup h ∗ ∈ Lip(1) (cid:12)(cid:12) E h ∗ ( X ) − E h ∗ ( Y ) (cid:12)(cid:12) is known as the Wasserstein distance. The class Lip(1) is chosen both because it is simple towork with and because it is convergence determining; i.e., convergence in the Wassersteindistance implies convergence in distribution (see, for instance, Gibbs and Su (2002)).Bounding the error on the right hand side of (3) using Stein’s method requires boundson the derivatives of f h ∗ ( x ), and depending on the transition structure of the CTMC, itmay also depend on moments of X . We refer to the former as “derivative bounds.” Usually,the approximation Y is such that (3) converges to zero at a rate of δ , and to prove thisit suffices to bound the second and third derivatives of f h ∗ ( x ). However, when one seeksapproximations Y with convergence rates faster than δ , as was done in Braverman et al.(2020a), for example, one needs to bound fourth- and higher-order derivatives.When d = 1, the explicit form of f h ∗ ( x ) is known and can be leveraged to get thederivative bounds via a brute-force approach. When d >
1, the Poisson equation is a second-order PDE, and the same kind of brute-force analysis cannot be carried out. Instead, onehas to rely on the fact that f h ∗ ( x ) = Z ∞ (cid:0) E Y (0)= x h ∗ ( Y ( t )) − E h ∗ ( Y ) (cid:1) dt, x ∈ R d , (4) raverman: Prelimit generator comparison approach Article submitted to
Stochastic Systems ; manuscript no. solves the Poisson equation, provided the quantity above is finite. See any one of Barbour(1990), G¨otze (1991), Gurvich (2014a), Mackey and Gorham (2016) for a proof of (4). Onecan then leverage (4) together with finite difference approximations to get the derivativebounds needed. For instance, ∂∂x i f h ∗ ( x ) ≈ f h ∗ ( x + εe ( i ) ) − f h ∗ ( x ) ε = 1 ε Z ∞ (cid:0) E Y (0)= x + εe ( i ) h ∗ ( Y ( t )) − E Y (0)= x h ∗ ( Y ( t )) (cid:1) dt. (5)There are a few ways to bound (5). Sometimes, the transient distribution of Y ( t ) is knownas a function of Y (0), as in Barbour (1990), G¨otze (1991), Gan et al. (2017), Gan and Ross(2019), and Chen et al. (2019), but this is true only for a handful of special cases.A more general approach is to use synchronous couplings of the diffusion: that is, initial-ize one diffusion process at x and another process sharing the same Brownian motion, butstarting at x + δe ( i ) . The bound then depends on the coupling time of the two diffusions.This idea was exploited heavily in Mackey and Gorham (2016) to study derivative boundsfor overdamped Langevin diffusions with strongly concave drifts and later in Gorham et al.(2019) where the authors used a combination of synchronous couplings and reflection cou-plings studied in Eberle (2016) and Wang (2016) to establish derivative bounds for a classof fast-coupling diffusions. However, the results of both papers require non-trivial assump-tions on the diffusion. For example, both papers required the drift be k-strongly concanveand everywhere differentiable, and their results fail to hold for a Lipschitz drift with onlya single point of non-differentiability such as the piecewise Ohrnstein-Uhlenbeck processused for approximating the many-server queue in Braverman et al. (2016). Apart fromthe differentiability assumption on the drift, the results of Mackey and Gorham (2016)and Gorham et al. (2019) hold only for diffusions on the entire space R d . This excludesdiffusions with reflecting boundary conditions, such as reflecting Brownian motions thatappear as heavy-traffic limits for networks of single-server queueing systems.A second approach to getting derivative bounds was proposed in Gurvich (2014a), wherethe author used a priori Schauder estimates from PDE theory to bound the derivativesof f h ∗ ( x ) in terms of f h ∗ ( x ) and h ( x ). He then bounded f h ∗ ( x ) by a Lyapunov functionsatisfying an exponential ergodicity condition for the diffusion. This approach requires raverman: Prelimit generator comparison approach
Article submitted to
Stochastic Systems ; manuscript no. finding a Lyapunov function satisfying an exponential ergodicity condition, which typicallyrequires significant effort, e.g. Dieker and Gao (2013), Gurvich (2014a). Furthermore, inthe case of a diffusion with a reflecting boundary, the complexity of the PDE machineryused makes it nontrivial to trace how the a priori Schauder estimates depend on theprimitives of the diffusion process.Most recently, another approach to getting derivative bounds based on Bismut’s formulafrom Malliavin calculus was proposed in Fang et al. (2018). The authors required the dif-fusion coefficient to be constant, and the assumptions imposed on the drift were similar tothose in Mackey and Gorham (2016). While each of the four approaches discussed aboveeach has merits, each also has drawbacks, and none are universally applicable to all prob-lems. For this reason, the problem of getting derivative bounds is often the bottleneck ofthe generator comparison approach. In this paper, we present a new approach to boundingthe left-hand side of (3). Let us informally illustrate its main steps.Fix a test function h : δ Z d → R , defined only on the lattice δ Z d as opposed to R d asbefore. Now, instead of (2), we consider the Poisson equation of the prelimit, G X f h ( δk ) = E h ( X ) − h ( δk ) , k ∈ Z d . (6)The solution to (6) is unique up to a constant. Furthermore, when adapted to continuoustime, Proposition 7.1 of Asmussen (2003) states that a solution to (6) exists provided E | h ( X ) | < ∞ . We are tempted to proceed analogously to (3) by taking expected valueswith respect to Y , but we cannot do so because G X f h ( δk ) is not defined on R d \ δ Z d . We getaround this by interpolating the discrete Poisson equation. Namely, we introduce a spline A , which interpolates functions f : δ Z d → R and results in extended functions Af : R d → R .By applying A to both sides of (6), we obtain the interpolated Poisson equation AG X f h ( x ) = E h ( X ) − Ah ( x ) , x ∈ R d . We may now take expected values with respect to Y to arrive at E h ( X ) − E Ah ( Y ) = E AG X f h ( Y )= E AG X f h ( Y ) − E G Y Af h ( Y ) . (7) raverman: Prelimit generator comparison approach Article submitted to
Stochastic Systems ; manuscript no.
We will see later that E G Y Af h ( Y ) = 0 follows from Itˆo’s lemma provided that Af h ( x )satisfies some mild conditions. To ensure that the convergence of (7) to zero implies theconvergence of X to Y , we again need to ensure that h ( δk ) belongs to a rich-enoughclass of functions. We describe some convergence-determining classes of grid-restricted testfunctions in Section 2.2. Lastly, to make the right-hand side of (7) comparable to (3), wewant to interchange A and G X . We will see that this interchange is possible but results insome error; i.e., AG X f h ( x ) = G X Af h ( x ) + error.After this interchange, the right-hand side of (7) becomes analogous to (3) in the sensethat the derivatives of f h ∗ ( x ) that appear in (3) are replaced by corresponding derivativesof Af h ( x ). Our choice of A will be such that the derivatives of Af h ( x ) correspond tofinite differences of f h ( δk ), meaning that the problem of establishing derivative bounds isreplaced by an analogous problem of bounding finite differences. We can bound the finitedifferences of f h ( δk ) by relying on the fact that f h ( δk ) = Z ∞ (cid:0) E X (0)= δk h ( X ( t )) − E h ( X ) (cid:1) dt, k ∈ Z d (8)is one solution to the Poisson equation and constructing synchronous couplings ofthe CTMC in a manner similar to the diffusion synchronous couplings used inMackey and Gorham (2016) and Gorham et al. (2019). We discuss in Section 3 severalways to verify that (8) is well-defined.For ease of reference, we refer to our approach as the prelimit generator comparisonapproach, or simply prelimit approach , and to the traditional approach based on (2) asthe diffusion approach . The prelimit and diffusion approaches are in some sense parallelapproaches with many conceptual similarities. If we choose h ∗ ( x ) in (3) to equal Ah ( x )from (7), we see that the right-hand sides of (3) and (7) are equal. This means thatany bound established via the diffusion approach should, in theory, be attainable via theprelimit approach, and vice versa. In practice, technical differences can make the prelimitapproach more attractive for some models.First, when working with models that have state-space collapse i.e., when the dimensionof the CTMC is higher than that of the diffusion the prelimit approach does not require oneto bound the so-called E | X ⊥ | , which is the distance between the stationary distribution raverman: Prelimit generator comparison approach
Article submitted to
Stochastic Systems ; manuscript no. of the CTMC and its projection onto the state-space collapse manifold. This is illustratedin more detail in Braverman (2021), a companion paper in which the prelimit approachis applied to the join-the-shortest-queue model. Second, the diffusion approach can sufferfrom what we call “misalignment of synchronous couplings,” which can complicate theprocess of getting derivative bounds via synchronous couplings. We illustrate this issue inSection 4 using a simple example.Apart from showing how fast X converges to Y , the prelimit Poisson equation (6)can also be used to prove tightness of the family of steady-state CTMC distributions invarious asymptotic regimes, such as heavy traffic in queueing. Tightness has become animportant property since the seminal work of Gamarnik and Zeevi (2006), which initiated awave of research into justifying steady-state diffusion approximations of queueing systems;see, for instance, Dai et al. (2014), Budhiraja and Lee (2009), Zhang and Zwart (2008),Katsuda (2010), Ye and Yao (2012), Tezcan (2008), Gamarnik and Stolyar (2012), andGurvich (2014b). Roughly speaking, process-level convergence of the CTMC to a diffusioncombined with tightness of the CTMC stationary distributions enables one to performa limit-interchange argument to conclude convergence of steady-state distributions. Thebottleneck is usually proving tightness, which has become synonymous with steady-stateconvergence.We can use (6) to prove tightness as follows. If x ∞ is the fluid equilibrium of the CTMC,we may choose h ( δk ) = | δk − x ∞ | in the Poisson equation to see that E | X − x ∞ | = G X f h ( x ∞ ) . The right-hand side corresponds to the transition rates in the CTMC and typically containsdifferences of f h ( δk ) up to the second order. We give an example in Section 3.2. Provingtightness is therefore equivalent to getting first- and second-order difference bounds at the single point x ∞ . In contrast, bounding the approximation error of Y requires third-orderdifference bounds on the entire support of Y . This perspective highlights the extra workrequired to go from merely proving the fact of convergence to having convergence rates.The idea of interpolating the discrete Poisson equation can be applied more broadly tothe problem of comparing discrete and continuous distributions using Stein’s method. To raverman: Prelimit generator comparison approach Article submitted to
Stochastic Systems ; manuscript no. the authors’ knowledge, anytime Stein’s method has been invoked for a discrete-versus-continuous random variable comparison, the starting point has always been the differentialequation for the continuous random variable. Furthermore, in most applications of themethod, the starting point has been the differential/difference equation for the limitingdistribution, whereas we start with the prelimit.To summarize, we make two main technical contributions. The first is that we establishthe existence of an interpolator A that satisfies certain convenient properties. Theorem 1contains the one-dimensional result, which is generalized to multiple dimensions in Theo-rem 3. The second contribution is that we describe the interchange error of A with G X ,which is a necessary step to compare G X to a diffusion generator. Propositions 1 and2 contain the one-dimensional results, while Propositions 3 and 4 are multidimensionalgeneralizations. After presenting the general framework, we illustrate it using the single-server queueing system as an example. This paper is meant to be a gentle introductionto the prelimit approach as well as a chance to illustrate several techniques for obtainingdifference bounds.It is important to add that using CTMC synchronous couplings dates back to Barbour(1988), which was the first paper to connect Stein’s method to Markov chains (althoughthe author of that paper did not use the term “synchronous coupling.”) In that work,the author viewed the Poisson distribution as the steady-state distribution of the infiniteserver queue. Later, the application of Stein’s method to birth-death processes receiveda thorough treatment in Brown and Xia (2001). A more recent example of using CTMCsynchronous couplings can be found in Barbour et al. (2018a,b).The remainder of the paper is structured as follows. In Section 2 we introduce thetechnical components of the prelimit approach. We then apply the prelimit approach tothe M/M/
For any set B ⊂ R d , we let Conv( B ) denote its convex hull. For any integer k ≥ B ⊂ R d , we let C k ( B ) be the set of all k -times continuously differentiable functions raverman: Prelimit generator comparison approach
Article submitted to
Stochastic Systems ; manuscript no. f : B → R . We use D = D ([0 , ∞ ) , R ) to denote the space of right continuous functions withleft limits mapping [0 , ∞ ) to R . Given a stochastic process { Z ( t ) } ∈ D and a functional f : D → R , we write E x ( f ( Z )) to denote E ( f ( Z ) | Z (0) = x ). We let e ∈ R d be the vectorwhose elements all equal 1 and let e ( i ) be the element with 1 in the i th entry and zerosotherwise. We use Z to denote the set of integers and let N = { , , , . . . } . For any δ > d >
0, we let δ Z d = { δk : k ∈ Z d } and define δ N d similarly. For any function f : δ Z d → R , we define the forward difference operator in the i th direction as∆ i f ( δk ) = f (cid:0) δ ( k + e ( i ) ) (cid:1) − f ( δk ) , k ∈ Z d , ≤ i ≤ d, and for j ≥
0, we define ∆ j +1 i f ( δk ) = ∆ ji f ( δ ( k + e ( i ) )) − ∆ ji f ( δk ) , (9)with the convention that ∆ i f ( δk ) = f ( δk ). For a vector a ∈ N d , we also let∆ a f ( δk ) = ∆ a . . . ∆ a d d f ( δk ) , and if f : R d → R , then ∂ a ∂x a f ( x ) = ∂ a ∂x a . . . ∂ a d ∂x a d d f ( x ) , and we adopt the convention that ∂ ∂x f ( x ) = f ( x ). For any x ∈ R d , we define k x k = P di =1 | x i | and write | x | to denote the Eucledian norm. Throughout the paper we will oftenuse C to denote a generic positive constant that may change from line to line and that willgenerally be independent of any parameters not explicitly specified. For a random variable X , we write supp( X ) to denote the support of X .
2. The Prelimit Generator Comparison Approach
In this section, we work out the technical details of the prelimit approach. We begin byintroducing the interpolation operator A in Section 2.1. We follow this with a discussionof convergence-determining classes in Section 2.2. Then, we write the form of AG X f ( x ) ina manner that easily lends itself to analysis. Informally, we refer to this as interchanging A with G X . The interchange is not perfect and results in some error that depends on the raverman: Prelimit generator comparison approach Article submitted to
Stochastic Systems ; manuscript no. domain of the CTMC. We treat unbounded domains in section 2.3 and bounded domains insection 2.4. To minimize notational burden, we restrict our discussion in Sections 2.1, 2.3,and 2.4 to one-dimensional CTMCs. In multiple dimensions, the results are analogous froma technical perspective, but may be harder to parse at first read. We therefore postponethe multidimensional discussion to the appendix, in which multidimensional interpolationis discussed in Appendix A, and multidimensional interchange is left to Appendix B.
The objective of this section is to state Theorem 1. Fix δ >
0, and for x ∈ R define k ( x ) = ⌊ x/δ ⌋ . Let K ⊂ R be a possibly unbounded interval and define K = { x ∈ K ∩ δ Z : ( x + 4 δ ) ∈ K ∩ δ Z } . For example, if K = ( −∞ , ∞ ), then K = δ Z . Let f : K ∩ δ Z be the function we want toextend to the continuum. One may be familiar with the cubic spline, a staple of numericalanalysis that can certainly extend the function. However, a cubic spline is insufficientfor our purposes because we want the extension to be four-times differentiable almosteverywhere. Instead, we use a spline composed of degree-7 polynomials. Define Af ( x ) = P k ( x ) ( x ) , where P k ( x ) = X i =0 α kk + i ( x ) f ( δ ( k + i )) , x ∈ R . Each P k ( x ) is a degree-7 polynomial and is best understood as a weighted sum of f ( δk ) , . . . , f ( δ ( k + 4)) with weights α kk ( x ) , . . . , α kk +4 ( x ). The precise form of P k ( x ) is dis-tracting, so we state it in Appendix A. The following result summarizes the key propertieswe will need of Af ( x ) and the weights α kk + i ( x ). Theorem 1.
Given f : K ∩ δ Z → R , the function Af ( x ) = X i =0 α k ( x ) k ( x )+ i ( x ) f ( δ ( k ( x ) + i )) , x ∈ Conv ( K ) (10) belongs to C ( Conv ( K )) and is infinitely differentiable on Conv ( K ) \ K . Furthermore, Af ( δk ) = f ( δk ) , δk ∈ K , (11) raverman: Prelimit generator comparison approach
Article submitted to
Stochastic Systems ; manuscript no. and the derivatives of Af ( x ) are bounded by the corresponding finite differences of f ( δk ) .Namely, there exists C > independent of x , f ( · ) and δ such that (cid:12)(cid:12)(cid:12) ∂ a ∂x a Af ( x ) (cid:12)(cid:12)(cid:12) ≤ Cδ − a max ≤ i ≤ − a | ∆ a f ( δ ( k ( x ) + i )) | , x ∈ Conv ( K ) , ≤ a ≤ , (12) and (12) also holds for x ∈ Conv ( K ) \ K when a = 4 . Additionally, the weights (cid:8) α kk + i : R → R : k ∈ Z , i = 0 , , , , (cid:9) are degree- polynomials in ( x − δk ) /δ whose coefficientsdo not depend on k or δ . They satisfy α kk ( δk ) = 1 , and α kk + i ( δk ) = 0 , k ∈ Z , i = 1 , , , , (13) X i =0 α kk + i ( x ) = 1 , k ∈ Z , x ∈ R , (14) and also the following translational invariance property: α k + jk + j + i ( x + δj ) = α kk + i ( x ) , i, j, k ∈ Z , x ∈ R . (15)Theorem 1 is proved in Appendix A and follows directly from the form of P k ( x ) statedthere. From (12) we see that the reason P k ( x ) depends on f ( δk ) , . . . , f ( δ ( k + 4)), as opposedto also depending on f ( δ ( k + 5)), is that we want ∂ a ∂x a Af ( x ) to be related to ∆ a f ( δk ( x ))for 0 ≤ a ≤
4, and we do not care what happens beyond the fourth derivative. In theory,one can make Af ( x ) as differentiable as is needed by using a higher degree polynomial P k ( x ). We mentioned in the introduction that when one uses the diffusion approach, Lip(1) is acommonly used convergence-determining class. In this section we discuss two convergence-determining classes of grid-valued functions that can be used with the prelimit approach.Lemma 1 below presents the main result of this section.Recall our convention of using a star superscript to emphasize that a function is definedon the continuum. Given two random variables
U, V ∈ R d and a class of functions H = { h ∗ : R d → R } , we define d H ( U, V ) = sup h ∗ ∈H (cid:12)(cid:12)(cid:12) E h ∗ ( U ) − E h ∗ ( V ) (cid:12)(cid:12)(cid:12) . raverman: Prelimit generator comparison approach Article submitted to
Stochastic Systems ; manuscript no.
We already said that Lip(1) is a convergence-determining class because d Lip(1) ( U, V ) → U converges to V in distribution. There are, of course, other convergence-determining classes. For instance, it was shown in Lemma 2.2 of Mackey and Gorham(2016) that if H = M = n h ∗ : R d → R : (cid:12)(cid:12)(cid:12) ∂ a ∂x a h ∗ ( x ) (cid:12)(cid:12)(cid:12) ≤ , ≤ k a k ≤ o , then d M ( U, V ) → M are classes of functions defined on R d , but in the prelimit approachthe function h ( δk ) in (6) is defined only on the grid. To mimic the two classes, we definedLip(1) = { h : δ Z d → R : | ∆ j h ( δk ) | ≤ δ, ≤ j ≤ d, k ∈ δ Z d } , M disc ( C ) = { h : δ Z d → R : | ∆ a h ( δk ) | ≤ Cδ k a k , ≤ k a k ≤ , k ∈ δ Z d } . The following lemma relates d Lip(1) ( U, V ) and d M ( U, V ) to their grid-restricted counter-parts. The lemma involves the multidimensional interpolator, which we have not yet for-mally introduced. However, that does not preclude an understanding of the lemma, whichis proved in Section C.1.
Lemma 1.
Let U ∈ δ Z d and V ∈ R d be two random vectors. For any h ∗ : R d → R , let h : δ Z d → R be the restriction of h ∗ ( x ) to δ Z d . Let Ah : R d → R be defined by (10) when d = 1 , and by (51) when d > . Then there exists a constant C > such that | E h ∗ ( U ) − E h ∗ ( V ) | ≤ | E h ( U ) − E Ah ( V ) | + Cδ sup ≤ j ≤ dx ∈ R d (cid:12)(cid:12)(cid:12)(cid:12) ∂∂x j h ∗ ( x ) (cid:12)(cid:12)(cid:12)(cid:12) . As a consequence, there exists a constant C ′ > such that d Lip(1) ( U, V ) ≤ sup h ∈ dLip(1) | E h ( U ) − E Ah ( V ) | + Cδ,d M ( U, V ) ≤ sup h ∈M disc ( C ′ ) | E h ( U ) − E Ah ( V ) | + Cδ. raverman:
Prelimit generator comparison approach
Article submitted to
Stochastic Systems ; manuscript no. Assume G X f ( δk ) is defined for all k ∈ Z ; i.e., the CTMC lives on δ Z . The interchange resultfor A and G X is given in Proposition 1 below. We then apply this result to characterizethe approximation error between X and its diffusion approximation Y in (22).Recall that q δk,δk ′ are the transition rates of our CTMC and define β ℓ ( δk ) = q δk,δ ( k + ℓ ) for k, ℓ ∈ Z . Then G X f ( δk ) = X k ′ ∈ Z q δk,δk ′ ( f ( δk ′ ) − f ( δk )) = X ℓ ∈ Z β ℓ ( δk )( f ( δ ( k + ℓ )) − f ( δk )) , k ∈ Z . Fix h : δ Z → R such that G X f h ( δk ) = E h ( X ) − h ( δk ) , k ∈ Z has a solution f h ( δk ). Since A is a linear operator, we have AG X f h ( x ) = A ( E h ( X ) − h )( x ) = E Ah ( X ) − Ah ( x ) . The following result says AG X f ( x ) = G X Af ( x ) + error( x ) and characterizes the error term.We prove it in Section B.1 by proving the multidimensional version, Proposition 3, there. Proposition 1.
Fix f : δ Z → R and assume that G X f ( δk ) is defined on all of δ Z .Assume also that X ℓ ∈ Z | β ℓ ( δk )( f ( δ ( k + ℓ )) − f ( δk )) | < ∞ , k ∈ Z , (16) which is trivially satisfied when the number of possible transitions from each state is finite.Then for any x ∈ R , AG X f ( x ) = X ℓ ∈ Z Aβ ℓ ( x ) (cid:0) Af ( x + δℓ ) − Af ( x ) (cid:1) + ε ( x ) . (17) The error satisfies ε ( x ) = X ℓ ∈ Z X i =0 α k ( x ) k ( x )+ i ( x ) (cid:16) β ℓ (cid:0) δ ( k ( x ) + i ) (cid:1) − Aβ ℓ ( x ) (cid:17) × (cid:16) ℓ > i − X j =0 ℓ − X m =0 ∆ f (cid:0) δ ( k ( x ) + m + i ) (cid:1) − ℓ < i − X j =0 − X m = ℓ ∆ f (cid:0) δ ( k ( x ) + m + i ) (cid:1)(cid:17) . (18) raverman: Prelimit generator comparison approach Article submitted to
Stochastic Systems ; manuscript no.
Let us assume that our CTMC is such that f h ( δk ) satisfies (16). With the help of Propo-sition 1, we may derive the diffusion approximation Y and characterize the approximationerror. First, we apply Taylor expansion to (cid:0) Af h ( x + δℓ ) − Af h ( x ) (cid:1) to get AG X f h ( x ) = ( Af h ) ′ ( x ) δ X ℓ ∈ Z ℓAβ ℓ ( x ) + 12 ( Af h ) ′′ ( x ) δ X ℓ ∈ Z ℓ Aβ ℓ ( x )+ 16 δ X ℓ ∈ Z ℓ Aβ ℓ ( x )( Af h ) ′′′ ( ξ ℓ ( x )) + ε ( x ) , where ξ ℓ ( x ) is some number between x and x + δℓ . To approximate X , we set b ( x ) = δ X ℓ ∈ Z ℓAβ ℓ ( x ) , and a ( x ) = δ X ℓ ∈ Z ℓ Aβ ℓ ( x ) , x ∈ R , and consider the random variable Y whose density is given by p ( x ) = κa ( x ) exp (cid:16) Z x b ( y ) a ( y ) dy (cid:17) , x ∈ R , where κ is a normalizing constant that we assume to be finite. One may verify usingintegration by parts that E b ( Y ) f ( Y ) + 12 E a ( Y ) f ′ ( Y ) = 0 (19)for any f ( x ) for which the expectations above exist and for whichlim x →±∞ f ( x ) exp (cid:16) R x b ( y ) a ( y ) dy (cid:17) = 0. Another way to view Y is as the stationary distributionof the diffusion process Y ( t ) = Y (0) + Z t b ( Y ( s )) ds + Z t p a ( Y ( s )) dW ( s ) , (20)where { W ( t ) } is standard Brownian motion. The process above has generator G Y f ( x ) = b ( x ) f ′ ( x ) + 12 a ( x ) f ′′ ( x ) , and Itˆo’s lemma tells us that for any f ∈ C ( R ), E x f ( Y ( t )) − E x f ( Y (0)) = E x Z t G Y f ( Y ( s )) ds, t > . raverman: Prelimit generator comparison approach
Article submitted to
Stochastic Systems ; manuscript no. Provided that E f ( Y ) is finite, we can initialize Y (0) d = Y to get0 = E h Z t G Y f ( Y ( s )) ds (cid:12)(cid:12)(cid:12) Y (0) d = Y i . If we further assume that E | G Y f ( Y ) | < ∞ , then we can apply the Fubini-Tonelli theoremto interchange the integral and expectation above and conclude that0 = E G Y f ( Y ) , (21)which is just a restatement of (19). Now, provided that (21) holds with Af h ( x ) in place of f ( x ) there, we get E Ah ( X ) − E Ah ( Y ) = E AG X f h ( Y ) − E G Y Af h ( Y )= 16 δ E X ℓ ∈ Z ℓ Aβ ℓ ( Y )( Af h ) ′′′ ( ξ ℓ ( Y )) + E ε ( Y ) . (22)The bounds on ( Af h ) ′′′ ( x ) from Theorem 1 imply16 δ (cid:12)(cid:12)(cid:12) E X ℓ ∈ Z ℓ Aβ ℓ ( Y )( Af h ) ′′′ ( ξ ℓ ( Y )) (cid:12)(cid:12)(cid:12) ≤ C (cid:12)(cid:12)(cid:12) E X ℓ ∈ Z ℓ Aβ ℓ ( Y ) max ≤ i ≤ (cid:12)(cid:12) ∆ f h ( δ ( k ( ξ ℓ ( Y )) + i )) (cid:12)(cid:12) (cid:12)(cid:12)(cid:12) . In other words, the term above depends on ℓ Aβ ℓ ( Y ) and third order differences of f h ( δk ).The second term in (22) is E ε ( Y ). We recall ε ( x ) below for convenience: ε ( x ) = X ℓ ∈ Z X i =0 α k ( x ) k ( x )+ i ( x ) (cid:16) β ℓ (cid:0) δ ( k ( x ) + i ) (cid:1) − Aβ ℓ ( x ) (cid:17) × (cid:16) ℓ > i − X j =0 ℓ − X m =0 ∆ f (cid:0) δ ( k ( x ) + m + i ) (cid:1) − ℓ < i − X j =0 − X m = ℓ ∆ f (cid:0) δ ( k ( x ) + m + i ) (cid:1)(cid:17) . First, the fact that α kk + i ( x ) is a polynomial in ( x − δk ) /δ implies sup x ∈ R (cid:12)(cid:12) α k ( x ) k ( x )+ i ( x ) (cid:12)(cid:12) isbounded by a constant independent of δ, x , or any other parameters. Second, the fact that0 ≤ i ≤ (cid:12)(cid:12) β ℓ (cid:0) δ ( k ( x ) + i ) (cid:1) − Aβ ℓ ( x ) (cid:12)(cid:12) = (cid:12)(cid:12) Aβ ℓ (cid:0) δ ( k ( x ) + i ) (cid:1) − Aβ ℓ ( x ) (cid:12)(cid:12) ≤ δ ( Aβ ℓ ) ′ ( ξ ′ ℓ ( x )) . Therefore, provided that the transition rates β ℓ ( · ) of the CTMC do not vary too much,e.g., they are Lipschitz, the term above can be controlled, so bounding (22) comes downto bounding ∆ f h ( δk ) and ∆ f h ( δk ). raverman: Prelimit generator comparison approach Article submitted to
Stochastic Systems ; manuscript no.
In this section we present Proposition 2, which is an interchange result when G X f ( δk ) isdefined only for k ∈ N , i.e., a domain with a left boundary. Domains with both left andright boundaries can be handled similarly.To see why Proposition 1 is inadequate in the case of a bounded domain, consider thebirth-death process defined by its generator G X f ( δk ) = λ ∆ f ( δk ) − µ k > f ( δ ( k − , k ∈ N . (23)This generator corresponds to the customer count, scaled by δ , in a single-server queuewhere customers arrive according to a Poisson process with rate λ and service times areexponentially distributed with rate µ . Such a system is also known as the M/M/ ρ = λ/µ is the system load. In steady state, the customer count isgeometrically distributed provided that ρ <
1. It is also well known that as ρ →
1, thecustomer count can be approximated by an exponential random variable. This section isfocused on obtaining an analog of (22) for the
M/M/ f : δ N → R and consider AG X f ( x ), which is defined for x ∈ [0 , ∞ ). By repeating theproof of Proposition 1, one may check that AG X f ( x ) = λ (cid:0) Af (cid:0) x + δ (cid:1) − Af ( x ) (cid:1) + µ (cid:0) Af (cid:0) x − δ (cid:1) − Af ( x ) (cid:1) + ε ( x )for x ≥ δ . However, contrary to Proposition 1, the above equality fails for x ∈ [0 , δ ) due tothe fact that Af ( x ) = P i =0 α k ( x ) k ( x )+ i ( x ) f ( δ ( k ( x ) + i )) is not defined for x < f ( δk )is undefined for k <
0. Our restriction of f ( δk ) to k ≥ f ( δk ) we intend to use the solution to the Poisson equation for the M/M/ δ N .We now describe an alternative to Proposition 1 for a general class of CTMCs definedon δ N . Consider the CTMC with generator G X f ( δk ) = X ℓ ∈ Z β ℓ ( δk )( f ( δ ( k + ℓ )) − f ( δk )) , k ∈ N raverman: Prelimit generator comparison approach
Article submitted to
Stochastic Systems ; manuscript no. and let X be distributed according to its stationary distribution. Fix h : δ N → R with E | h ( X ) | < ∞ . As discussed in the introduction, the equation G X f h ( δk ) = E h ( X ) − h ( δk ) , k ∈ N has a finite solution f h ( δk ). We first extrapolate f h : δ N → R to all of δ Z by letting b f h ( δk ) = X i =0 α k ∨ k ∨ i ( δk ) f h ( δ ( k ∨ i )) , k ∈ Z , (24)where the weights α kk + i ( x ) are as in Section 2.1. Next, we assume there exists some L ∈ N such that β ℓ ( δk ) = 0 for all k ∈ N if ℓ < − L . In other words, we assume that the largestjump to the left is, at most, of size L . Let us define f Aβ ℓ ( x ) = X i =0 α k ∨ Lk ∨ L + i ( x ) β ℓ (cid:0) δ ( k ∨ L + i ) (cid:1) , x ∈ R , ℓ ∈ Z . Note that f Aβ ℓ ( x ) = Aβ ℓ ( x ) for x ≥ δL , and for x < δL it is again the extrapolation of thetransition rates β ℓ ( δk ) based on their values when k ≥ L . In our M/M/ L = 1,and the facts that P i =0 α kk + i ( x ) = 1 implies f Aβ ( x ) = λ , and f Aβ − ( x ) = µ for all x ∈ R because β ( δk ) , β − ( δk ) are constant for all k ≥ Proposition 2.
Assume that X ℓ ∈ Z | β ℓ ( δk )( f h ( δ ( k + ℓ )) − f h ( δk )) | < ∞ , k ∈ N , (25) which is trivially satisfied when the number of possible transitions from each state is finite.Then AG X f h ( x ) = X ℓ ∈ Z f Aβ ℓ ( x )( A b f h ( x + δℓ ) − A b f h ( x )) + e ε ( x ) + ε h ( x ) + ε f ( x ) , x ≥ where e ε ( x ) = X ℓ ∈ Z X i =0 α k ( x ) ∨ Lk ( x ) ∨ L + i ( x ) (cid:16) β ℓ (cid:0) δ ( k ( x ) ∨ L + i ) (cid:1) − f Aβ ℓ ( x ) (cid:17) × (cid:16) ℓ > i − X j =0 ℓ − X m =0 ∆ f h (cid:0) δ ( k ( x ) ∨ L + m + i ) (cid:1) raverman: Prelimit generator comparison approach Article submitted to
Stochastic Systems ; manuscript no. − ℓ < i − X j =0 − X m = ℓ ∆ f h (cid:0) δ ( k ( x ) ∨ L + m + i ) (cid:1)(cid:17) , | ε h ( x ) | ≤ (cid:0) x ∈ [0 , δL ) (cid:1) C ( L ) max ≤ m ≤ L (cid:12)(cid:12) ∆ h ( δm ) (cid:12)(cid:12) , | ε f ( x ) | ≤ (cid:0) x ∈ [0 , δL ) (cid:1) C ( L ) X ℓ ∈ Z (cid:12)(cid:12) f Aβ ℓ ( x ) (cid:12)(cid:12)(cid:16) max ≤ m ≤ L (cid:12)(cid:12) ∆ f h ( δm ) (cid:12)(cid:12) + max ≤ m ≤ L + ℓ +8 (cid:12)(cid:12) ∆ f h ( δm ) (cid:12)(cid:12) (cid:17) . Proposition 2 is proved in Section B.2, where we state and prove the multidimensionalversion. The error term e ε ( x ) resembles ε ( x ) from Proposition 1, while ε h ( x ) and ε f ( x )are new. The bounds on | ε f ( x ) | and | e ε ( x ) | are similar in that both depend on the finitedifferences of f h ( δk ). The bound on | ε h ( x ) | depends on the finite differences of h ( δk ) andcan be made small by assuming h ∈ M disc ( C ′ ) for some C ′ >
0, which we know by Lemma 1to be a convergence-determining class of functions.To conclude this section, we derive a diffusion approximation for the
M/M/ G X be the M/M/ f h ( δk ) solve the corresponding Poisson equation. Then Proposition 2 statesthat for x ≥ AG X f h ( x ) = λ ( A b f h ( x + δ ) − A b f h ( x )) + µ ( A b f h ( x − δ ) − A b f h ( x ))+ e ε ( x ) + ε h ( x ) + ε f ( x )= δ ( λ − µ )( Af h ) ′ ( x ) + 12 δ ( λ + µ )( Af h ) ′′ ( x )+ 16 δ (cid:0) λ ( Af h ) ′′′ ( ξ ( x )) + µ ( A b f h ) ′′′ ( ξ − ( x )) (cid:1) + e ε ( x ) + ε h ( x ) + ε f ( x ) (26)In the second equality, we used the fact that A b f h ( x ) = Af h ( x ) for x ≥
0. The only termthat we are not equipped to bound yet is ( A b f h ) ′′′ ( x ), which by Theorem 1 we know isbounded by third-order differences of b f h ( δk ). To bound it, we need the following auxiliaryresult. Lemma 2.
Let b f h ( δk ) be as in (24) . There exists a constant C > independent of anyparameters such that for a = 0 , , , , , (cid:12)(cid:12) ∆ a b f h ( δk ) (cid:12)(cid:12) ≤ C (cid:0) | k ∧ | (cid:1) max ≤ i ≤ (cid:12)(cid:12) ∆ a f h (cid:0) δ (( k ∨
0) + i )) (cid:1)(cid:12)(cid:12) , k ∈ Z . raverman: Prelimit generator comparison approach
Article submitted to
Stochastic Systems ; manuscript no. The multidimensional version of the lemma is restated in Section B.2 and proved in Sec-tion B.2.2. To derive the diffusion approximation, we use the first line on the right-handside of (26). Since the
M/M/ Y ( t ) = Y (0) + δ ( λ − µ ) t + δ p λ + µW ( t ) + R ( t ) , (27)where R ( t ) is the unique, continuous, and non-decreasing process such that Y ( t ) ≥ R (0) = 0 and R ( t ) increases only at those times t when Y ( t ) = 0. Theorem 2 inHarrison and Reiman (1981) provides a version of Itˆo’s lemma for RBMs. Namely, for any f ∈ C (2) ( R + ), E x f ( Y ( t )) − E x f ( Y (0)) = E x h Z t (cid:0) δ ( λ − µ ) f ′ ( Y ( s )) + 12 δ ( λ + µ ) f ′′ ( Y ( s )) (cid:1) ds + f ′ (0) R ( t ) i . Let Y be a random variable having the stationary distribution of this RBM. It is well knownthat Y is exponentially distributed with mean δ ( λ + µ )2( λ − µ ) . Picking f ( x ) = x and initializing Y (0) d = Y above yields E (cid:0) R (1) | Y (0) d = Y (cid:1) = δ ( µ − λ ) , so for any function f ( x ) such that E f ( Y ) < ∞ and E (cid:12)(cid:12) δ ( λ − µ ) f ′ ( Y ) + 12 δ ( λ + µ ) f ′′ ( Y ) (cid:12)(cid:12) < ∞ , we can invoke the Fubini-Tonelli theorem to conclude that E (cid:0) δ ( λ − µ ) f ′ ( Y ) + 12 δ ( λ + µ ) f ′′ ( Y ) (cid:1) + f ′ (0) δ ( µ − λ ) = 0 . (28)Assume we know (28) is satisfied when f ( x ) = Af h ( x ), a fact that will be verified byLemma 4 of the following section. Taking expected values with respect to Y in (26), andusing the fact that AG X f h ( x ) = E h ( X ) − Ah ( x ), we arrive at E h ( X ) − E Ah ( Y )= E AG X f h ( Y ) − (cid:16) E (cid:0) δ ( λ − µ )( Af h ) ′ ( Y ) + 12 δ ( λ + µ )( Af h ) ′′ ( Y ) (cid:1) + ( Af h ) ′ (0) δ ( µ − λ ) (cid:17) = 16 δ E (cid:0) λ ( Af h ) ′′′ ( ξ ( Y )) + µ ( A b f h ) ′′′ ( ξ − ( Y )) (cid:1) + E (cid:0)e ε ( Y ) + ε h ( Y ) + ε f ( Y ) (cid:1) − ( Af h ) ′ (0) δ ( µ − λ ) . (29) raverman: Prelimit generator comparison approach Article submitted to
Stochastic Systems ; manuscript no.
We see that just as in the case of a CTMC with an unbounded domain, bounding theright-hand side requires bounds on the finite differences of f h ( δk ).
3. Difference Bounds for the
M/M/ system In this section we discuss various ways to establish bounds on the finite differences of thesolution to the
M/M/
M/M/
M/M/ f h ( δk ) = Z ∞ (cid:0) E X (0)= δk h ( X ( t )) − E h ( X ) (cid:1) dt solves the Poisson equation, so we now verify this fact. Lemma 3.
Consider a CTMC taking values on a set E ⊂ δ Z d with generator G X givenin (1) , and assume that Z ∞ (cid:0) E δk h ( X ( t )) − E h ( X ) (cid:1) dt is finite for all δk ∈ E. (30) Then f h ( δk ) = R ∞ (cid:0) E δk h ( X ( t )) − E h ( X ) (cid:1) dt solves G X f h ( δk ) = E h ( X ) − h ( δk ) , δk ∈ E. Lemma 3 is proved by performing a first-step analysis on R ∞ ε (cid:0) E X (0)= δk h ( X ( t )) − E h ( X ) (cid:1) dt for small values of ε . It is relegated to Section C.2.In practice, there are several ways to verify that (30) holds. One way is by showing that { X ( t ) } is h -exponentially ergodic; i.e., | E δk h ( X ( t )) − E h ( X ) | ≤ c e − c t for some c , c > E is finite but when E is infinite, theusual way to prove this would be to find a Lyapunov function V ( δk ) such that G X V ( δk ) ≤− cV ( δk ) + ¯ c k ∈ K ) for some compact set K and some constants c, ¯ c >
0. We refer thereader to Meyn and Tweedie (1993) for more on exponential ergodicity. raverman:
Prelimit generator comparison approach
Article submitted to
Stochastic Systems ; manuscript no. There is another way to verify (30) that is much closer to the spirit of this paper becauseit relies on finite difference bounds. First, note that (cid:12)(cid:12)(cid:12) Z ∞ (cid:0) E δk h ( X ( t )) − E h ( X ) (cid:1) dt (cid:12)(cid:12)(cid:12) ≤ Z ∞ X j ∈ Z d P ( X = δj ) (cid:12)(cid:12) E δk h ( X ( t )) − E δj h ( X ( t )) (cid:12)(cid:12) dt = X j ∈ Z d P ( X = δj ) Z ∞ (cid:12)(cid:12) E δk h ( X ( t )) − E δj h ( X ( t )) (cid:12)(cid:12) , where the last equality follows from by Fubini-Tonelli. It is possible to use synchronouscouplings to prove the right-hand side is finite. Let us illustrate this for the M/M/ { X ( t ) } be the CTMC and X have the station-ary distribution associated with the M/M/ G X that we introduced in (23) ofSection 2.4. Similarly, we let Y have the stationary distribution of the RBM in (27) of thesame section. Suppose we prove that for h ∈ dLip(1), Z ∞ (cid:12)(cid:12) E δ ( k +1) h ( X ( t )) − E δk h ( X ( t )) (cid:12)(cid:12) ≤ δ ( k + 1) µ − λ , k ∈ N . (31)We can then use a telescoping sum and the triangle inequality to see that Z ∞ (cid:12)(cid:12) E δk h ( X ( t )) − E δj h ( X ( t )) (cid:12)(cid:12) dt ≤ k ∨ j − X i = k ∧ j Z ∞ (cid:12)(cid:12) E δ ( i +1) h ( X ( t )) − E δi h ( X ( t )) (cid:12)(cid:12) dt ≤ k ∨ j − X i = k ∧ j δ ( i + 1) µ − λ ≤ δ ( k + j + 1) µ − λ ( k + j ) , and so (cid:12)(cid:12)(cid:12) Z ∞ (cid:0) E δk h ( X ( t )) − E h ( X ) (cid:1) dt (cid:12)(cid:12)(cid:12) ≤ X j ∈ Z P ( X = δj ) δ ( k + j + 1) µ − λ ( k + j ) . The right-hand side is finite because E X < ∞ . Thus, (30) is satisfied for our M/M/ f h ( δk ) with the helpof synchronous couplings. The following result summarizes our bounds. Lemma 4.
For any h ∈ dLip(1) , (31) holds and consequently, the function f h ( δk ) in (8) is well defined. Furthermore, | ∆ f h ( δk ) | ≤ δ ( k + 1) µ − λ , (cid:12)(cid:12) ∆ f h ( δk ) (cid:12)(cid:12) ≤ δµ − λ , and (cid:12)(cid:12) ∆ f h ( δk ) (cid:12)(cid:12) ≤ δλ , k ∈ N . (32) raverman: Prelimit generator comparison approach Article submitted to
Stochastic Systems ; manuscript no.
We prove the first claim and establish the bound on | ∆ f h ( δk ) | in Section 3.1. The remainingtwo bounds are proved in Section 3.3, with the help of the discussion from Section 3.2. Letus now show how Lemma 4 can be combined with the theory developed in Section 2 tobound the approximation error between Y and X . Theorem 2.
There exists a constant
C > such that for ρ < and h ∈ dLip(1) , | E h ( X ) − E Ah ( Y ) | ≤ Cδ (cid:16) ρ (cid:17) . Before proving the theorem, let us comment on the possible choices of δ . It is well knownthat X is well approximated by Y when ρ → E X = δρ/ (1 − ρ ). Choosing δ = 1,we see that even though E X → ∞ as ρ →
1, the approximation error | E X − E Y | doesnot grow. However, because both random variables diverge, we cannot conclude that X converges to Y . To ensure convergence of X to Y , we recall that E Y = δ ( λ + µ )2( λ − µ ) . Choosing δ = (1 − ρ ) ensures that { X } ρ< and { Y } ρ< are tight. Lemma 1 and Theorem 2 thenimply that X converges to Y in distribution as ρ →
1. As discussed in the introduction,tightness of the prelimit sequence is a sought-after property because, when combined withprocess-level convergence to some diffusion limit, tightness implies convergence of station-ary distributions as well. We will discuss in Section 3.2.1 below how one can use the Poissonequation to establish tightness.
Proof of Theorem 2
Since Y is exponentially distributed, a consequence of Lemma 4is that E (cid:0) δ ( λ − µ )( Af h ) ′ ( Y ) + 12 δ ( λ + µ )( Af h ) ′′ ( Y ) (cid:1) + ( Af h ) ′ (0) δ ( µ − λ ) = 0 , which is true because (28) is satisfied with f ( x ) = Af h ( x ) there. Consequently, (29) holds,which we recall below: E h ( X ) − E h ( Y ) = 16 δ E (cid:0) λ ( Af h ) ′′′ ( ξ ( Y )) + µ ( A b f h ) ′′′ ( ξ − ( Y )) (cid:1) + E (cid:0)e ε ( Y ) + ε h ( Y ) + ε f ( Y ) (cid:1) − ( Af h ) ′ (0) δ ( µ − λ ) . We now bound each term on the right side above. Using (12) from Theorem 1 and Lemma 4,16 δ λ | ( Af h ) ′′′ ( ξ ( Y )) | ≤ Cλ max ≤ i ≤ (cid:12)(cid:12) ∆ f h (cid:0) δ ( k ( ξ ( Y )) + i ) (cid:1)(cid:12)(cid:12) ≤ Cδ, | ( Af h ) ′ (0) δ ( µ − λ ) | ≤ Cδ , raverman: Prelimit generator comparison approach
Article submitted to
Stochastic Systems ; manuscript no. where k ( x ) = ⌊ x/δ ⌋ . Similarly, but using also Lemma 2 and the fact that ξ − ( Y ) ≥ − δ ,16 δ µ (cid:12)(cid:12)(cid:12) ( A b f h ) ′′′ ( ξ − ( Y )) (cid:12)(cid:12)(cid:12) ≤ Cµ max ≤ i ≤ (cid:12)(cid:12)(cid:12) ∆ b f h (cid:0) δ ( k ( ξ − ( Y )) + i ) (cid:1)(cid:12)(cid:12)(cid:12) ≤ Cδ ρ . Note that e ε ( Y ) = 0 because the transition rates β ( δk ) = λ and β − ( δk ) = µ k >
0) areconstant for k ≥
1. Furthermore, E ε h ( Y ) ≤ E (cid:16) (cid:0) Y ∈ [0 , δ ) (cid:1) C max ≤ m ≤ (cid:12)(cid:12) ∆ h ( δm ) (cid:12)(cid:12) (cid:17) ≤ Cδ, where in the last inequality we used the fact that ∆ h ( δm ) can be written in terms of∆ h ( δm + i ) for i = 0 , , ,
3, and that h ∈ dLip(1). Lastly, | ε f ( Y ) | ≤ Cδ ρ follows from the bound on | ε f ( x ) | from Proposition 2 together with the fact that f Aβ ( x ) = λ , f Aβ − ( x ) = µ , | ∆ f h ( δk ) | ≤ | ∆ f h ( δk ) | + | ∆ f h ( δ ( k + 1)) | ≤ δ/λ . (cid:3) We now discuss three ways to bound the finite differences of f h ( δk ) and prove Lemma 4. Synchronous couplings provide a way to bound ∆ a f h ( δk ) that generalizes well even if theCTMC is multidimensional. We now illustrate them for the M/M/
Recall that X ( t ) is the number of customers in the systemat time t ≥
0, scaled by δ . Let { X (0) ( t ) } be a copy of { X ( t ) } , and let { X (1) ( t ) } be anotherCTMC whose transitions we now define. We refer to { X ( i ) ( t ) } as system i . We set X (1) (0) = X (0) (0) + 1 and define the joint evolution of the two systems via the following table oftransition rates. Table 1 Transitions of the joint process { ( X (1) ( t ) , X (0) ( t )) } in state ( x (1) , x (0) ) . λ ( x (1) + δ, x (0) + δ )2 µ x (0) >
0) ( x (1) − δ, x (0) − δ )3 µ x (1) > , x (0) = 0) ( x (1) − δ, x (0) ) raverman: Prelimit generator comparison approach Article submitted to
Stochastic Systems ; manuscript no.
Let us describe the intuition behind this joint construction. Note that system 1 is also an
M/M/ t = 0, and all newly arriving customers,have an identical counterpart in system 1. The only difference between the two systemsis the extra customer initially present in system 1. One may think of this customer as alow-priority customer that gets served (in system 1) only when all other customers havebeen cleared. The two systems couple once this extra customer is served. We refer tosystems 1 and 0 as a synchronous coupling because the two systems are driven by the sameunderlying stochastic processes, i.e., arrivals and services. We now bound ∆ f h ( δk ).For k ∈ N , define τ ( i ) ( δk ) = inf t ≥ { X ( i ) ( t ) = δk } . From the discussion above, we have∆ f h ( δk ) = Z ∞ (cid:0) E δ ( k +1) h ( X ( t )) − E δk h ( X ( t )) (cid:1) dt = Z ∞ E X (0) (0)= δk (cid:16) h ( X (1) ( t )) − h ( X (0) ( t )) (cid:17) dt = Z ∞ E X (0) (0)= δk (cid:20) t ≤ τ (1) (0)) (cid:16) h ( X (0) ( t ) + 1) − h ( X (0) ( t )) (cid:17)(cid:21) dt. (33)We emphasize that the last equality above is true because systems 0 and 1 always maintaina constant gap of a single customer until they couple. We bound E X (0) (0)= δk τ (1) (0) bycombining the Lyapunov function V ( δk ) = k with Dynkin’s formula. Observe that V ( δk )satisfies G X V ( δk ) = λ − µ < k >
0, which means that E X (0) (0)= δk τ (1) (0) = E X (1) (0)= δ ( k +1) (cid:0) V ( X (1) (0)) − V ( X (1) ( τ (1) (0))) (cid:1) µ − λ = k + 1 µ − λ . (34)To see why the equality above is true, we refer the reader to the proof of Theorem 4.3.i ofMeyn and Tweedie (1993) (which is a direct application of Dynkin’s formula). Combining(34) and the fact that h ∈ dLip(1) with (33) proves | ∆ f h ( δk ) | ≤ δ ( k + 1) µ − λ , k ∈ N . In fact, we have proved the stronger statement (31). raverman:
Prelimit generator comparison approach
Article submitted to
Stochastic Systems ; manuscript no. It is straightforward to extend the coupling we con-structed in the previous section to bound∆ f h ( δk ) = Z ∞ (cid:0) E δ ( k +2) h ( X ( t )) − E δ ( k +1) h ( X ( t )) + E δk h ( X ( t )) (cid:1) dt. In addition to systems 0 and 1, we let { X (2) ( t ) } represent system 2, which is an identicalcopy of system 1 with one additional low-priority customer. The relationship between thethree systems is visualized in Figure 3.1.2, where we note that X (2) (0) = X (1) (0) + δ = X (0) (0) + 2 δ . The transitions of the joint chain are defined in Table 2 below. X (0) X (1) X (2) Figure 1 The initial state of systems 0,1,2 when system 0 starts with 4 customers. Circles arecustomers common to all systems. The diamonds and squares represent the extra customers.Table 2 Transitions of the joint process { ( X (2) ( t ) , X (1) ( t ) , X (0) ( t )) } in state ( x (2) , x (1) , x (0) ) . λ ( x (2) + δ, x (1) + δ, x (0) + δ )2 µ x (0) >
0) ( x (2) − δ, x (1) − δ, x (0) − δ )3 µ x (1) > , x (0) = 0) ( x (2) − δ, x (1) − δ, x (0) )4 µ x (2) > , x (1) = 0) ( x (2) − δ, x (1) , x (0) )Observe that systems 0 and 1 are identical for all t ≥ τ (1) (0). Proceeding similarly to(33), we see that∆ f h ( δk ) = Z ∞ E X (0) (0)= δk (cid:16) h ( X (2) ( t )) − h ( X (1) ( t )) − h ( X (0) ( t )) (cid:17) dt = Z ∞ E X (0) (0)= δk (cid:20) t ≤ τ (1) (0))∆ h ( X (0) ( t )) (cid:21) dt + Z ∞ E X (1) (0)=0 (cid:16) h ( X (2) ( t )) − h ( X (1) ( t )) (cid:17) dt = Z ∞ E X (0) (0)= δk (cid:20) t ≤ τ (1) (0))∆ h ( X (0) ( t )) (cid:21) dt + ∆ f h (0) , (35) raverman: Prelimit generator comparison approach Article submitted to
Stochastic Systems ; manuscript no. where the second equality follows from the strong Markov property. Provided that h ∈ dLip(1) and | ∆ h ( δk ) | ≤ δ , we combine the above with (34) and our bound on | ∆ f h ( δk ) | to conclude that (cid:12)(cid:12) ∆ f h ( δk ) (cid:12)(cid:12) ≤ δ ( k + 1) µ − λ + δ µ − λ . (36)Observe that the bound above is not the same as the bound on | ∆ f h ( δk ) | in Lemma 4.Furthermore, going from (35) to (36) requires the additional assumption that | ∆ h ( δk ) | ≤ δ , whereas Lemma 4 only assumes h ∈ dLip(1). In fact, the synchronous coupling approachtypically requires stronger assumptions on h ( δk ) than are necessary. The remaining boundsin Lemma 4 are proved using approaches that we discuss in Sections 3.2 and 3.3. The jump from the second to the third difference is almostidentical to the jump from the first to the second. We define system 3 as a copy of system2 with yet another low-priority customer. We omit the transition rate table and simplyillustrate the relationship of systems 0–3 in the figure below. It follows that X (0) X (1) X (2) X (3) Figure 2 The initial state of systems 0,1,2,3 when system 0 starts with 4 customers. The diamonds,squares, and stars represent the extra customers. ∆ f h ( δk ) = Z ∞ E X (0) (0)= δk (cid:16) h ( X (3) ( t )) − h ( X (2) ( t )) + 3 h ( X (1) ( t )) − h ( X (0) ( t )) (cid:17) dt = Z ∞ E X (0) (0)= δk (cid:20) t ≤ τ (1) (0))∆ h ( X (0) ( t )) (cid:21) dt + Z ∞ E X (1) (0)=0 (cid:16) h ( X (3) ( t )) − h ( X (2) ( t )) + 2 h ( X (1) ( t )) (cid:17) dt = Z ∞ E X (0) (0)= δk (cid:20) t ≤ τ (1) (0))∆ h ( X (0) ( t )) (cid:21) dt + ∆ f h (0) − ∆ f h (0)= Z ∞ E X (0) (0)= δk (cid:20) t ≤ τ (1) (0))∆ h ( X (0) ( t )) (cid:21) dt + Z ∞ E X (0) (0)=0 (cid:20) t ≤ τ (1) (0))∆ h ( X (0) ( t )) (cid:21) dt. (37) raverman: Prelimit generator comparison approach
Article submitted to
Stochastic Systems ; manuscript no. The last equality follows from (35). Provided that h ∈ dLip(1), | ∆ h ( δk ) | ≤ δ , and | ∆ h ( δk ) | ≤ δ , we apply (34) to conclude that (cid:12)(cid:12) ∆ f h ( δk ) (cid:12)(cid:12) ≤ δ ( k + 1) µ − λ + δ µ − λ . In the following section we discuss how the finite differences can be bounded by using thePoisson equation.
Perhaps the most obvious way to access the values of ∆ a f h ( δk ) is through the Poissonequation. Recalling the M/M/
M/M/ E h ( X ) − h ( δk ) = λ ∆ f h ( δk ) − µ ∆ f h ( δ ( k − λ ∆ f h ( δ ( k − − ( µ − λ )∆ f h ( δ ( k − , k ≥ . Rearranging terms, we get∆ f h ( δk ) = 1 λ (cid:0) E h ( X ) − h ( δ ( k + 1)) (cid:1) + µ − λλ ∆ f h ( δk ) , ∆ f h ( δk ) = 1 λ (cid:0) h ( δ ( k + 1)) − h ( δ ( k + 2)) (cid:1) + µ − λλ ∆ f h ( δk ) , k ≥ . (38)If h (0) = 0, we replace h ( δk ) by h ( δk ) − h (0), which has no effect on the solution f h ( δk ).Therefore, h (0) = 0 without loss of generality. Furthermore, if h ∈ dLip(1), then the aboveequations give automatic bounds on | ∆ f h ( δk ) | and | ∆ f h ( δk ) | provided that we can bound | E h ( X ) | ≤ E | X | and | ∆ f h ( δk ) | . From our discussion on synchronous couplings, we knowthat | ∆ f h ( δk ) | ≤ δ k +1 µ − λ . Furthermore, it is well known that X/δ is geometrically distributedwith mean ρ/ (1 − ρ ) = λµ − λ , where ρ = λ/µ , so (cid:12)(cid:12) ∆ f h ( δk ) (cid:12)(cid:12) ≤ δµ − λ + 2 δ ( k + 1) λ , and (cid:12)(cid:12) ∆ f h ( δk ) (cid:12)(cid:12) ≤ δλ + µ − λλ (cid:12)(cid:12) ∆ f h ( δk ) (cid:12)(cid:12) . (39)Observe that the bounds above only require that h ∈ dLip(1), compared to the additionalassumptions of the synchronous coupling bounds.The problem with using the Poisson equation is that it requires the CTMC to be one-dimensional. When the CTMC is multidimensional, there is more than one second-order raverman: Prelimit generator comparison approach Article submitted to
Stochastic Systems ; manuscript no. difference ∆ i ∆ j f h ( · ) present in the Poisson equation and it is not possible to isolate a singlesecond-order difference, say ∆ f ( δk ), in terms of first-order differences only. Therefore, onits own, the Poisson equation will not yield all the necessary high-order difference bounds.However, when combined with the synchronous coupling approach, it becomes a useful toolbecause it relates the high-order differences to each other. This idea is used in Braverman(2021). To bound ∆ f h ( δk ) via (38) we used a bound on E | X | . In thecase of the M/M/ X is known.However, obtaining a useful upper bound on E | X | is harder for more complicated systemsand usually involves using some kind of Lyapunov function. The Poisson equation providesanother route to bound E | X | . Picking h ( δk ) = | δk | , the Poisson equation at k = 0 becomes λ (cid:0) f h ( δ ) − f h (0) (cid:1) = λ ∆ f h (0) = E | X | = E X, so (33) and (34) imply that E X = δλ/ ( µ − λ ) = δρ/ (1 − ρ ). Choosing δ to be any constantmultiple of 1 − ρ ensures that { X } ρ< is tight.The main takeaway is that the problem of tightness is equivalent to bounding G X f h ( δk )at a single point that corresponds to the fluid equilibrium of the CTMC. At the fluidequilibrium, G X f h ( δk ) typically consists of second-order differences of f h ( δk ), unless thefluid equilibrium lies on the boundary of supp( X ) as in our M/M/
There is a simple trick based on the strong Markov property that lets us bound ∆ a f h ( δk )for a > X (1) ( t ) and X (0) ( t ) defined in Section 3.1 and assume h ∈ dLip(1). The strongMarkov property implies that for any k ≥ f h ( δk )= Z ∞ E X (0) (0)= δk (cid:20) (cid:16) t ≤ τ (0) ( δ ( k − (cid:17)(cid:16) h ( X (1) ( t )) − h ( X (0) ( t )) (cid:17)(cid:21) dt + ∆ f h ( δ ( k − . (40) raverman: Prelimit generator comparison approach
Article submitted to
Stochastic Systems ; manuscript no. We bring ∆ f h ( δ ( k − X (1) ( t ) − X (0) ( t ) = δ for t ≤ τ (0) ( δ ( k − (cid:12)(cid:12) ∆ f h ( δ ( k − (cid:12)(cid:12) ≤ δ E X (0) (0)= δk τ (0) ( δ ( k − . Recall that the Lyapunov function V ( δk ) = k satisfies G X V ( δk ) = ( λ − µ )1( k > E X (0) (0)= δk τ (0) ( δ ( k − V ( δk ) − E X (0) (0)= δk V ( X (0) ( τ (0) ( δ ( k − µ − λ = 1 µ − λ . The equalities above follow from Dynkin’s formula, just as in (34). We have thus provedthe second difference bound in Lemma 4. We point out that the argument above does notassume that | ∆ h ( δk ) | ≤ δ like the synchronous coupling approach does, and it also doesnot require a bound on E h ( X ) as was needed when we rearranged the Poisson equation.We can bound ∆ f h ( δk ) by repeating (40) with ∆ f h ( δk ) instead of ∆ f h ( δk ). However,instead we recall (38) and arrive at (cid:12)(cid:12) ∆ f h ( δk ) (cid:12)(cid:12) = (cid:12)(cid:12)(cid:12)(cid:12) λ (cid:0) h ( δ ( k + 1)) − h ( δ ( k + 2)) (cid:1) + µ − λλ ∆ f h ( δk ) (cid:12)(cid:12)(cid:12)(cid:12) ≤ δλ , which completes the proof of Lemma 4. We conclude this section by pointing out thatin practice, one often uses a hybrid approach involving all the methods discussed in Sec-tions 3.1, 3.2, and 3.3 to get the best possible bounds.
4. Misalignment of Diffusion Synchronous Couplings
We have presented the prelimit approach as a parallel to the diffusion approach for thepurposes of bounding | E h ( X ) − E h ( Y ) | . As we have seen, the main challenge with eachapproach is to bound the differences/derivatives of the respective Poisson equation solu-tion. In theory, any bound achievable using one approach should be achievable with theother. In practice, there are slight technical differences between working with a discrete-valued CTMC and a diffusion living on a continuum. In this section we illustrate onetechnical nuance that arises when we use synchronous couplings to bound the third deriva-tives of f h ∗ ( x ) in the diffusion approach. We term this the “misalignment of synchronouscouplings”. raverman: Prelimit generator comparison approach Article submitted to
Stochastic Systems ; manuscript no.
Let us recall the generic diffusion process { Y ( t ) } living on R that we defined in (20).We assume for simplicity that the diffusion coefficient a ( x ) = a for all x ∈ R and define thesynchronous couplings Y ( i ) ( t ) = Y (0) (0) + iε + Z t b ( Y ( i ) ( s )) ds + Z t √ adW ( s ) , i = 0 , , , . The four couplings start at different initial conditions but share the same Brownian motion.Since f h ∗ ( x ) is given by (4), it follows that ∂ ∂x f h ∗ ( x )= lim ε → ε Z ∞ E Y (0) (0)= x (cid:16) h ∗ ( Y (3) ( t )) − h ∗ ( Y (2) ( t )) + 3 h ∗ ( Y (1) ( t )) − h ∗ ( Y (0) ( t )) (cid:17) dt. (41)To show the integral on the right-hand side is finite, one must show that the synchronouscouplings converge to one another and characterize the speed at which it happens. Fur-thermore, the integral must be of order ε for the limit to exist. Let us consider this lastpoint further.Given a sufficiently differentiable function g : R → R , we know that its derivatives canbe approximated by finite differences. For instance, Taylor expansion tells us that g ′′′ ( x ) ≈ (cid:0) g ( x ′′′ ) − g ( x ′′ ) + 3 g ( x ′ ) − g ( x ) (cid:1) ε (42)when x ′′′ = x + 3 ε , x ′′ = x + 2 ε , and x ′ = x + ε . The precise spacing of x, x ′ , x ′′ , x ′′′ relativeto each other is essential for the limit (as ε →
0) of the right-hand side in (42) to exist.For example, if x ′′′ = x + 4 ε , then the numerator is now of order ε instead of ε , and theright-hand side diverges to ∞ as ε →
0. Therefore, one way to show that the integral in(41) is of order ε is to prove that the diffusion couplings maintain the appropriate spacingrelative to each other so that the integrand is of order ε for each t ≥ h ( x ) is smooth enough, the drift b ( x ) is four-times continuously differentiableand k -strongly concave, then (cid:12)(cid:12) h ∗ ( Y (3) ( t )) − h ∗ ( Y (2) ( t )) + 3 h ∗ ( Y (1) ( t )) − h ∗ ( Y (0) ( t )) (cid:12)(cid:12) ≤ ε Ce − kt/ (43) raverman: Prelimit generator comparison approach
Article submitted to
Stochastic Systems ; manuscript no. almost surely, where the constant C depends on k , h ( x ) and b ( x ). The above inequalitythen implies thatlim ε → ε Z ∞ E Y (0) (0)= x (cid:12)(cid:12)(cid:12) h ∗ ( Y (3) ( t )) − h ∗ ( Y (2) ( t )) + 3 h ∗ ( Y (1) ( t )) − h ∗ ( Y (0) ( t )) (cid:12)(cid:12)(cid:12) dt ≤ C/k. (44)Similarly, (43) also holds for d -dimensional diffusions with constant diffusion coefficients.Unfortunately, if the assumptions on the drift are violated, e.g. the drift is only Lipschitz-continuous or the diffusion has a reflecting boundary, then (43) no longer holds becausethe diffusion couplings become misaligned. This misalignment complicates the problem ofbounding (41) because one cannot use (43) anymore.As an example, we now illustrate how this misalignment occurs in the RBM that approx-imates the M/M/ Y ( t ) = Y (0) + δ ( λ − µ ) t + δ p ( λ + µ ) W ( t ) + R ( t ) , t ≥ , and let Y be the random variable having its stationary distribution. It was shown inHarrison and Reiman (1981) that R ( t ) = − inf ≤ s ≤ t n Y (0) + δ ( λ − µ ) s + δ p ( λ + µ ) W ( s ) o . We wish to bound the third derivative of f h ∗ ( x ) = Z ∞ (cid:0) E Y (0)= x h ∗ ( Y ( t )) − E h ∗ ( Y ) (cid:1) dt, x ≥ . For simplicity, we choose h ∗ ( x ) = x . Let us define the four coupled processes Y ( i ) ( t ) = Y (0) (0) + iε + δ ( λ − µ ) t + δ p ( λ + µ ) W ( t ) + R ( i ) ( t ) , where R ( i ) ( t ) = − inf ≤ s ≤ t n Y ( i ) (0) + δ ( λ − µ ) s + δ p ( λ + µ ) W ( s ) o , i = 0 , , , . (45)We also define D ( t ) = Y (3) ( t ) − Y (2) ( t ) + 3 Y (1) ( t ) − Y (0) ( t ). It follows that ∂ ∂x f h ∗ ( x ) = lim ε → ε Z ∞ E Y (0) (0)= x D ( t ) dt. (46) raverman: Prelimit generator comparison approach Article submitted to
Stochastic Systems ; manuscript no.
We define γ = inf t ≥ { Y (1) ( t ) = 3 ε/ } , γ = inf t ≥ { Y (1) ( t ) = ε/ } . We will prove at the end of this section that D ( t ) ≤ − ε/ , for t ∈ [ γ , γ ] . (47)We see that (47) violates (43). Furthermore, the expected hitting time of a fixed level bya Brownian motion with drift is well known and implies that E ( γ − γ ) = ε/ (2 δ ( µ − λ )).Therefore, the integral in (46) equals1 ε Z ∞ E Y (0) (0)= x ( D ( t )1( t ∈ [ γ , γ ])) dt + 1 ε Z ∞ E Y (0) (0)= x ( D ( t )1( t [ γ , γ ])) dt, (48)and the first term is bounded from above by − (8 δ ( µ − λ ) ε ) − , which diverges as ε → | D ( t ) | and take the limit as ε → ∂ ∂x f h ∗ ( x ) is well defined and the right-hand side of (46) exists. By applyingthe strong Markov property trick, one can show that the second integral in (48) containsa positive term of order 1 /ε that cancels out the first integral.The main takeaway is that the misaligned synchronous couplings added extra complex-ity to the problem. In contrast, the analogous analysis using the prelimit approach inSection 3.1.3 was cleaner because the CTMC is restricted to the grid.In the remainder of this section, we verify (47). By definition, Y ( i +1) ( t ) − Y ( i ) ( t ) = ε + R ( i +1) ( t ) − R ( i ) ( t ) , i = 0 , , t ≥
0. Now R ( i ) ( t ) = 0 for i = 1 , , t < inf s ≥ { Y (1) ( s ) = 0 } . This implies inparticular that Y (3) ( t ) − Y (2) ( t ) = Y (2) ( t ) − Y (1) ( t ) = ε, t ∈ [ γ , γ ]because γ < inf s ≥ { Y (1) ( s ) = 0 } . Thus, D ( t ) = − ε + Y (1) ( t ) − Y (0) ( t ) = R (1) ( t ) − R (0) ( t ) = − R (0) ( t ) , t ∈ [ γ , γ ] . One can check that R (0) ( γ ) = ε/ R ( i ) ( t ) in (45). Similarly, R (0) ( γ ) =3 ε/
4. Since R (0) ( t ) is non-decreasing, we have R (0) ( t ) ∈ [ ε/ , ε/
4] when t ∈ [ γ , γ ], whichproves (47). raverman: Prelimit generator comparison approach
Article submitted to
Stochastic Systems ; manuscript no.
5. Conclusion
In this paper we introduced the prelimit generator comparison approach and used the
M/M/
M/M/
U, U ′ , the Kolmogorov distance is defined as d K ( U, U ′ ) = sup z ∈ R (cid:12)(cid:12) E (cid:0) U ≥ z ) (cid:1) − E (cid:0) U ′ ≥ z ) (cid:1)(cid:12)(cid:12) . It is well known (e.g., Braverman et al. (2016)) that the discontinuity in the test functions1( · ≥ z ) makes working with the Kolmogorov distance more difficult than the Wasserstein.Even though we deal with discrete functions and their interpolations, the issue with thediscontinuity in 1( · ≥ z ) will still come up in the difference bounds on f h ( x ). However,the authors believe that, with minor tweaks, the prelimit approach can be applied in theKolmogorov distance setting. Appendix A: The Polynomial P k ( x ) and Interpolation in MultipleDimensions In this section we prove Theorem 1. We then state and prove Theorem 3, which is a generalizationof the theorem to multiple dimensions.
Proof of Theorem 1
Given f : K → R , for each k ∈ Z such that δk ∈ K we define P k ( x ) = f ( δk ) + (cid:16) x − δkδ (cid:17) (∆ −
12 ∆ + 13 ∆ ) f ( δk )+ 12 (cid:16) x − δkδ (cid:17) (cid:0) ∆ − ∆ (cid:1) f ( δk ) + 16 (cid:16) x − δkδ (cid:17) ∆ f ( δk ) raverman: Prelimit generator comparison approach Article submitted to
Stochastic Systems ; manuscript no. − (cid:16) x − δkδ (cid:17) ∆ f ( δk ) + 412 (cid:16) x − δkδ (cid:17) ∆ f ( δk ) − (cid:16) x − δkδ (cid:17) ∆ f ( δk ) + 112 (cid:16) x − δkδ (cid:17) ∆ f ( δk ) , x ∈ R . (49)where ∆ f ( δk ) = f ( δ ( k + 1)) − f ( δk ). It is clear from (49) that P k ( δk ) = f ( δk ), implying (11).Furthermore, it is straightforward to verify that ∂ a ∂x a P k − ( x ) (cid:12)(cid:12)(cid:12) x = δk = ∂ a ∂x a P k ( x ) (cid:12)(cid:12)(cid:12) x = δk , for a = 0 , , , . (50)The property above implies Af ( x ) = P k ( x ) ( x ) ∈ C (Conv( K )). Since P k ( x ) ∈ C ∞ ( R ), we know Af ( x ) is infinitely differentiable on Conv( K ) \ K . The weights α kk + i ( x ) can be read off by combiningthe coefficients corresponding to f ( δ ( k + i )) in (49). For example, α kk ( x ) = 1 − (cid:16) x − δkδ (cid:17) + (cid:16) x − δkδ (cid:17) − (cid:16) x − δkδ (cid:17) − (cid:16) x − δkδ (cid:17) + 412 (cid:16) x − δkδ (cid:17) − (cid:16) x − δkδ (cid:17) + 112 (cid:16) x − δkδ (cid:17) . It is straightforward to check that X i =0 α kk + i ( x ) = 1 , α kk ( δk ) = 1 , and α kk + i ( δk ) = 0 . It is also clear that the weights are degree-7 polynomials in ( x − δk ) /δ whose coefficients do notdepend on k or δ , i.e., α kk + i ( x ) = J i (cid:16) x − δkδ (cid:17) for some polynomial J i ( · ). A consequence of this that for any x ∈ R , α k + jk + j + i ( x + δj ) = J i (cid:16) x + δj − δ ( k + j ) δ (cid:17) = J i (cid:16) x − δkδ (cid:17) = α kk + i ( x ) , j, k ∈ Z , ≤ i ≤ , (cid:3) We now generalize Theorem 1 and define an interpolation operator that can interpolate any func-tion defined on K ∩ δ Z d where K ⊂ R d is convex. The interpolator is based on forward differences,but one could also use central or backward differences to accommodate different domains shapes.The following theorem summarizes the key properties we want from it. Theorem 3.
Let { α kk + i : R → R : k ∈ Z , i = 0 , , , , } be as in Theorem 1 and suppose we aregiven a convex set K ⊂ R d and a function f : K ∩ δ Z d → R . Letting i = ( i , . . . , i d ) ∈ Z d , we use theweights to define Af ( x ) = X i d =0 α k d ( x ) k d ( x )+ i d ( x d ) · · · X i =0 α k ( x ) k ( x )+ i ( x ) f ( δ ( k ( x ) + i ))= X i ,...,i d =0 (cid:18) d Y j =1 α k j ( x ) k j ( x )+ i j ( x j ) (cid:19) f ( δk ( x ) + i ) , x ∈ Conv ( K ) , (51) raverman: Prelimit generator comparison approach
Article submitted to
Stochastic Systems ; manuscript no. where k ( x ) ∈ Z d is defined by k i ( x ) = ⌊ x i /δ ⌋ , and K = { x ∈ K ∩ δ Z d : δ ( k ( x ) + i ) ∈ K ∩ δ Z d for all ≤ i ≤ e } . Then Af ( x ) ∈ C ( Conv ( K )) and is infinitely differentiable almost everywhere on Conv ( K ) . Addi-tionally, Af ( δk ) = f ( δk ) , δk ∈ K ∩ δ Z d , (52) and there exists a constant C > independent of f ( · ) , x , and δ , such that (cid:12)(cid:12)(cid:12) ∂ a ∂x a Af ( x ) (cid:12)(cid:12)(cid:12) ≤ Cδ −k a k max ≤ i j ≤ − a j j =1 ,...,d | ∆ a . . . ∆ a d d f ( δ ( k ( x ) + i )) | , x ∈ Conv ( K ) , (53) for ≤ k a k ≤ , and (53) also holds when k a k = 4 for almost all x ∈ Conv ( K ) . Note that for any J ⊂ { , . . . , d } and J c = { , . . . , d } \ J , we may rewrite (51) as Af ( x ) = X i j =0 j ∈ J c (cid:18) Y j ∈ J c α k j k j + i j ( x j ) (cid:19) X i j =0 j ∈ J (cid:18) Y j ∈ J α k j k j + i j ( x j ) (cid:19) f ( δ ( k + i )) ! . (54)The representation in (54) will come in handy in a later section. Let us construct the multidimen-sional analog of P k ( x ) from (49) by defining F k ( x ) = X i d =0 α k d k d + i d ( x d ) · · · X i =0 α k k + i ( x ) f ( δ ( k + i )) , x ∈ R d , k ∈ K . (55)Note that Af ( x ) defined in (51) satisfies Af ( x ) = F k ( x ) ( x ) for x ∈ Conv( K ). Furthermore, (13) ofTheorem 1 implies (52). To prove Theorem 3, it remains to verify the smoothness of Af ( x ) and(53).For any x ∈ R d and any set J ⊂ { , . . . , d } , we write x J to denote the vector whose i th elementequals x i i ∈ J ). The following result is the multidimensional analog (50) and is proved at the endof this section. Lemma 5.
Fix k ∈ K , and for any u ∈ [0 , d , let Θ( u ) = { i : u i = 1 } and Θ( u ) c = { , . . . , d } \ Θ( u ) . Then for any ≤ k a k ≤ , ∂ a ∂x a F k ( x ) (cid:12)(cid:12)(cid:12) x = δ ( k + u ) = ∂ a ∂x a F k + e Θ( u ) ( x ) (cid:12)(cid:12)(cid:12) x = δ ( k + u ) . (56) Furthermore, there exists a constant
C > independent of f ( · ) , k , and δ such that for all ≤ k a k ≤ and all x ∈ Conv ( K ) , (cid:12)(cid:12)(cid:12) ∂ a ∂x a F k ( x ) (cid:12)(cid:12)(cid:12) ≤ Cδ −k a k (cid:18) d Y j =1 (cid:16) (cid:12)(cid:12)(cid:12) x j − δk j δ (cid:12)(cid:12)(cid:12)(cid:17) − a j (cid:19) max ≤ i j ≤ − a j j =1 ,...,d | ∆ a . . . ∆ a d d f ( δ ( k + i )) | . (57) raverman: Prelimit generator comparison approach Article submitted to
Stochastic Systems ; manuscript no.
The above lemma proves Theorem 3. Indeed, (56) implies Af ( x ) ∈ C (Conv( K )), and since theweights α kk + i ( x ) belong to C ∞ ( R ), we know Af ( x ) is infinitely differentiable everywhere exceptat the points where the different F k ( x ) are glued together, i.e., on the set { x ∈ Conv( K ) : x i ∈ δ Z for some i ∈ { , . . . , d }} , which has Lebesgue measure zero. Furthermore, (53) follows directlyfrom (57). We now prove Lemma 5. Proof of Lemma 5
Fix k ∈ K . We first prove (56). Let j ′ be an element of Θ( u ). From (54) itfollows that ∂ a ∂x a F k ( x ) (cid:12)(cid:12)(cid:12) x = δ ( k + u ) = X i j =0 j ∈{ ,...,d }\{ j ′ } (cid:18) Y j ∈{ ,...,d }\{ j ′ } ∂ a j ∂x a j j α k j k j + i j ( x j ) (cid:12)(cid:12)(cid:12) x j = δ ( k j + u j ) (cid:19) × X i j ′ =0 (cid:18) ∂ a j ′ ∂x a j ′ j ′ α k j ′ k j ′ + i j ′ ( x j ′ ) (cid:12)(cid:12)(cid:12) x j ′ = δ ( k j ′ +1) (cid:19) f ( δ ( k + i )) ! . Combining (50) with the weighted sum representation of P k ( x ) implies X i j ′ =0 (cid:18) ∂ a j ′ ∂x a j ′ j ′ α k j ′ k j ′ + i j ′ ( x j ′ ) (cid:12)(cid:12)(cid:12) x j ′ = δ ( k j ′ +1) (cid:19) f ( δ ( k + i ))= X i j ′ =0 (cid:18) ∂ a j ′ ∂x a j ′ j ′ α k j ′ +1 k j ′ +1+ i j ′ ( x j ′ ) (cid:12)(cid:12)(cid:12) x j ′ = δ ( k j ′ +1) (cid:19) f ( δ ( k + i + e j ′ )) . Repeating the above procedure for all other elements of Θ( u ), we see that ∂ a ∂x a F k ( x ) (cid:12)(cid:12)(cid:12) x = δ ( k + u ) = X i j =0 j ∈ Θ( u ) c (cid:18) Y j ∈ Θ( u ) c ∂ a j ∂x a j j α k j k j + i j ( x j ) (cid:12)(cid:12)(cid:12) x j = δ ( k j + u j ) (cid:19) × X i j =0 j ∈ Θ( u ) (cid:18) Y j ∈ Θ( u ) ∂ a j ∂x a j j α k j +1 k j +1+ i j ( x j ) (cid:12)(cid:12)(cid:12) x j = δ ( k j +1) (cid:19) f ( δ ( k + i + e Θ( u ) )) ! = ∂ a ∂x a F k + e Θ( u ) ( x ) (cid:12)(cid:12)(cid:12) x = δ ( k + u ) which proves (56). It remains to prove the bound on (cid:12)(cid:12) ∂ a ∂x a F k ( x ) (cid:12)(cid:12) in (57). We know ∂ a ∂x a F k ( x ) = X i =0 ∂ a ∂x a α k k + i ( x ) · · · X i d =0 ∂ a d ∂x a d d α k d k d + i d ( x d ) f ( δ ( k + i )) . By inspecting the form of the one-dimensional P k ( · ) in (49), one can check that X i d =0 ∂ a d ∂x a d d α k d k d + i d ( x d ) f ( δ ( k + i )) = δ − a d Q ( d ) (cid:16) x d − δk d δ (cid:17) , raverman: Prelimit generator comparison approach
Article submitted to
Stochastic Systems ; manuscript no. Q ( d ) ( · ) is a (7 − a d )th order polynomial whose coefficients are independent of δ and dependon f ( δ ( k + i )) only through ∆ a d d f (cid:0) δ ( k + ( i , . . . , i d − , (cid:1) ∆ a d d f (cid:0) δ ( k + ( i , . . . , i d − , (cid:1) . . . ∆ a d d f (cid:0) δ ( k + ( i , . . . , i d − , − a d )) (cid:1) . This implies in particular that (cid:12)(cid:12)(cid:12) X i d =0 ∂ a d ∂x a d d α k d k d + i d ( x d ) f ( δ ( k + i )) (cid:12)(cid:12)(cid:12) ≤ Cδ − a d (cid:16) (cid:12)(cid:12)(cid:12) x d − δk d δ (cid:12)(cid:12)(cid:12)(cid:17) − a d max ≤ i d ≤ − a d | ∆ a d d f ( δ ( k + i )) | . We now consider X i d − =0 ∂ a d − ∂x a d − d − α k d − k d − + i d − ( x d − ) (cid:18) X i d =0 ∂ a d ∂x a d d α k d k d + i d ( x d ) f ( δ ( k + i )) (cid:19) . When viewed as a one-dimensional function of x d − , the above is again a polynomial of order(7 − a d − ) that depends on the quantity inside the parentheses only through∆ a d − d − (cid:18) X i d =0 ∂ a d ∂x a d d α k d k d + i d ( x d ) f ( δ ( k + i )) (cid:19) , with i d − = 0 , . . . , − a d − . Hence, (cid:12)(cid:12)(cid:12)(cid:12) X i d − =0 ∂ a d − ∂x a d − d − α k d − k d − + i d − ( x d − ) (cid:18) X i d =0 ∂ a d ∂x a d d α k d k d + i d ( x d ) f ( δ ( k + i )) (cid:19)(cid:12)(cid:12)(cid:12)(cid:12) ≤ Cδ − a d − − a d (cid:16) (cid:12)(cid:12)(cid:12) x d − − δk d − δ (cid:12)(cid:12)(cid:12)(cid:17) − a d − (cid:16) (cid:12)(cid:12)(cid:12) x d − δk d δ (cid:12)(cid:12)(cid:12)(cid:17) − a d × max ≤ i d ≤ − a d ≤ i d − ≤ − a d − (cid:12)(cid:12) ∆ a d − d − ∆ a d d f ( δ ( k + i )) (cid:12)(cid:12) . Repeating this argument along each of the remaining d − (cid:3) Appendix B: Interchange in Multiple Dimensions
In this section we generalize the interchange results to multiple dimensions. Section B.1 containsthe result for unbounded domains, while Section B.2 considers the case when the domain is δ N d . raverman: Prelimit generator comparison approach Article submitted to
Stochastic Systems ; manuscript no.
B.1. Unbounded Domains
In this section we prove Proposition 1 by proving the more general Proposition 3 stated below.Consider a CTMC living on Z d with generator given by (1). Just as we did in Section 2.3, we define β ℓ ( δk ) = q δk,δ ( k + ℓ ) , but this time, k, ℓ ∈ Z d . Then G X f ( δk ) = X ℓ ∈ Z d β ℓ ( δk )( f ( δ ( k + ℓ )) − f ( δk )) , k ∈ Z d . Proposition 3.
Fix f : δ Z d → R and assume that G X f ( δk ) is defined on all of δ Z d . Assumealso that X ℓ ∈ Z d | β ℓ ( δk )( f ( δ ( k + ℓ )) − f ( δk )) | < ∞ , k ∈ Z d , (58) which is trivially satisfied when the number of possible transitions from each state is finite. For x ∈ R d define k ( x ) ∈ Z d by k i ( x ) = ⌊ x i /δ ⌋ . Then AG X f ( x ) = X ℓ ∈ Z Aβ ℓ ( x ) (cid:0) Af ( x + δℓ ) − Af ( x ) (cid:1) + ε ( x ) , x ∈ R d , (59) where ε ( x ) = X ℓ ∈ Z X i ,...,i d =0 (cid:18) d Y j =1 α k j ( x ) k j ( x )+ i j ( x j ) (cid:19)(cid:16) β ℓ (cid:0) δ ( k ( x ) + i ) (cid:1) − Aβ ℓ ( x ) (cid:17) × (cid:16) f (cid:0) δ ( k ( x ) + ℓ + i ) (cid:1) − f (cid:0) δ ( k ( x ) + i ) (cid:1) − (cid:0) f (cid:0) δ ( k ( x ) + ℓ ) (cid:1) − f ( δk ( x )) (cid:1)(cid:17) . (60)Before we prove Proposition 3, let us reconcile the forms of ε ( x ) in (60) above and in (18) ofProposition 1. When d = 1, (60) equals X ℓ ∈ Z X i =0 α k ( x ) k ( x )+ i ( x ) (cid:16) β ℓ (cid:0) δ ( k ( x ) + i ) (cid:1) − Aβ ℓ ( x ) (cid:17) × (cid:16) f (cid:0) δ ( k ( x ) + ℓ + i ) (cid:1) − f (cid:0) δ ( k ( x ) + i ) (cid:1) − (cid:0) f (cid:0) δ ( k ( x ) + ℓ ) (cid:1) − f (cid:0) δk ( x ) (cid:1)(cid:1)(cid:17) . Using a telescoping series, we see that if ℓ > f (cid:0) δ ( k ( x ) + ℓ + i ) (cid:1) − f (cid:0) δ ( k ( x ) + i ) (cid:1) − (cid:0) f (cid:0) δ ( k ( x ) + ℓ ) (cid:1) − f (cid:0) δk ( x ) (cid:1)(cid:1) = ℓ − X j =0 (cid:0) ∆ f ( δ ( k ( x ) + j + ℓ )) − ∆ f ( δ ( k ( x ) + j )) (cid:1) = i − X j =0 ℓ − X m =0 ∆ f ( δ ( k ( x ) + m + j )) . raverman: Prelimit generator comparison approach
Article submitted to
Stochastic Systems ; manuscript no. ℓ < f (cid:0) δ ( k ( x ) + ℓ + i ) (cid:1) − f (cid:0) δ ( k ( x ) + i ) (cid:1) − (cid:0) f (cid:0) δ ( k ( x ) + ℓ ) (cid:1) − f (cid:0) δk ( x ) (cid:1)(cid:1) = − i − X j =0 − X m = − ℓ ∆ f ( δ ( k ( x ) + m + j )) . Therefore, Proposition 3 implies Proposition 1. When d >
1, it is also possible to write (60) as atelescoping series of second-order differences of f ( δk ). We leave this as an exercise in algebra forthe interested reader, because it is notationally messy. Proof of Proposition 3
Fix x ∈ R d . We will write k instead of k ( x ) for convenience. Recall fromTheorem 3 that for any function f : δ Z d → R , Af ( x ) = X i ,...,i d =0 (cid:18) d Y j =1 α k j k j + i j ( x j ) (cid:19) f (cid:0) δ ( k + i ) (cid:1) . It follows that AG X f ( x )= X i ,...,i d =0 (cid:18) d Y j =1 α k j k j + i j ( x j ) (cid:19) X ℓ ∈ Z β ℓ (cid:0) δ ( k + i ) (cid:1)(cid:16) f (cid:0) δ ( k + ℓ + i ) (cid:1) − f (cid:0) δ ( k + i ) (cid:1)(cid:17) = X ℓ ∈ Z Aβ ℓ ( x ) X i ,...,i d =0 (cid:18) d Y j =1 α k j k j + i j ( x j ) (cid:19)(cid:16) f (cid:0) δ ( k + ℓ + i ) (cid:1) − f (cid:0) δ ( k + i ) (cid:1)(cid:17) (61)+ X ℓ ∈ Z X i ,...,i d =0 (cid:18) d Y j =1 α k j k j + i j ( x j ) (cid:19)(cid:16) β ℓ (cid:0) δ ( k + i ) (cid:1) − Aβ ℓ ( x ) (cid:17)(cid:16) f (cid:0) δ ( k + ℓ + i ) (cid:1) − f (cid:0) δ ( k + i ) (cid:1)(cid:17) . (62)In the second equality, interchanging the summations is allowed by our assumption that P ℓ ∈ Z d | β ℓ ( δk )( f ( δ ( k + ℓ )) − f ( δk )) | < ∞ and the Fubini-Tonelli theorem. Looking at (61), observethat for each ℓ ∈ Z d , X i ,...,i d =0 (cid:18) d Y j =1 α k j k j + i j ( x j ) (cid:19)(cid:16) f (cid:0) δ ( k + ℓ + i ) (cid:1) − f (cid:0) δ ( k + i ) (cid:1)(cid:17) = X i ,...,i d =0 (cid:18) d Y j =1 α k j k j + i j ( x j ) (cid:19) f (cid:0) δ ( k + ℓ + i ) (cid:1) − Af ( x )= X i ,...,i d =0 (cid:18) d Y j =1 α k j + ℓ j k j + ℓ j + i j ( x j + δℓ j ) (cid:19) f (cid:0) δ ( k + ℓ + i ) (cid:1) − Af ( x )= Af ( x + δℓ ) − Af ( x ) , raverman: Prelimit generator comparison approach Article submitted to
Stochastic Systems ; manuscript no. where in the second equality we used the translation invariance property of the weights stated in(15) of Theorem 1. Moving on, we see that (62) equals X ℓ ∈ Z X i ,...,i d =0 (cid:18) d Y j =1 α k j k j + i j ( x j ) (cid:19)(cid:16) β ℓ (cid:0) δ ( k + i ) (cid:1) − Aβ ℓ ( x ) (cid:17) × (cid:16) f (cid:0) δ ( k + ℓ + i ) (cid:1) − f (cid:0) δ ( k + i ) (cid:1) − (cid:0) f (cid:0) δ ( k + ℓ ) (cid:1) − f ( δk ) (cid:1)(cid:17) + X ℓ ∈ Z (cid:16) f (cid:0) δ ( k + ℓ ) (cid:1) − f ( δk ) (cid:17) X i ,...,i d =0 (cid:18) d Y j =1 α k j k j + i j ( x j ) (cid:19)(cid:16) β ℓ (cid:0) δ ( k + i ) (cid:1) − Aβ ℓ ( x ) (cid:17) . The second line above equals zero because (14) of Theorem 1 implies P i ,...,i d =0 (cid:16) Q dj =1 α k j k j + i j ( x j ) (cid:17) = 1 and because Aβ ℓ ( x ) = P i ,...,i d =0 (cid:16) Q dj =1 α k j k j + i j ( x j ) (cid:17) β ℓ (cid:0) δ ( k + i ) (cid:1) by definition. Therefore, (62) equals ε ( x ). (cid:3) B.2. A Bounded Domain
In this section, we prove Proposition 2 by proving the more general Proposition 4 stated below. Wefocus on the case when the CTMC takes values in δ N d . Assume that our CTMC has generator G X f ( δk ) = X ℓ ∈ Z d β ℓ ( δk )( f ( δ ( k + ℓ )) − f ( δk )) , k ∈ N d . Fix h : δ N d → R such that G X f h ( δk ) = E h ( X ) − h ( δk ) , k ∈ N d has a finite solution f h ( δk ). Assume there exists some vector L ∈ N d such that β ℓ ( δk ) = 0 for all k ∈ N d if ℓ j < − L j for some 1 ≤ j ≤ d . The definition of L means that ℓ ≥ − L is a necessary conditionfor β ℓ ( δk ) to be non-zero for some k , so G X f ( δk ) = X ℓ ≥− L β ℓ ( δk )( f ( δ ( k + ℓ )) − f ( δk )) , k ∈ N d . (63)Define k ( x ) by k i ( x ) = ⌊ x i /δ ⌋ . Recalling the form of the multidimensional interpolator from (51) ofTheorem 3, we define f Aβ ℓ ( x ) = X i ,...,i d =0 (cid:18) d Y j =1 α k j ( x ) ∨ L j k j ( x ) ∨ L j + i j ( x j ) (cid:19) β ℓ (cid:0) δ ( k ( x ) ∨ L + i ) (cid:1) , x ∈ R d , ℓ ≥ − L, where k ( x ) ∨ L is understood to be the element-wise maximum. Furthermore, define b f h ( δk ) = X i ,...,i d =0 (cid:18) d Y j =1 α k j ∨ k j ∨ i j ( δk j ) (cid:19) f ( δ ( k ∨ i )) , k ∈ Z d , to be the extension of f h ( · ) to all of δ Z d . Let J ( k ) = { j : k j < L j } and J ( k ) c = { , . . . , d } \ J ( k ).Recall that e is the vector of ones and that for any x ∈ R d and any set J ⊂ { , . . . , d } , we write x J to denote the vector whose i th element equals x i i ∈ J ). The following generalizes Proposition 2. raverman: Prelimit generator comparison approach
Article submitted to
Stochastic Systems ; manuscript no. Proposition 4.
Consider the CTMC defined by the generator in (63) . Assume that X ℓ ≥− L | β ℓ ( δk )( f h ( δ ( k + ℓ )) − f h ( δk )) | < ∞ , k ∈ N d , (64) which is trivially satisfied when the number of possible transitions from each state is finite. Then AG X f h ( x ) = X ℓ ≥− L f Aβ ℓ ( x )( A b f h ( x + δℓ ) − A b f h ( x )) + e ε ( x ) + ε h ( x ) + ε f ( x ) , x ∈ R d + . Letting k = k ( x ) , the error terms satisfy e ε ( x ) = X ℓ ≥− L X i ,...,i d =0 (cid:18) d Y j =1 α k j ∨ L j k j ∨ L j + i j ( x j ) (cid:19)(cid:16) β ℓ (cid:0) δ ( k ∨ L + i ) (cid:1) − f Aβ ℓ ( x ) (cid:17) × (cid:16) f h (cid:0) δ ( k ∨ L + ℓ + i ) (cid:1) − f h (cid:0) δ ( k ∨ L + i ) (cid:1) − (cid:0) f h (cid:0) δ ( k ∨ L + ℓ ) (cid:1) − f h ( δ ( k ∨ L )) (cid:1)(cid:17) , | ε h ( x ) | ≤ J ( k ) = ∅ ) C ( L, d ) max ≤ i ≤ ek ≤ m ≤ Lj ∈ J ( k ) (cid:12)(cid:12) ∆ j h ( δ ( k J ( k ) c + i + m J ( k ) )) (cid:12)(cid:12) | ε f ( x ) | ≤ J ( k ) = ∅ ) C ( L, d ) X ℓ ≥− L (cid:12)(cid:12) f Aβ ℓ ( x ) (cid:12)(cid:12)(cid:16) max ≤ i ≤ ek ≤ m ≤ Lj ∈ J ( k ) (cid:12)(cid:12) ∆ j f h ( δ ( k J ( k ) c + i + m J ( k ) )) (cid:12)(cid:12) + max ≤ i ≤ ek + ℓ ≤ m ≤ L + ℓj ∈ J ( k ) (cid:12)(cid:12)(cid:12) ∆ j f h (cid:16) δ ( k + ℓ ) J ( k ) c + δm J ( k ) ∨ i (cid:17)(cid:12)(cid:12)(cid:12) (cid:17) . To see why Proposition 4 implies Proposition 2, note that { J ( k ) = ∅} = { ≤ k < L } when d = 1, sothe bounds on | ε h ( x ) | and | ε f ( x ) | in Proposition 4 imply the corresponding bounds in Proposition 2.To prove Proposition 4 we need the following two auxiliary results. They are proved in Sections B.2.1and B.2.2, respectively. Lemma 6.
Fix L ′ ∈ N d and for k ′ ∈ Z d define J ′ ( k ′ ) = { j : k ′ j < L ′ j } . There exists a constant C ( L ′ , d ) > such that for any function f : δ Z d → R , any x ∈ R d , and any k ′ ∈ Z d with k ′ ≥ − L ′ , (cid:12)(cid:12)(cid:12)(cid:12) X i ,...,i d =0 (cid:18) d Y j =1 α k ′ j ∨ L ′ j k ′ j ∨ L ′ j + i j ( x j ) (cid:19) f (cid:0) δ ( k ′ ∨ L ′ + i ) (cid:1) − X i ,...,i d =0 (cid:18) d Y j =1 α k ′ j k ′ j + i j ( x j ) (cid:19) f (cid:0) δ ( k ′ + i ) (cid:1)(cid:12)(cid:12)(cid:12)(cid:12) ≤ J ′ ( k ′ ) = ∅ ) C ( L ′ , d ) (cid:0) | x − δ ( k ′ ∨ L ′ ) | /δ (cid:1) max ≤ i ≤ ek ′ ≤ m ≤ L ′ j ∈ J ′ ( k ′ ) (cid:12)(cid:12) ∆ j f ( δ ( k ′ J ′ ( k ′ ) c ( x ) + i + m J ′ ( k ′ ) )) (cid:12)(cid:12) . (65) Lemma 7.
Given f : δ N d → R , define b f ( δk ) = X i ,...,i d =0 (cid:18) d Y j =1 α k j ∨ k j ∨ i j ( δk j ) (cid:19) f ( δ ( k ∨ i )) , k ∈ Z d . Then for any a ∈ N d with ≤ k a k ≤ , (cid:12)(cid:12) ∆ a b f ( δk ) (cid:12)(cid:12) ≤ C (cid:0) | k ∧ | (cid:1) max ≤ i ≤ e (cid:12)(cid:12) ∆ a f (cid:0) δ (( k ∨
0) + i )) (cid:1)(cid:12)(cid:12) , k ∈ Z d . (66) raverman: Prelimit generator comparison approach Article submitted to
Stochastic Systems ; manuscript no.
Proof of Proposition 4
Throughout the proof we write k instead of k ( x ) for notational conve-nience. Assume that x ≥ δL or, equivalently, J ( k ) = ∅ . Then AG X f h ( x ) = X i ,...,i d =0 (cid:18) d Y j =1 α k j k j + i j ( x j ) (cid:19) X ℓ ≥− L β ℓ (cid:0) δ ( k + i ) (cid:1)(cid:16) f h (cid:0) δ ( k + i + ℓ ) (cid:1) − f h (cid:0) δ ( k + i ) (cid:1)(cid:17) . The proof of Proposition 3 can be repeated to show that AG X f h ( x ) = X ℓ ≥− L Aβ ℓ ( x )( Af h ( x + δℓ ) − Af h ( x )) + ε ( x ) , where ε ( x ) is as in Proposition 3. From the definitions of e ε ( x ), f Aβ ℓ ( x ), A b f h ( x ), and A b f h ( x + δℓ ), itfollows that they equal ε ( x ), Aβ ℓ ( x ), Af h ( x ), and Af h ( x + δℓ ), respectively. Therefore, AG X f h ( x ) = X ℓ ≥− L f Aβ ℓ ( x )( A b f h ( x + δℓ ) − A b f h ( x )) + e ε ( x ) , J ( k ) = ∅ . We now handle the more involved case when J ( k ) = ∅ . Recall that Ah ( x ) = X i ,...,i d =0 (cid:18) d Y j =1 α k j k j + i j ( x j ) (cid:19) h (cid:0) δ ( k + i ) (cid:1) and define f Ah ( x ) = X i ,...,i d =0 (cid:18) d Y j =1 α k j ∨ L j k j ∨ L j + i j ( x j ) (cid:19) h (cid:0) δ ( k ∨ L + i ) (cid:1) , x ∈ R d + . (67)Setting ε h ( x ) = f Ah ( x ) − Ah ( x ), we have AG X f h ( x ) = E h ( X ) − Ah ( x ) = E h ( X ) − f Ah ( x ) + ε h ( x ) . Using Lemma 6 with L ′ and k ′ there being equal to L and k , respectively, we get | ε h ( x ) | ≤ J ( k ) = ∅ ) C ( L, d ) (cid:0) | x − δ ( k ∨ L ) | /δ (cid:1) max ≤ i ≤ e ≤ m ≤ Lj ∈ J ( k ) (cid:12)(cid:12) ∆ j h ( δ ( k J ( k ) c + i + m J ( k ) )) (cid:12)(cid:12) ≤ J ( k ) = ∅ ) C ( L, d ) max ≤ i ≤ ek ≤ m ≤ Lj ∈ J ( k ) (cid:12)(cid:12) ∆ j h ( δ ( k J ( k ) c + i + m J ( k ) )) (cid:12)(cid:12) . The second inequality follows from the facts that | x j − δk j | /δ = | x j − δk j ( x ) | /δ ≤ j suchthat k j ≥ L j , and | x j − δL j | /δ ≤ L j for those j where k j < L j because 0 ≤ x j < δL j . This proves thebound on | ε h ( x ) | from the statement of the proposition. Next, we observe that E h ( X ) − f Ah ( x )= X i ,...,i d =0 (cid:18) d Y j =1 α k j ∨ L j k j ∨ L j + i j ( x j ) (cid:19)(cid:16) E h ( X ) − h (cid:0) δ ( k ∨ L + i ) (cid:1)(cid:17) raverman: Prelimit generator comparison approach
Article submitted to
Stochastic Systems ; manuscript no. X i ,...,i d =0 (cid:18) d Y j =1 α k j ∨ L j k j ∨ L j + i j ( x j ) (cid:19) G X f h (cid:0) δ ( k ∨ L + i ) (cid:1) = X i ,...,i d =0 (cid:18) d Y j =1 α k j ∨ L j k j ∨ L j + i j ( x j ) (cid:19) X ℓ ≥− L β ℓ (cid:0) δ ( k ∨ L + i ) (cid:1)(cid:16) f h (cid:0) δ ( k ∨ L + ℓ + i ) (cid:1) − f h (cid:0) δ ( k ∨ L + i ) (cid:1)(cid:17) , where in the first equality we used (14) of Theorem 1; i.e., the weights sum to one. The above equals X ℓ ≥− L f Aβ ℓ ( x ) X i ,...,i d =0 (cid:18) d Y j =1 α k j ∨ L j k j ∨ L j + i j ( x j ) (cid:19)(cid:16) f h (cid:0) δ ( k ∨ L + ℓ + i ) (cid:1) − f h (cid:0) δ ( k ∨ L + i ) (cid:1)(cid:17) (68)+ X ℓ ≥− L X i ,...,i d =0 (cid:18) d Y j =1 α k j ∨ L j k j ∨ L j + i j ( x j ) (cid:19)(cid:16) β ℓ (cid:0) δ ( k ∨ L + i ) (cid:1) − f Aβ ℓ ( x ) (cid:17) × (cid:16) f h (cid:0) δ ( k ∨ L + ℓ + i ) (cid:1) − f h (cid:0) δ ( k ∨ L + i ) (cid:1)(cid:17) . (69)By repeating the argument from the proof of Proposition 3 that we used to show that (62) equals ε ( x ), one can check that (69) equals e ε ( x ). Lastly, we define ε f ( x ) = X ℓ ≥− L f Aβ ℓ ( x ) (cid:18) X i ,...,i d =0 (cid:18) d Y j =1 α k j ∨ L j k j ∨ L j + i j ( x j ) (cid:19) f h (cid:0) δ ( k ∨ L + ℓ + i ) (cid:1) − A b f h (cid:0) x + δℓ (cid:1)(cid:19) + X ℓ ≥− L f Aβ ℓ ( x ) (cid:18) X i ,...,i d =0 (cid:18) d Y j =1 α k j ∨ L j k j ∨ L j + i j ( x j ) (cid:19) f h (cid:0) δ ( k ∨ L + i ) (cid:1) − A b f h ( x ) (cid:19) , and arrive at AG X f h ( x ) = X ℓ ∈ Z d f Aβ ℓ ( x )( A b f h ( x + δℓ ) − A b f h ( x )) + e ε ( x ) + ε h ( x ) + ε f ( x ) , x ∈ R d + . It remains to verify the bound on | ε f ( x ) | . Note that A b f ( x ) = Af ( x ) when x ≥
0, so (cid:12)(cid:12)(cid:12)(cid:12) X i ,...,i d =0 (cid:18) d Y j =1 α k j ∨ L j k j ∨ L j + i j ( x j ) (cid:19) f h (cid:0) δ ( k ∨ L + i ) (cid:1) − A b f h ( x ) (cid:12)(cid:12)(cid:12)(cid:12) = (cid:12)(cid:12)(cid:12)(cid:12) X i ,...,i d =0 (cid:18) d Y j =1 α k j ∨ L j k j ∨ L j + i j ( x j ) (cid:19) f h (cid:0) δ ( k ∨ L + i ) (cid:1) − X i ,...,i d =0 (cid:18) d Y j =1 α k j k j + i j ( x j ) (cid:19) f h ( δ ( k + i )) (cid:12)(cid:12)(cid:12)(cid:12) ≤ J ( k ) = ∅ ) C ( L, d ) max ≤ i ≤ ek ≤ m ≤ Lj ∈ J ( k ) (cid:12)(cid:12) ∆ j f h ( δ ( k J ( k ) c + i + m J ( k ) )) (cid:12)(cid:12) , where the inequality follows from using Lemma 6 with L ′ = L and k ′ = k . Similarly, for any ℓ ≥ − L , (cid:12)(cid:12)(cid:12)(cid:12) X i ,...,i d =0 (cid:18) d Y j =1 α k j ∨ L j k j ∨ L j + i j ( x j ) (cid:19) f h (cid:0) δ ( k ∨ L + ℓ + i ) (cid:1) − A b f h (cid:0) x + δℓ (cid:1)(cid:12)(cid:12)(cid:12)(cid:12) = (cid:12)(cid:12)(cid:12)(cid:12) X i ,...,i d =0 (cid:18) d Y j =1 α ( k j + ℓ j ) ∨ ( L j + ℓ j )( k j + ℓ j ) ∨ ( L j + ℓ j )+ i j ( x j + δℓ j ) (cid:19) b f h (cid:0) δ (( k + ℓ ) ∨ ( L + ℓ ) + i ) (cid:1) − X i ,...,i d =0 (cid:18) d Y j =1 α k j + ℓ j k j + ℓ j + i j ( x j + δℓ j ) (cid:19) b f h ( δ ( k + ℓ + i )) (cid:12)(cid:12)(cid:12)(cid:12) , raverman: Prelimit generator comparison approach Article submitted to
Stochastic Systems ; manuscript no. where the equality follows from the translational invariance property (15) of Theorem 1 and thefact that b f h ( δk ) = f h ( δk ) for k ≥
0. Using Lemma 6 with L ′ = L + ℓ ≥ k ′ = k + ℓ ≥ − L ′ , andnoting that J ′ ( k ′ ) = { j : k j + ℓ j < L j + ℓ j } = { j : k j < L j } = J ( k ), we see that the quantity above isbounded by 1( J ( k ) = ∅ ) C ( L, d ) max ≤ i ≤ ek + ℓ ≤ m ≤ L + ℓj ∈ J ( k ) (cid:12)(cid:12)(cid:12) ∆ j b f h ( δ (( k + ℓ ) J ( k ) c + i + m J ( k ) )) (cid:12)(cid:12)(cid:12) . Using Lemma 7, we bound the above by1( J ( k ) = ∅ ) C ( L, d ) max ≤ i,i ′ ≤ ek + ℓ ≤ m ≤ L + ℓj ∈ J ( k ) (cid:12)(cid:12)(cid:12) ∆ j f h (cid:16) δ (cid:0) ( k + ℓ ) J ( k ) c + i + m J ( k ) (cid:1) ∨ i ′ (cid:17)(cid:12)(cid:12)(cid:12) ≤ J ( k ) = ∅ ) C ( L, d ) max ≤ i ≤ ek + ℓ ≤ m ≤ L + ℓj ∈ J ( k ) (cid:12)(cid:12)(cid:12) ∆ j f h (cid:16) δ ( k + ℓ ) J ( k ) c + δm J ( k ) ∨ i (cid:17)(cid:12)(cid:12)(cid:12) . The inequality is true because ( k + ℓ ) J ( k ) c ≥ J ( k ). (cid:3) B.2.1. Proof of Lemma 6
Proving Lemma 6 requires yet another technical lemma.
Lemma 8.
Suppose f : δ Z → R and recall that we introduced P k ( x ) = P i =0 α kk + i ( x ) f (cid:0) δ ( k + i ) (cid:1) inSection 2.1. For any u, ¯ u ∈ Z with u < ¯ u , there exists a constant C = C (¯ u − u ) > such that | P ¯ u ( x ) − P u ( x ) | ≤ C ¯ u − u X j =0 (cid:12)(cid:12) ∆ f ( δ ( u + j )) (cid:12)(cid:12) , x ∈ [ δu, δ ¯ u ] . (70)We now prove Lemma 6 and then prove Lemma 8. Proof of Lemma 6
We used L ′ , k ′ , and J ′ ( k ′ ) in the statement of Lemma 6 to distinguish thevariables from L , k , and J ( k ) in that section. To keep this proof cleaner, we drop the primes anduse L , k , and J ( k ) to represent L ′ , k ′ and J ′ ( k ′ ), respectively. When d = 1, the bound followsimmediately from Lemma 8. Indeed, if − L ≤ k < L , then (cid:12)(cid:12)(cid:12) X i =0 α LL + i ( x ) f (cid:0) δ ( L + i ) (cid:1) − X i =0 α kk + i ( x ) h (cid:0) δ ( k + i ) (cid:1)(cid:12)(cid:12)(cid:12) ≤ C L − k X j =0 (cid:12)(cid:12) ∆ f ( δ ( k + j )) (cid:12)(cid:12) ≤ C ( L ) max k ≤ m ≤ L (cid:12)(cid:12) ∆ f ( δm ) (cid:12)(cid:12) follows from Lemma 8 with k in place of u and L in place of ¯ u .Proving the result when d > j ∈ J ( k ), and theonly added complication is the notational bookkeeping involved. To begin, note that X i ,...,i d =0 (cid:18) d Y j =1 α k j ∨ L j k j ∨ L j + i j ( x j ) (cid:19) f (cid:0) δ ( k ∨ L + i ) (cid:1) − X i ,...,i d =0 (cid:18) d Y j =1 α k j k j + i j ( x j ) (cid:19) f (cid:0) δ ( k + i ) (cid:1) raverman: Prelimit generator comparison approach
Article submitted to
Stochastic Systems ; manuscript no. X i j =0 j ∈ J ( k ) c (cid:18) Y j ∈ J ( k ) c α k j k j + i j ( x j ) (cid:19) × X i j =0 j ∈ J ( k ) (cid:18)(cid:16) Y j ∈ J ( k ) α L j L j + i j ( x j ) (cid:17) f (cid:0) δ ( k ∨ L + i ) (cid:1) − (cid:16) Y j ∈ J ( k ) α k j k j + i j ( x j ) (cid:17) f (cid:0) δ ( k + i ) (cid:1)(cid:19) . (71)Theorem 3 states that the weights α k j k j + i j ( x j ) are degree-7 polynomials in ( x j − δk j ) /δ and thereforethere exists a constant C > (cid:12)(cid:12)(cid:12) α k j k j + i j ( x j ) (cid:12)(cid:12)(cid:12) ≤ C (cid:0) | x j − δk j | /δ (cid:1) = C (cid:0) | x j − δ ( k j ∨ L j ) | /δ (cid:1) , j ∈ J ( k ) c , (72)We bound the interior sum in (71) by writing it as a telescoping series so that Lemma 8 can beapplied to each term in the series. First, we fix those elements i j for which j ∈ J ( k ) c . Next, we let J = | J ( k ) | be the size of J ( k ), and let η (1) < η (2) < . . . < η ( J ) ∈ { , . . . , d } be the elements of J ( k ).Define u (0) = k ∨ L, u ( J ) = k , and for 0 < ℓ < J , define u ( ℓ ) = ( u ( ℓ ) , . . . , u d ( ℓ )) by u j ( ℓ ) = k j , j J ( k ) ,k j , j ∈ J ( k ) and j ≤ η ( ℓ ) ,L j , j ∈ J ( k ) and j > η ( ℓ ) . It follows that X i j =0 j ∈ J ( k ) (cid:18)(cid:16) Y j ∈ J ( k ) α L j L j + i j ( x j ) (cid:17) f (cid:0) δ ( k ∨ L + i ) (cid:1) − (cid:16) Y j ∈ J ( k ) α k j k j + i j ( x j ) (cid:17) f (cid:0) δ ( k + i ) (cid:1)(cid:19) = X i j =0 j ∈ J ( k ) (cid:18)(cid:16) Y j ∈ J ( k ) α u j (0) u j (0)+ i j ( x j ) (cid:17) f (cid:0) δ ( u (0) + i ) (cid:1) − (cid:16) Y j ∈ J ( k ) α u j ( J ) u j ( J )+ i j ( x j ) (cid:17) f (cid:0) δ ( u ( J ) + i ) (cid:1)(cid:19) = J X ℓ =1 4 X i j =0 j ∈ J ( k ) (cid:18)(cid:16) Y j ∈ J ( k ) α u j ( ℓ − u j ( ℓ − i j ( x j ) (cid:17) f (cid:0) δ ( u ( ℓ −
1) + i ) (cid:1) − (cid:16) Y j ∈ J ( k ) α u j ( ℓ ) u j ( ℓ )+ i j ( x j ) (cid:17) f (cid:0) δ ( u ( ℓ ) + i ) (cid:1)(cid:19) . (73)Now fix some ℓ between 1 and J , and consider X i j =0 j ∈ J ( k ) (cid:18)(cid:16) Y j ∈ J ( k ) α u j ( ℓ − u j ( ℓ − i j ( x j ) (cid:17) f (cid:0) δ ( u ( ℓ −
1) + i ) (cid:1) − (cid:16) Y j ∈ J ( k ) α u j ( ℓ ) u j ( ℓ )+ i j ( x j ) (cid:17) f (cid:0) δ ( u ( ℓ ) + i ) (cid:1)(cid:19) = X i j =0 j ∈ J ( k ) (cid:18)(cid:16) Y j ∈ J ( k ) j = η ( ℓ ) α u j ( ℓ − u j ( ℓ − i j ( x j ) (cid:17) α u η ( ℓ ) ( ℓ − u η ( ℓ ) ( ℓ − i η ( ℓ ) ( x η ( ℓ ) ) f (cid:0) δ ( u ( ℓ −
1) + i ) (cid:1) − (cid:16) Y j ∈ J ( k ) j = η ( ℓ ) α u j ( ℓ ) u j ( ℓ )+ i j ( x j ) (cid:17) α u η ( ℓ ) ( ℓ ) u η ( ℓ ) ( ℓ )+ i η ( ℓ ) ( x η ( ℓ ) ) f (cid:0) δ ( u ( ℓ ) + i ) (cid:1)(cid:19) . raverman: Prelimit generator comparison approach Article submitted to
Stochastic Systems ; manuscript no.
By definition, u j ( ℓ −
1) = u j ( ℓ ) for all j = η ( ℓ ). Also, u η ( ℓ ) ( ℓ −
1) = L η ( ℓ ) , and u η ( ℓ ) ( ℓ ) = k η ( ℓ ) .Therefore, the term above equals X i j =0 j ∈ J ( k ) j = η ( ℓ ) (cid:16) Y j ∈ J ( k ) j = η ( ℓ ) α u j ( ℓ ) u j ( ℓ )+ i j ( x j ) (cid:17) × (cid:18) X i η ( ℓ ) =0 α L η ( ℓ ) L η ( ℓ ) + i η ( ℓ ) ( x η ( ℓ ) ) f (cid:0) δ ( u ( ℓ −
1) + i ) (cid:1) − X i η ( ℓ ) =0 α k η ( ℓ ) k η ( ℓ ) + i η ( ℓ ) ( x η ( ℓ ) ) f (cid:0) δ ( u ( ℓ ) + i ) (cid:1)(cid:19) . (74)Since α u j ( ℓ ) u j ( ℓ )+ i j ( x j ) are degree-7 polynomials in ( x j − δu j ( ℓ )) /δ , and − L j ≤ k j < L j for all j ∈ J ( k ),there exists a constant C ( L ) > j ∈ J ( k ), (cid:12)(cid:12)(cid:12) α u j ( ℓ ) u j ( ℓ )+ i j ( x j ) (cid:12)(cid:12)(cid:12) ≤ C (cid:0) | x j − δu j ( ℓ ) | /δ (cid:1) ≤ C ( L ) (cid:0) | x j − δL j | /δ (cid:1) = C ( L ) (cid:0) | x j − δ ( k j ∨ L j ) | /δ (cid:1) . (75)To bound the interior sum in (74), we apply Lemma 8, with k η ( ℓ ) in place of u and L η ( ℓ ) in place of¯ u , to get (cid:12)(cid:12)(cid:12)(cid:12) X i η ( ℓ ) =0 α L η ( ℓ ) L η ( ℓ ) + i η ( ℓ ) ( x η ( ℓ ) ) f (cid:0) δ ( u ( ℓ −
1) + i ) (cid:1) − X i η ( ℓ ) =0 α k η ( ℓ ) k η ( ℓ ) + i η ( ℓ ) ( x η ( ℓ ) ) f (cid:0) δ ( u ( ℓ ) + i ) (cid:1)(cid:12)(cid:12)(cid:12)(cid:12) ≤ C ( L ) L η ( ℓ ) − k η ( ℓ ) X m =0 (cid:12)(cid:12) ∆ η ( ℓ ) f (cid:0) δ ( u ( ℓ ) + i { ,...,d }\{ η ( ℓ ) } + me { η ( ℓ ) } ) (cid:1)(cid:12)(cid:12) ≤ C ( L ) max ≤ i ≤ ek ≤ m ≤ Lj ∈ J ( k ) (cid:12)(cid:12) ∆ j f (cid:0) δ ( k J ( k ) c + i + m J ( k ) ) (cid:1)(cid:12)(cid:12) . The last inequality is there to make things simpler by having a uniform upper bound that does notdepend on ℓ . Combining the upper bound above with (72) and (75) proves the result. (cid:3) Proof of Lemma 8
Let us first assume that ¯ u = u + 1. We have already shown in (50) of theproof of Theorem 1 that ∂ v ∂x v P ¯ u ( x ) (cid:12)(cid:12)(cid:12) x = δ ¯ u = ∂ v ∂x v P u ( x ) (cid:12)(cid:12)(cid:12) x = δ ¯ u , for v = 0 , , , . By performing a fourth-order Taylor expansion around the point x = δ ¯ u , we see that | P ¯ u ( x ) − P u ( x ) | ≤ C (cid:16) sup x ∈ [ δu,δ ¯ u ] (cid:12)(cid:12)(cid:12) ∂ ∂x P ¯ u ( x ) (cid:12)(cid:12)(cid:12) + sup x ∈ [ δu,δ ¯ u ] (cid:12)(cid:12)(cid:12) ∂ ∂x P u ( x ) (cid:12)(cid:12)(cid:12)(cid:17) , x ∈ [ δu, δ ¯ u ] . Since α kk + i ( x ) are degree-7 polynomials in ( x − δk ) /δ whose coefficients do not depend on k or δ , itfollows that (cid:12)(cid:12)(cid:12) ∂ ∂x P ¯ u ( x ) (cid:12)(cid:12)(cid:12) ≤ Cδ − (cid:16) (cid:12)(cid:12)(cid:12) x − δ ¯ uδ (cid:12)(cid:12)(cid:12)(cid:17) (cid:12)(cid:12) ∆ f ( δ ¯ u ) (cid:12)(cid:12) , (cid:12)(cid:12)(cid:12) ∂ ∂x P u ( x ) (cid:12)(cid:12)(cid:12) ≤ Cδ − (cid:16) (cid:12)(cid:12)(cid:12) x − δuδ (cid:12)(cid:12)(cid:12)(cid:17) (cid:12)(cid:12) ∆ f ( δu ) (cid:12)(cid:12) , raverman: Prelimit generator comparison approach
Article submitted to
Stochastic Systems ; manuscript no. | P ¯ u ( x ) − P u ( x ) | ≤ C (cid:16) (cid:12)(cid:12) ∆ f ( δ ¯ u ) (cid:12)(cid:12) + (cid:12)(cid:12) ∆ f ( δu ) (cid:12)(cid:12) (cid:17) , x ∈ [ δu, δ ¯ u ] . The general case ¯ u > u + 1 follows from the triangle inequality: | P ¯ u ( x ) − P u ( x ) | ≤ ¯ u − u X j =1 (cid:12)(cid:12)(cid:12)(cid:12) X i =0 α u + ju + j + i ( x ) f (cid:0) δ ( u + j + i ) (cid:1) − X i =0 α u + j − u + j − i ( x ) f (cid:0) δ ( u + j − i ) (cid:1)(cid:12)(cid:12)(cid:12)(cid:12) ≤ C ¯ u − u X j =0 (cid:12)(cid:12) ∆ f ( δ ( u + j )) (cid:12)(cid:12) . (cid:3) B.2.2. Proof of Lemma 7
Proof of Lemma 7
Note that b f ( δk ) = X i ,...,i d =0 (cid:18) d Y j =1 α k j ∨ k j ∨ i j ( δk j ) (cid:19) f ( δ ( k ∨ i )) , k ∈ Z d can be interpreted as the restriction of A b f ( x ) to x ∈ δ Z d . Since A b f ( x ) = F k ( x ) ∨ ( x ), where k ( x ) ∨ F k ( x ) is as in (55), the bound in (57) of Lemma 5 implies (cid:12)(cid:12) b f ( δk ) (cid:12)(cid:12) ≤ C | ( k ∨ | max ≤ i ≤ e | f ( δ ( k ∨ i )) | . Now assume 0 < k a k ≤ (cid:12)(cid:12) ∆ a b f ( δk ) (cid:12)(cid:12) ≤ Cδ k a k Z [ δk,δ ( k + a )] (cid:12)(cid:12)(cid:12) ∂ a ∂x a A b f ( x ) (cid:12)(cid:12)(cid:12) dx. (76)Then we can again use the fact that A b f ( x ) = F k ( x ) ∨ ( x ) together with (57) of Lemma 5 to concludethat the quantity above is bounded bysup x ∈ [ δk,δ ( k + a )] C (cid:18) d Y j =1 (cid:16) (cid:12)(cid:12)(cid:12) x j − δ ( k j ( x ) ∨ δ (cid:12)(cid:12)(cid:12)(cid:17) − a j (cid:19) max ≤ i j ≤ − a j j =1 ,...,d | ∆ a f ( δ (( k ( x ) ∨
0) + i )) |≤ C (cid:18) d Y j =1 (cid:0) | k j ∧ | (cid:1) − a j (cid:19) max ≤ i ≤ e − ak ≤ m ≤ k + a | ∆ a f ( δ (( m ∨
0) + i )) |≤ C (cid:0) | k ∧ | (cid:1) max ≤ i ≤ e | ∆ a f ( δ (( k ∨
0) + i )) | . Let us now verify (76). Suppose g : R → R is three times continuously differentiable with an abso-lutely continuous third derivative, and let b g : δ Z → R be the restriction of g ( x ) to δ Z . We first provethat for 1 ≤ v ≤
4, ∆ v b g ( δk ) = δ v Z vδ c v ( u ) ∂ v ∂x v g ( δk + u ) du, (77) raverman: Prelimit generator comparison approach Article submitted to
Stochastic Systems ; manuscript no. where c v ( x ) is a function such that sup x ∈ [0 ,δv ] | c v ( x ) | ≤ C for some constant C > k , g ( x ), and δ . Suppose v = 4. Using Taylor expansion, we have∆ b g ( δk ) = ∆ (cid:0) g ( δ ( k + 1)) − g ( δk ) (cid:1) = ∆ (cid:16) δg ′ ( δk ) + 12 δ g ′′ ( δk ) + 16 δ g ′′′ ( δk ) + 16 Z δ g (4) ( δk + u )( δ − u ) du (cid:17) = ∆ (cid:16) δg ′ ( δk ) + 12 δ g ′′ ( δk ) + 16 δ g ′′′ ( δk ) (cid:17) + ∆ (cid:16) Z δ ( A b f ) (4) ( δ ( k + 1) + u )( δ − u ) du − Z δ ( A b f ) (4) ( δk + u )( δ − u ) du (cid:17) . One may continue to manipulate the right-hand side in a similar manner to reach the desired formin (77). For 1 ≤ v ≤
3, (77) is verified similarly. Now since ∆ a b f ( δk ) = ∆ a d d . . . ∆ a b f ( δk ), we can apply(77) along each dimension j where a j > (cid:3) Appendix C: Proofs of Miscellaneous Technical Lemmas
C.1. Proof of Lemma 1
Proof of Lemma 1
Since h ∗ ( X ) = h ( X ), the triangle inequality implies that (cid:12)(cid:12) E h ∗ ( X ) − E h ∗ ( Y ) (cid:12)(cid:12) ≤ (cid:12)(cid:12) E h ( X ) − E Ah ( Y ) (cid:12)(cid:12) + (cid:12)(cid:12) E Ah ( Y ) − E h ∗ ( Y ) (cid:12)(cid:12) . For x ∈ R d let k ( x ) ∈ Z d be defined by k i ( x ) = ⌊ x i /δ ⌋ . Since h ∗ ( δk ( x )) = h ( δk ( x )) = Ah ( δk ( x )) and | x i − k i ( x ) | ≤ δ , (cid:12)(cid:12) E Ah ( Y ) − E h ∗ ( Y ) (cid:12)(cid:12) ≤ (cid:12)(cid:12) E Ah ( Y ) − E h ∗ ( δk ( Y )) (cid:12)(cid:12) + (cid:12)(cid:12) E h ∗ ( δk ( Y )) − E h ∗ ( Y ) (cid:12)(cid:12) = (cid:12)(cid:12) E Ah ( Y ) − E Ah ( δk ( Y )) (cid:12)(cid:12) + (cid:12)(cid:12) E h ∗ ( δk ( Y )) − E h ∗ ( Y ) (cid:12)(cid:12) ≤ Cδ sup ≤ j ≤ dx ∈ R d (cid:12)(cid:12)(cid:12)(cid:12) ∂∂x j Ah ( x ) (cid:12)(cid:12)(cid:12)(cid:12) + Cδ sup ≤ j ≤ dx ∈ R d (cid:12)(cid:12)(cid:12)(cid:12) ∂∂x j h ∗ ( x ) (cid:12)(cid:12)(cid:12)(cid:12) . Using the bound in (53) from Theorem 3, it follows thatsup ≤ j ≤ dx ∈ R d (cid:12)(cid:12)(cid:12)(cid:12) ∂∂x j Ah ( x ) (cid:12)(cid:12)(cid:12)(cid:12) ≤ Cδ − sup ≤ j ≤ dx ∈ R d | ∆ j h ( δk ( x )) | = Cδ − sup ≤ j ≤ dx ∈ R d | ∆ j h ∗ ( δk ( x )) | ≤ C sup ≤ j ≤ dx ∈ R d (cid:12)(cid:12)(cid:12)(cid:12) ∂∂x j h ∗ ( x ) (cid:12)(cid:12)(cid:12)(cid:12) . This proves the first claim. The other two claims follow by observing that if h ∗ ∈ Lip(1), thenthe mean-value theorem implies h ∈ dLip(1), and if h ∗ ∈ M , then (77) can be used to show that h ∈ M disc ( C ′ ) for some C ′ > (cid:3) raverman: Prelimit generator comparison approach
Article submitted to
Stochastic Systems ; manuscript no. C.2. Proof of Lemma 3
Proof of Lemma 3
For convenience, define g ( δk ) = Z ∞ (cid:0) E δk h ( X ( t )) − E h ( X ) (cid:1) dt, k ∈ Z d . We now show that g ( δk ) solves the Poisson equation. For ε > k ∈ Z d , let J ε ( k ) be the numberof jumps made by { X ( t ) } in the interval [0 , ε ] given X (0) = δk . Also, let r = P k ′ ∈ Z d q k,k ′ . Since theinter-jump times of { X ( t ) } are exponentially distributed, it follows that P ( J ε ( k ) = j ) = − rε + o ( ε ) , j = 0 ,rε + o ( ε ) , j = 1 ,o ( ε ) , j > , where o ( ε ) is a quantity such that o ( ε ) /ε → ε →
0. By considering the jumps made on [0 , ε ],we see that Z ∞ ε (cid:0) E X (0)= δk h ( X ( t )) − E h ( X ) (cid:1) dt = (1 − rε ) Z ∞ ε (cid:0) E X ( ε )= δk h ( X ( t )) − E h ( X ) (cid:1) dt + ε X k ′ ∈ Z d q k,k ′ Z ∞ ε (cid:0) E X ( ε )= δk ′ h ( X ( t )) − E h ( X ) (cid:1) dt + o ( ε ) . Therefore, g ( δk ) = Z ε (cid:0) E δk h ( X ( t )) − E h ( X ) (cid:1) dt + Z ∞ ε (cid:0) E X (0)= δk h ( X ( t )) − E h ( X ) (cid:1) dt = Z ε (cid:0) E δk h ( X ( t )) − E h ( X ) (cid:1) dt + (1 − rε ) g ( δk ) + ε X k ′ ∈ Z d q k,k ′ g ( δk ′ ) + o ( ε ) . Dividing both sides by ε and letting ε →
0, we conclude that G X g ( δk ) = E h ( X ) − h ( δk ) . (cid:3) Acknowledgments
The author would like to thank Han Liang Gan for stimulating discussions during early stages ofthis work, as well as Robert Bray and Shane Henderson for providing feedback on early drafts.
References
Asmussen S (2003)
Applied probability and queues , volume 51 of
Applications of Mathematics (NewYork) (New York: Springer-Verlag), second edition, ISBN 0-387-00211-1, Stochastic Modellingand Applied Probability. raverman:
Prelimit generator comparison approach Article submitted to
Stochastic Systems ; manuscript no.
Barbour A (1990) Stein’s method for diffusion approximations.
Probab. Theory and Related Fields http://dx.doi.org/10.1007/BF01197887 .Barbour AD (1988) Stein’s method and Poisson process convergence.
Journal of Appl. Probab. .Barbour AD, Luczak MJ, Xia A (2018a) Multivariate approximation in total variation, i:Equilibrium distributions of Markov jump processes.
Ann. Probab. http://dx.doi.org/10.1214/17-AOP1204 .Barbour AD, Luczak MJ, Xia A (2018b) Multivariate approximation in total vari-ation, ii: Discrete normal approximation.
Ann. Probab. http://dx.doi.org/10.1214/17-AOP1205 .Braverman A (2020) Steady-state analysis of the join the shortest queuemodel in the Halfin-Whitt regime.
Math. Oper. Res. https://doi.org/10.1287/moor.2019.1023 .Braverman A (2021) Convergence rates for the steady-state distribution of the join the shortestqueue model in the Halfin-Whitt regime. Working paper.Braverman A, Dai JG (2017) Stein’s method for steady-state diffusion approximations of M/ Ph /n + M systems. Ann. of Appl. Probab. http://dx.doi.org/10.1214/16-AAP1211 .Braverman A, Dai JG, Fang X (2020a) High order steady-state diffusion approximations. URL https://arxiv.org/abs/2012.02824 .Braverman A, Dai JG, Feng J (2016) Stein’s method for steady-state diffusion approximations:An introduction through the Erlang-A and Erlang-C models.
Stoch. Syst. .Braverman A, Gurvich I, Huang J (2020b) On the taylor expansion of value functions.
Oper. Res. http://dx.doi.org/10.1287/opre.2019.1903 .Brown TC, Xia A (2001) Stein’s method and birth-death processes.
Ann. Probab. http://dx.doi.org/10.1214/aop/1015345606 .Budhiraja A, Lee C (2009) Stationary distribution convergence for generalized Jack-son networks in heavy traffic.
Math. Oper. Res. http://dx.doi.org/10.1287/moor.1080.0353 . raverman: Prelimit generator comparison approach
Article submitted to
Stochastic Systems ; manuscript no. https://arxiv.org/abs/1911.12917 .Dai JG, Dieker A, Gao X (2014) Validity of heavy-traffic steady-state approximations in many-server queues with abandonment.
Queueing Systems http://dx.doi.org/10.1007/s11134-014-9394-x .Dai JG, Shi P (2017) A two-time-scale approach to time-varying queuesin hospital inpatient flow management.
Oper. Res. http://dx.doi.org/10.1287/opre.2016.1566 .Dieker A, Gao X (2013) Positive recurrence of piecewise Ornstein–Uhlenbeck processesand common quadratic Lyapunov functions.
Ann. Appl. Probab. http://dx.doi.org/10.1214/12-AAP870 .Eberle A (2016) Reflection couplings and contraction rates for diffusions.
Probab.Theory and Related Fields http://dx.doi.org/10.1007/s00440-015-0673-1 .Fang X, Shao QM, Xu L (2018) Multivariate approximations in Wasserstein distance by Stein’smethod and Bismut’s formula. URL https://arxiv.org/abs/1801.07815 .Feng J, Shi P (2018) Steady-state diffusion approximations for discrete-time queue in hos-pital inpatient flow management.
Naval Research Logistics (NRL) http://dx.doi.org/10.1002/nav.21787 .Gamarnik D, Stolyar AL (2012) Multiclass multiserver queueing system in the Halfin-Whitt heavytraffic regime: Asymptotics of the stationary distribution.
Queueing Systems http://dl.acm.org/citation.cfm?id=2339029 .Gamarnik D, Zeevi A (2006) Validity of heavy traffic steady-state approximation in gen-eralized Jackson networks.
Ann. Appl. Probab. http://dx.doi.org/10.1214/105051605000000638 .Gan HL, R¨ollin A, Ross N (2017) Dirichlet approximation of equilibrium distributionsin Cannings models with mutation.
Advances in Appl. Probab. http://dx.doi.org/10.1017/apr.2017.27 .Gan HL, Ross N (2019) Stein’s method for the Poisson-Dirichlet distribution andthe Ewens sampling formula, with applications to Wright-Fisher models. URL https://arxiv.org/abs/1910.04976 . raverman: Prelimit generator comparison approach Article submitted to
Stochastic Systems ; manuscript no.
Gibbs AL, Su FE (2002) On choosing and bounding probability metrics.
International Sta-tistical Review / Revue Internationale de Statistique .Gorham J, Duncan AB, Vollmer SJ, Mackey L (2019) Measuring sample quality with diffusions.
Ann. Appl. Probab. http://dx.doi.org/10.1214/19-AAP1467 .G¨otze F (1991) On the rate of convergence in the multivariate CLT.
Ann. Probab. http://dx.doi.org/10.1214/aop/1176990448 .Gurvich I (2014a) Diffusion models and steady-state approximations for exponen-tially ergodic Markovian queues.
Ann. Appl. Probab. http://dx.doi.org/10.1214/13-AAP984 .Gurvich I (2014b) Validity of heavy-traffic steady-state approximations in multiclass queue-ing networks: the case of queue-ratio disciplines.
Math. Oper. Res. http://dx.doi.org/10.1287/moor.2013.0593 .Harrison JM, Reiman MI (1981) Reflected Brownian motion onan orthant.
Ann. Probab. http://links.jstor.org/sici?sici=0091-1798(198104)9:2<302:RBMOAO>2.0.CO;2-P&origin=MSN .Huang J, Gurvich I (2018) Beyond heavy-traffic regimes: Universal boundsand controls for the single-server queue.
Oper. Res. http://dx.doi.org/10.1287/opre.2017.1715 .Katsuda T (2010) State-space collapse in stationarity and its application to a multiclasssingle-server queue in heavy traffic.
Queueing Syst. http://dx.doi.org/10.1007/s11134-010-9178-x .Liu X, Ying L (2019) A simple steady-state analysis of load balancing algorithms in the sub-halfin-whitt regime.
SIGMETRICS Perform. Eval. Rev. http://dx.doi.org/10.1145/3305218.3305225 .Mackey L, Gorham J (2016) Multivariate Stein factors for a class of strongly log-concave distribu-tions.
Electron. Commun. Probab. http://dx.doi.org/10.1214/16-ECP15 .Meyn SP, Tweedie RL (1993) Stability of Markovian processes III: Foster-Lyapunov criteria forcontinuous time processes.
Adv. Appl. Probab. raverman:
Prelimit generator comparison approach
Article submitted to
Stochastic Systems ; manuscript no.
Stoch. Syst. http://dx.doi.org/10.1214/14-SSY139 .Tezcan T (2008) Optimal control of distributed parallel server systems underthe Halfin and Whitt regime.
Math. Oper. Res. http://search.proquest.com/docview/212618995?accountid=10267 .Wang FY (2016) Exponential contraction in Wasserstein distances for diffusion semigroups withnegative curvature. URL https://arxiv.org/abs/1603.05749 .Ye HQ, Yao DD (2012) A stochastic network under proportional fair resourcecontrol—diffusion limit with multiple bottlenecks.
Oper. Res. http://dx.doi.org/10.1287/opre.1120.1047 .Ying L (2016) On the approximation error of mean-field models.
Proceedings of the2016 ACM SIGMETRICS International Conference on Measurement and Model-ing of Computer Science , 285–297 (Antibes Juan-les-Pins, France: ACM), URL http://dx.doi.org/10.1145/2964791.2901463 .Ying L (2017) Stein’s method for mean field approximations in light and heavy trafficregimes.
Proc. ACM Meas. Anal. Comput. Syst. http://dx.doi.org/10.1145/3084449 .Zhang J, Zwart B (2008) Steady state approximations of limited processor sharing queues in heavytraffic.
Queueing Systems: Theory and Applications http://dx.doi.org/10.1007/s11134-008-9095-4http://dx.doi.org/10.1007/s11134-008-9095-4