Dickman approximation in simulation, summations and perpetuities
aa r X i v : . [ m a t h . P R ] N ov Dickman approximation in simulation, summationsand perpetuities
Chinmoy Bhattacharjee, Larry Goldstein ∗ November 26, 2018
Abstract
The generalized Dickman distribution D θ with parameter θ > W = d W ∗ , where W ∗ = d U /θ ( W + 1) , (1)with W non-negative with probability one, U ∼ U [0 ,
1] independent of W , and = d denoting equality in distribution. Members of this family appear in number theory,stochastic geometry, perpetuities and the study of algorithms. We obtain bounds inWasserstein type distances between D θ and the distribution of W n = 1 n n X i =1 Y k B k (2)where B , . . . , B n , Y , . . . , Y n are independent with B k ∼ Ber(1 /k ) , E [ Y k ] = k, Var( Y k ) = σ k and provide an application to the minimal directed spanning tree in R , and also ob-tain such bounds when the Bernoulli variables in (2) are replaced by Poissons. We alsogive simple proofs and provide bounds with optimal rates for the Dickman convergenceof the weighted sums, arising in probabilistic number theory, of the form S n = 1log( p n ) n X k =1 X k log( p k )where ( p k ) k ≥ is an enumeration of the prime numbers in increasing order and X k isgeometric with parameter (1 − /p k ), Bernoulli with success probability 1 / (1 + p k ) orPoisson with mean λ k .In addition, we broaden the class of generalized Dickman distributions by studyingthe fixed points of the transformation s ( W ∗ ) = d U /θ s ( W + 1)generalizing (1), that allows the use of non-identity utility functions s ( · ) in Vervaatperpetuities. We obtain distributional bounds for recursive methods that can be usedto simulate from this family. MSC 2010 subject classifications: Primary 60F05, 60E99, 91B16 Key words and phrases: weighted Bernoulli sums, delay equation, primes, utility, distributional approx-imation ∗ This work was partially supported by NSA grant H98230-15-1-0250. Introduction
The Dickman distribution D first made its appearance in [16] in the context of numbertheory for counting the number of integers below a fixed threshold whose prime factors liebelow a given upper bound; see the more recent work [27] for a readable explanation ofhow the Dickman distribution arises there. Members from the broader class of generalizedDickman distributions D θ for θ >
0, of which D = D , have since been used to approximatecounts in logarithmic combinatorial structures, including permutations and partitions in [6],and more generally for the quasi-logarithmic class considered in [7], for the weighted sum ofedges connecting vertices to the origin in minimal directed spanning trees in [29], and forcertain weighted sums of independent random variables in [28]. Simulation of the generalizedDickman distribution has been considered in [15], and in connection with the Quickselectsorting algorithm in [24] and [20].Following [20], for a given θ > W , define the θ -Dickman bias distribution of W by W ∗ = d U /θ ( W + 1) , (3)where U ∼ U [0 ,
1] and is independent of W , and = d denotes equality in distribution. Thoughthe density of D θ can presently be given only by specifying it somewhat indirectly as a certainsolution to a differential delay equation, it is well known [15] that the distributions D θ arecharacterized by satisfying W ∗ = d W uniquely, that is, D θ is the unique fixed point of thedistributional transformation (3). Indeed, this property is the basis for simulating from thisfamily using the recursion W n +1 = U /θn ( W n + 1) for n ≥
0, with W = 0, (4)where U m , m ≥ U [0 ,
1] random variables and U n is independent of W n , see [15].Generally, distributional characterization and their associated transformations, such as(3), provide an additional avenue to study distributions and their approximation, and havebeen considered for the normal [22], the exponential [26], and various other distributionsthat may be less well known, such as one arising in the study of the degrees of vertices incertain preferential attachment graphs, see [25].In the following, D θ will denote a D θ distributed random variable, where the subscriptmay be dropped when equal to 1. In [21], the upper bound d ( W, D θ ) ≤ (1 + θ ) d ( W, W ∗ ) (5)for the Wasserstein distance between a non-negative random variable W and D θ was provedvia Stein’s method, where d ( X, Y ) = sup h ∈ Lip | Eh ( X ) − Eh ( Y ) | (6)with Lip α = { h : | h ( x ) − h ( y ) | ≤ α | x − y |} for α ≥
0. (7)We also apply the fact that alternatively one can write d ( X, Y ) = inf E | X − Y | , (8)2here the infimum is over all joint distributions having the given X, Y marginals. Theinfimum is achieved for variables taking values in any Polish space, see e.g. [30], and so inparticular for those that are real valued. For notational simplicity we write d ( X, Y ), say,for d ( L ( X ) , L ( Y )), where L ( · ) stands for the distribution, or law, of a random variable. In[21], inequality (5) was used to derive a bound on the quality of the Dickman approximationfor the running time of the Quickselect algorithm.Here our aim is two fold. First, in Section 2 we study the approximation of sums thatconverge in distribution to Dickman, for instance, those of the form W n = 1 n n X k =1 Y k B k , (9)where { B , . . . , B n , Y , . . . , Y n } are independent, B k is a Bernoulli random variable withsuccess probability 1 /k , and Y k is non-negative with EY k = k , and Var( Y k ) = σ k for all k = 1 , . . . , n . The most well known case is the one where Y k = k a.s., for which W n = 1 n n X k =1 kB k . (10)Sums of this type arise, for instance, in the analysis of the Quickselect algorithm for findingthe m th smallest of a list of n distinct numbers, see [24] (also [21]), and for the sum ofpositions of records in a uniformly random permutation (see [32]). To state the result wewill apply to such sums, we first define the Wasserstein-2 metric d , ( X, Y ) = sup h ∈H , | Eh ( Y ) − Eh ( X ) | (11)where, for α ≥ , β ≥ H α,β = { h : h ∈ Lip α , h ′ ∈ Lip β } , (12)with Lip α given in (7). The work [3] obtains a bound of the form C √ log n/n between W n in(10) and D in a metric weaker than d , in (11), requiring test functions to be three timesdifferentiable, and with the constant C unspecified. The following theorem provides a moregeneral result that in the specific case of (10) yields a bound in the stronger metric d , witha small, explicit constant. Theorem 1.1.
Let W n be as in (9) and D a standard Dickman random variable. Then withthe metric d , in (11) , d , ( W n , D ) ≤ n + 12 n n X k =1 k q ( σ k + k ) σ k , and in particular if Y k = k a.s., that is, for W n as in (10) , d , ( W n , D ) ≤ n . (13)3rom the first bound given by the theorem, speaking asymptotically we see that W n in(9) converges to D in distribution whenever P nk =1 1 k p ( σ k + k ) σ k = o ( n ). In particular,weak convergence to the Dickman distribution occurs if σ k = O ( k − ǫ ) for some ǫ >
0. InSection 2 we provide an application of Theorem 1.1 to minimal directed spanning trees in R .We also show the following related result for a weighted sum of independent Poissonvariables. For λ >
0, let P ( λ ) denote a Poisson random variable with mean λ . Theorem 1.2.
For θ > , let { P , . . . , P n , Y , . . . , Y n } be independent with P k ∼ P ( θ/k ) and Y k non-negative with EY k = k and Var( Y k ) = σ k , for all k = 1 , . . . , n . Then W n = 1 n n X k =1 Y k P k (14) satisfies d , ( W n , D θ ) ≤ θ n + θn n X k =1 σ k k + θ n n X k =1 k q ( σ k + k ) σ k , (15) and in particular, in the case Y k = k a.s., W n = 1 n n X k =1 kP k satisfies d , ( W n , D θ ) ≤ θ n . (16)Similar to the weighted sum of Bernoullis in (9), we have weak convergence to the Dick-man distribution if σ k = O ( k − ǫ ) for some ǫ > X ∼ Geom( p ) if P ( X = m ) = (1 − p ) m p for m ≥
0. Let ( p k ) k ≥ be an enumeration of the prime numbers inincreasing order and Ω n denote the set of all positive integers having no prime factor largerthan p n . Let X , . . . , X n be independent with X k ∼ Geom(1 − /p k ) for 1 ≤ k ≤ n , and letΠ n be the distribution of M n given by M n = n Y k =1 p X k k and S n = log M n log( p n ) = 1log( p n ) n X k =1 X k log( p k ) . (17)One can specify (see e.g. [27]) Π n byΠ n ( m ) = 1 π n m for m ∈ Ω n with normalizing constant necessarily satisfying π n = P m ∈ Ω n /m . Distributional conver-gence of S n to the standard Dickman distribution was proved in [27]. In Theorem 1.3 below,we provide (log n ) − convergence rate in the Wasserstein-2 norm. Theorem 1.3.
For D a standard Dickman random variable and S n as in (17) with X , . . . , X n independent variables with X k ∼ Geom(1 − /p k ) , we have d , ( S n , D ) ≤ C log n for some universal constant C . Moreover, the order is not improvable. ′ n over Ω ′ n , the set of square-free integers withlargest prime factor less than or equal to p n , with Π ′ n ( m ) proportional to 1 /m for all m ∈ Ω ′ n .Then M n = Q nk =1 p X k k has distribution Π ′ n when X k ∼ Ber(1 / (1 + p k )) and are independent(see e.g. [12]). That S n = log M n / log( p n ) converges in distribution to the standard Dickmanwas proved in [12] and very recently a (log log n ) / (log n ) − rate was provided in [3] in ametric defined as a supremum over a class of three times differentiable functions. We providethe improved (log n ) − convergence rate in the stronger Wasserstein-2 norm. Theorem 1.4.
For D a standard Dickman random variable and S n as in (17) with X , . . . , X n independent variables with X k ∼ Ber(1 / (1 + p k )) , we have d , ( S n , D ) ≤ C log n for some universal constant C . Moreover, the order is not improvable. In Examples 2.1 and 2.2 we also provide such bounds when the X k ’s in (17) are distributedas Poisson random variables with parameters λ k > p k . Forour results in probabilistic number theory, we closely follow the arguments in [3].In Section 3 we consider the connection between the class of Dickman distributions andperpetuities. By approaching from the view of utility, we extend the scope of the Dickmandistributions past the currently known class. The recursion (4) was interpreted by Vervaat,see [40], as the relation between the values of a perpetuity at two successive times. Inparticular, during the n th time period a deposit of some fixed value, scaled to be unity, isadded to the value of an asset. During that time period, a multiplicative factor in [0 , U /θ . The generalizedDickman distributions arise as fixed points of this recursion, that is, solutions to W ∗ = d W where W ∗ is given in (3).Measuring the value of an asset directly by its monetary value corresponds to the casewhere the utility function s ( · ) of an asset is taken to be the identity. We consider thegeneralization of (4) to s ( W n +1 ) = U /θn s ( W n + 1) . (18)In [9], see also the translation [10], Daniel Bernoulli argued that utility should be given asa concave function of the value of an asset, typically justified by observing that receivingone unit of currency would be of more value to an individual who has very few resourcesthan one who has resources in abundance, see [17]. We may then interpret (18) in a mannersimilar to (4), but now in terms of utility. Again, during the n th time period, a constantvalue, scaled to be one, is added to an asset. Then, at time n + 1, the utility of the asset isgiven by some discount factor applied to the incremented utility of the asset. When s ( · ) isinvertible, as for the most common Vervaat perpetuities, one can now gain insight into theirlong term behavior by studying fixed points of the transformation W ∗ = d s − ( U /θ s ( W + 1)) . (19)Theorem 3.3 in Section 3 shows that under mild and natural conditions on the utilityfunction s ( · ) the transformation (19) has a unique fixed point, say D θ,s , which we say has the( θ, s )-Dickman distribution, denoted here as D θ,s . As the identity function s ( x ) = x recovers5he class of generalized Dickman distributions, this extended class strictly contains them.The parameter θ > D θ,s as it does for D θ , in particular in itsappearance in the distributional bounds for simulation using recursive schemes. Theorem3.4 generalizes the bound (5) of [21] to the D θ,s family, providing the inequality d ( W, D θ,s ) ≤ (1 − ρ ) − d ( W ∗ , W ) (20)with a parameter ρ given by a bound on an integral involving θ and s ( · ), see (66) and (67).We apply (20) to assess the quality of the recursive scheme W n +1 = s − ( U /θn s ( W n + 1)) for n ≥ W = 0, (21)for the simulation of variables having the D θ,s distribution. Simulation by these means forthe D θ family was considered in [15], though no bounds on its accuracy were provided.An algorithmic method for the exact simulation from the D θ family was given in [18] withbounds on the expected running time. In brief, the method in [18] depends on the use ofa multigamma coupler as an update function for the kernel K ( x, · ) := L ( U /θ ( x + 1)), andon finding a dominating chain so that one can simulate from its stationary distribution, ashifted geometric distribution in this case. To extend this approach to the more generalfamily D θ,s , one would consider the kernel K ( x, · ) := L ( U /θ s ( x + 1)), and though one cangeneralize the multigamma coupler for use as an update function for this kernel, finding asuitable dominating chain in this generality may not be straightforward.The efficacy of a simpler recursive scheme for simulation from this family is addressed in(73) of Corollary 3.2 where we show that the iterates generated by (21) obey the inequality d ( W n , D θ,s ) ≤ (1 − ρ ) − (cid:18) θθ + 1 (cid:19) n E [ s − ( U /θ )] , and which thus exhibit exponentially fast convergence. In Section 3.3 we present someinstances from the family D θ,s that arise as limiting distributions for perpetuities whentaking our utilities s ( · ) from those studied in economics.We obtain our results by extensions of [20] for the Stein’s method framework for theDickman distribution. The application of Stein’s method, as unveiled in [38] and furtherdeveloped in [39], begins with a characterizing equation for a given target distribution.Such a characterization is then used as the basis to form a Stein equation, which is usuallya difference or differential equation involving test functions in a class corresponding to adesired probability metric, such as the class of Lip functions for the Wasserstein distancein (6). One key step of the method requires bounds on the smoothness of solutions over thegiven class of test functions. For a modern treatment of Stein’s method, see [14] and [33].Theorems 1.4 improves on results of [3]. That work applies a different version of Stein’smethod, and in particular does not consider any form of the Stein equation, such as (22) or(24). Consequently [3] does not obtain bounds on a Stein solution for any Dickman case, asis achieved here in Theorems 4.7 and 4.9. Indeed, there it is noted in [2] that this last stepcan be an ‘extremely difficult problem’.In [20] the Stein equation used for the D θ family was of the integral type g ( x ) − A x +1 g = h ( x ) − E [ h ( D θ )] (22)6here the averaging operator A x g was given by A x g = (cid:26) g (0) for x = 0 θx θ R x g ( u ) u θ − du for x > . To handle the D θ,s family, over the range x > A x g = 1 t ( x ) Z x g ( u ) t ′ ( u ) du, (23)where t ( x ) = s θ ( x ). Smoothness bounds for solutions of (22) with A x as in (23) and D θ replaced by D θ,s , are given in Theorem 4.7 in Section 4 for a wide range of functions s ( · ).This generalization requires significant extensions of existing methods.Use of the Stein equation (22) is appropriate when the variable W of interest can becoupled to some W ∗ with its θ -Dickman bias distribution. However, such direct couplingsappear elusive for all our examples in Section 2, including in particular those in probabilisticnumber theory, and a different approach is needed. To handle these new examples we considerinstead a new Stein equation, of differential-delay type, given by( x/θ ) f ′ ( x ) + f ( x ) − f ( x + 1) = h ( x ) − E [ h ( D θ )] . (24)To apply the method, uniform bounds on the smoothness of the solution f ( · ) over testfunctions h ( · ) in some class H is required; we achieve such bounds for the class H , inTheorem 4.9 in Section 4.Throughout the paper, for a real-valued measurable function f ( · ) on a domain S ⊂ R , k f k ∞ denotes its essential supremum norm defined by k f k ∞ = ess sup x ∈ S | f ( x ) | = inf { b ∈ R : m ( { x : f ( x ) > b } ) = 0 } , (25)where m denotes the Lebesgue measure on R . For any real valued function defined on A ⊂ S we define its supremum norm on A by k f k A = sup x ∈ A | f ( x ) | . (26)Unless otherwise specifically noted, integration will be with respect to m , which for simplicitywill be denoted by, say, dv when the variable of integration is v .This work is organized as follows. We focus on sums, such as the Bernoulli and Poissonweighted sums in (9) and (14), and sums arising in probabilistic number theory as (17), inSection 2. We focus on perpetuities, with examples, in Section 3, and in Section 4 we provesmoothness bounds on the two types of Stein solutions considered here. We will prove Theorems 1.1 and 1.2, starting with a simple application of the former, inSection 2.1, and then provide the proofs of Theorems 1.3 and 1.4, in probabilistic numbertheory, in Section 2.2. In this section we deal with the form (24) of the Stein equation. That7s, in the proofs of Theorems 1.1, 1.2 and 2.1, we take a fixed θ > h ∈ H , , thefunction class defined in (12), and let f ∈ H θ,θ/ be the solution of the Stein equation (24)that is guaranteed by Theorem 4.9. Substituting our W n of interest for x in (24) and takingexpectation yields E [ h ( W n )] − E [ h ( D θ )] = E [( W n /θ ) f ′ ( W n ) − ( f ( W n + 1) − f ( W n ))] . (27) We begin with a simple application of Theorem 1.1 to the minimal directed spanning tree,or MDST, following [11], first pausing to describe the construction of the MDST.For two points ( u , v ) and ( u , v ) in R , we write ( u , v ) (cid:22) ( u , v ) if u ≤ u and v ≤ v , and write ( u , v ) ( u , v ) otherwise. For any set of points V in R , we say( u, v ) ∈ V is a minimal point, or sink, of V if ( a, b ) ( u, v ) for all ( a, b ) ∈ V , ( a, b ) = ( u, v ).For n ∈ N , consider a set of n + 1 distinct points V = { ( a i , b i ) , ≤ i ≤ n } in [0 , × [0 , a , b ) = (0 , E be the set of directed edges ( a i , b i ) → ( a j , b j )with i = j and ( a i , b i ) (cid:22) ( a j , b j ). Since (0 , (cid:22) ( a i , b i ) for all i = 1 , . . . , n , the edge set E contains all the directed edges ( a , b ) → ( a i , b i ) with i = 0. Let G be the collection of allgraphs G with vertex set G V = V and edge set G E ⊆ E such that for any 1 ≤ j ≤ n , thereexists a directed path from ( a , b ) to ( a j , b j ) with each edge in G E . We define a MDST on V as any graph T ∈ G that minimizes P e ∈ G E | e | where | e | denotes the Euclidean length ofthe edge e . Clearly T is a tree and need not be unique.Now let P be a random collection of n points uniformly and independently placed inthe unit square [0 , in R . In this random setting, the MDST on the point set V = P ∪ { (0 , } is uniquely defined almost surely, see [11]. By relabeling the points accordingto the size of their x -coordinate, without loss of generality, we may let the points in P be( X , Y ) , . . . , ( X n , Y n ) where Y , . . . , Y n are independent U [0 ,
1] random variables, and alsoindependent of X , . . . , X n , where 0 < X < X < · · · < X n < n independent U [0 ,
1] variables.Though the origin is the unique minimal point of V , the usual set of interest is thecollection of minimal points of P , which has size at least one. For i = 1 , . . . , n , observethat ( X i , Y i ) is a minimal point of P if and only if Y j > Y i for all j < i . One much studiedquantity in this context is the sum S n of the α th powers of the Euclidean distances betweenthe minimal points of the process and the origin for some α >
0; the work [29] shows that S n converges to D /α in distribution as n tends to infinity.The lower record times R , R , . . . of the height process Y , . . . , Y n are also studied, see[11], and are defined by letting R = 1, and for i > R i = (cid:26) ∞ if Y j ≥ Y R i − for all j > R i − or if R i − ≥ n ,min { j > R i − : Y j < Y R i − } otherwise . In terms of these record times, the collection of the k ( n ) minimal points inside the unitsquare is given by ( X R i , Y R i ) for i = 1 , . . . , k ( n ). We claim that the scaled sum of lowerrecord times W n = 1 n k ( n ) X i =1 R i (28)8an be approximated by the Dickman distribution D in the Wasserstein-2 metric in (11) towithin the bound specified by inequality (13) of Theorem 1.1. Indeed, for 1 ≤ j ≤ n , letting B k = ( k ∈ { R , . . . , R k ( n ) } )we have that P k ( n ) i =1 R i = P nk =1 kB k . As Lemma 2.1 of [11] shows that B , . . . , B n areindependent with B k ∼ Ber(1 /k ) for 1 ≤ k ≤ n , Theorem 1.1 yields the claimed bound forthe Dickman approximation of (28).We now present the proof of our first main result. Proof of Theorem 1.1:
Let W n be as in (9) and take θ = 1 in (27). Letting W ( k ) n = W n − Y k n B k , evaluating the first term on the right hand side of (27) yields E [ W n f ′ ( W n )] = E " n n X k =1 Y k B k f ′ ( W n ) = 1 n n X k =1 E (cid:20) Y k B k f ′ (cid:18) W ( k ) n + Y k n B k (cid:19)(cid:21) = 1 n n X k =1 E (cid:20) Y k f ′ (cid:18) W ( k ) n + Y k n (cid:19)(cid:21) P ( B k = 1) = 1 n n X k =1 E (cid:20) Y k k f ′ (cid:18) W ( k ) n + Y k n (cid:19)(cid:21) . The right hand side of (27) is therefore the expectation of1 n n X k =1 Y k k f ′ (cid:18) W ( k ) n + Y k n (cid:19) − Z f ′ ( W n + u ) du = 1 n n X k =1 Y k k (cid:18) f ′ (cid:18) W ( k ) n + Y k n (cid:19) − f ′ (cid:18) W ( k ) n + kn (cid:19)(cid:19) + 1 n n X k =1 (cid:18) Y k k f ′ (cid:18) W ( k ) n + kn (cid:19) − f ′ (cid:18) W ( k ) n + kn (cid:19)(cid:19) + 1 n n X k =1 (cid:18) f ′ (cid:18) W ( k ) n + kn (cid:19) − f ′ (cid:18) W n + kn (cid:19)(cid:19) + n n X k =1 f ′ (cid:18) W n + kn (cid:19) − Z f ′ ( W n + u ) du ! . (29)Using that f ∈ H , / , and hence in particular that f ′ ( · ) is Lipschitz, applying the Cauchy-Schwarz inequality to the first difference on the right hand side of (29) we find that theexpectation of that term is bounded by k f ′′ k ∞ n n X k =1 E (cid:20) | Y k | k | Y k − k | (cid:21) ≤ n n X k =1 k q ( σ k + k ) σ k . The expectation of the second difference is zero as E [ Y k ] = k and Y k is independent of W ( k ) n .For the expectation of the third difference, noting that E [ Y k B k ] = 1, we similarly obtain thebound k f ′′ k ∞ n n X k =1 E | W ( k ) n − W n | ≤ n n X k =1 E (cid:20) Y k n B k (cid:21) = 12 n . f ( · ),almost surely (cid:12)(cid:12)(cid:12)(cid:12) n n X k =1 f ′ (cid:18) W n + kn (cid:19) − Z f ′ ( W n + u ) du (cid:12)(cid:12)(cid:12)(cid:12) ≤ n X k =1 Z knk − n (cid:12)(cid:12)(cid:12)(cid:12) [ f ′ ( W n + k/n ) − f ′ ( W n + u )] (cid:12)(cid:12)(cid:12)(cid:12) du ≤ n X k =1 Z knk − n ( k/n − u ) du = 12 n n X k =1 k − Z udu ! = 14 n . Combining these three bounds yields, via (27) with θ = 1, that | E [ h ( W n )] − E [ h ( D )] | ≤ n + 12 n n X k =1 k q ( σ k + k ) σ k . Taking the supremum over H , and recalling the definition of the norm d , in (11) nowyields the theorem. The final claim (13) holds as σ k = 0 when Y k = k a.s.We turn now to the proof of our next main result, proceeding along the same lines asin the proof of Theorem 1.1. We first recall the well known Stein identity for the Poissondistribution, see e.g. [13], that P ∼ P ( λ ) if and only if E [ P g ( P )] = λE [ g ( P + 1)] (30)for all functions g ( · ) on the non-negative integers for which the expectation of either sideexists. Proof of Theorem 1.2:
Consider equation (27) with W n as in (14) and h ( · ) an arbitraryfunction in H , , and f ∈ H θ,θ/ the solution of (24) guaranteed by Theorem 4.9. For k = 1 , . . . , n set W ( k ) n = W n − Y k P k /n . Using that P , . . . , P n , Y , . . . , Y n are independentwith P k ∼ P ( θ/k ) and (30) for the second equality, letting S k = { Y j , j ∈ { , . . . , n } , P j , j ∈{ , . . . , n } \ { k }} , we have E [( W n /θ ) f ′ ( W n )] = 1 θn n X k =1 E (cid:2) Y k E (cid:2) P k f ′ ( W ( k ) n + Y k P k /n ) | S k (cid:3)(cid:3) = 1 n n X k =1 E (cid:20) Y k k E (cid:2) f ′ ( W ( k ) n + Y k P k /n + Y k /n ) | S k (cid:3)(cid:21) = 1 n n X k =1 E (cid:20) Y k k f ′ ( W n + Y k /n ) (cid:21) . Thus, via (27), we obtain E [ h ( W n )] − E [ h ( D θ )] = E [( W n /θ ) f ′ ( W n ) − ( f ( W n + 1) − f ( W n ))]= E " n n X k =1 Y k k f ′ (cid:18) W n + Y k n (cid:19) − Z f ′ ( W n + u ) du = E " n n X k =1 (cid:18) Y k k f ′ (cid:18) W n + Y k n (cid:19) − f ′ (cid:18) W n + kn (cid:19)(cid:19) + E " n n X k =1 f ′ (cid:18) W n + kn (cid:19) − Z f ′ ( W n + u ) du . (31)10ow for the second term in (31), since f ∈ H θ,θ/ , as for this same term that appears in theproof of Theorem 1.1, we have almost surely that (cid:12)(cid:12)(cid:12)(cid:12) n n X k =1 f ′ ( W n + k/n ) − Z f ′ ( W n + u ) du (cid:12)(cid:12)(cid:12)(cid:12) ≤ θ n . (32)Now we write the first term in (31) as the expectation of1 n n X k =1 (cid:20) Y k k f ′ (cid:18) W n + Y k n (cid:19) − Y k k f ′ (cid:18) W n + kn (cid:19)(cid:21) + 1 n n X k =1 (cid:20) Y k k f ′ (cid:18) W n + kn (cid:19) − f ′ (cid:18) W n + kn (cid:19)(cid:21) . (33)As in proof of Theorem 1.1, recalling that f ∈ H θ,θ/ , the expectation of the first term in(33) is bounded by k f ′′ k ∞ n n X k =1 E (cid:20) | Y k | k | Y k − k | (cid:21) ≤ θ n n X k =1 k q ( σ k + k ) σ k . The expectation of the second term in (33) can be bounded by k f ′ k ∞ n n X k =1 E (cid:12)(cid:12)(cid:12)(cid:12) Y k k − (cid:12)(cid:12)(cid:12)(cid:12) ≤ θn n X k =1 σ k k . Assembling the bounds on the terms arising from (31), consisting of (32) and the two in-equalities above, we obtain | E [ h ( W n )] − E [ h ( D θ )] | ≤ θ n + θn n X k =1 σ k k + θ n n X k =1 k q ( σ k + k ) σ k . Taking the supremum over h ∈ H , and applying definition (11) completes the proof of (15).The inequality in (16) follows by observing that σ k = 0 when Y k = k a.s. Let ( p k ) k ≥ be an enumeration of the prime numbers in increasing order. Let ( X k ) k ≥ be asequence of independent integer valued random variables and let S n = 1log( p n ) n X k =1 X k log( p k ) for n ≥
1. (34)Weak convergence of S n to the Dickman distribution in the cases when the X k ’s are dis-tributed as geometric and Bernoulli variables is well known in probabilistic number theory,and [3] recently provided a rate of convergence in the Bernoulli case. We give bounds in astronger metric and remove a logarithmic factor from their rate. We also prove such boundswhen the X k ’s are distributed as geometric or Poisson with parameters given by certainfunctions of p k . For our results in this area, we rely heavily on the techniques in the proofof Lemma 2.3 of [3]; in particular, the identity (35) below, without remainder, is due to [3].We begin with the following abstract theorem.11 heorem 2.1. Let S be a non-negative random variable with finite variance such that forsome constant µ and a random variable T satisfying P ( S + T = 0) = 0 , E [ Sφ ( S )] = µE [ φ ( S + T )] + R φ for all φ ∈ Lip / , (35) where the constant R φ ( · ) may depend on φ ( · ) . Then d , ( S, D ) ≤ | µ − | + 12 inf ( T,U ) E | T − U | + sup φ ∈ Lip / | R φ | (36) where D is a standard Dickman random variable, and the infimum is over all couplings ( T, U ) of T and U ∼ U [0 , constructed on the same space as S , with U independent of S . Remark 2.1.
We note the connection between the relation in (35) and size biasing, wherefor a non-negative random variable S with finite mean µ , we say S s has the S -size biaseddistribution when E [ Sφ ( S )] = µE [ φ ( S s )] for all functions φ ( · ) for which these expectations exist. In particular, when R φ in (35) iszero for all φ ∈ Lip / , we obtain that S s = d S + T ; for an application which requires theremainder, see Lemma 2.2. Additionally, Section 4.3 of [6] shows that the standard Dickman D is the unique non-negative solution to the distributional equality W s = d W + U , where U is U [0 , , and independent of W . Hence, the error term comparing T and U in Theorem 2.1is natural.Proof of Theorem 2.1: We first show that the set of couplings over which the infimum istaken in (36) is non-empty. Note that the case when S is identically zero is trivial since onecan take µ = 0, T = 0 and R φ = 0 for all φ ∈ Lip / . For a nontrivial S , let µ = E [ S ], andlet S s and U be constructed on the same space as S , independently of S , with S s having the S -size biased distribution and U ∼ U [0 , T = S s − S identity (35) is satisfiedwith R φ = 0 for all φ ∈ Lip / , and the pair ( T, U ) satisfies the conditions required of theinfimum in the theorem.Invoking Theorem 4.9 with θ = 1, for any given h ∈ H , there exists a function f ( · )satisfying k f ′ k (0 , ∞ ) ≤ k f ′′ k (0 , ∞ ) ≤ / E [ h ( S )] − E [ h ( D )] = E [ Sf ′ ( S ) + f ( S ) − f ( S + 1)] . Now consider µ and T satisfying (35) with ( T, U ) constructed on the same space as S , with U ∼ U [0 ,
1] and independent of S . Then, using P ( S + T = 0) = 0, allowing us to apply thebounds of Theorem 4.9 over (0 , ∞ ), the mean value theorem for the second inequality andrecalling definitions (25) and (26), we obtain | E [ h ( S )] − E [ h ( D )] | = | E [ Sf ′ ( S ) − f ′ ( S + U )] | = | E [ µf ′ ( S + T ) − f ′ ( S + U ) + R f ′ ] |≤ | E [ µf ′ ( S + T ) − f ′ ( S + T )] | + | E [ f ′ ( S + T ) − f ′ ( S + U )] | + | R f ′ |≤ k f ′ k (0 , ∞ ) | µ − | + k f ′′ k (0 , ∞ ) E | T − U | + | R f ′ | ≤ | µ − | + 12 E | T − U | + | R f ′ | . T, U ) satisfying theconditions of the theorem yields | E [ h ( S )] − E [ h ( D )] | ≤ | µ − | + 12 inf ( T,U ) E | T − U | + | R f ′ h | , where we have written f = f h to emphasize the dependence of f ( · ) on h ( · ). Taking supremumover h ∈ H , first on the right, and then on the left now yields the result upon applyingdefinition (11).Now we will demonstrate a few applications of Theorem 2.1. In all these examples theconditions that the variance of S is finite and that S + T > n ≥
1, let Ω n denote the set of integerswith no prime factor larger than p n , and let Π n be the distribution on Ω n with mass functionΠ n ( m ) = 1 π n m for m ∈ Ω n where π n = P m ∈ Ω n /m is the normalizing factor. One can check, see e.g. Proposition 1 in[27], that M n = Q nk =1 p X k k has distribution Π n , where X k ∼ Geom(1 − /p k ) are independentfor 1 ≤ k ≤ n ; we remind the reader that we write X ∼ Geom( p ) when P ( X = m ) =(1 − p ) m p for m ≥
0. For n ≥
1, the random variable S n as in (34) is therefore given by S n = 1log( p n ) n X k =1 X k log( p k ) = log M n log( p n ) . (37)Taking the mean, we find µ n = E [ S n ] = 1log( p n ) n X k =1 log( p k ) p k − . (38)Now define the random variable I taking values in { , . . . , n } , and independent of S n , withmass function P ( I = k ) = log( p k )( p k −
1) log( p n ) µ n for k ∈ { , . . . , n } . (39)The next lemma very closely follows the arguments in Lemmas 3 and 5 of [3] and is includedhere only for completeness. In the proof, we will use the statement, equivalent [23] tothe prime number theorem, that lim n →∞ p n / ( n log n ) = 1, and Rosser’s Theorem [34], torespectively yield thatlog p n = log n + O (log log n ) and p k > k log k. (40)We will also use the follwing stronger version of Merten’s theorem, see [19]: For j ≥ γ the Euler constant, j X k =1 log( p k ) /p k = log( p j ) + R j with lim j →∞ R j = − γ − ∞ X k =1 log( p k )( p k − p k = − . . . . (41)13 emma 2.2. Let S n be as in (37) with X , . . . , X n independent with X k ∼ Geom(1 − /p k ) , µ n as in (38) , I with distribution given in (39) and independent of S n and T n = log( p I )log( p n ) and R n,φ = 1log( p n ) n X k =1 log( p k ) p k − E (cid:20) X k (cid:18) φ (cid:18) S n + log( p k )log( p n ) (cid:19) − φ ( S n ) (cid:19)(cid:21) . Then E [ S n φ ( S n )] = µ n E [ φ ( S n + T n )] + R n,φ for all φ ∈ Lip / .Moreover sup φ ∈ Lip / | R n,φ | = O (cid:18) n (cid:19) and µ n − O (cid:18) n (cid:19) , and there exists a coupling between U ∼ U [0 , and T n with U independent of S n , such that E | T n − U | = O (cid:18) n (cid:19) . Proof.
It is easily verified that for X ∼ Geom( p ), E [ g ( X )] = 1 − pp E [ g ( X + 1) − g ( X )] (42)for all functions g ( · ) for which these expectations exist, and which satisfy g (0) = 0. Let S ( k ) n = S n − X k log( p k ) / log( p n ). Since X k ∼ Geom(1 − /p k ), specializing (42) to the case g ( x ) = xφ ( S ( k ) n + x log( p k ) / log( p n )), conditioning on S ( k ) n in the second equality and usingthe independence of I and S n in the last, for φ ∈ Lip / we have E [ S n φ ( S n )] = 1log( p n ) n X k =1 log( p k ) E (cid:20) X k φ (cid:18) S ( k ) n + X k log( p k )log( p n ) (cid:19)(cid:21) = 1log( p n ) n X k =1 log( p k )(1 /p k )1 − /p k E (cid:20) ( X k + 1) φ (cid:18) S n + log( p k )log( p n ) (cid:19) − X k φ ( S n ) (cid:21) = 1log( p n ) n X k =1 log( p k ) p k − E (cid:20) φ (cid:18) S n + log( p k )log( p n ) (cid:19) + X k (cid:18) φ (cid:18) S n + log( p k )log( p n ) (cid:19) − φ ( S n ) (cid:19)(cid:21) = µ n n X k =1 log( p k )( p k −
1) log( p n ) µ n E (cid:20) φ (cid:18) S n + log( p k )log( p n ) (cid:19)(cid:21) + R n,φ = µ n n X k =1 P ( I = k ) E (cid:20) φ (cid:18) S n + log( p k )log( p n ) (cid:19)(cid:21) + R n,φ = µ n E [ φ ( S n + T n )] + R n,φ , proving the first claim.Next, using mean value theorem and that k φ ′ k ∞ ≤ / | R n,φ | = (cid:12)(cid:12)(cid:12)(cid:12) p n ) n X k =1 log( p k ) p k − E (cid:20) X k (cid:18) φ (cid:18) S n + log( p k )log( p n ) (cid:19) − φ ( S n ) (cid:19)(cid:21) (cid:12)(cid:12)(cid:12)(cid:12) ≤
12 log( p n ) n X k =1 log ( p k )( p k −
1) log( p n ) EX k = 12 log ( p n ) n X k =1 log ( p k )( p k − = O (cid:18) n (cid:19) (43)14here in the last step, we have used that the second relation in (40) to lower bound p n by n , and, again by (40), that ∞ X k =1 log ( p k )( p k − ≤ C ∞ X k =1 log ( k )( k − < ∞ , where we have used the first relation there to upper bound log( p k ) by C log( k ) for somepositive constant in the numerator, and the second one again to lower bound p k by k in thedenominator. As the final sum in (43) does not depend on φ ( · ), the bound is uniform overall φ ∈ Lip / .The proof of the remainder of the lemma closely follows Lemma 5 of [3]. Using (41), weobtain j X k =1 log( p k ) p k − p j ) + j X k =1 log( p k )( p k − p k + R j = log( p j ) + O (1) , (44)where in the second sum we have used both relations in (40) to obtain log( p k ) p k ( p k − = O (cid:16) k log k (cid:17) .Thus, using (44), that p n > n via (40), and recalling µ n in (38), we obtain µ n − p n ) n X k =1 log( p k ) p k − − log( p n ) ! = O (cid:18) n (cid:19) . (45)To prove the last claim, we sketch the coupling construction of ( U, I ) in Lemma 5 of[3], with I a function of the uniform U ∼ U [0 , X , . . . , X n . For j = 0 , , . . . , n , set F j = j X k =1 P ( I = k ) = 1 µ n log( p n ) j X k =1 log( p k )( p k − , and define the random variable I by I = j if F j − ≤ U < F j . Clearly I is independent of X , . . . , X n , since it only depends on U . When I = j , using | u − c | is a convex function of u for any constant c for the equality, deterministically we have (cid:12)(cid:12)(cid:12)(cid:12) U − log( p I )log( p n ) (cid:12)(cid:12)(cid:12)(cid:12) ≤ sup u ∈ [ F j − ,F j ) (cid:12)(cid:12)(cid:12)(cid:12) u − log( p j )log( p n ) (cid:12)(cid:12)(cid:12)(cid:12) = max n (cid:12)(cid:12)(cid:12)(cid:12) F j − − log( p j )log( p n ) (cid:12)(cid:12)(cid:12)(cid:12) , (cid:12)(cid:12)(cid:12)(cid:12) F j − log( p j )log( p n ) (cid:12)(cid:12)(cid:12)(cid:12) o . (46)Now, using (40), (44) and (45), with (45) implying that µ n → n → ∞ , we have F j − log( p j )log( p n ) = 1log( p n ) j X k =1 log( p k ) p k − − log( p j ) ! − p n (1 − µ − n ) j X k =1 log( p k ) p k − O (cid:18) n (cid:19) − µ n − µ n log p n (log p j + O (1)) = O (cid:18) n (cid:19) . µ n →
1, we have P ( I = j ) = F j − F j − = 1 µ n log( p n ) log( p j )( p j −
1) = O (cid:18) j log n (cid:19) . (47)Thus, by subtracting and adding F j , we obtain F j − − log( p j )log( p n ) = O (cid:18) j log n (cid:19) + O (cid:18) n (cid:19) = O (cid:18) n (cid:19) , and hence, on the event I = j , from (46) we have (cid:12)(cid:12)(cid:12)(cid:12) U − log( p I )log( p n ) (cid:12)(cid:12)(cid:12)(cid:12) = O (cid:18) n (cid:19) . (48)Now, using (47) and (48) we obtain E (cid:12)(cid:12)(cid:12)(cid:12) U − log( p I )log( p n ) (cid:12)(cid:12)(cid:12)(cid:12) = n X j =1 P ( I = j ) E (cid:20)(cid:12)(cid:12)(cid:12)(cid:12) U − log( p I )log( p n ) (cid:12)(cid:12)(cid:12)(cid:12) (cid:12)(cid:12)(cid:12)(cid:12) I = j (cid:21) = O n X j =1 j log n n ! = O (cid:18) n (cid:19) , thus proving the final claim. Proof of Theorem 1.3:
The upper bound follows directly from Theorem 2.1 upon invokingLemma 2.2. Next we show that the order of the bound is optimal. From (45) and (44), wehave µ n − p n ) n X k =1 log( p k )( p k − p k + R n ! , and by the second display in (41) we obtainlim n →∞ n X k =1 log( p k )( p k − p k + R n = − γ. As log p n = O (log n ) by (40), | µ n − | is at least of order 1 / log n . Since the function h ( x ) = x is in H , , by (11) we have that d , ( S n , D ) ≥ | Eh ( S n ) − Eh ( D ) | = | µ n − | . Hence d , ( S n , D )is at least of order 1 / log n .For our next example, for n ≥ ′ n denote the set of square-free integers whose largestprime factor is less than or equal to p n and let Π ′ n denote the distribution on Ω ′ n with massfunction Π ′ n ( m ) = 1 π ′ n m for m ∈ Ω ′ n where π ′ n = P m ∈ Ω ′ n /m is the normalizing factor. We again consider S n as in (37), here for M n = Q nk =1 p X k k where X k ∼ Ber(1 / (1 + p k )) are independent for 1 ≤ k ≤ n . One can check,see e.g. [12], that M n ∼ Π ′ n . Following [3], let µ n = E [ S n ] = 1log( p n ) n X k =1 log( p k )1 + p k . (49)16he following lemma combines Lemmas 3 and 5 of [3]. By following tightly the same linesof argument in [3] the bounds we obtain in (52) and (53) are O (1 / log n ) whereas [3] claimsonly the order O (log log n/ log n ). Lemma 2.3.
Let S n be as in (37) with X , . . . , X n independent with X k ∼ Ber(1 / (1 + p k )) .With µ n as given in (49) , let the random variable I take values in { , . . . , n } with massfunction P ( I = k ) = log( p k )(1 + p k ) log( p n ) µ n for k ∈ { , . . . , n } ,and be independent of X , . . . , X n . For T n = log( p I )log( p n ) − X I log( p I )log( p n ) , (50) we have E [ S n φ ( S n )] = µ n E [ φ ( S n + T n )] for all φ ∈ Lip / . (51) Moreover, µ n − O (cid:18) n (cid:19) and E (cid:12)(cid:12)(cid:12)(cid:12) X I log( p I )log( p n ) (cid:12)(cid:12)(cid:12)(cid:12) = O (cid:18) n (cid:19) , (52) and there exists a coupling between a random variable U ∼ U [0 , and I with U independentof S n such that E (cid:12)(cid:12)(cid:12)(cid:12) U − log( p I )log( p n ) (cid:12)(cid:12)(cid:12)(cid:12) = O (cid:18) n (cid:19) . (53) Proof.
The proof of (51) is exactly same as in Lemma 3 of [3] and one can follow the linesof argument in [3] to prove the second claim in (52). The proofs of the other two claimsare similar to those of the corresponding results in Lemma 2.2 noting that the orders in thebounds do not change if we replace p k − p k + 1; we omit the computation. Proof of Theorem 1.4:
The upper bound follows directly from Theorem 2.1 upon invokingLemma 2.3 with R φ = 0 for all φ ∈ Lip / and noting that with T n and U as in (50) and(53) respectively, E | T n − U | ≤ E (cid:12)(cid:12)(cid:12)(cid:12) X I log( p I )log( p n ) (cid:12)(cid:12)(cid:12)(cid:12) + E (cid:12)(cid:12)(cid:12)(cid:12) U − log( p I )log( p n ) (cid:12)(cid:12)(cid:12)(cid:12) = O (cid:18) n (cid:19) , using (52) and (53) on these two terms, respectively. Finally, that the upper bound is ofoptimal order follows as in the proof of Theorem 1.3.We also prove that these types of convergence results hold for S n given in (37) when X k ∼ Poi( λ k ), k ≥ λ k ) k ≥ . Here we take µ n equal to the mean of S n , µ n = 1log( p n ) n X k =1 λ k log( p k ) and P ( I = k ) = λ k log( p k )log( p n ) µ n for k ∈ { , . . . , n } , (54)with I independent of S n . Under this framework, we have the following construction of avariable having the size bias distribution of S n .17 emma 2.4. For a sequence of positive real numbers ( λ k ) ≤ k ≤ n and independent randomvariables X , . . . , X n with X k ∼ Poi( λ k ) , let S n = 1log( p n ) n X k =1 X k log( p k ) . For µ n as in (54) and T n = log( p I ) / log( p n ) , where I is distributed as in (54) and is inde-pendent of S n , we have E [ S n φ ( S n )] = µ n E [ φ ( S n + T n )] for all φ ∈ Lip / .Proof. Using (30) in the second equality, for S ( k ) n = S n − X k log( p k ) / log( p n ), E [ S n φ ( S n )] = 1log( p n ) n X k =1 log( p k ) E [ X k φ ( S ( k ) n + X k log( p k ) / log( p n )]= 1log( p n ) n X k =1 log( p k ) λ k E [ φ ( S ( k ) n + ( X k + 1) log( p k ) / log( p n )]= 1log( p n ) n X k =1 log( p k ) λ k E [ φ ( S n + log( p k ) / log( p n )]= µ n n X k =1 P ( I = k ) E [ φ ( S n + log( p k ) / log( p n )] = µ n E [ φ ( S n + T n )]where in the last step, we have used that I is independent of S n .We now present two applications of Lemma 2.4 with notation and assumptions as there. Example 2.1.
Let λ k = 1 / (1 + p k ) . As the mean of the X k variables are the same here asin Lemma 2.3, µ n and the distribution of I also correspond. Taking U ∼ U [0 , independentof S n , and coupling I and U similarly as in Lemma 2.3, we have that | µ n − | = O (cid:18) n (cid:19) and E (cid:12)(cid:12)(cid:12)(cid:12) U − log( p I )log( p n ) (cid:12)(cid:12)(cid:12)(cid:12) = O (cid:18) n (cid:19) . Now, by Theorem 2.1 and Lemma 2.4 we obtain d , ( S n , D ) ≤ C log n for some universal constant C . One may show that the order of this bound is optimal byarguing as in the proof of Theorem 1.3. Example 2.2.
Let p = 1 and and λ k = 1 − log( p k − ) / log( p k ) for k ≥ . Then clearly µ n = 1 . Now to obtain a coupling ( T n , U ) , we take U ∼ U [0 , independent of S n , and define I = k if log( p k − )log( p n ) ≤ U < log( p k )log( p n ) for ≤ k ≤ n. Then by construction we have P ( I = k ) = λ k log( p k )log( p n ) µ n for ≤ k ≤ n . onditioning on I , we have E | T n − U | = n X k =1 P ( I = k ) E (cid:18)(cid:12)(cid:12)(cid:12) log( p k )log( p n ) − U (cid:12)(cid:12)(cid:12) (cid:12)(cid:12)(cid:12)(cid:12) I = k (cid:19) ≤ n X k =1 P ( I = k ) (cid:12)(cid:12)(cid:12)(cid:12) log( p k − )log( p n ) − log( p k )log( p n ) (cid:12)(cid:12)(cid:12)(cid:12) . Now using that p k /p k − ≤ by Bertrand’s postulate (see e.g. [31]) for all k ≥ , we obtain E | T n − U | ≤ log(2)log( p n ) . Hence from Theorem 2.1 with µ n = 1 and R φ = 0 for all φ ∈ Lip / , we have d , ( S n , D ) ≤ log(2)2 log( p n ) ≤ C log n for some universal constant C . Following the distribution of a draft of this manuscript, [5] pointed out that the approachin [4] may be used to obtain bounds in the Wasserstein-1 metric for some results in thissection. D θ,s family, simulations and dis-tributional bounds In this section we develop the extension of the generalized Dickman distribution to the D θ,s family for θ > s : [0 , ∞ ) → [0 , ∞ ). As detailed in the Introduction, therecursion (4) associated with the D θ family can be interpreted as giving the successive valuesof a Vervaat perpetuity under the assumption that the utility function is the identity. Moregenerally, with utility function s ( · ), one obtains the recursion s ( W n +1 ) = U /θn s ( W n + 1) for n ≥
0, (55)where U n , n ≥ U [0 ,
1] distribution, U n is independent of W n ,and W has some given initial distribution. In Section 3.1, under Condition 3.1 below on s ( · ), we prove Theorem 3.3 that shows that the distributional fixed points D θ,s of (55) existand are unique. When s ( · ) is invertible, as it is under Condition 3.1 below, we may write(55) as W n +1 = s − (cid:0) U /θn s ( W n + 1) (cid:1) for n ≥
0. (56)In Section 3.2, we provide distributional bounds for approximation of the D θ,s distribu-tion. Using direct coupling, Corollary 3.1 gives a bound on how well the utility s ( W n ) in (55)approximates the utility of its limit D θ,s . Next, Theorem 3.4 extends the main Wassersteinbound (5) of [21] to d ( W, D θ,s ) ≤ (1 − ρ ) − d ( W ∗ , W ) where W ∗ = d s − (cid:0) U /θ s ( W + 1) (cid:1) (57)for U ∼ U [0 , W . The constant ρ is defined in (67) as a uniform boundon an integral involving ( θ, s ) given by (66). However, [8] shows that this quantity can be19nterpreted in terms of the Markov chain (56) and its properties connected to those of itstransition operator ( P h )( x ) = E (cid:2) h (cid:0) s − (cid:0) U /θ s ( x + 1) (cid:1)(cid:1)(cid:3) in this, and some more general,cases. In particular, for h ∈ Lip , ρ is a bound on the essential supremum norm of thederivative of the transition operator. Though linear stochastic recursions are ubiquitousand are well known to be highly tractable, this special class of Markov chains, despite itsnon-linear transitions, seems also amenable to deeper analysis.We apply the inequality (57) in Corollary 3.2 to obtain a bound on the Wassersteindistance between the iterates W n of (56) and D θ,s . Finally in Section 3.3, we give a fewexamples of some new distributions that arise as a result of utility functions that appear inthe economics literature. D θ,s distribution In the following we use the terms increasing and decreasing in the non-strict sense. Let ≤ st denote inequality between random variables in the stochastic order. Lemma 3.1.
Let θ > and s : [0 , ∞ ) → [0 , ∞ ) satisfy s ( x + 1) ≤ s ( x ) + 1 for all x ≥ , (58) let W be a given non-negative random variable and let { W n , n ≥ } be generated by recursion (55) . Then s ( W n +1 ) ≤ U /θn ( s ( W n ) + 1) for all n ≥ . (59) If in addition s ( W ) ≤ st D θ , then s ( W n ) ≤ st D θ for all n ≥ . (60) Proof.
By applying (55) and (58) for the equality and inequality respectively, we have s ( W n +1 ) = U /θn s ( W n + 1) ≤ U /θn ( s ( W n ) + 1) , hence the claim (59) holds, and when s ( W n ) ≤ st D θ then U /θn ( s ( W n ) + 1) ≤ st U /θn ( D θ + 1) = d D θ , where for the final equality we have used that D θ is fixed by the Dickman bias transformation(3), and taken U n independent of D θ . Induction then shows that the claim holds for all n ≥ n = 0.Theorem 3.3, showing the existence and uniqueness of the fixed point D θ,s to (19), requiresthe following condition to hold on the utility function s ( · ). Condition 3.1.
The function s : [0 , ∞ ) → [0 , ∞ ) is continuous, strictly increasing with s (0) = 0 and s (1) = 1 , and satisfies s ( x + 1) ≤ s ( x ) + 1 for all x ≥ and | s ( x + 1) − s ( y + 1) | ≤ | s ( x ) − s ( y ) | for all x, y ≥ . (62)20he following result shows that choice of the starting distribution in (55) has vanishingeffect asymptotically as measured in the d Wasserstein norm.
Lemma 3.2.
Let θ > and Condition 3.1 be in force. Let W and V be given non-negativerandom variables such that the means of s ( W ) and s ( V ) are finite. For n ≥ let s ( V n ) and s ( W n ) have distributions as specified in (55) . Then s ( W n ) and s ( V n ) have finite mean forall n ≥ , and d ( s ( W n ) , s ( V n )) ≤ (cid:18) θθ + 1 (cid:19) n d ( s ( W ) , s ( V )) for all n ≥ . (63) Proof.
By (59) of Lemma 3.1, the existence of E [ s ( W n )] implies the existence of E [ s ( W n +1 )].Now induction and the assumption that E [ s ( W n )] is finite for n = 0 proves the expectationis finite for all n ≥ n = 0. Assuming it holds for some n ≥
0, let the jointdistribution of ( s ( V n ) , s ( W n )) achieve the infimum in (8). Then independently constructing U n ∼ U [0 ,
1] on the same space as s ( V n ) and s ( W n ), the pair s ( W n +1 ) , s ( V n +1 ) given by (55)are defined on the same space and have the desired marginals, and satisfy | s ( V n +1 ) − s ( W n +1 ) | ≤ U /θn | s ( V n ) − s ( W n ) | . Hence, by the independence of s ( W n ) and s ( V n ) from U n and definition (8) of the d metric,we obtain d ( s ( W n +1 ) , s ( V n +1 )) ≤ E [ U /θn | s ( V n ) − s ( W n ) | ] = θθ + 1 E | s ( V n ) − s ( W n ) | , and applying the induction hypotheses, we obtain (63).Define the generalized inverse of an increasing function s : [0 , ∞ ) → [0 , ∞ ) as s − ( x ) = inf { y : s ( y ) ≥ x } (64)with the convention that inf ∅ = ∞ . In particular for X a random variable, we consider s − ( X ) as a random variable taking values in the extended real line. When writing thestochastic order relation V ≤ st W between two extended valued random variables, we meanthat P ( V ≥ t ) ≤ P ( W ≥ t ) holds for all t in the extended real line. Note that s − ( · ) and s − ( · ) coincide on the range of s ( · ) when s ( · ) is continuous and strictly increasing. Theorem 3.3.
Let θ > and s ( · ) satisfy Condition 3.1. Then there exists a unique dis-tribution D θ,s for a random variable D θ,s such that s ( D θ,s ) has finite mean and satisfies D θ,s = d D ∗ θ,s , with D ∗ θ,s given by (19) . In addition, D θ,s ≤ st s − ( D θ ) .Proof. Generate a sequence W n , n ≥ W = 0. We first provethat a distributional fixed point to the transformation (19) exists by showing the existenceof a distribution D θ,s and a subsequence ( n k ) k ≥ such that W n k → d D θ,s and W n k +1 → d D θ,s as k → ∞ and W n k +1 = d W ∗ n k . (65)By Lemma 3.1 and the fact that s ( W ) = s (0) = 0, we have s ( W n ) ≤ st D θ for all n ≥ ≤ s ( W n ) ≤ st D θ , the sequence s ( W n ) , n ≥ s ( W n k ) → d E θ,s for some distribution E θ,s . As s ( · ) is invertible W n k → d D θ,s where D θ,s = d s − ( E θ,s ) proving first claim in (65). As weak limits preserve stochastic order, E θ,s ≤ st D θ and hence D θ,s ≤ st s − ( D θ ), as s ( · ) increasing implies that s − ( · ) given by (64) isalso increasing. The last claim of the theorem is shown.Let the sequence V n , n ≥ W n is in (56) with initial value V = d W and V independent of U n , n ≥
0. Note that s ( V ) has finite mean by Lemma 3.2, and hence(63) may be invoked to conclude that d ( s ( W n ) , s ( V n )) → n → ∞ . As s ( W n k ) → d E θ,s ,we have s ( V n k ) → d E θ,s hence V n k → d D θ,s . As V = d W , we have V n = d W n +1 , implying W n k +1 → d D θ,s . The second claim in (65) is shown. The third claim holds by (56) and bydefinition (57) of the Dickman bias transform.By the first claim in (65) and the continuity of s ( · ) and s − ( · ), letting U ∼ U [0 ,
1] beindependent of D θ,s ∼ D θ,s and W n k , as k → ∞ we have W ∗ n k = d s − (cid:0) U /θ s ( W n k + 1) (cid:1) → d s − (cid:0) U /θ s ( D θ,s + 1) (cid:1) = d D ∗ θ,s . Hence, letting k → ∞ in the third relation (65) we obtain D θ,s = D ∗ θ,s , showing that D θ,s isa fixed point of the Dickman bias transformation (57).Now let W and V be any two fixed points of the transformation such that s ( W ) and s ( V ) have finite mean. Then the distributions of s ( W n ) and s ( V n ) do not depend on n , and(63) yields d ( s ( W ) , s ( V )) = d ( s ( W n ) , s ( V n )) → n → ∞ .Hence s ( W ) = d s ( V ), and applying s − we conclude W = d V ; the fixed point is unique. D θ,s approximation and Simulations In this section we study the accuracy of recursive methods to approximately sample from the D θ,s family, starting with the following simple corollary to Lemma 3.2 that gives a bound onhow well the utility s ( W n ), satisfying the recursion (55), approximates the long term utilityof the fixed point. Corollary 3.1.
Let θ > and Condition 3.1 be in force. Then s ( W n ) given by (55) satisfies d ( s ( W n ) , s ( D θ,s )) ≤ (cid:18) θθ + 1 (cid:19) n d ( s ( W ) , s ( D θ,s )) for all n ≥ .Proof. The result follows from (63) of Lemma 3.2 by taking V = d D θ,s and noting that D θ,s is fixed by the transformation (57) so that s ( V n ) = d s ( D θ,s ) for all n .Corollary 3.1 depends on the direct coupling in Lemma 3.2, which constructs the variables s ( W n ) and s ( V n ) on the same space. Theorem 3.4 below gives a bound for when a non-negative random variable W is used to approximate the distribution of D θ,s . Though directcoupling can still be used to obtain bounds such as those in Theorem 3.4 for the D θ family,doing so is no longer possible for the more general D θ,s family as iterates of (56) can no longerbe written explicitly when s ( · ) is non-linear. Theorem 3.4 below provides a Wassersteinbound between D θ,s and W assuming certain natural conditions on the function s ( · ).22or θ >
0, suppressed in the notation, and x > s ′ ( x ) exists, let I ( x ) = θs ′ ( x ) s θ +1 ( x ) Z x s θ ( v ) dv. (66)For S ⊂ [0 , ∞ ), we say a function f : [0 , ∞ ) → [0 , ∞ ) is locally absolutely continuouson S if it is absolutely continuous when restricted to any compact sub-interval of S . Unlessotherwise stated, locally absolutely continuity will mean over the domain of f ( · ). Theorem 3.4.
Let θ > and s : [0 , ∞ ) → [0 , ∞ ) satisfying Condition 3.1 be locally abso-lutely continuous on [0 , ∞ ) and such that E [ D θ,s ] < ∞ . With I ( · ) as in (66) , if there exists ρ ∈ [0 , such that k I k ∞ ≤ ρ, (67) then for any non-negative random variable W with finite mean, d ( W, D θ,s ) ≤ (1 − ρ ) − d ( W ∗ , W ) . (68) In the special case s ( x ) = x , k I k ∞ = θ/ ( θ + 1) ∈ [0 , , and one may take ρ equal to thisvalue. Remark 3.1.
Note that E [ s − ( D θ )] < ∞ implies E [ D θ,s ] < ∞ as D θ,s ≤ st s − ( D θ ) byTheorem 3.3. Remark 3.2.
By a simple argument, similar to the one in Section 3 of [21], for θ > and s : [0 , ∞ ) → [0 , ∞ ) satisfying Condition 3.1, (71) below and E [ D θ,s ] < ∞ , for anynon-negative random variable W with finite mean, we have d ( W, D θ,s ) ≤ (1 + θ ) d ( W ∗ , W ) so that (68) holds with ρ = θ/ ( θ + 1) .The use of Stein’s method in Theorem 3.4 does not require that s ( · ) satisfy (71) but doesneed s ( · ) to be locally absolutely continuous. In addition, the alternative approach in [21] hasno scope for improvement in terms of finding the best constant ρ ; Example 3.2 presents acase where taking ρ = θ/ ( θ + 1) is not optimal. Theorem 3.7 below gives a verifiable criteriaby which one can show when the canonical choice ρ = θ/ ( θ + 1) is not improvable. We will prove Theorem 3.4 using Stein’s method in Section 4. Here, we provide thefollowing corollary applicable for the simulation of D θ,s distributed random variables. Notethat when s ( · ) is strictly increasing and continuous, for W independent of U ∼ U [0 ,
1] thetransform W ∗ as given by (19) satisfies W ∗ = d s − ( U /θ s ( W + 1)) ≤ W + 1 . (69) Corollary 3.2.
Let s : [0 , ∞ ) → [0 , ∞ ) be as in Theorem 3.4 and let { W n , n ≥ } begenerated by (56) with W non-negative and EW < ∞ , independent of { U n , n ≥ } . If ρ ∈ [0 , exists satisfying (67) , then d ( W n , D θ,s ) ≤ (1 − ρ ) − d ( W n +1 , W n ) . (70)23 oreover, if s ( · ) satisfies | s − ( as ( x )) − s − ( as ( y )) | ≤ a | x − y | for a ∈ [0 , and x, y ≥ , (71) then d ( W n , D θ,s ) ≤ (1 − ρ ) − (cid:18) θθ + 1 (cid:19) n d ( W , W ) . (72) When W = 0 , d ( W n , D θ,s ) ≤ (1 − ρ ) − (cid:18) θθ + 1 (cid:19) n E [ s − ( U /θ )] , (73) and in the particular the case of the generalized Dickman D θ family, d ( W n , D θ ) ≤ θ (cid:18) θθ + 1 (cid:19) n . (74) Proof.
Identity (56), the inequality in (69) and induction show that W n ≤ W + n , and hence EW n < ∞ , for all n ≥
0. Inequality (70) now follows from Theorem 3.4 noting from (19)that W ∗ n = d W n +1 for all n ≥ n ≥ W ′ n − and V ′ n independent of U n such that W ′ n − = d W n − , V ′ n = d W n and E | V ′ n − W ′ n − | = d ( W n , W n − ). Now letting W ′′ n = s − ( U /θn s ( W ′ n − + 1)) and V ′′ n +1 = s − ( U /θn s ( V ′ n + 1))we have W ′′ n = d W n and V ′′ n +1 = d W n +1 . Thus, using (8) followed by (71) we have d ( W n +1 , W n ) ≤ E | V ′′ n +1 − W ′′ n | = E | s − ( U /θn s ( V ′ n + 1)) − s − ( U /θn s ( W ′ n − + 1)) |≤ E [ U /θn | V ′ n − W ′ n − | ] = θθ + 1 d ( W n , W n − ) . Induction now yields d ( W n +1 , W n ) ≤ (cid:18) θθ + 1 (cid:19) n d ( W , W )and applying (70) we obtain (72).Inequality (73) now follows from (72) noting in this case, using s (1) = 1, that ( W , W ) =(0 , s − ( U /θ )), and (74) is now achieved from (73) by taking ρ to be θ/ ( θ + 1), as providedby Theorem 3.4 when s ( x ) = x .In the remainder of this subsection, in Lemma 3.6 we present some general and easilyverifiable conditions on s ( · ) for the satisfaction of (71), and in Theorem 3.7 ones under whichthe integral bound k I k ∞ ≤ ρ in (67) holds with ρ ∈ [0 , ondition 3.2. The function s : [0 , ∞ ) → [0 , ∞ ) is continuous at , strictly increasing with s (0) = 0 and s (1) = 1 , and concave. Lemma 3.5.
If a function f : [0 , ∞ ) → [0 , ∞ ) is increasing, continuous at and locallyabsolutely continuous on (0 , ∞ ) , then it is locally absolutely continuous on its domain.Proof. Since f ( · ) is absolutely continuous on any compact subset of (0 , ∞ ), by continuity of f ( · ) at 0, for 0 < ǫ ≤ x < ∞ , using absolute continuity on [ ǫ, x ] in the second equality andmonotone convergence in the third, we have f ( x ) − f (0) = lim ǫ ↓ ( f ( x ) − f ( ǫ )) = lim ǫ ↓ Z xǫ f ′ ( v ) dv = Z x f ′ ( v ) dv. Hence f ( · ) is locally absolutely continuous on its domain. Lemma 3.6. If s : [0 , ∞ ) → [0 , ∞ ) satisfies Condition 3.2, then it is locally absolutelycontinuous on [0 , ∞ ) , satisfies Condition 3.1 and | s − ( as ( y )) − s − ( as ( x )) | ≤ a | y − x | for all x, y ≥ and a ∈ [0 , . (75) Proof.
First, since s ( · ) is concave, it is locally absolutely continuous on (0 , ∞ ). Thus, byLemma 3.5, s ( · ) is locally absolutely continuous on its domain. Next we show s ( · ) is subad-ditive, that is, that s ( x + y ) ≤ s ( x ) + s ( y ) for x, y ≥
0. (76)Taking x, y ≥
0, we may assume both x and y are non-zero as (76) is trivial otherwise since s (0) = 0. By concavity, yx + y s (0) + xx + y s ( x + y ) ≤ s ( x ) and xx + y s (0) + yx + y s ( x + y ) ≤ s ( y ) . Since s (0) = 0, adding these two inequalities yield (76). Taking y = 1 and using s (1) = 1we obtain (61). Next, the local absolute continuity and concavity of s ( · ) on [0 , ∞ ) imply thatit is almost everywhere differentiable on this domain, with s ′ ( · ) decreasing almost everywhere.Thus for x ≥ y ≥
0, we have s ( x + 1) − s ( x ) = Z x +1 x s ′ ( u ) du ≤ Z x +1 x s ′ ( u + y − x ) du = Z y +1 y s ′ ( u ) du = s ( y + 1) − s ( y ) , which together with the fact that s ( · ) is increasing implies (62). Hence s ( · ) satisfies Condition3.1.Lastly, we show that s ( · ) satisfies (75). Since s (0) = 0 the inequality is trivially satisfiedfor a = 0, so fix some a ∈ (0 , x = y ;without loss, let 0 ≤ x < y . The inverse function r ( · ) = s − ( · ) is continuous at zero andconvex on the range S of s ( · ), a possibly unbounded convex subset [0 , ∞ ) that includes theorigin. Letting u = s ( x ) and v = s ( y ), as s ( · ), and hence r ( · ), are strictly increasing and x = y , inequality (75) may be written r ( av ) − r ( au ) ≤ a ( r ( v ) − r ( u )) or equivalently r ( av ) − r ( au ) av − au ≤ r ( v ) − r ( u ) v − u , (77)25here all arguments of r ( · ) in (77) lie in S , it being a convex set containing { , u, v } .The second inequality in (77) follows from the following slightly more general one thatany convex function r : [0 , ∞ ) → [0 , ∞ ) which is continuous at 0 satisfies by virtue of itslocal absolute continuity and a.e. derivative r ′ ( · ) being increasing: if ( u , v ) and ( u , v ) aresuch that u = v , u ≤ u and v ≤ v , and all these values lie in the range of r ( · ), then r ( v ) − r ( u ) v − u = 1 v − u Z v u r ′ ( w ) dw = Z r ′ ( u + ( v − u ) w ) dw ≤ Z r ′ ( u + ( v − u ) w ) dw = 1 v − u Z v u r ′ ( w ) dw = r ( v ) − r ( u ) v − u , as one easily has that u + ( v − u ) w ≤ u + ( v − u ) w for all w ∈ [0 , s ( · ) is nice enough, we can actually say more about the constant ρ in(67) of Theorem 3.4. Theorem 3.7.
Assume that θ > and s : [0 , ∞ ) → [0 , ∞ ) is concave and continuous at .Then with I ( x ) as given in (66) , k I k ∞ ≤ θθ + 1 . (78) If moreover s ( · ) is strictly increasing with s (0) = 0 and lim n →∞ s ′ ( x n ) < ∞ for some sequenceof distinct real numbers x n ↓ in the domain of s ′ ( · ) , then k I k ∞ = θθ + 1 . (79) Proof.
Since s ( · ) is concave and continuous at 0, it is locally absolutely continuous with s ′ ( · )decreasing almost everywhere on [0 , ∞ ). Since u θ +1 is Lipschitz on any compact interval,by composition, s θ +1 ( · ) is absolutely continuous on [0 , x ] for any x ≥
0, and thus for almostevery x ,( θ + 1) I ( x ) θ = ( θ + 1) s ′ ( x ) s θ +1 ( x ) Z x s θ ( v ) dv ≤ s θ +1 ( x ) Z x ( θ + 1) s θ ( v ) s ′ ( v ) dv = s θ +1 ( x ) − s θ +1 (0) s θ +1 ( x ) ≤ , proving (78).To prove the second claim, first note that 0 < lim n →∞ s ′ ( x n ) < ∞ , the existence of thelimit and second inequality holding by assumption, and the first inequality holding as s ( · ) isstrictly increasing and s ′ ( · ) is decreasing almost everywhere.Thus, in the second equality using a version of the Stolz-Ces`aro theorem [37] adapted toaccommodate s θ +1 ( x n ) decreasing to zero,lim n →∞ I ( x n ) = θ lim n →∞ s ′ ( x n ) lim n →∞ R x n s θ ( v ) dvs θ +1 ( x n ) = θ lim n →∞ s ′ ( x n ) lim n →∞ R x n x n +1 s θ ( v ) dvs θ +1 ( x n ) − s θ +1 ( x n +1 )= θ lim n →∞ s ′ ( x n ) lim n →∞ R x n x n +1 s θ ( v ) dv ( θ + 1) R x n x n +1 s θ ( v ) s ′ ( v ) dv = θθ + 1 lim n →∞ s ′ ( x n ) lim n →∞ s ′ ( x n ) = θθ + 1 , n →∞ s ′ ( x n ) = lim n →∞ s ′ ( x n +1 ) ≤ lim n →∞ R x n x n +1 s θ ( v ) dv R x n x n +1 s θ ( v ) s ′ ( v ) dv ≤ lim n →∞ s ′ ( x n )and hence k I k ∞ ≥ θθ + 1which together with (78) proves (79).The bound (74) of Corollary 3.2 is obtained by specializing results for the D θ,s family,proven using the tools of Stein’s method, to the case where s ( x ) = x . For this special case,letting V j = U /θj for j ≥
0, the iterates of the recursion (56), starting at W = 0, can bewritten explicitly as W n = n − X k =0 n − Y j = k V j , allowing one to obtain bounds using direct coupling. Interestingly, the results obtained byboth methods agree, as seen as follows. First, we show W n = d Y n where Y n = n − X k =0 k Y j =0 V j , and Y ∞ ∼ D θ where Y ∞ = ∞ X k =0 k Y j =0 V j . The first claim is true since for every n ≥ V , . . . , V n − ) = d ( V n − , . . . , V ) . For the second claim, note that the limit Y ∞ exists almost everywhere and has finite meanby monotone convergence. Now using definition (3), with U − ∼ U [0 ,
1] independent of U , U . . . and setting V − = U /θ − , we have Y ∗∞ = U /θ − ( Y ∞ + 1) = V − ∞ X k =0 k Y j =0 V j + 1 ! = ∞ X k =0 k Y j = − V j + V − = ∞ X k = − k Y j = − V j = ∞ X k =0 k Y j =0 V j − = d ∞ X k =0 k Y j =0 V j = Y ∞ . Hence Y ∞ ∼ D θ . As ( Y n , Y ∞ ) is a coupling of a variable with the W n distribution to onewith the D θ distribution, by (8) we obtain d ( W n , D θ ) = d ( Y n , Y ∞ ) ≤ E | Y ∞ − Y n | = E ∞ X k = n k Y j =0 V j ! = ∞ X k = n (cid:18) θθ + 1 (cid:19) k +1 = θ (cid:18) θθ + 1 (cid:19) n , in agreement with (74). 27 .3 Examples We now consider three new distributions that arise as special cases of the D θ,s family. Ex-pected Utility (EU) theory has long been considered as an acceptable paradigm for decisionmaking under uncertainty by researchers in both economics and finance, see e.g. [17]. Toobtain tractable solutions to many problems in economics, one often restricts the EU crite-rion to a certain class of utility functions, which includes in particular the ones in Examples3.1 and 3.3. In these two examples we apply the bounds provided in Corollary 3.2 for thesimulation of the limiting distributions these functions give rise to via the recursion (56)with say, W = 0. For each example we will verify Condition 3.2, implying Condition 3.1 byLemma 3.6, and hence existence and uniqueness of D θ,s . Example 3.1.
The exponential utility function u ( x ) = 1 − e − αx is the only model, up tolinear transformations, exhibiting constant absolute risk aversion (CARA), see [17]. Sinceutility is unique up to linear transformations, we consider its scaled version s α ( x ) = 1 − e − αx − e − α for x ≥ characterized by a parameter α > . Clearly s α ( · ) is continuous at , strictly increasing with s α (0) = 0 and s α (1) = 1 and concave. Since lim x ↓ s ′ α ( x ) = α (1 − e − α ) − ∈ (0 , ∞ ) , for all θ > , by (79) of Theorem 3.7, one can take ρ to be θ/ ( θ + 1) and not strictly smaller, and (73) of Corollary 3.2 yields d ( W n , D θ,s α ) ≤ θ (cid:18) θθ + 1 (cid:19) n − for all n ≥ ,using that ≤ s − α ( U /θ ) ≤ s − α (1) = 1 almost surely.Letting W α ∼ D θ,s α it is easy to verify that s α ( W α ) = d U /θ s α ( W α + 1) = U /θ (1 + e − α s α ( W α )) . Using this identity, that Theorem 3.3 gives ≤ s α ( W α ) ≤ st D θ for all α > , and that lim α ↓ s α ( x ) = x for all x ≥ one can show that W α converges to D θ as α ↓ . Hence, nowsetting s ( x ) = x , the family of models D θ,s α , α ≥ is parameterized by a tuneable values of α ≥ whose value may be chosen depending on a desired level of risk aversion, including thecanonical α = 0 case where utility is linear. Example 3.2.
Here we show how standard Vervaat perpetuity models can be seen to as-sume an implicit concave utility function, and how uncertainty in these utilities can be ac-commodated using the new families we introduce. Indeed, letting θ = 1 in (18) and then s θ ( x ) = x θ , θ ∈ (0 , , it is easy to see that D ,s θ = D θ . To model situations where theseutilities are themselves subject to uncertainty, we may let A be a random variable supportedin (0 , and consider the mixture s ( x ) = E [ s A ( x )] .More formally, for some < a ≤ , let µ be a probability measure on the interval (0 , a ] ,and define s ( x ) = Z a s α ( x ) dµ ( α ) . ince < a ≤ , each s α ( · ) is concave and satisfies Condition 3.2 and hence so does s ( · ) .By (78) of Theorem 3.7, for the family D θ,s one can take ρ = θ/ ( θ + 1) .Fix l > . For x ≥ l , note that ∂x α /∂x = αx α − ≤ αl α − which is bounded and hence µ -integrable on [0 , a ] . Thus by dominated convergence, since l > is arbitrary, we obtain s ′ ( x ) = Z a ∂x α ∂x dµ ( α ) = Z a αx α − dµ ( α ) for all x > . (80) Now note that for a < , lim x ↓ s ′ ( x ) diverges to infinity, and hence (79) of Theorem 3.7cannot be invoked. We show, in fact, that one may obtain a bound better than θ/ ( θ + 1) inthis case.Taking θ = 1 and computing I ( x ) directly from (66) , using (80) for the first equality andFubini’s theorem for the second, we have I ( x ) = (cid:2)R a αx α − dµ ( α ) (cid:3) (cid:2)R x R a v α dµ ( α ) dv (cid:3)(cid:2)R a x α dµ ( α ) (cid:3) = (cid:2)R a αx α − dµ ( α ) (cid:3) hR a x α +1 α +1 dµ ( α ) i(cid:2)R a x α dµ ( α ) (cid:3) = hR a R a αβ +1 x α + β dµ ( α ) dµ ( β ) i(cid:2)R a x α dµ ( α ) (cid:3) = hR a R a (cid:16) αβ +1 + βα +1 (cid:17) x α + β dµ ( α ) dµ ( β ) iR a R a x α + β dµ ( α ) dµ ( β ) ≤ sup α,β ∈ [0 ,a ] (cid:18) αβ + 1 + βα + 1 (cid:19) . Taking ≤ α ≤ β ≤ a , the reverse case being handled similarly, using the simple fact that ( β − α ) ≤ β − α for ≤ α ≤ β ≤ shows that for ≤ α ≤ β ≤ a , αβ + 1 + βα + 1 ≤ ββ + 1 ≤ aa + 1 and hence one can take ρ = a/ ( a + 1) . Note that when a = 1 / , say, we obtain the upperbound ρ = 1 / , whereas the bound (78) of Theorem 3.7 gives / when θ = 1 .Taking µ tobe unit mass at yields ρ = 1 / which recovers the bound on ρ for the standard Dickmanderived in [21], and as given in Theorem 3.4, for the value θ = 1 . Example 3.3.
The logarithm u ( x ) = log x is another commonly used utility function asit exhibits constant relative risk aversion (CRRA) which often simplifies many problemsencountered in macroeconomics and finance, see [17]. Applying a shift to make it non-negative, let s ( x ) = log( x + 1) / log 2 for x ≥ .Clearly s ( · ) satisfies Condition 3.2. To apply Corollary 3.2 it remains to compute an upperbound ρ on the integral in (66) . Now since lim x ↓ s ′ ( x ) < ∞ , by (79) of Theorem 3.7, wemay take ρ = θ/ ( θ + 1) . Noting s − ( x ) = 2 x − , imulating from this distribution by the recursion W n +1 = ( W n + 2) U /θn − for n ≥ with initial value W = 0 ,inequality (73) of Corollary 3.2 yields d ( W n , D θ,s ) ≤ θ (cid:18) θθ + 1 (cid:19) n − for all n ≥ ,using that ≤ s − ( U /θ ) = 2 U /θ − ≤ almost surely. In this section we turn to proving Theorem 4.7 from which Theorem 3.4 readily follows. Wedevelop the necessary tools building on [20]. For notational simplicity, in this section given( θ, s ), let t ( x ) = s θ ( x ) for all x ≥
0. (81)Throughout this section t : [0 , ∞ ) → [0 , ∞ ) will be strictly increasing and hence almosteverywhere differentiable by Lebesgue’s Theorem, see e.g. Section 6.2 of [35], inducingthe measure ν satisfying dν/dv = t ′ ( v ) on [0 , ∞ ), where v is Lebesgue measure. For h ∈ L ([0 , a ] , ν ) for some a >
0, define the averaging operator A x h = 1 t ( x ) Z x h ( v ) t ′ ( v ) dv for x ∈ (0 , a ], and A h = h (0) ( t (0) = 0) . (82) Lemma 4.1.
Let t : [0 , ∞ ) → [0 , ∞ ) be a strictly increasing function. If h ∈ L ([0 , a ] , ν ) forsome a > , then f ( x ) = A x h satisfies t ( x ) t ′ ( x ) f ′ ( x ) + f ( x ) = h ( x ) a.e. on (0 , a ] . (83) Conversely, if in addition t ( · ) is locally absolutely continuous on [0 , ∞ ) with t (0) = 0 , and f ∈ S α ≥ Lip α , then the function h ( · ) as given by the right hand side of (83) is in L ([0 , a ] , ν ) for all a > and f ( x ) = A x h for all x ∈ (0 , ∞ ) . (84) Proof.
The first claim follows from the definition (82) of A x h by differentiation. For thesecond claim, noting that the case α = 0 is trivial, fix α >
0. Since t ( · ) is locally absolutelycontinuous and increasing, for any a > (cid:12)(cid:12)(cid:12)(cid:12) Z a h ( v ) t ′ ( v ) dv (cid:12)(cid:12)(cid:12)(cid:12) ≤ Z a ( t ( v ) | f ′ ( v ) | + | f ( v ) | t ′ ( v )) dv ≤ αat ( a ) + ( | f (0) | + αa ) t ( a ) < ∞ and hence h ∈ L ([0 , a ] , ν ) for all a >
0. Now note that the function f ( x ) t ( x ) is locallyabsolutely continuous on [0 , ∞ ) since both f ( · ) and t ( · ) are locally absolutely continuousand for any compact C ⊂ (0 , ∞ ), the function g ( u, v ) = uv is Lipschitz on f ( C ) × t ( C ).Thus, for x >
0, we have A x h = 1 t ( x ) Z x h ( v ) t ′ ( v ) dv = 1 t ( x ) Z x ( t ( v ) f ′ ( v ) + t ′ ( v ) f ( v )) dv = 1 t ( x ) ( f ( x ) t ( x )) = f ( x ) . emma 4.2. Let t : [0 , ∞ ) → [0 , ∞ ) be given by t /θ ( · ) = s ( · ) for s ( · ) a strictly increasinglocally absolutely continuous function on [0 , ∞ ) with s (0) = 0 . Then t ( · ) is also locallyabsolutely continuous on [0 , ∞ ) . Moreover, for W a non-negative random variable and W ∗ with distribution as in (19) , for h ∈ T a ∈ S L ([0 , a ] , ν ) where S is the support of W + 1 , E [ h ( W ∗ )] = E [ A W +1 h ] (85) whenever either expectation above exists, and letting f ( x ) = A x h for all x ∈ S , E (cid:20) t ( W ∗ ) t ′ ( W ∗ ) f ′ ( W ∗ ) + f ( W ∗ ) (cid:21) = E [ f ( W + 1)] , (86) when the expectation of either side exists.Proof. Since s ( · ) is locally absolutely continuous on [0 , ∞ ) and the function u θ is Lipschitzon any compact subset of (0 , ∞ ), we have that t ( · ) is locally absolutely continuous on (0 , ∞ ),and hence the first claim of the lemma follows by Lemma 3.5.Next, as A x h exists for all x ∈ S for any h ( · ) satisfying the hypotheses of the lemma and W ∗ ≤ st W + 1 by (69), the averages A W +1 h and A W ∗ h both exist. Now let the expectationon the left hand side of (85) exist. Using (19) and (81) for the first equality and applyingthe change of variable v = ut ( W + 1) in the resulting integral, we obtain E [ h ( W ∗ )] = E [ h ( t − [ U t ( W + 1)])] = E Z h ( t − [ ut ( W + 1)]) du = E " t ( W + 1) Z t ( W +1)0 h ( t − ( v )) dv = E (cid:20) t ( W + 1) Z W +10 h ( w ) t ′ ( w ) dw (cid:21) = E [ A W +1 h ] , where in the second to last equality we have applied the change of variable t ( w ) = v and thefact that t (0) = 0. When the expectation on the right hand side of (85) exists we apply thesame argument, reading the display above from right to left.To prove the second claim of the lemma, by an argument similar to the one at the startof Section 3 of [20], the distribution of U /θ s ( W + 1) is absolutely continuous with respectto Lebesgue measure, with density, say p ( · ). By a simple change of variable, we obtain that W ∗ has density p W ∗ ( x ) = p ( s ( x )) s ′ ( x ) almost everywhere,and hence the distribution of W ∗ is also absolutely continuous with respect to Lebesguemeasure. Thus by (83), E (cid:20) t ( W ∗ ) t ′ ( W ∗ ) f ′ ( W ∗ ) + f ( W ∗ ) (cid:21) = E [ h ( W ∗ )]and (86) follows from the first claim.For an a.e. differentiable function f ( · ), let D t f ( x ) = t ( x ) t ′ ( x ) f ′ ( x ) + f ( x ) − f ( x + 1) . (87)31ote that if f ( x ) = A x g for some g ( · ), then under the conditions of Lemma 4.1, by (83)we may write (87) as D t f ( x ) = g ( x ) − A x +1 g almost everywhere. (88)Condition 3.1 is assumed in some of the following statements to assure that the distri-bution of D θ,s exists uniquely. The proof of the next lemma is omitted, as it follows usingLemmas 4.1 and 4.2, similar to the proof of Lemma 3.2 in [20]. Lemma 4.3.
Let θ > and s ( · ) satisfy Condition 3.1. If s ( · ) is locally absolutely continuouson [0 , ∞ ) , then, E [ h ( D θ,s )] = E [ A D θ,s +1 h ] and E [ D t f ( D θ,s )] = 0 , for all h ( · ) ∈ T a ∈ (0 , ∞ ) L ([0 , a ] , ν ) and f ( · ) ∈ S α ≥ Lip α for which E [ D t f ( D θ,s )] exists, re-spectively. The second claim of the lemma and (87) suggest the Stein equation t ( x ) t ′ ( x ) f ′ ( x ) + f ( x ) − f ( x + 1) = h ( x ) − E [ h ( D θ,s )] , (89)which via (88) may be rewritten as g ( x ) − A x +1 g = h ( x ) − E [ h ( D θ,s )] (90)whenever g ( · ) is such that A x g exists for all x and f ( x ) = A x g .To prove Theorem 3.4, we first need to identify a set of broad sufficient conditions on t ( · ) under which we can find a nice solution g ( · ) to (90) when h ∈ Lip , , where, suppressingdependence on θ and s ( · ) for notational simplicity, for α >
0, we letLip α, = { h : [0 , ∞ ) → R : h ∈ Lip α , E [ h ( D θ,s )] = 0 } . (91)We note that the integral I ( x ) in (66) can be written as the one appearing in (93) belowwhen t ( x ) = s θ ( x ) as in (81). Also note that by Lemma 4.2, if s ( · ) is strictly increasingwith s (0) = 0, locally absolutely continuity of one of s ( · ) and t ( · ) implies that of the other.Hence, given that either one is locally absolutely continuous on [0 , ∞ ), as any continuousfunction h : [0 , ∞ ) → R is bounded on [0 , a ] for all a ≥
0, we have h ∈ ∩ a> L ([0 , a ] , ν ). Asthe integrability of h ( · ) can thus be easily verified, it will not be given further mention. Lemma 4.4.
Let t : [0 , ∞ ) → [0 , ∞ ) be a strictly increasing and locally absolutely contin-uous function on [0 , ∞ ) . If h ( · ) is absolutely continuous on [0 , a ] for some a > with a.e.derivative h ′ ( · ) , then with A x h as in (82) , ( A x h ) ′ = t ′ ( x ) t ( x ) Z x h ′ ( u ) t ( u ) du a.e. on x ∈ (0 , a ] . (92) If there exists some ρ ∈ [0 , ∞ ) such that ess sup x> I ( x ) ≤ ρ where I ( x ) = t ′ ( x ) t ( x ) Z x t ( u ) du, (93) then A x h ∈ Lip αρ on [0 , ∞ ) whenever h ∈ Lip α for some α ≥ . roof. For the first claim, first assume h (0) = 0. Using Fubini’s theorem in the third equalityand then the local absolute continuity of t ( · ), for x ∈ (0 , a ], we obtain A x h = 1 t ( x ) Z x h ( v ) t ′ ( v ) dv = 1 t ( x ) Z x Z v t ′ ( v ) h ′ ( u ) dudv = 1 t ( x ) Z x Z xu t ′ ( v ) h ′ ( u ) dvdu = 1 t ( x ) Z x h ′ ( u )[ t ( x ) − t ( u )] du, (94)and differentiation yields (92).To handle the case where h (0) is not necessarily equal to zero, letting h ( x ) = h ( x ) − h (0)the result follows by noting that h ′ ( · ) = h ′ ( · ) and, by the absolute continuity of t ( · ), that( A x h ) ′ = ( A x h − h (0)) ′ = ( A x h ) ′ .For the final claim, using (92) and (93), for every x for which I ( x ) ≤ ρ and t ′ ( x ) exists,we obtain | ( A x h ) ′ | = (cid:12)(cid:12)(cid:12)(cid:12) t ′ ( x ) t ( x ) Z x h ′ ( u ) t ( u ) du (cid:12)(cid:12)(cid:12)(cid:12) ≤ k h ′ k ∞ t ′ ( x ) t ( x ) Z x t ( u ) du ≤ αρ. (95)As t ( · ) is locally absolutely continuous, A x h , as seen by the first equality in (94), is aratio of two locally absolutely continuous functions. For any fixed compact subset C of(0 , ∞ ), since u ( x ) := R x h ( v ) t ′ ( v ) dv is continuous, u ( C ) is also compact and hence bounded.Also, since t ( · ) is strictly increasing with t (0) ≥ t ( C ) is bounded away from 0. Hence thefunction f ( u, v ) = u/v restricted to u ( C ) × t ( C ) is Lipschitz, implying that A x h is absolutelycontinuous on C . Thus, it follows that A x h ∈ Lip αρ , as only x values in a set of measurezero have been excluded in (95). Remark 4.1. If θ > and t is given by t ( · ) = s θ ( · ) for s ( · ) concave and continuous at zero,then k I k ∞ ≤ θ/ ( θ + 1) by Theorem 3.7. Hence ρ ∈ [0 , always exists for such choices of t . Lemmas 4.5, 4.6 and Theorem 4.7 generalize Lemmas 3.5, 3.6 and Theorem 3.1 in [20]for the generalized Dickman; their proofs follow closely those in [20] and hence are omitted.
Lemma 4.5.
Let θ > and s ( · ) satisfy Condition 3.1. Moreover assume that µ = E [ D θ,s ] exists. Then with Lip α, as in (91) , for any α > , sup h ∈ Lip α, | h (0) | = αµ. (96)To define iterates of the averaging operator on a function h ( · ), let A x +1 h = h ( x ) and A nx +1 = A x +1 ( A n − • +1 ) for n ≥ H let A nx +1 ( H ) = { A nx +1 h : h ∈ H} for n ≥ Lemma 4.6.
Let s ( · ) satisfy Condition 3.1 and be locally absolutely continuous on [0 , ∞ ) .If there exists ρ ∈ [0 , ∞ ) such that (93) holds, then for all θ > , α ≥ and n ≥ , A nx +1 (Lip α, ) ⊂ Lip αρ n , .
33n the following, by replacing h ( x ) by h ( x ) − E [ h ( D θ,s )], when handling the Stein equations(89) and (90), without loss of generality we may assume that E [ h ( D θ,s )] = 0.For a given function h ∈ Lip α, for some α ≥
0, let h ( ⋆k ) ( x ) = A kx +1 h for k ≥ g ( x ) = X k ≥ h ( ⋆k ) ( x ) and g n ( x ) = n X k =0 h ( ⋆k ) ( x ) . (97)Also recall definition (26) that for any a ≥ f ( · ), k f k [0 ,a ] = sup x ∈ [0 ,a ] | f ( x ) | . Theorem 4.7.
Let s ( · ) satisfy Condition 3.1 and be locally absolutely continuous on [0 , ∞ ) .Further assume that µ = E [ D θ,s ] exists. If there exists ρ ∈ [0 , such that (93) holds, thenfor all a ≥ and h ∈ Lip , we have k h ( ⋆k ) k [0 ,a ] ≤ ( µ + a ) ρ k , (98) g n ∈ Lip (1 − ρ n +1 ) / (1 − ρ ) and g ( · ) given by (97) is a Lip / (1 − ρ ) solution to (90) .Proof of Theorem 3.4: The proof follows by arguing as in the proof of Theorem 1.3 of [20],with the final claim obtained by applying Theorem 3.7 to s ( x ) = x ; we omit the details.In the remainder of this section we specialize to the case of the generalized Dickmandistribution where for some θ > t ( x ) = x θ , dν/dv = θv θ − and the Stein equation(89) becomes ( x/θ ) f ′ ( x ) + f ( x ) − f ( x + 1) = h ( x ) − E [ h ( D θ )] . (99)Note that the function s ( x ) = x trivially satisfies Condition 3.1. For notational simplicity,in what follows, let ρ i = θ/ ( θ + i ) for i ∈ { , } . Lemma 4.8.
For non-negative α and β , let H α,β be as in (12) . For every θ > , if h ∈ H α,β then A x h ∈ C [(0 , ∞ )] and both A x h and A x +1 h are elements of H αρ ,βρ .Proof. Take h ∈ H α,β . Since h ∈ Lip α , by Lemmas 4.6 and 4.4, h ( · ) is ν -integrable on anyinterval of the form [0 , a ] for all a > A x h ∈ Lip αρ and( A x h ) ′ = θx θ +1 Z x h ′ ( v ) v θ dv for x > A x h ) ′′ = θx θ +1 (cid:20) h ′ ( x ) x θ − θ + 1 x Z x h ′ ( v ) v θ dv (cid:21) for x > h ′ ∈ Lip β , the function A x h is twice continuously differentiable on (0 , ∞ ) proving thefirst claim. Since x θ = θ + 1 x Z x v θ dv we have ( A x h ) ′′ = θ ( θ + 1) x θ +2 (cid:20)Z x ( h ′ ( x ) − h ′ ( v )) v θ dv (cid:21) . h ′ ∈ Lip β now yields | ( A x h ) ′′ | ≤ θ ( θ + 1) x θ +2 (cid:20)Z x | h ′ ( x ) − h ′ ( v ) | v θ dv (cid:21) ≤ βθ ( θ + 1) x θ +2 (cid:20)Z x ( x − v ) v θ dv (cid:21) = βθ ( θ + 1) x θ +2 x θ +2 ( θ + 1)( θ + 2) = βθθ + 2 = βρ . Since both A x h and ( A x h ) ′ are continuous at 0 and belong in C [(0 , ∞ )], we obtain A x h ∈H αρ ,βρ . The final claim is a consequence of the fact that A x +1 h is a left shift of A x h . Theorem 4.9.
For every θ > and h ∈ H , , there exists a solution f ∈ H θ,θ/ to (99) with k f ′ k (0 , ∞ ) ≤ θ and k f ′′ k (0 , ∞ ) ≤ θ/ .Proof. Take h ∈ H , . By replacing h ( · ) by h − E [ h ( D θ )] we may assume E [ h ( D θ )] = 0 . Clearly s ( x ) = x satisfies Condition 3.1 and E [ D θ ] = θ (see e.g. [15]). Also, by Theorem3.4, ρ = ρ satisfies (93). For h ∈ Lip , , Theorem 4.7 shows that g ( · ) given by (97) is aLip / (1 − ρ ) solution to (90). Since g ( · ) is Lipschitz, we have g ∈ T a> L ([0 , a ] , ν ) and hence f ( x ) = A x g is a solution to (99) by the equivalence of (89) and (90). Now for a >
0, for anyfunction h ∈ L ([0 , a ] , ν ), k A • h k [0 ,a ] = sup x ∈ [0 ,a ] | A x h | ≤ sup x ∈ [0 ,a ] x θ Z x | h ( v ) | θv θ − dv ≤ k h k [0 ,a ] . (100)Let g n ( x ) = n X k =0 h ( ⋆k ) ( x ) and f n ( x ) = A x g n . Since g n ∈ Lip (1 − ρ n +1 ) / (1 − ρ ) by Theorem 4.7, it is ν -integrable over [0 , a ]. Now using (100),the triangle inequality and (98) of Theorem 4.7, noting E [ D θ ] = θ , we have k f − f n k [0 ,a ] = k A • g − A • g n k [0 ,a ] ≤ k g − g n k [0 ,a ] ≤ sup x ∈ [0 ,a ] X k ≥ n +1 k h ( ⋆k ) k [0 ,a ] ≤ ( θ + a ) X k ≥ n +1 ρ k = ( θ + a ) ρ n +11 − ρ . Letting n → ∞ , we obtain f ( x ) = X n ≥ A x h ( ⋆n ) . Lemma 4.8 and induction imply that A x h ( ⋆n ) ∈ C [(0 , ∞ )] and A x h ( ⋆n ) ∈ H ρ n +11 ,ρ n +12 for all n ≥ k ( A x h ( ⋆n ) ) ′ k (0 , ∞ ) ≤ ρ n +11 and k ( A x h ( ⋆n ) ) ′′ k (0 , ∞ ) ≤ ρ n +12 . (101)35hus, for any a >
0, on the interval (0 , a ], f ′ n ( x ) = P nk =0 ( A x h ( ⋆k ) ) ′ and f ′′ n ( x ) = P nk =0 ( A x h ( ⋆k ) ) ′′ converge uniformly to the corresponding infinite sums respectively, noting that by (101), theinfinite sums are absolutely summable. Thus we obtain (see e.g. Theorem 7.17 in [36]) f ′ ( x ) = lim n →∞ f ′ n ( x ) and f ′′ ( x ) = lim n →∞ f ′′ n ( x ) for all x ∈ [0 , a ].Hence, again using (101), with k · k (0 , ∞ ) the supremum norm defined as in (26), k f ′ k (0 , ∞ ) ≤ X n ≥ ρ n +11 = ρ − ρ = θ and k f ′′ k (0 , ∞ ) ≤ X n ≥ ρ n +12 = ρ − ρ = θ . Finally, since f ( · ) and f ′ ( · ) are differentiable everywhere on (0 , ∞ ) with bounded derivative,they are absolutely continuous on (0 , ∞ ). Also both f ( · ) and f ′ ( · ) are continuous at 0 sinceby definition, f (0) = A g = g (0) = lim x ↓ f ( x ) and f ′ (0) = lim x ↓ f ′ ( x ). Now noting that ifa function is absolutely continuous on (0 , ∞ ) with bounded derivative and continuous at 0,then it is Lipschitz, we obtain that f ∈ H θ,θ/ . Remark 4.2.
The reasoning in the proof of Theorem 4.9 holds in greater generality in t ( · ) ,and only specifically depends on the form t ( x ) = x θ when invoking Lemma 4.8. Remark 4.3.
In contrast to the bound k f ′′ k ≤ k h ′ k (see e.g. (2.12) of [14]) for the solutionof Stein equation in the normal case, one cannot uniformly bound the second derivatives ofthe solutions f ( · ) of (99) in Theorem 4.9 assuming only a Lipschitz condition on the testfunctions h ( · ) in a class H . For b > let h ( x ) = (cid:26) x ≤ bx − b x > b. Clearly h ∈ Lip . Taking θ = 1 and s ( x ) = x , the function g ( · ) as in (97) , with h ( · ) replacedby ¯ h ( · ) = h ( · ) − E [ h ( D )] is Lipschitz and solves (90) by Theorem 4.7, hence f ( x ) = A x g solves (99) . Arguing as in the proof of Theorem 4.9 to interchange A x and the infinite sum, f ( · ) is given by f ( x ) = X k ≥ A x [ A k • +1 (¯ h )] . (102) Consider the term k = 0 in the sum (102) . Directly, one may verify that A x h = (cid:26) x ≤ b ( x − b ) x x > b. ( A x h ) ′ = (cid:26) x ≤ b (1 − ( b/x ) ) x > b. (103) and ( A x h ) ′′ = (cid:26) x ≤ b b x x > b. (104) so in particular, lim x ↓ b ( A x ¯ h ) ′′ = lim x ↓ b ( A x h − Eh ( D )) ′′ = lim x ↓ b ( A x h ) ′′ = 1 /b, (105)36 hich is not bounded as b ↓ .From (103) and (104) respectively, we have that ( A x +1 ¯ h ) ′ ≤ / and ( A x +1 ¯ h ) ′′ ≤ b / ( x +1) ≤ b on (0 , ∞ ) , and hence A x +1 ¯ h ∈ H α,β with α = 1 / and β = b , By Lemma 4.8 with ρ = 1 / and ρ = 1 / , we have A k • +1 (¯ h ) ∈ H α/ k − ,β/ k − for k ≥ . Hence, again by Lemma4.8, A x [ A k • +1 (¯ h )] ∈ H α/ k ,β/ k on (0 , ∞ ) for k ≥ . (106) Summing and substituting the vales of α and β , we obtain X k ≥ A x [ A k • +1 (¯ h )] ∈ H / ,b / . (107) From (102) , (105) and (107) , we find that f ′′ ( x ) may be made arbitrarily large on a set ofpositive measure by choosing b > sufficiently small. Remark 4.4.
Shortly after a draft of this manuscript was posted, as a special case of theirwork on infinitely divisible laws, Arras and Houdr´e proved smoothness bounds in [1] for asolution to the standard Dickman Stein equation of the form xt ( x ) − Z t ( x + u ) du = h ( x ) − Eh ( D ); (108) this equation corresponds to (99) upon identifying t ( · ) and f ′ ( · ) . Lemma 5.2 in [1] showsthat when h ( · ) is in the class H = { h : k h k ∞ ≤ , k h ′ k ∞ ≤ , h ′ ( · ) is continuous } then thereexists a solution t ( · ) to (108) with k t ′ k ∞ ≤ . The proof of Theorem 2.1 requires a uniformbound on f ′ ( · ) over (0 , ∞ ) to control the coefficient of | µ − | in (36) . As no such bound isprovided in [1], in the case µ = 1 one can argue as for Theorem 2.1 to produce a versionof it for the metric induced by H . As neither class H nor H , in (12) contains the other,the first class requiring the test functions to be uniformly bounded, and the second requiringtheir derivatives to be Lipschitz, the resulting metrics they induce are incomparable. References [1] Arras, B. and Houdr´e, C. (2017). On Stein’s Method for Infinitely Divisible Laws WithFinite First Moment. https://arxiv.org/abs/1712.10051 .[2] Arras, B., Mijoule, G., Poly, G. and Swan, Y. (2016). Distances be-tween probability distributions via characteristic functions and biasing. https://arxiv.org/abs/1605.06819v1 .[3] Arras, B., Mijoule, G., Poly, G. and Swan, Y. (2017). A new approach to the Stein-Tikhomirov method: with applications to the second Wiener chaos and Dickman con-vergence. https://arxiv.org/abs/1605.06819 [4] Arratia, R. (2002). On the amount of dependence in the prime factorization of a uniformrandom integer.
Contemporary combinatorics , 10, 29-91.[5] Arratia, R. (2017). Personal communication.376] Arratia, R., Barbour, A. and Tavar´e, S. (2003). Logarithmic combinatorial structures:a probabilistic approach.
EMS Monographs in Mathematics . European MathematicalSociety (EMS), Z¨urich.[7] Barbour, A. and Nietlispach, B. (2011). Approximation by the Dickman distributionand quasi-logarithmic combinatorial structures.
Electron. J. Probab. , 16, 880-902.[8] Baxendale, P. (2017). Personal communication.[9] Bernoulli, D. (1738). Specimen theoriae novae de mensura sortis.
Comentarii AcademiaeScientiarum Imperiales Petropolitanae , 5, 175-192.[10] Bernoulli, D. (1954). Exposition of a new theory on the measurement of risk.
Econo-metrica: Journal of the Econometric Society , 23-36.[11] Bhatt, A. and Roy, R. (2004). On a random directed spanning tree.
Adv. Appl. Prob. ,36(01), 19-42.[12] Cellarosi, F. and Sinai, Y. G. (2013) Non-standard limit theorems in Number theory.
Prokhorov and contemporary probability theory , Springer Berlin, 197-213.[13] Chen, L.H.Y. (1975). Poisson approximation for dependent trials.
Ann. Prob. , 534–545.[14] Chen, L.H.Y., Goldstein, L, and Shao, Q.M. (2010). Normal approximation by Stein’smethod. Springer.[15] Devroye, L. and Fawzi, O. (2010). Simulating the Dickman distribution.
Statistics andProbability Letters , 80(03), 242-247.[16] Dickman, K. (1930). On the frequency of numbers containing prime factors of a certainrelative magnitude.
Ark. Mat. Astr. Fys. , 22(10), 1-14.[17] Eeckhoudt, L., Gollier, C. and Schlesinger, H. (2005). Economic and financial decisionsunder risk. Princeton University Press.[18] Fill J. and Huber, M. (2010). Perfect Simulation of Vervaat Perpetuities.
Electron. J.Probab. , 15(04), 96-109.[19] Finch, S. R. (2003). Mathematical constants. Cambridge university press.[20] Goldstein, L. (2017). Non asymptotic distributional bounds for the Dick-man approximation of the running time of the Quickselect algorithm. https://arxiv.org/abs/1703.00505v1 .[21] Goldstein, L. (2017). Non asymptotic distributional bounds for the Dick-man approximation of the running time of the Quickselect algorithm. https://arxiv.org/abs/1703.00505 .[22] Goldstein, L. and Reinert, G. (1997). Stein’s method and the zero bias transformationwith application to simple random sampling.
Ann. Appl. Prob. , 7(04), 935-952.3823] Hardy, G. H. and Wright, E. M. (1979). An introduction to the theory of numbers.Oxford university press.[24] Hwang, H. K. and Tsai, T. H. (2002). Quickselect and the Dickman function.
Combi-natorics, Probability & Computing , 11(04), 353-371.[25] Pek¨oz, E., R¨ollin, A and Ross, N. (2013). Degree asymptotics with rates for preferentialattachment random graphs.
Ann. Appl. Prob. , 23(03), 1188-1218.[26] Pek¨oz, E. and R¨ollin, A. (2011) New rates for exponential approximation and the the-orems of R´enyi and Yaglom.
Ann. Prob. , 587-608.[27] Pinsky, R. (2016). A Natural Probabilistic Model on the Integers andits Relation to Dickman-Type Distributions and Buchstab’s Function. https://arxiv.org/abs/1606.02965 .[28] Pinsky, R. (2016). On the strange domain of attraction to generalized Dickman distribu-tions for sums of independent random variables. https://arxiv.org/abs/1611.07207 .[29] Penrose, M. D., and Wade, A. R. (2004). Random minimal directed spanning trees andDickman-type distributions.
Adv. Appl. Prob. , 36(03), 691-714.[30] Rachev, S. T. (1991). Probability metrics and the stability of stochastic models. (Vol.269). John Wiley & Son Ltd.[31] Ramanujan, S. (1919). A proof of Bertrand’s postulate.
Journal of the Indian Mathe-matical Society , 11(181-182), 27.[32] R´enyi, A. (1962). Th´eorie des ´el´ements saillants d’une suite d’observations.
Ann. Fac.Sci. Univ. Clermont-Ferrand , 8, 7-13.[33] Ross, N. (2011). Fundamentals of Stein’s method.
Probability Surveys , 8, 210-293.[34] Rosser, B. (1939). The n th Prime is Greater than n log n . Proceedings of the LondonMathematical Society, 2(1), 21-44.[35] Royden, H. L. and Fitzpatrick, P. (1988). Real analysis (Vol. 198, No. 8). New York:Macmillan.[36] Rudin, W. (1964). Principles of mathematical analysis (Vol. 3). New York: McGraw-hill.[37] Stolz, O. (1885). Vorlesungen ¨uber allgemeine Arithmetik: Nach den neueren Ansichten(Vol. 1). BG Teubner.[38] Stein, C. (1972). A bound for the error in the normal approximation to the distributionof a sum of dependent random variables. Proc. Sixth Berkeley Symp. Math. Stat. Prob.,583-602.[39] Stein, C. (1986). Approximate Computation of Expectations. Institute of MathematicalStatistics, Hayward, CA. 3940] Vervaat, W. (1979). On a stochastic difference equation and a representation of non-negative infinitely divisible random variables. Adv. Appl. Prob. , 11(04), 750-783.
IMSV, Universit¨at Bern, Switzerland
E-mail address : [email protected] Department of Mathematics, University of Southern California, Los Angeles, USA
E-mail address : [email protected]@math.usc.edu