A probabilistic approach to the Erdös-Kac theorem for additive functions
aa r X i v : . [ m a t h . P R ] F e b A PROBABILISTIC APPROACH TO THE ERD ¨OS-KAC THEOREM FORADDITIVE FUNCTIONS
LOUIS H. Y. CHEN, ARTURO JARAMILLO, XIAOCHUAN YANG
Abstract.
We present a new perspective of assessing the rates of convergence to the Gauss-ian and Poisson distributions in the Erd¨os-Kac theorem for additive arithmetic functions ψ of a random integer J n uniformly distributed over { , ..., n } . Our approach is probabilistic,working directly on spaces of random variables without any use of Fourier analytic meth-ods, and our ψ is more general than those considered in the literature. Our main results are(i) bounds on the Kolmogorov distance and Wasserstein distance between the distributionof the normalized ψ ( J n ) and the standard Gaussian distribution, and (ii) bounds on theKolmogorov distance and total variation distance between the distribution of ψ ( J n ) and aPoisson distribution under mild additional assumptions on ψ . Our results generalize theexisting ones in the literature. Introduction
Overview.
The present manuscript aims to provide new probabilistic perspectives forthe understanding of additive arithmetic functions; namely, mappings ψ : N → R satisfyingthe identity ψ ( jk ) = ψ ( j )+ ψ ( k ) when j and k are co-prime. Many of the functions of interestto number theorists are of this type, for example the prime factor counting functions ω andΩ, as well as the logarithm of any multiplicative function, such as the sum of powers of divi-sors, Euler’s totient function, M¨obius function and Mangoldt’s function (see [26, Section 2.2]for details). The behavior of such functions is typically analyzed by counting the number ofintegers within a large interval, whose image under the action of ψ lies in a given subset of R . Probabilistically, this procedure is equivalent to describing the action of ψ over a uniformrandom variable J n with uniform distribution over { , . . . , n } . The aim of this paper is tostudy the asymptotic behavior (as n → ∞ ) of the law of ψ ( J n ), with specific emphasis onthe case where ψ is chosen to be the function that counts the number of distinct prime factors.It is worth mentioning that, although the probabilistic perspective for addressing this typeof problems has been used for a long time (see Section 1.2 for details), only suboptimal de-scriptions for ψ ( J n ) have been obtained probabilistically, whereas the sharp ones have beenderived by non-trivial number theoretical tools, such as Perron’s formula, Dirichlet seriesand estimates on the Riemann zeta function. The relevance of the present manuscript comesnot only from the main new results per se (which broadly speaking, can be described as Date : February 11, 2021.2020
Mathematics Subject Classification.
Key words and phrases.
Erd¨os-Kac theorem, Stein’s method, additive functions, normal approximation,Poisson approximation. “sharp Gaussian and Poisson approximations for standardized and non-standardized ver-sions of ψ ( J n )”), but also from the nature of the perspective, which up to an estimation onthe function π ( n ) that counts the number of primes smaller than or equal to n , relies almostentirely on probabilistic arguments and leads to conclusions as sharp as the ones obtainedby number theoretical tools, with a higher level of generality.In order to set up an appropriate context for stating our main result, to be presented indetail in Section 2, we first review briefly some of the current literature related to limittheorems for ψ ( J n ). Our approach will be presented as part this literature, in the section“The conditioned independence approach”. For convenience, we will assume that all therandom variables throughout are defined in a common probability space ( Z , F , P ).1.2. The case of the prime factors counting function.
Denote by P the set of primenumbers and by [ n ] the set { , . . . , n } . One of the most important instances of additivefunctions for which the law of ψ ( J n ) can be successfully approximated, is the case where ψ is taken to be the mapping ω : N → N , defined by ω ( k ) := |{ p ∈ P ; p divides k }| . (1.1)The value of ω ( k ) represents the number of prime divisors of a given integer k ∈ N withoutaccounting for multiplicity. For instance, the value of ω (54) = ω (2 × ) is equal to two,since the only two primes that divide 54 are two and three. Classical Erd¨os-Kac theorem
The study of the distributional properties of ω ( J n ) began with the influential paper [11] byPaul Erd¨os and Mark Kac, where it was shown that the normalized random variables Z n = Z ωn := ω ( J n ) − log log( n ) p log log( n ) (1.2)converge in distribution towards a standard Gaussian random variable N . Since the publi-cation of this result (nowadays known as the Erd¨os-Kac theorem), many improvements anddevelopments on this topic have been considered. Among them is the paper [5] by Billingsley,where the problem was addressed probabilistically by using the method of moments and thedecomposition ω ( J n ) = X p ∈P∩ [ n ] ( p divides J n ) , (1.3)which simplifies the analysis of ω ( J n ) due to the fact that the summands on the right handside can be shown to be asymptotically independent.The asymptotic Gaussianity of Z n raises the question of whether the associated convergencein distribution could be quantitatively assessed with respect to a suitable probability metric,such as the Komogorov distance d K or the 1-Wasserstein distance d , defined as d K ( X, Y ) = sup z ∈ R | P [ X ≤ z ] − P [ Y ≤ z ] | PROBABILISTIC APPROACH TO THE ERD ¨OS-KAC THEOREM FOR ADDITIVE FUNCTIONS 3 and d ( X, Y ) = sup h ∈ Lip | E [ h ( X )] − E [ h ( Y )] | , where Lip is the family of Lipschitz functions with Lipschitz constant at most one. For thispurpose, the idea of decomposing ω ( J n ) as a sum of random variables exhibiting a “weakstochastic dependence” is of great relevance from a probabilistic point of view, as it bringsthe problem of studying Z n to the widely developed line of research of limit theorems forweakly dependent sums of random variables; an area for which the powerful machinery ofcharacteristic functions and Stein’s method is available. This idea has been exploited bymany authors (see for instance [22], [20], [16], [13], [1], [3], [14] and [15]) who have used avariety of techniques to find bounds for d K ( Z n , N ). Next we present a brief summary of themain contributions to this topic. LeVeque’s conjecture
The first assessment of the Kolmogorov distance between Z n and N was presented in thepaper [20] by LeVeque, where it was shown that d K ( Z n , N ) ≤ C log log log( n )log log( n ) , for some constant C > n . In the same paper, it was also conjecturedthat the optimal rate was of the order log log( n ) − ; a claim that was subsequently shownto be true in the paper [22] by R´enyi and Tur´an. The approach presented in [22] relied ona careful study of the characteristic function of ω ( J n ), based on Perron’s formula, Dirichletseries, manipulations on contour integrals for analytic functions and some estimates on theRiemann zeta function ζ around the vertical strip { z ∈ C ; ℜ ( z ) = 1 } . The Stein’s method perspective
Although the solution to LeVeque’s conjecture presented by Tur´an and R´enyi in [22] is quiteingenious and beautiful, it is as well highly non-trivial from a probabilistic point of view.Moreover, up to this day, most of the approaches for obtaining bounds on d K ( Z n , N ) (eventhose leading to suboptimal rates) are based on the analysis of the characteristic functionof ω ( J n ), which requires deep and complicated number-theoretic manipulations. One of thealternative perspectives that have been proposed in the recent years, is the one presentedin the paper [13] by Harper, where techniques from Stein’s method for weakly dependentrandom variables were used to prove that the truncated version of (1.3), V n := X p ∈P∩ [ n
13 log log( n ) − ] ( p divides J n ) , satisfies d K ( V n , M n ) ≤ C log log( n ) − , where M n is a Poisson random variable with intensitylog log( n ) and C > n . The Poisson approximationapproach presented in [13] possesses two important features: in one hand, modulo a suitableestimation for the error of approximating Z n with log log( n ) − ( V n − log log( n )), it provides LOUIS H. Y. CHEN, ARTURO JARAMILLO, XIAOCHUAN YANG an elementary approach for obtaining a bound of the type d K ( Z n , N ) ≤ C log log log( n ) log log( n ) − , (1.4)where C > V n is a phenomenon of great interest on its own, as thediscrete nature of the Poisson distribution intuitively fits better that of V n .The idea of truncating the number of terms in (1.3) was previously explored by Kubilius in[16], who proved among other things, a bound of the form (1.4) by means of an approxima-tion of V n with a sum of fully independent random variables. This result was subsequentlysharpened by many authors (see [2], [10], [25]) and it is up to this day, a very useful tool foranalyzing the law of ψ ( J n ) from a probabilistic perspective.Both Harper’s and Kubilius’ approaches are very simple from a probabilistic point of view,but they have the disadvantage that the main contribution of the error in the estimation of d K ( Z n , N ), comes from approximating the law of ω ( J n ) with V n , and not from the approxi-mation of V n with either a Poisson random variable (as in [13]) or with a sum of independentrandom variables (as in [16]). Thus, every analysis of the law of ω ( J n ) that is based on adescription of V n , regardless of the level of accuracy of the approximation of the law of V n ,can only lead to a bound of the form d K ( ω ( J n ) , N ) ≤ C log log log( n ) log log( n ) − , which hasa strictly slower asymptotic decay as the one from LeVeque’s conjecture. The mod- φ convergence perspective Recent developments on number theory have lead to a much better understanding of thecharacteristic function of ω ( J n ) (see for instance [26, Chapter III.4]), which has served asstarting point for the heuristics of the papers [14], [15] and [3], where the powerful tool ofmod-Gaussian and mod-Poisson convergence was developed and successfully applied to theanalysis of the asymptotic law of ω ( J n ). This type of technique, which we will refer to inthe sequel as mod- φ convergence (to avoid the specification on the Gaussian and Poissoniannature), aims to describe distributional properties of a given collection of random variables { X n } n ∈ N by analyzing the quotient E [ e i λX n ] E [ e i λU n ] , (1.5)where U n is a random variable whose distribution is either a Gaussian or a Poisson. Toexemplify the nature of (1.5), consider the case where U n is a standard Gaussian randomvariable and X n is an infinitely divisible random variable with unit Gaussian component andcharacteristic function E [ e i λX n ] = e i µ n − λ + R R ( e i λx − − ( {| x | < } ) i λx )Π n ( dx ) , where µ n ∈ R and Π n is a Levy measure. For this instance, the quotient in (1.5) takes theform E [ e i λX n ] E [ e i λU n ] = e i µ n + R R ( e i λx − − ( {| x | < } ) i λx )Π n ( dx ) , which is the characteristic function of the non-Gaussian part of X n . In this sense, we canthink of (1.5) as a quantity that describes the part of the characteristic function of X n that PROBABILISTIC APPROACH TO THE ERD ¨OS-KAC THEOREM FOR ADDITIVE FUNCTIONS 5 is not standard Gaussian (respectively, Poissonian). One should remark however, that fora more general random variable X n , the quotient E [ e i λXn ] E [ e i λUn ] might not be the characteristicfunction of a random variable, which is an important observation to take into account whenapplying the heuristic above. Naturally, if the law of U n remains constant as n varies, thenthe convergence of (1.5) towards one implies the convergence in distribution of X n to U .However, a more complex phenomenology might appear in the case where U n varies andthe aforementioned quotient converges to a non-constant limit. This idea was explored byBarbour, Kowalski and Nikeghbali, where the distance between the laws of X n and a suitablePoisson random variable was described in terms of the regularity properties of the limit of(1.5) as a function of λ . These results were then applied to the case where X n := ω ( J n ),for which a lot of information on the characteristic function of ω ( J n ) was available, leadingamong other things, to the following remarkable result see [3, Theorem 7.2] Theorem 1.1.
There exists a constant
C >
0, such that d TV ( ω ( J n ) , M n ) ≤ C log log( n ) − , (1.6)where M n is a Poisson random variable with intensity parameter log log( n ) and d TV ( X, Y )denotes the total variation distance between two random variables X and Y ; namely, d TV ( X, Y ) := sup A ∈B ( R ) | P [ X ∈ A ] − P [ Y ∈ A ] | , where B ( R ) the collection of Borel subsets of R .Notice that as a corollary of (1.6), one obtains an alternative proof of LeVeque’s bound.It is worth mentioning that in [3], sharper approximations of the law of ω ( J n ) were ob-tained by means of Poisson-Charlier signed measures. The downside of applying the mod- φ convergence approach for proving LeVeque’s conjecture, is that prior knowledge on the char-acteristic function of ω ( J n ) is required, which as in the paper of R´enyi and Tur´an [22], canonly be obtained by means of analytic techniques from number theory. The size-biased permutation approach
In the paper [1] by Arratia, an alternative probabilistic approach for studying the divisibilityproperties of J n was proposed. The methodology consists in constructing a coupling of J n together with a partial product T n of size-biased permutated random primes, in such a waythat the total variation distance between J n and T n P n is small, where P n is a suitable randomprime (see [1, Section 1.2]). Manipulations on the law of the size-biased permutation becometractable after introducing a suitable point process with amenable independence properties(see [1, Section 3.5] for details).Using the aforementioned ideas, it was shown in [1, Theorem 3] that if d id : N → N denotesthe insertion deletion distance d id ( Y p ∈P p α p , Y p ∈P p β p ) := X p ∈P | α p − β p | , LOUIS H. Y. CHEN, ARTURO JARAMILLO, XIAOCHUAN YANG for Q p ∈P p α p , Q p ∈P p β p ∈ N , and d , id the associated 1-Wasserstein distance measure withrespect to d id , whose action over the laws of random variables X, Y is given by d , id ( X, Y ) = sup {| E [ h ( X )] − E [ h ( Y )] | ; | h ( x ) − h ( y ) | ≤ d id ( x, y ) for all x, y ∈ N } , then lim n →∞ d , id ( J n , Y p ∈P∩ [ n ] p ξ p ) = 2 , (1.7)where ξ p are independent Geometric random variables with P [ ξ p = k ] = p − k (1 − p − ), for k ≥
0. This identity can be combined with classical results on sums of independent randomvariables in order to obtain a bound similar to that of LeVeque’s conjecture, but measuredwith respect to the 1-Wasserstein distance d . More precisely, it can be shown that (1.7)implies the existence of a constant C >
0, such that d (cid:0) Z n , N (cid:1) ≤ C p log log( n ) . (1.8)It is worth mentioning that although it is not clear how to use (1.7) to get bounds of the type(1.8) with respect to the Kolmogorov distance d K , the two-step approximation scheme of Ar-ratia (approximating J n with T n and then ω ( T n ) with a Gaussian distribution) does inspireus to find a transparent divisibility structure in the intermediate step, a counterpart of hiswell elaborated T n . By doing so, we manage to find sharp bounds for both the Kolmogorovdistance d K (cid:0) Z n , N (cid:1) and the Wasserstein distance d (cid:0) Z n , N (cid:1) (see Section 2 for details). The function
ΩAnother instance for which the law of ψ ( J n ) can be suitably approximated, is the case where ψ is the prime factor counting function with multiplicity, defined byΩ( k ) := X p ∈P∩ [ k ] max { α ≥ p α divides k } . (1.9)Unlike ω , this function is not totally additive (meaning that Ω( p α ) doesn’t necessarily coin-cide with Ω( p ), for p ∈ P and α ∈ N ). However, most of the results related to the state ofthe art on the asymptotical distribution of ω ( J n ) are also valid for Ω( J n ). In particular, theresults presented by R´enyi and Tur´an in [22] and those presented by Barbour, Kowalski andNikeghbali in [3] provide a bound of the type d K (cid:18) Ω( J n ) − log log( n ) p log log( n ) , N (cid:19) ≤ C log log( n ) − , for some C >
0. Moreover, the Poisson approximation (and Poisson-Charlier approximation)presented in [3, Theorem 6.2], establishes the bound d TV (Ω( J n ) , M n ) ≤ C log log( n ) − , (1.10)where C > n and M n is a Poisson random variable withparameter log log( n ). The paper [13] doesn’t explicitly state a Poisson approximation for PROBABILISTIC APPROACH TO THE ERD ¨OS-KAC THEOREM FOR ADDITIVE FUNCTIONS 7 Ω( J n ), although it is clear that the ideas from [13] can be easily adapted to obtain a boundof the type d K ( ˜ V n , M n ) ≤ C log log( n ) − , where˜ V n := X p ∈P∩ [1 ,n
13 log log( n ) − ] max { α ≥ p α divides J n } . The interested reader is also encouraged to see [13, Section 5] for an analysis of both ω ( J n )and Ω( J n ) via exchangeable pairs. However, one should keep in mind that this approachleads to results strictly coarser than those from [13, Section 4].1.3. Erd¨os-Kac theorem for general additive functions.
The broad range of approachesand ideas nowadays available for addressing the classical Erd¨os Kac-theorem and LeVeque’sconjecture, naturally brings the question of whether such techniques can be adapted to de-scribe the asymptotic distribution of ψ ( J n ) for a more general additive function ψ . Althoughthe convergence in distribution (without assessment on its Kolmogorov distance) has beenknown since the paper [11], the adaptation of the proof of the optimal bounds obtainedin the papers [22], [3], [14] and [15] to the general additive function case, is a surprisinglydifficult task. This is mainly due to the fact that all the estimations of d K ( Z n , N ) based onthe use of characteristic function rely on the identity E [ e i λω ( J n ) ] = e log log( n )( e i λ − exp (cid:8) Γ( e i λ ) − Y q ∈P (1 + q − ( e i λ − − q − ) e i λ − + O (log( n ) − ) (cid:9) , which doesn’t necessarily hold when ω is replaced by ψ . Other perspectives, such as theone by Harper in [13] and Kubilius in [16] are quite versatile and extend easily to additivefunctions, but as mentioned before, they do not provide an optimal rate of convergence inthe case for the prime factor counting functions ω and Ω. This motivates the developmentof an alternative probabilistic tool that allows to optimally estimate d K (cid:0) σ − n ( ψ ( J n ) − µ n ) , N (cid:1) , (1.11)where µ n ∈ R and σ n > σ − n ( ψ ( J n ) − µ n ) converges in law to N . The maingoal for this paper consists in addressing the aforementioned problem from a probabilisticpoint of view, relying as little as possible on sophisticated number theoretical arguments.In the sequel, we will refer to this methodology by “the conditioned independence approach”. The conditioned independence approach
In the paper [13] by Harper, it is mentioned that the decomposition (1.3), expressing ω ( J n )as a sum of weakly dependent random variables, suggests the use of the theory from Stein’smethod for estimating (1.11). This idea partially influences our approach, as we follow aswell a Stein’s method perspective. However, instead of viewing ψ ( J n ) as a sum of weaklydependent random variables, we will use a two-step approximation strategy similar in spiritto those from [1, Section 1.2], to show that, firstly, ψ ( J n ) is close in Kolmogorov distanceto ψ ( H n ), where H n is a random variable supported in { , . . . , n } and taking the value k with probability proportional to k − . We then carry out a Stein’s method analysis overthe variables ψ ( H n ), which surprisingly, is a considerably simpler task due to a key identityin law (see Theorem 3.6) which expresses the law of ψ ( H n ) as a sum of fully independent LOUIS H. Y. CHEN, ARTURO JARAMILLO, XIAOCHUAN YANG random variables, conditioned on a suitable explicit event.As one can expect, the conditioned independence provides a much easier framework to applyprobabilistic techniques, in comparison with the case of general weakly dependent randomvariables. We utilize this neat structure to embed the underlying randomness of ψ ( H n ) intoa Poisson space, which remarkably facilitates the application of Stein’s method, in virtueof the celebrated “Mecke’s formula”. This embedding procedure is entirely different fromprevious approaches and consists on comparing the behavior of ψ ( H n ) with that of a randomvariable of the form ˜ ψ ( H n ), where ˜ ψ is a additive arithmetic function characterized by theidentity ˜ ψ ( p α ) = αψ ( p ), valid for all p ∈ P and α ∈ N . We would like to refer the reader to[1] for a Poissonian embedding of the randomness of the prime decomposition of J n , basednot on independent random variables conditioned on a certain constraint, but rather on abeautiful parallelism with uniform random permutations. It is interesting to remark that theapproximating function ˜ ψ that we utilize doesn’t satisfy the property ˜ ψ ( p α ) = ˜ ψ ( p ) (namely,it is not “totally additive” the classic heuristics that additive functions are easier to studywhen we approximate them with totally additive functions (this is the case of the analysisof Ω( J n ), which is obtained from properties of ω ( J n )).Before elaborating further on the details of this methodology, we will introduce some notationand establish basic assumptions on ψ . We will write N to denote the set of natural numbersincluding zero, namely N := N ∪ { } . The probability law of a random variable X , will bedenoted by L ( X ). For a given p ∈ P , consider the p -adic valuation function α p : N → N ,defined as the unique mapping satisfying the prime factorization k = Y p ∈P p α p ( k ) , (1.12)for all k ∈ N . It is plain that if k ≤ n , one only needs to consider p ∈ P n in the factorizationwhere P n := P ∩ [ n ] . Let { ξ p } p ∈P be a collection of independent gemetric random variables with P [ ξ p = k ] = (1 − p − ) p − k , for all k ∈ N and p ∈ P . Our main result requires the following assumptions. (H1) The function ψ restricted to P is bounded. Namely, c := sup p ∈P | ψ ( p ) | < ∞ . (H2) Suppose c := X p ∈P E [ ψ ( p ξ p +2 ) ] p ! / < ∞ , which can be shown to be equivalent to X p ∈P X k ≥ ψ ( p k ) p k < ∞ . PROBABILISTIC APPROACH TO THE ERD ¨OS-KAC THEOREM FOR ADDITIVE FUNCTIONS 9
Since the type of result that we are seeking is asymptotic as n approaches infinity, we willassume in the sequel that n ≥
21. We use the notation ¯ ζ ( s ) to denote Riemann’s zetafunction minus 1, namely P k ≥ k − s which serves as upper bound for the series P p ∈P p − s with some specific choices of s . 2. Main Results
In this section we present our main results. Let µ n and σ n > µ n = X p ∈P n ψ ( p ) p − (1 − p − ) − and σ n = X p ∈P n ψ ( p ) p − (1 − p − ) − . (2.1)We will assume without loss of generality that ψ is not-identically zero and n ≥
27 issufficiently large so that σ n is strictly positive. Our main results are the following bounds. Theorem 2.1.
Suppose that ψ satisfies (H1) and (H2) . Then, provided that σ n ≥ c + c ), d K (cid:18) ψ ( J n ) − µ n σ n , N (cid:19) ≤ κ σ n + κ σ n + κ log log( n )log( n ) , (2.2)and d (cid:18) ψ ( J n ) − µ n σ n , N (cid:19) ≤ κ σ n + κ log log( n ) log( n ) , (2.3)where N is a standard Gaussian random variable and κ := 65 c + 66 c κ := 726 c + 116 c c κ := 67 . κ := 106 c + 2 c κ := 49 . . (2.4) Remark 2.2.
As one can observe from the proof of Theorem 2.1 (to be presented in Section4-6), the use of the normalizations µ n and σ n is quite natural from a probabilistic perspective,as they represent the mean and variance of an approximating sum of independent randomvariables. However, one should keep in mind that any choice of asymptotically equivalentnormalizing constants leads to an equivalent version of Theorem 2.1, provided that we suit-ably modify the bounds. More precisely, let m n and s n be another pair of normalizingconstants. Then it is simple to check that d K (cid:16) ψ ( J n ) − m n s n , N (cid:17) ≤ d K (cid:16) ψ ( J n ) − µ n σ n , N (cid:17) + d K ( N, W ) , where W is a normal random variable with mean ( µ n − m n ) /s n and variance σ n /s n . Similarly,the relation d ( aX + c, aY + c ) = | a | d ( X, Y ) for a, c ∈ R and arbitrary random variables X, Y implies d (cid:16) ψ ( J n ) − m n s n , N (cid:17) ≤ σ n s n d (cid:16) ψ ( J n ) − µ n σ n , N (cid:17) + σ n s n d ( N, W ′ ) , where W ′ is a normal random variable with mean ( m n − µ n ) /σ n and variance s n /σ n . Theadditional error term in both bounds is simple to estimate, see Lemma B.5. Remark 2.3.
We work out the concrete example of ψ = ω with normalizing constants m n = s n = log log( n ) ≤ µ n ≤ σ n . By Theorem 2.1, Lemma B.5 and Mertens’ formula (thetwo-sided bounds (3.7)), we have d K (cid:16) ω ( J n ) − log log( n ) p log log( n ) , N (cid:17) ≤ . p log log( n ) + 823 . n ) + 67 . n )log( n ) ≤ p log log( n ) , which gives a quantitative version of LeVeque’s conjecture with explicit constant. We leavethe extension of the above argument to ψ = Ω and bounds in d to the interested reader.As a byproduct of our analysis, we will obtain as well analogous theorems for the casewhere the J n are replaced by random variables supported in [ n ] and taking the value k withprobability proportional to k − for k ∈ N . Theorem 2.4.
Suppose that ψ satisfies (H1) and (H2) . Let { H n } n ≥ be a sequence ofrandom variables supported in N ∩ [ n ], with P [ H n = k ] = 1 L n k , for k ∈ { , . . . , n } , where L n := P nj =1 1 j . Then, provided that σ n ≥ c + c ), d K (cid:18) ψ ( H n ) − µ n σ n , N (cid:19) ≤ γ σ n + γ σ n , (2.5)and d (cid:18) ψ ( H n ) − µ n σ n , N (cid:19) ≤ γ σ n , (2.6)where N is a standard Gaussian random variable and γ := 32 c + 33 c , γ := 363 c + 58 c c , γ := 105 c + 2 c . (2.7)For N -valued additive functions, the following Poissonian approximations can be proved. Theorem 2.5.
Assume that ψ : N → N and λ n := X p ∈P n ψ ( p ) p − > , n ∈ N . (2.8)Let M n be a Poisson random variable with intensity λ n and suppose that ψ satisfies (H1) and (H2) . Then, d TV ( ψ ( H n ) , M n ) ≤ ˜ γ √ λ n + ˜ γ λ n + 2 c λ n X p ∈P n | ψ ( p ) − | p . (2.9)where ˜ γ := 17 c + 2 c , ˜ γ := 2 . c + 8 . c c + 4 c . In particular, if ψ ( p ) = 1 for all p ∈ P , then d TV ( ψ ( H n ) , M n ) ≤ √ λ n + 6 . . c λ n . (2.10) Theorem 2.6.
Suppose that ψ : N → N satisfies (H1) , (H2) and (2.8). PROBABILISTIC APPROACH TO THE ERD ¨OS-KAC THEOREM FOR ADDITIVE FUNCTIONS 11 (i) We have d K ( ψ ( J n ) , M n ) ≤ ˜ κ √ λ n + ˜ κ λ n + 4 c λ n X p ∈P n | ψ ( p ) − | p + κ log log( n )log( n ) . (2.11)where˜ κ := 51 c + 6 c + 1 ˜ κ := 7 . c + 24 . c c + 12 c + 2 . c ∨ κ := 67 . ψ ( p ) = 1 for all p ∈ P . Then d TV ( ψ ( J n ) , M n ) ≤
18 + 2 c √ λ n + 6 . . c λ n + κ log log( n )log( n ) . Remark 2.7.
Provided that the bounds from Theorems 2.5 and 2.6 are of the order λ − n ,we can obtain an alternative approach for proving Theorems 2.1 and 2.4; as one can firstapproximate the law of ψ ( J n ) (respectively ψ ( H n )) with a Poisson distribution, and then thenormalized Poisson distribution with a standard Gaussian law. However, one should keep inmind that for a large family of arithmetic additive functions ψ , the term1 λ n X p ∈P n | ψ ( p ) − | p might not converge to zero, which will prevent us from obtaining Gaussian approximationsfrom Theorems 2.5 and 2.6. Remark 2.8.
Up to a multiplicative constant independent of n , the bound from Theorem2.6 implies the one presented in the paper [3, Theorem 7.2].One should observe that Remarks 2.2 and 2.3 apply as well to Theorems 2.4-2.6 for a suitablemodification of the upper bounds appearing therein.The rest of the paper is organized as follows. In Section 3 we present some useful preliminarieson number theoretical results, divisibility properties of L ( J n ), Stein’s method and integrationby parts for Poisson functionals. In sections 4-6 we present the proofs of Theorem 2.4 andTheorem 2.1. The Poisson approximation results from Theorems 2.6 and 2.5 are proved inSection 7. Finally, in the appendix we state and prove a generalized version of Lemma 3.6(whose elementary version plays a fundamental role in our methodology), as well as someuseful estimates. 3. Auxiliary results
Elementary results from number theory.
We present the number theoretic resultsthat will be required for our computations. With the exception of the estimations on theprime counting function π : N → N , all of these results can be proved fairly easily. Prime counting function inequalities
Denote by π : [1 , ∞ ) → N the prime counting function, defined by π ( x ) := |P ∩ [1 , x ] | . The existence of infinitely many primes implies that π ( n ) converges to infinity as n → ∞ .There have been many efforts to address the highly non-trivial task of describing as sharplyas possible the asymptotic behavior of this function. The interested reader is referred to thebook [26] for a historical compendium of some of the most popular methods that have beenused to achieve this goal. In this manuscript, we will use two results related to this problem,which we state next: for every n ≥
1, we have that π ( n ) ≤ . n log( n ) . Furtheremore, if n ≥
17, then π ( n ) ≥ n log( n ) , (3.1)and if n ≥ (cid:12)(cid:12)(cid:12)(cid:12) π ( n ) − Z n t ) dt (cid:12)(cid:12)(cid:12)(cid:12) ≤ . n log( n ) exp (cid:26) − r log( n )6 . (cid:27) ≤ n log( n ) , (3.2)where in the last inequality we used the fact that x . e − x ≤ .
7. In particular, since | R n t ) dt − n log( n ) − − n log( n ) − | ≤ n log( n ) − , | π ( n ) − n log( n ) − n log( n ) | ≤ n log( n ) . (3.3)The proofs of (3.1) and (3.2) can be found in [23] and [27], respectively. Rosser and Schoenfeld inequalities
We will require as well suitable bounds for Q p ∈P n (1 − p − ). The bound that we will usewas first presented in the paper [23] (see as well [26, page 17]). There exists a function g : R + → R , such that for all n ∈ N , | g ( n ) log( n ) | ≤ Y p ∈P n (1 − p − ) = e − γ log( n ) e g ( n ) , where γ ≈ .
577 is Euler’s constant. In addition, we have that Y p ∈P n (1 − p − ) > e − γ log( n ) − (1 − log( n ) − ) , (3.4)where γ ≈ .
577 is Euler’s constant, see [26, page 17].
Divisibility probabilities for J n Throghout this manuscript, we will repeadetely use the fact that the probability that a givenpositive integer d ∈ N divides the random variable J n can be expressed as P [ d divides J n ] = 1 n n X k =1 ( d divides k ) = 1 n j nd k . (3.5) Mertens’ formulas
It is a well-known, elementary result from number theory, that sums of the form P p ∈P n log( p ) p PROBABILISTIC APPROACH TO THE ERD ¨OS-KAC THEOREM FOR ADDITIVE FUNCTIONS 13 and P p ∈P n p , with n ∈ N , can be easily estimated as described below. Such results areattributed to Franz Mertens log( n ) − ≤ X p ∈P n log( p ) p < log( n ) . (3.6)The proof is given in [26, page 14]. In addition, there exists a constant C >
0, with C ≈ . n ) ≤ X p ∈P n p ≤ log log( n ) + C + 2log( n ) . (3.7)For a proof, see for instance [26, page 15].3.2. Approximating L ( J n ) with L ( H n ) . In this section we present a link between theprobability laws of the random variables J n and H n . To achieve this, we will make use ofideas that are close in spirit to those from [1, Sections 1.2 and 3.6]. Let { H n } n ≥ be given asbefore. Let { Q ( k ) } k ≥ be a sequence of random variables defined in (Ω , F , P ), independentof ( J n , H n ) and satisfying the property that Q ( k ) has uniform distribution over the set P ∗ k := { } ∪ P k . (3.8)Namely, P [ Q ( k ) = j ] = 1 π ( k ) + 1 , for j ∈ { } ∪ P k . Lemma 3.1.
Let J n , H n and { Q ( k ) } k ≥ be as before. Then, for n ≥
21, we have d TV ( J n , H n Q ( n/H n )) ≤
61 log log n log n . Proof.
For each m ∈ [ n ], one has P [ H n Q ( n/H n ) = m ] = X p ∈P ∗ n p | m P h H n = m/p, Q ( ⌊ np/m ⌋ ) = p i = 1 nL n X p ∈P ∗ n p | m np/m π ( np/m ) . Then, | P [ H n Q ( n/H n ) = m ] − P [ J n = m ] | ≤ nL n | X p ∈P ∗ n p | m (log( np/m ) − − L n | + KnL n (1 + ω ( m )) , where K := sup x ≥ (cid:12)(cid:12)(cid:12) x π ( x ) − (log x − (cid:12)(cid:12)(cid:12) . Therefore, setting s ( m ) := Q p | m p , one has d TV ( J n , H n Q ( n/H n )) = 12 n X m =1 | P [ H n Q ( n/H n ) = m ] − P [ J n = m ] | ≤ R + R , (3.9) where R := 12 nL n | n X m =1 (1 + ω ( m ))(log( n/m ) −
1) + log s ( m ) − L n | ,R := K nL n n X m =1 (1 + ω ( m )) . To bound R , we use the fact that s ( m ) ≤ log( n ) < L n for all m ≤ n , to deduce that R ≤ L n E [ | log( n/J n ) − | ] + 12 L n E [ | ω ( J n )(log( n/J n ) − | ] + 12 L n | E [ L n − log s ( J n )] | . By applying Cauchy-Schwarz inequality to the first two terms, we deduce that R is boundedfrom above by R , + R , , where R , := 12 L n ( E [ ω ( J n ) ] + 1) E [ | log( n/J n ) − | ] R , := 12 L n | E [ L n − log s ( J n )] | . The term R , can be bounded by using the integral approximation E [(log( n/J n ) − ] = 1 + 1 n n − X m =1 log( n/m ) − n n − X m =1 log( n/m ) ≤ n − X m =1 Z mnm − n log(1 /x ) dx − n − X m =1 Z m +1 nmn log(1 /x ) dx ≤ Z n log(1 /x ) dx, where the last step follows from the fact that R log( x ) dx = 2 and R log(1 /x ) dx = 1. Thus,using the condition n ≥
21, we obtain E [(log( n/J n ) − ] ≤ . . As a consequence, by (B.3), we have R , ≤ nL n . On the other hand, it is easy to see by integral approximation thatlog( n ) − n Z n log( x ) dx ≤ n n X m =1 log( m ) ≤ n Z n +11 log( x ) dx = n + 1 n log( n + 1) − , which, combined with the relation log( n + 1) ≤ L n ≤ log( n ) + 1, leads to | L n − E [log( J n )] | ≤ . PROBABILISTIC APPROACH TO THE ERD ¨OS-KAC THEOREM FOR ADDITIVE FUNCTIONS 15
On the other hand, E [log( J n ) − log( s ( J n ))] = X p ∈P n E [( α p ( J n ) − ( α p ( J n ) ≥ p )= X p ∈P n E [( α p ( J n ) − + ] log( p ) ≤ X p ∈P n log( p ) X k ≥ P [ α p ( J n ) ≥ k ] = X p ∈P n log( p ) p (1 + (1 − p − ) − ) ≤ , so that R , ≤ L n . We thus conclude that R ≤ . n ) L n ≤ . n )log( n ) . (3.10)To bound R , we use (B.3) as well as the condition n ≥
21, to show that12 n n X m =1 (1 + ω ( m )) ≤
12 log log( n )(1 + 2 . n ) ) ≤ .
63 log log( n ) , which leads to R ≤ . KL n log log( n ) ≤ . K log log( n )log( n ) . (3.11)It thus remains to bound K . To this end, we notice that by (3.1) and (3.3), for every x ≥ (cid:12)(cid:12)(cid:12) x π ( x ) − (log( x ) − (cid:12)(cid:12)(cid:12) ≤ log( x ) x (cid:12)(cid:12)(cid:12) x − (1 + π ( x ))(log( x ) − (cid:12)(cid:12)(cid:12) ≤ log( x ) x (cid:12)(cid:12)(cid:12) x − (1 + x log( x ) + x log( x ) )(log( x ) − (cid:12)(cid:12)(cid:12) + 184log( x )= log( x ) x (cid:12)(cid:12)(cid:12) x log( x ) − log( x ) (cid:12)(cid:12)(cid:12) + 184log( x ) , so that (cid:12)(cid:12)(cid:12) x π ( x ) − (log( x ) − (cid:12)(cid:12)(cid:12) ≤ x ) ≤ . . For 1 ≤ x ≤ (cid:12)(cid:12)(cid:12)(cid:12) x π ( x ) − (log( x ) − (cid:12)(cid:12)(cid:12)(cid:12) ≤ (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) x x log( x ) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) + | log( x ) − | ≤ x ≥
17 and (cid:12)(cid:12)(cid:12)(cid:12) x π ( x ) − (log( x ) − (cid:12)(cid:12)(cid:12)(cid:12) ≤ x + | log( x ) − | ≤ , for x ≤
17, we obtain (cid:12)(cid:12)(cid:12)(cid:12) x π ( x ) − (log( x ) − (cid:12)(cid:12)(cid:12)(cid:12) ≤ . Consequently, K ≤ . R ≤ . n )log( n ) . (3.12)The result follows from (3.9), (3.10) and (3.12). (cid:3) The following lemma will be useful when studying the Poisson approximations for ψ ( J n ). Lemma 3.2.
Let H n and { Q ( k ) } k ≥ be as before. Then, for n ≥ P [ Q ( n/H n ) divides H n ] ≤ . n )log( n ) . Proof.
Define T n := P [ Q ( n/H n ) divides H n ], and let P ∗ m be given as in (3.8). Recall that d TV ( X, X ′ ) ≤ P [ X = X ′ ] for any coupling of ( X, X ′ ) defined on (Ω , F , P ). As a consequence, d TV ( ω ( H n Q ( n/H n )) , ω ( H n ) + 1) ≤ T n ≤ P [ Q ( n/H n ) divides H n , H n ≤ n/
2] + P [ H n ≥ n/ . The second term is bounded by . n ) due to (B.8) and the condition n ≥
21. Consequently, T n ≤ . n ) + 1 L n n/ X k =1 k (1 + π ( ⌊ n/k ⌋ )) X p ∈P ∗⌊ n/k ⌋ ( p | k ) ≤ . n ) + 1 L n n/ X k =1 k (1 + π ( ⌊ n/k ⌋ )) (cid:0) X p ∈P k ( p | k ) (cid:1) . Thanks to (3.1), for all x ≥
2, we have π ( x ) ≥ . x/ log( x ). Hence, T n ≤ . n ) + 1 . nL n n X k =1 X p ∈P k ( p | k ) log( n/k ) ≤ . n ) + 1 . L n E [ ω ( J n ) log( n/J n )] . By an integral comparison, we can easily show that E [log( n/J n ) ] ≤ n n X m =1 log( n/m ) ≤ Z log(1 /x ) dx = 2 . Combining this inequality with (B.3) and Cauchy-Schwarz inequality, we thus get T n ≤ . n ) + 5 log log( n ) L n ≤ . n )log( n ) , where in the last inequality we used the condition n ≥ (cid:3) Stein’s method for normal approximation.
The so-called “Stein’s method” is acollection of probabilistic techniques that allow to asses the distance between to probabilitydistributions by means of differential operators. It was first introduced in the pathbreakingpaper [24] by Charles Stein, for obtaining Gaussian approximations.The basic idea for Stein’s method for Gaussian approximations consists on noticing thatif N is a random variable with standard Gaussian distribution and f : R → R is an ab-solutely continuous function satisfying E [ | f ′ ( N ) | ] < ∞ , then E [ A [ f ]( N )] = 0, where A is PROBABILISTIC APPROACH TO THE ERD ¨OS-KAC THEOREM FOR ADDITIVE FUNCTIONS 17 the so called “Stein’s characterizing operator”, which is defined over the set of differentiablefunctions, and maps f to A [ f ], where A [ f ]( x ) := f ′ ( x ) − xf ( x ). Then, at a heuristic level,if F is a random variable with the property that E [ A [ f ]( F )] is close to zero for a large classof absolutely continuous functions f , then F has be close (in some meaningful probabilisticsense), to N .This heuristics can be formalized quite beautifully by considering a test function h : R → R ,and solving for f , in Stein’s equation A [ f ]( x ) = h ( x ) − E [ h ( N )] . (3.13)This way, if f h denotes the solution of (3.13), then for every family of functions K satisfyingthe property that (3.13) has a solution f h for all h ∈ K , then d K ( F, N ) := sup h ∈K | E [ h ( F )] − E [ h ( N )] | = sup h ∈K | E A [ f h ]( F ) | . (3.14)Naturally, in order to find sharp bounds for the right hand side of (3.14), we need knowledgeon the regularity properties of the solutions f h , for h ∈ K . The next Lemmas provide someof these properties for the case where K = { I ( −∞ , z ) ; z ∈ R } and the case where K is theclass of Lipschitz functions with Lipschitz constant at most 1. Lemma 3.3. [8, Lemma 2.3] If h z = ( −∞ , z ) for some z ∈ R , then (3.13) has a solution f z satisfying sup x ∈ R | f z ( x ) | ≤ √ π x ∈ R | f ′ z ( x ) | ≤ u, v, w ∈ R , | ( w + u ) f z ( w + u ) − ( w + v ) f z ( w + v ) | ≤ ( | u | + | v | )( | w | + √ π . (3.15)Moreover, x xf z ( x ) is non-decreasing and | xf z ( x ) | ≤ z ∈ R Lemma 3.4. [8, Lemma 2.4] For each h Lipschitz continuous with Lipschitz constant atmost 1, the equation (3.13) has a solution f h satisfyingsup x ∈ R | f h ( x ) | ≤ , sup x ∈ R | f ′ h ( x ) | ≤ p /π and sup x ∈ R | f ′′ h ( x ) | ≤ . Stein’s method for Poisson approximation.
The aforementioned ideas can as wellbe applied in the context where the target distribution is a Poisson random variable. Thefirst work in this direction is the paper [7] by Chen, where the methodology was introducedand applied in the context of sums of independent, non-necessarily identically distributedBernoulli random variables. A classic reference on Poisson approximation by Stein’s methodis the book [4] by Barbour et al. (the reader is as well referred to the more recent references[6] and [12]). For approximations towards a Poisson random variable M with parameter λ ,the corresponding Stein operator becomes H λ [ f ]( x ) := λf ( x + 1) − xf ( x ), and the associatedStein’s equation is H λ [ f ]( x ) = h ( x ) − E [ h ( M )] . (3.16) The idea for obtaining bounds for d TV ( X, M ), where X is a random variable supportedin the non-negative integers, consists in considering a test function h ( k ) = ( k ∈ B ), andestimating | E [ h ( X ) − h ( M )] | with bounds for | E [ H λ [ X ]] | , where f is the solution of (3.16).This task is achieved by means of the following result. Lemma 3.5. [4, page] If h ( k ) = ( k ∈ B ) for some B ⊂ N , then (3.16) has a solution f h satisfyingsup x ∈ N | f h ( x ) | ≤ ∧ λ − and sup x ∈ N | f h ( x + 1) − f h ( x ) | ≤ (1 − e − λ ) λ − . Conditioned independence of prime factorizations.
In this section we presentthe key ingredient for our approach, which is a result exhibiting a conditioned independencestructure for the factors of the prime factorization of the random variable H n . In AppendixA, we will show that this type of phenomenology extends to a much more general family ofprobability distributions (see Remark 3.7).Our starting point is the well-known relation of the p -adic valuation of J n , given by α p ( J n ),and geometric random variables. Let { ξ p } p ∈P be a family of independent geometric randomvariables with P [ ξ p = k ] = p − k (1 − p − ) . Then, for any i ∈ N and k , ..., k i ∈ N = N ∪ { } one has P [ α p ( J n ) ≥ k , · · · , α p i ( J n ) ≥ k i ] → P [ ξ p ≥ k , · · · , ξ p i ≥ k i ] . as n → ∞ , where p , ..., p i are the first i primes. To see this, one simply notices that i \ j =1 { α p j ( J n ) ≥ k j } = i \ j =1 { p k j j divides J n } = ( i Y j =1 p k j j divides J n ) , then apply (3.5). This hinges on the intuition that different p -adic valuations at J n becomemore independent as n grows to infinity. Although the the asymptotic independence is ourmain guiding principle for drawing interesting probabilistic conclusions, it is not directlyapplicable to obtain non-asymptotic bounds. One of the new probabilistic inputs of thispaper is the following non-asymptotic conditioned independence structure of α p ( H n ). Theorem 3.6.
Suppose that n ≥
21, and let { ξ p } p ∈P and L n be given as before. Define theevent A n := n Y p ∈P n p ξ p ≤ n o , (3.17)as well as the random vector ~C ( n ) := ( α p ( H n ); p ∈ P n ). Then P [ A n ] = L n Y p ≤ n (1 − p − ) ≥ , (3.18)and L ( ~C ( n )) = L ( ~ξ ( n ) | A n ) , (3.19) PROBABILISTIC APPROACH TO THE ERD ¨OS-KAC THEOREM FOR ADDITIVE FUNCTIONS 19 where ~ξ ( n ) := ( ξ p ; p ∈ P n ). In particular, L ( ψ ( H ( n ))) = L ( X p ∈P n ψ ( p ξ p ) | A n ) . (3.20) Remark 3.7.
As one can observe from the proof that we present below, the heuristic expla-nation of why the above result holds, comes from the fact that the probability mass functionof the geometric distribution transforms “products over sets of primes” to “sums over setsof integers”, which allows us to use the multiplicative property of characteristic functions inour advantage. One thus can naturally ask whether Theorem 3.6 can be extended to moregeneral families of distributions. This task can indeed be carried without difficulties, pro-vided that we impose some multiplicativity condition on the underlying random variable, aswe explain in Appendix A. For the purposes of this manuscript, we mostly require knowledgeon L ( H n ), so in this section we only handle the case of the Harmonic distribution and leaveits generalized version Proposition A.1 as as an available tool for future related problems. Proof of Theorem 3.6.
Consider a fixed vector ~λ = ( λ p ; p ∈ P n ) ∈ R π ( n ) . For a given n ∈ N ,define the set K n := { ( c p ; p ∈ P n ) ∈ N π ( n )0 ; Y p ∈P n p c p ≤ n } , (3.21)consisting of the tuples of non-negative integers c p , indexed by the primes p belonging to P n and satisfying the condition Q p ∈P n p c p ≤ n .Observe that the prime factorization theorem induces a natural bijective correspondencebetween the sets K n and { , . . . , n } . Let f : N π ( n )0 → R be bounded. The bijection allows usto write E [ f ( ~ξ ( n )) ( A n )] = X ~c =( c p ; p ∈P n ) ∈K n f ( ~c ) P [ ξ p = c p for all p ∈ P n ]= X ~c =( c p ; p ∈P n ) ∈K n f ( ~c ) Y p ∈P n (1 − p − ) Y p ∈P n p − c p = n X k =1 f ( α p ( k ) , p ∈ P n ) k − Y p ∈P n (1 − p − )= E [ f ( α p ( H n ) , p ∈ P n )] L n Y p ∈P n (1 − p − ) . Letting f ( ~c ) = 1 for all ~c ∈ N π ( n )0 , we deduce that P [ A n ] = L n Y p ≤ n (1 − p − ) , (3.22)which in addition gives E [ f ( ~ξ ( n )) | A n ] = E [ f ( α p ( H n ) , p ∈ P n )] , hence implying (3.19). To prove the inequality (3.18), we use the identities (3.22) and (3.4), as well as the fact that L n ≥ log( n + 1) and n ≥
21, in order to obtain P [ A n ] ≥ e − γ (1 − log( x ) − ) ≥ . . Finally, we notice that identity (3.20) easily follows from (3.19), since L ( ψ ( H n )) = L ( X p ∈P n ψ ( p α p ( H n ) )) = L ( X p ∈P n ψ ( p ξ p ) | A n ) . (cid:3) Poisson embedding and linear approximation.
Another key idea for proving The-orem 2.4 consists on regarding the law of ψ ( H n ) as the distribution of a suitable functional ofa Poisson point process. In view of Theorem 3.6, this task can be carried simply by viewingthe ξ p ’s as functionals of a Poisson point process. To achieve this, we define the (discrete)ambient space X := { ( p, k ) : p ∈ P , k ∈ N } and consider a Poisson process η , defined on X ,with intensity measure λ : X → R + given by λ ( p, k ) = 1 kp k , for all p ∈ P , k ∈ N . Using an elementary manipulation of characteristic functions, one can easily show that if ξ is a geometric random variable satisfying P [ ξ = k ] = (1 − ρ ) ρ k for k ∈ N and ρ ∈ (0 , ξ Law = X k ≥ kM ρ ( k )where the random variables M ρ ( k ) indexed by N are independent Poisson with parameters ρ k k , respectively. Applying this result to the variables ξ p , we obtain the identity in law( ξ p , p ∈ P ) Law = (cid:16) X k ∈ N kη ( p, k ) , p ∈ P (cid:17) . (3.23)Taking (3.23) into consideration, we will assume in the sequel that ξ p := X k ∈ N kη ( p, k ) (3.24)for every p ∈ P . Observe that the additivity of ψ implies that ψ ( Y p ∈P n p ξ p ) = X p ∈P n ψ ( p ξ p ) , (3.25)which induces a natural dependence of the law of ψ ( H n ) on the underlying Poisson process η via Theorem 3.6. However, for computational simplicity, we will instead use the identity ( ξ p = 1) = ξ p − ξ p ( ξ p ≥
2) to write ψ ( Y p ∈P n p ξ p ) = Y n + R n , (3.26)where Y n := X p ∈P n ψ ( p ) ξ p and R n := X p ∈P n ( ψ ( p ξ p ) − ψ ( p ) ξ p ) ( ξ p ≥ , (3.27) PROBABILISTIC APPROACH TO THE ERD ¨OS-KAC THEOREM FOR ADDITIVE FUNCTIONS 21
The decomposition (3.26) will be of great help for future computations, due to the fact that Y n has compound Poisson distribution, and R n is an error satisfying E [ | R n | ] ≤ c + 2 c , (3.28)due to Lemma B.1 in the Appendix.3.7. Integration by parts for linear functionals of η . As it is usually the case in Stein’smethod, a suitable integration by parts greatly simplifies computations. This task can beaddressed by using the Poisson integral structure of Y n and the following integration byparts formula, obtained as an easy consequence of Mecke’s formula (or from palm theory forPoisson processes). In what follows, we write µ ( ρ ) := R X ρ ( x ) µ ( dx ) for any positive measure µ on X and function ρ : X → R . Lemma 3.8.
Write e η ( ρ ) = η ( ρ ) − λ ( ρ ) for the compensated Poisson integral of a kernelfunction ρ ∈ L ( X , dλ ) ∩ L ( X , dλ ). Let G := G ( η ) be square-integrable σ ( η )-measurable.Then E [ e η ( ρ ) G ( η )] = Z X ρ ( x ) E [ D x G ( η )] λ ( dx ) , (3.29)where D x G ( η ) := G ( η + δ x ) − G ( η ). Proof.
Mecke’s equation [18, Theorem 4.1] states that for ρ and G as above, E [ η ( ρ ) G ( η )] = E h Z X ρ ( x ) G ( η ) η ( dx ) i = Z X E [ ρ ( x ) G ( η + δ x )] λ ( dx ) . Subtracting from both sides E [ λ ( ρ ) G ( η )], one arrives at E [ e η ( ρ ) G ( η )] = Z X E [ ρ ( x )( G ( η + δ x ) − G ( η ))] λ ( dx ) = Z X ρ ( x ) E [ D x G ( η )] λ ( dx ) , as required. (cid:3) Remark 3.9. (i) The lemma above is in fact a duality formula: the compensated Poisson measureapplied to a deterministic function is the dual operation of the difference operator D ,which is customarily called the Kabanov-Skorohod integral. We stress that duality(or integration by parts) formula on the Poisson space holds for more general Pois-son functionals, and reduces to our case when applied to linear ones. We refer theinterested reader to the monographs [21, 18] for more identities of this kind, and toLast, Peccati and Schulte [19], D¨obler and Peccati [9] and a recent work [17] for morestriking applications of duality formulas for normal approximation on the Poissonspace.(ii) The integrability condition is satisfied automatically for each ρ n and ̺ n utilized inthe proofs as they are supported on subsets of X with finite λ -measure. Proof of Theorem 2.4: Wasserstein bound
This section is devoted to proving equation (2.6). Recall that by Theorem 3.6, we have L ( ψ ( H n )) = L ( Y n + R n | A n ), where Y n and R n are given by (3.27) and A n by (3.17). Define W n , ˜ W n by W n := σ − n ( Y n − µ n ) and ˜ W n := σ − n ( X p ∈P n ψ ( p ξ p ) − µ n ) , (4.1)so that the law of the underlying approximating sequence Z n = Z ψn := σ − n ( ψ ( H n ) − µ n ) , (4.2)can be written as L ( Z n ) = L ( ˜ W n | A n ) = L ( W n + σ − n R n | A n ) . (4.3)Let I n denote the indicator of A n . From (4.3), (3.18) and the definition of 1-Wassersteindistance, it follows that d ( Z n , L ( W n | A n )) ≤ σ − n E [ | R n || A n ] ≤ σ − n E [ | R n | I n ] , which by the triangle inequality and (3.28) implies that d ( Z n , N ) ≤ d ( L ( W n | A n ) , N ) + σ − n (4 c + 2 c ) . (4.4)We thus have reduced the problem to bounding d ( L ( W n | A n ) , N ). This task can be achievedby implementing Stein’s bound (3.14) for the conditional law L ( W n | A n ) and then boundingfrom above the quantity | E [ A [ f ]( W n ) | A n ] | = P [ A n ] − | E [ f ( W n ) W n I n ] − E [ f ′ ( W n ) I n ] |≤ | E [ f ( W n ) W n I n ] − E [ f ′ ( W n ) I n ] | , (4.5)where f is the solution to Stein’s equation (3.13) with respect to a test function h ∈ Lip(1)satisfying the uniform bounds in Lemma 3.4.
Step I
In order to bound (4.5), we next find a suitable integration by parts for W n . Recall fromSection 3.6 that ξ p = ξ p ( η ) = P k ≥ kη ( p, k ). Thus, we can write W n = 1 σ n X p ∈P n X k ≥ ψ ( p ) k (cid:16) η ( p, k ) − kp k (cid:17) = e η ( ρ n ) , where ρ n ( k, p ) := σ − n kψ ( p ) ( p ∈ P n ) . (4.6)From the definition of the difference operator D x , one can easily check that if F and F are σ ( η )-measurable, then D x ( F F ) = F D x F + F D x F + D x F D x F . (4.7)Thus, applying Lemma 3.8 to G = f ( W n ) I n , we can write E [ f ( W n ) W n I n ] = E [ e η ( ρ n ) f ( e η ( ρ n )) I n ] = J n + ε n , (4.8) PROBABILISTIC APPROACH TO THE ERD ¨OS-KAC THEOREM FOR ADDITIVE FUNCTIONS 23 where J n := Z X ρ n ( x ) E [ I n D x ( f ( e η ( ρ n )))] λ ( dx ) ε n := Z X ρ n ( x ) E [( f ( e η ( ρ n )) + D x f ( e η ( ρ n ))) D x I n ] λ ( dx ) . Next we describe separately the behavior of each term on the right hand side of (4.8).
Step II
First we analyze J n . To this end, we use Taylor’s formula to write | D x ( f ( e η ( ρ n ))) − f ′ ( e η ( ρ n )) ρ n ( x ) | = | f ( e η ( ρ n ) + ρ n ( x )) − f ( e η ( ρ n )) − f ′ ( e η ( ρ n )) ρ n ( x ) |≤ k f ′′ k ∞ | ρ n ( x ) | ≤ p /π | ρ n ( x ) | , where the last inequality follows from Lemma 3.4. It is clear that W n is standardized by ourchoice of µ n and σ n , and consequently k ρ n k L ( X ,dλ ) = V ar[ W n ] = 1 by moment formula forPoisson integrals [18, Lemma 12.2], yielding | J n − E [ I n f ′ ( W n )] | = (cid:12)(cid:12)(cid:12) J n − Z X ρ n ( x ) E [ I n f ′ ( e η ( ρ n ))] λ ( dx ) (cid:12)(cid:12)(cid:12) ≤ p /π Z X | ρ n ( x ) | λ ( dx ) . From (4.6) and Lemma B.4, we have Z X | ρ n ( x ) | λ ( dx ) = 1 σ n X p ∈P n X k ≥ | ψ ( p ) | k p k ≤ c σ n X p ∈P n | ψ ( p ) | (1 + p − ) p (1 − p − ) = c σ n X p ∈P n | ψ ( p ) | Var[ ξ p ](1 + p − )1 − p − ≤ c σ n X p ∈P n Var[ ψ ( p ) ξ p ] = 3 c σ n . We thus conclude that | J n − E [ I n f ′ ( W n )] | ≤ p /πc σ n . (4.9) Step III
Next we handle the term ε n . We have | ε n | ≤ k f k ∞ Z X | ρ n ( x ) | E [ | D x I n | ] λ ( dx ) ≤ Z X | ρ n ( x ) | E [ | D x I n | ] λ ( dx ) , (4.10)where the last inequality follows from Lemma 3.4. Hence, it suffices to consider E [ | D x I n | ].We notice that | D x I n | ∈ { , } , more precisely, setting x = ( p , k ) ∈ P n × N , we have D x I n = ( p k Y p ∈P n p P ∞ k =1 kη ( p,k ) ≤ n ) − ( Y p ∈P n p P ∞ k =1 kη ( p,k ) ≤ n )= − ( Y p ∈P n p ξ p ∈ [ p − k n, n ]) Hence, D x I n = 0 implies I n = 0. Applying Theorem 3.6 and Lemma B.3, we have E [ | D x I n | ] = E [ I n | D x I n | ] = P [ A n ] E [ | D x I n || A n ] ≤ P [ Y p ∈P n p ξ p ∈ [ p − k n, n ] | A n ] = P [ H n ≥ np − k ] ≤ k log( p )log( n ) . Plugging this estimate back to (4.10) and applying Lemma B.4, we obtain | ε n | ≤ X p ∈P n X k ≥ σ n | ψ ( p ) | log( p )log( n ) kp k ≤ c σ n X p ∈P n log( p ) p log( n ) ≤ c σ n , (4.11)where the last inequality follows from Merten’s formula (3.6). Inequality (2.6) follows from(4.4), (4.5), (4.8), (4.9) and (4.11).5. Proof of Theorem 2.4: Kolmogorov bound
Now we proceed with the proof of (2.5). Let W n , ˜ W n and Z n be given as in (4.1) and(4.2). It is not evident that L ( ˜ W n | A n ) and L ( W n | A n ) are close in the Kolmogorov distanceas they are in the Wasserstein distance, so have to implement Stein’s method directly for L ( Z n ) = L ( ˜ W n | A n ). We will see, however, that the decomposition (4.3) is still relevant inorder to apply our integration by parts formula Lemma 3.8.It suffices to bound | E [ A [ f ]( Z n )] | , for f given as the solution to the Stein’s equationassociated to a test function h of the form h ( x ) := ( x ≤ z ), for z ∈ R . By Theorem 3.6, wehave that | E [ A [ f ]( Z n )] | = P [ A n ] − | E [ f ( ˜ W n ) ˜ W n I n ] − E [ f ′ ( ˜ W n ) I n ] |≤ | E [ f ( ˜ W n ) ˜ W n I n ] − E [ f ′ ( ˜ W n ) I n ] |≤ √ π σ − n E [ | R n | ] + 2 | E [ f ( ˜ W n ) W n I n ] − E [ f ′ ( ˜ W n ) I n ] | , (5.1)where we have used (4.3) and Lemma 3.3 for the last inequality. Observe that by relation(3.28), √ π E [ | R n | ] ≤ . c + 2 . c , (5.2)so it suffices to estimate the second term in the right hand side of (5.1). As before, we splitthe rest of the proof into several steps. Step I
In order to handle the second term in (5.1), we apply Lemma 3.8 and (4.7) to obtain E [ W n f ( ˜ W n ) I n ] = Z X ρ n ( x ) E [ D x ( f ( ˜ W n ) I n )] λ ( dx ) = T n + ε n, , (5.3) PROBABILISTIC APPROACH TO THE ERD ¨OS-KAC THEOREM FOR ADDITIVE FUNCTIONS 25 where T n := Z X ρ n ( x ) E [ D x ( f ( ˜ W n )) I n ] λ ( dx ) , (5.4) ε n, := Z X ρ n ( x ) E [( f ( ˜ W n ) + D x ( f ( ˜ W n )) D x I n )] λ ( dx ) . Notice that Lemma 3.3, together with the argument leading to (4.11), yields | ε n, | ≤ k f k ∞ Z X ρ n ( x ) E [ | D x I n | ] λ ( dx ) ≤ . c σ n , (5.5)so that we are left to show that T n and E [ f ′ ( ˜ W n ) I n ] are close. Step II
In view of the form of T n , it helps to rewrite E [ f ′ ( ˜ W n ) I n ] as follows E [ f ′ ( ˜ W n ) I n ] = Z X ρ n ( x ) E [ ρ n ( x ) f ′ ( ˜ W n ) I n ] λ ( dx ) , where we used the fact that R X ρ n dλ = 1. On the other hand, by Taylor’s formula, D x ( f ( ˜ W n )) − f ′ ( ˜ W n ) D x ˜ W n = D x ˜ W n Z ( f ′ ( ˜ W n + tD x ˜ W n ) − f ′ ( ˜ W n )) dt. Therefore, approximating D x ( f ( ˜ W n )) by D x ˜ W n f ′ ( ˜ W n ) in (5.4), and then D x ˜ W n by ρ n ( x ),we have T n − E [ f ′ ( ˜ W n ) I n ] = ε n, + ε n, , (5.6)where ε n, := Z Z X ρ n ( x ) E [ D x ˜ W n ( f ′ ( ˜ W n + tD x ˜ W n ) − f ′ ( ˜ W n )) I n ] λ ( dx ) dt,ε n, := Z X ρ n ( x ) E [( D x ˜ W n − ρ n ( x )) f ′ ( ˜ W n ) I n ] λ ( dx ) . To bound both terms, we have to understand D x ˜ W n . Since ˜ W n is a non-linear functional of η , the quantity D x ˜ W n would be random, in contrast to D x W n . Write x = ( p, k ) ∈ P n × N ,we have D x ˜ W n = ˜ W n ( η + δ x ) − ˜ W n ( η ) = σ − n ( ψ ( p ξ p + k ) − ψ ( p ξ p )) . (5.7)Indeed, the additional Dirac mass δ x that is added to η will affect only one summand in thedefinition of ˜ W n , more precisely, for all q ∈ P n , ξ q ( η + δ x ) = X j ≥ j ( η + δ x )( q, j ) = ( k + P j ≥ jη ( q, j ) = k + ξ p , if p = q, P j ≥ jη ( p, j ) = ξ p , if p = q. (5.8) Step III
We first bound the error ε n, . By Lemma 3.3 and the explicit form of ρ n and λ , we have | ε n, | ≤ Z X | ρ n ( x ) | E [ | D x ˜ W n − ρ n ( x ) | ] λ ( dx ) (5.9)= 1 σ n X p ∈P n X k ≥ | ψ ( p ) | p k E [ | ψ ( p ξ p + k ) − ψ ( p ξ p ) − kψ ( p ) | ] ≤ c σ n X p ∈P n p E [ | ψ ( p ξ p +1 ) − ψ ( p ξ p ) − ψ ( p ) | ]+ c σ n X p ∈P n X k ≥ p k E [ | ψ ( p ξ p + k ) − ψ ( p ξ p ) − kψ ( p ) | ] =: ε n, , + ε n, , . Notice that the term inside the expectation of ε n, , vanishes under the event ξ p = 0. Thus, ε n, , ≤ c σ − n ( δ + δ + δ ) with δ = X p ∈P n p E [ | ψ ( p ξ p +1 ) | ( ξ p ≥ ,δ = X p ∈P n p E [ | ψ ( p ξ p ) | ( ξ p ≥ , (5.10) δ = X p ∈P n | ψ ( p ) | p P [ ξ p ≥ . By the memoryless property L ( ξ p | ξ p ≥ k ) = L ( ξ p + k ) for all p ∈ P and k ∈ N , as well asthe Cauchy-Schwarz inequality, we have δ = X p ∈P n E [ | ψ ( p ξ p +2 ) | ] p ≤ ¯ ζ (2) / c . (5.11)The same argument leads to δ = X p ∈P n E [ | ψ ( p ξ p +1 ) | ] p ≤ ¯ ζ (2) c + X p ∈P n E [ | ψ ( p ξ p +1 ) | ( ξ p ≥ p ≤ ¯ ζ (2) c + X p ∈P n E [ | ψ ( p ξ p +2 )] p ≤ ¯ ζ (2) c + ¯ ζ (4) / c , (5.12)and δ ≤ ¯ ζ (2) c , yielding ε n, , ≤ σ n (cid:0) ζ (2) c + ( ¯ ζ (2) + ¯ ζ (4) ) c c (cid:1) ≤ σ n (cid:0) . c + 1 . c c (cid:1) . (5.13) PROBABILISTIC APPROACH TO THE ERD ¨OS-KAC THEOREM FOR ADDITIVE FUNCTIONS 27
To bound ε n, , , we write ε n, , ≤ c σ − n ( δ ′ + δ ′ + δ ′ ) where δ ′ = X p ∈P X k ≥ p k E [ | ψ ( p ξ p + k ) | ] ,δ ′ = X p ∈P X k ≥ p k E [ | ψ ( p ξ p ) | ] , (5.14) δ ′ = X p ∈P X k ≥ p k k | ψ ( p ) | . Notice that by the identity L ( ξ p + k ) = L ( ξ p | ξ p ≥ k ), δ ′ = X p ∈P X k ≥ E [ | ψ ( p ξ p ) | ( ξ p ≥ k )] ≤ X p ∈P E [ | ψ ( p ξ p ) | ξ p ( ξ p ≥ X p ∈P p E [ | ψ ( p ξ p +2 ) | ( ξ p + 2)] ≤ c (cid:16) X p ∈P E [( ξ p + 2) ] p (cid:17) / ≤ (11 ¯ ζ (2)) / c . (5.15)where we used the Cauchy-Schwarz inequality and Lemma B.4 for the two inequalities. Thesame argument implies δ ′ = X p ∈P (1 − p − ) − p E [ | ψ ( p ξ p ) | ( ( ξ p = 1) + ( ξ p ≥ ≤ ¯ ζ (3) c + 2 X p ∈P E [ | ψ ( p ξ ) | ( ξ p ≥ p = ¯ ζ (3) c + 2 X p ∈P E [ | ψ ( p ξ p +2 ) | ] p ≤ ¯ ζ (3) c + 2 ¯ ζ (6) / c , (5.16)and δ ′ ≤ ζ (2) c , yielding ε n, , ≤ σ n [( ¯ ζ (3) + 6 ¯ ζ (2)) c + ((11 ¯ ζ (2)) + 2 ¯ ζ (6) ) c c ] ≤ σ n (4 . c + 3 c c ) . (5.17)Combining (5.13) and (5.17), we obtain | ε n, | ≤ σ n (5 . c + 4 . c c ) . (5.18) Step IV
It remains to bound ε n, . Applying Stein’s equation f ′ ( x ) = xf ( x ) + ( x ≤ z ) − P [ N ≤ z ]leads to ε n, = Z Z ρ n ( x ) E [ D x ˜ W n (( ˜ W n + tD x ˜ W n ) f ( ˜ W n + tD x ˜ W n ) − ˜ W n f ( ˜ W n )) I n ] λ ( dx ) dt + Z Z ρ n ( x ) E [ D x ˜ W n ( ( ˜ W n + tD x ˜ W n ≤ z ) − ( W n ≤ z )) I n ] λ ( dx ) dt Since x xf ( x ) and x
7→ − ( x ≤ z ) are non-decreasing functions, one has for h ∈ R , t ∈ (0 ,
1) that h · (( x + h ) f ( x + h ) − xf ( x )) ≥ h · (( x + th ) f ( x + th ) − xf ( x )) ≥ , and h · ( ( x ≤ z ) − ( x + h ≤ z )) ≥ h · ( ( x ≤ z ) − ( x + th ≤ z )) ≥ . Applying these inequalities with h = D x ˜ W n and x = ˜ W n , we obtain | ε n, | ≤ Z X | ρ n ( x ) | E [ D x ˜ W n (( W n + D x ˜ W n ) f ( ˜ W n + D x ˜ W n ) − ˜ W n f ( ˜ W n ))] λ ( dx )+ Z X | ρ n ( x ) | E [ D x ˜ W n ( ( W n ≤ z ) − ( W n + D x ˜ W n ≤ z ))] λ ( dx )Observe that ( ˜ W n + D x ˜ W n ) f ( ˜ W n + D x ˜ W n ) − ˜ W n f ( ˜ W n ) = D x ( ˜ W f ( ˜ W )) , ( ˜ W n ≤ z ) − ( ˜ W n + D x ˜ W n ≤ z ) = − D x ( ( ˜ W n ≤ z )) , leading to the bound | ε n, | ≤ Z X | ρ n ( x ) | E [ D x ˜ W n D x ( ˜ W f ( ˜ W ) − ( ˜ W n ≤ z ))] λ ( dx ) = ε n, , + ε n, , where ε n, , := Z X | ρ n ( x ) | ρ n ( x ) E [ D x ( ˜ W n f ( ˜ W n ) − ( ˜ W n ≤ z ))] λ ( dx ) ,ε n, , := Z X | ρ n ( x ) | E [( D x ˜ W n − ρ n ( x )) D x ( ˜ W n f ( ˜ W n ) − ( ˜ W n ≤ z ))] λ ( dx ) . Applying Lemma 3.8 gives ε n, , = E [˜ η ( | ρ n | ρ n )( ˜ W f ( ˜ W ) − ( ˜ W n ≤ z ))] , so that by the fact that | xf ( x ) | ≤ x ∈ R in Lemma 3.3, one has | ε n, , | ≤ E [ | ˜ η ( | ρ n | ρ n ) | ] ≤ V ar[˜ η ( | ρ n | ρ n )] / = 3 (cid:16) Z X | ρ n ( x ) | λ ( dx ) (cid:17) / = 3 σ n (cid:16) X p ∈P n X k ≥ k | ψ ( p ) | p k (cid:17) / ≤ c σ n (cid:16) X p ∈P n ψ ( p ) p (cid:17) / ≤ c σ n X p ∈P n ψ ( p ) p (1 − p − ) − ≤ c σ n . (5.19)On the other hand, using | xf ( x ) − yf ( y ) | ≤ | ( x ≤ z ) − ( y ≤ z ) | ≤ x, y ∈ R ,we have | ε n, , | ≤ Z X | ρ n ( x ) | E [ | D x ˜ W n − ρ n ( x ) | ] λ ( dx ) . By (5.9), this integral can be handled the same way as ε n, , leading to the inequality | ε n, , | ≤ σ n (16 . c + 12 . c c ) (5.20) PROBABILISTIC APPROACH TO THE ERD ¨OS-KAC THEOREM FOR ADDITIVE FUNCTIONS 29
Hence, by (5.19) and (5.20) | ε n, , | ≤ σ n (49 . c + 24 . c c )so that | ε n, | ≤ σ n (175 . c + 24 . c c ) . (5.21)Relations (5.6), (5.18) and (5.21) lead to | T n − E [ f ′ ( ˜ W n ) I n ] ≤ σ n (181 . c + 28 . c c ) . (5.22)Relations (5.1),(5.2), (5.3), (5.5) and (5.22) lead to the desired Kolmogorov bound.6. Proof of Theorem 2.1
Combining Theorem 2.4 with Lemma 3.1, we deduce that d K (cid:18) ψ ( J n ) − µ n σ n , N (cid:19) ≤ d K (cid:18) ψ ( H n Q ( n/H n )) − µ n σ n , ψ ( H n ) − µ n σ n (cid:19) (6.1)+ γ σ n + γ σ n + 61 log log( n )log( n ) , where N is a random variable with standard Gaussian distribution and γ , γ , are given asin (2.7). Define T n := P (cid:20) ψ ( H n Q ( n/H n )) − µ n σ n ≤ x (cid:21) − P (cid:20) ψ ( H n ) + ψ ( Q ( n/H n )) − µ n σ n ≤ x (cid:21) , and notice that | T n | ≤ | P (cid:20) ψ ( H n Q ( n/H n )) − µ n σ n ≤ x and Q ( n/H n ) H n (cid:21) − P (cid:20) ψ ( H n ) + ψ ( Q ( n/H n )) − µ n σ n ≤ x and Q ( n/H n ) H n (cid:21) | + 2 P [ Q ( n/H n ) divides H n ] . By the additivity of ψ , the absolute value in the right is equal to zero, and thus, | T n | ≤ . n )log( n ) , where the last inequality follows from Lemma 3.2. Thus, by condition (H1) , for every x ∈ R , (cid:12)(cid:12)(cid:12)(cid:12) P (cid:20) ψ ( H n Q ( n/H n )) − µ n σ n ≤ x (cid:21) − P (cid:20) ψ ( H n ) − µ n σ n ≤ x (cid:21) (cid:12)(cid:12)(cid:12)(cid:12) ≤ P (cid:20) ψ ( H n ) − µ n σ n ∈ [ x, x + σ − n c ] (cid:21) ∨ P (cid:20) ψ ( H n ) − µ n σ n ∈ [ x − σ − n c , x ] (cid:21) + 6 . n )log( n ) . By a further application of Theorem 2.4, (cid:12)(cid:12)(cid:12)(cid:12) P (cid:20) ψ ( H n Q ( n/H n )) − µ n σ n ≤ x (cid:21) − P (cid:20) ψ ( H n ) − µ n σ n ≤ x (cid:21) (cid:12)(cid:12)(cid:12)(cid:12) ≤ . n )log( n ) + sup x ∈ R P (cid:2) N ∈ [ x, x + σ − n c ] (cid:3) + γ σ n + γ σ n ≤ . n )log( n ) + 1 √ πσ n c + γ σ n + γ σ n , (6.2)where the last inequality follows from an application of the mean value theorem to the stan-dard Gaussian density. The Kolmogorov bound (2.2) then follows from (6.1) and (6.2).To show (2.3), we use Theorem 2.4, to get d (cid:18) ψ ( J n ) − µ n σ n , N (cid:19) ≤ d (cid:18) ψ ( H n ) + ψ ( Q ( n/H n )) − µ n σ n , ψ ( H n ) − µ n σ n (cid:19) (6.3)+ d (cid:18) ψ ( H n ) + ψ ( Q ( n/H n )) − µ n σ n , ψ ( J n ) − µ n σ n (cid:19) + γ σ n , where γ is given by (2.7). The first term can be bounded in the following way d (cid:18) ψ ( H n ) + ψ ( Q ( n/H n )) − µ n σ n , ψ ( H n ) − µ n σ n (cid:19) ≤ σ n E [ | ψ ( Q ( n/H n )) | ] ≤ c σ n . (6.4)To bound the term T n := d (cid:18) ψ ( H n ) + ψ ( Q ( n/H n )) − µ n σ n , ψ ( J n ) − µ n σ n (cid:19) , we use Dobrushin’s theorem, to guarantee the existence of a random variable Y n such that Y n Law = ψ ( H n ) + ψ ( Q ( n/H n )) and d TV ( ψ ( J n ) , ψ ( H n ) + ψ ( Q ( n/H n ))) = P [ ψ ( J n ) = Y n ] . In principle, Y n might not be defined over (Ω , F , P ), but we will assume that it is, at thecost of extending the underlying probability space, if necessary. On the other hand, for anyrandom variables X, Y , the bound d ( X, Y ) ≤ E [ | e X − e Y | ] holds where e X and e Y are equalin law to X and Y respectively. From here it follows that T n ≤ σ n E [ | ψ ( J n ) − Y n | ] = 1 σ n E [ | ψ ( J n ) − Y n | ( Y n = ψ ( J n ))] ≤ P [ Y n = ψ ( J n )] σ n ( k ψ ( J n ) k L ( Z ) + k ψ ( H n ) + ψ ( Q ( n/H n )) k L ( Z ) ) , where the last step follows from Cauchy-Schwarz inequality. By Lemma 3.1, P [ Y n = ψ ( J n )] = d TV ( ψ ( H n ) + ψ ( Q ( n/H n )) , ψ ( J n )) ≤ d TV ( ψ ( H n Q ( n/H n )) , ψ ( J n )) + d TV ( ψ ( H n ) + ψ ( Q ( n/H n )) , ψ ( H n Q ( n/H n ))) ≤
61 log log( n )log log( n ) + d TV ( ψ ( H n ) + ψ ( Q ( n/H n )) , ψ ( H n Q ( n/H n ))) . PROBABILISTIC APPROACH TO THE ERD ¨OS-KAC THEOREM FOR ADDITIVE FUNCTIONS 31
To handle the remaining term, we observe that d TV ( ψ ( H n ) + ψ ( Q ( n/H n )) , ψ ( H n Q ( n/H n ))) ≤ P [ Q ( n/H n ) divides H n ] ≤ . n )log( n ) , so that P [ Y n = ψ ( J n )] ≤ . n )log( n ) . From the previous analysis it follows that T n ≤ . (cid:18) log log n log n (cid:19) ( σ − n k ψ ( J n ) k L ( Z ) + σ − n k ψ ( H n ) k L ( Z ) + σ − n k ψ k P ) . (6.5)Combining (6.5) with Lemma B.2, we obtain T n ≤ log log( n ) log( n ) (24 . σ − n c + 60 . σ − n c ) , which by the condition σ n ≥ c + c ), gives T n ≤ . n ) log( n ) . Relation (2.3) follows from (6.3), (6.4) and (6.5).7.
Proof of Theorem 2.6
Harmonic case.
Recall that L ( ψ ( H n )) = L ( Y n + R n | A n ), where Y n and R n are givenby (3.27). Set ˜ Y n = Y n + R n . In this section, we show that the Poisson approximation occursfor the non-standardized ψ ( H n ) as long as ψ is equal to 1 for a sufficiently large proportionof p ∈ P .Suppose that h : N → R is an indicator function of a subset of N . Let f be the solution to(3.16) with intensity λ n , namely, h ( k ) − E [ h ( M n )] = λ n f ( k + 1) − kf ( k ) . Notice that | E [ h ( ψ ( H n ))] − E [ h ( M n )] | = | E [ λ n f ( ψ ( H n ) + 1) − ψ ( H n ) f ( ψ ( H n ))] | = P [ A n ] − | E [( λ n f ( ˜ Y n + 1) − ˜ Y n f ( ˜ Y n )) I n ] |≤ | E [( λ n f ( ˜ Y n + 1) − ( Y n + R n ) f ( ˜ Y n )) I n ] | . Using Lemma 3.5 and the bound (3.28) for E [ | R n | ], we get | E [ h ( ψ ( H n ))] − E [ h ( M n )] | ≤ | λ n E [ f ( ˜ Y n + 1) I n ] − E [ Y n f ( ˜ Y n )) I n ] | + λ − n ( c + 2 c ) . (7.1)Define the kernel ̺ n : X → R by ̺ n ( p, k ) := kψ ( p ) ( p ≤ n ) for p ∈ P and k ∈ N . Recall that by (3.24), Y n = η ( ̺ n ) and E [ η ( ̺ n )] = λ n by our choice of λ n . Thus, by Mecke’s equation [18, Theorem 4.1] and (5.8) E [ Y n f ( ˜ Y n ) I n ] = E [ η ( ̺ n ) f ( ˜ Y n ) I n ] = Z ρ n ( x ) E [ f ( ˜ Y n ( η + δ x )) I n ( η + δ x )] λ ( dx )where for each x = ( p, k ) ∈ P n × N , we have I n ( η + δ x ) = ( p k Y q ∈P n \{ p } q ξ q ≤ n ) , ˜ Y n ( η + δ x ) = ψ ( p k + ξ p ) + X q ∈P n \{ p } ψ ( p ξ q ) . (7.2)Therefore, E [ Y n f ( ˜ Y n ) I n ] − λ n E [ f ( ˜ Y n + 1)] = ǫ n, + ǫ n, , (7.3)where ǫ n, := Z ̺ n ( x ) E [( f ( ˜ Y n ( η + δ x )) − f ( ˜ Y n + 1)) I n ( η + δ x )] λ ( dx ) ,ǫ n, := Z ̺ n ( x ) E [ f ( ˜ Y n + 1)( I n ( η + δ x ) − I n )] λ ( dx ) . We have by the first half of Lemma 3.5 and (4.10), (4.11) that | ǫ n, | ≤ √ λ n Z | ̺ n ( x ) | E [ | D x I n | ] λ ( dx ) ≤ c √ λ n . (7.4)By the second half of Lemma 3.5 and (7.2), | ǫ n, | ≤ λ n Z | ̺ n ( x ) | E [ | ˜ Y n ( η + δ x ) − ( ˜ Y n + 1) | ] λ ( dx )= 1 λ n X p ∈P n X k ≥ | ψ ( p ) | p k E [ | ψ ( p ξ p + k ) − ψ ( p ξ p ) − | ]= 1 λ n X p ∈P n − p − p | ψ ( p ) || ψ ( p ) − | + 1 λ n X p ∈P n | ψ ( p ) | p E [ | ψ ( p ξ p +1 ) − ψ ( p ξ p ) − | ( ξ p ≥ λ n X p ∈P n X k ≥ | ψ ( p ) | p k E [ | ψ ( p ξ p + k ) − ψ ( p ξ p ) − | ] =: ǫ n, , + ǫ n, , + ǫ n, , . This decomposition is totally parallel to our way of obtaining (5.18). Recalling (5.10), (5.11)and (5.12), we have ǫ n, , ≤ c λ n (cid:0) δ + δ + X p ∈P n p − P [ ξ p ≥ (cid:1) ≤ c λ n [ ¯ ζ (2) / c + ¯ ζ (2) c + ¯ ζ (4) / c + ¯ ζ (2)] ≤ λ n (0 . c + 1 . c c + 0 . c ) . PROBABILISTIC APPROACH TO THE ERD ¨OS-KAC THEOREM FOR ADDITIVE FUNCTIONS 33
Similarly, recalling (5.14), (5.15) and (5.16), we have ǫ n, , ≤ c λ n ( δ ′ + δ ′ + X p ∈P n X k ≥ p k ) ≤ c λ n ((11 ¯ ζ (2)) / c + ¯ ζ (3) c + 2 ¯ ζ (6) / c + 2 ¯ ζ (2)) ≤ λ n (0 . c + 3 c c + 1 . c ) , yielding | ǫ n, | ≤ λ n X p ∈P n | ψ ( p ) − | p + 1 λ n [1 . c + 4 . c c + 2 c ] . (7.5)Combining (7.1), (7.3), (7.4) and (7.5) gives the desired bound.7.2. Uniform case: the total variation bound.
We first prove the total variation boundunder the additional assumption that ψ ( p ) = 1 for all p ∈ P . For the Kolmogorov boundwithout this assumption, we follow the same strategy and prove it in the next subsection.By the triangle inequality, d TV ( ψ ( J n ) , M n ) ≤ η + η + η + η . (7.6)where η := d TV ( ψ ( J n ) , ψ ( H n Q ( n/H n ))) η := d TV ( ψ ( H n Q ( n/H n )) , ψ ( H n ) + 1) η := d TV ( ψ ( H n ) + 1 , M + 1) η := d TV ( M + 1 , M ) . Notice that Lemma 3.1, η ≤
61 log log( n )log( n ) , (7.7)and the bound η ≤ ˜ γ √ λ n + ˜ γ λ n + 2 c X p ∈P n | ψ ( p ) − | p (7.8)can be obtained from (2.9). For handling η , we use Stein’s equation for the Poisson distri-bution with parameter λ n . By taking M n as the target distribution, we consider λ E [ f ( M n + 2)] − E [( M n + 1) f ( M n + 1)]= λ E [ g ( M n + 1)] − E [ M n g ( M n )] − E [ g ( M n )] = − E [ g ( M )] = − E [ f ( M + 1)] , where we have used the condition g ( k ) = f ( k +1) for any k ∈ N , as well as the characterizingequation for the Poisson random variable M n . Therefore, by the uniform bound in Lemma3.5, one has η = d TV ( M n + 1 , M n ) ≤ √ λ n . (7.9)It remains to handle η . Define D n = { Q ( n/H n ) divides H n } . Notice the inclusion { ψ ( H n Q ( n/H n )) = ψ ( H n ) + 1 } ⊂ D n . By additivity, the assumption ψ ( p ) = 1 and Lemma 3.2, η ≤ P [ D n ] ≤ . n )log( n ) . The desired bound follows immediately.7.3.
Uniform case: the Kolmogorov bound.
Due to the relation d K ( X, Y ) ≤ d TV ( X, Y )for arbitrary random variables
X, Y , we have the bound d K ( ψ ( J n ) , M n ) ≤ η + η ′ + η + η , where η ′ := d K ( ψ ( H n Q ( n/H n )) , ψ ( H n ) + 1) . It remains to handle η ′ . Let z ∈ R be given. Then, T := | P [ ψ ( H n Q ( n/H n )) ≤ z ] − P [ ψ ( H n ) + 1 ≤ z ] |≤ | P [ ψ ( H n ) + ψ ( Q ( n/H n )) ≤ z, D cn ] − P [ ψ ( H n ) + 1 ≤ z, D cn ] | + P [ D n ] . Let A ∆ B denote the symmetric difference of two given subsets A, B ∈ Z . Observe that thefirst term in the right-hand side is bounded by P [ { ψ ( H n ) + ψ ( Q ( n/H n )) ≤ z, D cn } ∆ { ψ ( H n ) + 1 ≤ z, D cn } ] ≤ P [ { ψ ( H n ) + ψ ( Q ( n/H n )) ≤ z }\{ ψ ( H n ) + 1 ≤ z } ]+ P [ { ψ ( H n ) + 1 ≤ z }\{ ψ ( H n ) + ψ ( Q ( n/H n )) ≤ z } ] . Moreover, by (H1) , we have the inclusions { ψ ( H n ) + ψ ( Q ( n/H n )) ≤ z }\{ ψ ( H n ) + 1 ≤ z } ⊂ { ψ ( H n ) ∈ [ z − , z + c ∨ }{ ψ ( H n ) + 1 ≤ z }\{ ψ ( H n ) + ψ ( Q ( n/H n )) ≤ z } ⊂ { ψ ( H n ) ∈ [ z − c ∨ , z + 1] } , and consequently, T ≤ P [ z − c ∨ ≤ ψ ( H n ) ≤ z + c ∨
1] + P [ D n ] ≤ d TV ( ψ ( H n ) , M n ) + 2 P [ z − c ∨ ≤ M n ≤ z + c ∨
1] + P [ D n ] . Combining Theorem 2.5 with Lemma 3.2, we thus obtain the bound T ≤ P [ z − c ∨ ≤ M n ≤ z + c ∨
1] + 2˜ γ √ λ n + 2˜ γ λ n + 4 c X p ∈P n | ψ ( p ) − | p + 6 . n )log( n ) . Finally, using the fact that the Poisson distribution is unimodal (which implies that theprobability of the atoms of M n is bounded by λ λnn λ n ! e − λ n ), as well as Stirling’s formula, we getthe estimate P [ z − c ∨ ≤ M n ≤ z + c ∨ ≤ c ∨ √ πλ n From here we conclude that η ≤ γ √ λ n + 1 λ n (cid:0) γ + 4( c ∨ √ π (cid:1) + 4 c X p ∈P n | ψ ( p ) − | p + 6 . n )log( n ) . (7.10)Relation (2.11) follows from (7.7)-(7.10). PROBABILISTIC APPROACH TO THE ERD ¨OS-KAC THEOREM FOR ADDITIVE FUNCTIONS 35
Appendix A. Generalization of Theorem 3.6
Next we present an extension of Proposition 3.6. Although Proposition 3.6 is good enoughfor proving the main results of the manuscript, Proposition A.1 below illustrates that notonly the law of H n satisfies a relation of the type (3.19), but also any random variablesupported in N with a suitable multiplicative property over its probability distribution. Proposition A.1.
Suppose that n ≥ . Let ϑ : N → R + be a non-negative multiplicativefunction (i.e. ϑ ( mn ) = ϑ ( m ) ϑ ( n ) if m and n are co-prime) satisfying P ∞ k =0 ϑ ( p k ) < ∞ forall p ∈ P . Let H ϑn be a random variable defined in ( Z , F , P ) and supported in N ∩ [1 , n ] , withprobability distribution given by P [ H ϑn = k ] = 1 L ϑn ϑ ( k ) , for k = 1 , . . . , n and L ϑn := P nk =1 ϑ ( k ) . Consider a family { ξ ϑp } p ∈P of independent randomvariables defined in ( Z , F , P ) , with P [ ξ ϑp = k ] = ν p ϑ ( p k ) , for some ν p ≥ satisfying P ∞ k =0 ν p ϑ ( p k ) = 1 . Define the event A ϑn := n Y p ∈P n p ξ ϑp ≤ n o , as well as the random vector ~C ϑ ( n ) := ( α p ( H ϑn ); p ∈ P n ) . Then P ( A ϑn ) = L ϑn Y p ∈P n ν p , (A.1) and L ( ~C ϑ ( n )) = L ( ~ξ ϑ ( n ) | A ϑn ) , (A.2) where ~ξ ϑ ( n ) := ( ξ ϑp ; p ∈ P n ) . Remark A.1.
By choosing ν p = (1 − p − ) and ϑ ( m ) = m , we obtain Proposition 3.6 as aCorollary of Proposition A.1 Proof.
Consider a fixed vector ~λ = ( λ p ; p ∈ P n ) ∈ R π ( n ) and define K n by (3.21). As in theproof of Proposition A.1, the prime factorization theorem allows us to write E [ e P p ∈P n i λ p ξ ϑp ( A n )] = X ( c p ; p ∈P n ) ∈K n exp { i X p ∈P n λ p c p } P [ ξ ϑp = c p for all p ∈ P n ]= X ( c p ; p ∈P n ) ∈K n exp { i X p ∈P n λ p α p ( Y θ ∈P n θ c θ ) } P [ ξ ϑp = c p for all p ∈ P n ]= X ( c p ; p ∈P n ) ∈K n exp { i X p ∈P n λ p α p ( Y θ ∈P n θ c θ ) } (cid:0) Y p ∈P n ν p (cid:1)(cid:0) Y p ∈P n ϑ ( p c p ) (cid:1) = X ( c p ; p ∈P n ) ∈K n exp { i X p ∈P n λ p α p ( Y θ ∈P n θ c θ ) } (cid:0) Y p ∈P n ν p (cid:1) ϑ ( Y p ∈P n p c p )= n X k =1 exp { i X p ∈P n λ p α p ( k ) } ϑ ( k ) Y p ∈P n ν p = E [exp { X p ∈P n i λ p α p ( H ϑn ) } ] L ϑn Y p ∈P n ν p . Relations (A.1) and (A.2) then follow analogously to the proof of Proposition 3.6 (cid:3)
Appendix B. Technical lemmas
In this section, we prove some technical lemmas that were repidetely used throughout themanuscript.
Lemma B.1.
Let ω, Ω be the prime counting functions defined by (1.1) and (1.9). Then,for all n ≥ , | E [ ω ( J n )] − log log( n ) | ≤ . , (B.1) | E [Ω( J n )] − log log( n ) | ≤ . , (B.2) E [ ω ( J n ) ] ≤ . n ) . (B.3) E [ ω ( H n ) ] ≤ . n ) . (B.4) Proof.
By using the representation ω ( J n ) = X p ∈P n ( α p ( J n ) ≥ , as well as identity (3.5), we can write | E [ ω ( J n )] − X p ∈P n p | ≤ X p ∈P n | P [ p divides J n ] − p | ≤ |P n | n ≤ . n ) , (B.5)where the last inequality follows from the fact that π ( n ) ≤ . n log( n ) . Relation (B.1), followsfrom (B.5), (3.7) and the condition n ≥ E [Ω( J n )] = X p ∈P n E [ α p ( J n )] = X p ∈P n ∞ X k =1 P ( α p ( J n ) ≥ k ) = X p ∈P n ∞ X k =1 P ( p k divides J n ) . PROBABILISTIC APPROACH TO THE ERD ¨OS-KAC THEOREM FOR ADDITIVE FUNCTIONS 37
Hence, using (3.5), (3.7) and the second inequality in (B.5), we have | E [Ω( J n )] − log log( n ) | ≤ .
262 + 3 . n + X p ∈P n X k ≥ P ( p k divides J n ) ≤ .
262 + 3 . n + 2 ¯ ζ (2) ≤ . , where the last inequality follows from the condition n ≥
21. To show (B.3), we notice thatby (3.5), E [ ω ( J n ) ] = X p,q ∈P n P [ p, q divide J n ] = X p,q ∈P n p = q P [ pq divides J n ] + X p ∈P n P [ p divides J n ] . Therefore, by (3.7), we conclude that E [ ω ( J n ) ] ≤ X p,q ∈P n p = q pq + X p ∈P n p ≤ (cid:18) X p ∈P n p (cid:19)(cid:18) X p ∈P n p (cid:19) ≤ (log log( n ) + 2)(log log( n ) + 1) ≤ log log( n ) + 3 log log( n ) + 2 ≤ . n ) , (B.6)as required.To show (B.4), we write E [ ω ( H n ) ] = X p,q ∈P n p = q P [ pq divides H n ] + X p ∈P n P [ p divides H n ] . We can easily show that for every m ∈ N , P [ m divides H n ] = 1 L n X ≤ k ≤ n k ( m divides k ) = 1 L n X ≤ j ≤ n/m jm ≤ m , (B.7)and thus, by (B.6), E [ ω ( H n ) ] ≤ X p,q ∈P n p = q pq + X p ∈P n p ≤ . n ) , as required. (cid:3) Lemma B.2.
Let ψ be a general additive function subject to (H1) and (H2) . Then E [ ψ ( H n ) ] ≤ c log log( n ) + 13 . c . E [ ψ ( J n ) ] ≤ c log log( n ) + 13 . c . Proof.
We first write E [ ψ ( H n ) ] = X p ∈P n E [ ψ ( p α p ( H n ) ) ] + X p = q ∈P n E [ ψ ( p α p ( H n ) ) ψ ( p α p ( H n ) )] . Notice that by (B.7) and P ( ξ p = k ) = (1 − p − ) p − k , we have E [ ψ ( p α p ( H n ) ) ] = ∞ X k =1 P ( α p ( H n ) = k ) ψ ( p k ) ≤ ∞ X k =1 P ( p k divides H n ) ψ ( p k ) ≤ ∞ X k =1 P ( ξ p = k ) ψ ( p k ) = 2 E [ ψ ( p ξ p ) ] . Similarly, we see that for p = q , E [ ψ ( p α p ( H n ) ) ψ ( p α p ( H n ) )] ≤ ∞ X k,ℓ =1 P ( α p ( H n ) = k, α q ( H n ) = ℓ ) | ψ ( p k ) ψ ( q ℓ ) |≤ ∞ X k,ℓ =1 P ( p k q ℓ divides H n ) | ψ ( p k ) ψ ( q ℓ ) |≤ (cid:16) ∞ X k =1 ψ ( p k ) p k (cid:17)(cid:16) ∞ X k =1 ψ ( q k ) q k (cid:17) ≤ E [ | ψ ( p ξ p ) | ] E [ | ψ ( q ξ q ) | ] . Therefore, we have E [ ψ ( H n ) ] ≤ (cid:16) X p ∈P n E [ | ψ ( p ξ p ) | ] (cid:17) . We notice that by using (3.5) in place of (B.7) we also have E [ ψ ( J n ) ] ≤ (cid:16) X p ∈P n E [ | ψ ( p ξ p ) | ] (cid:17) . To bound each of the summands, we infer from the fact L ( ξ p | ξ p ≥
2) = L (2 + ξ p ) that E [ | ψ ( p ξ p ) | ] = (1 − p − ) p − | ψ ( p ) | + p − E [ | ψ ( p ξ p ) | ] . One concludes that E [ ψ ( H n ) ] ∨ E [ ψ ( H n ) ] ≤ (cid:16) . c X p ∈P n p − + X p ∈P n Ψ( p ) p (cid:17) ≤ c log log( n ) + 8 ¯ ζ (2) c , where we have used (3.7) for bounding the first series and Cauchy-Schwarz’s inequality forthe second. The proof is now complete. (cid:3) Lemma B.1. E [ X p ∈P | ψ ( p ) | ξ p ( ξ p ≥ < ζ (2) c ≤ c E [ X p ∈P n | ψ ( p ξ p ) | ( ξ p ≥ ≤ ¯ ζ (2) / c ≤ c PROBABILISTIC APPROACH TO THE ERD ¨OS-KAC THEOREM FOR ADDITIVE FUNCTIONS 39
Proof.
By the identity L ( ξ p | ξ p ≥
2) = L ( ξ p + 2), we have E [ X p ∈P n | ψ ( p ) | ξ p ( ξ p ≥ ≤ c X p ∈P P [ ξ p ≥ E [2 + ξ p ]= c X p ∈P p − (2 + p − (1 − p − ) − ) ≤ ζ (2) c . by (H1). For the other term E [ X p ∈P n | ψ ( p ξ p ) | ( ξ p ≥ X p ∈P n E [ | ψ ( p ξ p +2 ) | ] P [ ξ p ≥
2] = X p ∈P n E [ | ψ ( p ξ p +2 ) | ] p ≤ (cid:16) X p ∈P p (cid:17) / (cid:16) X p ∈P E [ | ψ ( p ξ p +2 ) | ] p (cid:17) / ≤ ¯ ζ (2) / c by (H2). (cid:3) Lemma B.3.
For any θ ∈ [ n ] and n ≥
21, we have P [ H n > nθ − ] ≤ θL n . (B.8) Proof.
The inequality is trivial when n ≤ θ , as in such instance the right hand side isbounded from below by n/ n )+1 ≥
1, due to the condition n ≥
21. Thus, we can assumewithout loss of generality that n > θ . Notice that for all k ≥ n X i = k i ≤ Z nk − x dx = log( n ) − log( k − , yielding P [ H n > nθ − ] = n X k = ⌊ nθ − ⌋ +1 kL n ≤ L n (log( n ) − log( (cid:4) nθ − (cid:5) )) ≤ L n log (cid:18) nnθ − − (cid:19) ≤ L n log(2 θ ) ≤ θL n , where the one but last inequality follows from the fact that n > θ . (cid:3) Lemma B.4.
For every p ≥
2, we have that ∞ X k =1 kp − k = p − (1 − p − ) − ≤ p (B.9) ∞ X k =1 k p − k = p − (1 − p − ) − (1 + p − ) ≤ p (B.10) ∞ X k =1 k p − k = p − (1 + 4 p + p )(1 − p − ) − ≤ p . (B.11) Proof.
The result easily follows from the fact that if G has geometric distribution with P [ G = k ] = θ (1 − θ ) k − , then its moment generating function is given by E [ e λG ] = θ − (1 − θ ) e λ . The result is thus obtained by multiplying the both sides of (B.9)-(B.11) by (1 − p − ), takingthe first two derivatives in E [ e λG ] and evaluating at λ = 0 and θ = 1 − p − . (cid:3) Lemma B.5.
Let N be a standard normal random variable and W be normal with mean µ and variance σ . Then d K ( W, N ) ≤ | σ − | + √ π µ,d ( W, N ) ≤ √ π | σ − | + 2 µ. Proof.
By integration by parts, one sees that σ E [ f ′ ( W )] = E [( W − µ ) f ( W )]for all f : R → R with k f k ∞ + k f ′ k ∞ < ∞ , yielding | E [ W f ( W )] − E [ f ′ ( W )] | ≤ k f ′ k ∞ | σ − | + k f k ∞ µ. The result follows from Lemmas 3.3 and 3.4. (cid:3)
Acknowledgments . This research is supported by FNR Grant R-AGR-3410-12-Z (MIS-SILe) from the University of Luxembourg and partially supported by Grant R-146-000-230-114 from the National University of Singapore.
References [1] R. Arratia. On the amount of dependence in the prime factorization of a uniform random integer. In
Contemporary combinatorics , volume 10 of
Bolyai Soc. Math. Stud. , pages 29–91. J´anos Bolyai Math.Soc., Budapest, 2002.[2] M. B. Barban and A. I. Vinogradov. On the number-theoretic basis of probabilistic number theory.
Dokl. Akad. Nauk SSSR , 154:495–496, 1964.[3] A. D. Barbour, E. Kowalski, and A. Nikeghbali. Mod-discrete expansions.
Probab. Theory Related Fields ,158(3-4):859–893, 2014.[4] A. D. Holst Barbour and S. Janson.
Poisson approximation . Oxford studies in probability. Oxford,England, 1992.[5] Patrick Billingsley. On the central limit theorem for the prime divisor functions.
Amer. Math. Monthly ,76:132–139, 1969.[6] Sourav Chatterjee, Persi Diaconis, and Elizabeth Meckes. Exchangeable pairs and poisson approxima-tion.
Probability Surveys , 2, 12 2004.[7] Louis H. Y. Chen. Poisson approximation for dependent trials.
Ann. Probability , 3(3):534–545, 1975.[8] Louis H. Y. Chen, Larry Goldstein, and Qi-Man Shao.
Normal approximation by Stein’s method . Prob-ability and its Applications (New York). Springer, Heidelberg, 2011.[9] D¨obler, Christian; Peccati, Giovanni. The fourth moment theorem on the Poisson space. Ann. Probab.46 (2018), no. 4, 1878–1916.
PROBABILISTIC APPROACH TO THE ERD ¨OS-KAC THEOREM FOR ADDITIVE FUNCTIONS 41 [10] P. D. T. A. Elliott.
Probabilistic number theory. I , volume 239 of
Grundlehren der MathematischenWissenschaften [Fundamental Principles of Mathematical Science] . Springer-Verlag, New York-Berlin,1979. Mean-value theorems.[11] P. Erd¨os and M. Kac. The Gaussian law of errors in the theory of additive number theoretic functions.
Amer. J. Math. , 62:738–742, 1940.[12] Torkel Erhardsson.
Stein’s method for Poisson and compound Poisson approximation , pages 61–113. 042005.[13] Adam J. Harper. Two new proofs of the Erd¨os-Kac theorem, with bound on the rate of convergence, byStein’s method for distributional approximations.
Math. Proc. Cambridge Philos. Soc. , 147(1):95–114,2009.[14] Jean Jacod, Emmanuel Kowalski, and Ashkan Nikeghbali. Mod-Gaussian convergence: new limit theo-rems in probability and number theory.
Forum Math. , 23(4):835–873, 2011.[15] Emmanuel Kowalski and Ashkan Nikeghbali. Mod-Poisson convergence in probability and number the-ory.
Int. Math. Res. Not. IMRN , (18):3549–3587, 2010.[16] J. Kubilius.
Probabilistic methods in the theory of numbers . Translations of Mathematical Monographs,Vol. 11. American Mathematical Society, Providence, R.I., 1964.[17] Rapha¨el Lachieze-Rey, Giovanni Peccati, Xiaochuan Yang. Quantitative two-scale stabilisation on thePoisson space, preprint.[18] G¨unter Last and Mathew Penrose. Lectures on the Poisson process. Institute of Mathematical StatisticsTextbooks, 7. Cambridge University Press, Cambridge, 2018. xx+293 pp.[19] Last, G¨unter; Peccati, Giovanni; Schulte, Matthias. Normal approximation on Poisson spaces: Mehler’sformula, second order Poincar´e inequalities and stabilization. Probab. Theory Related Fields 165 (2016),no. 3-4, 667–723.[20] Wm. J. LeVeque. On the size of certain number-theoretic functions.
Trans. Amer. Math. Soc. , 66:440–463, 1949.[21] Ivan Nourdin and Giovanni Peccati. Normal approximations with Malliavin calculus. From Stein’smethod to universality. Cambridge Tracts in Mathematics, 192. Cambridge University Press, Cam-bridge, 2012. xiv+239 pp.[22] A. R´enyi and P. Tur´an. On a theorem of Erd¨os-Kac.
Acta Arith. , 4:71–84, 1958.[23] J. Barkley Rosser and Lowell Schoenfeld. Approximate formulas for some functions of prime numbers.
Illinois J. Math. , 6:64–94, 1962.[24] Charles Stein. A bound for the error in the normal approximation to the distribution of a sum ofdependent random variables. In
Proceedings of the Sixth Berkeley Symposium on Mathematical Statisticsand Probability (Univ. California, Berkeley, Calif., 1970/1971), Vol. II: Probability theory , pages 583–602, 1972.[25] G´erald Tenenbaum. Crible d’´Eratosth`ene et mod`ele de Kubilius. In
Number theory in progress, Vol. 2(Zakopane-Ko´scielisko, 1997) , pages 1099–1129. de Gruyter, Berlin, 1999.[26] G´erald Tenenbaum.
Introduction to analytic and probabilistic number theory , volume 163 of
GraduateStudies in Mathematics . American Mathematical Society, Providence, RI, third edition, 2015. Translatedfrom the 2008 French edition by Patrick D. F. Ion.[27] Tim Trudgian. Updating the error term in the prime number theorem.
Ramanujan J. , 39(2):225–234,2016.
Louis H. Y. Chen: Department of Mathematics, National University of Singapore, BlockS17, 10 Lower Kent Ridge Road, Singapore 119076.
Email address : [email protected] Arturo Jaramillo & Xiaochuan Yang: Mathematics Research Unit, Universit´e du Luxem-bourg, Maison du Nombre 6, Avenue de la Fonte, L-4364 Esch-sur-Alzette, Luxembourg.,Department of Mathematics National University of Singapore Block S17, 10 Lower KentRidge Road Singapore 119076.
Email address : [email protected] Email address ::