aa r X i v : . [ m a t h . S T ] F e b Sample variance of rounded variables
J. An a) Korea Astronomy and Space Science Institute (Dated: 18 February 2021)
If the rounding errors are assumed to be distributed independently from the intrinsic distribution of the random variable,the sample variance s of the rounded variable is given by the sum of the true variance σ and the variance of therounding errors (which is equal to w /
12 where w is the size of the rounding window). Here the exact expressionsfor the sample variance of the rounded variables are examined and it is also discussed when the simple approximation s = σ + w /
12 can be considered valid. In particular, if the underlying distribution f belongs to a family of symmetricnormalizable distributions such that f ( x ) = σ − F ( u ) where u = ( x − µ ) / σ , and µ and σ are the mean and varianceof the distribution, then the rounded sample variance scales like s − ( σ + w / ) ∼ σΦ ′ ( σ ) as σ → ∞ where Φ ( τ ) = R ∞ − ∞ d u e iu τ F ( u ) is the characteristic function of F ( u ) . It follows that, roughly speaking, the approximation is valid for aslowly-varying symmetric underlying distribution with its variance sufficiently larger than the size of the rounding unit. I. INTRODUCTION
Most real world data are only recorded in the rounded figurewith a fixed number of significant digits. Strictly this round-ing introduces additional systematic uncertainties which mustbe properly accounted for, in order to infer the property ofthe intrinsic distribution of the measured quantities. Naively,assuming that there is neither intrinsic uncertainty nor sys-tematic bias, the differences between the true value and therounded reported value are expected to be distributed evenlyover the window of the size of the reporting unit.In particular, if the variable value x is rounded to an inte-ger multiple value of the measurement unit as in nw (where w is the measurement unit and n is an integer), then ( n + δ − / ) w ≤ x < ( n + δ + / ) w or ( n + δ − / ) w < x ≤ ( n + δ + / ) w . Here the constant δ ∈ [ − / , / ] specifiesthe rounding method (e.g., δ = / δ = − / δ = ρ = nw − x ) is distributed in the rectangu-lar distribution: P ( ρ ) = ( w − for − ( / + δ ) w < ρ < ( / − δ ) w , (1)where the distribution at the boundary is determined by thechosen convention – however provided that x is a real variablein a continuous distribution, the boundaries consitute a nullmeasure set and so the specific choice does not affect the fol-lowing discussion. For a random variable x , the mean of therounded values is (with ¯ x being the true mean of x ) nw = x + ρ = ¯ x + Z ( / − δ ) w − ( / + δ ) w ρ w d ρ = ¯ x − δ u , (2)while the variance is s = ( nw − nw ) = ( x + ρ ) − ( ¯ x + ¯ ρ ) = σ + ρ − ¯ ρ + ( x ρ − ¯ x ¯ ρ ) , (3) a) Electronic mail: [email protected] where σ = x − ¯ x is the variance of x , with the variance ofthe rounding errors given by ρ − ¯ ρ = Z ( / − δ ) w − ( / + δ ) w ρ w d ρ − ( δ w ) = w . (4)In other words, provided that the distribution of x does not af-fect the rounding (as in x ρ = ¯ x ¯ ρ ), the standard deviation of therounded values is simply a quadrature sum of the true underly-ing standard deviation and that of the rounding errors, and thetrue standard deviation may be estimated from the variance ofthe rounded values via σ = (cid:18) s − w (cid:19) / . (5)However, this result is only valid “on average” sense. That isto say, the underlying distribution of the variable can techni-cally affect the rounding but for an arbitrary unspecified dis-tribution, the expected value of “ x ρ − ¯ x ¯ ρ ” should be zero andthe reported error tends to the quadrature sum of the randomerror and the rounding error ( σ ρ / w = / √ ≈ . II. THEORY
Suppose that f ( x ) is a probability distribution of a real ran-dom variable x with Z ∞ − ∞ d x f ( x ) = , Z ∞ − ∞ d x x f ( x ) = µ , Z ∞ − ∞ d x ( x − µ ) f ( x ) = Z ∞ − ∞ d x x f ( x ) − µ = σ . (6)Next consider the rounding off the measured value of the vari-able such that, with a fixed constant δ ∈ [ − / , / ] and themeasurement unit w , the value x is read off by an integermultiple of the unit, i.e. nw , where n = ⌊ ( x / w + / − δ ) ⌋ or n = ⌈ ( x / w − / − δ ) ⌉ with ⌊ x ⌋ and ⌈ x ⌉ being the integerfloor and ceiling of x . Then the (discrete) distribution of thereported integer n for the rounded value is found to be F n = Z ( n + δ + / ) w ( n + δ − / ) w d x f ( x ) . (7)This distribution is properly normalized: that is, ∞ ∑ n = − ∞ F n = Z ∞ − ∞ d x f ( x ) = , (8)and so we can find the mean and the variance of the roundedvariables by calculating m = ¯ nw = w ∞ ∑ n = − ∞ nF n , s = ( nw − m ) = n w − m = w ∞ ∑ n = − ∞ n F n − ∞ ∑ n = − ∞ nF n ! . (9)For some distributions f ( x ) , the associated discrete distribu-tion F n as well as its mean m and the standard deviation s ofthe rounded variable can be computed analytically. Howeverthe calculations become quite tedious even for many simpledistributions and the computations can only be done numeri-cally for most distributions including the important examplesuch as the normal distribution. Instead here we try to ana-lyze the problem more generally. Henceforth we also assume w = w are trivial. A. characteristic function
First let us introduce the characteristic function ϕ ( t ) of thedistribution f ( x ) : namely, ϕ ( t ) = Z ∞ − ∞ d x e itx f ( x ) . (10)The derivatives of the characteristic function then result in ϕ ( n ) ( t ) = i n Z ∞ − ∞ d x x n e itx f ( x ) ; ϕ ( n ) ( ) = i n Z ∞ − ∞ d x x n f ( x ) , (11)and so ϕ ( ) = ϕ ′ ( ) = i µ and ϕ ′′ ( ) = − ( σ + µ ) . Wecan also define the shifted characteristic function:˜ ϕ ( t ) = e − it µ ϕ ( t ) = Z ∞ − ∞ d x e it ( x − µ ) f ( x )= Z ∞ − ∞ d ε e it ε f ( µ + ε ) , ˜ ϕ ′ ( t ) = e − it µ [ ϕ ′ ( t ) − i µϕ ( t )] ;˜ ϕ ′′ ( t ) = e − it µ [ ϕ ′′ ( t ) − i µϕ ′ ( t ) − µ ϕ ( t )] . (12)Then ˜ ϕ ( ) = ϕ ( ) =
1, ˜ ϕ ′ ( ) = ϕ ′ ( ) − i µϕ ( ) =
0, and˜ ϕ ′′ ( ) = ϕ ′′ ( ) − i µϕ ′ ( ) − µ ϕ ( ) = − σ − µ + µ − µ = − σ . In other words, the Maclaurin series coefficientsof ˜ ϕ ( t ) result in the sequence of the central moments whereas those of ϕ ( t ) result in the moments about the origin. Further-more, if the distribution is symmetric about its mean µ as in f ( µ + ε ) = f ( µ − ε ) for any ε ∈ R , then˜ ϕ ( t ) = Z ∞ − ∞ d ε e it ε f ( µ − ε )= Z ∞ − ∞ d ε e i ( − t ) ε f ( µ + ε ) = ˜ ϕ ( − t ) ; (13)and also ˜ ϕ ( n ) ( t ) = ( − ) n ˜ ϕ ( n ) ( − t ) . That is to say, the shiftedcharacteristic function of a symmetric distribution is an evenfunction. The converse also holds in that, if the characteristicfunction is in the form of ϕ ( t ) = e it µ ˜ ϕ ( t ) with an even func-tion such that ˜ ϕ ( t ) = ˜ ϕ ( − t ) , the distribution must be symmet-ric about the mean µ . B. distribution of rounded values
The characteristic function may also be inverted to recoverthe distribution via the inverse Fourier transform: that is, f ( x ) = π Z ∞ − ∞ d t e − itx ϕ ( t )= π Z ∞ − ∞ d t e it ( µ − x ) ˜ ϕ ( t ) . (14)Inserting this into equation (7), we find the expression for thediscrete distribution F n of the rounded variable in terms of thecharacteristic function ϕ ( t ) : namely, F n = π Z n + δ + / n + δ − / d x Z ∞ − ∞ d t e − itx ϕ ( t )= π Z ∞ − ∞ d t ϕ ( t ) Z n + δ + / n + δ − / d x e − itx = π Z ∞ − ∞ d t ϕ ( t ) sinc (cid:16) t (cid:17) e − it ( n + δ ) = π Z ∞ − ∞ d t ˜ ϕ ( t ) sinc (cid:16) t (cid:17) e it ( µ − δ − n ) . (15)Here sinc ( x ) = x − sin x for x , ( ) =
1. In additionwe can also define the characteristic function of F n . Since F n isa discrete distribution, its characteristic function is given by Φ t = ∞ ∑ n = − ∞ e itn F n = π Z ∞ − ∞ d τϕ ( τ ) sinc (cid:16) τ (cid:17) e − i τδ ∞ ∑ n = − ∞ e i ( t − τ ) n = ∞ ∑ k = − ∞ ϕ ( t + π k ) sinc (cid:16) t + π k (cid:17) e − i ( t + π k ) δ = ∞ ∑ k = − ∞ ˜ ϕ ( t + π k ) sinc (cid:16) t + π k (cid:17) e i ( t + π k )( µ − δ ) , (16)Here we have used the Fourier series representation of the so-called Dirac comb distribution: namely,12 π ∞ ∑ n = − ∞ e in ( t − τ ) = ∞ ∑ k = − ∞ δ ( t − τ + π k ) . (17)Then the derivative of Φ t is found to be d Φ t dt = ∞ ∑ k = − ∞ ((cid:2) ϕ ′ ( t + π k ) − i δϕ ( t + π k ) (cid:3) sinc (cid:16) t + π k (cid:17) + ϕ ( t + π k ) ddt sinc (cid:16) t + π k (cid:17)) e − i ( t + π k ) δ , (18) while the second-order derivative is given by d Φ t dt = ∞ ∑ k = − ∞ ((cid:2) ϕ ′′ ( t + π k ) − i δϕ ′ ( t + π k ) − δ ϕ ( t + π k ) (cid:3) sinc (cid:16) t + π k (cid:17) + (cid:2) ϕ ′ ( t + π k ) − i δϕ ( t + π k ) (cid:3) ddt sinc (cid:16) t + π k (cid:17) + ϕ ( t + π k ) d dt sinc (cid:16) t + π k (cid:17)) e − i ( t + π k ) δ . (19)Here for the sake of clarity, we have not yet introduced theexplicit forms for the derivatives of the sinc function, ddt sinc (cid:16) t + π k (cid:17) =
12 cos ( t / + π k ) − sinc ( t / + π k ) t / + π kd dt sinc (cid:16) t + π k (cid:17) = −
14 sinc (cid:16) t + π k (cid:17) −
12 cos ( t / + π k ) − sinc ( t / + π k )( t / + π k ) . (20)Next given that d k Φ t dt k = i k ∞ ∑ n = − ∞ n k e itn F n ⇒ ∞ ∑ n = − ∞ n k F n = i k d k Φ t dt k (cid:12)(cid:12)(cid:12)(cid:12) t = , (21)we can find that m = ∞ ∑ n = − ∞ nF n = i d Φ t dt (cid:12)(cid:12)(cid:12)(cid:12) t = = µ − δ + S s = ∞ ∑ n = − ∞ ( n − m ) F n = − d Φ t dt (cid:12)(cid:12)(cid:12)(cid:12) t = − m = σ + − S − S , (22)where S = ∞ ∑ k = − ∞ k , ( − ) k ϕ ( π k ) π ik e − π ik δ = ∞ ∑ k = − ∞ k , ( − ) k ˜ ϕ ( π k ) π ik e π ik ( µ − δ ) , (23) and S = ∞ ∑ k = − ∞ k , ( − ) k π k (cid:20) ϕ ′ ( π k ) − i µϕ ( π k ) − ϕ ( π k ) π k (cid:21) e − π ik δ = ∞ ∑ k = − ∞ k , ( − ) k π k (cid:20) ˜ ϕ ′ ( π k ) − ˜ ϕ ( π k ) π k (cid:21) e π ik ( µ − δ ) . (24)Here we have used the fact that sinc ( π k ) = ( π k ) − sin ( π k ) = k ∈ Z − { } as well as ddt sinc (cid:16) t + π k (cid:17)(cid:12)(cid:12)(cid:12)(cid:12) t = = ( − ) k π k k ∈ Z − { } k = , (25)and d dt sinc (cid:16) t + π k (cid:17)(cid:12)(cid:12)(cid:12)(cid:12) t = = ( − ) k + ( π k ) k ∈ Z − { }− k = . (26)Equations (22) indeed reproduce the results expected from theelementary arguments given in the introduction with the pro-viso that the infinite sums, S and S in equations (23) and(24) are negligible. In other words, if one considers only the k = m = µ − δ and s = σ + / F n to be a continuous distribution overreal n and replace the infinite sum ∑ ∞ n = − ∞ e i ( t − τ ) n in equa-tion (16) with the integral R ∞ − ∞ e i ( t − τ ) n d n , the Dirac comb ∑ ∞ k = − ∞ δ ( t − τ + π k ) would be replaced by a single Diracdelta δ ( t − τ ) . That is to say, the naive expectation that m = µ − δ and s = σ + /
12 may be considered as the ap-proximation in the limit of continuous F n . C. symmetric distribution If M ∈ Z is the integer to which the mean µ is rounded, µ ∈ [ M + δ − / , M + δ + / ] and χ = µ − δ − M ∈ [ − / , / ] .Since M and k are integers and µ − δ = M + χ , we find S = ∞ ∑ k = − ∞ k , ( − ) k ˜ ϕ ( π k ) π ik e π ik χ ; S = ∞ ∑ k = − ∞ k , ( − ) k π k (cid:20) ˜ ϕ ′ ( π k ) − ˜ ϕ ( π k ) π k (cid:21) e π ik χ , (27)which is basically the Fourier series expressions of S ( χ ) and S ( χ ) for χ ∈ [ − / , / ] . Since ˜ ϕ ( − t ) = ˜ ϕ ( t ) and˜ ϕ ′ ( − t ) = − ˜ ϕ ′ ( t ) for a symmetric distribution, these can befurther reducible to the real ones: S = ∞ ∑ k = ( − ) k ˜ ϕ ( π k ) π k sin ( π k χ ) ; S = ∞ ∑ k = · ( − ) k π k (cid:20) ˜ ϕ ′ ( π k ) − ˜ ϕ ( π k ) π k (cid:21) cos ( π k χ )= ∞ ∑ k = · ( − ) k ddt (cid:18) ˜ ϕ ( t ) t (cid:19)(cid:12)(cid:12)(cid:12)(cid:12) t = π k cos ( π k χ ) (28)if f ( x ) is symmetric about its mean.Next, consider the family of the distributions sharing thecommon normalized form; namely, f ( x ) = σ F (cid:18) x − µσ (cid:19) (29)where F ( u ) is a fixed non-negative function such that R ∞ − ∞ d u F ( u ) = R ∞ − ∞ d u u F ( u ) =
0, and R ∞ − ∞ d u u F ( u ) = ϕ ( t ) = e i µ t Φ ( σ t ) and ˜ ϕ ( t ) = Φ ( σ t ) where Φ ( τ ) = Z ∞ − ∞ d u e iu τ F ( u ) (30)is the characteristic function of the normalized distribution.Here f ( x ) is a symmetric distribution if and only if F ( u ) and Φ ( τ ) are even functions: F ( − u ) = F ( u ) and Φ ( − τ ) = Φ ( τ ) .In the limit of σ = F ( u ) = δ ( u ) – we then have˜ ϕ ( t ) = Φ ( ) = ϕ ′ ( t ) = σΦ ′ ( σ t ) = σ → S = ∞ ∑ k = ( − ) k π k sin ( π k χ ) = − χ ;lim σ → S = ∞ ∑ k = ( − ) k + ( π k ) cos ( π k χ ) = − χ , (31)for χ ∈ [ − / , / ] . Then it follows that m = µ − δ − χ = M and s = σ + / − ( / − χ ) − ( − χ ) = σ =
0, as ex-pected (i.e. every sample point is rounded to the same integer).Now suppose that Φ ( τ ) admits an asymptotic expansion; Φ ( τ ) ≃ | τ | s ∞ ∑ p = Φ ∞ , p τ p ( τ → ∞ ) (32) with a constant s >
0. Then it follows from equation (28) that S ≃ ∞ ∑ p = Φ ∞ , p ( σ ) p + s ∞ ∑ k = ( − ) k sin ( π k χ )( π k ) p + s + ; S ≃ ∞ ∑ p = ( p + s + ) Φ ∞ , p ( σ ) p + s ∞ ∑ k = ( − ) k + cos ( π k χ )( π k ) p + s + . (33)Here the inner sums on k converge absolutely for s > s are actually reducible to theBernoulli polynomials) given that (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) ∞ ∑ k = ( − ) k sin ( π k χ )( π k ) p + s + (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) ≤ ∞ ∑ k = | ( − ) k sin ( π k χ ) | ( π k ) p + s + ≤ ∞ ∑ k = ( π k ) p + s + = ζ ( p + s + ) π p + s + , (34)where ζ ( z ) is the Riemann zeta function and similarly (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) ∞ ∑ k = ( − ) k + cos ( π k χ )( π k ) p + s + (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) ≤ ζ ( p + s + ) π p + s + . (35)If χ = /
2, then cos ( π k χ ) = cos ( π k ) = ( − ) k for any in-teger k and so the last bound is actually sharp. By contrast,the first bound is not sharp but it suffices for our purposeshere. Since lim z → ∞ ζ ( z ) = ζ ( z ) for z > S and S as σ → ∞ .Also it follows that, if lim τ → ∞ d ln | Φ ( τ ) | / d ln | τ | = − s < S ∼ σ − s → S ∼ σ − s → σ → ∞ as wellas m = µ − δ + O ( σ − s ) and s = σ + / + O ( σ − s ) .As a concrete example, consider the bilateral exponential(Laplace) distribution of the mean µ and the variance σ : f ( x ) = √ σ exp (cid:18) − √ σ | x − µ | (cid:19) , (36)which is easily normalizable so that F ( u ) = e −√ | u | √ Φ ( τ ) = (cid:18) + τ (cid:19) − . (37)Here we find Φ ( τ ) = − ∑ ∞ k = ( − / τ ) k ≃ / τ as τ → ∞ andso it should be m = µ − δ + O ( σ − ) and s = σ + / + O ( σ − ) . In fact for this case, we know the analytic formsfor S ( χ ) and S ( χ ) . That is to say, let us consider the oddfunction for χ ∈ [ − / , / ] given by S ( χ ) = ∞ ∑ k = k B k + ( / + χ )( k + ) ! σ k = sinh ( √ χ / σ ) [ / ( √ σ )] − χ , (38)where B n ( z ) is the Bernoulli Polynomial. Then we find that Z / − / d χ S ( χ ) sin ( π k χ ) = ( − ) k π k [ + ( σπ k ) ] . (39)for a positive integer k . It follows that the first infinite sumin equation (28) with ˜ ϕ ( t ) = Φ ( σ t ) = [ + ( σ t ) / ] − is theFourier (sine) series expansion for S ( χ ) in equation (38). Inother words, if we sample the random variable x distributedaccording to equation (36) and round it to an integer n suchthat n + δ − / ≤ x < n + δ + / n + δ − / < x ≤ n + δ + / δ ∈ [ − / , / ] , the mean of n is m = µ − δ + S = M + sinh ( √ χ / σ ) [ / ( √ σ )] ≃ µ − δ − χ ( − χ ) σ + O ( σ − ) , (40)where M is the integer to which µ is rounded and χ = µ − δ − M . Similarly we can also establish that the second infinitesum in equation (28) with the same ˜ ϕ ( t ) is a Fourier seriesrepresentation of S ( χ ) = ∞ ∑ k = k B k + ( / + χ )( k + )( k ) ! σ k = σ + − χ + χ sinh ( √ χ / σ ) sinh [ / ( √ σ )] − cosh [ / ( √ σ )] cosh ( √ χ / σ ) [ / ( √ σ )] , (41)and so the variance of the rounded integers sampled over therandom variables with the distribution in equation (36) is s = [ / ( √ σ )] cosh ( √ χ / σ ) − sinh ( √ χ / σ ) [ / ( √ σ )] ≃ σ + − (cid:18) − χ + χ (cid:19) σ + O ( σ − ) . (42) III. EXPECTATION VALUES FOR UNSPECIFIED MEAN
The results so far have concerned the distributions with aknown fixed mean. Here instead we consider the cases of un-specified means. That is to say, let us calculate the expecta-tion values for the (difference to the true) mean and the vari-ance of the rounded variables averaged over distributions withall possible means. In practice, this is achieved by averag-ing over χ ∈ [ − / , / ] and so h m − µ i = − δ + h S i and h s i = σ + / − h S i − h S i , where h S i = R / − / d χ S ( χ ) and so on. However, we have h e π ik χ i = h sin ( πχ k ) i = h cos ( πχ k ) i = k and thus h S i = h S i = h S i , let us firstnote h S i , h S i . Rather from equation (27), h S i = ∞ ∑ k , p = − ∞ k , p , ( − ) k + p ˜ ϕ ( π k ) ˜ ϕ ( π p )( π i ) kp h e π i ( k + p ) χ i = ∞ ∑ k = − ∞ k , ˜ ϕ ( π k ) ˜ ϕ ( − π k )( π k ) = ∞ ∑ k = | ˜ ϕ ( π k ) | ( π k ) , (43) where we have used the fact that ˜ ϕ ( − t ) is the complex conju-gate of ˜ ϕ ( t ) for any real f ( x ) (see eq. 12). The same result forthe symmetric distributions may also be derived from equation(28) given h sin ( πχ k ) i = / h sin ( πχ k ) sin ( πχ p ) i = k , p . Consequently h m i = µ − δ , h s i = σ + − ∞ ∑ k = | ˜ ϕ ( π k ) | ( π k ) , (44)with lim σ → h S i = ζ ( ) / ( π ) = /
12 (given ˜ ϕ ( t ) = σ =
0) and h s i = σ = σ = Φ ( τ ) is given by the same function admitting the asymp-totic expansion of equation (32), we find h S i ≃ ∞ ∑ p = ζ ( p + s + ) (cid:0) ∑ pr = Φ ∞ , p − r Φ ∞ , r (cid:1) ( p + s )+ π ( p + s + ) σ ( p + s ) , (45)and so h S i ∼ σ − s → h s i = σ + / + O ( σ − s ) (also 0 ≤ h s i ≤ σ + /
12) as σ → ∞ for Φ ( τ ) ∼ τ − s as τ → ∞ . That is to say, h s i typically tends to the limiting valuelim σ → ∞ ( h s i − σ ) = /
12 about twice much faster than theindividual s does. For example, with the bilateral exponentialdistribution given in equation (36), we specifically have h S i = ∞ ∑ k = ( π k ) [ + ( π k σ ) ] = ∞ ∑ k = ( π k ) σ ∞ ∑ p = ( − ) p ( p + ) p ( π k σ ) p = ∞ ∑ p = ( − ) p ( p + ) σ ( p + ) ζ ( p + )( π ) p + = ∞ ∑ p = ( p + ) B p + ( p + ) ! 2 p + σ ( p + ) = σ + − + √ σ sinh ( √ / σ )
16 sinh [ / ( √ σ )] (46)where B k is the Bernoulli number, and so h s i = + √ σ sinh ( √ / σ )
16 sinh [ / ( √ σ )]= σ + − σ ∞ ∑ p = ( p + ) B p + ( p + ) ! 2 p + σ p ; (47)that is, h s i ≃ σ + / − / ( σ )+ O ( σ − ) , which con-trasts to equation (42). A. distributions with a compact support
Suppose that f ( x ) is f ( x ) = √ σ ( µ − √ σ ≤ x ≤ µ + √ σ ) , (48)i.e. the uniform distribution over a compact interval, the nor-malized form of which is F ( u ) = √ ( −√ ≤ u ≤ √ ) Φ ( τ ) = Z √ −√ e iu τ d u √ = sin ( √ τ ) √ τ . (49)We then find for the compact uniform distribution that h S i = ∞ ∑ k = sin ( √ πσ k ) ( π k ) ( √ πσ k ) = ∞ ∑ k = − cos ( √ πσ k ) σ ( π k ) = σ (cid:20) ζ ( ) π − B ( ξ ) (cid:21) = ξ ( − ξ ) σ (50)where ξ = √ σ − ⌊ √ σ ⌋ ∈ [ , ) is the fractional part of“2 √ σ (which is the width of the support)”. Here we haveused the Fourier series expansion of the Bernoulli polynomial(for 0 ≤ ξ ≤
1) of the even order B n ( ξ ) = ( − ) n + ( n ) !2 n − ∞ ∑ k = cos ( π k ξ )( π k ) n , (51)with ζ ( ) = π /
90 and B ( ξ ) = ξ ( ξ − ) − /
30. That isto say, while the remainder h S i = σ + / − h s i falls off“on average” like ∼ σ − as σ → ∞ , the actual behavior in-cludes the periodic modulation superimposed on the asymp-totic scale-free decay. This is due to the compact support onthe underlying distribution: note that a compact distribution F ( u ) typically results in an oscillatory Φ ( τ ) , and h S i is ba-sically the sum on the regular sampling of the latter. The re-sulting modulation of h S i may be understood as a sort of in-terference patterns between the width of the compact supportand the unit intervals for the rounded integer values. Howeverunless the variance of the underlying continuous distribution f ( x ) is known a priori, the averaged asymptotic behavior of h S i as σ → ∞ may be used to estimate σ from s in practicewithin a reasonable accuracy (provided s >> h S i in equation (50) over ξ ∈ [ , ) andget h S i = ζ ( ) / ( σ π ) = ( σ ) − . For a general dis-tribution with a compact support, one may obtain the averagedasymptotic behavior for h S i in equation (43) by assumingthat any sum of the form ∑ k sin ( a σ k ) / k n or ∑ k cos ( a σ k ) / k n (where a is a fixed real constant) also vanishes on average.In principle we can also calculate S and S first, and subse-quently average them over the proper interval. For the uniformdistribution in equation (48), equation (28) results in S = ∞ ∑ k = ( − ) k sin ( √ σπ k ) √ σ ( π k ) sin ( π k χ )= ∞ ∑ k = cos ( π∆ − k ) − cos ( π∆ + k ) √ σ ( π k ) = B ( ∆ − ) − B ( ∆ + ) √ σ = − λζ √ σ , (52) where ∆ ± = µ − δ + / ± √ σ − m ± is the fractional part of µ ± √ σ − δ + / m ± = ⌊ ( / + µ − δ ± √ σ ) ⌋ is theinteger to which the upper/lower limit of the compact support(i.e. µ ± √ σ ) is rounded. In addition, λ = ( ∆ + + ∆ − − ) / ζ = ( ∆ + − ∆ − ) /
2. Also used are ( − ) k sin ( π k χ ) = sin [ π k ( / + µ − δ )] for any integer k given χ = µ − δ − M with an integer M , and B ( x ) = x − x + /
6. Similarly, S = ∞ ∑ k = ( − ) k " cos ( √ σπ k )( π k ) − sin ( √ σπ k ) √ σ ( π k ) cos ( π k χ )= B ( ∆ − ) + B ( ∆ + ) + B ( ∆ − ) − B ( ∆ + ) √ σ = λ + ζ − − (cid:18) λ + ζ − (cid:19) ζ √ σ , (53)further utilizing B ( x ) = x − x / + x / : ∞ ∑ k = sin ( π kx )( π k ) n + = ( − ) n + n ( n + ) ! B n + ( x ) ( ≤ x ≤ ) . (54)Here note lim σ → S = λ + ζ − /
12 and so s = σ − λ − ζ + / + O ( σ − ) even though Φ ( τ ) = sinc ( √ τ ) ∼ τ − as τ → ∞ . Technically this is not a counter-example of the priordiscussion following equation (32), since sinc ( x ) does not ac-tually have a proper asymptotic expansion as x → ∞ in thestrict sense. In fact, we observe that, while the asymptoticbehavior of S as σ → ∞ follows that of Φ ( τ ) as τ → ∞ , thebehavior of S actually traces τΦ ′ ( τ ) instead. With an oscil-latory Φ ( τ ) due to F ( u ) on a compact support, τΦ ′ ( τ ) canindeed be much larger than Φ ( τ ) even if Φ ( τ ) is bounded byan asymptotically decaying envelope, and so s − σ does notnecessarily tend to 1 /
12 unless τΦ ′ ( τ ) → τ → ∞ . Bycontrast, the formula for h s i in equation (44) only involves | ˜ ϕ ( π k ) | and we thus expect the asymptotic behavior of theremainder σ + / − h s i to trace that of | Φ ( τ ) | in general.In order to average the expresseions in equations (52) and(53) over χ ∈ [ − / , / ] , we first observe that ζ = ξ / ≥ ζ = ( ξ − ) / < ≤ ξ < √ σ ). Next ∆ ± = λ ± ζ + / ∈ [ , ) implies λ ∈ [ − / + ζ , / − ζ ) = [( ξ − ) / , ( − ξ ) / ) for ζ ≥ λ ∈ [ − / − ζ , / + ζ ) = [ − ξ / , ξ / ) for ζ <
0. Fi-nally λ = χ ± or χ depending on the parity of m + + m − andwe find that χ ∈ [ − / , / ] at fixed σ (and consequently ξ is also fixed) then maps to the union of those two allowed in-tervals on λ . Consequently averaging over χ ∈ [ − / , / ] isachieved through summing two integrals on λ : viz. h S i = Z − ξ ξ − d λ S (cid:12)(cid:12) ζ = ξ + Z ξ − ξ d λ S (cid:12)(cid:12) ζ = ξ − = σ " ξ Z − ξ d λλ + ( ξ − ) Z ξ d λλ = σ (cid:20) ξ ( − ξ ) + ( ξ − ) ξ (cid:21) = ξ ( − ξ ) σ , (55)which recovers equation (50). Since S is an odd function of λ , it is immediately obvious that h S i =
0. As for h S i , tworespective integrals for ζ = ξ / ≥ ζ = ( ζ − ) / < h S i = χ (i.e. the offset of the mean from an integer value)determines both ∆ ± (i.e. the offsets of the boundary points ofthe support from integer values) once σ (which specifies thewidth of the support) is fixed, ∆ + and ∆ − are not in fact inde-pendent from each other. Nevertheless it is still formally pos-sible to consider S in equation (52) as a function of the pairof independent variables ( ∆ + , ∆ − ) ∈ [ , ) and average S over this whole rectangular domain. Following the coordinatetransform ( ∆ + , ∆ − ) → ( λ , ζ ) , the resulting average is shownto be identical to further averaging the χ -average of equation(55) over ζ ∈ ( − / , / ) or equivalently ξ ∈ [ , ) . That is tosay, if one assume that both upper and lower boundaries of thecompact support are randomly placed relative to the integervalues (and independent from each other), the resulting ex-pectation value (averaged over all possible such placements)recovers only the “slow” asymptotic decay behavior of h s i for the rounded random variables on a compact support whileaveraging off the “fast” modulation due to the interference be-tween the width of the support and the integer signposts. IV. NORMAL DISTRIBUTIONS
Finally we would like to consider the case of f ( x ) being theGaussian normal distribution: f ( x ) = ( π ) / σ exp (cid:20) − ( x − µ ) σ (cid:21) , (56)or in the standard form F ( u ) = e − u / √ π , Φ ( τ ) = e − τ / . (57)Then with ˜ ϕ ( t ) = Φ ( σ t ) , we have S = ∞ ∑ k = ( − ) k sin ( π k χ ) e − ( πσ k ) π k ; S = ∞ ∑ k = ( − ) k cos ( π k χ ) π dd κ (cid:18) e − ( πσκ ) κ (cid:19)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) κ = k = ∞ ∑ k = ( − ) k + cos ( π k χ ) + ( πσ k ) ( π k ) e − ( πσ k ) . (58)Thanks to the super-exponential decay ∝ e − ( πσ k ) in k (NB:the k = k = ∼ e − ( πσ ) : if σ =
1, note e − π ≈ × − !), these sums(which are actually in the form of the Jacobi theta function and its antiderivatives) converge extremely quickly and arecompletely dominated by their respective first terms unless σ ≪
1. Alternatively we may construct a more formal ar-gument by bracketing the infinite sums following the integral convergence test. In particular, we first observe that | S | ≤ ∞ ∑ k = e − ( πσ k ) π k = I , | S | ≤ ∞ ∑ k = + ( πσ k ) ( π k ) e − ( πσ k ) = I (59)but the summands now are strictly decreasing positive func-tions of k ≥
1. Hence the integral convergence test indicates E ( π σ ) π ≤ I ≤ e − ( πσ ) π + E ( π σ ) π , e − ( πσ ) π ≤ I ≤ ( πσ ) + π e − ( πσ ) , (60)where E ( x ) = R ∞ d t e − tx / t is the analytic exponential integral.Note that I for σ =
0, which results in the harmonic series,actually diverges and so the first bounds are only valid for σ >
0, but it has been already shown that S = − χ if σ = e x E ( x ) ∼ ∑ ∞ k = ( − ) k k ! / x k + as x → ∞ , the first bounds may also be replaced by the purelyelementary functions. In particular, for x >
0, we find e x E ( x ) = Z ∞ e x ( − t ) d tt < Z ∞ d t e − x ( t − ) = x , (61)i.e. E ( x ) < e − x / x for x >
0, and so follows that | S | ≤ I ≤ (cid:20) + ( πσ ) (cid:21) e − ( πσ ) π . (62)That is to say, for a sufficiently large σ , both sums I and I are completely dominated by their respective first terms,and S ∼ O ( e − ( πσ ) ) and S ∼ O ( σ e − ( πσ ) ) as σ → ∞ . Inconclusion, the mean m and the variance s of the roundedvariables drawn from the normal distribution (of the mean µ and the variance σ ) behave like m ≃ µ − δ + O ( e − ( πσ ) ) ; s ≃ σ + + O ( σ e − ( πσ ) ) , (63)as σ → ∞ , but the remainder in most practical purposes can besafely ignored provided σ & e − π / π ≈ . × − ).As for the expectation value averaged over µ at fixed σ , ifwe consider the sum h S i = ∞ ∑ k = e − ( πσ k ) ( π k ) , (64)the integral test is still applicable: Z ∞ e − ( πσ x ) d x ( π x ) ≤ h S i ≤ e − ( πσ ) π + Z ∞ e − ( πσ x ) d x ( π x ) . (65)While here the integral is technically reducible to the incom-plete gamma function (or an expression involving the errorfunction), it is sufficient for our purpose to note e a Z ∞ e − ax d xx = Z ∞ e − at d t ( t + ) / < Z ∞ e − at d t = a (66)for a >
0. Hence it follows that, for σ > h S i = σ + − h s i ≤ (cid:20) + ( πσ ) (cid:21) e − ( πσ ) π ; (67) ∴ h s i ≃ σ + + O ( e − ( πσ ) ) (68)with an even faster-decaying (cf. e − ( π ) / π ≈ . × − )remainder term. In summary, the rounding errors for the nor- mally distributed random variables can for most practical ap-plications be considered as independent from the intrinsic dis-persion unless the intrinsic dispersion itself is quite smallerthan the rounding unit. A. R. Tricker, “Effects of Rounding on the Moments of a Probability Distri-bution,” The Statistician. , 381-390 (1984) F. W. J. Olver, D. W. Lozier, R. F. Boisvert, C. W. Clark, “NIST Handbookof Mathematical Functions,” (Cambridge Univ. Press, Cambridge. 2010)( http://dlmf.nist.govhttp://dlmf.nist.gov