Bayesian estimate of the zero-density frequency of a Cs fountain
aa r X i v : . [ s t a t . A P ] J u l DR A FT Bayesian estimate of the zero-density frequency of aCs fountain
D Calonico, F Levi, L Lorini and G Mana
INRIM - Istituto Nazionale di Ricerca Metrologica, Str. delle Cacce 91, 10135Torino, ItalyE-mail: [email protected]
Abstract.
Caesium fountain frequency-standards realize the second in theInternational System of Units with a relative uncertainty approaching 10 − .Among the main contributions to the accuracy budget, cold collisions playan important role because of the atomic density shift of the reference atomictransition. This paper describes an application of the Bayesian analysis ofthe clock frequency to estimate the density shift and describes how the Bayestheorem allows the a priori knowledge of the sign of the collisional coefficientto be rigourously embedded into the analysis. As an application, data from theINRIM caesium fountain are used and the Bayesian and orthodox analyses arecompared. The Bayes theorem allows the orthodox uncertainty to be reduced by28% and demonstrates to be an important tool in primary frequency-metrology. Submitted to:
Metrologia
PACS numbers: 02.50.Cw, 02.50.Tt, 06.20.Dk, 07.05.Kf, 06.30.Ft
1. Introduction
Atomic fountains depend on laser cooling of caesium atoms down to a temperatureof 1 µ K or an even lower value; this, together with the use of a single microwavecavity to implement the Ramsey separated-field spectroscopy, allowed the accuracy ofprimary frequency standards to be improved by more than one order of magnitude.Today, fountains realize the second with a relative accuracy ranging from 3 × − to 10 × − [1, 2, 3, 4, 5, 6, 7, 8] and, in the last ten years, allowed the uncertaintyof the Atomic International Timescale (TAI) unit to be reduced from 1 × − to5 × − .In Cs fountain clocks, cold collisions occur between the ultra-cold atoms withconsequent perturbation of the atomic energy levels and shift of the atomic referencefrequency. This shift, is proportional to the atom density and it must be carefullyevaluated to correct the clock frequency. Since it is a major component of theuncertainty budget, it has been the subject of many theoretical and experimentalstudies [9, 10]. It depends on the collisional dynamics during the atom ballistic flight,which is related to the way the atom cloud is prepared before launch. When an atomcloud, captured into a magneto-optical-trap, is launched, its high initial density limitsthe collision energy and the density shift is strongly dependent on the temperature R A FT µ K and 10 µ K range and that theatomic sample is almost an evenly quantum mixture of the hyperfine eigenstates | F = 3 , m F = 0 i and | F = 4 , m F = 0 i . In this case, the density shift has a negligibledependence on the mixture ratio and it is linearly dependent on density through anegative coefficient [9].In a previous paper [11], we used the Bayesian inference [12, 13] to estimatethe collisional coefficient, consistently with the constraint of a negative value. Thisestimate can be used to extrapolate the fountain frequency to the zero-density value.In the present paper we illustrate how to perform the extrapolation, given two ormore frequency measurements at different atomic densities, irrespectively of the valueof the collisional coefficient value, but consistently with its negative sign. From amathematical viewpoint, the problem is to find the best-fit line through two or morepoints, with the constraint that the regression coefficient is negative.Our goal is twofold. First, we want to illustrate a non-trivial application of theBayes theorem and the relevant data analysis. Second, we want to assess the viabilitya density shift estimation consistent with the constrain of a negative collisionalcoefficient and to test its performances. A Bayesian approach is important becausethe shift has the same magnitude as the measurement noise. Therefore, despite thefact that the shift is a linear function of the atom density having a negative regressioncoefficient, measurement results having the low-density frequency lower than the high-density one are relatively common. A linear extrapolation to zero density from thesedata is clearly meaningless. On the contrary, a Bayesian inference makes it possible todeal with this situation, thus avoiding physical absurdity and, consequently, reducingthe extrapolation uncertainty.After the problem statement, in section 3.1, the Bayesian solution is given,first for two frequency measurements and, then, for list of frequency measurements.Eventually, in section 3.5, the Bayesian analysis is applied to data collected in a testmeasurement. All the symbolic and numerical calculations have been performed withthe aid of Mathematica [14].
2. Experimental techniques
The density shift is commonly evaluated by means of a differential measurementapproach. The fountain is operated alternating low with high atom density andthe frequency of a hydrogen-maser is measured using the fountain in the twoconfigurations. By virtue of its frequency stability, in both short and medium timescales, the maser is used as a flywheel oscillator: the comparison of the two measuredfrequencies cancels the maser frequency and allows density shift to be evaluated. Asreported in [9], if the ratio between the atomic density and the total number of detectedatoms is assumed constant, we can state that the density shift is proportional tothe number of detected atoms. The differential measurement provides a collisionalcoefficient which can be used to extrapolate the frequency data to zero density.The actual experimental practice in the various laboratories differs in the waythe atom clouds of different densities are prepared and in the durations of thefountain operation at low and high density. The extrapolations techniques of the R A FT y = y − y − y x − x x , (1)where ˆ y is the sought zero density frequency, y and y are the frequency in low andhigh density conditions, and x and x are the number of atoms detected at the lowand high density, respectively. The extrapolation is carried out for each ( y , y ) pairs.The total duration of each run is 27000 s; this ensure a good rejection of systematiceffects – fluctuations of the hydrogen maser frequency, magneto-optical trap, and atomdetection efficiency – which could bias the frequency extrapolation.By neglecting the σ x uncertainty of the atom-density measurements – anacceptable omission provided σ x ≪ σ y / | y − y | – the uncertainty of (1) isˆ σ = p x + x x − x σ y , (2)where σ y is the uncertainty of the frequency measurements. Deviations from alinear relationship and an imperfect rejection of long term effects are estimated tocontribute to the uncertainty by 20% of the density-shift value. Here, this non-statistical contributions to the uncertainty have not been considered, though they takepart of the total uncertainty budget of atomic clocks. Besides, being the statisticalcontribution of the same order of magnitude as the shift, model errors do not contributesignificantly.
3. Zero density extrapolation
In the simplest case, we want to perform a linear regression analysis of two clock-frequency, at low and high atom densities, given the prior knowledge that the slope ofthe sought line is negative. In particular, we are interested in the value of the interceptof the regression line. The data are assumed normally distributed about ax i + b withthe same standard deviations σ y , that is, P ν ( y i | a, b ) = 1 √ πσ y exp (cid:2) − ( y i − ax i − b ) σ y (cid:3) . (3) R A FT P ν ( y , y | a, b ) = 12 πσ y exp (cid:2) − ( y − ax − b ) + ( y − ax − b ) σ y (cid:3) . (4)The question is how to account for the a < b .Here and in the following, we will use the notation P r ( r i | s j ) to indicate theprobability density that the quantity r has the particular value r = r i if theparameter s in the probability distribution has the particular value s = s j . Irrelevantconditionals, such as x i and σ y in (3), will be dropped. The Bayesian approach takes account of a < P ab ( a , b )of the regression coefficient and intercept values before the measurement result isknow; in the absence of any additional information let us assume that P ab ( a , b ) = ϑ ( − a ) , (5)where ϑ ( . ) is the Heaviside function. According to the Bayes theorem the post-dataprobability density of the a and b values is P ab ( a , b | y , y ) ∝ P ν ( y , y | a , b ) P ab ( a , b ) ∝ exp (cid:2) − ( y − a x − b ) + ( y − a x − b ) σ y (cid:3) ϑ ( − a ) , (6)where a meaningless normalization coefficient has been omitted. This probabilitydensity embeds both the a < a is integrated out from (6) by marginalization. Hence [14], P b ( b | y , y ) = Z −∞ P ab ( a , b | y , y ) d a ∝ exp (cid:2) − ( b − ˆ y ) σ (cid:3) erfc (cid:2) − b − ˜ y √ σ (cid:3) , (7) - - H y - y L(cid:144) Σ y p r ob a b ilit yd e n s it y Figure 1.
Probability density of the intercept values; x /x = 3, solid (red)is ( y − y ) /σ y = −
4, dashed (blue) is ( y − y ) /σ y = 0, dotted (black) is( y − y ) /σ y = +4. R A FT y and ˆ σ are the intercept of the line through the data and its variance – givenby (1) and (2), ˜ y = x y + x y x + x , (8)is the mean of the measured frequencies weighed by the relevant atomic densities,˜ σ = x + x ( x + x ) σ y (9)is the uncertainty of ˜ y , and erfc( . ) is the complementary error function. The post-data probability density is shown in Fig. 1. The Gaussian factor is the probabilitydensity we obtain by a classical analysis: it is the probability density of the orthodoxextrapolation, having ˆ y mean and ˆ σ variance. The erfc factor, originating from theapplication of the Bayes theorem and from marginalization, takes account of the a < y .The probability density (7) is the result of the Bayesian analysis. To convertit into a single numerical estimate, a loss associated with the estimate error mustbe specified. The optimal estimator minimizes the expected loss over (7). A lossproportional to the squared or absolute errors indicates the mean or the median,respectively; a constant loss indicates the most probable values. Confidence intervalsare easily expressed by integrating (7) to obtain the cumulative distribution function.Figure 2 shows the mean and standard deviation of the zero-density frequency.In the limit case when ( y − y ) /σ y ≪
0, the mean tends to the ˆ y intercept of the linethrough the data, but with larger uncertainty. This is explained by observing that,via marginalization, the Bayesian analysis takes account of all the possible regressioncoefficients. The understanding of the ( y − y ) /σ y ≫ ( y − y ) /σ y →∞ P b ( b | y , y ) = 1 √ πσ y exp (cid:20) − ( b − ¯ y ) σ y (cid:21) , (10)where ¯ y = ( y + y ) /
2. Therefore, the intercept mean approaches the sample mean.In this case, the data are inconsistent with the a priori knowledge on a , but we did notallow room for such inconsistency when formulating the problem and, though at leastone of the data is clearly wrong, we do not know which it is. Consequently, the best wecan do is to fit the data with a line satisfying the a < a = 0; the Bayesian inference is slightly greater because it accountsfor all the a < y − y ) /σ y ≫
0, the extrapolation uncertaintyis smaller than the uncertainty of the extrapolation when ( y − y ) /σ y ≪
0. Thereason is that, when ( y − y ) /σ y → ∞ , we did not question the σ y / √ y − y ) /σ y ≈
0, the mean is always greater than y and smoothly connects the ˆ y and ¯ y asymptotes.When ( y − y ) /σ y ≈
0, the Bayesian analysis seems to overestimate the zero-density frequency, which, from an orthodox analysis, is expected to be quite near y . The supposed overestimation is due to the scarce information delivered by thedata and the use of (5): with a uniformly negative regression coefficient, a verticalregression is, a priori, as probable as a horizontal one and the Bayes theorem accountsfor both possibilities. In the appendix A, we give the result of a Monte Carlo simulationwhich confirms that (5) describes correctly the pre-data probability density, when noinformation, apart a <
0, is available. To summarize the results, we used the mean R A FT - - - - H y - y L(cid:144) Σ y H y - y L (cid:144) Σ y Figure 2.
Mean and standard deviation of the zero-density frequency, given { x = 1 , y } and { x = 3 , y } . The left and right sides of the diagram correspondto physical and unphysical values of the collisional coefficient, respectively.The coloured area indicates the extrapolation uncertainty. The straight lines(asymptotic limits) are b = y − ( y − y ) / b =( y + y ) / { , y } and { , y } pairs are observed three times. and the standard deviation as calculated from (7). However, when ( y − y ) /σ y ≈ { x = 1 , y = 0 } . In this case, the marginal probabilitydensity of the best-fit line intercept is the improper distribution P b ( b | x ) ∝ erfc (cid:18) − b √ σ y (cid:19) , (11)where roughly all the frequencies greater than the measured value are equally probable.However, as an increasing number of data become available, the post-data distributionis dominated by the likelihood function; P ab ( a , b ) becomes irrelevant, and we are ledto the same conclusion irrespectively of the prior probability density. To extend the previous analysis to N measurement pairs { x i , y i } , we must rewrite thejoint probability density of the data as P ν ( y | a, b ) = N Y i =1 N ( y i | ax i + b, σ i ) , (12)where y is the vector of the measured frequencies, N ( y i | ν i , σ i ) is a normal distributionwith ν i mean and σ i variance, and we assumed the data independent and identicallydistributed. Hence, the joint post-data probability density of the regression coefficientand zero-density frequency is P ab ( a , b | y ) ∝ P ν ( y | a, b ) ϑ ( − a ) (13) ∝ exp (cid:20) − (cid:0) a − ˆ a b − ˆ y (cid:1) C − ab (cid:18) a − ˆ ab − ˆ y (cid:19) (cid:21) ϑ ( − a ) , R A FT a , ˆ y , and C ab are the least-squares estimatesand covariance matrix of a and b . Eventually, in the same way as in (7), the parameter a is integrated out of the problem by marginalization. There is no additional insightin integrating (13) analytically, but it is must be noted that, when ˆ a/σ a ≪
0, where σ a is the least-squares uncertainty of ˆ a , then E ( b ) → ˆ y , where E ( b ) is the b mean overthe P b ( b | x i , y i ) distribution. On the contrary, when ˆ a/σ a ≫
0, then E ( b ) → ¯ y , where¯ y is the sample average of the data. In the general case, the smaller the uncertaintyof the least-squares line is, the closer is the extrapolated frequency E ( b ) to ˆ y , if ˆ a < y , if ˆ a >
0. This is shown in Fig. 2 for the particular case when the same { , y } and { , y } pairs are observed three times. We can also process the data pairs sequentially. In this case, we start with (6) basedon the first data pair. This probability density substitutes for (5) as the pre-dataprobability density in the analysis of the second data pair. When this procedureis repeated up to the last pair, we obtain the same post-data distribution as thatobtained with the one-step approach. To better understand the Bayesian analysis, letus consider three identical data-pairs, where ( y − y ) /σ y is close to zero. By startingwith (6), given the first data pair, and by repeating the analysis given the secondand, then, given the third, we note that the Bayesian extrapolation changes each timea new pair is made available. This result is in striking contrast with the orthodoxanalysis where, if we use three identical data pairs, the extrapolation uncertainty isreduced, but the extrapolated value remains the same.This apparent paradox is explained by observing that, when applying the Bayestheorem for the first time, the only piece of information is that the regressioncoefficient value is between zero and minus infinity with equal probability. Hence,the extrapolation is biased towards a frequency value much greater than y . In thesecond iteration, additional information is available, namely the result of the firstmeasurement. Both pieces of information are synthesized in the post-data probabilitydensity (6), which substitutes for the pre-data probability density (5) and limitsextrapolation to a neighbour of the classical one. In each subsequent application ofthe Bayes theorem, we update the pre-data probability density, thus further reducingthe discrepancy between orthodox and Bayesian extrapolations. The Bayes theorem,by prescribing that the regression coefficient is negative and that all its negativevalues – including those arbitrarily large – have the same prior probability, infersthat the classical extrapolation statistically underestimates the zero-density frequency.However, when several measurement results are available, and all are consistent with a <
0, Bayesian extrapolation approaches to the classical one.In the absence of informative data, the sensitivity of the Bayes theorem to priorinformation, synthesized in the pre-data probability density of the measurand, maybe disappointing. This could discourage the use of Bayesian methods, in view of anapparent lack of objectivity. However, a seminal paper by R. T. Cox [15] demonstratesthat, in order to make consistent inferences, it is necessary to resort the Bayes theorem.If we give it up, because we are adverse to make an estimate depending on the priorinformation, we put our results at the risk of contradictions. R A FT Given its good stability during time intervals from hours to weeks, the hydrogen maseris a common choice as a frequency flywheel for the differential measurements. It is alsoused as a transfer oscillator when the fountain is used to evaluate the TAI time unit.However, the hydrogen-maser frequency drifts linearly; this drift is usually evaluatedand removed by means of orthodox statistical techniques. Since the drift is stable forlong time intervals, we can use the knowledge accumulated in the past fountain run toexploit a full Bayesian simultaneous evaluation of the drift and zero-density frequency,with the use of a two-dimensional linear regression.This requires a slight modification of the sampling distribution of the data, whichare therefore assumed normally distributed about ax i + b + ct i , where x i and t i arerespectively the density and epoch related to the frequency value y i , a is the collisionalcoefficient, c is the hydrogen maser drift and b is the frequency value extrapolated tozero density and the epoch t = 0. Hence, the joint probability density of the databecomes P ν ( y | a, b, c ) = N Y i =1 N ( y i | ax i + b + ct i , σ i ) . (14)The drift of the hydrogen maser frequency is known; therefore, its pre-data probabilitydensity is P c ( c ) = N ( c | ˜ c, σ c ) , (15)where ˜ c ± σ c is its pre-data estimate. Eventually, the joint post-data probabilitydensity of the model parameters is P abc ( a , b , c | y ) ∝ N Y i =1 N ( y i | a x i + b + c t i , σ i ) N ( c | ˜ c, σ c ) ϑ ( − a ) (16)and the post-data probability densities of each individual parameter irrespectively ofthe others are obtained by marginalization. Then, the marginal post-data probabilitydensity of the zero-density frequency is P b ( b | y ) = Z + ∞−∞ P abc ( a , b , c | y ) d a d c (17)
4. Frequency extrapolation in a Cs fountain
Table 1 records measurement results and the relevant uncertainties, collected during aTAI unit evaluation run of the IT-CsF1 fountain at INRIM. The density values havebeen so scaled that the mean low-density is unitary; the frequency values have beengiven in units of the σ ν = 3 . × − ν Cs uncertainty – where ν Cs = 9 192 631 770 Hz– and have been so shifted that the mean low-frequency is one. The same data areshown in Fig. 3.Some of the low- and high-density data pairs in Table 1 lie on a positive-slopeline, in agreement with Gaussian dispersion of the frequency data. By the orthodoxapproach, the prior information is not taken into account and these positively-slopeddata contribute to the best-fit line with the same weight as the others. It is not sowhen a Bayesian analysis is made. By applying the model described in 3.5 to the datain Table 1, the marginal post-data probability density of the zero-density frequency R A FT Table 1.
Clock frequency vs. atom density. Measurement have been performedalternately at low and high densities at equal 0.313 day intervals; time increasesfrom top to bottom and, then, from left to right. ρ low and ν low are the mean low-density and -frequency, σ ν = 3 . × − ν Cs is the mean uncertainty of frequencydata. x ρ low y − ν low σν x ρ low y − ν low σν x ρ low y − ν low σν x ρ low y − ν low σν . − . . − . .
15) 0 . . . − . . . − . . − . .
15) 0 . − . . . . . − . . − . .
15) 1 . − . . − . . . − . . − . .
15) 1 . . . − . . . − . . − . .
15) 1 . . . − . . . − . . − . .
15) 1 . . . . . . − . . − . .
15) 1 . . . − . . . − . . − . .
15) 1 . . . − . . . − . . − . .
15) 1 . − . . . . . − . . − . .
15) 1 . . . . . . − . . − . .
15) 1 . . .
66) 3 . . . . − . . − . .
15) 1 . . . . . . − . . − . .
15) 0 . . . . . . . . . .
15) 1 . . . . . . . . − . .
15) 1 . . . . . . − . . − . .
15) 0 . . . . . . − . . − . .
15) 0 . . . . . . − . . . .
15) 1 . . . . . . − . . − . .
64) 0 . . . . . . − . .
48) 3 . − . .
64) 0 . . . . . . . .
48) 3 . − . .
64) 1 . . . . . . − . .
48) 3 . . . ••••••••••••••••••••••••••••••••••••••••••• ŸŸŸŸŸŸŸŸŸŸŸŸŸŸŸŸŸŸŸŸŸŸŸŸŸŸŸŸŸŸŸŸŸŸŸŸŸŸŸŸŸŸŸ- - - - - H Ν - Ν l o w L (cid:144) Σ Ν - - - Ρ (cid:144) Ρ low H Ν - Ν l o w L (cid:144) Σ Ν Figure 3.
Results of low (red squares) and high (blue bullets) density frequencymeasurements. The hydrogen-maser drift has been removed and the line is theintersection of the best-fit plane with t = 0. is given by (17) and it is shown in Fig. 4, where we used the ˜ c = (0 . ± . σ ν /daypre-data estimate of the hydrogen maser drift.After the relevant marginalization, the mean and standard deviation of thehydrogen maser drift, collisional coefficient, and zero-density frequency have beencalculated and are shown in Table 2 [14]. Since, in general, the probability densitiesof these quantities are not Gaussian, the mathematical definitions of the mean andvariance, for instance,E( b ) = Z + ∞−∞ ξP b ( ξ | y ) d ξ (18)and var( b ) = Z + ∞−∞ (cid:2) ξ − E( b ) (cid:3) P b ( ξ | y ) d ξ, (19) R A FT ax i + b + ct i plane.When comparing the Bayesian and the orthodox frequency extrapolation, weobserve that the Bayes theorem allows the orthodox uncertainty to be reduced by 28%.This is in agreement with the best usage of the prior information, which usage reducessignificantly the variability of the regression coefficient. In terms of original relativefrequency units, the Bayesian analysis allows the extrapolation uncertainty to bereduced from 9 . × − ν Cs to 7 . × − ν Cs . We also checked that the use of a moreconservative pre-data estimate of the hydrogen maser drift, e.g., (0 . ± . σ ν /day,does not significantly affect the post-data probability density. Table 2.
Comparison of classical and Bayesian estimates of the hydrogen maserdrift, collisional coefficient, and zero-density frequency. σ ν = 3 . × − ν Cs isthe mean uncertainty of frequency data classical analysis Bayesian analysisH maser drift (0 . ± . σ ν /day (0 . ± . σ ν /daycollisional coefficient ( − . ± . σ ν /ρ low ( − . ± . σ ν /ρ low zero-density frequency ν low + (0 . ± . σ ν ν low + (0 . ± . σ ν A final consideration concerns the conceptual interpretation of classical andBayesian results. When summarized by the orthodox best estimate and uncertainty,the available information about the zero-density frequency is synthesized by therealization of a random variable (the extrapolated frequency value) and by a measureof the width of distribution from which this realization has been sorted – e.g., thevariance. On the contrary, the Bayesian analysis synthesizes the information by ameasure of the measurand-distribution location and by a coverage interval containingthe value of the zero-density frequency with stated probability.
5. Conclusions
When extrapolating the frequency of a caesium-fountain clock to zero atom-density,the Bayes theorem makes it possible to take account of a negative correlation between - - H Ν - Ν low L Σ Ν p r ob a b ilit yd e n s it y Figure 4.
Post-data probability density of the zero-density extrapolatedfrequency. R A FT y > y constraint.We have considered the accuracy evaluation of the INRIM Cs fountain duringstandard operation; the uncertainty contribution due to atomic density shift is reducedby 28%, from 9 . × − to 7 . × − . In this application, the frequencies areaveraged classically over 21000 s (low-frequency configuration) and 6000 s (high-frequency configuration); the Bayesian analysis is then applied to the resulting setof data. Since these averages do not account for a negative collisional coefficient,future work will be aimed at investigating the optimal split the 27000 s block intoshorter ones to reduce the influence of the classical averaging. Appendix A. Inverse Monte Carlo simulation
The probability density of the zero-density frequency (7) was calculated alsonumerically by inverse Monte Carlo simulation. In direct Monte Carlo simulationa random list of measurement results, having a fixed measurand value, is generatedby repetitions of a numerical experiment. On the contrary, in inverse simulation it isgenerated a random list of measurand values having a fixed measurement result.A brute force approach exemplifies the simulation procedure. Let the measuredvalues of the low- and high-density frequencies be fixed, for example, y = 0 and y = − σ y = 1. In the inverse simulation, y and y are sampled from N ( y i | ax i + b, y = 0 and y = −
1, that is, if theyhappen to be identical to the wanted measurement results, the sorted zero frequency b is appended to the Monte Carlo list.This procedure is thoroughly inefficient. A short-cut is to observe that anyarbitrary { y , y } pair – sorted from N ( y i | ax i + b, - - H y - y L(cid:144) Σ y p r ob a b ilit yd e n s it y Figure 5.
Monte Carlo histogram of the zero-density frequency, given { x =1 , y = 0 } and { x = 3 , y = − } . Solid line is the theoretical prediction (7) R A FT a and b – can be mapped into the wanted { , − } pair by y i → y i − (1 + y − y )( x i − x ) x − x − y . (A.1)The same result should have been obtained if the data pair should have been sampledfrom N ( ν i | ν i , ν i = (cid:18) a − y − y x − x (cid:19) x i + b + 1 + y − y x − x x − y . (A.2)Hence, given any { y , y } sampled from N ( y i | ax i + b, b → b + y − y x − x x − y , (A.3)which is obtained by setting x i = 0 in (A.2). Provided the relevant collisionalcoefficient in (A.2), a − (1 + y − y ) / ( x − x ), is negative – as requested, (A.3)is appended to the Monte Carlo list, otherwise it is rejected.A Mathematica script illustrating the inverse Monte Carlo simulation is appendedbelow; the simulation results are shown in Fig. 5: the Monte Carlo frequencies agreewith the marginal probability density of the zero-density frequency predicted by (7).We are unable to find out if and where the uniformity of the pre-data distribution hasbeen used in the simulation; therefore, this result confirms that (5) indicates correctlythe absence of any prior information, apart from the sign. (* Inverse Monte Carlo Simulation *)np = 100000; (* Appendix B. Sufficient statistics for the zero-density frequency andcollisional coefficient
The post-data probability density (13) holds because ˆ a and ˆ y – the slope andintercept of the best-fit line through the data – are sufficient statistics for thecollisional coefficient and zero-density frequency. To demonstrate this, let the samplingdistribution (12) be written as P ν ( y | β ) ∝ exp (cid:20) −
12 ( y − A β ) T C − y ( y − A β ) (cid:21) , (B.1) R A FT y is the vector of the measured frequencies, β is the vector of the unknowns, A is the design matrix, and C y the covariance matrix. The exponent of (B.2), χ ( β ), isa quadratic form in β ; hence χ ( β ) = const . −
12 ( β − ˆ β ) T C − β ( β − ˆ β ) , (B.2)where ˆ β = C β A T C − y y is the least-squares estimate of β and C β = ( A T C − y A ) − itscovariance matrix. Eventually, by leaving out the terms independent of β , which areunessential in (13), P ν ( y | β ) ∝ exp (cid:20) −
12 ( β − ˆ β ) T C − β ( β − ˆ β ) (cid:21) . (B.3) References [1] Levi F, Calonico D, Lorini L, Godone A 2006 IEN-CsF1 primary frequency standard at INRIM:accuracy evaluation and TAI calibrations
Metrologia − uncertainty level Metrologia Metrologia et al. IEEETrans on Instr and Meas Metrologia Metrologia Metrologia , , 139-148[8] Takayuki K, Fukuyama Y, Koga Y, Abe K 2004, Preliminary evaluation of the Cs atomic fountainfrequency standard at NMIJ/AIST IEEE Trans. IM Cs Fountain Clocks
Phys. Rev. Lett. Phys. Rev. Lett. , 153002[11] Calonico D, Levi F, Lorini L and Mana G 2008 Bayesian inference of a negative quantity frompositive measurement results Metrologia Data Analysis: a Bayesian Tutorial (Oxford: Oxford UniversityPress)[13] Jaynes E T 2003
Probability Theory: the Logic of Science (Cambridge: Cambridge UniversityPress)[14] Wolfram Research, Inc. 2008 Mathematica Edition: Version 7.0 (Champaign, Illinois: WolframResearch, Inc.)[15] Cox R T 1946 Probability, Frequency, and Reasonable Expectation
Am. Jour. Phys.14