[PDF] A Parameterization Invariant Approach to the Statistical Estimation of the CKM Phase α

Abstract

In contrast to previous analyses, we demonstrate a Bayesian approach to the estimation of the CKM phase α that is invariant to parameterization. We also show that in addition to {\em computing} the marginal posterior in a Bayesian manner, the distribution must also be {\em interpreted} from a subjective Bayesian viewpoint. Doing so gives a very natural interpretation to the distribution. We also comment on the effect of removing information about B 00 .

Full PDF

aa r X i v : . [ h e p - ph ] O c t Preprint typeset in JHEP style - HYPER VERSION

A Parameterization Invariant Approach to theStatistical Estimation of the CKM Phase α Robin D. Morris

USRA-RIACS, 444 Castro St, Suite 320, Mountain View, CA 94306, USAE-mail: [email protected]

Johann Cohen-Tanugi

Stanford Linear Accelerator Center, 2575 Sand Hill Rd, Mailstop 0029, Menlo Park, CA94025, USAandLaboratoire de Physique Th´eorique et Astroparticules, CNRS/IN2P3 et Universit´eMontpellier 2, Place Eug`ene Bataillon, F-34095 Montpellier Cedex, FranceE-mail: [email protected]

Abstract:

In contrast to previous analyses, we demonstrate a Bayesian approach to theestimation of the CKM phase α that is invariant to parameterization. We also show that inaddition to computing the marginal posterior in a Bayesian manner, the distribution mustalso be interpreted from a subjective Bayesian viewpoint. Doing so gives a very naturalinterpretation to the distribution.We also comment on the eﬀect of removing information about B . Keywords:

Statistical Methods, CP violation, B-Physics. ontents

1. Introduction 12. Mirror Solutions in a Simple 2D Problem 23. Extracting the CKM Phase α

54. Conclusions 9A. Reparameterization invariance of the marginal posterior pdf over α α Problem 10

B.1 The Pivk-LeDiberder Parameterization 10B.2 The Explicit Solution Parameterization 11B.3 The 1i Parameterization 12

1. Introduction

A number of papers have been published recently that form a lively debate about thenature of inference in particle physics in general, and in the extraction of the CKM phase α from measured branching ratios and asymmetries in particular (see e.g. [1] and referencestherein for theoretical motivations and recent experimental results).The ﬁrst paper, Charles et al., [2], proposed several diﬀerent parameterizations of theCKM phase α problem and showed, in their formulation, that diﬀerent parameterizationsresulted in diﬀerent posterior marginal distributions for α . These diﬀerent distributionswere held to be the result of using ﬂat priors in the diﬀerent parameterizations. Theinterpretation of p ( α ) in Charles et al. also claimed that it did not correctly identify the8 known mirror solutions to the CKM phase α problem. Charles et al. also provided asimple 2-dimensional problem which they claimed showed similar features.Charles et al. is a criticism of the approach taken by the UTﬁt collaboration [3], andBona et al. replied in [4]. In this paper the emphasis is shifted from full distributions over α to 95% probability regions, which are shown to be very similar to the 95% conﬁdenceintervals given in Charles et al.. Bona et al. also note that the identiﬁcation of the 8modes in the 1-CL plot of Charles et al. is not robust to slight changes in the values ofthe observables, and that, in practice there is plenty of information regarding the hadronicamplitudes which can (and should) be used to remove some of the degeneracy.Charles et al. replied in [5], criticizing the change of emphasis from p ( α ) to 95% prob-ability intervals as being an admission that the approach of Bona et al. has signiﬁcant– 1 –ependence on the parametrization chosen. They also repeated their criticism that theBayesian marginal posterior, p ( α ) does not show the expected 8-fold ambiguity.A paper by Botella and Nebot [6] took another approach, noting that some param-eterizations used in the analysis of the CKM phase α problem are inadequate if they gobeyond the minimal Gronau and London assumptions [7]. In particular, the “modulus andargument” (MA) and “real and imaginary” (RI) parameterizations of Charles et al. wereshown to not uniquely identify α in the parameterization, leading to the leaking of spuri-ous information into p ( α ). Botella and Nebot identiﬁed which parameterizations do notsuﬀer from this problem. They also, however, concentrated on probability regions, thoughthey came tantalizingly close to giving the correct Bayesian interpretation of p ( α ) in theirappendices C and E.In this paper we will show how to perform a Bayesian analysis of the problem thatresults in the same p ( α ) for any parameterization. We also show how regarding p ( α ) as aBayesian subjective distribution, i.e. one that describes our state of knowledge, allows it tobe correctly interpreted in a straightforward manner – it is not suﬃcient just to use BayesTheorem to perform computation, the result of that computation must also be interpretedfrom the Bayesian perspective.We begin by reconsidering the simple 2-dimensional problem with mirror solutions ofCharles et al. as it is illustrative of some of the main points we wish to make.

2. Mirror Solutions in a Simple 2D Problem

The problem, from section VIII of [2], is presented as “a theory predicts the expressions oftwo observables X and Y as functions of the two parameters α and µ ” : X = ( α + µ ) Y = µ , (2.1)where “an experiment has measured the observables from a Gaussian sample of events” with the results: X = 1 . ± . Y = 1 . ± . . (2.2)In terms of the assumed physics, only α is of interest.It is important even at this early stage of the analysis to be clear regarding what isconsidered an “observable”, what is considered a “parameter”, and what is meant by sayingthat an observable has a distribution, or that a parameter has a distribution. Observablesare expected to have values that vary with diﬀerent experimental data sets, and sayingthat an observable has a distribution quantiﬁes the uncertainty due to a particular dataset. Saying that a parameter has a distribution is a Bayesian concept, indicating that thereis actually a true, ﬁxed, value, and that the distribution represents our state-of-knowledgeregarding what that value might be. – 2 –his distinction is often somewhat artiﬁcial, however. Typically the quantities labeledas observables are not actually observed directly, instead they are themselves inferred fromobserved data. Diﬀerent data sets will give diﬀerent distributions over the observables and,consequently in the Bayesian framework, diﬀerent distributions over the parameters. Inequation 2.2, for example, the means and variances for X and Y are the summary resultsof a particular data set.The standard approach to computing a joint Bayesian posterior distribution for α and µ is to use equations (2.1) and (2.2) to deﬁne a likelihood, and then to combine it with aprior, p ( α, µ ), on α , µ , giving p α,µ ( α, µ | d ) ∝ πσ X σ Y exp (cid:18) − [( α + µ ) − ¯ X ] σ X − [ µ − ¯ Y ] σ Y (cid:19) p ( α, µ ) (2.3)where d denotes the experimental data and ¯ X , ¯ Y , σ X and σ Y are derived by considering thefull expression for the likelihood over the individual measurements. They are all functionsof d .This formulation is subject to the standard criticism that diﬀerent parameterizationsrequire diﬀerent priors – if, for example, we were to parameterize the problem by α , µ ′ where µ ′ = µ , then clearly ﬂat priors on µ and µ ′ will result in diﬀerent posterior distributions[8]. The discussion of observables and parameters above motivates an alternative Bayesiananalysis, one that results in a posterior distribution that is invariant to the parameterizationchosen. In this analysis we ﬁrst use the observed data to obtain a posterior distributionover X and Y . This requires a prior on the observables, and yields p X,Y ( x, y | d ) ∝ πσ X σ Y exp (cid:18) − ( x − ¯ X ) σ X − − ( y − ¯ Y ) σ Y (cid:19) p ( x, y ) . (2.4)Placing priors in the space of observables is reasonable: it is here that the experimenterwill typically have good prior knowledge – prior knowledge that determined the design ofthe experiment.The physical parameters of interest, α , µ are related to X , Y by the deterministicrelationships in equation (2.1). The distribution p α,µ ( α, µ | d ) is thus computed by thechange of variables rule. When the posterior for α , µ is computed in this way, the generalresult in Appendix A can be used to show that the resulting posterior marginal distribution, This simple for of the likelihood is a result of the assumed Gaussian errors. In general, it will not beexpressible in terms of summary statistics. Conditioning explicitly on the data, d i , i = 1 . . . N d Charles et al.’s “Gaussian sample of events”, gives p ( x | d ) ∝ p ( x ) N d Y i =1 √ πσ e exp „ − ( x − d i ) σ e « . It is well known that the product of two Gaussians has variance less than either of the two. As a consequence p ( d | x ) becomes steadily more peaked as more data is collected ( N d increases). The prior p ( x ) does notchange. Thus, contrary to what is claimed in Charles et al., it is often simple to show that “the relativeprior dependence of the posterior distribution is reduced as the statistical information from the measureddata is increased”. – 3 – -2 -1.5 -1 -0.5 0 0.5 1 1.5 2 µ -1-0.500.51 |d) µ , α ( µ , α p α -3 -2 -1 0 1 2 3 | d ) α ( α p Figure 1:

Left : The posterior distribution of µ, α ; Right : the marginal distribution p α ( α | d ). p α ( α | d ) is invariant with respect to the chosen parameterization of the other variables (inthis case, µ ).Changing variables gives p α,µ ( α, µ | d ) ∝ p X,Y ( x ( α, µ ) , y ( α, µ ) | d ) (cid:12)(cid:12)(cid:12)(cid:12) ∂ ( X, Y ) ∂ ( α, µ ) (cid:12)(cid:12)(cid:12)(cid:12) (2.5)resulting in p α,µ ( α, µ | d ) ∝ πσ X σ Y exp (cid:18) − [( α + µ ) − ¯ X ] σ X − [ µ − ¯ Y ] σ Y (cid:19) | µ ( α + µ ) | (2.6)on the assumption of a ﬂat prior p ( x, y ), and where the factor of 4 is removed because ofthe multiple solutions. This is plotted in ﬁgure 1.Comparing equation (2.6) with equation (2.3) it is clear that this transformation ofvariables formulation is equivalent to using the prior p ( α, µ ) ∝ | µ ( α + µ ) | . In this problem it is straightforward to show that the Jeﬀrey’s prior [9], given by p | I ( α, µ ) | where I () is the Fisher Information matrix, is also proportional to | µ ( α + µ ) | . The Jeﬀrey’sprior is the prior that is invariant to transformation of the variables. Thus, computing aposterior p X,Y ( x, y | d ) using a uniform prior on X and Y followed by a transformation ofvariables to give p α,µ ( α, µ | d ) is equivalent o using a Jeﬀrey’s prior on α , µ .While ﬁgure 1 (left) looks very similar in projection to ﬁgure 5 in [2], note, however,that the modes of p α,µ ( α, µ ) are not located at the values of α that were found by sub-stituting the mean values ¯ X and ¯ Y into equations (2.1). They are shifted because of thepresence of the term | µ ( α + µ ) | in the expression for P α,µ ( α, µ ) in equation (2.6), comingfrom the determinant of the Jacobian of the transformation from X, Y to α, µ . In this casethe displacement of the modes is small; it is not visible in ﬁgure 1. This need not be thecase in general, and indeed is not the case for the CKM phase α problem. See section 3.The simplest way to form the marginal distribution p ( α ) is to generate samples fromthe distributions of X and Y , to transform these samples into samples of α and µ , andthen to plot a histogram of the samples of α [10]. In this case we generate samples x i ← – 4 – (1 . , .

07) and y i ← N (1 . , . i = 1 . . . N for some suitably large N , and from eachpair ( x i , y i ) we ﬁnd the four solutions for ( α i , µ i ), namely α i = ǫ x √ x i − ǫ y √ y i µ i = ǫ y √ y i (2.7)where ǫ x = ± ǫ y = ± α i , µ i ) pairs is given weight 1/4 .In the right panel of ﬁgure 1 we plot the marginal distribution p α ( α | d ), which isvery similar to ﬁgure 6 (bottom) from Charles et al.. In their discussion of this ﬁgure,Charles et al. state that “if α and µ are fundamental physics parameters, Nature can onlyaccommodate a single pair of values” , and criticize the Bayesian approach by saying thatthe marginal p α ( α | d ) only has 3 peaks, with the peak at zero being higher than the othertwo. This is an incorrect interpretation of the distribution. This distribution is in factexactly right when interpreted as a Bayesian subjective distribution, as representing ourstate of knowledge . Nature has chosen one of the four modes visible in the joint distribution p α,µ ( α, µ ). We do not know which one. On the basis of our knowledge, there are twochances out of four that Nature has chosen α ≈

0, so our state of knowledge is exactly that α ≈ α ≈ − α ≈

2. This is precisely what is shown by thedistribution in the right panel of ﬁgure 1, where the central mode has twice the area ofeach of the other two modes.This simple problem has illustrated two of the key points we wish to make, namelythat the posterior distribution must be interpreted in a subjective Bayesian manner, andthat the posterior distribution in this type of problem can be found by putting priors in thespace of observables, and then using the transformation of variables rule to compute thedistribution over the parameters derived from the observables. The simple problem is notrich enough to clearly demonstrate that this approach also leads to posterior distributionsfor α which are independent of the parameterization chosen. To do this, we turn now tothe full CKM phase α problem.

3. Extracting the CKM Phase α There are six observable parameters involved in the CKM Phase α problem, three CPaveraged branching fractions, B + − , B +0 , B , the direct CP asymmetries C + − and C ,and the B ¯ B mixing-induced CP asymmetry, S + − . These have been recently measuredby the B-factory experiments B a B ar and Belle [1, 11].The general formula for the branching ratio of a 2-body decay of a meson B can befound in [12] (eqs. 38.16 and 38.17). Specializing to a ﬁnal state of light mesons, and Note, however, that with ﬁnite probability some of the samples x i and/or y i will be negative, resultingin imaginary values for µ i and/or complex values for α i . This is not a problem with probability theory.What it indicates is that the Gaussian distributions in equations (2.2) are only approximations to the truedistributions of X and Y. – 5 –bservable B + − B +0 B Mean ± std (5 . ± . × − (5 . ± . × − (1 . ± . × − Observable C + − C S + − Mean ± std − . ± . − . ± . − . ± . Table 1:

World average values for the observables, from [2]. averaging over CP-eigenstate yields: B ij = τ i + jB πM B ~ | A ij | + | ¯ A ij | C ij = | A ij | − | ¯ A ij | | A ij | + | ¯ A ij | S + − = 2 Im ( ¯ A + − A + −∗ ) | A + − | + | ¯ A + − | . The decay amplitudes can be parameterized in a number of ways. Here we will considerthree parameterizations, the Pivk-LeDiberder (PLD) and Explicit Solution (ES) parame-terizations considered in Charles et al. and the so-called 1i parameterization from Botellaand Nebot. These vary in how they parameterize A ij and ¯ A ij , but all include α explicitlyas one of the parameters. Details of the parameterizations are given in appendix B.Denote the parameterizations as ( α, φ P LD ), ( α, φ ES ) and ( α, φ i ), where φ P LD denotesthe other ﬁve parameters of the PLD parameterization, and similarly for φ ES and φ i .Denote by O the set of six observables, B + − , B , B +0 , C + − , C and S + − . Then we have O = f ( α, φ P LD )= g ( α, φ ES )= h ( α, φ i ) , where the functional forms of f (), g () and h () can be derived from the parameterizationsgiven in Appendix B. Table 1 gives the values for the observables and their uncertaintythat are used in this work . Using a uniform prior in the space of observables, these deﬁnea multivariate Gaussian posterior, p ( O | d ) where d is the experimental data.Using the change-of-variables formulation gives p P LD ( α, φ P LD ) = p ( f ( α, φ P LD ) | d ) | J f | and the marginal distribution for α is given by p P LD ( α ) = Z φ PLD p ( f ( α, φ P LD ) | d ) | J f | dφ P LD . (3.1) The values for the observables given in Table 1 are those used in [2], as we wish to compare our methodwith theirs. Subsequent improved measurements result in the distributions only having four modes. SeeAppendix B of [6]. – 6 – | d ) α ( α p +++++-+-++---++-+---+---sum Explicit Solution (ES) α | d ) α ( α p α +++++-+-++---++-+---+---sum PLD parameterization α α | d ) α ( α p +++++-+-++---++-+---+---sum

1i parameterization α Figure 2:

The ﬁrst three plots show the marginal posterior distributions for α under the PLD,ES and 1i parameterizations generated by inverting the systems. The short vertical red lines onthe top left plot indicate the central values obtained by [2]. Legends indicate the tuple of signscorresponding to each mode. The ﬁnal plot is of samples generated from the PLD parameterizationusing the tempered transitions MCMC scheme. Binning in α is identical for all ﬁgures. In the ﬁrstthree plots, a sample of 100000 sets of observables is drawn, and the choices of signs, as indicatedby the legends, allows each mode to be determined separately. As a result, the sum histogramshave 800000 non independent entries. The fourth histogram is of size 100000. Similarly p i ( α ) = Z φ i p ( h ( α, φ i ) | d ) | J h | dφ i . (3.2)In appendix A we show that under reasonable conditions these marginal distributionsare identical, i.e. that the marginal posterior distribution for α is independent of the chosenparameterization. This should not be surprising – the same information on the sameobservables gives the same information about the same physical parameter.In ﬁgure 2 we plot histograms representing the three marginal posterior distributions.The samples were generated by sampling the observables and inverting the systems .As expected, the three histograms are essentially identical. We also show a histogramof samples generated using the PLD parameterization and a Markov chain Monte Carloalgorithm [13]. As expected, the histogram is the same as the others. It is includedto demonstrate that our approach is not restricted to cases where the system can beinverted. Care must be taken in choosing the MCMC scheme, as the distribution is strongly If we choose to use non-ﬂat priors on the observables., then we can generate samples representing thedistribution p ( α, φ ) by generating samples from the observables, weighting each sample by the prior, andthen re-sampling the set of weighted samples to give samples from the posterior. See [10] for details. – 7 –ultimodal. We used the simulated tempering scheme of [14] which successfully sampledthe 8 modes of the distribution.If we consider the modulus-and-argument (MA) parameterization [3, equation 4], wewill not, however recover the same distribution for α . The determinant of the Jacobianfor the MA system is identically zero for α = 0, and this results in a spurious zero inprobability at α = 0, the remainder of the distribution being identical to our ﬁgure 2. Thisadds to the discussion in [6] that the MA parameterization, by going beyond the minimalGronau and London assumptions, is unsuited to the analysis of this problem.The histograms generated by inverting the systems are clearly composed of 8 modes,one for each of the 8 solutions. (There are two modes that overlap almost totally around α ≈ ◦ .) By construction, each of these modes has equal probability mass (=1/8),even though they are diﬀerent shapes; the heights and widths vary, but the area beneatheach mode is the same. Each possible solution for α has diﬀerent uncertainty (due to thecomplex relationship between α and the observables), but each mode has equal probabilityto be the one chosen by Nature .The ﬁnal marginal distribution is the sum of these 8 modes, which is plotted as thedotted line. This shows a large peak around α = 140 ◦ and a number of smaller peaks.Again, this distribution correctly describes our state of knowledge – there are 2 of the 8modes near α = 140 ◦ and, because we don’t know which mode Nature has chosen, thereare thus 2 chances out of 8 that α ≈ ◦ . There is only 1 chance out of 8 that α ≈ ◦ , sothe peak there has half the area of the peak at α = 140 ◦ . This accurately represents ourstate of knowledge about α .Also shown on ﬁgure 2 are short vertical lines marking the values of α that are foundwhen the mean values for the observables are transformed into the diﬀerent parameter-izations. Again, it comes as no great surprise that the mean of the distribution of theinputs is not transformed to the mean of the distribution of the output, especially whenthe uncertainty on some of the variables is of the same order as the value itself, and thesystem of equations is highly nonlinear . This also naturally explains why there is stillﬁnite probability density that α = 0 / ◦ .As the methodology presented in this work relies on the one-to-one relationship (up todiscrete ambiguities) between the observables {B + − , B +0 , B , C + − , C , S + − } and the The reader is reminded that we are reconsidering the case discussed in Charles et al.. A completeanalysis of the CKM phase α problem would include additional information which would break the symmetry[15]. In this case, and in the 2d problem in section 2, it is known by construction that each mode containsthe same proportion of the total probability (1/4 for each mode in the 2d problem and 1/8 for the CKMphase α problem). In general, however, this may not be known in advance. Using a numerical searchroutine with random restarts can be used to locate the modes, and the Hessian, H , at each mode can becomputed. (Often this will be computed as a by-product of the numerical optimization.) The probabilityvolume in each mode can be approximated by p (ˆ θ ) / p det( H/ π ) where ˆ θ are the parameters at the mode[16]. Alternatively, samples generated without knowing how many modes are present (e.g. by using thetempered transitions MCMC scheme) can be clustered, and the number of samples in each cluster gives ameasure of the probability volume in that mode. We note, however, that as the variances of the observables are reduced, the mean values remainingﬁxed, that the modes do converge to the values given by inverting the mean values. – 8 – | d ) α ( α p +++++-+-++---++-+---+---sum B -6 × C -1-0.8-0.6-0.4-0.200.20.40.60.81 Figure 3:

Left: Posterior distribution for the ’1i’ parameterization when C and B are uniformlysampled in [-1, +1] and [0, 20 B ], respectively. Sampling is identical to ﬁgure 2. Right: Jointdistribution of B and C implied by the observations and the assumption of isospin symmetry. underlying isospin amplitude representation, the analysis of the case when B and C arenot measured is not in general possible, once the system has been inverted. For instance,although the PLD representation presents the very appealing feature that α appears in thesystem (B.2) only in the expressions for B and C , and therefore cannot be determinedwhen the latter are not measured, this feature is not obvious anymore in the inverted system(B.3). This is equivalent to the fact, already mentioned in Botella and Nebot (section C.1),that {B , C } are algebraically constrained by any set of measurements {B + − , B +0 , C + − , S + − } and the assumption of isospin symmetry. As noted by Botella and Nebot, sampling C uniformly between − B between 0 and B max , results in a distributionthat is much ﬂatter than those shown in ﬁgure 2. This distribution does not, however,become ﬂat as B max → ∞ , because ultimately the shape of the underlying single modedistributions will be driven by the algebraic constraints from the isospin assumption and bythe error propagation from the measured observables. As an illustration, we show in ﬁgure3 the result of the ‘1i’ parameterization for B max = 20 B . Increasing the upper bound on B will not change the ﬁnal distribution, but will result in more samples being thrown awayas incompatible with the constraints on the system. Figure 3 (right) shows a histogram ofthe samples of B and C that were retained. It shows the probabilistic constraints on B and C due to the observations and the assumption of isospin symmetry.

4. Conclusions

In the debate concerning the analysis of the CKM phase α problem we have contributed twoimportant points. The ﬁrst is a formulation of the problem that is invariant to the choiceof parameterization. The second is the correct interpretation of the posterior marginaldistribution for α as a representation of our state of knowledge.In the CKM Phase α problem the relationships between the parameters of the modeland the observables is deterministic. In this case the appropriate statistical technique toﬁnd the distribution over the model parameters is that of the transformation-of-variables.This gives us a distribution over the model parameters that summarizes our state of knowl-edge. It does not, and cannot, tell us if our model is true or false. We have no way of– 9 –nowing the actual mechanisms of the external universe. We can only generate models ofthe universe and use data to cast light on these models. However “true” we may thinkour models are today, better models will certainly be developed tomorrow. The scientiﬁcmethod is composed of the cycle of model formulation, testing against observations, andmodel revision and development. Bayesian statistics provides many tools to facilitate thisprocess. Acknowledgments

RDM is supported by the NASA AISR program. The authors thank St´ephane T’Jampens,Roger Barlow and Louis Lyons for helpful discussions.

A. Reparameterization invariance of the marginal posterior pdf over α We consider a system of N random variables X i ( i = 1 ...N ), which are related to a set of N observables O i as O i = f i ( X ). We also assume that it is possible to reparameterize thevariables X i into a set Y i so that X = Y = α , Y i = φ i ( X ), and O i = g i ( Y ). Within theBayesian framework, we consider a dataset d used to estimate the observables, which yieldsthe posterior pdf p O ( o | d ). Under the further hypothesis that f , g and φ are invertible, wecan write the marginal posterior on α using the parameterization Y as: p Y α ( α | d ) = Z ... Z p O ( o | d ) | J g | dy ...dy N as p X ( x | d ) = p O ( o | d ) | J g | (A.1)= Z ... Z p O ( o | d ) | J g || J φ | dx ...dx N (A.2)= Z ... Z p O ( o | d ) | J f | dx ...dx N (A.3)= p X α ( α | d ) as p Y ( y | d ) = p O ( o | d ) | J f | , (A.4)proving that the marginal posterior on α is parameterization invariant. Thus, if a Bayesiananalysis has been performed on the dataset d so that the posterior pdf on the observablesis known, the marginal posterior on α obtained by the change of variables Y i = φ i ( X ) isinvariant under reparameterization of the N − X i , i = 2 . . . N . B. Parameterizing the CKM Phase α Problem

We give details here of the three parameterizations, the Pivk-LeDiberder (PLD), the Ex-plicit Solution (ES) and the 1i parameterizations.

B.1 The Pivk-LeDiberder Parameterization

PLD introduces six parameters, α, α eff , µ, a, ¯ a, ∆, via A + − = µa , ¯ A + − = µ ¯ ae iα eff A +0 = µe i (∆ − α ) , ¯ A +0 = µe i (∆+ α ) (B.1) A = µe i (∆ − α ) (cid:18) − a √ e − i (∆ − α ) (cid:19) , ¯ A = µe i (∆+ α ) (cid:18) − ¯ a √ e − i (∆+ α − α eff ) (cid:19) – 10 –hich results in B + − = C τ B µ ( a + ¯ a ) B = C τ B µ (cid:18) a + ¯ a − √ a cos (∆ − α ) + ¯ a cos (∆ + α − α eff )) (cid:19) B +0 = Cτ B + µ C + − = a − ¯ a a + ¯ a (B.2) C = a − ¯ a − √ a cos (∆ − α ) − ¯ a cos (∆ + α − α eff ))2 + a +¯ a − √ a cos (∆ − α ) + ¯ a cos (∆ + α − α eff )) S + − = 2 a ¯ aa + ¯ a sin 2 α eff where C = (16 πM B ~ ) − . This system can be solved to give µ = B +0 Cτ B + a = K (1 + C + − )¯ a = K (1 − C + − )sin 2 α eff = S + − p − ( C + − ) ≡ sin s (B.3)cos (∆ − α ) = (1 + C + − ) K − K B B + − (1 + C ) + 22 p K (1 + C + − ) ≡ cos t cos (∆ + α − α eff ) = (1 − C + − ) K − K B B + − (1 − C ) + 22 p K (1 − C + − ) ≡ cos u where we deﬁne K = B + − B +0 τ B + τ B , and s, t and u as in the ﬁnal three equations. The fourthequation yields 2 α eff = s or 2 α eff = π − s . The ﬁnal two equations yield ∆ + α = ǫt + s or ∆ + α = ǫt + π − s and ∆ − α = ǫ ′ u or ∆ + α = ǫt + π − s , respectively, where ǫ, ǫ ′ = ± α = ǫt + ǫ ′ u + s or α = ǫt + ǫ ′ u + π − s as the 8 solutions correspondingto each set of values of the observables. B.2 The Explicit Solution Parameterization

The Explicit Solution (ES) parameterization [17] begins with the same parameters as thePLD parameterization, and then deﬁnes c = cos( φ ) , φ = α − ∆¯ c = cos( ¯ φ ) , ¯ φ = α + ∆ − α eff and also s = sin( α ), ¯ s = sin( ¯ φ ). Using the identity 2 α = 2 α eff + φ + ¯ φ allows the following– 11 –olution to be derived.tan α = sin(2 α eff )¯ c + cos(2 α eff )¯ s + s cos(2 α eff )¯ c − sin(2 α eff )¯ s + c sin(2 α eff ) = S + − √ − C + − cos(2 α eff ) = ± q − sin (2 α eff ) c = r τ B + τ B τ B τ B + B +0 + B + − (1 + C + − ) / − B (1 + C ) p B + − B +0 (1 + C + − )¯ c = r τ B + τ B τ B τ B + B +0 + B + − (1 − C + − ) / − B (1 − C ) p B + − B +0 (1 − C + − ) s = ± p − c ¯ s = ± p − ¯ c (B.4)where the 8 solutions in the range [0 , π ] are apparent from the three arbitrary signs. B.3 The 1i Parameterization

Botella and Nebot introduce the following parameterization A + − = e − iα T / ( T + iP ) √ A +0 = e − iα T / ¯ A + − = e + iα T / ( T − iP ) √ A = e − iα T / (1 − T − iP ) √ A +0 = e + iα T / √ A = e + iα T / (1 − T − iP )and writing T and P in terms of real and imaginary parts allows the system of equationsfor the observables to be inverted in terms of α , T / , T r , T i , P r , P i , in the following way : T = s B +0 τ B + CT r = 2 B +0 τ B + (cid:0) B + − − B (cid:1) τ B + B +0 τ B P i = (cid:0) B C − B + − C + − (cid:1) τ B + B +0 τ B ( T i + P r ) = B + − τ B + B +0 τ B (1 + C + − ) − ( T r − P i ) ( T i − P r ) = B + − τ B + B +0 τ B (1 − C + − ) − ( T r + P i ) α = arctan ± √ b + a − c + ac − b ! with a = ( T i − P i + T r − P r ) , b = 2 P i T i + 2 P r T r , c = S + − B + − τ B + / (2 τ B B +0 ).– 12 – eferences [1] B. Aubert et al. [BABAR Collaboration], Phys. Rev.

D 76 (2007) 091102 arXiv:0707.2798 [2] J. Charles, A. H¨ocker, H. Lacker. F.R. Le Diberder and S.T’Jampens, hep-ph/0607246 , 22July 2006[3] M. Bona et al. [UTﬁt Collaboration],

J. High Energy Phys. (2005) 028[4] M. Bona et al. [UTﬁt Collaboration], Phys. Rev.

D 76 (2007) 014015 hep-ph/0701204 [5] J. Charles, A. H¨ocker, H. Lacker. F.R. Le Diberder and S.T’Jampens, hep-ph/0703073 ,March 2007[6] F.J. Botella and M. Nebot, arXiv:0704.0174 , April 2007[7] M. Gronau and D. London,

Phys. Rev. Lett. (1990) 3381-3384[8] H. Prosper, “Bayesian Analysis”. Proceedings of the Workshop on Conﬁdence Limits , CERN,17-18 January 2000. hep-ph/0006356 , June 2000[9] R.E. Kass and L. Wasserman, JASA , 435, pp 1343-1370, (1996)[10] A.F.M. Smith and A.E. Gelfand, The American Statistician , , 2, pp 84-88, (1992)[11] H. Ishino et al. [Belle Collaboration], Phys. Rev. Lett. (2007) 211801 hep-ex/0608035 [12] W.-M. Yao et al., J. Phys.

G 33 (2006) 1[13] C.P. Robert and G. Casella,

Monte Carlo Statistical Methods , Springer, 2nd edition, (2004)[14] R. Neal,

Statistics and Computing , , 4, pp 353-366, (1996)[15] J. Charles et al. [CKMﬁtter Group], Eur. Phys. J.

C 41 (2005) 1 hep-ph/0406184 [16] D.S. Sivia with J. Skilling,

Data Analysis: A Bayesian Tutorial , 2nd edition, OxfordUniversity Press, 2006[17] M. Pivk and F.R. Le Diberder,

Eur. Phys. J.

C 39 (2005) 397-409 hep-ph/0406263hep-ph/0406263