[PDF] Bayesian approach for g-mode detection, or how to restrict our imagination

Abstract

Nowadays, g-mode detection is based upon a priori theoretical knowledge. By doing so, detection becomes more restricted to what we can imagine. De facto, the universe of possibilities is made narrower. Such an approach is pertinent for Bayesian statisticians. Examples of how Bayesian inferences can be applied to spectral analysis and helioseismic power spectra are given. Our intention is not to give the full statistical framework (much too ambitious) but to provide an appetizer for going further in the direction of a proper Bayesian inference, especially for detecting gravity modes.

Full PDF

aa r X i v : . [ a s t r o - ph ] N ov Astron. Nachr. / AN , No. 88, 789 – 793 (2007) /

DOI please set DOI!

Bayesian approach for g-mode detection, or how to restrict our imagi-nation

T. Appourchaux ,⋆ Institut d’Astrophysique Spatiale, UMR 8617, Universit´e Paris-Sud, Bˆatiment 121, 91045 Orsay Cedex, FranceReceived 9 October 2007Published online later

Key words

Sun – statistics – Bayes – p modes – g modesNowadays, g-mode detection is based upon a priori theoretical knowledge. By doing so, detection becomes more restrictedto what we can imagine.

De facto , the universe of possibilities is made narrower. Such an approach is pertinent for Bayesianstatisticians. Examples of how Bayesian inferences can be applied to spectral analysis and helioseismic power spectra aregiven. Our intention is not to give the full statistical framework (much too ambitious) but to provide an appetizer for goingfurther in the direction of a proper Bayesian inference, especially for detecting gravity modes. c (cid:13) Since the beginning of helioseismology, the detection ofg modes has been the most challenging quest in our ﬁeld.There were claims of g-mode detection (Delache & Scherrer1983; Thomson et al. 1995), none of which were conﬁrmed.Since the conception of the SOHO mission, one of the goalsof this mission was to detect g modes. In 1997, follow-ing the lack of g-mode detection by SOHO experimenters,the Phoebus group was formed, with the aim of detectingg modes. The group set an upper limit to the g-mode am-plitude of 10 mm/s at 200 µ Hz (Appourchaux et al. 2000b).Since then, this lower limit has been even decreased downto about 4.5 mm.s − for a 10-year observation with a sin-glet, and down to 1.5 mm.s − for a multiplet (Elsworth et al.2006).Over the years, the Phoebus group has developed severaltechniques for g-mode detection. Some of which are basedupon having no knowledge of the structure and dynamicsof the Sun. There are also other techniques using a priori knowledge of the Sun such as using rotational splitting pat-terns (Elsworth et al. 2006, and references therein). Recentg-mode detection claims rely upon a new detection tech-nique derived from the asymptotic properties of g-mode pe-riods derived from theoretical models (Garc´ıa et al. 2007).Although these approaches look promising, they are all ba-sed on what we believe we know about the Sun. From thatpoint of view, it is time that we turn to an approach that hasbeen talked about a lot: a Bayesian approach to statisticalinference.First, I will present a short introduction to what I believeI understand about a Bayesian approach versus a frequentistapproach. In the second section, I will show simple exam- ⋆ Corresponding author: e-mail: [email protected] ples of this can be applied to Fourier analysis. In the thirdsection, the Bayesian approach is applied to well known ex-ample in helioseismology. I then conclude and give a tenta-tive roadmap for the future.

The controversy between Bayesians and Frequentists is re-lated to subjective versus objective probabilities. A frequen-tist thinks that the laws of physics are deterministic , while aBayesian ascribes a belief that the laws of physics are trueor operational . The subjective approach to probability wasﬁrst coined by De Finetti (1937), the reading of which isextremely enlightening. For the rest of us, the difference inviews can be summarized by this quote from the Wikipediaencyclopedia:

Whereas a frequentist and a Bayesian mightboth assign probability to the event of getting a head whena coin is tossed, only a Bayesian might assign probability to personal belief in the proposition that there was lifeon Mars a billion years ago, without intending to assert any-thing about any relative frequency. In short frequentists as-sign probability to measurable events that can be inﬁnitelymeasured, while Bayesians assign probability to events thatcannot be measured, like the outcome of sport-related betsfor instance. The Bayesian approach is then related to whatReverend Bayes would have understood as the degree of be-lief . The application of Bayes’ theorem is then referred to as

Bayesian inference . The theorem of Bayes (1763) is known to any kid going tohigh school. It relates the probability of an event A giventhe occurrence of an event B to the probability of the event c (cid:13)

90 T.Appourchaux: Bayesian approach for helioseismology

B given the occurence of the event A, and the probability ofoccurrence of the event A and B. P (A | B) = P (B | A) P (A) P (B) (1)For example, the probability of having rain given the pres-ence of clouds is related to the probability of having cloudsgiven the presence of rain by Eq. (1). The term prior prob-ability is given to P (A) (probability of having rain in gen-eral). The term likelihood is given to P (B | A) (probabilityof having clouds given the presence of rain) . The term pos-terior probability is given to P (A | B) (probability of havingrain given the presence of clouds). The term normalizationconstant is given to P (B) (probability of having clouds ingeneral).This theorem can then be transposed to anything that weknow, or any information known a priori : P (Ω | D , I) = P (Ω | I) P (D | Ω , I) P (D | I) (2)where Ω are the observables for which we seek the posteriorprobability , D is the observed data set, and I is the informa-tion. The prior probability of the observables is given by P (Ω | I) : this is the way to quantify our belief about what weseek. In theory, it looks quite simple but in practice the derivationof Eq (2) could be somewhat complicated. This is especiallydifﬁcult when there are a subset of observables ( Ω b ) thatare not known and need to be eliminated. In that case, itis required to integrate (or marginalize) over the unwanted parameters as follows: P (Ω a | D , I) = Z Ω b P (Ω a , Ω b | D , I)dΩ b (3)Replacing Eq. (2) in Eq. (3), we get: P (Ω a | D , I) = Z Ω b P (Ω a , Ω b | I) P (D | Ω a , Ω b , I) P (D | I) dΩ b (4)with the prior probability expressed using the product ruleas: P (Ω a , Ω b | I) = P (Ω b | Ω a , I) P (Ω a | I) (5)Equation (4) can then be integrated provided that the priorprobability is assigned. If the parameters Ω a and Ω b are supposed not to be cor-related, the prior probability P (Ω a , Ω b | I) (Eq. 5) can thensimply be expressed as: P (Ω a , Ω b | I) = P (Ω a | I) P (Ω b | I) (6)The prior probability will then express what we believe weknow (or not) about the parameters. The most obvious priorprobability is the one that is uniformly distributed over somerange of the parameters Ω b of interest. The choice of the prior is related to the amount of information at our disposal.The role of the prior and its impact on the posterior proba-bility should ideally be as small as possible. Sivia & David(1994) show the impact of various priors on the outcomeof a Bayesian analysis. The ideas developed in this articleare also used by Thierry Toutain (private communication)for inferring high frequency p-mode splitting hampered bymode blending (Appourchaux et al. 2000a).This discussion of the role of the prior is beyond thescope of this article. Here I am just touching the tip of theiceberg. I encourage the reader to seek other sources of in-formation (Jaynes 1987). As soon as the posterior probability is known, we can derivean estimate of the parameter using: < Ω a > = Z Ω a P (Ω a | D , I)dΩ a (7)with the following rms error: σ Ω a = Z (Ω a − < Ω a > ) P (Ω a | D , I)dΩ a (8)This latter expression is used in the next section for the caseof spectral analysis. Bretthorst (1988), using a Bayesian approach to the analy-sis of the time series of a pure sine wave embedded in noisehaving a gaussian distribution, demonstrated that the poste-rior probability for the angular frequency ω of the sine wavecan be written as: P ( ω | D , σ, I) ∝ e C ( ω ) σ (9)where D are the data ( d i taken at time t i ), σ is the rms valueof the noise assumed to be known, and C ( ω ) is the so-calledSchuster periodogram given by: C ( ω ) = 1 N | N X i =1 d i e jωt i | (10)This periodogram that is today called the Discrete FourierTransform was ﬁrst derived by Schuster (1897). Follow-ing this approach, Jaynes (1987) demonstrated using Eq. (8)that for a pure sine wave of amplitude A , the rms error onthe frequency ν = ω/ π is given by: δν = √ π σA √ N T (11)where T is the observing time and N is the number of sam-ples taken. This formula is the same as given by Cuypers(1987) and derived by Koen (1999). This equation showsthat the Rayleigh criterion ( /T ) for frequency resolution isvery pessimistic compared to the precision with which fre-quencies can be measured. This feature provided by puresine wave will be fully used by the CoRoT Data Analysis c (cid:13) stron. Nachr. / AN (2007) 791 Team for classical stellar pulsators as outlined by Appourchaux et al.(2006).This revisitation of spectral analysis by Bretthorst is ex-tremely useful when one wants to understand how to applya Bayesian analysis to helioseismology.

Bayesian inference has recently been used in asteroseismol-ogy by Brewer et al. (2007) and applied to several stars (Bedding et al.2007; Carrier et al. 2007). Unfortunately, the assumptionsabout the stochastic nature of the mode excitation are com-pletely ignored in the formulation of the Bayesian inference,even though simulated spectra do integrate the randomnessof the excitation.The stochastic nature of the mode excitation has beenknown since (Woodard 1984). It leads to the formulation byDuvall & Harvey (1986) of the ﬁtting of the p-mode spec-trum by Maximum Likelihood Estimation. In our case, I usethis formulation in a very similar manner but instead of ex-pressing the posterior probability of, say, the frequency of amode, as a function of the samples taken in time, I will usethe samples taken in frequency. This is because the stochas-tic nature of the excitation of the harmonic oscillator, rep-resenting an eigenmode, prevents us from separating the in-strumental noise and the solar noise. This separation, as wewill see later on, is only possible in the Fourier spectrum.We apply Eq. (2) to the simple example of a single modestochastically excited. In our case, I can then write: P (Ω s | D , I) = P (Ω s | I) P (D | Ω s , I) P (D | I) (12)with Ω s = ( ν , Γ , A, B ) , where ν is the mode frequency, Γ is the mode linewidth, A is the mode amplitude and B isthe noise. The prior probability is given by: P (Ω s | I) = P ( ν | I) P (Γ | I) P ( A | I) P ( B | I) (13)where here I assume that the information about ν has norelation to the information we have about Γ , A and B , andalso for all other pairs of parameters. This is not quite cor-rect for A and Γ but this is the choice of the prior that Imade here. With the assumption of stochastic excitation ofthe modes, it is known that the power spectrum of the eigen-mode is a χ with 2 d.o.f with a mean given essentially bythe mode proﬁle plus noise Duvall & Harvey (1986). Thisassumption is sufﬁcient for deriving the likelihood as: P (D | Ω s , I) = N Y i =1 S ( ν i ) e − siS ( νi ) (14)where N is the number of samples used in the power spec-trum, ν i is the frequency at sample i , s i is the observedpower spectrum at frequency ν i , and S ( ν ) is the mean pow-er spectrum. In the case of single mode with no correlationwith the solar background, I can write: S ( ν ) = A (cid:16) ν − ν )Γ (cid:17) + B (15) Here I assumed, unlike Nigam et al. (1998), that the modehas no asymmetry. Replacing Eqs. (13) and (14) in Eq. (12),the posterior probability can then be expressed as: P (Ω s | D , I) ∝ P (Ω s | I) N Y i =1 S ( ν i ) e − siS ( νi ) (16)If I assume that we know the mode amplitude A , the modelinewidth Γ , the noise B , I obtain after marginalization over these latter variables the following: P ( ν | D , I) ∝ P ( ν | I) N Y i =1 S ( ν i ) e − siS ( νi ) (17)This equation is almost the same as the likelihood used forﬁtting helioseismic power spectra by maximization. For testing the Bayesian inference, I used almost 10 years ofcoeval SOHO data from the LOI and GOLF instrumentsfor testing the Bayes inference. I applied Eq. (17) to theGOLF data for which we know modes have been detected,and to the LOI data for which modes have not been detectedbelow 2000 µ Hz. It is well known that GOLF using solarradial velocities can detect more easily modes below 2000 µ Hz. So the Bayesian approach to the LOI data is here idealbecause we know that modes exist in this frequency range,but they are almost impossible to detect directly. In addition,solar models predict that low frequency modes are ratherinsensitive to surface effects. Therefore the theoretical un-certainty provided by the model of the atmosphere is herealleviated, and the frequency of the mode is only boundedby the uncertainties in the model of the internal structure ofthe Sun.Figures 1 to 3 show results for three typical cases. Fig-ure 1 is for a case where modes are easily detected in both instruments. The error bars are typical of what you can alsoget using MLE estimators. Figure 2 shows a very interest-ing case where I can detect with the LOI instrument a modeat a very low frequency close to 1500 µ Hz. In that case theerror bar on the frequency is quite large. Figure 3 shows thedetection of a mode below 1000 µ Hz with the GOLF in-strument that was also detected by Chaplin et al. (2002). Onthe other hand, there is no detection in the LOI because the posterior probability is now commensurate with prior prob-ability that stated that the mode frequency was uniformlydistributed over that speciﬁc range of 2 µ Hz. In that lattercase, the rms error of the frequency is compatible with afrequency uniformly distributed over 2 µ Hz; the rms in thatcase being close to / √ .Although the results for low frequency low degree pmodes demonstrated that the Bayesian approach could beuseful, the application in the g-mode range lead to resultsfor GOLF similar to those of the LOI for the p modes (See We assumed that the prior probabilities are Dirac δ distributions Luminosity Oscillations Imager, Appourchaux et al. (1997) Global Oscillations at Low Frequency, Gabriel et al. (1997) c (cid:13)

92 T.Appourchaux: Bayesian approach for helioseismology

Fig. 1 (Top) Power spectra smoothed over 1 µ Hz (315 bins) of the GOLF and LOI spectra for an l = 0 mode. (Bottom) Posterior probability for GOLF and LOI data assuming an amplitude of 1 and a noise of 1, and a linewidth of 1 µ Hz. Themean and rms error of the frequency as derived from Eqs. (7) and (8) are indicated on the top of the diagrams.

Fig. 2 (Top) Power spectra smoothed over 0.2 µ Hz (63 bins) of the GOLF and LOI spectra for an l = 0 mode. (Bottom) Posterior probability for GOLF and LOI data assuming an amplitude of 1 a noise of 1, and a linewidth of so 0.2 µ Hz. Themean and rms error of the frequency as derived from Eqs. (7) and (8) are indicated on the top of the diagrams.

Fig. 3 (Top) Power spectra smoothed over 0.01 µ Hz (13 bins) of the GOLF and LOI spectra for an l = 0 mode. (Bottom) Posterior probability for GOLF and LOI data assuming an amplitude of 5 and 0.2, respectively; a noise of 1, and a linewidth0.01 µ Hz. The mean and rms error of the frequency as derived from Eqs. (7) and (8) are indicated on the top of the diagrams. c (cid:13) stron. Nachr. / AN (2007) 793 Figure 3). It is quite clear that additional information needsto be included before concluding for the g-mode range. TheBayesian approach would ideally be suited to the recentclaims of g-mode detection made by Garc´ıa et al. (2007).

The Bayesian approach to the analysis of solar power spec-tra seems extremely promising. One must not forget thatmany aspects have been neglected such as marginalization.On this latter aspect, either one is fond of integration andderives the integrals of Eq. (4), or one uses the so-calledMarkov Chain Monte Carlo (MCMC) algorithm as used byGregory (2005). So far, I have not delved into the subjectbut I sense that the MCMC algorithms will need to be un-derstood before we can proceed to the proper application ofthe Bayesian approach to g-mode detection.Last but not least, this paper could be very far froma proper and accurate description of Bayesian inference.Here I wanted to attract the helioseismic reader to a ﬁeldthat has recently been fast developing in astrophysics. I hopethis paper will trigger the interest of the reader for a genuinetreatment of the Bayesian inference.

Acknowledgements.

I beneﬁted from useful discussions with T.Toutain, F. Baudin, N. Barbey, P. Boumier and N. Aghanim; andfrom a careful reading by J. Leibacher. I also had great pleasure inreading Guy Demoment’s course on

Mod´elisation des incertitutes,inf´erence logique, et traitement des donn´ees exp´erimentales givenat Universit´e Paris Sud - Orsay. SOHO is a mission of internationalcollaboration between ESA and NASA. This paper is the result ofideas and discussions started in the framework of the InternationalSpace Science Institute whose support is greatly acknowledge.

References

Appourchaux, T., Andersen, B., Fr¨ohlich, C., et al. 1997, Sol.Phys., , 27Appourchaux, T., Berthomieu, G., Michel, E., et al. 2006, Dataanalysis tools for the seismology programme (The CoRoT Mis-sion (Eds) M. Fridlung, A. Baglin, J. Lochard and L. Conroy,ESA Publications Division, ESA Spec. Publ. 1306), 377Appourchaux, T., Chang, H.-Y., Gough, D., & Sekii, T. 2000a,MNRAS, , 365Appourchaux, T., Fr¨ohlich, C., Andersen, B., et al. 2000b, ApJ, , 401Bayes, T. 1763, Philosophical Transactions of the Royal Societyof London, , 370Bedding, T. R., Kjeldsen, H., Arentoft, T., et al. 2007, ApJ, ,1315Bretthorst, L. 1988, Bayesian spectrum analysis and parameterestimation (Springer-Verlag, Berlin, available electronically atbayes.wustl.edu/glb/book.pdf)Brewer, B. J., Bedding, T. R., Kjeldsen, H., & Stello, D. 2007,ApJ, , 551Carrier, F., Kjeldsen, H., Bedding, T. R., et al. 2007, A&A, ,1059Chaplin, W. J., Elsworth, Y., Isaak, G. R., et al. 2002, MNRAS, , 979Cuypers, J. 1987, Medelingen van de Koninklijke Academie voorWetenschappen, Letteren en Schone Kunsten van Belg¨ıe, Jg. 49Nr. 3 (Brussel: Paleis der Academi¨een), 21De Finetti, B. 1937, Annales de l’Institut Henri Poincar´e, , 1, 1Delache, P. & Scherrer, P. H. 1983, Nature, , 651Duvall, T. L., Jr. & Harvey, J. W. 1986, in Seismology of the Sunand the Distant Stars, 105Elsworth, Y. P., Baudin, F., Chaplin, W., et al. 2006, in ESA Spe-cial Publication, Vol. 624, Proceedings of SOHO 18/GONG2006/HELAS I, Beyond the spherical Sun, electronic versionGabriel, A. H., Charra, J., Grec, G., et al. 1997, Sol. Phys., ,207Garc´ıa, R. A., Turck-Chi`eze, S., Jim´enez-Reyes, S. J., et al. 2007,Science, , 1591Gregory, P. C. 2005, ApJ, , 1198Jaynes, E. T. 1987, in Maximum entropy and Bayesian spectralanalysis and estimation problems, ed. C. Smith & G. Erickson(D.Reidel, Dordrecht-The Netherlands), 1Koen, C. 1999, MNRAS, , 769Nigam, R., Kosovichev, A. G., Scherrer, P. H., & Schou, J. 1998,ApJ Letters, , L115Schuster, A. 1897, Proc. Royal Soc. of London, , 455Sivia, D. S. & David, W. 1994, Acta Crystallographica, A50 , 703Thomson, D. J., Maclennan, C. G., & Lanzerotti, L. J. 1995, Na-ture, , 139Woodard, M. 1984, PhD thesis, University of California, SanDiego c (cid:13)(cid:13)