[PDF] Power law relating 10.7 cm flux to sunspot number

Abstract

To investigate the relation between observations of the 10.7 cm flux and the international sunspot number so that a physical unit may be ascribed to historical records, both polynomial and power law models are developed giving the radio flux as a function of sunspot number and vice versa. Bayesian data analysis is used to estimate the model parameters and to discriminate between the models. The effect on the parameter uncertainty and on the relative evidence of normalizing the measure of fit is investigated. The power law giving flux as a function of sunspot number is found to be the most plausible model and may be used to estimate the radio flux from historical sunspot observations.

Full PDF

aa r X i v : . [ a s t r o - ph . S R ] S e p Power law relating 10.7 cm ﬂux to sunspot number

Robert W. JohnsonAbstract

To investigate the relation between observa-tions of the 10.7 cm ﬂux and the international sunspotnumber so that a physical unit may be ascribed to his-torical records, both polynomial and power law mod-els are developed giving the radio ﬂux as a function ofsunspot number and vice versa . Bayesian data anal-ysis is used to estimate the model parameters and todiscriminate between the models. The eﬀect on the pa-rameter uncertainty and on the relative evidence of nor-malizing the measure of ﬁt is investigated. The powerlaw giving ﬂux as a function of sunspot number is foundto be the most plausible model and may be used to es-timate the radio ﬂux from historical sunspot observa-tions.

Keywords

That a relation exists between the 2800 MHz 10.7cm solar radio ﬂux observed by ground stations andthe sunspot number as deﬁned by Wolf has long beenknown (Covington 1969; Hathaway et al. 2002). Thecorrespondence of the 10.7 cm ﬂux with other indicatorsof solar activity as well as mechanisms for its origin arediscussed by Tapping and Detracey (1990). That solarmagnetic activity correlates with various geophysicalprocesses is now well established (Labitzke and van Loon1997; Svensmark 1998; Johnson 2009, 2010), and thesunspot number provides our longest continuous recordof its level. Putting the sunspot number onto a footingwith physical units is of intrinsic interest to the solartheorist.

Robert W. JohnsonAlphawave Research, Atlanta, GA 30238, USA

Building a mathematical model to describe the re-lation between two quantities of physical interest is apopular pastime, and deciding whether to accept or re-ject a model based on a quality of ﬁt parameter is oftendone. However, the essential question is not “how welldoes this model ﬁt the data” but rather “how muchbetter does this model ﬁt the data relative to anothermodel.” If a single model is all that is available, its qual-ity of ﬁt is irrelevant, as no better idea has presenteditself. In Bayesian analysis (Sivia 1996), it is the ratioof the integrated evidence evaluated at the parametersof best ﬁt which determines the relative plausibility ofthe models under consideration. After evaluating thebest ﬁtting parameters, we will compare their evidenceratios to determine the most plausible model consis-tent with the data. The nonlinearity inherent in thedeﬁnition of the Wolf index proves particularly hard tomodel.

Often when comparing two independent sets of mea-surements, the choice of which data to use for ab-scissa and which for ordinate is not unambiguous. Herewe will consider polynomial and power law modelseach with three parameters relating the internationalsunspot number provided by the World Data Center forthe Sunspot Index, Belgium (SIDC-team 2008), to theadjusted Penticton/Ottawa 2800 MHz solar ﬂux pro-vided by the National Research Council of Canada andavailable through the National Geophysical Data Cen-ter, NOAA, USA. The adjusted ﬂux compensates forvariation in the earth-sun distance. These data sets donot quote variance values, which must then be set tounity for equal weighting of each data value.Previous investigators have usually selected a poly-nomial model for the relation between the solar ﬂux

Table 1

Previous models available online: F ZH and F ZH are from Zhao and Han (2008) and F IPS and R IPS are fromthe IPS unit of the Australian Bureau of MeteorologyModel Parameters F ZH = 60 . . R D F ZH = 65 . . R D + 3 . × − R D − . × − R D F IPS = 67 . . R D + 3 . × − R D − . × − R D R IPS = 1 . F B − . × − F B + 1 . × − F B , F B = F D − . F Z H R D (a)1 4 16 63 2505078122192300 F Z H R D (b)1 4 16 63 2505078122192300 F I PS R D (c)1 4 16 63 2505078122192300 R I PS F D (d)50 78 122192300141663250 Fig. 1

Comparison of polynomial models available onlineusing yearly data values F D and sunspot number R D . The subscript D willbe used to distinguish data values from model values.Zhao and Han (2008) consider both a linear ﬁt anda cubic ﬁt for F ( R D ) using annual values for 1947–2005, and the Ionospheric Prediction Service (IPS) unitof the Australian Bureau of Meteorology (Thompson2010) gives cubic equations for both F ( R D ) and R ( F D )using monthly values from 1947–1990. The radioﬂux is expressed in solar ﬂux units (sfu) equal to10 − W / m / Hz. These model equations, written in aform comparable to that which we will investigate, aredisplayed in Table 1.A graphical comparison of these models using theyearly data values for 1947–2008 is given in Figure 1.We see that the linear model is not capable of match-ing the data at low activity levels, and shortly beyondthe region displayed the cubic models for F ( R D ) in-ﬂect downwards, implying a saturation of radio ﬂux atextreme levels of solar magnetic activity. The corre-sponding inverse relation R IP S ( F D ) does not so inﬂect F Z H R D (a)1 4 16 63 2505078122192300 F Z H R D (b)1 4 16 63 2505078122192300 F I PS R D (c)1 4 16 63 2505078122192300 R I PS F D (d)50 78 122192300141663250 Fig. 2

Comparison of polynomial models available onlineusing monthly data values and is dominated by the cubic term at high ﬂux lev-els. These remarks hold as well for the monthly valuesshown in Figure 2.

Our implementation of Bayesian data analysis drawsprimarily on the text by Sivia (1996). The essential fea-ture which takes it beyond simple regression is the useof a non-uniform prior in appropriate circumstances.Using the language of conditional probabilities (Durrett1994), we write “the probability of A given B underconditions I ” asprob( A | B ; I ) ≡ p ( A | I B ) ≡ p AB (1)when the background information I is unchanging. Thechoice of prior (D’Agostini 1998) represents one’s back-ground knowledge on the likely distribution of a pa-rameter x ∈ [ x , x ] before analysis of the current set of data. A non-uniform prior p xf arises naturally inmany contexts, often representing a prior which is uni-form over a change of variables x → F for some inte-grable function f ( x ) = dF/dx , with normalization p xf =∆ − x f ( x ) for ∆ x ≡ R x x f ( x ) dx such that R x x p xf dx = 1.Besides the uniform prior f − x = 1, one commonly en-counters the Jeﬀreys prior f − x = x uniform over log x and the Cauchy distribution f − x = 1 + x uniform overarctan x .3.1 Parameter estimationOne states Bayes’ theorem in the context of parameterestimation as p X D = p X p D X /p D , (2)reading “the evidence for parameters X given data D equals the prior for X times the likelihood for D given X divided by the chance of measuring D ”. What wecall “the evidence” is often called “the posterior”, asthe normalization constant p D aﬀecting neither param-eter estimation nor model selection is sometimes called“evidence”; both “prior” and “likelihood” have theirusual meaning. The logarithm (base e ) of Equation 2reads L E = L P + L L + D , where the ﬁnal term isa constant equal to − log p D . For independent data D = { D t } indexed by t with Gaussian noise σ , thelikelihood factors as p DX = Q t (2 πσ t ) − / exp( − χ t / χ t ≡ [ M t ( X ) − D t ] /σ t is the weighted residualof the model M , so that L L has one term proportionalto the measure of ﬁt χ ≡ P t χ t and another which isconstant. With the deﬁnition of the merit function interms of the model parameters, − L X − L DX = − X x ∈ X log p x + 12 χ + σ , (3)the problem becomes one of nonlinear global optimiza-tion (Press et al. 1992), seeking a unique solution tothe equation ∇ X L E = 0. Short of evaluating the meritfunction over the entire prior range, one must rely on in-tuition and luck to varying degrees. One’s intuition, en-coded in the form and domain of the prior functions p X ,contributes to the gradient of the log evidence ∇ X L E in the limit of poor data ∇ X L L →

0, thereby improvingthe chances of success.Getting slightly ahead of ourselves, let us remarkhere that the traditional deﬁnition of the measure ofﬁt χ is the unnormalized sum of weighted residualssquared. Recognizing that χ represents the varianceof the data relative to the model, we believe that the normalized sum of weighted residuals is a more appro-priate measure of ﬁt, which one deﬁnes as e χ ≡ X t [ M t ( X ) − D t ] d t , (4)with the normalized weights d t ≡ σ − t / P t σ − t play-ing the role of the discrete measure factor. For N t data values with unit variance σ t ≡

1, the normalizedmeasure of ﬁt reduces to e χ = χ /N t . In the contin-uum limit P → R the measure factor is made appar-ent e χ = R [ M ( t ) − D ( t )] dt , and the normalization isrequired so that the measure of ﬁt is not dependentupon the sampling rate—for an inﬁnite or continuousdata set, the unnormalized χ must be inﬁnite for anymodel which does not perfectly match the data. Re-placing M t with a single parameter model given by theweighted mean of the data D = P t D ( t )( σ − t / P t σ − t )reveals the relationship between the measure of ﬁt andthe variance of the data vector e χ D = σ D . The nor-malization has no eﬀect on the location of the maxi-mum likelihood solution X L while inﬂuencing the rela-tive weighting of likelihood and prior in the expressionfor the evidence, thereby shifting the maximal evidencesolution X E for non-uniform priors; it also aﬀects thewidth of the error bars assigned to the parameter val-ues, as exp( − e χ /

2) = [exp( − χ / /N t .3.2 Model selectionGiven a single model, all one can do is estimate its bestﬁtting parameters—the quality of ﬁt is irrelevant be-yond its role in the likelihood p DX and its evidence maybe normalized to unity. However, faced with a choice ofmodels, Bayes’ theorem allows one to compute their ev-idence ratio R ABE , which reduces to the likelihood ratio R ABE ≡ p A D p B D = p A p D A /p D p B p D B /p D → p D A p D B ≡ R ABL , (5)where p A = p B indicates no prior preference for eithermodel. The null hypothesis of “no relation” is sup-ported only when one can deﬁne a noise model, as thelikelihood cannot be computed for a model B given onlythat M t ( B ) = M t ( A ). The likelihood for each modelis the unnormalized integral of the evidence for its pa-rameters, p D M = Z p D , X M d X = Z p X M p DX ,M d X , (6)and may be identiﬁed as the “integrated probabilitybump” over the model parameters X . There is an un-fortunate confusion of nomenclature in the literaturebecause p D M appears both in the position of chance in Table 2

Summary of prior functions f x and domains[ x , x ] for the various models MM x f − x Yearly Monthly x x x x F B A A C C -0.01 0.01 -0.01 0.01 F B A A

C C R B A A C C -0.01 0.01 -0.01 0.01 R B A A

C C

Equation (2) and in the position of likelihood in Equa-tion (5).Under the quadratic approximation, generally ac-ceptable when the evidence is not severely truncated bythe prior range, one can evaluate the integral analyti-cally to write the negative logarithm of the likelihoodas − L D M ≈ χ + X k log f − k − X k log p π/h k ∆ k ! , (7)for X indexed by k and { h k } the eigenvalues of the in-verse of the variance matrix for the parameters Q k h k =det Σ − X , where the ﬁrst two terms are the value of themerit function evaluated at its minimum and the re-mainder comprise the Occam factor accounting for theratio of the width of the evidence Σ X to the prior vol-ume { ∆ k } . An additional parameter must provide notjust a better ﬁt but a signiﬁcantly better ﬁt in orderfor its plausibility to increase. With several models tochoose from, the one with the lowest value of − L DM isdeemed the most plausible, with the preference factorgiven by the exponential of the diﬀerence between the(negative) log evidence for each. With two functional forms, polynomial and power law,and an arbitrariness to the selection of abscissa andordinate, we consider a total of four models, two for F ( R D ) and two for R ( F D ). As the 10.7 cm ﬂux isobserved never to fall below some background level ∼ B for the background level inall models. Parameter A will be an amplitude, andparameter C will be either another amplitude or the Table 3

Yearly analysis results for best ﬁtting parameters B , A , and C , their standard deviations σ , the quality of ﬁt ∇ L E and − L E , and the evidence ratio R E , for both theunnormalized and normalized log likelihood − L L − L L M B A C ∇ L E R E σ B σ A σ C − L E χ F .

87 0 .

835 0 . − . .

31 0 .

009 0 . . F .

98 0 .

582 1 . − . × .

39 0 .

021 0 . e χ F .

01 0 .

830 0 . − .

62 4.422 .

46 0 .

072 0 . . F .

54 0 .

550 1 . − . .

02 0 .

148 0 . . χ R .

75 1 . − . − . .

25 0 .

010 0 . . R .

01 0 .

402 1 . − . × .

27 0 .

011 0 . e χ R .

63 1 . − . − .

47 13.21 .

98 0 .

077 0 . . R .

30 0 .

388 1 . − . .

04 0 .

083 0 . . exponent in the power law. Speciﬁcally, we considerthe three parameter models given by F = B + AR D + CR D , (8) F = B + ( AR D ) C , (9) R = A ( F D − B ) + C ( F D − B ) , (10) R = A − ( F D − B ) /C , (11)where F D and R D are the data selected for the abscissaand the form of R is chosen to compare directly its pa-rameters with those of F . We will be neglecting anyinﬂuence from a lag between the solar ﬂux and sunspotnumbers (Wilson et al. 1987; Sparavigna 2008). Upona visual inspection of the normalized monthly data se-ries, any lag appears to be negligible at that temporalresolution.We summarize our use of priors in Table 2. A uni-form prior is assigned to B whose domain is adjusted formodel R , which requires B ≤ min { F D } . The Cauchydistribution serves as the prior for the amplitudes ofthe polynomial models, and for the power law modelsthe Jeﬀreys prior is taken for A and C . The Jeﬀreysand Cauchy priors share the property that they maybe used equally for the forward and inverse models of F and R . As p /A | dA − /dA | = p A ∝ A − , one maysubstitute ˜ A = A − to write p ˜ A ∝ ˜ A − , and similarlyfor the Cauchy prior. Our results are not inﬂuencedgreatly by the choice of priors, indicating that the ﬁt isdriven primarily by the likelihood. F R D (a)1 4 16 63 2505078122192300 F R D (b)1 4 16 63 2505078122192300 R F D (c)50 78 122192300141663250 R F D (d)50 78 122192300141663250 Fig. 3

Comparison of the best ﬁtting solutions to theyearly data values using the normalized measure of ﬁt e χ X E is headed by ∇ L E ≡ log |∇ X L E ( X E ) | and the negative log of theintegrated evidence by − L E . The normalization of themeasure of ﬁt is indicated in the ﬁrst column, and theevidence ratio R E is in the last column. As B ( e χ , R ) iswithin 3 σ B of its upper limit, a numerical evaluation ofits integrated evidence is necessary, which diﬀers fromthe approximate value by only a few percent.We see that the various models give slightly diﬀerentestimates for the background radio ﬂux B . The polyno-mial models F and R return a value of about 63 sfu,while the power law model values are higher, around 65sfu for F and 67 sfu for R . These remarks hold foreither normalization of the measure of ﬁt. We comparein Figure 3 the model solutions using the normalizedmeasure of ﬁt for all four models—the solutions for theunnormalized measure of ﬁt are visually indistinguish-able. Compared to Figure 1, one can see that the powerlaw F provides with three parameters a quality of ﬁt onpar with a polynomial of four parameters and does notsuﬀer from inﬂection problems at high levels of solaractivity. Polynomial models are notorious for havingdiﬃculties with extrapolation.While the solution location X E is not greatly in-ﬂuenced by the choice of χ or e χ in L L , the width Table 4

Monthly analysis results to compare with Table 3 − L L M B A C ∇ L E R E σ B σ A σ C − L E χ F .

72 0 .

900 0 . − . .

09 0 .

003 0 . . F .

72 0 .

686 1 . − . × .

12 0 .

007 0 . e χ F .

87 0 .

895 0 . − .

37 4.552 .

49 0 .

071 0 . . F .

36 0 .

645 1 . − . .

19 0 .

178 0 . . χ R .

42 1 . − . − . ∼ .

06 0 .

002 0 . R .

50 0 .

281 1 . − . .

00 0 .

001 0 . e χ R .

35 1 . − . − . .

67 0 .

061 0 . .

51 5 . R .

50 0 .

279 1 . − . × − .

01 0 .

028 0 . . of the marginal error bars is greater when using thenormalized variance. This change in the width of theevidence has a strong impact on the evaluation of its in-tegral through the Occam factor in Equation (7). Con-sequently, the evidence ratio R E indicating the prefer-ence factor for the power law over the polynomial modelis vastly diﬀerent for the two choices of L L . With suchsimilarity in the model solutions X E ( χ ) and X E ( e χ ),it is hard for us to countenance a preference factor onthe order of 10 or even 10 . Using the normalizedmodel variance e χ gives a preference factor ∼

10 for thepower law models. Furthermore, it seems reasonable toexpect the variance of the background estimate B foreach model to be on the order of the variance betweenthe models, as is found when using e χ .4.2 Monthly analysisRepeating the analysis using the monthly data values,we ﬁnd the results shown in Table 4. While the assess-ment of the models for F ( R D ) is consistent with that ofthe yearly data, here we ﬁnd that the power law modelfor R ( F D ) is not to be preferred. The reason is becausethe background parameter B is very tightly constrainedto a value just below the minimum of the abscissa data F D . One might consider a modiﬁcation of the model sothat R ( F D < B ) = 0; however, such approach posesdiﬃculties with the analytic evaluation of the gradientof the log likelihood. The estimates of the backgroundfor the models F ( R D ) are lower compared to those fromthe yearly data, while those for R are about the same,as are the remainder of the parameters. We display themodel solutions for the monthly data in Figure 4.As we are most interested in ascribing to the his-torical record of sunspot activity a physical unit based F R D (a)1 4 16 63 2505078122192300 F R D (b)1 4 16 63 2505078122192300 R F D (c)50 78 122192300141663250 R F D (d)50 78 122192300141663250 Fig. 4

Comparison of the best ﬁtting solutions to themonthly data values using the normalized measure of ﬁt e χ on the solar radio ﬂux, the consistency in preferencefor F to F indicates the power law model functionmay be used for either yearly or monthly analysis. Us-ing ψ and µ to indicate the yearly and monthly solu-tions, we are tempted to compare boxes of apples toapples by looking at the diﬀerence between L E ( F ψ )and L E ( F µ ), made possible through the use of the nor-malized measure of ﬁt e χ . Reading the values fromthe tables, one can state that F ﬁts the yearly databetter than the monthly data by a factor of aboutexp(133 − . ∼ . Continuing the analogy tomodels for apples and bananas, one ﬁnds that F ψ ﬁtsbetter than R ψ by a factor exp(45 − . ∼ The primary diﬃculty the models face is in relating theinternational sunspot number derived from the originalWolf index to the physical ﬂux measurements of theS-component oscillation at small magnitudes. It stemsfrom the behavior of R D ≡ k i (10 g + s ) for small valuesof spot and group numbers s and g . The Wolf indexhas a jump from 0 to 11 for the ﬁrst spot observed, andwhile the modern discontinuity is reduced slightly bythe international reduction coeﬃcient k i , it still repre-sents a signiﬁcant source of nonlinearity.An interesting feature of Bayesian model selection isthat the log evidence, Equation (7), contains factors for both the quality of ﬁt and the error bars on the param-eters given by the determinant of the inverse variancematrix. The consequence is that for models with a sim-ilar measure of ﬁt and prior volume, probability theoryactually prefers the one with the larger error bars. Thereason is because a greater range of its parameter spaceyields a model consistent with the data.Concluding, we have considered various models ofthree parameters relating the 10.7 cm solar radio ﬂuxto the international sunspot number. The parametersfound using maximal evidence are consistent with thosegiven by other investigators. Model selection using theevidence ratio indicates that the power law determiningthe solar ﬂux from the sunspot number is most consis-tent with the yearly data values. That model may beused to ascribe to the historical sunspot record a valuein solar ﬂux units. Acknowledgements

Sunspot data provided by theSIDC-team, World Data Center for the Sunspot In-dex, Royal Observatory of Belgium, Monthly Reporton the International Sunspot Number, online catalogueof the sunspot index, 1947–2008. Penticton/Ottawa2800 MHz Solar Flux data provided by the NationalResearch Council of Canada and available through theNational Geophysical Data Center, NOAA, Boulder,Colorado, USA .

References

A. E. Covington. Solar radio emission at 10.7cm.

J. RoyalAstron. Soc. Canada , 63:125, 1969.G. D’Agostini. Jeﬀreys priors versus experienced physi-cist priors - arguments against objective Bayesian the-ory.

ArXiv Physics e-prints , November 1998. URL http://arxiv.org/abs/physics/9811045 . Bayesian Statistics6: Proceedings of the Sixth Valencia International Meet-ing (Oxford Science Publications).Richard Durrett.

The Essentials of Probability . DuxburyPress, A Division of Wadsworth, Inc., Belmont, Califor-nia, USA, 1994.D. H. Hathaway, R. M. Wilson, and E. J. Reichmann. GroupSunspot Numbers: Sunspot Cycle Characteristics.

Sol.Phys. , 211:357–370, December 2002.Robert W. Johnson. Edge adapted wavelets, solar mag-netic activity, and climate change.

Astrophysics andSpace Science , pages 4–+, January 2010. doi: 10.1007/s10509-009-0249-6. URL .Robert W. Johnson. Enhanced wavelet analysis of so-lar magnetic activity with comparison to global tem-perature and the Central England Temperature record.

Journal of Geophysical Research (Space Physics) , 114(A05105), may 2009. doi: 10.1029/2009JA014172. URL .K. Labitzke and H. van Loon. The signal of the 11-year sunspot cycle in the upper troposphere-lower strato-sphere.

Space Science Reviews , 80(3):393–410, May 1997.doi: 10.1023/A:1004907126955. URL .W. H. Press, S. A. Teukolsky, W. T. Vetterling, and B. P.Flannery.

Numerical Recipes . CUP, Cambridge, England,1992.SIDC-team. The International Sunspot Number.

MonthlyReport on the International Sunspot Number, online cat-alogue , 2008.D. S. Sivia.

Data Analysis: a Bayesian Primer . OUP, Ox-ford, England, 1996.Amelia Sparavigna. Recurrence plots of sunspots, solar ﬂuxand irradiance, 2008. URL http://arxiv.org/abs/0804.1941 .arXiv:0804.1941.Henrik Svensmark. Inﬂuence of cosmic rays on earth’s cli-mate.

Phys. Rev. Lett. , 81(22):5027–5030, Nov 1998. doi:10.1103/PhysRevLett.81.5027.K. F. Tapping and B. Detracey. The origin of the 10.7 CMﬂux.

Sol. Phys. , 127:321–332, June 1990. doi: 10.1007/BF00152171.Richard Thompson. The sun and solar activity – the tencentimetre solar radio ﬂux. IPS - Radio and Space Ser-vices, Bureau of Meteorology, Australia, 2010. URL .Robert M. Wilson, Douglas Rabin, and Ronald L. Moore.10.7-cm solar radio ﬂux and the magnetic complexity ofactive regions.

Sol. Phys. , 111(2):279–285, September1987. doi: 10.1007/BF00148520. URL .Juan Zhao and Yan-Ben Han. Historical dataset reconstruc-tion and a prediction method of solar 10.7cm radio ﬂux.