Error formulae for the energy-dependent cross-spectrum
MMNRAS , 1–13 (2019) Preprint 5 September 2019 Compiled using MNRAS L A TEX style file v3.0
Error formulae for the energy-dependent cross-spectrum
Adam Ingram (cid:63) Department of Physics, Astrophysics, University of Oxford, Denys Wilkinson Building, Keble Road, Oxford OX1 3RH, UK
Accepted 2019 August 28. Received 2019 August 20; in original form 2019 June 28.
ABSTRACT
I present analytic error formulae for the energy-dependent cross-spectrum and rmsspectrum, which are Fourier statistics widely used to probe the rapid X-ray variabil-ity observed from accreting compact objects. The new formulae cover the modulus,phase, real and imaginary parts of the cross-spectrum, and are valid for any valueof intrinsic coherence between variability in different energy bands. I show that ex-isting error formulae (including that for the phase lag), which are valid for a singlecross-spectrum or power spectrum, lead to over-fitting when applied to the energy-dependent cross-spectrum - which consists of cross-spectra between individual energychannels and a common reference band. I also introduce an optimal, unbiased wayto define the reference band and an accurate way to calculate the intrinsic coherencebetween energy bands. I find that the traditional use of the old formulae has likely hada rather benign impact on the literature, but recommend the use of the new formulaein future wherever appropriate, since they are more accurate and are no harder toimplement than existing error estimates. A code to implement the new error formulaeon observational data is available online.
Key words: methods: data analysis – methods: statistical – X-rays: general
Accreting compact objects exhibit rapid variability in theirX-ray radiation that can be exploited in order to probe themotion of matter in strong gravitational fields and measureintrinsic properties of the compact object itself. The proper-ties of this variability are best studied in the Fourier domain,since typically the signal to noise ratio for a given variabilitytimescale of interest is low but an observation contains manycycles of that timescale. The most widely used Fourier statis-tics are the power spectrum and the cross-spectrum (van derKlis et al. 1987; van der Klis 1989). The cross-spectrum isthe product of the Fourier transform of one time series withthe conjugate of the Fourier transform of another, and thepower spectrum is the special case whereby the two timeseries are the same. It follows from the convolution theoremthat the cross-spectrum is the Fourier transform of the cross-correlation function and the power spectrum is the Fouriertransform of the auto-correlation function. The modulus andphase of the cross-spectrum respectively indicate the ampli-tude of correlated variability and the phase lag between thetwo time series as a function of Fourier frequency.Fourier domain studies of black hole X-ray binaries andactive galactic nuclei (AGN) reveal strong X-ray variabilityover several decades in Fourier frequency (e.g. van der Klis2006; McHardy et al. 2006) that is highly coherent across (cid:63)
E-mail: [email protected] different energy channels (Vaughan & Nowak 1997; Nowaket al. 1999; Uttley et al. 2014). Phase lag measurements us-ing the cross-spectrum reveal that, at low Fourier frequen-cies ( ν (cid:46) M / M (cid:12) ; where M is the mass of the black hole)hard photons lag soft photons. This behaviour is generallyattributed to inward propagation of mass accretion rate fluc-tuations (e.g. Lyubarskii 1997; Kotov et al. 2001; Ingram &van der Klis 2013). At higher frequencies, the phase lags be-tween energy bands are thought to instead be dominated bylight-crossing lags between directly observed rays and thosethat have been reprocessed by the accretion disc ( reverber-ation lags ; Fabian et al. 2009; De Marco et al. 2013; Uttleyet al. 2014; De Marco & Ponti 2016).A particularly powerful tool is provided by the energy-dependent cross-spectrum , which is a series of cross-spectra,each between the flux from a different energy channel anda common reference band. This enables the phase lag andcorrelated variability amplitude to be plotted as a func-tion of photon energy for different frequency ranges. Re-lated to this is the rms spectrum (rms standing for ‘rootmean squared’), which is derived from the power spectrumof each energy channel integrated over a given frequencyrange. These tools have led to several breakthroughs, includ-ing discovery of an iron K feature in the lag energy spectrumof AGN (Zoghbi et al. 2012) and X-ray binaries (De Marcoet al. 2017; Kara et al. 2019) that is indicative of reverbera-tion and the discovery of a quasi-periodic modulation of theiron line centroid energy indicative of relativistic precession © a r X i v : . [ a s t r o - ph . H E ] S e p Adam Ingram (Ingram et al. 2016). In future, the same concept applied toX-ray polarimeters will enable searches for variability in X-ray polarization degree and angle on timescales far shorterthan the minimum exposure time required for detection ofpolarization (Ingram & Maccarone 2017). As the field ma-tures, the focus is now shifting from discovering qualitativefeatures to measuring physical properties of the accretionflow and/or compact object by fitting theoretical models.For instance, Mastroserio et al. (2019) recently used the X-ray reverberation model reltrans (Ingram et al. 2019) tomeasure the mass of the black hole in Cygnus X-1. It istherefore important to use accurate error estimates for theenergy-dependent cross-spectrum in order to properly assessmeasurement uncertainties of model parameters.Derivations for the error on a single power spectrum anda single cross-spectrum are presented in Bendat & Piersol(2010, originally published in 1971). However, these formu-lae cannot generally be used for the energy-dependent cross-spectrum because they do not account for the use of a com-mon reference band for all subject energy channels, and theydo not account for intrinsic correlations between variabil-ity in different energy channels. This means that use of theBendat & Piersol (2010) formulae for the energy-dependentcross-spectrum or the rms spectrum results in over-fitting.Here, I present analytic formulae for the error on the energy-dependent cross-spectrum and rms spectrum. These formu-lae are valid for any value of intrinsic coherence between en-ergy channels and I demonstrate that they work even for abroad reference band constructed by summing the flux fromall energy channels. In Section 2 I present the formulae, inSection 3 I test the formulae with a Monte Carlo simulationand in Section 4 I present the derivations of the formulae.The logic of this structure arises from the derivation mak-ing use of asymptotic limits of the Monte Carlo simulation.In Section 6 I discuss the context in which the new formu-lae should be used, the optimum reference band and theeffect on the literature of historical use of the old formulae.The conclusions are presented in Section 7. A code to im-plement the new error formulae on observational data withcomprehensive usage instructions is available at https://bitbucket.org/adingram/cross_spec_code/downloads/
In this section, I first quote the familiar error formulae fora single power spectrum and a single cross-spectrum. I thenpresent new error formulae for the rms spectrum and theenergy dependent cross-spectrum. These formulae take intoaccount correlations between energy bands and the use of asingle reference band for multiple cross-spectra. All formu-lae quoted here assume that the observed light curves can bemodelled as the sum of a signal and a noise component, suchthat the signal and noise contributions are completely un-correlated with one another. This is almost always the casein X-ray astronomy, but is not generally valid (e.g. opticallight curves that are heavily affected by atmospheric noise).I first introduce the nomenclature I will use throughout.
Imagine we measure the stochastically varying flux from anastronomical source in N e subject bands and one referenceband for a time T . Typically the N e subject band light curveseach represent the count rate in a different energy channel,and that is the context I adopt throughout this paper. Notehowever, that these bands could equally be the count ratefor different ranges of modulation angle measured by a po-larimeter such as the Imaging X-ray Polarimarey Explorer ( IXPE ; Weisskopf et al. 2016; Ingram & Maccarone 2017).We could imagine the reference band being the zeroth energychannel, or alternatively it could be the sum of some or allof the subject bands. I will explore both scenarios here, andultimately demonstrate that the two are equivalent after avery simple adjustment. The underlying power spectrum ofthe reference band, averaged over the k th Fourier frequencyrange ν k − ∆ ν k / to ν k + ∆ ν k / , is P r ( k ) . The equivalent forthe n th subject band is P s ( k , n ) , and the intrinsic coherence between the n th subject band and the reference band forthe k th frequency range is γ ( k , n ) . If these underlying prop-erties remain constant over the time T ( stationarity ), thepower spectra measured from individual segments of thelight curves will be realizations of these underlying powerspectra plus a noise contribution due to photon countingstatistics.We can estimate the underlying properties by averagingover N realizations. Here, the averaging can be over differentsegments and over different Fourier frequencies in the k th frequency range (i.e. ν k − ∆ ν k ≤ ν ≤ ν k + ∆ ν k ), and so N isthe product of the number of segments in the observationand the number of frequencies in the range (see e.g. van derKlis 1989; Uttley et al. 2014). Denoting the j th realizationof the Fourier transforms of the flux in the reference bandand the n th subject band as R j ( k ) and S j ( k , n ) respectively,our estimates for the underlying power spectra are ˜ P r ( k ) = (cid:104)| R j ( k )| (cid:105) ≡ N N (cid:213) j = | R j ( k )| (1) ˜ P s ( k , n ) = (cid:104)| S j ( k , n )| (cid:105) ≡ N N (cid:213) j = | S j ( k , n )| . (2)The expectation values of these estimates are E { ˜ P r ( k )} = P r ( k ) + P r , noise ( k ) (3) E { ˜ P s ( k , n )} = P s ( k , n ) + P s , noise ( k , n ) , (4)where P r , noise ( k ) and P s , noise ( k , n ) are expectation values forthe noise contribution in the k th frequency range. Hereafter,a tilde will continue to denote an estimate of a variable.In the same way, we can estimate the cross-spectrumbetween the n th subject band and the reference band. Forsubject bands that are not one of the channels summed inorder to create the reference band light curve, the estimatedcross-spectrum is ˜ G ( k , n ) = (cid:104) S j ( k , n ) R ∗ j ( k )(cid:105) . (5)For subject bands that are included in the reference band,the estimated cross-spectrum is ˜ G ( k , n ) = (cid:104) S j ( k , n ) R ∗ j ( k )(cid:105) − N( k , n ) , (6)where N( k , n ) accounts for the trivial correlation between the MNRAS , 1–13 (2019) ross-spectrum errors noise component of the n th subject band and the noise com-ponent of the same subject band’s contribution to the refer-ence band. The expression for N( k , n ) is derived in the follow-ing sub-section. It is common in the literature to avoid thistrivial correlation by subtracting the subject band countsfrom the reference band, resulting in each subject band be-ing crossed with a slightly different reference band (Uttleyet al. 2014). I will demonstrate that this rather unsatisfyingpractice is not at all required, and that equation (6) providesa far more elegant way to deal with a broad reference band.Equations (5) and (6) both have an expectation value of E { ˜ G ( k , n )} = E (cid:8) Re [ ˜ G ( k , n )] + i Im [ ˜ G ( k , n )] (cid:9) = G ( k , n ) ≡ Re [ G ( k , n )] + i Im [ G ( k , n )] = γ ( k , n ) S ( k , n ) R ∗ ( k ) , (7)where R ( k ) and S ( k , n ) are the underlying Fourier transformsof the reference and subject bands, such that | R ( k )| = P r ( k ) and | S ( k , n )| = P s ( k , n ) . Note that here, following Bendat &Piersol (2010), the underlying cross-spectrum is defined as G ( k , n ) = γ ( k , n ) S ( k , n ) R ∗ ( k ) . This is to ensure that the un-derlying cross-spectrum contains information on how wellthe light curves correlate with each other as well as thephase lag between them, plus it is mathematically conve-nient for the estimate of the cross-spectrum to tend to theunderlying cross-spectrum as N tends to infinity. The co-herence is a measure of how well different realizations ofthe cross-spectrum line up on the complex plane (see e.g.Nowak et al. 1999 for an intuitive discussion). For N (cid:38) ,all estimates are Gaussian distributed to a very good ap-proximation (Huppenkothen & Bachetti 2018). This is thelimit considered throughout this paper. The most importantnomenclature used throughout this paper is summarized inTable 1. Here, I derive the function N( k , n ) that accounts for the triv-ial correlation that the subject band flux has to its owncontribution to the reference band. The correct expressiondepends on whether the absolute or fractional rms normal-ization is applied. Defining the j th realization of the referenceband Fourier transform in the k th frequency range as R j ( k ) = N e (cid:213) n = S j ( k , n ) , (8)it is straight forward to show that the expectation value of (cid:104) S j ( k , n ) R ∗ j ( k )(cid:105) is E {(cid:104) S j ( k , n ) R ∗ j ( k )(cid:105)} = (cid:104)| S j ( k , n )| (cid:105) + (cid:213) m (cid:44) n S j ( k , n ) S ∗ j ( k , m ) . (9)The first term on the right hand side includes a noise term(equation 4) therefore, if ˜ G ( n , k ) and P s , noise ( k , n ) are bothdefined in absolute rms normalization (see e.g. Section 2 ofIngram & van der Klis 2013), then N( k , n ) = P s , noise ( k , n ) . (10)In the case of pure photon counting noise (i.e. the noise isPoisson distributed with no background subtraction, instru-mental dead time etc), then P s , noise ( k , n ) = µ s ( n ) (van derKlis 1989; Uttley et al. 2014), where µ s ( n ) is the mean countrate in the n th subject band. Symbol Definition P r Reference band power spectrum P s Subject band power spectrum R Fourier transform of the reference band fluxtime series S Fourier transform of the subject band flux timeseries µ R Reference band mean count rate µ s Subject band mean count rate P r , noise Reference band noise contribution P s , noise Subject band noise contribution G Cross-spectrum: G = Re [ G ] + i Im [ G ] = | G | e i φ γ Intrinsic squared coherence g Raw squared coherence b Bias term N See Section 2.2Q Underlying value of quantity Q ˜ Q Estimate of quantity QQ j j th realization of quantity Q (cid:104) Q j (cid:105) N (cid:205) Nj = Q j dQ σ uncertainty on quantity Q Table 1.
Summary of the meaning of the most frequently usedsymbols in this paper.
The cross-spectrum can be converted from absoluteto fractional rms normalization by dividing both sides by µ s ( n ) µ R , where µ R is the mean count rate in the referenceband. The power can be converted from absolute to frac-tional rms normalization by dividing by µ s ( n ) . Therefore, thefunction that accounts for the trivial correlations becomes N( k , n ) = µ s ( n ) µ R P s , noise ( k , n ) . (11)For pure photon counting noise, this becomes N( k , n ) = / µ R , which happens to be equal to P r , noise in fractional rmsnormalization. The error on the reference band power spectrum is (e.g. vander Klis 1989; Bendat & Piersol 2010) d ˜ P r ( k ) = ˜ P r ( k )√ N , (12)and extension to one of the subject band power spectra istrivial. This error estimate is appropriate if one wishes tofit a model for the frequency dependence of a single powerspectrum, in that the expected value of reduced χ for thecorrect model is unity. Bendat & Piersol (2010) show that MNRAS000
The cross-spectrum can be converted from absoluteto fractional rms normalization by dividing both sides by µ s ( n ) µ R , where µ R is the mean count rate in the referenceband. The power can be converted from absolute to frac-tional rms normalization by dividing by µ s ( n ) . Therefore, thefunction that accounts for the trivial correlations becomes N( k , n ) = µ s ( n ) µ R P s , noise ( k , n ) . (11)For pure photon counting noise, this becomes N( k , n ) = / µ R , which happens to be equal to P r , noise in fractional rmsnormalization. The error on the reference band power spectrum is (e.g. vander Klis 1989; Bendat & Piersol 2010) d ˜ P r ( k ) = ˜ P r ( k )√ N , (12)and extension to one of the subject band power spectra istrivial. This error estimate is appropriate if one wishes tofit a model for the frequency dependence of a single powerspectrum, in that the expected value of reduced χ for thecorrect model is unity. Bendat & Piersol (2010) show that MNRAS000 , 1–13 (2019)
Adam Ingram the error for the real and imaginary parts of a single cross-spectrum is d Re [ ˜ G ( k )] = (cid:115) ˜ P r ( k ) ˜ P s ( k ) + ( Re [ ˜ G ( k )]) − ( Im [ ˜ G ( k )]) Nd Im [ ˜ G ( k )] = (cid:115) ˜ P r ( k ) ˜ P s ( k ) − ( Re [ ˜ G ( k )]) + ( Im [ ˜ G ( k )]) N . (13)Again, these error estimates are appropriate if one wishes tofit a model for the frequency dependence of a single cross-spectrum, as in e.g. Rapisarda et al. (2017b).One may alternatively wish to fit a model for modulus, | ˜ G ( k )| , and phase, tan ˜ φ ( k ) = Im [ ˜ G ( k )]/ Re [ ˜ G ( k )] , of the cross-spectrum. In this case, the Bendat & Piersol (2010) formulaeare d ˜ φ ( k ) = (cid:115) − ˜ g ( k ) g ( k ) N (14) d | ˜ G ( k )| = (cid:115) ˜ P r ( k ) ˜ P s ( k ) N , (15)where ˜ φ ( k ) is expressed in radians and ˜ g ( k ) is the raw coher-ence , estimated as ˜ g ( k ) = | ˜ G ( k )| − ˜ b ( k ) ˜ P r ( k ) ˜ P s ( k ) , (16)and ˜ b ( k ) is the bias term, ˜ b ( k ) = ˜ P r ( k ) ˜ P s ( k ) − γ [ ˜ P r ( k ) − P r , noise ( k )][ ˜ P s ( k ) − P s , noise ( k )] N . (17)This bias term comes from the modulus of a quantity beingpositive definite, and therefore a noisy measurement of | ˜ G | will be biased towards values slightly larger than | G | . Sub-tracting off the bias term corrects for this bias. We see fromequation (17) that ˜ b depends on the intrinsic coherence,which is not known a priori . Usually, the bias term is calcu-lated assuming γ = (Vaughan & Nowak 1997; Uttley et al.2014), but note that a more general estimate can be madeby setting up an iteration loop. I use such an iteration loopon Rossi X-ray Timing Explorer ( RXTE ) data in section 5.For large N , ˜ b becomes very small and it is practical to set ˜ b = , which I do in sections 3 and 4 in which N = is al-ways used. Since the estimate of the bias term is itself noisy,including it even for large N can lead to erroneous estima-tions of negative ˜ g . I therefore recommend always setting ˜ b = for N ≥ . If we instead calculate N e cross-spectra with the intentionof fitting a model for the energy dependence of the cross-spectrum ( the energy-dependent limit ), we must appreciatethat: i) we have used the same reference band for each cross-spectrum, and ii) the subject bands are correlated with oneanother. Using the Bendat & Piersol (2010) formulae willtherefore lead to over-fitting (i.e. the expectation value of re-duced χ is less than unity), because these formulae accountfor uncertainties in the reference band that are the same for each of the N e cross-spectra. These reference band uncer-tainties therefore contribute a systematic error in that theycontribute an uncertainty on the normalization of the model,but they do not contribute to the dispersion of measure-ments for different subject bands. In the energy-dependentlimit, the error on the real part, imaginary part and modulusof the cross-spectrum are all the same, given by d Re [ ˜ G ( n )] = d Im [ ˜ G ( n )] = d | ˜ G ( n )| = (cid:115) ˜ P r N (cid:20) ˜ P s ( n ) − | ˜ G ( n )| − ˜ b ( n ) ˜ P r − P r , noise (cid:21) , (18)where all dependencies on frequency range have been left asimplicit for brevity (and will continue to be for the remainderof this section). I present the derivation of this equation inSection 4.1. The error on the phase is d ˜ φ ( n ) = (cid:115) ˜ P r N (cid:20) ˜ P s ( n )| ˜ G ( n )| − ˜ b ( n ) − P r − P r , noise (cid:21) , (19)and I present the derivation of this equation in Section 4.2.In Section 3, I demonstrate that these formulae work usinga Monte Carlo simulation.It is common to define alternative statistics derivedfrom the cross-spectrum such as the complex covariance , ˜ C ( n ) ≡ ˜ G ( n ) (cid:113) ∆ ν /( ˜ P r − P r , noise ) (Mastroserio et al. 2018)and the covariance spectrum , | ˜ C ( n )| (Wilkinson & Uttley2009; Uttley et al. 2014). The error on these quantitiescan be trivially obtained by multiplying equation (18) by (cid:113) ∆ ν /( ˜ P r − P r , noise ) . Finally, if we calculate the power spectrum for each sub-ject band with the intention of fitting an energy dependentmodel, we would again suffer from over-fitting if we used theBendat & Piersol (2010) error estimate (equation 12). Thisis because the subject bands are correlated with one an-other, whereas using equation (12) implicitly assumes thatthe subject bands are completely independent of one an-other. Instead, the correct error estimate is d ˜ P s = (cid:115) [ − ˜ γ ( n )] ˜ P ( n ) + P , noise ( n ) + P sub ( n ) P s , noise ( n ) N , (20)where ˜ P sub ( n ) ≡ ˜ P s ( n ) − P s , noise ( n ) and the estimate of theintrinsic coherence is ˜ γ ( n ) = | ˜ G ( n )| − ˜ b ( n )( ˜ P r − P r , noise ) P sub ( n ) . (21)Note that for γ = , the new formula simplifies dramaticallyto the old formula (equation 12 with subscript r changedto subscript s). I present a derivation of equation (20) inSection 4.3 and demonstrate that is works with a simulationin Section 3.It is common to define the rms spectrum , ˜ σ ( n ) ≡ (cid:112) ∆ ν ˜ P sub ( n ) . Depending on the normalization used for theFourier transform, this can either be used as a measure ofthe fractional or absolute rms variability amplitude (see e.g. MNRAS , 1–13 (2019) ross-spectrum errors Ingram & van der Klis 2013 or Uttley et al. 2014 for a dis-cussion on rms normalization). Using standard error prop-agation, it is straightforward to show that the error on therms spectrum is d ˜ σ ( n ) = (cid:115) [ − ˜ γ ( n )] ˜ σ ( n ) + σ ( n ) + σ ( n ) ˜ σ noise ( n ) N ˜ σ ( n ) , (22)where ˜ σ s , noise ( n ) ≡ (cid:112) ∆ ν P noise ( n ) . In this section, I use Monte Carlo simulations to demonstratethat the formulae all work, in that they give the expected χ values when the simulation data are fit back with thecorrect model. I first assume that the reference band is en-tirely separate from the subject bands, before additionallysimulating the case whereby the reference band is the sumof all the subject bands in order to demonstrate that thetwo cases are the same.Let us start by defining a complex Gaussian randomvariable X jk , where j denotes the j th realization and k the k th frequency range. The subscripts therefore denote thatnew values of the real and imaginary parts of X jk are drawnfor each frequency range of each realization. Although thesimulation only considers a single frequency range, it is in-structive to carry around explicit k dependencies in some ofthe formulae. The real and imaginary parts of X jk are bothindependent random variables drawn from a Gaussian dis-tribution with a mean of zero and a variance of / . Thismeans that the expectation value of | X jk | is unity. Defininganother independent complex random variable A jk in ex-actly the same way, we can use these ‘seed’ complex randomvariables to define the j th realization of the Fourier transformof the reference band flux as R j ( k ) = R ( k ) X jk + (cid:113) P r , noise ( k ) A jk , (23)where P r ( k ) = | R ( k )| . This is essentially the algorithm ofTimmer & Koenig (1995). It is fairly straight forward toshow that the expectation value of (cid:104)| R j ( k )| (cid:105) is P r ( k ) + P r , noise ( k ) , as is required (equation 3). This is because X jk and A jk are uncorrelated, and so the expectation value ofe.g. (cid:104) X jk A ∗ jk (cid:105) is zero.To simulate the n th subject band Fourier transform, wecan define a further two seed complex random variables, with j th realization Y jkn and B jkn . These have exactly the sameproperties as X jk and A jk , except the subscript n denotesthat a new value of Y jkn and B jkn is drawn for each subjectband, whereas the same values of X jk and A jk are used forthe j th realization of every subject band. The j th realizationof the Fourier transform of the flux in the n th reference bandfor the k th frequency range is S j ( k , n ) = S ( k , n ) (cid:20) γ ( k , n ) X jk + (cid:113) − γ ( k , n ) Y jkn (cid:21) + (cid:113) P s , noise ( k , n ) B jkn , (24)where P s ( n , k ) = | S ( n , k )| . We can again check thatthis has the required properties: E {(cid:104)| S j ( n , k )| (cid:105)} = P s ( n , k ) + P s , noise ( n , k ) and E {(cid:104) S j ( n , k ) R ∗ j ( k )(cid:105)} = G ( n , k ) = γ ( n , k )| S ( n , k )|| R ( k )| . Because new values of X jk and A jk arenot drawn for different subject bands, this simulation cap-tures the correlations between subject bands created by us-ing a common reference band.We can now use this Monte Carlo simulation to test theerror formulae presented in Section 2. I average ensemble av-eraged quantities over N = realizations and calculate thecross-spectrum for N e = subject band channels, consider-ing only one frequency range (I will therefore leave frequencydependencies as implicit for the remainder of this section). Iuse arbitrarily chosen input parameters: γ ( n ) = . , R = . , P r , noise = . , S ( n ) = . + i . and P s , noise ( n ) = . to cal-culate N e simulated cross-spectral estimates, ˜ G ( n ) , and N e estimates of the rms, ˜ σ ( n ) . Since the chosen input underlyingquantities do not depend on energy channel n , any scatter inthe simulated cross-spectra will be due purely to statisticalfluctuations. Fig. 1 (left) shows the simulation output. Topto bottom panels show respectively the real, imaginary, mod-ulus and phase of the cross-spectrum, and the error bars arecalculated using the formulae presented in Section 2.4. Thered lines are best-fitting constants. It is important to notethat these red lines are not exactly equal to the input mod-els. This is because of the systematic error introduced by thereference band uncertainty. It is possible to incorporate thisuncertainty into the uncertainty of the model normalization,since we know the errors on the reference band power spec-trum. However, most of the time the model normalization isnot of much physical interest.The error bars in Fig. 1 (left) describe the scatter ofthe data well, resulting in reduced χ values of ∼ unity. Ifurther test the error formulae by running 50,000 such sim-ulations, each time with a different random seed, and mea-suring the same four χ values for each run. Fig. 1 (right)shows histograms of these 50,000 χ values (black steppedlines) for the real part, imaginary part, modulus and phaseof the cross-spectrum (top to bottom). These histograms aredescribed very well by the χ probability density functionwith N e − = degrees of freedom (red lines), indicatingthat the error formulae are correct. The grey stepped linesresult when the Bendat & Piersol (2010) error formulae areused instead of the new formulae presented here. Clearly,these formulae result in smaller reduced χ values than thosepredicted by the χ probability density function, indicatingover-fitting. Corresponding results for the rms spectrum areshown in Fig. 2. Again we see that the new formula (blackstepped lines) gives the expected answer, whereas the Ben-dat & Piersol (2010) formula results in (albeit very mild)over-fitting .The simulation results presented thus far all assumethat the reference band is completely separate from the sub-ject bands. I additionally ran an alternative set of simula-tions with the reference band instead consisting of the sumof all the subject bands. The results of this simulation areshown in Fig. 3, demonstrating that the new error formulaepresented in this paper are also valid in the case wherebythe reference band light curve is created by summing someor all of the subject band light curves. Note that the Bendat & Piersol (2010) formulae work very wellfor simulations of a single frequency dependent power and cross-spectrum, as expected.MNRAS000
Adam Ingram the error for the real and imaginary parts of a single cross-spectrum is d Re [ ˜ G ( k )] = (cid:115) ˜ P r ( k ) ˜ P s ( k ) + ( Re [ ˜ G ( k )]) − ( Im [ ˜ G ( k )]) Nd Im [ ˜ G ( k )] = (cid:115) ˜ P r ( k ) ˜ P s ( k ) − ( Re [ ˜ G ( k )]) + ( Im [ ˜ G ( k )]) N . (13)Again, these error estimates are appropriate if one wishes tofit a model for the frequency dependence of a single cross-spectrum, as in e.g. Rapisarda et al. (2017b).One may alternatively wish to fit a model for modulus, | ˜ G ( k )| , and phase, tan ˜ φ ( k ) = Im [ ˜ G ( k )]/ Re [ ˜ G ( k )] , of the cross-spectrum. In this case, the Bendat & Piersol (2010) formulaeare d ˜ φ ( k ) = (cid:115) − ˜ g ( k ) g ( k ) N (14) d | ˜ G ( k )| = (cid:115) ˜ P r ( k ) ˜ P s ( k ) N , (15)where ˜ φ ( k ) is expressed in radians and ˜ g ( k ) is the raw coher-ence , estimated as ˜ g ( k ) = | ˜ G ( k )| − ˜ b ( k ) ˜ P r ( k ) ˜ P s ( k ) , (16)and ˜ b ( k ) is the bias term, ˜ b ( k ) = ˜ P r ( k ) ˜ P s ( k ) − γ [ ˜ P r ( k ) − P r , noise ( k )][ ˜ P s ( k ) − P s , noise ( k )] N . (17)This bias term comes from the modulus of a quantity beingpositive definite, and therefore a noisy measurement of | ˜ G | will be biased towards values slightly larger than | G | . Sub-tracting off the bias term corrects for this bias. We see fromequation (17) that ˜ b depends on the intrinsic coherence,which is not known a priori . Usually, the bias term is calcu-lated assuming γ = (Vaughan & Nowak 1997; Uttley et al.2014), but note that a more general estimate can be madeby setting up an iteration loop. I use such an iteration loopon Rossi X-ray Timing Explorer ( RXTE ) data in section 5.For large N , ˜ b becomes very small and it is practical to set ˜ b = , which I do in sections 3 and 4 in which N = is al-ways used. Since the estimate of the bias term is itself noisy,including it even for large N can lead to erroneous estima-tions of negative ˜ g . I therefore recommend always setting ˜ b = for N ≥ . If we instead calculate N e cross-spectra with the intentionof fitting a model for the energy dependence of the cross-spectrum ( the energy-dependent limit ), we must appreciatethat: i) we have used the same reference band for each cross-spectrum, and ii) the subject bands are correlated with oneanother. Using the Bendat & Piersol (2010) formulae willtherefore lead to over-fitting (i.e. the expectation value of re-duced χ is less than unity), because these formulae accountfor uncertainties in the reference band that are the same for each of the N e cross-spectra. These reference band uncer-tainties therefore contribute a systematic error in that theycontribute an uncertainty on the normalization of the model,but they do not contribute to the dispersion of measure-ments for different subject bands. In the energy-dependentlimit, the error on the real part, imaginary part and modulusof the cross-spectrum are all the same, given by d Re [ ˜ G ( n )] = d Im [ ˜ G ( n )] = d | ˜ G ( n )| = (cid:115) ˜ P r N (cid:20) ˜ P s ( n ) − | ˜ G ( n )| − ˜ b ( n ) ˜ P r − P r , noise (cid:21) , (18)where all dependencies on frequency range have been left asimplicit for brevity (and will continue to be for the remainderof this section). I present the derivation of this equation inSection 4.1. The error on the phase is d ˜ φ ( n ) = (cid:115) ˜ P r N (cid:20) ˜ P s ( n )| ˜ G ( n )| − ˜ b ( n ) − P r − P r , noise (cid:21) , (19)and I present the derivation of this equation in Section 4.2.In Section 3, I demonstrate that these formulae work usinga Monte Carlo simulation.It is common to define alternative statistics derivedfrom the cross-spectrum such as the complex covariance , ˜ C ( n ) ≡ ˜ G ( n ) (cid:113) ∆ ν /( ˜ P r − P r , noise ) (Mastroserio et al. 2018)and the covariance spectrum , | ˜ C ( n )| (Wilkinson & Uttley2009; Uttley et al. 2014). The error on these quantitiescan be trivially obtained by multiplying equation (18) by (cid:113) ∆ ν /( ˜ P r − P r , noise ) . Finally, if we calculate the power spectrum for each sub-ject band with the intention of fitting an energy dependentmodel, we would again suffer from over-fitting if we used theBendat & Piersol (2010) error estimate (equation 12). Thisis because the subject bands are correlated with one an-other, whereas using equation (12) implicitly assumes thatthe subject bands are completely independent of one an-other. Instead, the correct error estimate is d ˜ P s = (cid:115) [ − ˜ γ ( n )] ˜ P ( n ) + P , noise ( n ) + P sub ( n ) P s , noise ( n ) N , (20)where ˜ P sub ( n ) ≡ ˜ P s ( n ) − P s , noise ( n ) and the estimate of theintrinsic coherence is ˜ γ ( n ) = | ˜ G ( n )| − ˜ b ( n )( ˜ P r − P r , noise ) P sub ( n ) . (21)Note that for γ = , the new formula simplifies dramaticallyto the old formula (equation 12 with subscript r changedto subscript s). I present a derivation of equation (20) inSection 4.3 and demonstrate that is works with a simulationin Section 3.It is common to define the rms spectrum , ˜ σ ( n ) ≡ (cid:112) ∆ ν ˜ P sub ( n ) . Depending on the normalization used for theFourier transform, this can either be used as a measure ofthe fractional or absolute rms variability amplitude (see e.g. MNRAS , 1–13 (2019) ross-spectrum errors Ingram & van der Klis 2013 or Uttley et al. 2014 for a dis-cussion on rms normalization). Using standard error prop-agation, it is straightforward to show that the error on therms spectrum is d ˜ σ ( n ) = (cid:115) [ − ˜ γ ( n )] ˜ σ ( n ) + σ ( n ) + σ ( n ) ˜ σ noise ( n ) N ˜ σ ( n ) , (22)where ˜ σ s , noise ( n ) ≡ (cid:112) ∆ ν P noise ( n ) . In this section, I use Monte Carlo simulations to demonstratethat the formulae all work, in that they give the expected χ values when the simulation data are fit back with thecorrect model. I first assume that the reference band is en-tirely separate from the subject bands, before additionallysimulating the case whereby the reference band is the sumof all the subject bands in order to demonstrate that thetwo cases are the same.Let us start by defining a complex Gaussian randomvariable X jk , where j denotes the j th realization and k the k th frequency range. The subscripts therefore denote thatnew values of the real and imaginary parts of X jk are drawnfor each frequency range of each realization. Although thesimulation only considers a single frequency range, it is in-structive to carry around explicit k dependencies in some ofthe formulae. The real and imaginary parts of X jk are bothindependent random variables drawn from a Gaussian dis-tribution with a mean of zero and a variance of / . Thismeans that the expectation value of | X jk | is unity. Defininganother independent complex random variable A jk in ex-actly the same way, we can use these ‘seed’ complex randomvariables to define the j th realization of the Fourier transformof the reference band flux as R j ( k ) = R ( k ) X jk + (cid:113) P r , noise ( k ) A jk , (23)where P r ( k ) = | R ( k )| . This is essentially the algorithm ofTimmer & Koenig (1995). It is fairly straight forward toshow that the expectation value of (cid:104)| R j ( k )| (cid:105) is P r ( k ) + P r , noise ( k ) , as is required (equation 3). This is because X jk and A jk are uncorrelated, and so the expectation value ofe.g. (cid:104) X jk A ∗ jk (cid:105) is zero.To simulate the n th subject band Fourier transform, wecan define a further two seed complex random variables, with j th realization Y jkn and B jkn . These have exactly the sameproperties as X jk and A jk , except the subscript n denotesthat a new value of Y jkn and B jkn is drawn for each subjectband, whereas the same values of X jk and A jk are used forthe j th realization of every subject band. The j th realizationof the Fourier transform of the flux in the n th reference bandfor the k th frequency range is S j ( k , n ) = S ( k , n ) (cid:20) γ ( k , n ) X jk + (cid:113) − γ ( k , n ) Y jkn (cid:21) + (cid:113) P s , noise ( k , n ) B jkn , (24)where P s ( n , k ) = | S ( n , k )| . We can again check thatthis has the required properties: E {(cid:104)| S j ( n , k )| (cid:105)} = P s ( n , k ) + P s , noise ( n , k ) and E {(cid:104) S j ( n , k ) R ∗ j ( k )(cid:105)} = G ( n , k ) = γ ( n , k )| S ( n , k )|| R ( k )| . Because new values of X jk and A jk arenot drawn for different subject bands, this simulation cap-tures the correlations between subject bands created by us-ing a common reference band.We can now use this Monte Carlo simulation to test theerror formulae presented in Section 2. I average ensemble av-eraged quantities over N = realizations and calculate thecross-spectrum for N e = subject band channels, consider-ing only one frequency range (I will therefore leave frequencydependencies as implicit for the remainder of this section). Iuse arbitrarily chosen input parameters: γ ( n ) = . , R = . , P r , noise = . , S ( n ) = . + i . and P s , noise ( n ) = . to cal-culate N e simulated cross-spectral estimates, ˜ G ( n ) , and N e estimates of the rms, ˜ σ ( n ) . Since the chosen input underlyingquantities do not depend on energy channel n , any scatter inthe simulated cross-spectra will be due purely to statisticalfluctuations. Fig. 1 (left) shows the simulation output. Topto bottom panels show respectively the real, imaginary, mod-ulus and phase of the cross-spectrum, and the error bars arecalculated using the formulae presented in Section 2.4. Thered lines are best-fitting constants. It is important to notethat these red lines are not exactly equal to the input mod-els. This is because of the systematic error introduced by thereference band uncertainty. It is possible to incorporate thisuncertainty into the uncertainty of the model normalization,since we know the errors on the reference band power spec-trum. However, most of the time the model normalization isnot of much physical interest.The error bars in Fig. 1 (left) describe the scatter ofthe data well, resulting in reduced χ values of ∼ unity. Ifurther test the error formulae by running 50,000 such sim-ulations, each time with a different random seed, and mea-suring the same four χ values for each run. Fig. 1 (right)shows histograms of these 50,000 χ values (black steppedlines) for the real part, imaginary part, modulus and phaseof the cross-spectrum (top to bottom). These histograms aredescribed very well by the χ probability density functionwith N e − = degrees of freedom (red lines), indicatingthat the error formulae are correct. The grey stepped linesresult when the Bendat & Piersol (2010) error formulae areused instead of the new formulae presented here. Clearly,these formulae result in smaller reduced χ values than thosepredicted by the χ probability density function, indicatingover-fitting. Corresponding results for the rms spectrum areshown in Fig. 2. Again we see that the new formula (blackstepped lines) gives the expected answer, whereas the Ben-dat & Piersol (2010) formula results in (albeit very mild)over-fitting .The simulation results presented thus far all assumethat the reference band is completely separate from the sub-ject bands. I additionally ran an alternative set of simula-tions with the reference band instead consisting of the sumof all the subject bands. The results of this simulation areshown in Fig. 3, demonstrating that the new error formulaepresented in this paper are also valid in the case wherebythe reference band light curve is created by summing someor all of the subject band light curves. Note that the Bendat & Piersol (2010) formulae work very wellfor simulations of a single frequency dependent power and cross-spectrum, as expected.MNRAS000 , 1–13 (2019)
Adam Ingram R e G I m G | G | φ Channel Number (n) 0.020.04 d P / d χ ReG d P / d χ ImG d P / d χ |G| d P / d χ χ /d.o.f. φ Figure 1.
Left:
Example of simulated data with error bars calculated using the analytic expressions presented in this paper (black), alongwith the best-fitting model (red lines). The top two panels show the real and imaginary parts of the energy-dependent cross-spectrum,whereas the bottom two panels show the modulus and phase angle (the latter is in radians). The input model does not depend on channelnumber and the input parameters are arbitrarily chosen: γ = . , R = . , P r , noise = . , S ( n ) = . + i . , P s , noise ( n ) = . , N = . Right:
Probability distribution functions created by simulating 50,000 data sets (the left panel is an example of one of these data sets) andcalculating the χ of the best-fitting model for each one (stepped lines). The black histograms are calculated using the error formulaederived in this paper, whereas the grey lines are instead calculated using the Bendat & Piersol formulae. The red lines depict the χ probability distribution function with degrees of freedom ( N e = energy channels and one free model parameter). It is clear that thenew formulae reproduce the theoretical distribution very well, whereas use of the old formulae leads to over-fitting. . . σ Channel Number (n) 0.5 1 1.5 2 . . . . P / d χ χ /d.o.f. Figure 2.
Left:
Simulations of an RMS spectrum with error bars calculated using equation (22) (black), along with the best-fittingmodel (red lines).
Right:
Probability distributions of the χ value calculated for each of 50,000 such simulations. Black and grey steppedlines respectively correspond to errors calculated using equation (22) and the Bendat & Piersol equivalent. The input parameters are thesame as those used for Fig. 1. MNRAS , 1–13 (2019) ross-spectrum errors d P / d χ ReG d P / d χ ImG d P / d χ |G| d P / d χ χ /d.o.f. φ Figure 3.
The same as Fig. 1 except now the reference bandin the simulation is the sum of all the subject bands. The errorformulae derived here therefore also apply in the case whereby thereference band light curve is a sum of some or all subject bandlight curves.
I use asymptotic limits of the Monte Carlo simulationto derive analytic estimates for the errors on the energy-dependent cross-spectrum. As discussed in the previous sub-section, the Monte Carlo simulations use 4 seed complexGaussian random variables: X jk , A jk , Y jkn and B jkn . Here,subscript j represents the j th realization, subscript k the k th frequency range and subscript n the n th energy channel. Cru-cially, new values of X jk and A jk are drawn only for everyfrequency range of every realization (i.e. every permutationof j and k ), whereas new values of Y jkn and B jkn are drawnfor every new frequency range and channel of every realiza-tion (i.e. every permutation of j , k and n ).This subtlety introduces two definitions of expecta-tion value: one associated with averaging over all frequencyranges and the other associated with averaging over all en-ergy channels. Taking (cid:104)| X jk | (cid:105) ≡ N N (cid:213) j = | X jk | (25)as an example of a random variable whose asymptotic prop-erties do not depend on k or n , the former and latter defini-tions of expectation are E k {(cid:104)| X jk | (cid:105)} = N f N f (cid:213) k = (cid:104)| X jk | (cid:105) , (26) and E n {(cid:104)| X jk | (cid:105)} = N e N e (cid:213) n = (cid:104)| X jk | (cid:105) . (27)Consequently, there are two different definitions of variance:that associated with changing k and that associated withchanging n Var k {(cid:104)| X jk | (cid:105)} = E k (cid:26)(cid:16) (cid:104)| X jk | (cid:105) (cid:17) (cid:27) − (cid:16) E k {(cid:104)| X jk | (cid:105)} (cid:17) (28) Var n {(cid:104)| X jk | (cid:105)} = E n (cid:26)(cid:16) (cid:104)| X jk | (cid:105) (cid:17) (cid:27) − (cid:16) E n {(cid:104)| X jk | (cid:105)} (cid:17) . (29)Hereafter, I shall refer to the former and latter cases respec-tively as the k variance and the n variance . The k varianceof (cid:104)| X jk | (cid:105) is Var k {(cid:104)| X jk | (cid:105)} = N , (30)which comes from the formula for the standard error onthe mean of a Gaussian distributed random variable. The n variance, on the other hand, is Var n {(cid:104)| X jk | (cid:105)} = , (31)because the same value of X jk is used for all energy channels,for a given realization and frequency range.In the Bendat & Piersol (2010) limit (i.e. the calculationof a single power spectrum or cross-spectrum) the k varianceis the relevant quantity, whereas in the energy-dependentlimit, the n variance is instead what matters. The followingasymptotic limits will also be important for the derivationsin the coming sub-sections Var n {| Y jkn | } = N (32) Var n { Re [ X jk Y ∗ jkn ]} = N (33) Var n { Re [ B jkn Y ∗ jkn ]} = N , (34)and random variables with the same subscripts can triviallybe interchanged. Since the expectation value of any deter-ministic quantity is simply itself, the above relations enableme to derive all error formulae. For instance, Var n {| S ( k , n )| (cid:104)| Y jkn | (cid:105)} = | S ( k , n )| N . (35) From equations (5), (23) and (24), we see that the real partof the estimated cross-spectrum is equal to Re [ ˜ G ( k , n )] = Re [ G ( k , n )](cid:104)| X jk | (cid:105) + (cid:113) − γ ( k , n ) Re [ S ( k , n ) R ∗ ( k )(cid:104) Y jkn X ∗ jk (cid:105)] + (cid:113) P s , noise ( k , n ) P r , noise ( k ) Re [(cid:104) B jkn A ∗ jk (cid:105)] + γ ( k , n ) (cid:113) P r , noise ( k ) Re [ S ( k , n )(cid:104) X jk A ∗ jk (cid:105)] + (cid:113) − γ ( k , n ) (cid:113) P r , noise ( k ) Re [ S ( k , n )(cid:104) Y jkn A ∗ jk (cid:105)] + (cid:113) P s , noise ( k , n ) Re [ R ∗ ( k )(cid:104) B jkn X ∗ jk (cid:105)] . (36) MNRAS000
I use asymptotic limits of the Monte Carlo simulationto derive analytic estimates for the errors on the energy-dependent cross-spectrum. As discussed in the previous sub-section, the Monte Carlo simulations use 4 seed complexGaussian random variables: X jk , A jk , Y jkn and B jkn . Here,subscript j represents the j th realization, subscript k the k th frequency range and subscript n the n th energy channel. Cru-cially, new values of X jk and A jk are drawn only for everyfrequency range of every realization (i.e. every permutationof j and k ), whereas new values of Y jkn and B jkn are drawnfor every new frequency range and channel of every realiza-tion (i.e. every permutation of j , k and n ).This subtlety introduces two definitions of expecta-tion value: one associated with averaging over all frequencyranges and the other associated with averaging over all en-ergy channels. Taking (cid:104)| X jk | (cid:105) ≡ N N (cid:213) j = | X jk | (25)as an example of a random variable whose asymptotic prop-erties do not depend on k or n , the former and latter defini-tions of expectation are E k {(cid:104)| X jk | (cid:105)} = N f N f (cid:213) k = (cid:104)| X jk | (cid:105) , (26) and E n {(cid:104)| X jk | (cid:105)} = N e N e (cid:213) n = (cid:104)| X jk | (cid:105) . (27)Consequently, there are two different definitions of variance:that associated with changing k and that associated withchanging n Var k {(cid:104)| X jk | (cid:105)} = E k (cid:26)(cid:16) (cid:104)| X jk | (cid:105) (cid:17) (cid:27) − (cid:16) E k {(cid:104)| X jk | (cid:105)} (cid:17) (28) Var n {(cid:104)| X jk | (cid:105)} = E n (cid:26)(cid:16) (cid:104)| X jk | (cid:105) (cid:17) (cid:27) − (cid:16) E n {(cid:104)| X jk | (cid:105)} (cid:17) . (29)Hereafter, I shall refer to the former and latter cases respec-tively as the k variance and the n variance . The k varianceof (cid:104)| X jk | (cid:105) is Var k {(cid:104)| X jk | (cid:105)} = N , (30)which comes from the formula for the standard error onthe mean of a Gaussian distributed random variable. The n variance, on the other hand, is Var n {(cid:104)| X jk | (cid:105)} = , (31)because the same value of X jk is used for all energy channels,for a given realization and frequency range.In the Bendat & Piersol (2010) limit (i.e. the calculationof a single power spectrum or cross-spectrum) the k varianceis the relevant quantity, whereas in the energy-dependentlimit, the n variance is instead what matters. The followingasymptotic limits will also be important for the derivationsin the coming sub-sections Var n {| Y jkn | } = N (32) Var n { Re [ X jk Y ∗ jkn ]} = N (33) Var n { Re [ B jkn Y ∗ jkn ]} = N , (34)and random variables with the same subscripts can triviallybe interchanged. Since the expectation value of any deter-ministic quantity is simply itself, the above relations enableme to derive all error formulae. For instance, Var n {| S ( k , n )| (cid:104)| Y jkn | (cid:105)} = | S ( k , n )| N . (35) From equations (5), (23) and (24), we see that the real partof the estimated cross-spectrum is equal to Re [ ˜ G ( k , n )] = Re [ G ( k , n )](cid:104)| X jk | (cid:105) + (cid:113) − γ ( k , n ) Re [ S ( k , n ) R ∗ ( k )(cid:104) Y jkn X ∗ jk (cid:105)] + (cid:113) P s , noise ( k , n ) P r , noise ( k ) Re [(cid:104) B jkn A ∗ jk (cid:105)] + γ ( k , n ) (cid:113) P r , noise ( k ) Re [ S ( k , n )(cid:104) X jk A ∗ jk (cid:105)] + (cid:113) − γ ( k , n ) (cid:113) P r , noise ( k ) Re [ S ( k , n )(cid:104) Y jkn A ∗ jk (cid:105)] + (cid:113) P s , noise ( k , n ) Re [ R ∗ ( k )(cid:104) B jkn X ∗ jk (cid:105)] . (36) MNRAS000 , 1–13 (2019)
Adam Ingram
Using equations (31)-(34), the n variance of the six terms onthe right hand side (RHS) of equation (36) is Var n { term 1 } = (37) Var n { term 2 } = [ − γ ( k , n )]| S ( k , n )| | R ( k )| N (38) Var n { term 3 } = P s , noise ( k , n ) P r , noise ( k ) N (39) Var n { term 4 } = (40) Var n { term 5 } = [ − γ ( k , n )]| S ( k , n )| P r , noise ( k ) N (41) Var n { term 6 } = P s , noise ( k , n )| R ( k )| N . (42)In the limit of N (cid:38) explored here, all six terms are Gaus-sian distributed and therefore the real part of the cross-spectrum is also Gaussian distributed. Therefore, the vari-ance of the real part of the cross-spectrum is the sum of thevariances of the six terms, giving Var n { Re [ ˜ G ( n )]} = ˜ P r N (cid:104) [ − γ ( n )] ˜ P s ( n ) + γ ( n ) P s , noise ( n ) (cid:105) , (43)where the k dependence has been left as implicit for brevity(and will continue to be hereafter, except for the seed ran-dom variables). The expression for the error on Re [ ˜ G ( n )] quoted in equation (18) results from taking the square rootof the above equation and re-arranging.The imaginary part of the cross-spectrum can also beexpanded into six terms, simply by taking equation (36) andsubstituting Re for Im . The n variances of these six terms areexactly the same as for the real part (equations 37-42), andtherefore d Im [ ˜ G ( n )] = d Re [ ˜ G ( n )] .We can use the same logic to derive the Bendat & Pier-sol (2010) formulae for real and imaginary parts of the cross-spectrum. In this case, there is only one value of n , and wewish to know the k variance, which is the same as the n variance for terms 2, 3, 5 and 6. The k variance of term 4is Var k { term 4 } = γ ( n )| S ( n )| P r , noise /( N ) for both the realand imaginary parts of the cross-spectrum. For term 1, itis ( Re [ G ( n )]) / N for the real part and ( Im [ G ( n )]) / N for theimaginary part. Summing the variance of the six terms, re-arranging and taking the square root then leads to the Ben-dat & Piersol (2010) error formulae (equations 13). Since the errors on the real and imaginary parts of theenergy-dependent cross-spectrum are equal to one another,we can trivially write d | ˜ G ( n )| = d Im [ ˜ G ( n )] = d Re [ ˜ G ( n )] . Thisis simple to appreciate if we picture error contours on thecomplex plane. The distribution is a Gaussian in the real andimaginary axes, and therefore the error contours will simplybe concentric circles if the errors on the real and imaginaryaxes are the same. Therefore, the width of the distributionis the same in any direction on the complex plane.Using geometrical arguments, Bendat & Piersol (2010)show that the variance of the phase angle can be written as Var { ˜ φ } = | G | (cid:110) ( Re [ G ]) Var { Im [ ˜ G ]}− [ G ] Im [ G ] Cov { Re [ ˜ G ] , Im [ ˜ G ]} + ( Im [ G ]) Var { Re [ ˜ G ]} (cid:111) , (44)and this is valid both for the k and n variance. Here, thecovariance term is Cov { Re [ ˜ G ( n )] , Im [ ˜ G ( n )]} = E { Re [ ˜ G ( n )] Im [ ˜ G ( n )]}− E { Re [ ˜ G ( n )]} E { Im [ ˜ G ( n )]} . (45)We can calculate Re [ ˜ G ( n )] Im [ ˜ G ( n )] by taking equation (36)and multiplying it by the equivalent expression for the imag-inary part of the cross-spectrum to get Re [ ˜ G ( n )] Im [ ˜ G ( n )] = Re [ G ( n )] Im [ G ( n )]((cid:104)| X jk | (cid:105)) +
35 more terms . (46)Substituting all 36 terms on the RHS of equation (46) intoequation (45) at first appears a rather daunting task. Wecan however take a shortcut by first considering the k co-variance, which we already know to be Cov k { Re [ ˜ G ] , Im [ ˜ G ]} = Re [ G ] Im [ G ]/ N (Bendat & Piersol 2010). We can see thatthis is equal to the k covariance of the first term on theRHS of equation (46). We can therefore conclude that theother 35 terms contribute zero k covariance. Therefore, allwe need to do to calculate the covariance term in the energy-dependent limit is evaluate the n variance of the first termon the RHS of equation (46), which is zero. Therefore, Cov n { Re [ ˜ G ] , Im [ ˜ G ]} = .With the covariance term equal to zero, and the vari-ances of the real and imaginary parts equal to one another,equation (44) dramatically simplifies to Var n { ˜ φ ( n )} = Var n { Re [ ˜ G ( n )]}| G ( n )| , (47)and it is simple to take the square root and re-arrange toarrive at equation (19). Substituting equation (24) into equation (2) and multiplyingout of the brackets gives ˜ P s ( n ) = γ ( n )| S ( n )| (cid:104)| X jk | (cid:105) + . (48)For the first term, the n variance is zero, whereas for theother 8 terms, the n variance is equal to the k variance.Therefore, using a similar trick to the previous sub-section,we can simply start from the Bendat & Piersol (2010) for-mula for the variance on a single power spectrum, Var k { ˜ P s } = ˜ P s / N , and subtract the k variance of the first term in equa-tion (48), which is Var k { γ ( n )| S ( n )| (cid:104)| X j | (cid:105)} = γ ( n )| S ( n )| N . (49)This gives Var n { ˜ P s ( n )} = (cid:2) | S ( n )| + P s , noise (cid:3) − γ ( n )| S ( n )| N , (50)which is straight forward to manipulate into equation (20). MNRAS , 1–13 (2019) ross-spectrum errors − − − . . R e [ G ] Frequency (Hz)
Figure 4.
Example of the real part of the cross-spectrum for
RXTE
PCA data from Cygnus X-1 (black points). The referenceband is .
14 keV ≤ E ≤ .
34 keV and the subject band is .
34 keV ≤ E ≤ .
64 keV . The error bars are calculated from the Bendat &Piersol formulae (equation 13). The black solid and red dashedlines show the errors calculated directly from the standard erroron the mean and with the Bendat & Piersol formula respectively.
I demonstrate the use of the new formulae on an
RXTE observation of Cygnus X-1 in the hard state (obsid 10238-01-06-00). This observation has been used for a numberof spectral-timing studies (e.g. Gilfanov et al. 2000; Mas-troserio et al. 2018; Mahmoud & Done 2018). Most signif-icantly for the purposes of this paper, Mastroserio et al.(2019) fit the X-ray reverberation model reltrans (Ingramet al. 2019) to the real and imaginary parts of the energy-dependent cross-spectrum for 10 frequency ranges (plus thetime-average spectrum) in order to yield a black hole massmeasurement, but the best fitting model had a reduced χ of . . It turns out that this over-fitting occurred becausethe error bars were calculated in the single cross-spectrumlimit relevant to the Bendat & Piersol (2010) formulae, whenthe error formulae derived in this paper would instead havebeen appropriate.I follow the data reduction procedure described in Mas-troserio et al. (2018) to obtain N e = light curves each withtime bin sizes of dt = / s (the best resolution possible forthis particular binned mode). I ensemble average the powerand cross spectra over 46 segments, each of s duration,resulting in a useful exposure of . ks. I apply geometricre-binning to the power and cross spectra, such that the k th frequency bin is averaged over N f ( k ) ≤ c k Fourier frequen-cies. Here, N f is an integer and c ≥ is a constant. There-fore, for large k the width of the frequency bins increaseslogarithmicaly with frequency. An accurate estimate of thenoise component in the reference band and all subject bandsis required. Since the noise is significantly affected by deadtime in this data set (due to the high count rate), and thereis no frequency range below the Nyquist frequency that isdominated by noise (in which case the noise level could havebeen determined empirically), it is necessary to use a deadtime model. I use the Zhang et al. (1995) dead time model ofthe Proportional Counter Array (PCA). I ignore very large
105 20 . . r a t i o Energy (keV)
Figure 5.
Example of the real part of the energy-dependent cross-spectrum for
RXTE
PCA data from Cygnus X-1 (black points) inthe frequency range . − . Hz, plotted as a ratio to an ab-sorbed power-law model folded around the PCA response matrix.The black points have error bars calculated from the Bendat &Piersol formulae, and the red points use the new formula derivedin this paper. The error bars calculated from the old formula areclearly too large, whereas the new error bars properly reflect thescatter seen in the data. events, which only become important for frequencies muchhigher than the Nyquist frequency of this data set (whichis Hz). Specifically, I use equation 4 from Nowak et al.(1999) with very large event rate R vle = , PCA dead time τ d = µ s, time bin size t b = dt = / s and the numberof proportional counter units (PCUs) is 5. This is a fairlygeneral formula, and so in principle can be applied for otherX-ray telescopes as long as the dead time is known, or theNyquist frequency is high enough to fit the model to a noisedominated frequency range (see e.g. Bult 2017 for the caseof NuSTAR ).In order to calculate the bias term, ˜ b (equation 17), Iset up an iteration loop. The loop is initialized by calculating ˜ b in the γ = limit, and this is then used to calculate ˜ γ (equation 16). This value of ˜ γ is then used to calculate anew value of ˜ b , which in turn is used to calculate anothernew value of ˜ γ and so on until convergence is achieved.This typically takes only a few steps, and the results turnout to be very similar to simply calculating ˜ b in the γ = limit because the coherence is very close to unity for mostfrequencies of most bands. For some very low count rateenergy bands (with E (cid:38) keV well out of the calibratedrange of the PCA), ˜ b is larger than | ˜ G | due to the largeuncertainties involved in making the estimates. Wheneverthis is the case, I simply set ˜ b = . This does not affect anyof the bands used for the analysis.The black points in Fig. 4 depict the real part of thecross-spectrum between a subject band consisting of channel10 ( . - . keV) and a reference band consisting of chan-nels 1-9 ( . - . keV). I use fractional rms normalization,re-bin using a geometric binning constant of c = . andthe error bars are calculated for the single cross-spectrumlimit (i.e. equation 13), since this is a single cross-spectrumplotted against frequency. The black line depicts the error MNRAS000
PCA data from Cygnus X-1 (black points) inthe frequency range . − . Hz, plotted as a ratio to an ab-sorbed power-law model folded around the PCA response matrix.The black points have error bars calculated from the Bendat &Piersol formulae, and the red points use the new formula derivedin this paper. The error bars calculated from the old formula areclearly too large, whereas the new error bars properly reflect thescatter seen in the data. events, which only become important for frequencies muchhigher than the Nyquist frequency of this data set (whichis Hz). Specifically, I use equation 4 from Nowak et al.(1999) with very large event rate R vle = , PCA dead time τ d = µ s, time bin size t b = dt = / s and the numberof proportional counter units (PCUs) is 5. This is a fairlygeneral formula, and so in principle can be applied for otherX-ray telescopes as long as the dead time is known, or theNyquist frequency is high enough to fit the model to a noisedominated frequency range (see e.g. Bult 2017 for the caseof NuSTAR ).In order to calculate the bias term, ˜ b (equation 17), Iset up an iteration loop. The loop is initialized by calculating ˜ b in the γ = limit, and this is then used to calculate ˜ γ (equation 16). This value of ˜ γ is then used to calculate anew value of ˜ b , which in turn is used to calculate anothernew value of ˜ γ and so on until convergence is achieved.This typically takes only a few steps, and the results turnout to be very similar to simply calculating ˜ b in the γ = limit because the coherence is very close to unity for mostfrequencies of most bands. For some very low count rateenergy bands (with E (cid:38) keV well out of the calibratedrange of the PCA), ˜ b is larger than | ˜ G | due to the largeuncertainties involved in making the estimates. Wheneverthis is the case, I simply set ˜ b = . This does not affect anyof the bands used for the analysis.The black points in Fig. 4 depict the real part of thecross-spectrum between a subject band consisting of channel10 ( . - . keV) and a reference band consisting of chan-nels 1-9 ( . - . keV). I use fractional rms normalization,re-bin using a geometric binning constant of c = . andthe error bars are calculated for the single cross-spectrumlimit (i.e. equation 13), since this is a single cross-spectrumplotted against frequency. The black line depicts the error MNRAS000 , 1–13 (2019) Adam Ingram on Re [ ˜ G ] estimated directly from the standard error aroundthe mean. For each Fourier frequency, this is calculated asthe standard deviation of the 46 estimates of the real part ofthe cross-spectrum (i.e. one for each light curve segment) di-vided by √ . The errors of all the Fourier frequencies in eachbroad frequency bin are then averaged in quadrature. Thered dashed line is the error instead calculated from equation(13). We see that this gives the same answer, as expected.This indicates that the use of independent Gaussian ran-dom variables in the derivation of the error formulae doesnot render them inaccurate for use on data from accretingcompact objects, even though we know that the real dataexhibit non-linear variability properties not captured by theTimmer & Koenig (1995) algorithm (Uttley et al. 2005; Mac-carone et al. 2011).Fig. 5 shows the real part of the energy-dependent cross-spectrum in the frequency range . - . Hz. The ref-erence band is now the full PCA band (i.e. the sum of all64 subject bands), the normalization is now absolute rmsand the re-binning constant is now c = . . Note that, be-cause each subject band is included in the reference band,a noise contribution, N( n ) , has been subtracted from eachsubject band (see Section 2.2). This noise contribution is al-ways very small compared to the signal because there aremany more counts in the reference band than in any givensubject band. The data are plotted as the ratio to an ab-sorbed power law model ( tbabs with N h = × cm − and the Wilms et al. 2000 abundances, Γ = . ) that hasbeen folded around the PCA response matrix. The black andred points respectively use the error formula for the singlecross-spectrum (equation 13) and energy-dependent (equa-tion 18) limit. We see that the error bars are clearly too largein the former case, whereas they are commensurate with thespread of the data around broad trends in the latter case. Itis worth noting that this is an extreme example chosen forillustration. For higher frequencies of the real part and allfrequencies of the imaginary part, the discrepancy betweenthe two formulae is much less striking. Since the error bars inMastroserio et al. (2019) were calculated from the standarderror on the mean (i.e. the Bendat & Piersol 2010 limit), thisexplains the mild over-fitting exhibited by their analysis: theover-fitting will have been fairly severe for some spectra inthe simultaneous fit, with little to no over-fitting for otherspectra. Fortunately, changing the error formulae is unlikelyto have a large systematic effect on the best fitting param-eters, because the low frequency cross-spectrum (which ismost affected by over-fitting) and the time-averaged spec-trum predicted by the reverberation model are very similarto one another, and the time-averaged spectrum is very wellconstrained by the data. In particular, the black hole massis most sensitive to the signal at higher frequencies. The un-certainties on the model parameters may however have beenover-estimated.As a final test of the formulae, I simultaneously fit rel-trans to the real and imaginary parts of the − keVenergy-dependent cross-spectrum in the frequency range . − . Hz only, using xspec version 12.9 (Arnaud1996). This is not indented as a means to constrain systemparameters, but rather as an exercise to test whether a rea-sonable χ can be achieved using the new error formulae.The result is shown in Fig. 6, with the real and imaginaryparts of the energy-dependent cross-spectrum plotted in the E R e [ G ( E ) ]
105 20 − E I m [ G ( E ) ] Energy (keV)
Figure 6.
Example fit of the X-ray reverberation model rel-trans to the . − . Hz energy-dependent cross-spectrum.Real and imaginary parts are in the top and bottom panels re-spectively. The data are unfolded around the best fitting model( xspec command iplot eeuf ). The reduced χ is 42.9/36, whichcorresponds to a null-hypothesis probability of . top and bottom panels respectively. The data are unfoldedaround the best fitting model ( xspec command iplot eeuf )and the model is plotted with a higher resolution than thedata for clarity ( xspec command iplot eemo ). To obtainthis fit, I started from the Model 3 parameters of Mastrose-rio et al. (2019). I then varied the ‘continuum’ parameters α , γ , φ A and φ B whilst keeping the other parameters fixed. Thisis necessary because I use a different reference band to Mas-troserio et al. (2019), and I use the cross-spectrum insteadof the complex covariance, meaning that the continuum pa-rameters have slightly different definitions. I then addition-ally thaw the parameters h , r in , Γ , log ξ , A Fe and /B . Thenew best fitting values of these parameters are consistentwith the Mastroserio et al. (2019) parameters within σ confidence. The resulting reduced χ is . / , which cor-responds to a null-hypothesis probability of , indicatinga formally acceptable fit.The Fortran code used for this section can be down-loaded from https://bitbucket.org/adingram/cross_spec_code/downloads/ . I have presented error formulae for the energy-dependentcross-spectrum and rms spectrum, and shown that these for-mulae are still valid when a single broad reference band isused. Here, I discuss the results of this work.
MNRAS , 1–13 (2019) ross-spectrum errors I have shown that a broad reference band can be used tocalculate the energy-dependent cross-spectrum, as long asa noise contribution is subtracted from the real part of thecross-spectrum for subject bands that are part of the refer-ence band. For photon counting noise, the noise contributiontends to be very small compared with the signal, since thecount rate in the broad reference band will always be muchgreater than that in an individual subject band. It is com-mon in the literature to avoid trivial correlations betweenthe subject and reference bands by subtracting the currentsubject band from the broad reference band, meaning thateach subject band is crossed with a slightly different refer-ence band. This only introduces a small bias, again becausethe reference band count rate is much larger than the sub-ject band count rate, but the method suggested here is farmore elegant and introduces no bias at all as long as thenoise level is well known.With freedom to define the reference band however wesee fit, we can ask the question: what is the optimal referenceband? In the most general case, we can define the referenceband light curve as a weighted sum of all the subject bandlight curves. We could then define the weightings in order tominimize e.g. (cid:205) N e n d | ˜ G ( n )|/| ˜ G ( n )| (see equation 18). However,this would become a rather involved process, and to a goodapproximation signal to noise is more or less optimized bymaximizing the reference band count rate. Therefore, formost practical applications, it is optimal for the referenceband to be the sum of all the subject bands. A possibleexception is if the phase normalization of the model thatwill be fit to the data is calculated self-consistently, in whichcase it is optimal to restrict the reference band only to well-calibrated energy channels (see Mastroserio et al. 2018 andIngram et al. 2019 for a detailed discussion). For the case of a model being fit to the frequency depen-dence of a single power and/or cross-spectrum (e.g. Rapis-arda et al. 2017a,b), the Bendat & Piersol (2010) formu-lae should be used to calculate error bars. For the case ofa model being fit to the energy dependence of the powerand/or cross-spectrum (including the time / phase lags) ina single frequency range (e.g. Cackett et al. 2014; Chainakunet al. 2016), the new formulae presented in this work shouldbe used to calculate error bars. Similarly, if a model is beingfit to the energy-dependent power and/or cross-spectrum fora number of frequency ranges, and the model normalizationis a free parameter for different frequency ranges , the newformulae should be used. This is because, in this case, the fitto the energy-dependent cross-spectrum for each frequencyrange is completely independent of the fit to the energy-dependent cross-spectrum in every other frequency range.The formula for a single frequency range therefore applies.This is most certainly the case for any frequency resolvedspectroscopy studies, where spectral models are fit to theenergy-dependent rms or covariance spectrum for differentfrequency ranges (e.g. Revnivtsev et al. 1999; Gilfanov et al.2000; Axelsson et al. 2013; Peille et al. 2015; Axelsson &Done 2018). This is also case for the reltrans model, sincein this model the amplitude of the energy-dependent cross- spectrum in the frequency range centered on frequency ν isset by the normalization parameter α ( ν ) , which is free to takeany value for each frequency range in the fit . The new for-mulae should also be used for the X-ray polarimetry-timingmethod of Ingram & Maccarone (2017), meaning that scopefor detecting X-ray polarization QPOs is a little more posi-tive than the analysis presented in that paper suggests.However, if a model predicts the frequency and energydependence of the cross-spectrum and/or power spectra,then the Bendat & Piersol (2010) formulae should be used.This is because the normalization of the energy-dependentmodel is now not free to take any value that minimizes χ fora given frequency range. This is the case for the propagatingmass accretion rate model propfluc (e.g. Ingram & Done2012; Ingram & van der Klis 2013; Rapisarda et al. 2016),which has so far only been applied to a single cross-spectrumand the power spectra of two bands, but could in principlebe extended to multiple energy bands. This would also bethe case if the reltrans model were changed such that afrequency dependent model for α ( ν ) were implemented (e.g.a zero-centered Lorentzian). I have extensively compared the formulae derived here withthe Bendat & Piersol (2010) formulae, and discussed thelimits in which each set of equations should be used. It isimportant to also discuss other formulae in the literaturethat cover slightly different limits.
The error on the rms spectrum has previously been derivedin the γ = limit, first empirically from Monte Carlo sim-ulations by Vaughan et al. (2003) and later analytically byWilkinson (2011). Taking equation (22) from this paper andsetting γ = yields an equation almost identical to the γ = limit formula quoted in Uttley et al. (2014) (their equation14), except their equation has a in the denominator in-stead of a . This turns out to result from a very subtlemistake made when converting from the time domain excessvariance considered in Vaughan et al. (2003) (equation 11therein) and Wilkinson (2011) to the frequency domain rmsconsidered in Uttley et al. (2014). The variance in the earliertwo papers is calculated in the time domain for N t time bins(this is called N in Vaughan et al. 2003 and Wilkinson 2011,but I have re-named it in an attempt to avoid confusion).This corresponds, via Parseval’s theorem, to the integral ofthe power spectrum over all N t / positive Fourier frequen-cies. In the notation of Uttley et al. (2014), the power spec-tra are averaged over K frequency bins and M light curvesegments, such that the total number of realizations beingaveraged over is N = K M . The analysis of Vaughan et al.(2003) essentially uses K = N t / and M = (integratingover all frequency ranges but doing no ensemble averagingover light curve segments). Therefore, the formula for the Although note that in a plot of best fitting α against frequency,the error bars will look too small because the uncertainty in α contributed by the reference band is effectively treated by thenew formulae as a systematic error rather than a statistical error.MNRAS000
The error on the rms spectrum has previously been derivedin the γ = limit, first empirically from Monte Carlo sim-ulations by Vaughan et al. (2003) and later analytically byWilkinson (2011). Taking equation (22) from this paper andsetting γ = yields an equation almost identical to the γ = limit formula quoted in Uttley et al. (2014) (their equation14), except their equation has a in the denominator in-stead of a . This turns out to result from a very subtlemistake made when converting from the time domain excessvariance considered in Vaughan et al. (2003) (equation 11therein) and Wilkinson (2011) to the frequency domain rmsconsidered in Uttley et al. (2014). The variance in the earliertwo papers is calculated in the time domain for N t time bins(this is called N in Vaughan et al. 2003 and Wilkinson 2011,but I have re-named it in an attempt to avoid confusion).This corresponds, via Parseval’s theorem, to the integral ofthe power spectrum over all N t / positive Fourier frequen-cies. In the notation of Uttley et al. (2014), the power spec-tra are averaged over K frequency bins and M light curvesegments, such that the total number of realizations beingaveraged over is N = K M . The analysis of Vaughan et al.(2003) essentially uses K = N t / and M = (integratingover all frequency ranges but doing no ensemble averagingover light curve segments). Therefore, the formula for the Although note that in a plot of best fitting α against frequency,the error bars will look too small because the uncertainty in α contributed by the reference band is effectively treated by thenew formulae as a systematic error rather than a statistical error.MNRAS000 , 1–13 (2019) Adam Ingram error on the rms spectrum quoted in Uttley et al. (2014)(equation 14 therein) must be divided by √ in order to becorrect in the γ = limit. The error on the covariance spectrum in the γ = limit isalso quoted in Uttley et al. (2014) (equation 15 therein). Thisformula was derived by Wilkinson & Uttley (2009) usingequations from Bartlett (1955). We can compare this withformula derived in this work by taking equation (18) andsetting γ = to get d | ˜ C ( n )| = (cid:118)(cid:116) σ r σ ( n ) + σ , noise σ ( n ) N σ r , (51)where σ r ≡ ∆ ν [ ˜ P r − P r , noise ] and σ , noise ≡ ∆ ν P r , noise . Thisformula is similar to equation 15 in Uttley et al. (2014),except their equation has an extra term in the numera-tor: σ ( n ) σ , noise in the notation of this paper. This ex-tra term arises because the formulae in Bartlett (1955) do account for the signal from the reference band beingthe same for each subject band, but don’t account for thenoise in the reference band being the same for every sub-ject band. However, there is only one reference band lightcurve used to make the energy-dependent cross-spectrum,with only one signal contribution and one noise contribution.Equation 15 from Uttley et al. (2014) can be reproducedby using Var j { term 4 } = γ ( n )| S ( n )| P r , noise /( N ) in place of Var j { term 4 } = (see equation 40) in the derivation for theerror on the real and imaginary parts of the cross-spectrumpresented in Section 4.1, and finally setting γ = . It seems possible that the use of incorrect error formulaein the literature has led to many erroneous results. This isnot the case though - the conclusions of previous studies areunlikely to be significantly changed by adoption of the newerror formulae (see e.g. Section 5), since the new formulaeare in the most part subtle corrections to the old formulae.That said, it is clearly optimal to use the new formulae whereappropriate in future, since they are formally correct and noharder to implement than the old formulae.There are many studies of lag versus energy spectra inthe literature that use the Bendat & Piersol (2010) errorformula (e.g. Kara et al. 2013, 2016; Cackett et al. 2014;Chainakun et al. 2016). Although this formula gives errorbars that are slightly too large, Fig. 1 (bottom panel) showsthat the resulting over-fitting is very subtle indeed, andtherefore should not be a cause for concern. In contrast, anyfrequency resolved spectroscopy studies that employed theBendat & Piersol (2010) error formulae will have sufferedfrom severe over-fitting (see the third panel down of Fig.1). However, most if not all frequency resolved spectroscopystudies use either the Vaughan et al. (2003) formula for theerror on the rms in the γ = limit or the Wilkinson & Ut-tley (2009) formula for the error on the covariance in the γ = limit. As discussed in the previous sub-section, thereis a small mistake in the version of the Vaughan et al. (2003) formula quoted in Uttley et al. (2014), but this is only a fac-tor of √ and it is not clear whether it has propagated intothe literature at all. The previous sub-section also discussesan extra term in the Wilkinson & Uttley (2009) formulathat technically should not be included, but in practice isvery small if a broad reference band is used. Therefore, theerror formulae applied in previous studies (e.g. Wilkinson& Uttley 2009; Axelsson et al. 2014) can all be consideredapproximately correct, although in future I recommend theuse of the new formulae presented here. I have derived new error formulae for the energy-dependentcross-spectrum and rms spectrum that are valid for anyvalue of intrinsic coherence. I present errors for the energy-dependent cross-spectrum in terms of real and imaginaryparts (equation 18) and in terms of modulus (equation 18)and phase (equation 19). The error on the rms spectrum isequation (22).I have shown that it is optimal to use a broad referenceband, constructed by summing the flux from many subjectbands. In this case, trivial correlations arise between the ref-erence band and the subject bands that are included in thereference band. It is common in the literature to avoid thesetrivial correlations by subtracting the current subject bandcount rate from the reference band, leading to a slightly dif-ferent reference band being used for each subject band. HereI show that this is not necessary. Instead, the same broadreference band should be used for each subject band. If thecurrent subject band is included in the reference band, thenthe trivial correlations can be accounted for by subtracting anoise contribution from the real part of the cross-spectrum.Finally, I have discussed the general form of the biasterm that arises when calculating the modulus of the cross-spectrum (equation 17). In order to accurately calculate theintrinsic coherence, it is necessary to set up an iteration loop,since the intrinsic coherence and bias depend on one another.I find for an example of hard state Cygnus X-1 data that thisiteration loop converges quickly and the measured intrinsiccoherence is similar to that measured using the γ = limitof the bias term (which has been employed in the literaturein the past). This is mainly because the intrinsic coherenceis very close to unity for the observation considered.A Fortran code that employs the new error for-mulae, a broad reference band and uses an iterationloop to properly calculate the intrinsic coherence canbe downloaded from https://bitbucket.org/adingram/cross_spec_code/downloads/ . This code was used for theCygnus X-1 example discussed in Section 5. Comprehensiveuse instructions can be found in the download repository. ACKNOWLEDGEMENTS
I thank Michiel van der Klis, Edward Nathan, GuglielmoMastroserio and Phil Uttley for valuable discussions. I ac-knowledge support from the Royal Society, and useful com-ments from the anonymous referee that improved the paper.
MNRAS , 1–13 (2019) ross-spectrum errors REFERENCES
Arnaud K. A., 1996, in G. H. Jacoby & J. Barnes ed., Astronom-ical Society of the Pacific Conference Series Vol. 101, Astro-nomical Data Analysis Software and Systems V. pp 17–+Axelsson M., Done C., 2018, MNRAS, 480, 751Axelsson M., Hjalmarsdotter L., Done C., 2013, MNRAS, 431,1987Axelsson M., Done C., Hjalmarsdotter L., 2014, MNRAS, 438,657Bartlett M. S., 1955, An introduction to stochastic processes.Cambridge University Press, 1955Bendat J. S., Piersol A. G., 2010, Random Data: Analysis andmeasurement procedures; 4th edition. Wiley, 2010Bult P., 2017, ApJ, 837, 61Cackett E. M., Zoghbi A., Reynolds C., Fabian A. C., Kara E.,Uttley P., Wilkins D. R., 2014, MNRAS, 438, 2980Chainakun P., Young A. J., Kara E., 2016, MNRAS, 460, 3076De Marco B., Ponti G., 2016, preprint, ( arXiv:1605.00765 )De Marco B., Ponti G., Cappi M., Dadina M., Uttley P., CackettE. M., Fabian A. C., Miniutti G., 2013, MNRAS, 431, 2441De Marco B., et al., 2017, MNRAS, 471, 1475Fabian A. C., et al., 2009, Nature, 459, 540Gilfanov M., Churazov E., Revnivtsev M., 2000, MNRAS, 316,923Huppenkothen D., Bachetti M., 2018, ApJS, 236, 13Ingram A., Done C., 2012, MNRAS, 419, 2369Ingram A. R., Maccarone T. J., 2017, MNRAS, 471, 4206Ingram A., van der Klis M. v. d., 2013, MNRAS, 434, 1476Ingram A., van der Klis M., Middleton M., Done C., AltamiranoD., Heil L., Uttley P., Axelsson M., 2016, MNRAS, 461, 1967Ingram A., Mastroserio G., Dauser T., Hovenkamp P., vander Klis M., Garc´ıa J. A., 2019, arXiv e-prints, p.arXiv:1906.08310Kara E., Fabian A. C., Cackett E. M., Uttley P., Wilkins D. R.,Zoghbi A., 2013, MNRAS, 434, 1129Kara E., Alston W. N., Fabian A. C., Cackett E. M., Uttley P.,Reynolds C. S., Zoghbi A., 2016, MNRAS, 462, 511Kara E., et al., 2019, Nature, 565, 198Kotov O., Churazov E., Gilfanov M., 2001, MNRAS, 327, 799Lyubarskii Y. E., 1997, MNRAS, 292, 679Maccarone T. J., Uttley P., van der Klis M., Wijnands R. A. D.,Coppi P. S., 2011, MNRAS, 413, 1819Mahmoud R. D., Done C., 2018, preprint, ( arXiv:1803.04811 )Mastroserio G., Ingram A., van der Klis M., 2018, MNRAS, 475,4027Mastroserio G., Ingram A., van der Klis M., 2019, arXiv e-prints,p. arXiv:1906.08266McHardy I. M., Koerding E., Knigge C., Uttley P., Fender R. P.,2006, Nature, 444, 730Nowak M. A., Vaughan B. A., Wilms J., Dove J. B., BegelmanM. C., 1999, ApJ, 510, 874Peille P., Barret D., Uttley P., 2015, ApJ, 811, 109Rapisarda S., Ingram A., Kalamkar M., van der Klis M., 2016,MNRAS, 462, 4078Rapisarda S., Ingram A., van der Klis M., 2017a, MNRAS, 469,2011Rapisarda S., Ingram A., van der Klis M., 2017b, MNRAS, 472,3821Revnivtsev M., Gilfanov M., Churazov E., 1999, A&A, 347, L23Timmer J., Koenig M., 1995, A&A, 300, 707Uttley P., McHardy I. M., Vaughan S., 2005, MNRAS, 359, 345Uttley P., Cackett E. M., Fabian A. C., Kara E., Wilkins D. R.,2014, A&ARv, 22, 72Vaughan B. A., Nowak M. A., 1997, ApJ, 474, L43Vaughan S., Edelson R., Warwick R. S., Uttley P., 2003, MNRAS,345, 1271Weisskopf M. C., et al., 2016, in Society of Photo-Optical In- strumentation Engineers (SPIE) Conference Series. p. 990517,doi:10.1117/12.2235240Wilkinson T., 2011, PhD thesis, University of SouthamptonWilkinson T., Uttley P., 2009, MNRAS, 397, 666Wilms J., Allen A., McCray R., 2000, ApJ, 542, 914Zhang W., Jahoda K., Swank J. H., Morgan E. H., Giles A. B.,1995, ApJ, 449, 930Zoghbi A., Fabian A. C., Reynolds C. S., Cackett E. M., 2012,MNRAS, 422, 129van der Klis M., 1989, in ¨Ogelman H., van den Heuvel E. P. J.,eds, NATO Advanced Science Institutes (ASI) Series C Vol.262, NATO Advanced Science Institutes (ASI) Series C. p. 27van der Klis M., 2006, Advances in Space Research, 38, 2675van der Klis M., Hasinger G., Stella L., Langmeier A., vanParadijs J., Lewin W. H. G., 1987, ApJ, 319, L13This paper has been typeset from a TEX/L A TEX file prepared bythe author.MNRAS000