A cautionary tale in fitting galaxy rotation curves with Bayesian techniques: does Newton's constant vary from galaxy to galaxy?
Pengfei Li, Federico Lelli, Stacy McGaugh, James Schombert, Kyu-Hyun Chae
AAstronomy & Astrophysics manuscript no. TestGN © ESO 2021February 1, 2021 L etter to the E ditor A cautionary tale in fitting galaxy rotation curves with Bayesiantechniques: does Newton’s constant vary from galaxy to galaxy?
Pengfei Li (cid:63) , Federico Lelli , Stacy McGaugh , James Schombert , and Kyu-Hyun Chae Department of Astronomy, Case Western Reserve University, Cleveland, OH 44106, USA School of Physics and Astronomy, Cardi ff University, Queens Buildings, The Parade, Cardi ff , CF24 3AA, UK Department of Physics, University of Oregon, Eugene, OR 97403, USA Department of Physics and Astronomy, Sejong University, 209 Neungdong-ro Gwangjin-gu, Seoul 05006, Republic of KoreaReceived xxx; accepted xxx
ABSTRACT
The application of Bayesian techniques to astronomical data is generally non-trivial because the fitting parameters can be stronglydegenerated and the formal uncertainties are themselves uncertain. An example is provided by the contradictory claims over thepresence or absence of a universal acceleration scale (g † ) in galaxies based on Bayesian fits to rotation curves. To illustrate thesituation, we present an analysis in which the Newtonian gravitational constant G N is allowed to vary from galaxy to galaxy whenfitting rotation curves from the SPARC database, in analogy to g † in the recently debated Bayesian analyses. When imposing flatpriors on G N , we obtain a wide distribution of G N which, taken at face value, would rule out G N as a universal constant with highstatistical confidence. However, imposing an empirically motivated log-normal prior returns a virtually constant G N with no sacrificein fit quality. This implies that the inference of a variable G N (or g † ) is the result of the combined e ff ect of parameter degeneraciesand unavoidable uncertainties in the error model. When these e ff ects are taken into account, the SPARC data are consistent with aconstant G N (and constant g † ). Key words. galaxies: dwarf — galaxies: irregular — galaxies: kinematics and dynamics — galaxies: spiral — dark matter
1. Introduction
Bayesian analysis has been extensively implemented in as-tronomical studies due to the frequent occurrence of multi-parameter problems. A major philosophy of Bayesian inferenceis to incorporate reliable priors to break the parameter degen-eracies. This is best achieved by imposing physically motivatedpriors which, however, are not always readily available. The situ-ation becomes even more di ffi cult when the input formal uncer-tainties are themselves uncertain, as is often the case for het-erogeneous collections of data. Thus, any formal outcome ofBayesian analysis should be interpreted with a proper appreci-ation for both the strengths and limitations of the source data.One example is the claim of “absence of a fundamental ac-celeration scale in galaxies" by Rodrigues et al. (2018a) andthe follow-up work by Marra et al. (2020). They fitted rota-tion curves of disk galaxies from the SPARC database (Lelliet al. 2016) using the radial acceleration relation (RAR, Mc-Gaugh et al. 2016; Lelli et al. 2017). The RAR is a tight em-pirical relation between the observed kinematic acceleration g obs and the baryonic gravitational field g bar with a characteristicscale g † (cid:39) − m s − . Its asymptotic behaviors agree with thepredictions of modified Newtonian dynamics (MOND, Milgrom1983), hence the RAR can be interpreted as a viable realizationof MOND only if the empirical scale g † is a universal constant.Rodrigues et al. (2018a) tested the universality of g † by fit-ting the RAR to individual galaxies imposing flat priors on three (cid:63) Email: [email protected], [email protected] Spitzer Photometry and Accurate Rotation Curves. All data are avail-able at astroweb.case.edu/SPARC . fitting parameters (acceleration scale g † , galaxy distance, andstellar mass-to-light ratio) and taking formal uncertainties liter-ally for each and every rotation curve. They obtained a wide dis-tribution of g † and concluded that a universal acceleration scaleis ruled out. After substantial criticism about their methodologyand interpretation of data (McGaugh et al. 2018; Kroupa et al.2018; Cameron et al. 2020), they presented a follow-up analysisin Marra et al. (2020), in which they follow Li et al. (2018) inadopting physically motivated Gaussian priors on distance, in-clination, and stellar mass-to-light ratio. They persist in using aflat prior on g † , resulting in severe parameter degeneracies. Sim-ilar work was presented by Chang & Zhou (2019) using a verybroad, uninformative prior on g † .The mere existence and tightness of the RAR and of the bary-onic Tully-Fisher relation (BTFR, e.g., Lelli et al. 2019) suggestan empirically motivated Gaussian prior centered around 10 − m s − . There cannot be more scatter in g † than there is in theraw forms of these relations prior to any rotation-curve fitting,yet this is exactly what is inferred in analyses where a broadprior on g † is assumed. Degeneracy can cause this parameter tovary widely with other parameters without providing a meaning-ful improvement in fit quality. This was already pointed out inLi et al. (2018), where we concluded that rotationally supportedgalaxies are consistent with a single value of g † : there is no valueadded in allowing it to vary. The same argument applies to pres-sure supported galaxies (Chae et al. 2020a) and when the MONDexternal-field e ff ect is taken into account (Chae et al. 2020b).Comparing the posterior distribution functions of g † from in-dividual galaxy fits implies that one can fully trust the formaluncertainties in each and every case. The observational uncer- Article number, page 1 of 4 a r X i v : . [ a s t r o - ph . GA ] J a n & A proofs: manuscript no. TestGN tainties of SPARC galaxies are surely sensible on average; forexample, the mean expected scatter in the RAR from error prop-agation is comparable with the observed scatter (e.g., Lelli et al.2017). However, taking formal uncertainties of each and everyrotation curve literally is not recommended because they wereculled from multiple sources (Lelli et al. 2016). More generally,we can measure the Doppler shift in di ff erent parts of galaxieswith high accuracy, but dynamical analysis demands knowledgeof the circular velocity of a test particle in the equilibrium gravi-tational potential. The gas rotational velocity is arguably the bestpossible proxy to the circular velocity, but uncertainties in someindividual galaxies (e.g., due to non-circular motions or out-of-equilibrium dynamics) can never be fully under control.To further illustrate how a blind application of Bayesianstatistics with uninformative priors could mislead us, we presenta “reductio ad absurdum” using Newton’s gravitational constant G N as a free parameter in a similar fashion as g † . Choosing G N has a key advantage: since it is widely accepted as a con-stant, comparing the Bayesian inference with the expected valueprovides a clear evaluation on the approach of Rodrigues et al.(2018a) with respect to that of Li et al. (2018). In Section 2, wereproduce the results by Rodrigues et al. (2018a) using similardata and priors but for G N instead of g † . In Section 3, we repeatthe analysis of Li et al. (2018) exploring di ff erent priors on G N .We discuss our results in Section 4.
2. Challenging the constancy of Newtoniangravitational constant
In analogy to previous works (Li et al. 2018; Rodrigues et al.2018a; Chang & Zhou 2019; Marra et al. 2020), we fit the ro-tation curves from the SPARC database (Lelli et al. 2016) usingthe RAR. The SPARC database consists of 175 late-type galax-ies with accurate H I / H α rotation curves and mass modeling fromSpitzer photometry at 3.6 µ m. The gravitational contributions ofdi ff erent baryonic components (gas disk, stellar disk, and bulgeif applicable) are represented as the circular velocity of a testparticle ( V gas , V disk , and V bulge , respectively). The total rotationvelocity due to baryonic contributions V bar is then the quadraticsum of these components: V = V gas | V gas | + Υ disk V + Υ bulge V , (1)where Υ disk and Υ bulge are the stellar mass-to-light ratios for disksand bulges with the fiducial values Υ disk = . Υ bulge = . obs to that due to baryons g bar :g obs = g bar − e − √ g bar / g † , (2)where g obs = V / R , g bar = V / R , and g † is the only freeparameter. This relation was established by analyzing the datapoints from 153 galaxies in a statistical sense. We aim to testthe constancy of G N , so we fit Eq. 2 to rotation curves fixingg † = . × − m s − and varying G N from galaxy to galaxy.We also tried to vary g † and G N simultaneously, obtaining sim-ilar results. We show as an example the case that has the samenumber of fitting parameters as in Rodrigues et al. (2018a).We set the likelihood function as L = e − χ with χ given by χ = (cid:88) R [g obs (R) − g RAR (R)] δ g (R) , (3) Galaxy l og G N k p c k m s − M − fl Fiducial value1 σ region2 σ region3 σ region Fig. 1.
Posterior probability distribution of Newton’s gravitational con-stant G N for the galaxies in the SPARC database (Lelli et al. 2016).Following a similar analysis by Rodrigues et al. (2018a) about the ac-celeration scale g † , galaxies are ordered for increasing value of G N andflat priors are imposed on galaxy distances, stellar mass-to-light ratios,and G N (see text for details). Black dots show the maximum of the pos-terior probability, together with 1 σ , 2 σ and 3 σ credible regions fromthe Python package GetDist. The expected value of G N is representedby the dashed line. Among the selected 101 galaxies, 34 are incompat-ible with the expected G N at the 3 σ level. Taken at the face value, thiswould imply that the gravitational constant is not constant. where g RAR is the expected centripetal acceleration from Eq. 2and δ g obs = obs δ V obs R is the uncertainty on the observed accel-eration. We impose a flat prior on G N with G N > † . Following the procedure of Rodrigueset al. (2018a), we also impose flat priors on Υ (cid:63) with a toleranceof a factor of two, though this is not consistent with stellar pop-ulation synthesis models (Schombert et al. 2019).Galaxy distance a ff ects the fit quality as it is directly relatedto the contributions of the baryonic components. When the dis-tance D is changed to D (cid:48) , the galaxy radius R and the contribu-tion of each component V (cid:48) k will transform according to R (cid:48) = R D (cid:48) D , V (cid:48) k = V k (cid:114) D (cid:48) D , (4)where k denotes gas, disk, or bulge. Rodrigues et al. (2018a)allow galaxy distance to vary freely within 20% of their obser-vational values. This prior does not reflect actual observationalerrors. Distances measured through Cepheids or the tip mag-nitude of the red giant branch are highly accurate, so a 20%variation could be larger than 4 σ . The Hubble flow method ismuch less accurate; distances measured using this method couldhave errors up to 30%, so the imposed 20% free range is evenwithin the 1 σ region. Rodrigues et al. (2018a) also ignored theuncertainty on disk inclination. Possible outer asymmetries andwarps in the gas disk could mislead the determination of incli-nations, as reflected in their uncertainties, which would in turna ff ect the observed rotation velocities. Despite the shortcomingsof the methodology of Rodrigues et al. (2018a), we choose to fitthe same parameters (except G N instead of g † ) and impose thesame priors to achieve the most direct comparison.We map the posterior distributions of the fitting param-eters using the standard a ffi ne-invariant ensemble sampler inthe Python package emcee (Foreman-Mackey et al. 2013). TheMarkov Chains are initialized with 200 random walkers and iter-ated for 2000 steps after 1000 burnt-in iterations. We record the Article number, page 2 of 4engfei Li et al.: Does Newton’s constant vary from galaxy to galaxy? -1 χ ν C D F Gaussian priors on G N Flat priors on G N Fixed G N log( G N × ) [kpc km s − M − fl ] N u m b e r o f P o i n t s G N × (kpc km s − M − fl ) N u m b e r o f P o i n t s Fig. 2.
Left: cumulative distribution functions of χ ν fixing G N (red line) or imposing empirical lognormal priors (blue line) and flat priors (blackline) on G N . The three cases show indistinguishable fit qualities. Right: histograms of the maximum-likelihood G N . The dark and light bluehistograms correspond to lognormal and flat priors on G N , respectively. The inset panel shows a zoom-in distribution for the Gaussian prior,switched to linear scale for a better view. The actual value of G N is indicated by red, dashed lines. The Gaussian prior returns a tight distributionof G N without decreasing the fit quality: this confirms the constancy of G N as expected. The same argument has been applied for the constancy ofg † in Li et al. (2018). best-fit parameters that maximize the probability function for ev-ery galaxy.Figure 1 shows the posterior probability distribution of G N .Though we analyze all the SPARC galaxies, we only include101 of them in Figure 1 based on the following quality cuts: weremoved galaxies with quality flag Q = ◦ , or reduced χ ν > .
0. This is a sim-ilar quality cut as adopted in Rodrigues et al. (2018a), and itretains a similar number of galaxies. Figure 1 shows that the pos-terior probability distribution of G N presents a wide distributionspanning ∼ σ , 2 σ , and 3 σ credi-ble regions (including 68.3%, 95.4% and 99.7% of the posteriorprobability, respectively) from the output of “getMargeStats" inthe Python package GetDist . It turns out that 34 galaxies outof 101 are incompatible with the expected G N . This is similarto Figure 1 of Rodrigues et al. (2018a), showing a wide distribu-tion of a (which we call g † to distinguish empirical free parame-ters from physical constants). Rodrigues et al. (2018a) found that31% of the galaxies are rejected from the global best-fit g † at 3 σ level. Hence, they infer the “absence of a fundamental accelera-tion scale in galaxies”. We find 34% of the galaxies are excludedfrom the expected value of G N at 3 σ level. Following their argu-ment, we reach the conclusion of the absence of a fundamentalNewtonian gravitational constant, ruling out Newtonian gravityand General Relativity.There are two interconnected problems here: (i) flat priorscan give G N too much freedom to go astray, and (ii) formal un-certainties on circular velocities (as well as distances and inclina-tions) are themselves uncertain, so the confidence regions fromthe posterior distribution functions of individual galaxies have tobe taken with a grain of salt. With highly accurate observationaldata (e.g. the orbital data of the planets in the solar system), onewould expect that flat priors would return consistent values of G N . For observational data like galaxy rotation curves, there canoccur cases where parameters co-vary wildly given the imper-fect nature of the input data and their uncertainties (e.g. warpeddisks, complex non-circular motions, etc.). Since flat prior does https://getdist.readthedocs.io not provide any constraints against these e ff ects, the fitting pa-rameters can go astray out of reasonable regions.
3. Confirming the constancy of Newtoniangravitational constant
To further demonstrate the degeneracy problem, we compare fitqualities imposing di ff erent priors on G N . Similar work has beendone in Li et al. (2018) for g † . In that paper, we found that impos-ing flat and Gaussian priors on g † result in significantly di ff erentdistributions of its maximum likelihood value, but essentially re-turn the same fit quality. We concluded that adding g † as a fittingparameter does not improve fit quality, thus the data are con-sistent with a single value of g † . The validity of this argumentshould be straightforward, but given the recent claims of Chang& Zhou (2019) and Marra et al. (2020), we repeat the analysisof Li et al. (2018) varying G N .In Li et al. (2018), we include disk inclination as a fittingparameter as it a ff ects the observational rotation velocities andtheir uncertainties. When the inclination is adjusted from i to i (cid:48) , the observational rotation velocities and the uncertainties willtransform as V (cid:48) obs = V obs sin i sin i (cid:48) ; δ V (cid:48) obs = δ V obs sin i sin i (cid:48) , (5)where V obs and δ V obs are observational velocities and uncer-tainties, respectively. We also impose more realistic priors thanRodrigues et al. (2018a) on the fitting parameters: a lognormalprior on the stellar mass-to-light ratio with a standard deviationof 0.1 dex inspired by stellar population synthesis model (e.g.,Schombert et al. 2019), and Gaussian priors on galaxy distanceand disk inclination with the standard deviations given by theirobservational uncertainties. The same priors on these nuisance,galactic parameters are imposed here.For G N we follow two di ff erent approaches: (1) we imposean uninformative flat prior within [10 − , 10 − ] kpc km s − M − (cid:12) which is known to give unrealiable results (as shown in the pre-vious section), and (2) we use a lognormal prior following theso-called “empirical Bayes approach”. The lognormal prior is Article number, page 3 of 4 & A proofs: manuscript no. TestGN centered around the expected value G N = . × − kpc km s − M − (cid:12) , while its standard deviation is empirically motivatedusing the BTFR (Lelli et al. 2019). The BTFR is mathematicallyequivalent to the low-acceleration portion of the RAR (see Sect.7.1 in Lelli et al. 2017) and can be expressed as:log( V f ) = .
25 log( M b ) + .
25 log( g † G N / X ) , (6)where V f is the flat rotation velocity, M b is the total baryonicmass, and X is a factor of order unity that accounts for the cylin-drical geometry of disks (McGaugh et al. 2018). Fitting the datafrom Lelli et al. (2019) in log-log space with LTS_LINEFIT(Cappellari et al. 2013), we find that the best-fit slope is indistin-guishable from 0.25 and the vertical intrinsic scatter is 0.02 dexalong V f . This sets a very hard upper limit for the scatter on G N or g † given that galaxy-to-galaxy variations in X may also con-tribute. Any intrinsic variation in log( G N ) or log( g † ) just cannotbe larger than 0.02 dex. Thus, we consider a standard deviationof 0.02 dex for the lognormal prior on G N . We stress that G N and g † enter in Eq. 6 in the same fashion, so the same argument canbe applied to g † analogously to Li et al. (2018).The left panel of Figure 2 shows the cumulative distributionfunction (CDF) of the reduced χ for both flat and lognormal pri-ors on G N . We stress that χ ν is merely used as a first-order statis-tics to assess the overall quality of di ff erent Bayesian fits, so thearguments against its use outlined in Rodrigues et al. (2018b)do not apply. Moreover, we are not comparing individual galaxyfits, but their average, cumulative behavior that is more robustagainst the occasional overestimate or underestimate of formaluncertainties. Unsurprisingly, the results shown here are similarto Figures 6 and 7 in Li et al. (2018), though we are varying G N rather than g † . The CDF shows that imposing flat and lognormalpriors essentially give the same fit quality. For reference, we alsoinclude the results with fixed G N (red solid line), which shows aslightly better CDF of χ ν despite it has one less fitting parameter.This occurs because the rotation-velocity residuals are compara-ble, so the smaller number of fitting parameters f for the samenumber N of data points improves the value of χ ν = χ / ( N − f ).Therefore, there is no added value in varying G N .An important thing to note about the CDF in Figure 2 is thatthere are too many galaxies with χ ν that is too high and also toomany for which it is too low relative to the expected distributionfor χ ν . If we take the cases with high χ ν at face value, then the fitsare formally rejected at high confidence. However, we have alsomade fits to the same data with dark matter halos (Li et al. 2020),and obtain CDF with the same structure for all of the halo typesconsidered (NFW, pseudo-isothermal, Burkert, Einasto, DC14,coreNFW, and Lucky13). The same galaxies have χ ν that are toohigh for all models of any type. Taking this at face value, nei-ther dark matter nor MOND can explain the data. Rather thanconclude that all conceivable models are excluded, we infer thatthe uncertainties are underestimated in these cases. Similarly, theuncertainties for galaxies with very low χ ν have likely been over-estimated. This is a common occurrence in astronomy, wherethere is often considerable error in the uncertainties.The right panel of Figure 2 shows the di ff erent distributionsfor the maximum-likelihood G N . The lognormal prior gives avery tight distribution around the fiducial value. It appears as asingle column in log space and can be resolved by zooming-inin linear scale. This is essentially consistent with a single valueof G N . In contrast, the flat prior results in a wide distributionspanning ∼ ff erent number of parameters and imposing di ff er-ent priors on the galactic parameters. It suggests that the wideposterior distribution is not specific to our case, but a general conclusion of flat priors on G N or g † . Following the same argu-ment made in Li et al. (2018) about g † , we conclude that G N isindeed constant in galaxies. If insteadi, we follow the reasoningof Rodrigues et al. (2018b), we would conclude that G N is notconstant, but varies from galaxy to galaxy, just as they concludefor g † . In essence, they believe that there is a meaningful di ff er-ence between galaxies on one side of the histogram from thoseon the other, while we do not.
4. Discussion and Conclusion
In this paper, we present an inference that Newton’s constantvaries as a “reductio ad absurdum”. This highlights the dangersof degeneracies in parameter estimation from Bayesian fits whenthe input data are complex and their formal uncertainties can-not be taken too literally. This is motivated by the recent worksof Rodrigues et al. (2018a), Chang & Zhou (2019), and Marraet al. (2020), which claim that the acceleration scale g † in theempirical RAR varies from galaxy to galaxy. This conclusionis misleading; if we apply the same logic to G N , we infer thatthis fundamental constant varies from galaxy to galaxy. Simi-larly, their conclusion that MOND is ruled out at high statisticalsignificance also applies to fits with dark matter halos (Li et al.2020): all conceivable models are ruled out if we take the errorbars literally.While the arguments presented here are specific to rotation-curve fits of disk galaxies, they teach us a general lesson aboutthe application of broad priors in Bayesian analyses of astronom-ical data, where there is often considerable uncertainty in the un-certainties. The blind use of Bayesian statistics without properlyconsidering the parameter degeneracy and the uncertain natureof formal errors can lead to absurd conclusions. We use G N ingalaxies as an example, as it provides a straightforward compar-ison to the works by Rodrigues et al. (2018a), Chang & Zhou(2019), and Marra et al. (2020). The same issues could arise inother astronomical data sets. Acknowledgements.
We thank Harry Desmond for useful discussions. This workwas supported in part by NASA ADAP grant 80NSSC19k0570. K.-H. C. wassupported in part by the National Research Foundation of Korea (NRF) grantfunded by the Korea government (MSIT) (No. NRF-2019R1F1A1062477).
References