[PDF] How Analytic Choices Can Affect the Extraction of Electromagnetic Form Factors from Elastic Electron Scattering Cross Section Data

Abstract

Scientists often try to incorporate prior knowledge into their regression algorithms, such as a particular analytic behavior or a known value at a kinematic endpoint. Unfortunately, there is often no unique way to make use of this prior knowledge, and thus, different analytic choices can lead to very different regression results from the same set of data. To illustrate this point in the context of the proton electromagnetic form factors, we use the Mainz elastic data with its 1422 cross section points and 31 normalization parameters. Starting with a complex unbound non-linear regression, we will show how the addition of a single theory-motivated constraint removes an oscillation from the magnetic form factor and shifts the extracted proton charge radius. We then repeat both regressions using the same algorithm, but with a rebinned version of the Mainz dataset. These examples illustrate how analytic choices, such as the function that is being used or even the binning of the data, can dramatically affect the results of a complex regression. These results also demonstrate why it is critical when using regression algorithms to have either a physical model in mind or a firm mathematical basis

Full PDF

HHow Analytic Choices Can Aﬀect the Extraction of Electromagnetic Form Factorsfrom Elastic Electron Scattering Cross Section Data

Scott K. Barcus, Douglas W. Higinbotham, and Randall E. McClellan

Jeﬀerson Lab, Newport News, VA 23606

Scientists often try to incorporate prior knowledge into their regression algorithms, such as aparticular analytic behavior or a known value at a kinematic endpoint. Unfortunately, there isoften no unique way to make use of this prior knowledge, and thus, diﬀerent analytic choices canlead to very diﬀerent regression results from the same set of data. To illustrate this point in thecontext of the proton electromagnetic form factors, we use the Mainz elastic data with its 1422cross section points and 31 normalization parameters. Starting with a complex unbound non-linear regression, we will show how the addition of a single theory-motivated constraint removes anoscillation from the magnetic form factor and shifts the extracted proton charge radius. We thenrepeat both regressions using the same algorithm, but with a rebinned version of the Mainz dataset.These examples illustrate how analytic choices, such as the function that is being used or even thebinning of the data, can dramatically aﬀect the results of a complex regression. These results alsodemonstrate why it is critical when using regression algorithms to have either a physical model inmind or a ﬁrm mathematical basis.

Keywords: form factors; proton radius; conﬁrmation bias; regression; robust methods

I. INTRODUCTION

Silberzahn et al. [1] points out that there is often lit-tle appreciation for how diﬀerent analytic strategies canaﬀect a reported result. In this work, we illustrate howanalytic choices can impact the extraction of the elec-tromagnetic form factors and the associated charge radiifrom electron scattering data. These extractions are fre-quently done with complex non-linear regression algo-rithms and tend to make use of prior information aboutthe limiting behavior of the electromagnetic form factorsto help constrain the value of experimental normaliza-tion parameters. Also, while many tend to look at allregressions as being the same, in fact there are diﬀerenttypes of regressions such as descriptive, predictive, andexplanatory.A descriptive model is used to capture the features ofa dataset in a compact manner without reliance on anunderlying theory. A predictive model is any statisticalmodel which tries to generalize beyond the data that isbeing ﬁtted. Finally, explanatory modeling takes a the-ory based model and tests that model’s hypothesis by ap-plying it to data. Further details about these diﬀerencescan be found in Ref. [2]. Though the type of regressionmodel being developed is not always clearly stated, it isyet another choice that aﬀects how scientists design theirregression algorithms.

II. PROTON ELASTIC SCATTERING

There has been renewed interest in proton elastic scat-tering data due to muonic hydrogen Lamb shift resultswhich determined the charge radius of the proton to be0.84078(39) fm [3, 4], a result in stark contrast to theCODATA-2014 recommended value of 0.8751(61) fm [5].This systematic diﬀerence was known as the proton ra- dius puzzle [6–8]. We will show that determining theproton’s charge radius is highly dependent on the ana-lytic choices made when selecting a model to describethe world data.In the plane-wave Born approximation, the cross sec-tion for elastic electron scattering on a proton is givenby: σ = σ Mott × (cid:34) G E (cid:0) Q (cid:1) + τ G M (cid:0) Q (cid:1) τ + 2 τ G M (cid:0) Q (cid:1) tan (cid:18) θ (cid:19)(cid:35) ,(1)where σ Mott is the Mott cross section, G E and G M arethe electric and magnetic Sachs form factors respectively, τ = Q m p , Q = 4 E Beam E (cid:48) sin (cid:0) θ (cid:1) , E Beam is the energyof the electron beam, E (cid:48) is the energy of the outgoingelectron, θ is the scattering angle of the outgoing electron,and m p is the mass of the proton.The proton charge radius, r p , is extracted from thecross sections by determining the slope of the electricform factor, G E , in the limit of four-moment transfer, Q , approaching zero [9]: r p ≡ (cid:32) − dG E ( Q ) dQ (cid:12)(cid:12)(cid:12)(cid:12) Q =0 (cid:33) / . (2)Since the scattering data is measured at ﬁnite Q , an ex-trapolation is required to extract the charge radius. Apurely mathematical ﬁt to the scattering data would bea descriptive model and would generally only be valid inthe region of the data making extrapolation risky. Whenextrapolating back to Q =0 it is desirable to use a pre-dictive model. This requires extra care such as adding a r X i v : . [ phy s i c s . d a t a - a n ] J un physics considerations to the model and/or mathemati-cal requirements to keep the ﬁt well behaved (e.g. not di-verging in the region) and not unduly complex. Authorshave taken many diﬀerent approaches to this extraction,yielding diﬀerent outcomes [10–20]. III. REGRESSIONS

To illustrate how analytic choice can strongly aﬀectthe extracted radius, we ﬁrst use the Mainz dataset of1422 cross section points along with its 31 normalizationparameters [21]. As noted in the work of Bernauer etal. [22], knowledge of the absolute value of cross sectionsis limited by the determination of the absolute luminos-ity, which in turn is limited by the uncertainty of the tar-get thickness and beam current. In order to compensatefor these uncertainties, normalization parameters wereintroduced to the original ﬁts of this data which wereconstrained only by the value of the charge and mag-netic form factors in the limit of Q = 0. While there isno debate about the value of the form factors at Q = 0,how the data at ﬁnite Q is connected to the known end-point brings a model dependence to the analysis that isnot easily understood.These parameters are taken in combinations to linksets of data together, with the ﬁnal value of each crosssection point deﬁned by: σ exp = σ p · normA p · normB p , (3)where normA p and normB p are the two normalizationparameters associated with that data point. A completelist of the 31 diﬀerent normalization parameters, N j , thatare taken in 34 unique combinations for the 1422 points,is shown in Table I. Further details of how these param-eters connect to each of the 1422 cross section pointscan be found in the supplemental material of Bernauer et al. [22].For our example of how analytic choice can eﬀect theoutcome, we ﬁrst ﬁt the Mainz dataset with an unboundcomplex non-linear regression with the form factors pa-rameterized in terms of polynomials: G E, polynomial ( a Ei , Q ) = 1 + n (cid:88) i =1 a Ei Q i and (4) G M, polynomial ( a Mi , Q ) = µ p (cid:32) n (cid:88) i =1 a Mi Q i (cid:33) , (5)where µ p is the magnetic moment of the proton and n is the order of the polynomial. This is one of the manyform factors parameterizations described in Ref. [22] thathas been used for extracting the proton radius from theMainz dataset. For these regressions, we perform a weighted leastsquares minimization with a χ function deﬁned as fol-lows: χ = p max (cid:88) p =1 (cid:18) σ Model ( E p , θ p ) − σ p · normA p · normB p ∆ σ p · normA p · normB p (cid:19) ,(6)where for each data point p there is a cross-section, σ p ,with energy E p , angle θ p , and normalization parameters normA and normB as shown in Table I. As was done inthe Mainz ﬁts, the normalization parameters are allowedto ﬂoat freely.We repeat this same regression adding one require-ment: that the terms of the polynomial have succes-sively alternating signs. This makes the polynomial moreclosely resemble a “completely monotone” function andis referred to as a bound regression. A true completelymonotone function, f , would possess derivatives, f n , ofall orders such that ( − n f n ( x ) ≥ , x > G E and G M and use the resultant normalization parametersas initialization parameters in the more complex regres-sions.In Table II and Table III we show the results of ﬁttingwith both the unbound and bound regressions for the1422 Mainz cross section points and the rebinned 658 TABLE I. The 34 diﬀerent combinations of the 31 normaliza-tion parameters, N j , found in Ref. [22] which link the datatogether along with the number of data points and the Q range of each dataset.Energy Spec. normA normB Points Q Range [GeV ]180 MeV B N N

106 0.0038 – 0.0129B N N

41 0.0101 – 0.0190A N - 102 0.0112 – 0.0658B N N

19 0.0190 – 0.0295C N N

38 0.0421 – 0.0740C N N

17 0.0740 – 0.0834315 MeV B N N

104 0.0111 – 0.0489A N N

38 0.0430 – 0.1391A N - 40 0.0479 – 0.1441C N N

62 0.1128 – 0.2131450 MeV B N N

77 0.0152 – 0.0572B N N

52 0.0572 – 0.1175A N - 42 0.0586 – 0.2663B N N

17 0.0589 – 0.0851A N N

36 0.0670 – 0.2744C N N

50 0.2127 – 0.3767A N - 2 0.2744 – 0.2744585 MeV B N N

41 0.0255 – 0.0433B N N

47 0.0433 – 0.1110A N - 27 0.0590 – 0.0964B N N

21 0.0920 – 0.1845A N - 37 0.0964 – 0.4222C N N

20 0.3340 – 0.5665720 MeV B N N

47 0.0711 – 0.1564A N - 46 0.1835 – 0.6761C N N

28 0.6536 – 0.7603B N N

27 0.2011 – 0.2520C N N

37 0.4729 – 0.7474B N N

36 0.1294 – 0.2435855 MeV B N N

35 0.3263 – 0.4378C N N

31 0.7300 – 0.9772A N N

32 0.3069 – 0.5011A N - 13 0.5274 – 0.7656B N N

54 0.0868 – 0.3263

Mainz cross section points respectively. The regressionresults and residuals are shown graphically in Fig. 1 andFig. 2 for the 1422 Mainz cross section points and therebinned 658 Mainz cross section points respectively. Forclarity we divide σ exp by σ dipole , where σ dipole is simplyEq. 1 with standard dipole form factors: G E, dipole ( Q ) = (cid:18) Q .

71 GeV (cid:19) − and (7) G M, dipole ( Q ) = µ p (cid:18) Q .

71 GeV (cid:19) − . (8)The ﬁts of the unbound and bound regressions clearly diﬀer signiﬁcantly for both the original and the rebinnedMainz cross section points, but the residuals of eachﬁt are quite similar. The reason the locations of thecross section data points diﬀer between the unbound andbound regressions in Fig. 1 and Fig. 2 is due to the choiceof regression model shifting the 31 normalization param-eters to maintain agreement with our prior knowledge ofthe values of the electromagnetic form factors in the limitof Q = 0. The magnitude of individual normalizatonshifts is a few tenths of a percent, which is much smallerthan absolute normalizatons can be determined, but canhave a clear eﬀect on the results as the point-to-pointuncertainties are also just few tenths of a percent. TABLE II. The values of the polynomial terms for the un-bound and bound regressions of the 1422 cross section pointsfollowing the notation of Eq. 4 and 5. If one wishes to in-terpret the charge form factor slope term ( i = 1) in terms ofcharge radius using Eq. 2, one ﬁnds the unbound ﬁt gives acharge radius of 0.882 fm while the bound ﬁt gives a chargeradius of 0.854 fm. unbound boundi a Ei a Mi a Ei a Mi IV. MODEL SELECTION

For a ﬁxed number of ﬁt parameters, the unbound re-gressions presented in this work will always have a total χ equal to or lower than a bound regression as shown inFig. 3 and Fig. 4 which plot the total χ and charge ra-dius given by the unbound and bound regressions of the1422 Mainz cross section points and the 658 rebinnedcross sections respectively. Since adding parameters willalways either decrease or keep total χ the same, χ byitself is not a valid model selection criterion. More ap-propriate model selection techniques include using an F-Test for nested models that are linear in terms [12, 26] ormodel selection methods like the Akaike Information Cri-terion (AIC) [27] or the Bayesian Information Criterion(BIC) [28] which can be used with non-nested non-linearmodels (see Ref. [29] for more details).Since the regressions herein are non-linear, we use theAIC and BIC to determine the most appropriate number e x p / d i p o l e E Beam =0.18 GeV0.00 0.05 0.10 0.15 0.200.960.981.00 e x p / d i p o l e E Beam =0.315 GeV e x p / d i p o l e E Beam =0.45 GeV e x p / d i p o l e E Beam =0.585 GeV0.0 0.2 0.4 0.60.951.001.051.10 e x p / d i p o l e E Beam =0.72 GeV0.0 0.2 0.4 0.6 0.8 1.0 Q [GeV ] e x p / d i p o l e E Beam =0.855 GeV R e s i d u a l E Beam =0.18GeV0.00 0.05 0.10 0.15 0.200.010.000.01 R e s i d u a l E Beam =0.315 GeV R e s i d u a l E Beam =0.45 GeV R e s i d u a l E Beam =0.585 GeV0.0 0.2 0.4 0.60.020.000.02 R e s i d u a l E Beam =0.72 GeV0.0 0.2 0.4 0.6 0.8 1.0 Q [GeV ] R e s i d u a l E Beam =0.855 GeV

FIG. 1. The 1422 Mainz cross section points plotted vs. Q for the six diﬀerent incident beam energies and ﬁt with unboundand bound polynomials. The gray points were analyzed using an unbound eleventh order polynomial regression in G E and G M while the black points used a bound eleventh order polynomial regression constrained to alternate term signs. The systematicdiﬀerence in the location of the points is due to how the 31 normalization parameters in the ﬁt change based on the choice ofusing either the unbound or bound functions in the regression. While the mean values are clearly diﬀerent for these ﬁts, theresiduals of the ﬁts to their respective functions are quite similar. e x p / d i p o l e e x p / d i p o l e E Beam =0.315 GeV e x p / d i p o l e E Beam =0.45 GeV e x p / d i p o l e E Beam =0.585 GeV0.0 0.2 0.4 0.61.001.051.10 e x p / d i p o l e E Beam =0.72 GeV0.0 0.2 0.4 0.6 0.8 1.0 Q [GeV ] e x p / d i p o l e E Beam =0.855 GeV R e s i d u a l E Beam =0.18GeV0.00 0.05 0.10 0.15 0.200.010.000.01 R e s i d u a l E Beam =0.315 GeV R e s i d u a l E Beam =0.45 GeV R e s i d u a l E Beam =0.585 GeV0.0 0.2 0.4 0.60.010.000.01 R e s i d u a l E Beam =0.72 GeV0.0 0.2 0.4 0.6 0.8 1.0 Q [GeV ] R e s i d u a l E Beam =0.855 GeV

FIG. 2. The 658 rebinned Mainz cross section points plotted vs. Q for the six diﬀerent incident beam energies and ﬁt withunbound and bound polynomials. The gray points were analyzed using an unbound eleventh order polynomial regression in G E and G M while the black points used a bound eleventh order polynomial regression constrained to alternate term signs. Thesystematic diﬀerence in the location of the points is due to how the 31 normalization parameters in the ﬁt change based on thechoice of using either the unbound or bound functions in the regression. While the mean values again diﬀer for the two ﬁts,the residuals of the ﬁts to their respective functions are quite similar. TABLE III. The values of the polynomial terms for the un-bound and bound regressions of the 658 cross section pointsof rebinned data [13] following the notation of Eq. 4 and 5.If one wishes to interpret the charge form factor slope term( i = 1) in terms of charge radius using Eq. 2, one ﬁnds theunbound ﬁt gives a charge radius of 0.863 fm while the boundﬁt gives a radius of 0.845 fm.unbound boundi a Ei a Mi a Ei a Mi of parameters for these regressions. These statistical cri-teria, along with the frequently quoted χ per degree offreedom χ / df [30], are deﬁned as follows: χ = N (cid:88) n =1 (( data i − model i ) /σ i ) , (9) χ / df = χ / ( N − N var ) , (10)AIC = N log( χ /N ) + 2 N var , (11)BIC = N log( χ /N ) + log( N ) N var , (12)where N is the number of data points, data i and σ i arethe measured values and their estimated uncertaintiesrespectively, model i is the model value, and N var is thenumber of model parameters. Starting from lowest orderﬁts, at ﬁrst the value of these criteria will decrease asa parameter is added indicating an underﬁtting of thedata while eventually the criteria will start to increaseas parameters are added indicating an overﬁtting of thedata; thus, with these criteria the model with the lowestAIC or BIC value should be selected as most appropriate.For the 1422 original Mainz data points, we ﬁnd withboth AIC and BIC the most appropriate of the boundﬁts is the 7 th order ﬁt with a χ / df of 1.21, while forthe unbound ﬁts the 10 th order ﬁt with a χ / df of 1.14is most appropriate as previously found [21, 22]. For the658 rebinned Mainz data points, we ﬁnd with both AICand BIC the most appropriate of the bound ﬁts is the 7 th order ﬁt with a χ / df of 0.865, while for the unbound ﬁtsthe 9 th order ﬁt with a χ / df of 0.830 is most appropriate.Ideally a criteria would have been deﬁned prior to theanalysis of the data, since we are doing a re-analysis, weare presenting two of the most common techniques so onecan judge how even the criteria can eﬀect the outcome of the analysis. Further details about model selectiontechniques can be found in [31]. C h a r g e R a d i u s [ f m ] BoundUnbound3 4 5 6 7 8 9 10 11

Polynominal Order G E and G M T o t a l BoundUnbound

FIG. 3. Total χ vs. the number of ﬁt parameters in G E and G M using Eq. 4 and 5 for both the unbound and bound poly-nomial regressions of the full 1422 point Mainz dataset. Total χ will decrease as parameters are added, but at some pointno signiﬁcant improvement will be made where signiﬁcanceis deﬁned using the statistical criteria. With both AIC andBIC, the most appropriate bound ﬁts are the 7 th order whilefor the unbound descriptive ﬁts the most appropriate order is10 th . One should also keep in mind whether one is tryingto do a descriptive ﬁt of the data or, by adding physicalconstraints, building a predictive or explanatory modelof the data [2]. One must also keep in mind that noneof these model selection techniques will prevent the useof completely inappropriate functions nor do they en-sure that the best type of function has been selected.As noted in Ref. [30, 32], it is essential to plot the ﬁtfunctions and residuals to ensure a reasonable regressionas χ minimization alone is insuﬃcient as illustrated inAppendix B.As the changes we have presented in these four ﬁtsare larger than the statistical parameter uncertainties,we have limited ourselves to a discussion of the shifts ofthe mean values of the points. For non-linear regressionssuch as these, statistical bootstrapping which makes useof sampling with replacement can be used to ﬁnd thestatistical parameter uncertainties [25]. V. RESULTS

In Fig. 5 and Fig. 6, we show the individual electric andmagnetic form factors obtained from the unbound and C h a r g e R a d i u s [ f m ] BoundUnbound3 4 5 6 7 8 9 10 11

Polynominal Order G E and G M T o t a l BoundUnbound

FIG. 4. Total χ vs. the number of ﬁt parameters in G E and G M using Eq. 4 and 5 for both the unbound and boundpolynomial regressions of the 658 points of the rebinned Mainzdataset [13]. For these ﬁts, by AIC and BIC, 7 th order isthe most appropriate for the bound regression while for theunbound 9 th order is the most appropriate. bound regressions for the 1422 Mainz cross section points.In Fig. 5 we see that G M for both the unbound and boundﬁts remains well behaved at high Q . However, for G E both the unbound and bound ﬁts begin to diverge athigh Q . This is due to the dominance of the magneticform factor in the cross section at high Q which leadsthe electric form factor to become unconstrained in thisregion. Due to using a high-order polynomial regression,along with the unconstrained nature of G E in the high Q region, the divergence of G E is to be expected.In Fig. 6 the ratios of the electromagnetic form fac-tors to the standard dipole are shown. This ratio revealsthat the unbound G E ratio has a small oscillation andthe unbound G M ratio has a large oscillation. Whereas,the bound G E ratio has mostly removed the oscillationand the bound G M ratio has almost no oscillation. Theunbound ratios are descriptive models of the scatteringdata without any physics considerations, but the boundratios are more akin to predictive models as the termsalternate sign as one would expect from chiral eﬀectiveﬁeld theory. By adding this one physical constraint theoscillations are removed as the model becomes more pre-dictive.Though it is beyond the range of the data used in theregression, the results of regressions like these are fre-quently used to extract the charge radius of the protonby using Eq. 2 to relate the ﬁt function to the chargeradius of the proton. Ideally, these extractions would usea predictive model to ﬁt the data as ﬁnding the charge radius requires extrapolating beyond the range of the ex-perimental data. For the case of a polynomial regression,this is simply: r p = ( − a E ) / . (13)Using Eq. 2 one ﬁnds a charge radius of 0.882 fm from theunbound regression and 0.854 fm from the bound regres-sion of the original Mainz cross section points (Table II),and a charge radius of 0.863 fm from the unbound re-gression and 0.845 fm from the bound regression of therebinned Mainz data (Table III). These results show thateven just rebinning the data can shift the result of a high-order polynomial regression signiﬁcantly.In the end, the radii extracted from the more descrip-tive unbound regressions are closer to the CODATA-2014value, while the radii extracted from the more physicallyjustiﬁable and predictive bound regressions are closer tothe muonic results as well as the most recent atomic re-sult [33]. With freedom to make analytic choices that sostrongly aﬀect the results, there is the potential for un-conscious conﬁrmation bias, and for researchers to selectand report the regressions that conﬁrm their expecta-tions [34, 35]. Q [GeV ] F o r m F a c t o r G M ,unbound / G E ,unbound G M ,bound / G E ,bound FIG. 5. The electromagnetic form factors vs. Q from theunbound and bound polynomial regressions of the 1422 Mainzcross sections [22]. For these kinematics, as the Q gets large,the cross sections are dominated by G M and the G E formfactor becomes unconstrained, so the divergence of G E athigh Q is to be expected from a high-order polynomial. VI. CONCLUSIONS

We have shown that small changes in analytic func-tions and binning choices applied to a complex non-linearregression can result in signiﬁcantly diﬀerent results. Inparticular, using the Mainz dataset of elastic cross sec-tion points to extract a proton charge radius, we haveshown results consistent with the CODATA-2014 valuewhen using high-order unbounded polynomial ﬁts andvalues close to the muonic results when using boundedpolynomial regressions. Thus, by simply trying diﬀerent Q [GeV ] R a t i o G M ,unbound / G M ,dipole G E ,unbound / G E ,dipole G M ,bound / G M ,dipole G E ,bound / G E ,dipole FIG. 6. The ratio of the extracted electromagnetic form fac-tors to standard dipole vs. Q for the 1422 Mainz cross sec-tions. The oscillations in the unbound magnetic form fac-tor, and to a lesser extent electric form factor, go away oncethe terms of the ﬁt functions are forced to alternate sign.The smooth magnetic form factor also results in a smallerextracted proton charge radius (0.854 fm vs. 0.882 fm). functions, limits, and bounds, one can easily extrapolateto diﬀerent results which can lead to conﬁrmation biasand/or inappropriate rejection of certain results. En-rico Fermi noted that these types of problems should beaddressed using either a ﬁrm mathematical basis or aphysical model [36]. Ideally, regression models shouldbe carefully developed prior to obtaining experimentaldata, as was done by the PRad collaboration [37, 38];otherwise, one must be exceedingly careful to avoid con-ﬁrmation bias though the rigorous use of model selectiontechniques [34, 35].Thus, while one can argue that the bounded non-linearregression is the more physical function, it would be moreappropriate to approach the analysis such that the ana-lytic choices do not so strongly aﬀect the results. To dothis, one can either ﬁt only lower Q data where fewerfree parameters are required [11, 12, 29, 37, 39–44] andthe results are not sensitive to the magnetic form fac-tor, as shown explicitly in Ref. [29]; or, as Fermi pre-ferred, use a physical model, such as that of Bernard etal. [45] or Alarc´on and Weiss [24] to constrain theﬁts [16, 17, 46, 47]. There are also the physically moti-vated functions such as rational functions [12, 48], contin-ued fractions [11, 49, 50], or the z-expansion ﬁts [13, 51]though these still require model selection techniques todetermine the appropriate number of regression parame-ters. We hope to have illustrated that by using extremelycomplex non-linear regressions and deep searches, onecan ﬁnd nearly any radius in a wide range of radii froma single dataset [52]. To quote Nobel laureate RonaldH. Coase, “if you torture the data long enough, it willconfess.” ACKNOWLEDGMENTS

Many thanks to Franziska Hagelstein and VladimirPascalutsa for their questions about normalization pa-rameters that prompted this work. Thanks to DaveMeekins for his many useful comments and discussions.Thanks to Nigel Tufnel for suggesting going to eleven.And thanks to Marcy Stutzman and Simon ˇSirca for theireditorial comments. This work was supported by the U.S.Department of Energy contract DE-AC05-06OR23177under which Jeﬀerson Science Associates operates theThomas Jeﬀerson National Accelerator Facility.

Appendix A: Robust Regressions

Ordinary least squares regression (OLSR) is one of themost commonly employed techniques used to ﬁt a givenmodel and its parameters to a dataset. However, OLSR iscommonly misunderstood and misapplied by researchers.For OLSR, ﬁt parameters are determined by minimizingthe square of the diﬀerences between real-world data andmodel predictions. This is known as a χ minimization.Eq. A1 shows this minimization for N data points with M ﬁt parameters. χ ≡ N (cid:88) i =1 (cid:18) y i − y ( x i | a , a , ..., a M ) σ i (cid:19) (A1)Here y i are the measured data values, y ( x i | a , a , ..., a M )are the values given by the model with ﬁt parameters a to a M when evaluated at the x i of the measured data,and σ i are the uncertainties on each measured data point.While OLSR via χ minimization is often a useful ini-tial method for checking the ‘goodness’ of a ﬁt, it can failif the dataset being ﬁt does not meet certain conditions.OLSR is based on the core assumptions that the errorsare random variables that are normally distributed, theerrors are uncorrelated to each other, and the errors arehomoscedastic, which is to say they have the same vari-ance. Unfortunately, these assumptions often do not holdtrue in the case of real-world data. When OLSR’s as-sumptions are not met, such as when the dataset hassigniﬁcant outliers, OLSR is not suﬃcient for ﬁtting thedata and can yield misleading results. Even a singularoutlier can skew the results of an OLSR pulling the ﬁtaway from the data’s true behavior [25, 53]. To avoidthese pitfalls, robust methods such as robust least squaresregression (RLSR) should be used instead of OLSR tech-niques.To avoid outliers having too much inﬂuence over a ﬁt,we desire a method by which outliers can be identiﬁedand then re-weighted such that they do not skew the over-all ﬁt. The least squares minimization found in Eq. A1can be generalized to Eq. A2 by introducing the function ρ ( z ) [54]. OLSR is then simply the case where ρ ( z ) = z .Many functions can be used for ρ ( z ) to introduce robust-ness, but for the following examples the ‘soft loss’ (softl1)function given in Eq. A3 was selected and implementedusing the Python package SciPy [54–56]. χ ≡ N (cid:88) i =1 ρ i ( z ) and z = (cid:18) y i − y ( x i | a , a , ..., a M ) σ i (cid:19) (A2) ρ ( z ) = 2 (cid:0) √ z − (cid:1) (A3)With soft loss, as a z i gets larger, the magnitude of ρ i ( z ) is increasingly reduced with respect to OLSR. ARLSR with soft loss essentially re-weights the outliers ofa dataset, decreasing their inﬂuence when ﬁtting. Notethat if a dataset meets all of the above assumptions in-herent to OLSR (i.e. errors are normally distributed,uncorrelated, and have the same variance) then OLSRand RLSR techniques should both produce the same ﬁtresults since the dataset, by deﬁnition, does not containexcessive outliers.A simple example reproduced from a classic statisticspaper [32] is shown in Fig. 7. The dataset has a singleclear outlier which pulls the ﬁt considerably away fromthe bulk of the data when using OLSR. However, whenRLSR with soft loss is used to ﬁt the data the inﬂuenceof the outlier is greatly reduced, and the ﬁt returns tothe bulk of the data yielding superior results. x y Robust Least Squares RegressionOrdinary Least Squares Regression

FIG. 7. This example data, reproduced from a classic statis-tics paper [32], shows how an ordinary least squares regres-sion, OLSR, is easily pulled away from the true trend of thedata while a robust least squares regression, RLSR, is onlyweakly aﬀected by the outlying data point.

For a real-world example using RLSR we can studythe full Initial State Radiation (ISR) dataset found inthe supplemental material of Ref. [57, 58]. Fig. 8 showsthe results of two regressions of the ISR proton electricform factor data using the theory model of Alarc´on andWeiss [24], with the proton radius as its single free pa-rameter. The light gray curve uses an OLSR to ﬁt thedataset and the dark curve uses a RLSR with soft loss. There is a clear separation between the results of thetwo regressions, with the RLSR ﬁtting the higher Q databetter. The OLSR ﬁnds a proton radius of 0.874 fm, andthe RLSR ﬁnds a signiﬁcantly smaller radius of 0.844 fm.Again, the purely analytic choice of the regression typesigniﬁcantly inﬂuences the ﬁt results. Further, the factthat the OLSR and the RLSR ﬁt results diﬀer signiﬁ-cantly is evidence that there are outliers in the datasetthat are not following a normal distribution, otherwisethe OLSR and RLSR ﬁts would have better agreement. Q [GeV ] G E P Robust Least Squares RegressionOrdinary Least Squares Regression

FIG. 8. Proton electric form factor data taken from the InitialState Radiation dataset found in the supplemental materialof Ref. [57]. The uncertainties are calculated by summingthe listed statistical uncertainties with the systematic uncor-related uncertainties in quadrature. The theoretical modelused for the regressions is the model of Alarc´on and Weiss [24],with only one free parameter. These regressions give a protonradius of 0.874 fm for the OLSR and 0.844 fm for the RLSRwith soft loss.

Appendix B: Anscombe’s Quartet

With the power of modern computing, one can betempted to blindly assume the results of the regressionare correct if the χ / df is close to unity; but this can beextremely misleading [59]. To illustrate this point we usethe Anscombe quartet [32], though as nuclear physiciststend to use χ / df instead of R . We have taken the 1973example problem and added uncertainties to the pointsas shown in Table IV.When ﬁt with a linear function, these four sets of(x,y,dy) values give the same statistical quantities tothree signiﬁcant ﬁgures: mean, variance, χ , χ / df, etc.So if one fails to make graphical checks, one can be com-pletely fooled into thinking the ﬁts are all equally good;but by simply graphing (see Fig. 9) one can see that onlythe ﬁrst set of data is distributed in an ideal way aroundthe ﬁt function.Dataset two clearly has a curved residual yet has ex-actly the same mean, errors, and χ as the ﬁrst ﬁt. Thissuggests that the ﬁtter should likely add a quadratic term0 TABLE IV. Four datasets of (x i ,y i ,dy i ) values adaptedfrom [32].x , , y y y x y dy , , , x y m = 0.50b = 3.00 / df = 1.00 0 5 10 15 20 x y m = 0.50b = 3.00 / df = 1.000 5 10 15 20 x y m = 0.50b = 3.00 / df = 1.00 0 5 10 15 20 x y m = 0.50b = 3.00 / df = 1.00 FIG. 9. Graphs of the four sets of data from Table IV. Whenﬁt with a linear function, y = mx + b , all four sets of datagive the same slope, same intercept and same χ /df of one;yet clearly these four sets of data are not the same. to the regression as well as check that the uncertaintieshave been correctly reported.Dataset three illustrates the eﬀect of an outlier on theregression. Of course, the scientist doesn’t simply throwout an outlier. Instead one should report on the outlier’sinﬂuence on the result. For example, in dataset three itwould be worth noting that if the outlier is removed, thedata exactly follow a line of y = 4 + 0 . x and that thatmeasurement should be repeated.Dataset four looks unsatisfactory, since all the informa-tion about the slope comes from one observation. This isvery diﬀerent from dataset one where any one point canbe removed and one will obtain nearly the same result.Thus, for dataset four it should be pointed out that asingle observation plays a critical role in the result. [1] R. Silberzahn et al. , Advances in Methods and Practicesin Psychological Science , 337 (2018).[2] G. Shmueli, Statist. Sci. , 289 (2010).[3] R. Pohl et al. , Nature , 213 (2010).[4] A. Antognini et al. , Science , 417 (2013).[5] P. J. Mohr, D. B. Newell, and B. N. Taylor,Rev. Mod. Phys. , 035009 (2016), arXiv:1507.07956[physics.atom-ph].[6] R. Pohl, R. Gilman, G. A. Miller, and K. Pachucki, Ann.Rev. Nucl. Part. Sci. , 175 (2013), arXiv:1301.0905[physics.atom-ph].[7] C. E. Carlson, Prog. Part. Nucl. Phys. , 59 (2015),arXiv:1502.05314 [hep-ph].[8] G. A. Miller, in (2018)arXiv:1809.09635 [physics.atom-ph]. [9] G. A. Miller, Phys. Rev. C99 , 035202 (2019),arXiv:1812.02714 [nucl-th].[10] M. Horbatsch and E. A. Hessels, Phys. Rev.

C93 , 015204(2016), arXiv:1509.05644 [nucl-ex].[11] K. Griﬃoen, C. Carlson, and S. Maddox, Phys. Rev.

C93 , 065207 (2016), arXiv:1509.06676 [nucl-ex].[12] D. W. Higinbotham, A. A. Kabir, V. Lin, D. Meekins,B. Norum, and B. Sawatzky, Phys. Rev.

C93 , 055207(2016), arXiv:1510.01293 [nucl-ex].[13] G. Lee, J. R. Arrington, and R. J. Hill, Phys. Rev.

D92 ,013013 (2015), arXiv:1505.01489 [hep-ph].[14] K. M. Graczyk and C. Juszczak, Phys. Rev.

C90 , 054334(2014), arXiv:1408.0150 [hep-ph].[15] I. T. Lorenz and Ulf.-G. Meißner, Phys. Lett.

B737 , 57(2014), arXiv:1406.2962 [hep-ph].[16] M. Horbatsch, E. A. Hessels, and A. Pineda, Phys. Rev.

C95 , 035203 (2017), arXiv:1610.09760 [nucl-th]. [17] J. M. Alarc´on, D. W. Higinbotham, C. Weiss, and Z. Ye,Phys. Rev. C99 , 044303 (2019), arXiv:1809.06373 [hep-ph].[18] J. Alarcn, D. Higinbotham, and C. Weiss, (2020),arXiv:2002.05167 [hep-ph].[19] S. Zhou, P. Giuliani, J. Piekarewicz, A. Bhat-tacharya, and D. Pati, Phys. Rev.

C99 , 055202 (2019),arXiv:1808.05977 [nucl-th].[20] M. Mihoviloviˇc, D. W. Higinbotham, M. Bevc, andS. ˇSirca, Front. in Phys. , 36 (2020), arXiv:2003.03816[nucl-ex].[21] J. C. Bernauer et al. (A1), Phys. Rev. Lett. , 242001(2010), arXiv:1007.5076 [nucl-ex].[22] J. C. Bernauer et al. (A1), Phys. Rev. C90 , 015206(2014), arXiv:1307.6227 [nucl-ex].[23] M. Merkle, in

Analytic Number Theory, Approxima-tion Theory, and Special Functions , edited by G. V.Milovanovi´c and M. T. Rassias (Springer-Verlag NewYork, New Your, 2012) Chap. 12, pp. 347–364,arXiv:1211.0900.[24] J. M. Alarc´on and C. Weiss, Phys. Lett.

B784 , 373(2018), arXiv:1803.09748 [hep-ph].[25] W. H. Press, S. A. Teukolsky, W. T. Vetterling, andB. P. Flannery,

Numerical Recipes 3rd Edition: The Artof Scientiﬁc Computing , 3rd ed. (Cambridge UniversityPress, New York, NY, USA, 2007).[26] S. ˇSirca,

Probability for Physicists , Graduate Texts inPhysics (Springer International Publishing, 2016).[27] H. Akaike, IEEE Transactions on Automatic Control ,716 (1974).[28] G. Schwarz, Ann. Statist. , 461 (1978).[29] D. W. Higinbotham, P. Giuliani, R. E. McClellan,S. Sirca, and X. Yan, (2018), arXiv:1812.05706[physics.data-an].[30] R. Andrae, T. Schulze-Hartung, and P. Melchior, arXive-prints , arXiv:1012.3754 (2010), arXiv:1012.3754 [astro-ph.IM].[31] E. Wit, E. v. d. Heuvel, and J.-W. Romeijn, StatisticaNederlandica , 217 (2012).[32] F. J. Anscombe, The American Statistician , 17(1973).[33] N. Bezginov, T. Valdez, M. Horbatsch, A. Marsman,A. C. Vutha, and E. A. Hessels, Science , 1007(2019).[34] J. J. Koehler, Organizational Behavior and Human De-cision Processes , 28 (1993).[35] J. P. A. Ioannidis, PLOS Medicine (2005),10.1371/journal.pmed.0020124.[36] F. Dyson, Nature , 297 EP (2004).[37] X. Yan, D. W. Higinbotham, D. Dutta, H. Gao, A. Gas- parian, M. A. Khandaker, N. Liyanage, E. Pasyuk,C. Peng, and W. Xiong, Phys. Rev. C98 , 025204 (2018),arXiv:1803.01629 [nucl-ex].[38] W. Xiong et al. , Nature , 147 (2019).[39] L. N. Hand, D. G. Miller, and R. Wilson, Rev. Mod.Phys. , 335 (1963).[40] J. J. Murphy, Y. M. Shin, and D. M. Skopik, Phys.Rev. C9 , 2125 (1974), [Erratum: Phys. Rev. C10 , 2111(1974)].[41] F. Borkowski, G. G. Simon, V. H. Walther, and R. D.Wendling, Z. Phys.

A275 , 29 (1975).[42] G. G. Simon, C. Schmitt, F. Borkowski, and V. H.Walther, Nucl. Phys.

A333 , 381 (1980).[43] F. Hagelstein and V. Pascalutsa, Phys. Lett.

B797 ,134825 (2019), arXiv:1812.02028 [nucl-th].[44] T. B. Hayward and K. A. Griﬃoen, (2018),arXiv:1804.09150 [nucl-ex].[45] V. Bernard, H. W. Fearing, T. R. Hemmert, and Ulf.-G.Meißner, Nucl. Phys.

A635 , 121 (1998), [Erratum: Nucl.Phys.A642,563(1998)], arXiv:hep-ph/9801297 [hep-ph].[46] G. Hohler, E. Pietarinen, I. Sabba Stefanescu,F. Borkowski, G. G. Simon, V. H. Walther, and R. D.Wendling, Nucl. Phys.

B114 , 505 (1976).[47] I. T. Lorenz, Ulf.-G. Meißner, H. W. Hammer, and Y. B.Dong, Phys. Rev.

D91 , 014023 (2015), arXiv:1411.1704[hep-ph].[48] J. J. Kelly, Phys. Rev.

C70 , 068202 (2004).[49] I. Sick, Phys. Lett.

B576 , 62 (2003), arXiv:nucl-ex/0310008 [nucl-ex].[50] I. T. Lorenz, H. W. Hammer, and Ulf.-G. Meißner, Eur.Phys. J.

A48 , 151 (2012), arXiv:1205.6628 [hep-ph].[51] R. J. Hill and G. Paz, Phys. Rev.

D82 , 113005 (2010),arXiv:1008.4619 [hep-ph].[52] D. W. Higinbotham, (2019), 10.5281/zenodo.2566639.[53] D. Draper, Statistical Science , 239 (1988).[54] B. Triggs, P. F. McLauchlan, R. I. Hartley, and A. W.Fitzgibbon, in Vision Algorithms: Theory and Practice ,edited by B. Triggs, A. Zisserman, and R. Szeliski(Springer Berlin Heidelberg, Berlin, Heidelberg, 2000)pp. 298–372.[55] E. Jones, T. Oliphant, P. Peterson, et al. , “SciPy: Opensource scientiﬁc tools for Python,” (2001–), [Online; ac-cessed 9/26/2019].[56] T. E. Oliphant, Computing in Science & Engineering ,10 (2007).[57] M. Mihovilovi et al. , (2019), arXiv:1905.11182 [nucl-ex].[58] M. Mihoviloviˇc et al. , Phys. Lett. B771 , 194 (2017),arXiv:1612.06707 [nucl-ex].[59] S. ˇSirca and M. Horvat,