Non-uniqueness in quasar absorption models and implications for measurements of the fine structure constant
Chung-Chi Lee, John K. Webb, Dinko Milakovi?, Robert F. Carswell
MMNRAS , 1–8 (2021) Preprint 24 February 2021 Compiled using MNRAS L A TEX style file v3.0
Non-uniqueness in quasar absorption models and implications formeasurements of the fine structure constant
Chung-Chi Lee ★ , John K. Webb † , Dinko Milaković , and Robert F. Carswell . DAMTP, Centre for Mathematical Sciences, University of Cambridge, Cambridge CB3 0WA, UK. Clare Hall, University of Cambridge, Herschel Rd, Cambridge CB3 9AL. Institute for Fundamental Physics of the Universe, Via Beirut, 2, 34151 Grignano TS, Italy. Institute of Astronomy, University of Cambridge, Madingley Road, Cambridge CB3 0HA, U.K.
Accepted Received ; in original form
ABSTRACT
High resolution spectra of quasar absorption systems provide the best constraints on temporalor spatial changes of fundamental constants in the early universe. An important systematic thathas never before been quantified concerns model non-uniqueness; the absorption componentstructure is generally complicated, comprising many blended lines. This characteristic meansany given system can be fitted equally well by many slightly different models, each havinga different value of 𝛼 , the fine structure constant. We use AI Monte Carlo modelling toquantify non-uniqueness and describe how it accounts for previously unexplained scatter seenin the majority of published measurements. Extensive supercomputer calculations are reported,revealing new systematic effects that guide future analyses: (i) systematic errors significantlyincrease if line broadening models are turbulent but are minimised if gas temperature isincluded as a free parameter; modelling quasar absorption systems using turbulent broadeningshould be avoided and compound broadening is preferable. (ii) The general overfitting tendencyof AICc dramatically increases non-uniqueness and hence the overall error budget on estimatesof 𝛼 variations. The newly introduced Spectral Information Criterion (SpIC) statistic is moresuitable and substantially decreases non-uniqueness compared to AICc-based models. Key words:
Cosmology: cosmological parameters; Methods: data analysis, numerical, statis-tical; Techniques: spectroscopic; Quasars: absorption lines; Line: profiles; Abundances
Generalised theories of varying fundamental constants (Barrow &Lip 2012) motivate high-precision searches for new physics usingnew facilities like ESPRESSO on the VLT (Pepe et al. 2021). Vary-ing constants constitute one of the main scientific drivers for theforthcoming ELT e.g. (Marconi et al. 2016; Tamai et al. 2018).Instrumentation improvements, and better wavelength calibrationmethods based around laser frequency combs (Milaković et al. 2020;Probst et al. 2020), and ever-increasing data quality, necessitate anew look at old analysis methods; approximations made when dataquality was lower may be inadequate today.One such approximation arises in the context of modellingquasar absorption systems. Minimising the number of free modelparameters is appealing, to reduce degeneracy between parametersand because it generally yields a smaller uncertainty estimate onthe parameter or parameters of interest. However, in the interestsof objectivity, best-fit models should be free from human decision- ★ E-mail: [email protected] † E-mail: [email protected] making (Lee et al. 2021) and the number of model parametersshould be information criterion-based (Webb et al. 2020).Nevertheless, even with fully objective and reproduciblemethodologies, model ambiguity is unavoidable; the 𝜒 –parameterspace may contain multiple local minima and more than one set ofmodel parameters may provide a statistically acceptable fit to thedata. This property generates an additional uncertainty on interest-ing parameters, over and above the usual “statistical” covariancematrix uncertainty derived at the minimum 𝜒 . Additional uncer-tainties of this sort increase data scatter, sometimes allowed for by“ 𝜎 rand ” in statistical surveys of the fine structure constant, 𝛼 , at highredshift, e.g. (Webb et al. 2011).For varying 𝛼 searches using quasar absorption systems, itis necessary to assume a broadening mechanism for each redshiftcomponent in an absorption complex. In this paper, we use an ex-tensive suite of supercomputer simulations to study in detail howthe assumed broadening mechanism and the choice of informationcriterion influences the parameter error budget and model ambigu-ity. © a r X i v : . [ a s t r o - ph . C O ] F e b C.C. Lee et al
In (Lee et al. 2021) we introduced an artificial intelligence approachto modelling absorption line systems (ai-vpfit), a development builton the first attempt at such a code, described in (Bainbridge & Webb2017a,b). ai-vpfit merges a genetic algorithm, a Monte Carlo pro-cess, and a well-established code, vpfit (Carswell & Webb 2014,2020) to form a fully automated and unbiased modelling process.The statistical error returned by vpfit is derived from the Hessiandiagonal, based on a parabolic model for the 𝜒 -parameter space.It has been shown that the vpfit error estimates are reasonable e.g.(King et al. 2009). Nevertheless, the parabolic model cannot, ofcourse, map out 𝜒 to reveal potential multiple minima. When mul-tiple minima are present, the scatter we see amongst a distributionof ai-vpfit models should be larger than the statistical error.The Monte Carlo aspect of ai-vpfit (trial absorption line po-sitions are placed randomly) means that each independent model isconstructed differently, i.e. parent models for each generation arenot the same between independent ai-vpfit models. This slight pro-cedural variation is analogous to the slightly different approachesdifferent human modellers would take, presented with the sameproblem. The main difference is, of course, that a human may re-quire weeks to carry out the same task that ai-vpfit carries out inhours. This procedural variation in turn means that all (or most)multiple minima present in 𝜒 -parameter space should be revealedin the calculations reported here, given enough independent ai-vpfitruns, as would happen in a Markov-Chain Monte Carlo approach(King et al. 2009).If the line broadening model is wrong (as is often likely tobe the case for turbulent or thermal broadening models), it wouldbe expected that a broader range of best-fit models can be found.To give a simple example, if a turbulent model is imposed on anobserved line that is in reality thermally broadened, the model 𝑏 for that line could potentially be too small (depending on atomicmass and other things). In this case, additional spurious velocitycomponents may be added which can then velocity-shift lines ofimportant components (for Δ 𝛼 / 𝛼 ) and hence perturb the optimal Δ 𝛼 / 𝛼 measurement. The likely consequence is that additional com-ponents will be needed for that particular line in order to obtain agood fit. Therefore, the model non-uniqueness problem should begreater if we use a turbulent or a thermal model, compared to com-pound broadening. Further, in a previous paper (Webb et al. 2020),we introduced a new information criterion (SpIC) aimed specifi-cally at spectroscopic modelling, to address weaknesses identifiedin modelling procedures using the widely used information criteria,AICc and BIC. The finding of that work was that SpIC “bridged thegap” between the overfitting and underfitting tendencies of AICcand BIC respectively. With the availablility of a system such asai-vpfit, the considerations above motivate a more detailed explo-ration of 𝜒 -parameter space using the various broadening modelsas well as using different information criteria. The astronomical data we use for this study has already been de-scribed in (Milaković et al. 2021). We refer the reader to that paperfor details. Since the present work requires considerable computingpower (models contain a reasonably large number of free param-eters), we select a subset of the complex system at 𝑧 𝑎𝑏𝑠 = . . < 𝑧 abs < . For pure thermal broadening, the width of an absorption linedepends only on the cloud temperature 𝑇 and atomic mass, 𝑚 .However, for sufficiently low cloud temperatures, line broadeningcan be dominated by bulk motions in the cloud. More generally, wemay find that both processes apply, such that line widths dependon three parameters, 𝑇 , 𝑚 , and an additional turbulence parameter.Here we adopt the usual assumption that the observed line profile isVoigt so that thermal and turbulent contributions can be combinedin quadrature and thus study the following three cases:1. Fully turbulent broadening; all atomic species at the sameredshift share the same 𝑏 -parameter, 𝑏 turb ,2. Pure thermal broadening, 𝑏 th = √︃ 𝑘𝑇𝑚 = . √︃ 𝑇 𝑚 km s − ,3. Compound broadening, 𝑏 = 𝑏 + 𝑏 . Non-linear least squares algorithms such as vpfit (which is in-corporated into ai-vpfit) require carefully-set stopping criteria toterminate the minimisation process. The usual approach is to set Δ 𝜒 / 𝜒 = ( 𝜒 𝑛 − 𝜒 𝑛 − )/ 𝜒 𝑛 − (where 𝑛 indicates the iteration num-ber) to be smaller than some suitable threshold, such that any pa-rameter changes between the final two iterations are well below theircorresponding 1 𝜎 uncertainties. For the calculations described inthis paper, we set Δ 𝜒 / 𝜒 = × − . This ensures a very smallincrement between successive values of Δ 𝛼 / 𝛼 , certainly far belowits statistical uncertainty. For HE0515-4414, this stopping criteriacorresponds to Δ 𝜒 (cid:46) .
75, which maps (empirically) to a change | Δ 𝛼 / 𝛼 | (cid:46) × − , approximately an order of magnitude belowthe estimated statistical uncertainty on Δ 𝛼 / 𝛼 . As a precaution, weset the requirement that the stopping criterion must be met on threeconsecutive iterations before terminating the fit. We can thus beconfident that each of the ai-vpfit models correspond closely to areal local minimum in 𝜒 space. The high performance supercomputing facility used for the calcu-lations in this paper is OzSTAR at Swinburne University in Aus-tralia . The computing time required for automated modelling usinga procedure such as ai-vpfit is significant. For this reason, severalcomponents of the code have been parallelised to run simultane-ously across multiple processors. Even with parallelisation, veryapproximately 80,000 computing hours were required on the OzS-TAR supercomputer facility to obtain the results presented in thispaper. We generate a total of about 600 models. The data are fittedindependently 100 times, using 3 line broadening models and 2 in-formation criteria, AICc and SpIC (Webb et al. 2020), to select thefittest model at each generation. Both ICs have been incorporatedinto the ai-vpfit code (the SpIC modification to ai-vpfit is not de-scribed in (Lee et al. 2021) since that enhancement has since beenadded). On average, each model thus requires 135 hours processing https://supercomputing.swin.edu.au/ozstar/ MNRAS , 1–8 (2021) odel non-uniqueness and varying 𝛼 time. Parallelisation (we typically used 5 CUPs per calculation) thenmeans the average model calculation time is 27 hours.Interestingly, the different line broadening models require quitedifferent total computing times. Again very approximately, thetimes required for turbulent/thermal/compound models are 35/35/12hours. Note that compound broadening models require substantiallyless time to compute. We have not explored this in quantitative de-tail but the explanations are that (a) compound models descendmore rapidly to the best-fit solution (consistent with the expectationof having a steeper 𝜒 gradient for the more physically plausiblemodel) and (b) slightly fewer velocity components are generallyrequired for compound models (again, because the model is phys-ically correct, unlike turbulent or thermal models, fewer spuriouscomponents are needed to achieve a good fit). The distribution of models we get is unbiased in that each modelis constructed using a genetic procedure in which first guesses aregenerated using a Monte Carlo method (Lee et al. 2021). As wewill shortly see, a new source of uncertainty is revealed, over andabove the simple “statistical” uncertainty derived from the Hessiandiagonals at the best-fit solution.Figures 1, 2, and 3 illustrate the results obtained. The HE0515-4414 absorption system has been fitted with ai-vpfit about 100times each, using turbulent, thermal, and compound line broadeningmodels and using the two information criteria, SpIC (Webb et al.2020) and AICc (i.e. making a total of ∼
600 independent models).Each panel in these Figures shows the distribution of best-fit Δ 𝛼 / 𝛼 values. Each point corresponds to a different ai-vpfit model. Sincethe model construction process is Monte Carlo based (Lee et al.2021), the observed distribution of points reveals multiple minima,if present.In Figure 1, panels (a) and (d) reveal something quite remark-able. These panels correspond to turbulent line broadening, whichis the model frequently assumed when modelling quasar absorptionsystems. Whilst a turbulent model is recognised as an approxi-mation, it has previously been assumed that the assumption doesnot introduce any bias in estimating Δ 𝛼 / 𝛼 and that it does notadd any additional uncertainty to the measurement. The turbulentbroadening panels illustrate that both assumptions are wrong. Con-sider again panel (a): the AICc models fall into two well-separatedclumps, one at around Δ 𝛼 / 𝛼 ≈ −
1, the other around Δ 𝛼 / 𝛼 ≈ − − ). The upper clump is populated by around 60% of thebest-fit models whilst the lower clump is populated by around 40%.If using AICc, we do not know, a priori , which of these is correct i.e.a single model, produced interactively by a human, may suffer fromthese difficulties. Moreover, the overall spread in Δ 𝛼 / 𝛼 is actually substantially larger than the statistical uncertainty returned by theHessian diagonal . These results are both surprising and somewhatshocking, given many measurements in the literature are based onthe process just outlined. In fact, Figure 1 shows that the worst pos-sible assumption, at least for the absorption system considered inthis paper, is turbulent broadening. The “non-uniqueness” problemis far less pronounced for thermal, although compound broadeningis clearly the preferable model.More generally, Figure 1 shows that if the best-fit model 𝜒 -parameter space contained one single global minimum and wasotherwise structureless, all ai-vpfit models would be identical.That they are not is a demonstration of real multiple local min-ima. Each panel shows the simple mean and range for Δ 𝛼 / 𝛼 (the range is determined empirically by the ±
34% range over the sam-ple of ∼
100 models in each case). The mean statistical uncertaintyusing AICc, (cid:104) 𝜎 𝑠 (cid:105) = . × − for the turbulent case, comparedwith 0 . × − for the compound broadening model. The cor-responding numbers for SpIC are smaller but show the same trend.These numbers already demonstrate convincingly show that SpICis preferable to AICc and that the appropriate broadening model iscompound, not turbulent and not thermal.Prior to a code such as ai-vpfit (Lee et al. 2021), the calcu-lations performed would have been impractical. However we nowlearn something interesting: for both SpIC and AICc, the scatter inthe turbulent samples is large compared to the statistical error. Thisis of considerable concern because the systematic error associatedwith model non-uniqueness appears (in the HPC calculations) tobe considerably larger than the statistical error. This may be inter-preted in two possible ways: either (a) turbulent models appear torepresent the data badly and hence generate multiple 𝜒 –parameterspace minima, or (b) the Monte Carlo nature of the ai-vpfit mod-elling process (i.e. placing trial lines randomly within the spectralsegment being fitted) may not emulate a human process and mayitself generate multiple 𝜒 –parameter space minima that are par-ticular to ai-vpfit. If the explanation is simply (a), we may expectto see the same effect in published Δ 𝛼 / 𝛼 samples that were fittingusing a turbulent model. Although both ai-vpfit and a human in-teractive modeller will tend to target strong but unsaturated featuresearlier and refine the model with weaker features later, there is no“correct” ordering in which an absorption complex model should beconstructed. Any clumping in 𝜒 space is not an artificial aspect ofthe ai-vpfit Monte Carlo process, which itself removes any possiblesubjectivity.In Figure 2 we show how the different broadening modelsand the two different information criteria impact on the requirednumber of model parameters. A clear trend is seen; SpIC requiresfewer model parameters. Given the ways in which AICc and SpICare defined, this is expected. We refer to (Webb et al. 2020) for amore detailed discussion on this point.Figure 2 reveals something else rather interesting; AICc andSpIC generate quite different solutions for the HE0515-4414 ab-sorption system, irrespective of the broadening method. The AICccompound broadening model results in a mean number of parame-ters of roughly 120 whereas the SpIC compound broadening modelproduces about 100. This means the AICc models, on average, re-quire around 7 more absorption lines across the complex. Closer in-spection of the results shows that these divide approximately equallyinto heavy element components and interlopers.Figure 3 reinforces the above as we can see that on average, theoverall best-fit 𝜒 values are slightly smaller for AICc compared toSpIC, as expected. Both information criteria provide “statisticallyacceptable” model fits i.e. the normalised values of 𝜒 are aroundunity. Trying to distinguish between these possibilities using 𝜒 is not reliable because spectral processing procedures from rawdata to one dimensional spectrum create weak small-scale pixel topixel correlations. This means that the spectral error array is onlyan approximation and small departures of the normalised 𝜒 fromunity are not easy to interpret.The blue hollow circles in Fig. 4 illustrate Δ 𝛼 / 𝛼 vs the nor-malised chi-squared for 98 model fits using AICc. These modelsare for compound line broadening. The red crosses show 100 fitsusing SpIC. The AICc distribution appears to fall into three (orpossibly four) clumps: the most populated lies in the approximaterange 0 < Δ 𝛼 / 𝛼 < × − . Two others lie in the approximate range − . × − < Δ 𝛼 / 𝛼 < − . × − < Δ 𝛼 / 𝛼 < − . × − . MNRAS000
100 models in each case). The mean statistical uncertaintyusing AICc, (cid:104) 𝜎 𝑠 (cid:105) = . × − for the turbulent case, comparedwith 0 . × − for the compound broadening model. The cor-responding numbers for SpIC are smaller but show the same trend.These numbers already demonstrate convincingly show that SpICis preferable to AICc and that the appropriate broadening model iscompound, not turbulent and not thermal.Prior to a code such as ai-vpfit (Lee et al. 2021), the calcu-lations performed would have been impractical. However we nowlearn something interesting: for both SpIC and AICc, the scatter inthe turbulent samples is large compared to the statistical error. Thisis of considerable concern because the systematic error associatedwith model non-uniqueness appears (in the HPC calculations) tobe considerably larger than the statistical error. This may be inter-preted in two possible ways: either (a) turbulent models appear torepresent the data badly and hence generate multiple 𝜒 –parameterspace minima, or (b) the Monte Carlo nature of the ai-vpfit mod-elling process (i.e. placing trial lines randomly within the spectralsegment being fitted) may not emulate a human process and mayitself generate multiple 𝜒 –parameter space minima that are par-ticular to ai-vpfit. If the explanation is simply (a), we may expectto see the same effect in published Δ 𝛼 / 𝛼 samples that were fittingusing a turbulent model. Although both ai-vpfit and a human in-teractive modeller will tend to target strong but unsaturated featuresearlier and refine the model with weaker features later, there is no“correct” ordering in which an absorption complex model should beconstructed. Any clumping in 𝜒 space is not an artificial aspect ofthe ai-vpfit Monte Carlo process, which itself removes any possiblesubjectivity.In Figure 2 we show how the different broadening modelsand the two different information criteria impact on the requirednumber of model parameters. A clear trend is seen; SpIC requiresfewer model parameters. Given the ways in which AICc and SpICare defined, this is expected. We refer to (Webb et al. 2020) for amore detailed discussion on this point.Figure 2 reveals something else rather interesting; AICc andSpIC generate quite different solutions for the HE0515-4414 ab-sorption system, irrespective of the broadening method. The AICccompound broadening model results in a mean number of parame-ters of roughly 120 whereas the SpIC compound broadening modelproduces about 100. This means the AICc models, on average, re-quire around 7 more absorption lines across the complex. Closer in-spection of the results shows that these divide approximately equallyinto heavy element components and interlopers.Figure 3 reinforces the above as we can see that on average, theoverall best-fit 𝜒 values are slightly smaller for AICc compared toSpIC, as expected. Both information criteria provide “statisticallyacceptable” model fits i.e. the normalised values of 𝜒 are aroundunity. Trying to distinguish between these possibilities using 𝜒 is not reliable because spectral processing procedures from rawdata to one dimensional spectrum create weak small-scale pixel topixel correlations. This means that the spectral error array is onlyan approximation and small departures of the normalised 𝜒 fromunity are not easy to interpret.The blue hollow circles in Fig. 4 illustrate Δ 𝛼 / 𝛼 vs the nor-malised chi-squared for 98 model fits using AICc. These modelsare for compound line broadening. The red crosses show 100 fitsusing SpIC. The AICc distribution appears to fall into three (orpossibly four) clumps: the most populated lies in the approximaterange 0 < Δ 𝛼 / 𝛼 < × − . Two others lie in the approximate range − . × − < Δ 𝛼 / 𝛼 < − . × − < Δ 𝛼 / 𝛼 < − . × − . MNRAS000 , 1–8 (2021)
C.C. Lee et al
Figure 1. Δ 𝛼 / 𝛼 (in units of 10 − ) vs both information criteria, AICc and SpIC for the absorption system towards HE0515-4414. Models in the top row werederived using AICc. The bottom row corresponds to SpIC. Each hollow circle corresponds to one ai-vpfit model The error bars plotted are from the vpfitHessian diagonal. Within each panel we show the simple mean Δ 𝛼 / 𝛼 and its 68% range super- and sub-scripts for the different line broadening models. The68% range does not include the statistical uncertainty 𝜎 𝑠 i.e. the uncertainty returned by the vpfit Hessian diagonal, so the range illustrated can be consideredto represent an additional systematic error associated with model non-uniqueness, over and above the statistical uncertainty. Also within each panel we showthe mean statistical uncertainty (cid:104) 𝜎 𝑠 (cid:105) . These quantities are explained in Section 3.Model: A1 A2 A3 𝑛 𝑝
13 13 13 Δ 𝛼 / 𝛼 ( − ) − . ± . − . ± .
60 4 . ± . 𝜒 𝜈 Table 1.
Statistical details for AICc models A1, A2, A3, as illustrated inFigs. A1, A2, and A3.
As can be seen, the spread in Δ 𝛼 / 𝛼 is ∼ × − , approximately6 times larger than the statistical (vpfit) uncertainty on each indi-vidual point (Fig. 3). The blue circles thus emphasise that for AICcmodels specifically, non-uniqueness is a highly significant issue,even when the correct line broadening model is used. To see thisfurther, Figure 5 shows three example models, drawn ad hoc fromeach of the three clumps. All three models produce very similar 𝜒 values and visually indistinguishable models and normalised resid-uals (details are shown in Table 1 and all transitions modelled areshown in Figures A1, A2, and A3). If modelling is carried our inter-actively by a human, it is unclear where in the Δ 𝛼 / 𝛼 - 𝜒 𝜈 plane thefit might end up. Of course it is possible that an interactive modellermight be tempted (either consciously or otherwise) to manuallyadjust parameters that give Δ 𝛼 / 𝛼 closest to zero. The main conclusions of our study are:(1) The availability of a fully automated and unbiased proceduresuch as ai-vpfit reveal (as expected, but not previously demon-strated) that the 𝜒 space may contain multiple local minima.A human modeller cannot readily distinguish between a “real”and “fake” minimum. Multiple independent unbiased modelsare needed.(2) The effect described above is especially prominent for modelsderived using turbulent broadening and AICc, but increasedscattering can also be seen for thermal broadening models.(3) The extensive HPC calculations reported here compellinglyshow that when solving for Δ 𝛼 / 𝛼 , one should only considerthe more physically plausible compound line broadening and that this is best done in conjunction with the SpIC informationcriterion and not using AICc.Fitting turbulent models necessarily generates or enhancesmodel non-uniqueness, adding a substantial additional random un- MNRAS , 1–8 (2021) odel non-uniqueness and varying 𝛼 Figure 2. Δ 𝛼 / 𝛼 (in units of 10 − ) vs. 𝑁 𝑝 , the number of free model parameters for each set of ∼
100 models in each panel for the absorption system towardsHE0515-4414. The plots illustrate that SpIC requires fewer model parameters compared to AICc and that SpIC suffers less from model non-uniqueness. Thepanels correspond to those in Figure 1, i.e. the upper row is for AICc and the lower row is for SpIC. certainty to Δ 𝛼 / 𝛼 . This is a likely source of excess scatter seen inprevious samples of Δ 𝛼 / 𝛼 measurements .Figures 1, 2, and 3, panels (a), show how independent mod-els clump into several groups, illustrating the multiple minima in 𝜒 space. As model complexity increases, absorption componentswhich may be intrinsically single require additional components fora satisfactory fit. Each time this happens, the best-fit Δ 𝛼 / 𝛼 jumps toa new value. This process forms the groupings. Figure 2 (a) showsthat the scatter ( or number of groupings) increases with decreasing 𝑁 𝑝 (i.e. increasing model complexity). This is unsurprising; over-fitting the data creates multiple Δ 𝛼 / 𝛼 solutions. Figures 2 (d) and(f) show that even if a human interactive modeller adopts a highlyparsimonious approach, i.e. fits the minimum number of parame-ters possible, one still cannot completely avoid the non-uniquenessproblem when fitting either turbulent or thermal models. The smallnumber of points at 𝑁 𝑝 ≈
90, in Figure 2 (d), scatter more widelythan expected on the basis of their statistical errors. The same isapparent in Figure 2 (f) at 𝑁 𝑝 ≈ The excess scatters shown in Figures 1, 2, and 3, over and above the sta-tistical uncertainty, are comparable in magnitude and hence at least partially(and perhaps fully) account for excess scatter seen in the models of (Kinget al. 2012), parameterised using 𝜎 𝑟𝑎𝑛𝑑 . necessarily the same for the different line broadening mechanisms!The higher concentration of points in Figures 1 (a) and (d) sit around − < Δ 𝛼 / 𝛼 − − ). On the other hand, in Figures1 (b) and (e), points are concentrated at positive Δ 𝛼 / 𝛼 . Given thefar higher scatter in panels (a) and (d), the obvious inference isthat turbulent models actually bias the result, even if weaker 𝜒 minima are avoided. This means that turbulent broadening is theworst possible choice.Figures 1, 2, and 3 show beyond reasonable doubt that SpICachieves far more consistent results than does AICc. The reason isSpIC more heavily penalises weak lines, so close blends occur lessfrequently, such that fewer 𝜒 𝑚𝑖𝑛 are found. MNRAS000
90, in Figure 2 (d), scatter more widelythan expected on the basis of their statistical errors. The same isapparent in Figure 2 (f) at 𝑁 𝑝 ≈ The excess scatters shown in Figures 1, 2, and 3, over and above the sta-tistical uncertainty, are comparable in magnitude and hence at least partially(and perhaps fully) account for excess scatter seen in the models of (Kinget al. 2012), parameterised using 𝜎 𝑟𝑎𝑛𝑑 . necessarily the same for the different line broadening mechanisms!The higher concentration of points in Figures 1 (a) and (d) sit around − < Δ 𝛼 / 𝛼 − − ). On the other hand, in Figures1 (b) and (e), points are concentrated at positive Δ 𝛼 / 𝛼 . Given thefar higher scatter in panels (a) and (d), the obvious inference isthat turbulent models actually bias the result, even if weaker 𝜒 minima are avoided. This means that turbulent broadening is theworst possible choice.Figures 1, 2, and 3 show beyond reasonable doubt that SpICachieves far more consistent results than does AICc. The reason isSpIC more heavily penalises weak lines, so close blends occur lessfrequently, such that fewer 𝜒 𝑚𝑖𝑛 are found. MNRAS000 , 1–8 (2021)
C.C. Lee et al
Figure 3. Δ 𝛼 / 𝛼 (in units of 10 − ) vs. the normalised chi-squared, 𝜒 / 𝑁 𝑑𝑜 𝑓 , where 𝑁 𝑑𝑜 𝑓 = 𝑁 𝑑 − 𝑁 𝑝 , the difference between the total number of data pointsand the number of model parameters for the absorption system towards HE0515-4414. All panels correspond to Figures 1 and 2. This figure shows that AICcgenerally produces a slightly smaller value of 𝜒 / 𝑁 𝑑𝑜 𝑓 , as expected, that both information criteria produce statistically acceptable models (i.e. 𝜒 / 𝑁 𝑑𝑜 𝑓 ≈ -5 AICcSpIC
Figure 4. Δ 𝛼 / 𝛼 for HE0515-4414 from AICc ai-vpfit models (blue circles)and SpIC (red crosses) as a function of 𝜒 . The line broadening is compound.This figure corresponds to panels (b) and (e) in Figures 1, 2, and 3. Theoverfitting properties of AICc generate the 𝜒 sub-structure and model non-uniqueness dominates the overall Δ 𝛼 / 𝛼 uncertainty. MNRAS , 1–8 (2021) odel non-uniqueness and varying 𝛼 Figure 5.
Three example ai-vpfit models (using AICc) for a segment of the absorption complex towards HE0515-4414. These are compound broadening fits (i.e.the lines are broadened by both thermal and turbulent Doppler motions). Each row of tick marks illustrate the component positions for each of the three modelsand the corresponding best-fit Δ 𝛼 / 𝛼 values (more detailed figures for each model are given in the Appendix). Model A1 has Δ 𝛼 / 𝛼 = (− . ± . ) × − ,model A2 Δ 𝛼 / 𝛼 = (− . ± . ) × − , and model A3 has Δ 𝛼 / 𝛼 = ( . ± . ) × − .MNRAS000
Three example ai-vpfit models (using AICc) for a segment of the absorption complex towards HE0515-4414. These are compound broadening fits (i.e.the lines are broadened by both thermal and turbulent Doppler motions). Each row of tick marks illustrate the component positions for each of the three modelsand the corresponding best-fit Δ 𝛼 / 𝛼 values (more detailed figures for each model are given in the Appendix). Model A1 has Δ 𝛼 / 𝛼 = (− . ± . ) × − ,model A2 Δ 𝛼 / 𝛼 = (− . ± . ) × − , and model A3 has Δ 𝛼 / 𝛼 = ( . ± . ) × − .MNRAS000 , 1–8 (2021) C.C. Lee et al
ACKNOWLEDGEMENTS
We are grateful for supercomputer time using OzSTAR at the Cen-tre for Astrophysics and Supercomputing at Swinburne Universityof Technology. CCL thanks the Royal Society for a Newton Interna-tional Fellowship during the early stages of this work. JKW thanksthe John Templeton Foundation, the Department of Applied Math-ematics and Theoretical Physics, the Institute of Astronomy, andClare Hall at Cambridge University for hospitality and support.
DATA AVAILABILITY
The observational data were collected at the European SouthernObservatory under ESO programme 102.A-0697(A) and are avail-able through the ESO archive. The ai-vpfit models can be obtainedfrom the authors on request.
REFERENCES
Bainbridge M. B., Webb J. K., 2017a, Universe, 3, 34Bainbridge M. B., Webb J. K., 2017b, MNRAS, 468, 1639Barrow J. D., Lip S. Z. W., 2012, Phys. Rev.~D, 85, 023514Carswell R. F., Webb J. K., 2014, VPFIT: Voigt profile fitting program,Astrophysics Source Code Library (ascl:1408.015)Carswell R. F., Webb J. K., 2020, Bob Carswell’s homepage, https://people.ast.cam.ac.uk/~rfc/
King J. A., Mortlock D. J., Webb J. K., Murphy M. T., 2009, Mem. Soc.Astron. Italiana, 80, 864King J. A., Webb J. K., Murphy M. T., Flambaum V. V., Carswell R. F.,Bainbridge M. B., Wilczynska M. R., Koch F. E., 2012, MNRAS, 422,3370Lee C.-C., Webb J. K., Carswell R. F., Milakovic D., 2021, arXiv e-prints,p. arXiv:2008.02583Marconi A., et al., 2016, in Ground-based and Airborne Instrumentation forAstronomy VI, eds. Christopher J. Evans and Luc Simard and HidekiTakami. SPIE, pp 676 – 687, doi:10.1117/12.2231653, https://doi.org/10.1117/12.2231653
Milaković D., Pasquini L., Webb J. K., Lo Curto G., 2020, MNRAS, 493,3997Milaković D., Lee C.-C., Carswell R. F., Webb J. K., Molaro P., Pasquini L.,2021, MNRAS, 500, 1Pepe F., et al., 2021, A&A, 645, A96Probst R. A., et al., 2020, Nature Astronomy, 4, 603Tamai R., Koehler B., Cirasuolo M., Biancat-Marchet F., Tuti M., GonzálesHerrera J. C., 2018, in Marshall H. K., Spyromilio J., eds, Societyof Photo-Optical Instrumentation Engineers (SPIE) Conference SeriesVol. 10700, Ground-based and Airborne Telescopes VII. p. 1070014,doi:10.1117/12.2309515Webb J. K., King J. A., Murphy M. T., Flambaum V. V., Carswell R. F.,Bainbridge M. B., 2011, Phys. Rev.~Lett., 107, 191101Webb J. K., Lee C.-C., Carswell R. F., Milaković D., 2020, arXiv e-prints,p. arXiv:2009.08336
APPENDIX A: THREE AI-VPFIT MODELS WITHCOMPOUND BROADENING
The three figures shown here relate to panels (b) of Figure 1, 2,and 3. The line broadening mechanism is compound but the modelselection criterion is AICc. Comparing panels (b) with panels (e)show that independent ai-vpfit models derived using the latter ex-hibit far less scatter and hence that the overfitting tendency of AICcgenerates unnecessary model ambiguity. This problem is avoided when SpIC is used. Here we illustrate three models spanning therange in Δ 𝛼 / 𝛼 shown in panels (b). This paper has been typeset from a TEX/L A TEX file prepared by the author.MNRAS , 1–8 (2021) odel non-uniqueness and varying 𝛼 Δ αα = − 2.08 × 10 −5 Figure A1.
Demonstration of the non-uniqueness problem. The model is fitted from HE0515-4414 with compound model and AICc. The Δ 𝛼 / 𝛼 = − . × − in this model.MNRAS000
Demonstration of the non-uniqueness problem. The model is fitted from HE0515-4414 with compound model and AICc. The Δ 𝛼 / 𝛼 = − . × − in this model.MNRAS000 , 1–8 (2021) C.C. Lee et al Δ αα = − 6.67 × 10 −6 Figure A2.
Same as Fig. A1 with another random seed, in which Δ 𝛼 / 𝛼 = − . × − in this model. MNRAS , 1–8 (2021) odel non-uniqueness and varying 𝛼 Δ αα = 4.58 × 10 −6 Figure A3.
Same as Fig. A1 with another random seed, in which Δ 𝛼 / 𝛼 = . × − in this model.MNRAS000