[PDF] Leading hadronic contribution to the muon magnetic moment from lattice QCD

Abstract

We compute the leading order hadronic vacuum polarization (LO-HVP) contribution to the anomalous magnetic moment of the muon, ( g μ −2) , using lattice QCD. Calculations are performed with four flavors of 4-stout-improved staggered quarks, at physical quark masses and at six values of the lattice spacing down to 0.064~fm. All strong isospin breaking and electromagnetic effects are accounted for to leading order. The infinite-volume limit is taken thanks to simulations performed in volumes of sizes up to 11~fm. Our result for the LO-HVP contribution to ( g μ −2) has a total uncertainty of 0.8\%. Compared to the result of the dispersive approach for this contribution, ours significantly reduces the tension between the standard model prediction for ( g μ −2) and its measurement.

Full PDF

LLeading hadronic contribution to the muon magnetic moment from lattice QCD Sz. Borsanyi , Z. Fodor , , , , , ∗ , J. N. Guenther , C. Hoelbling , S. D. Katz , L. Lellouch , T. Lippert , , K. Miura , , , L. Parato , K. K. Szabo , , F. Stokes , B. C. Toth , Cs. Torok , L. Varnhorst Department of Physics, University of Wuppertal, D-42119 Wuppertal, Germany J¨ulich Supercomputing Centre, Forschungszentrum J¨ulich, D-52428 J¨ulich, Germany Institute for Theoretical Physics, E¨otv¨os University, H-1117 Budapest, Hungary University of California, San Diego, 9500 Gilman Drive, La Jolla, CA 92093, USA Pennsylvania State University, Department of Physics, State College, PA 16801, USA Department of Physics, University of Regensburg, Regensburg D-93053, Germany Aix Marseille Univ, Universit´e de Toulon, CNRS, CPT, Marseille, France Helmholtz Institute Mainz, D-55099 Mainz, Germany Kobayashi-Maskawa Institute for the Origin of Particles and the Universe, Nagoya University, Nagoya a r X i v : . [ h e p - l a t ] A ug he standard model of particle physics describes the vast majority of experiments and observations involving elementary particles. Any deviation from its predictions would be a sign of new, fundamental physics. One long-standing discrepancy concerns the anomalous magnetic moment of the muon, ( g µ − , a measure of the magnetic ﬁeld surrounding that particle. Indeed, standard model predictions for ( g µ − , reviewed in [1], exhibit disagreement with the measurement [2] that is tightly scattered around . standard deviations. Today, theory and measurement errors are comparable. However, a new experiment is underway at Fermilab and another is planned at J-PARC, both aiming to reduce the measurement’s error by a factor of four. On the theory side, the dominant source of error is the leading-order, hadronic vacuum polarization (LO-HVP) contribution. To fully leverage the upcoming measurements, it is critical to check the prediction for this contribution with independent methods and to reduce its uncertainties. The most precise, model-independent determinations currently rely on dispersive techniques, combined with measurements of the cross-section for electron-positron annihilation into hadrons [3–6]. Here we use ab initio simulations in quantum chromodynamics and quantum electrodynamics to compute the LO-HVP contribution with suﬃcient precision to discriminate between the measurement of ( g µ − and the dispersive predictions. Surprisingly our result, [( g µ − / LO − HVP = 708 . . × − , eliminates the need to invoke new physics to explain the measurement of ( g µ − . Moreover, the methods used and developed here will allow further increases in precision, as more powerful computers become available. The muon is an ephemeral sibling of the electron. It is 207 times more massive, but has the same electric charge and spin. Similarly to the electron, it behaves like a tiny magnet, characterized by a magnetic moment. This quantity is proportional to the spin and charge of the muon, and inversely proportional to twice its mass. Dirac’s relativistic quantum mechanics predicts that the constant of proportionality, g µ , should be 2. However, in a relativistic quantum ﬁeld theory such as the standard model, this prediction receives small corrections due to quantum, vacuum ﬂuctuations. These corrections are called the anomalous magnetic moment and are quantiﬁed by ( g µ − / . They were measured to an exquisite 0.54 ppm at the Brookhaven National Laboratory in the early 2000s [2], and have been calculated with a comparable precision (see [7] for a recent review). At this level of precision, all of the interactions of the standard model contribute. The leading contributions are electromagnetic and described by quantum electrodynamics (QED), but the one that dominates the theory error is induced by the strong interaction and requires solving the highly non-linear leading-order, hadronic vacuum polarization (LO-HVP), which describes how the propagation of a virtual photon is modiﬁed by the presence of quark and gluon ﬂuctuations in the vacuum. Here we compute this LO-HVP contribution to ( g µ − / , denoted by a LO − HVP µ , using ab initio simulations in QCD and QED. QCD is a generalized version of QED. The Euclidean Lagrangian for this theory is L = 1 / (4 e ) F µν F µν + / (2 g ) Tr G µν G µν + (cid:80) f ¯ ψ f [ γ µ ( ∂ µ + iq f A µ + iB µ ) + m f ] ψ f , where γ µ are the Dirac-matrices, f runs over the ﬂavors of quarks, the m f are their masses and the q f are their charges in units of the electron charge e . Moreover, F µν = ∂ µ A ν − ∂ ν A µ and G µν = ∂ µ B ν − ∂ ν B µ + [ B µ , B ν ] and g is the QCD coupling constant. In electrodynamics, the gauge potential A µ is a real valued ﬁeld, whereas in QCD, B µ is a × Hermitian matrix ﬁeld. The diﬀerent “ﬂavors” of quarks are represented by independent fermionic ﬁelds, ψ f . These ﬁelds have an additional “color” index in QCD, which runs from 1 to 3. In the present work, we include both QED and QCD, as well as four non-degenerate quark ﬂavors (up, down, strange and charm), in a lattice formulation taking into account all dynamical eﬀects. We also consider the tiny contributions of the bottom and top quarks, as discussed in the Supplementary Information. We compute a LO − HVP µ in the so-called time-momentum representation [8], which relies on the following, zero three-momentum, two-point function in Euclidean time t : G ( t ) = 13 (cid:88) µ =1 , , (cid:90) d x (cid:104) J µ ( (cid:126)x, t ) J µ (0) (cid:105) , (1)where eJ µ is the quark electromagnetic current with J µ = ¯ uγ µ u − ¯ dγ µ d − ¯ sγ µ s + ¯ cγ µ c . u, d, s and c are the up, down, strange and charm quark ﬁelds and the angle brackets stand for the QCD+QED expectation value to order e . It is convenient to decompose G ( t ) into light, strange, charm and disconnected components, which have very diﬀerent statistical and systematic uncertainties. Integrating the one- photon-irreducible ( γ I ) part of the two-point function (1) yields the LO-HVP contribution to the magnetic moment of the muon [8–11]: a LO − HVP µ = α (cid:90) ∞ dt K ( t ) G γ I ( t ) , (2)with the weight function, K ( t ) = (cid:90) ∞ dQ m µ ω (cid:18) Q m µ (cid:19) (cid:20) t − Q sin (cid:18) Qt (cid:19)(cid:21) , (3)3nd where ω ( r ) = [ r + 2 − (cid:112) r ( r + 4)] / (cid:112) r ( r + 4) , α is the ﬁne structure constant in the Thomson limit and m µ is the muon mass. Since we consider only the LO-HVP contribution, for brevity we drop the superscript and multiply the result by , ie. a µ stands for a LO − HVP µ × throughout this work. The subpercent precision, that we are aiming for, represents a huge challenge for lattice QCD. To reach that goal, we have to address four critical issues: A. scale determination; B. noise reduction; C. QED and strong-isospin breaking; D. inﬁnite-volume and continuum extrapolations. We discuss these one by one. ad A. The quantity a µ depends on the muon mass. When computing (2) on the lattice, m µ has to be converted into lattice units, am µ , where a is the lattice spacing. A relative error of the lattice spacing propagates into about a twice as large a relative error on a µ , so that a has to be determined with a few permil precision. We use the mass of the Ω baryon, M Ω = 1672 . MeV [1], to set the lattice spacing. We also use the w -scale from [12], in order to deﬁne an isospin decomposition of our observables. Though w can be measured with sub-permil precision on the lattice, it is inaccessible experimentally. In this work we determine the physical value of w including QED and strong-isospin-breaking eﬀects: w , ∗ = 0 . fm, where the ﬁrst error is statistical, the second is systematic and the third is the total error. In total we reach a relative accuracy of four permil, which is better than the error of the previous best determination of [13], whose value agrees with ours. There the pion-decay-constant was used as experimental input and the isospin-breaking eﬀects were only included as an estimate. ad B. Our result for a µ is obtained as an integral over the conserved current-current correlation function, from zero to inﬁnite time separation, as shown in Equation (2). For large separations the correlator is quite noisy. This noise manifests itself as a statistical error in a µ . To reach the desired accuracy on a µ , one needs high-precision at every step. Over 20,000 conﬁgurations were accumulated for our 27 ensembles on L ≈ fm lattices. In addition, we also include a lattice with L ≈ fm (see point D). The most important improvement over our earlier a µ determination in [14] is the extensive use of analysis techniques based on the lowest eigenmodes of the Dirac operator, see eg. [15–18]. About an order of magnitude accuracy-gain can be reached using this technique for a µ [19, 20]. ad C. The precision needed cannot be reached with pure, isopin-symmetric QCD. Thus, we include QED eﬀects and allow the up and down quarks to have diﬀerent masses. These eﬀects are included both in the scale determination and in the current-current correlators. Note that the separation of isospin symmetric and isospin breaking contributions requires a convention, which we discuss in detail in the Supplementary Information. Strong-isospin breaking is implemented by taking derivatives of QCD+QED isospin-symmetric conﬁgurations [21]. Note that the ﬁrst derivative of the fermionic determinant vanishes. We also implement derivatives with respect to the electric charge [22]. It is useful to distinguish between the electric charge in the fermionic determinant, e s or sea electric-charge, and in the observables, e v or valence electric-charge. The complete list of graphs that should be evaluated are shown in Figure 1 with our numerical results for them. The ﬁnal observable is given as a Taylor-expansion around the isospin-symmetric, physical-mass point with zero sea and valence charges. Instead of the quark masses, we use the pseudoscalar meson masses of pions and kaons, which can be determined with high precision. With the expansion coeﬃcients, we extrapolate in the charges, in the strong-isospin breaking parameter and in the lattice spacing and interpolate in the quark masses to the phyiscal point. Thus, we obtain a µ and its statistical and systematic uncertainties. ad D. The standard wisdom for lattice calculations is that M π L > should be taken, where M π is the mass of the pion and L is the spatial extent of the lattice. Unfortunately, this is not satisfactory in the present case: a µ is far more sensitive to L than other quantities, such as hadron masses, and large volumes are needed to reach permil accuracy. For less volume-sensitive quantities in this work, we use well-established results to determine the ﬁnite-volume corrections on the pion-decay constant [23] and on charged hadron masses [24–26]. Leading-order chiral perturbation theory [27] or two-loop, partially- quenched chiral perturbation theory [20, 28] for a µ help, but the non-perturbative, leading-order, large- L expansion of [29] indicates that those approaches still lead to systematic eﬀects which are larger than the accuracy that we are aiming for. In addition to the inﬁnite-volume extrapolation, the continuum extrapolation is also diﬃcult. This is connected to the taste-symmetry breaking of staggered fermions, which we use in this work. We determine the ﬁnite-volume corrections and/or improve the continuum extrapolations by three means: i. we work out the full two-loop, ﬁnite-volume, staggered chiral perturbation theory corrections for a µ ; ii. we apply a modiﬁed Lellouch-L¨uscher-Gounaris-Sakurai technique to calculate the necessary contributions; iii. most importantly we carry out a full lattice simulation on an L ≈ fm lattice, with highly-suppressed taste violations and with physical, taste-averaged pion masses. Combining all of these ingredients we obtain, as a ﬁnal result, a µ = 708 . . . . . The ﬁrst, statistical error comes mostly from the noisy, large-distance region of the current-current correlator. The second, systematic error is dominated by the continuum extrapolation and the ﬁnite-size eﬀect computa- accuracy of . %. In Figure 2 we show the continuum extrapolation of the light, connected component of the isospin symmetric part of a µ , which gives the dominant contribution to a µ . Figure 3 compares our result with previous lattice computations and also with results from the R-ratio method. In principle, one can reduce the uncertainty of our result by combining our lattice correlator, G ( t ) , with the one obtained from the R-ratio method, in regions of Euclidean time where the latter is more precise [19]. We do not do so here because there is a tension between our result and those obtained by the R-ratio method, as can be seen in Figure 3. For the total, LO-HVP contribution to a µ , our result is . σ , . σ and . σ larger than the R-ratio results of a µ = 694 . . [3], a µ = 692 . . [4] and a µ = 692 . . [5, 6], respectively. This tension requires further investigation. Note that all three R-ratio determinations are based on the same experimental data set and thus they are strongly correlated. As a ﬁrst step in that direction, it is instructive to consider a modiﬁed observable, where the correlator G ( t ) is restricted to a ﬁnite interval by a smooth window function [19]. This observable, which we denote by a µ, win , is obtained much more readily than a µ on the lattice. Its shorter-distance nature makes it signiﬁcantly less susceptible to statistical noise and to ﬁnite-volume eﬀects. Moreover, in the case of staggered fermions, it has reduced discretization artefacts. This is shown in Figure 4, where the light, connected component of a µ, win is plotted as a function of a . Because the determination of this quantity does not require overcoming many of the challenges described above, other lattice groups have obtained it with errors comparable to ours [19, 20]. This allows for a sharper benchmarking of our calculation of this challenging, light-quark contribution that dominates a µ . On average, our a light µ, win is only 1.2 σ away from those of [19, 20]. Moreover, a µ, win can be computed in the R-ratio approach, and we do so using the data set courteously given to us by the authors of [4]. However, here we ﬁnd a . σ tension with our lattice result. To conclude, when combined with the other standard model contributions (see eg. [3, 4]), our result for the leading-order hadronic contribution to the anomalous magnetic moment of the muon, a LO − HVP µ = 708 . . × − , eliminates the longstanding discrepancy between experiment and theory. However, as discussed above and can be seen in Figure 3, our lattice result exhibits a tension with the R-ratio determinations of [3–6]. Obviously, our ﬁndings should be conﬁrmed –or refuted– by other collaborations using other discretizations of QCD. Those investigations are underway.

Acknowledgments.

We thank J. Charles, A. El-Khadra, M. Hoferichter, F. Jegerlehner, C. Lehner,

M. Knecht, A. Kronfeld and E. de Rafael for informative discussions. We thank J. Bailey, W. Lee and

6. Sharpe for correspondence on staggered chiral perturbation theory. Special thanks goes to A. Keshavarzi for providing us cross-section data and for useful discussions. The computations were performed on

JUQUEEN, JURECA, JUWELS and QPACE at Forschungszentrum J¨ulich, on SuperMUC and SuperMUC-

NG at Leibniz Supercomputing Centre in M¨unchen, on Hazel Hen and HAWK at the High Performance

Computing Center in Stuttgart, on Turing and Jean Zay at the Institute for Development and Resources in

Intensive Scientiﬁc Computing (IDRIS) in Orsay, on Marconi in Roma and on GPU clusters in Wuppertal and Budapest. We thank the Gauss Centre for Supercomputing, PRACE and GENCI (grant 52275) for awarding us computer time on these machines. This project was partially funded by the DFG grant

SFB/TR55, by the BMBF Grant No. 05P18PXFCA, by the Hungarian National Research, Development and Innovation Oﬃce grant KKP126769 and by the Excellence Initiative of Aix-Marseille University -

A*MIDEX (ANR-11-IDEX-0001-02), a French “Investissements d’Avenir” program, through the “Chaire d’Excellence” initiative and the OCEVU “Laboratoire d’Excellence” (ANR-11-LABX-0060).

References

1. Tanabashi, M. et al.

Review of Particle Physics.

Phys. Rev.

D98,

2. Bennett, G. W. et al.

Final Report of the Muon E821 Anomalous Magnetic Moment Measurement at BNL.

Phys. Rev.

D73,

3. Davier, M., Hoecker, A., Malaescu, B. & Zhang, Z. A new evaluation of the hadronic vacuum polarisation contributions to the muon anomalous magnetic moment and to α ( m ) . Eur. Phys. J. C

241 (2020).

4. Keshavarzi, A., Nomura, D. & Teubner, T. g − of charged leptons, α ( M Z ) , and the hyperﬁne splitting of muonium. Phys. Rev.

D101,

5. Colangelo, G., Hoferichter, M. & Stoﬀer, P. Two-pion contribution to hadronic vacuum polarization.

JHEP

006 (2019).

6. Hoferichter, M., Hoid, B.-L. & Kubis, B. Three-pion contribution to hadronic vacuum polarization.

JHEP

137 (2019).

7. Aoyama, T. et al.

The anomalous magnetic moment of the muon in the Standard Model. arXiv: (June 2020). trong isospin-breaking connected light connected strange connected charm disconnected634.6(2.7)(3.7) 53.393(89)(68) 14.6(0)(1) -13.15(1.28)(1.29)0.11(4) bottom; higher order;perturbative Etc.Finite-size effects disconnected-4.63(54)(69) ×a μ LO-HVP = 708.7(2.8) stat (4.5) sys [5.3] tot

QEDisospin-breaking:valence Isospin symmetric connected disconnectedconnected disconnected connecteddisconnectedconnected -0.55(15)(11) -0.047(33)(23)0.011(24)(14)-1.27(40)(33)-0.0095(86)(99)0.42(20)(19) 6.59(63)(53)

QEDisospin-breaking: seaQEDisospin-breaking:mixed isospin-symmetricisospin-breaking18.7(2.5)0.0(0.1)

Figure 1: List of the contributions to a µ , including examples of the corresponding Feynman diagrams.Solid lines are quarks and curly lines are photons. Gluons are not shown explicitly, and internal quarkloops, only if they are attached to photons. Dots represent coordinates in position space, a box indicatesthe mass insertion relevant for strong-isospin breaking. The numbers give our result for each contribution,they correspond to our “reference” system size given by L ref = 6 . fm spatial and T ref = 9 . fmtemporal lattice extents. We also explicitly compute the ﬁnite-size corrections that must be added tothese results, these are given separately in the lower right panel. The ﬁrst error is the statistical and thesecond is the systematic uncertainty; except for the contributions where only a single, total error is given.8

540 560 580 600 620 640 6600.000 0.005 0.010 0.015 0.020 [ a µ li gh t ] i s o a [fm ] NNLOSSLGS-winNNLO-winNLOnone

Figure 2: Continuum extrapolation of the isospin-symmetric light connected component of a µ , denotedby [ a light µ ] iso . The data points are obtained on lattices of sizes L ≈ fm. The diﬀerent colors/symbolscorrespond to diﬀerent types of improvement procedures: “none” stands for applying no improvement;“NLO” and “NNLO” refer to improvements based on the next-to-leading and the next-to-next-to-leadingorders of ﬁnite-volume, staggered chiral perturbation theory; “SLLGS” is an approach based on experi-mental input parameterized by a Gounaris-Sakurai model combined with the Lellouch-L¨uscher formalism(see the Supplementary Information for details). The two methods labeled with ’win’ are used to obtainthe ﬁnal results of the paper. The lines show ﬁts using linear and quadratic ﬁts in a with varying numberof lattice spacings in the ﬁt. 9 HHKS’19KNT’19DHMZ’19BMWc’17RBC’18ETM’19FHM’19Mainz’19BMWc’20 660 680 700 720 740 10 × a LO-HVP µ latticeR-ratio no new physics Figure 3: Comparison of recent results for the leading-order, hadronic vacuum polarization contributionto the anomalous magnetic moment of the muon (see [7] for a recent review). Green squares are latticeresults: this work’s result, denoted by BMWc’20 and represented by a ﬁlled symbol at the top of the ﬁgure,is followed by Mainz’19 [30], FHM’19 [31], ETM’19 [32], RBC’18 [19] and our earlier work BMWc’17[14]. Red circles are obtained using the R-ratio method from DHMZ’19 [3], KNT’19 [4] and CHHKS’19[5, 6]; these results use the same experimental data as input. The blue shaded region is the value that a LO − HVP µ would have to have to explain the experimental measurement of ( g µ − , assuming no newphysics. 10

198 200 202 204 206 208 210 212 214 R -r a t i o / l a tt i c e R B C ’ A ub i n ’ B M W c ’ [ a li gh t µ , w i n ] i s o w / o i m p r o v e m en t N L O S X P T i m p r o v e m e n t a [fm ] Figure 4: Continuum extrapolation of the isospin-symmetric, light, connected component of the windowobservable a µ, win , denoted by [ a light µ, win ] iso . The data points are extrapolated to the inﬁnite-volume limit.Two diﬀerent ways to perform the continuum extrapolations are shown: one without improvement, andanother with corrections from next-to-leading-order staggered chiral perturbation theory. In both cases thelines show linear and quadratic ﬁts in a with varying number of lattice spacings in the ﬁt. The continuumextrapolated result is shown with the results from other lattice groups, RBC’18 [19] and Aubin’19 [20].Also plotted is our R-ratio-based determination, obtained using the experimental data compiled by theauthors of [4] and our lattice results for the non light connected contributions. This value, denoted by’R-ratio/lattice’, is . σ smaller than our pure lattice result for [ a light µ, win ] iso .11. Bernecker, D. & Meyer, H. B. Vector Correlators in Lattice QCD: Methods and applications. Eur.

Phys. J.

A47,

148 (2011).

9. Lautrup, B. E., Peterman, A. & de Rafael, E. Recent developments in the comparison between theory and experiments in quantum electrodynamics.

Phys. Rept.

10. De Rafael, E. Hadronic contributions to the muon g-2 and low-energy QCD.

Phys. Lett.

B322,

11. Blum, T. Lattice calculation of the lowest order hadronic contribution to the muon anomalous magnetic moment.

Phys. Rev. Lett.

12. Borsanyi, S. et al.

High-precision scale setting in lattice QCD.

JHEP

010 (2012).

13. Dowdall, R., Davies, C., Lepage, G. & McNeile, C. Vus from pi and K decay constants in full lattice

QCD with physical u, d, s and c quarks.

Phys. Rev. D

14. Borsanyi, S. et al.

Hadronic vacuum polarization contribution to the anomalous magnetic moments of leptons from ﬁrst principles.

Phys. Rev. Lett.

15. Neﬀ, H., Eicker, N., Lippert, T., Negele, J. W. & Schilling, K. On the low fermionic eigenmode dominance in QCD on the lattice.

Phys. Rev.

D64,

16. Giusti, L., Hernandez, P., Laine, M., Weisz, P. & Wittig, H. Low-energy couplings of QCD from current correlators near the chiral limit.

JHEP

013 (2004).

17. DeGrand, T. A. & Schaefer, S. Improving meson two point functions in lattice QCD.

Comput.

Phys. Commun.

18. Shintani, E. et al.

Covariant approximation averaging.

Phys. Rev.

D91,

19. Blum, T. et al.

Calculation of the hadronic vacuum polarization contribution to the muon anomalous magnetic moment.

Phys. Rev. Lett.

20. Aubin, C. et al.

Light quark vacuum polarization at the physical point and contribution to the muon g − . Phys. Rev. D

21. De Divitiis, G. M. et al.

Isospin breaking eﬀects due to the up-down mass diﬀerence in Lattice

QCD.

JHEP

124 (2012).

22. De Divitiis, G. M. et al.

Leading isospin breaking eﬀects on the lattice.

Phys. Rev.

D87, (2013).

Nucl. Phys.

B721,

24. Davoudi, Z. & Savage, M. J. Finite-Volume Electromagnetic Corrections to the Masses of Mesons,

Baryons and Nuclei.

Phys. Rev.

D90,

25. Borsanyi, S. et al.

Ab initio calculation of the neutron-proton mass diﬀerence.

Science

26. Fodor, Z. et al.

Quantum electrodynamics in ﬁnite volume and nonrelativistic eﬀective ﬁeld theories.

Phys. Lett.

B755,

27. Aubin, C. et al.

Finite-volume eﬀects in the muon anomalous magnetic moment on the lattice.

Phys. Rev.

D93,

28. Bijnens, J. & Relefors, J. Vector two-point functions in ﬁnite volume using partially quenched chiral perturbation theory at two loops.

JHEP

114 (2017).

29. Hansen, M. T. & Patella, A. Finite-volume eﬀects in ( g − HVP,LO µ . arXiv: (2019).

30. Gerardin, A. et al.

The leading hadronic contribution to ( g − µ from lattice QCD with N f = 2 + 1 ﬂavours of O( a ) improved Wilson quarks. Phys. Rev.

D100,

31. Davies, C. T. H. et al.

Hadronic-Vacuum-Polarization Contribution to the Muon’s Anomalous

Magnetic Moment from Four-Flavor Lattice QCD.

Phys. Rev.

D101,

32. Giusti, D., Lubicz, V., Martinelli, G., Sanﬁlippo, F. & Simula, S. Electromagnetic and strong isospin- breaking corrections to the muon g − from Lattice QCD+QED. Phys. Rev.

D99, upplementary Information Leading hadronic contribution to the muon magnetic momentfrom lattice QCD

Sz. Borsanyi , Z. Fodor , , , , , ∗ , J. N. Guenther , C. Hoelbling , S. D. Katz , L. Lellouch , T. Lippert , ,K. Miura , , , L. Parato , K. K. Szabo , , F. Stokes , B. C. Toth , Cs. Torok , L. Varnhorst Department of Physics, University of Wuppertal, D-42119 Wuppertal, Germany J¨ulich Supercomputing Centre, Forschungszentrum J¨ulich, D-52428 J¨ulich, Germany Institute for Theoretical Physics, E¨otv¨os University, H-1117 Budapest, Hungary University of California, San Diego, 9500 Gilman Drive, La Jolla, CA 92093, USA Pennsylvania State University, Department of Physics, State College, PA 16801, USA Department of Physics, University of Regensburg, Regensburg D-93053, Germany Aix Marseille Univ, Universit´e de Toulon, CNRS, CPT, Marseille, France Helmholtz Institute Mainz, D-55099 Mainz, Germany Kobayashi-Maskawa Institute for the Origin of Particles and the Universe, Nagoya University, Nagoya464-8602, Japan

Contents action and gauge ensembles 32 The action and gauge ensembles 63 Overlap action 94 Hadron mass measurements 125 Path integral and expectation values 176 Isospin breaking: decomposition 197 Isospin breaking: dynamical QED 208 Isospin-breaking: w -scale 219 Isospin breaking: hadron masses 2310 Current propagator (cid:104) J J (cid:105) a µ (cid:104) J J (cid:105) (cid:104)

J J (cid:105) a µ a light µ and a disc µ w , M ss and ∆ M M Ω scale-setting 6723 Results for a µ and its various contributions 6924 Result for a µ, win a µ, win .940.960.981.001.02 ( M K - M π ) / ( ph ys ) .

94 0 .

96 0 .

98 1 .

00 1 . ( M K - M π ) / ( ph ys ) M π /(phys) .

94 0 .

96 0 .

98 1 .

00 1 . M π /(phys) .

94 0 .

96 0 .

98 1 .

00 1 . M π /(phys)physomegaw Figure 1: Position of the ensembles in the plane of the hadron mass combinations of Equation(1). These correspond approximately to the light and strange quark masses. The lattice spacings are a = 0 . , . , . , . , . and . fm, respectively. The corresponding β gaugecouplings are indicated at the top of each panel. action and gauge ensembles The main part of the simulation eﬀort was carried out using the lattice action. This discretizationis deﬁned through the use of the tree-level Symanzik gauge action [2] and a one-link staggered fermionaction with four steps of stout smearing [3]. The smearing parameter was set to ρ = 0 . .We have chosen six gauge coupling parameters, β = 6 /g , as shown in Table 1. All of these ensembleswere generated using 2+1+1 dynamical ﬂavors with no isospin breaking. The charm mass is set by its ratioto the strange mass, m c /m s = 11 . , which comes from the spectroscopy of the pseudoscalar charmedmesons in the continuum limit worked out in [4]. This value is within one per-cent of the latest FLAGaverage [5]. The light and strange quark masses are chosen to scatter around a “physical point” deﬁnedby the pseudoscalar masses M π and M K and the mass of the Omega baryon, M Ω , as follows: M π M = (cid:34) M π M − (cid:35) ∗ , M K − M π M = (cid:34) M K χ M − (cid:35) ∗ . (1)where ∗ denotes the experimental value and M K χ ≡ (cid:0) M K + + M K − M π + (cid:1) . (2)The latter quantity is designed to be approximately proportional to the strange quark mass with a vanishingleading order sensitivity to strong-isospin breaking.In Equation (1), the mass of the Omega baryon plays the role of the scale setting variable. It could,in principle, be replaced by any other dimensionful quantity that satisﬁes the criteria: a) moderate quark3 a [ fm ] L × T m s m s /m l conf . . ×

64 0 . .

899 9043 . . ×

96 0 . .

038 3150 . .

939 5160 . .

183 5040 . .

038 5220 . .

038 2153 . . ×

84 0 . .

843 5100 . .

400 5050 . .

479 5070 . .

852 3853 . . ×

96 0 . .

500 5100 . .

205 1900 . .

205 4360 . .

007 15030 . .

893 5003 . . ×

128 0 . .

679 5060 . .

502 5120 . .

512 10010 . .

679 3270 . .

738 14500 . .

502 5004 . . ×

144 0 . .

634 4460 . .

124 5510 . .

634 22480 . .

124 10000 . .

263 9850 . .

695 1750

Table 1: List of ensembles with gauge coupling, lattice spacing at the physical point, lattice size,quark masses and number of conﬁgurations. Two diﬀerent lines with the same parameters means thatthe ensembles were generated in two diﬀerent streams.4 -2 -1 a s c a li n g ( M π ,t - M π ) / ( M e V ) a[fm] Figure 2: Taste multiplet of staggered pions as a function of lattice spacing, both for the andthe action. We label the meson operators by Roman numbers as in [1]. The root-mean-square pionmass corresponds approximately to the operator VIII, drawn with a solid line.mass dependence, b) precisely determined in a lattice simulation, c) known experimental value to anaccuracy better than a permil level. The mass M Ω satisﬁes all three criteria, see Section 4 for moredetails. In this work we also use the w -scale [6], which is derived from the Wilson-ﬂow of the gaugeﬁelds [7]. The main motivation for this scale setting is to deﬁne an isospin decomposition, see Section 6.The w -scale readily satisﬁes both the a) and b) criteria but, alas, it is deﬁned in terms of observablesin Euclidean space, not by any experiment. In order to use w for scale setting, we ﬁrst determine itsphysical value from our simulations, using the accurate M Ω scale as an input. This is described in Section21. Evidently whenever we use the w scale setting, both the statistical and the systematic error of w ,as well as the statistical correlation, will be accounted for. As a by-product of this procedure, we give aphysical value for w , including dynamical QED eﬀects, for the ﬁrst time in the literature.Our main analysis is based on the 27 ensembles shown in Table 1. In Figure 1 we show the “landscape”for each of our lattice spacings: we plot the ensembles in a plane where the x and y axes give the relativedeviation of the light and strange quark masses from their physical value. These are deﬁned by the hadronicobservables and their experimental values in Equation (1). The simulation parameters are chosen in away that makes interpolation to the physical point possible. This “bracketing” feature is not available foreach lattice spacing, but only if all lattice spacings are considered together. This is not a problem, since inour analyses we apply global ﬁts with all ensembles included. In Figure 1 each ensemble is represented bytwo points, corresponding to the M Ω and w scale settings. Although we determined the physical valueof w using M Ω itself, the mass ratios vary with the choice of scale setting. This is because there arediscretization eﬀects in the w M Ω product on the lattice. Notice that the ﬁner the lattices, the smallerthe diﬀerence in the respective mass ratios.We also measured the taste violation for all six lattice spacings. The result is shown in Figure 2. Theplot shows the mass-squared diﬀerence between a non-Goldstone pion and the Goldstone pion as a functionof the lattice spacing. The diﬀerence shows a behavior that resembles a in the range of our smallestthree lattice spacings. This is much faster than the α s a [8], where α s is the strong coupling constantat the lattice cutoﬀ scale. The faster falloﬀ is most probably due to higher order terms of the type α ns a with n > . At the smallest lattice spacing the root-mean-square pion mass is about m π, RMS = 155

MeV.In Figure 3 we show the topological-charge history in a run on our ﬁnest lattice. The charge Q wascomputed using the standard discretization of the topological charge density at a Wilson-ﬂow time of τ ,5 Q β =4.0126, a=0.064fm Figure 3: History of topological charge Q , deﬁned from the Wilson-ﬂow in a run at the physicalpoint. The lattice spacing is about a = 0 . fm. β a [ fm ] m s m s /m l L × T conf M π [ MeV ]0 . .

112 0 . .

971 56 ×

84 7709 10496 ×

96 96233 .

728 56 ×

84 8173 12196 ×

96 813

Table 2: List of ensembles with gauge coupling, lattice spacing, quark masses, lattice size, numberof conﬁgurations and Goldstone pion mass. These masses are chosen so that they bracket the point wherea certain taste-average pion mass has the physical value of the pion mass (see text). At that point theGoldstone-pion mass is M π = 110 MeV.which was set to have a smearing radius of about √ τ ≈ . fm. The integrated autocorrelation time of Q is found to be trajectories. action and gauge ensembles A major systematic eﬀect of the ensembles is related to their box size, which is about L ≈ fm.To obtain the inﬁnite-volume result it is desirable to extend the data set with a signiﬁcantly larger box.A large box is computationally only aﬀordable with a large lattice spacing. On coarse lattices, however,taste violations make the pions too heavy, which completely distorts the ﬁnite-volume behavior and makesa ﬁnite-size study pointless.We introduce here a new staggered action, called , to drastically reduce the taste violation. Thisrequires a gauge action that heavily suppresses ultraviolet ﬂuctuations and a fermion action with a moreagressive smearing than . As a result, taste splitting is reduced by an order of magnitude. Addi-tionally, we lower the Goldstone-pion mass below the physical value, ensuring that the heavier tastes arecloser to the physical pion mass. The topological susceptibility is particularly sensitive to taste violations.We show here that the observed reduced taste violation is paired with having the topological susceptibilitymuch closer to the continuum than in at the same lattice spacing. We use this action togenerate the lattices for our ﬁnite-size study.The gauge ensembles use the same action both for the valence and the sea quarks. They have6 Q β =0.73, a=0.112fm Figure 4: History of topological charge, deﬁned from the Wilson-ﬂow in a run at the physical point.The lattice spacing is about a = 0 . fm and the strange-to-light-quark mass ratio is m s /m l = 33 . . χ [f m - ] m l /m l,phys XPT4stout-sym,0.112fm0.064fm4hex-dbw2,0.112fm,32x6456x84

Figure 5: Topological susceptibility as a function of the quark mass with the action at a = 0 . fmlattice spacing. Also plotted are results at the physical point and with two diﬀerent lattice spacings.The grey band is the prediction of leading order chiral perturbation theory, with parameters taken from[9]. The simulation with the lightest quark mass ( m l = 0 . · m phys ) has a topological susceptibility,that is about the same as in the continuum at the physical point.7he following characteristics: • n f = 2 + 1 ﬂavors of one-link staggered quarks with four steps of HEX smeared [10] gauge links, • DBW2 gauge action [11], which diﬀers from the Symanzik gauge action used in only in thecoeﬃcient of the × Wilson-loop.The rationale for this choice is to drastically reduce the ultraviolet ﬂuctuations in the gauge conﬁgurations,since these directly impact the size of the taste violation.Four steps of HEX smearing suppresses the ultraviolet ﬂuctuations much more than four steps of stoutsmearing, and does so without increasing the locality range of the smearing procedure. Though highernumbers of HEX smearing steps would also have been possible, the increasing cost of the smearing andthe marginal improvement in the taste violations make an even higher number of steps less practical.The DBW2 gauge action suppresses taste violations even more than the Symanzik gauge action,as shown eg. in [12]. However, it slows down the decorrelation of the topological charge towards thecontinuum limit much more dramatically than other gauge actions. We use the action only atlattice spacings where suﬃcient tunnelings in the topological charge Q are observed. This includes thelattice spacing where we carry out the ﬁnite-volume study of the hadronic vacuum polarization. Figure4 shows the history of the charge Q in one of these runs. Again, Q is computed using the standarddiscretization of the topological charge density at a Wilson-ﬂow time of τ , which was set to have asmearing radius of about √ τ ≈ . fm. The integrated autocorrelation time of Q is found to be trajectories.In exploring the parameter space of the action we carried out n f = 3 simulations at ﬁve diﬀerent β values in the range β = 0 . . . . . , corresponding to lattice spacings a ≈ . . . . . fm. Note thatsuch small β values are typical with the DBW2 gauge action. The quark mass was tuned to the vicinityof the three ﬂavor symmetric point where the quark mass equals the physical value of (2 m l + m s ) . Thelattice sizes were × . Members of the pion taste multiplet are shown as a function of the latticespacing in Figure 2, together with data. The taste violations are an order of magnitudesmaller than that of the action. At a lattice spacing of a = 0 . fm, is as good as at a = 0 . fm, which is the ﬁnest lattice spacing available.This reduced taste violation is also reﬂected in the topological susceptibility. Non-chiral actions,including staggered fermions, typically show large discretization errors in this quantity. Figure 5 shows n f = 2 + 1 simulations at β = 0 . , a = 0 . fm, with the physical strange-quark mass. Thelight-quark mass was varied from . to . times its physical value, set by the Goldstone-pion mass. Thelattice sizes are × and × . Results obtained with the action at the physical point arealso given in the plot. The result at the same lattice spacing is oﬀ by an order of magnitude fromthe continuum expectation. On the other hand the data closely follows the continuum curve almostdown to the chiral limit, and is as good as the result at the ﬁnest lattice spacing available.For the ﬁnite-size study of the hadronic vacuum polarization we work at β = 0 . and m s = 0 . .This choice corresponds to a = 0 . fm and about a physical strange-quark mass. The ensemblesgenerated are listed in Table 2. We have two diﬀerent volumes with the same parameters. We will referto the smaller volume as the “reference volume”. It has a spatial and time extent of L ref = 6 . fm and T ref = 32 L ref . (3)This geometry corresponds approximately to the geometry of the lattices in the data set. Thelarger lattice extents will be denoted by L big and T big and are given as L big = T big = 10 . fm . (4)We also use two light-quark-mass values, so that we can bracket the physical point. Here, diﬀerentlyfrom above and also from the data set, we set the physical point, not with the Goldstone-pionmass, rather with a prescription that takes into account taste violations. Such a choice is advantegeous for8 a [ fm ] L × T m s m s /m l conf . . ×

48 0 . .

899 7163 . . ×

56 0 . .

843 8873 . . ×

64 0 . .

500 11103 . . ×

80 0 . .

512 5590 . .

738 3644 . . ×

96 0 . .

634 3390 . .

263 264

Table 3: List of ensembles used in a crosscheck with valence overlap quarks. The columns aregauge coupling, lattice spacing at the physical point, lattice size, quark masses and number of conﬁgura-tions. β m l m ov Z V . . . . . . . . . . . . . . . . Table 4: Staggered light-quark mass, matched overlap quark mass and vector renormalization constantfor diﬀerent lattice spacings.studying ﬁnite-size eﬀects, which depend strongly on the masses of the pions. The precise deﬁnition of thetaste-average pion mass will be given in the section on ﬁnite-size eﬀects, Section 17. For the Goldstonepion this prescription gives M π = 110 MeV. Let us note here also that the topological susceptibility, withsuch a choice, is about the same as in the continuum limit at the physical point, as shown in Figure 5.The number of conﬁgurations saved is also given in Table 2. They are separated by 10 unit-length RHMCtrajectories.

In order to crosscheck our results for staggered valence quark artefacts, including the normalization ofthe vector current, we compute a light µ, win in a mixed action setup, with overlap valence quarks on gaugebackgrounds generated with the staggered action. We work at the isospin-symmetric point in thiscrosscheck.For the sea quarks we use the staggered action and generate conﬁgurations with L ≈ fm boxsizes and at ﬁve lattice spacings, given in Table 3. The parameters are chosen to match the parametersof a subset of the L ≈ fm lattices of the data set given in Table 1. Though there are signiﬁcantﬁnite-size eﬀects in a box of L ≈ fm for an observable like a light µ , these are much less severe forthe window observable a light µ, win , which is our target here. Our setup is appropriate for crosschecking thecontinuum extrapolation and also the normalization of the vector current.For the valence quarks we use the overlap fermion formulation [13]. In particular, the overlap Diracoperator D ov is constructed from the sign function of the Wilson Dirac operator D W as D ov = m W [sgn( γ D W ) + 1] , (5)where the Wilson operator has a mass of − m W . We choose m W = 1 . and use the Zolotarev approximationof the sign function. The gauge ﬁelds undergo two steps of HEX smearing [14]. The overlap mass m ov isintroduced as D ov ( m ov ) = (cid:0) − m ov (cid:1) D ov + m ov . (6)9 .400.410.420.430.440.450.460.470.480.490.50 0 T/4 T/2 3T/4 Tt + ζ (t)- ζ (t)[ ζ (t)- ζ (T-t)]/2 Figure 6: Ratio of three-point and two-point functions ζ ( t ) as function of the current insertion time t .This ratio is used to deﬁne the local vector current renormalization factor as Z V = [ ζ ( T / − ζ (3 T / − .The plot shows data from a β = 3 . ensemble, T = 56 .This version of the operator has been extensively used in previous thermodynamical studies [15, 16]. Weapply O ( a ) improvement to the overlap propagator by transforming each instance of D − ( m ov ) as D − ( m ov ) → (cid:0) − D ov /m W (cid:1) D − ( m ov ) . (7)At the same time we also compute the propagators with staggered quarks, using the noise reductiontechnique as on the large volume ensembles.We set the overlap quark mass by matching the staggered and overlap pion masses. For this purpose,we compute overlap pion masses at four values of the quark mass m ov = 0 . , . , . , . andinterpolate the pion mass squared using the form M π, ov ( m ov ) = Am B ov + Cm (8)with A , B and C ﬁt parameters. This form can capture a possible quenched chiral logarithm [17] typicalin mixed action setups. Our matching condition is to set the root-mean-square staggered pion mass equalto the overlap pion mass. Using the RMS pion is more advantageous than using the Goldstone-pion mass,since in the latter case we face drastic increase in the statistical error on our coarsest lattices. For thematched overlap quark masses we get the values given in Table 4.Our determination of a light µ, win proceeds in a similar way as in the staggered-on-staggered case, describedin Sections 10 and 11. A major diﬀerence is that we use the local vector current in the overlap case. Wethus need to compute the current renormalization constant Z V , which we get by measuring the electriccharge of the pion. For this we compute the ratio of three-point and two-point functions: ζ ( t ) = (cid:104) P ( T / V ( t ) ¯ P (0) (cid:105)(cid:104) P ( T /

2) ¯ P (0) (cid:105) , (9)where the pseudoscalar density P and the local vector current V µ are given in terms of valence overlap10ermion ﬁelds ψ and ψ as: P ( t ) = (cid:88) (cid:126)x ( ¯ ψ γ ψ )( (cid:126)x, t ) , ¯ P ( t ) = (cid:88) (cid:126)x ( ¯ ψ γ ψ )( (cid:126)x, t ) , V µ ( t ) = (cid:88) (cid:126)x ( ¯ ψ γ µ ψ )( (cid:126)x, t ) . (10)In Figure 6 we show the ratio ζ ( t ) as a function of the timeslice of the current insertion t . In the case of aconserved current, ζ ( t ) = for t < T / and ζ ( t ) = − otherwise. The renormalization factor should bedeﬁned, so that the ζ -ratio for the renormalized current Z V V equals to at some physical distance, forwhich we take T / . We deﬁne the renormalization factor as Z V = [ ζ ( T / − ζ (3 T / − , which includesa trivial symmetrization in time. The values for the diﬀerent ensembles are given in Table 4.11ange . . . . . . . . . . kaon . . . . . . . . . . ss . . . . . . . . . . Table 5: Fit ranges for extracting pseudoscalar masses on isospin symmetric ensembles.four-state ﬁt GEVP ﬁt β N

Wptl N range t a t b range . . . .

17 6 . . .

14 4 7 6 . . . . . . .

19 6 . . .

16 4 7 6 . . . . . . .

20 7 . . .

17 4 7 7 . . . . . . .

23 8 . . .

20 4 9 8 . . . . . . .

28 10 . . .

24 6 9 9 . . . . . . .

30 12 . . .

30 6 11 11 . . . Table 6: Parameters used for obtaining the Ω mass: number of Wuppertal and gauge-link smearing stepsin the Ω operator; ﬁt ranges Pseudoscalar mass measurements

The pseudoscalar propagators are computed with random wall sources and point sinks, using an operatorcorresponding to the pseudo-Goldstone taste. To extract the mass and the decay constant we performeda correlated cosh[ M ( t − T / ﬁt, using suﬃciently late time slices to allow for this simple form. Toestimate systematic errors, we selected two ﬁt windows which are given for the diﬀerent pseudoscalarsin Table 5. For the kaons we selected the even/odd slices for the ﬁrst/second ﬁt window, respectively.These ﬁt ranges were chosen by performing a Kolmogorov-Smirnov test, which ascertains whether the ﬁtqualities in all of the ﬁts on the ensembles of Table 1 follow a uniform distribution.

Omega propagators

To extract the mass of the positive-parity, ground-state Ω baryon, a number of diﬀerent operators areavailable in the staggered formalism. First, there are two operators from the pioneering work of Goltermanand Smit [18]. To label these operators we use the convention of [1]: Ω VI ( t ) = (cid:88) x k even (cid:15) abc [ S χ a S χ b S χ c − S χ a S χ b S χ c + S χ a S χ b S χ c ] ( x ) , (11) Ω XI ( t ) = (cid:88) x k even (cid:15) abc [ S χ a S χ b S χ c ]( x ) . (12)Here, χ a ( x ) is the strange-quark ﬁeld with color index a . The operator S µ performs a symmetric, gauge-covariant shift in direction µ , while S µν ≡ S µ S ν . Both Ω VI and Ω XI couple to two diﬀerent tastesof the Ω baryon, which become degenerate in the continuum limit. At ﬁnite lattice spacing however,there is a splitting between the two tastes. In principle they could be disentangled by carrying out ananalysis involving the correlators of both Ω VI and Ω XI and also their cross terms. Later, Bailey successfullyconstructed an operator which only couples to a single taste [19]. To achieve this, two additional (valence)strange quarks are introduced. In other words, the strange-quark ﬁeld gets an additional “ﬂavor” index:12 .1131.1141.1151.1161.1171.1181.1191.1201.1211.122 8 10 12 14 16 18 Ω VI M Ω e ff e c t i v e m a ss t/a

8 10 12 14 16 18 Ω XI t/a

8 10 12 14 16 18 Ω Ba t/a Figure 7: Eﬀective mass of the ground state of the Ω baryon in lattice units on our coarsest ensemblewith β = 3 . . Results with three diﬀerent staggered operators, Ω VI , Ω XI and Ω Ba are shown. Thehorizontal lines and the shaded regions represent the ﬁt values and the errors obtained with a four-stateﬁt, Equation (15), using range χ -values including the contributionof the priors are . , . and . for degrees of freedom. The dashed lines are the eﬀective massescomputed from the ﬁtted functions. χ aα with α = 1 , , . The operator is then given as Ω Ba ( t ) = [2 δ α δ β δ γ − δ α δ β δ γ − δ α δ β δ γ + ( . . . β ↔ γ . . . )] ·· (cid:88) x k even (cid:15) abc [ S χ aα S χ bβ S χ cγ − S χ aα S χ bβ S χ cγ + S χ aα S χ bβ S χ cγ ] ( x ) . (13)The mass of this state becomes degenerate with the above two taste partners in the continuum limit.We investigated the diﬀerence between these three operators on an ensemble with large statistics. At β = 3 . , corresponding to our coarsest lattice spacing, we generated about 3000 conﬁgurations inaddition to the statistics listed in Table 1. Note that only the Ω operators were measured on theseextra conﬁgurations. The eﬀective masses for the above three operators are shown in Figure 7. In theasymptotic regime we see deviations below 0.1%, which gives an estimate of the taste violation. Weexpect that these will get smaller as we go to ﬁner lattice spacings, as it does for pions. In this work wechose the Ω VI operator for our scale setting measurements. This is justiﬁed, since typical statistical andsystematic errors on our ensembles are around 0.1%, and thus cover the taste-violation eﬀects estimatedhere.As is usual with staggered fermions, these propagators have an oscillating contribution, correspondingto negative parity states. There are also excited states for both parities in the propagators. We suppressthe excited state by a number of Wuppertal smearing steps [20] applied equally to the source and sink.To avoid mixing between time slices, we deﬁne the smearing to act in the spatial directions only. Sincestaggered baryon operators are deﬁned on a coarse lattice with spacing a , our implementation of theWuppertal smearing connects only sites that reside on the same sublattice with a lattice spacing. In thisway there is no interference with the spin-taste structure of the operators. The action of the smearing13 .780 0.785 0.790 0.795 0.800 0.805 0.810 0.8150.043194 : 30.2050.040750 : 28.0070.039130 : 26.8930.043194 : 28.500 β =3.8400 m s : m s /m l aM Ω Figure 8: The mass of the Ω baryon in lattice units extracted for four ensembles with β = 3 . , whichbracket the physical point. We compare four methods. The deviation between them is used to estimatethe systematic error (see text for details).operator ˆ W on a vector v x with coeﬃcient σ is given as: [ ˆ W v ] x = (1 − σ ) v x + σ (cid:88) µ =1 , , (cid:16) U µ,x U µ,x + µ v x +2 µ + U , † µ,x − µ U , † µ,x − µ v x − µ (cid:17) , (14)where the U [ U ] parallel transporters are inserted to keep the recipe gauge invariant: no gauge ﬁxingis needed. These parallel transporters are built from a smeared version of the underlying gauge ﬁeldconﬁguration U , by applying a number of three dimensional stout smearing steps N . For this we use thesame ρ = 0 . parameter that we also have in the link smearing of the Dirac operator. Note that the U links are also needed to build the baryon operator, as they appear in the S µ shifts. The smearing operationin Equation (14) is iterated N Wptl times on both source and sink sides on a point vector with a coeﬃcientof σ = 0 . . In Table 6 we list the number of smearing steps for each lattice spacing. The smearing radiicorresponding to the number of smearing steps approximately follow the change in the lattice spacing.That way the smearing radius in physical units is kept constant. Note that we are discussing a threedimensional smearing here: this only eﬀects the overlap of the mass eigenstates with the operator andleaves the masses invariant.To enhance the signal, we calculate the Ω propagator from 256 diﬀerent source ﬁelds on each gaugeconﬁguration listed in Table 1. For each source ﬁeld we select a random time slice, which, in turn, is pop-ulated with eight independent Z random point sources at (0 , , , ( L/ , , , . . . and ( L/ , L/ , L/ .This formation of sources is usually called a grid source. We also randomize the center of the grid source. Omega mass determination: four-state ﬁts

Our model for the propagator is a four-state ﬁt function h , with two positive and two negative paritystates: h ( t, A, M ) = A h + ( M , t ) + A h − ( M , t ) + A h + ( M , t ) + A h − ( M , t ) (15)14here the h + ( M, t ) = e − Mt + ( − t − e − M ( T − t ) and h − ( M, t ) = − h + ( M, T − t ) (16)functions describe the time dependence of the positive and negative parity states, see eg. Equation (123)of [21]. Here M and A are the mass and amplitude of the ground state. Our χ function is deﬁned asa sum of the correlated χ of the model h and a prior term: χ ( A, M ) = (cid:88) i,k [ h ( t i , A, M ) − H i ] Cov − ik [ h ( t k , A, M ) − H k ] + χ ( M ) , (17)where H i is the value of the hadron propagator on the time slice t i and Cov ik stands for the covariancebetween H i and H k . A prior term was introduced to stabilize the ﬁt, containing priors on the massesexcept for the ground state. The concrete form is: χ ( M ) = (cid:88) s =1 (cid:18) M s /M − µ s δµ s (cid:19) , (18)where the prior parameters are set as follows: s µ s · MeV δµ s MeV .

102 2250

MeV .

103 2400

MeV . The prior for the negative parity ground state, s = 1 , is motivated by the recent observation from theBelle collaboration [22]. The excited states, s = 2 , , have not been discovered in experiments so far, sotheir priors follow from the quark model [23]. The existence of these undiscovered states is also motivatedby lattice thermodynamics below the chiral transition [24, 25].The range of time slices, that are included in the χ , were chosen by an optimization on the coarsestlattice, β = 3 . . As already mentioned, we have around 4000 conﬁgurations there, which is aboutfour times larger than on the ensembles at other lattice spacings. In the ﬁt range [7 . . . we obtainedthe mass with a relative precision of . and with ﬁt quality of Q = 0 . : M Ω , VI = 1 . inlattice units. The priors did not impose a signiﬁcant pull on the result, ie. the ﬁnal values of the ﬁtparameters were well within the prior widths. A diﬀerent ﬁt range [6 . . . resulted in a change in M Ω within a small fraction of the statistical error. These results reassure us that the excited state eﬀects aresmaller than the statistical error with these two ﬁt ranges. On the other ensembles, with lesser statistics,we used these two ﬁt ranges, keeping their values in physical units approximately constant upon changingthe lattice spacing. The exact ﬁt ranges used are given in Table 6 in lattice units.In Figure 8 we show a comparison of the Ω masses obtained with the various ﬁts on four ensemblesat β = 3 . . Besides the four-state ﬁt with two diﬀerent ﬁt ranges, see Table 6, we also show a valuefrom a “combined” ﬁt. Here we combine the correlators from all of the ensembles at a given β and applyto them a common four-state ﬁt. Assuming that all excited state masses are a linear function of the barestrange mass for a given β , one can ﬁt this linear dependence across the ensembles along with the stillindependent ground state masses. This reduces the number of ﬁt parameters and results in more stableﬁts. The bare light-mass dependence of the excited states can be safely ignored, even for the ground statethese eﬀects are barely signiﬁcant. The result of this combined ﬁt agrees well with the ones obtained fromthe individual ﬁt ranges. We do not use this combined ﬁt in our ﬁnal analyses though, it would introducecorrelations between ensembles, making the analysis procedure more complicated.15 mega mass determination: GEVP method In addition to the above four-state ﬁt to the Ω propagator we also used a mass extraction procedureproposed in [26], which is based on the Generalized Eigenvalue Problem (GEVP). The method has theadvantage of not using priors. We ﬁrst apply a folding transformation to the original hadron propagator H t : H t → (cid:26) [ H t + ( − t +1 H T − t ] 0 < t < T H t t = 0 or t = T (19)Then we construct a matrix for each time slice t : H ( t ) =  H t +0 H t +1 H t +2 H t +3 H t +1 H t +2 H t +3 H t +4 H t +2 H t +3 H t +4 H t +5 H t +3 H t +4 H t +5 H t +6  (20)For a given t a and t b let λ ( t a , t b ) be an eigenvalue and v ( t a , t b ) an eigenvector solution to this × generalized eigenvalue problem: H ( t a ) v ( t a , t b ) = λ ( t a , t b ) H ( t b ) v ( t a , t b ) . (21)Here we select the smallest eigenvalue λ and use the corresponding eigenvector v to project out the groundstate: v + ( t a , t b ) H ( t ) v ( t a , t b ) , (22)which then can be ﬁtted to a simple exp( − M t ) type function. This assumes that backward propagatingstates are negligible between t a and t b as well as in the range used to ﬁt Equation (22). The parity partnerstates inherent in the staggered formulation appear as excited states, that give large contributions to thecorrelation functions (22) constructed with an eigenvalue λ with non-minimal absolute value. In thatcase, both the correlation function (22) and the eigenvalue λ exhibit oscillating signs. In the case of thecorrelation function, this oscillation occurs as a function of t , and in the case of λ as a function of t b − t a .See [27] for details of the variational method with staggered fermions.The tuneable parameters of the procedure are t a and t b for specifying the GEVP, as well as the ﬁtrange for the exp ﬁtting in the last step; they are given in the last three columns of Table 6. Similar tothe pion mass analysis these parameters were chosen by performing a Kolmogorov-Smirnov test across allthe ensembles. In Figure 8 we show the ﬁt results obtained with this GEVP procedure for ensembles at β = 3 . . They are in good agreement with the four-state ﬁt values.The mass extracted using the GEVP gives a third M Ω value for each ensemble, beside the results withthe four-state ﬁt procedure with the two ﬁt ranges. We will use the deviation between these three valuesas a systematic error in the Ω mass determination. Finite size corrections

In order to determine the ﬁnite-volume corrections for the pseudoscalar masses, M π ( L ) − M π ( ∞ ) , anddecay constants, f π ( L ) − f π ( ∞ ) , we use the chiral perturbation theory based formulae of Reference [28].Our pion masses are very close to the physical point, where one obtains a relative correction of . % forthe mass and a relative correction of . % for the decay constant. The mass of the kaon also receivesa correction due to the ﬁnite volume. However, this correction is so small, and any uncertainty related toit is so subdominant, that we ignore it.We also take into account the eﬀect of the ﬁnite time extent T in the decay constants, both for pionsand kaons, assuming that they are free particles. This is obtained by noting that the T -dependence of the16ree particle propagator is given by cosh[ M ( t − T / / sinh( M T / . Therefore, we ﬁt our propagators tothe form A cosh[ M ( t − T / / sinh( M T / and extract the decay constant from the amplitude A .The ﬁnite-size eﬀects on the Ω mass is estimated from next-to-leading order, three-ﬂavor, heavy-baryon chiral perturbation theory. See [29] for the corresponding formulas. To this order the pions giveno contribution to the ﬁnite-size eﬀects, but only the kaons and the eta do. As a result, the ﬁnite-sizecorrection is so tiny that it can be safely neglected. Our staggered path integral includes four ﬂavors of quarks, f = { u, d, s, c } , gluon ﬁelds U and photonﬁelds A and is given by: Z = (cid:90) [ dU ] exp( − S g [ U ]) (cid:90) [ dA ] exp( − S γ [ A ]) (cid:89) f det M / [ V U exp( ieq f A ) , m f ] . (23)The ensemble speciﬁc deﬁnition of the gauge action S g is given in Sections 1 and 2. The photon integralmeasure [ dA ] and action S γ are deﬁned in the QED L scheme [30]. The one-hop staggered matrix in abackground ﬁeld W µ can be written as M [ W, m ] = D [ W ] + m = (cid:88) µ D µ [ W µ ] + m, (24)where D µ is the covariant diﬀerentiation in the µ direction involving W and its adjoint W † together withthe obligatory staggered phases. In the path integral the fermions are coupled to a gauge ﬁeld that is aproduct of the exponentiated photon ﬁeld and of the smeared gluon gauge ﬁeld V U . Our smearing recipesare given in Sections 1 and 2. The photon ﬁeld is not smeared. q f ∈ { + , − , − , + } stand for thequark electric charges in units of the positron charge e , m f for the quark masses and α = e / (4 π ) . We usethe notation δm ≡ m d − m u for the diﬀerence in the up and down quark masses and m l ≡ ( m u + m d ) for their average. To simplify later formulas we also introduce the notations M f ≡ M [ V U e ieq f A , m f ] and dets[ U, A ; { m f } , { q f } , e ] ≡ (cid:89) f det M / f , (25)where the latter is the product of all fermion determinants.In this work, isospin-breaking is implemented by taking derivatives with respect to the isospin-breakingparameters and by measuring the so obtained derivative operators on isospin-symmetric conﬁgurations[31]. A diﬀerent approach would be to generate conﬁgurations at non-zero values of the isospin breakingparameters and use the same operators as at zero isospin breaking, see eg. [32]. We choose the formerapproach in this work, so as to optimally distribute the computing resources among the various isospin-breaking contributions.We introduce a set of notations for isospin-symmetric observables and their isospin-breaking derivatives.Consider the observable X ( e, δm ) , which is a function of e and δm . Then we deﬁne X ≡ X (0 , , X (cid:48) m ≡ m l ∂X∂δm (0 , , X (cid:48) ≡ ∂X∂e (0 , , X (cid:48)(cid:48) ≡ ∂ X∂e (0 , . (26)The isospin-breaking derivatives are denoted by prime(s) and an index. The mass derivative has theindex m , it requires no renormalization, since δm and m l have the same renormalization factor at zeroelectromagnetic coupling. The electric charge derivatives have a single digit index: or . Below, wealso deﬁne electric charge derivatives with two-digit indices. We take into account only leading-orderisospin-breaking in this work, so no higher derivatives are needed.In the case of the fermion determinant, the isospin-symmetric value is denoted by dets . The strong-17sospin-breaking of dets is zero at leading order: dets (cid:48) m = 0 , (27)since dets is symmetric under the exchange u ↔ d . The electromagnetic derivatives are dets (cid:48) dets = (cid:88) f q f (cid:0) M − f D [ iAV U ] (cid:1) , dets (cid:48)(cid:48) dets = 12 (cid:34)(cid:18) dets (cid:48) dets (cid:19) − (cid:88) f q f (cid:0) M − f D [ A V U ] (cid:1) − (cid:88) f q f (cid:0) M − f D [ iAV U ] M − f D [ iAV U ] (cid:1)(cid:35) , (28)where Tr is trace over color and spacetime indices and the argument of the D operator is a × complexmatrix valued ﬁeld, eg. A V U has components A µ,x [ V U ] µ,x . The implementation of these derivatives isgiven in Section 7.We also make a distinction between the electric charge in the fermion determinant and in the operatorthat we measure. We call the former sea electric charge and denote it by e s , the latter is the valenceelectric charge and is denoted by e v . For an observable X that depends on both the valence and seacharges, X ( e v , e s ) , the second order electric charge derivatives are deﬁned as follows: X (cid:48)(cid:48) ≡ ∂ X∂e v (0 , , X (cid:48)(cid:48) ≡ ∂ X∂e v ∂e s (0 , , X (cid:48)(cid:48) ≡ ∂ X∂e s (0 , . (29)For functions that depend on either e v or e s , but not on both, we use the single digit notations of Equation(26).The expectation value of an operator O is calculated by inserting O [ U, A ] into the integrand of the pathintegral of Equation (23) and normalizing the integral by Z . Here we consider operators whose photonﬁeld dependence arises entirely from the photon-quark interaction, ie. O = O [ U, e v A ] . The expectationvalue of this operator depends on δm , e v and e s , and the isospin expansion can be written as: (cid:104) O (cid:105) = [ (cid:104) O (cid:105) ] + e v (cid:104) O (cid:105) (cid:48)(cid:48) + e v e s (cid:104) O (cid:105) (cid:48)(cid:48) + e s (cid:104) O (cid:105) (cid:48)(cid:48) + δmm l (cid:104) O (cid:105) (cid:48) m . (30)Here, the individual terms can be expressed as expectation values obtained with the isospin-symmetricpath integral, which we denote by (cid:104) . . . (cid:105) . The concrete expressions are:isospin-symmetric: [ (cid:104) O (cid:105) ] = (cid:104) O (cid:105) qed valence-valence: (cid:104) O (cid:105) (cid:48)(cid:48) = (cid:104) O (cid:48)(cid:48) (cid:105) qed sea-valence: (cid:104) O (cid:105) (cid:48)(cid:48) = (cid:28) O (cid:48) dets (cid:48) dets (cid:29) qed sea-sea: (cid:104) O (cid:105) (cid:48)(cid:48) = (cid:28) O dets (cid:48)(cid:48) dets (cid:29) − (cid:104) O (cid:105) (cid:28) dets (cid:48)(cid:48) dets (cid:29) strong-isospin-breaking: (cid:104) O (cid:105) (cid:48) m = (cid:104) O (cid:48) m (cid:105) (31)In the derivation of these expressions we use (cid:68) dets (cid:48) dets (cid:69) = 0 . In Table 7 we give an overview of theisospin-breaking derivatives for the observables that are computed in this paper.Note that Equation (30) is an expansion in bare parameters and not what we consider a decompo-sition into isospin-symmetric and isospin breaking parts. The latter involves derivatives with respect torenormalized observables and our prescription for that is given in Section 6. There is no need to introducea renormalized electromagnetic coupling though: its running is an O ( e ) eﬀect, ie. beyond the leadingorder isospin approximation that we consider here. 18 X (cid:48)(cid:48) X (cid:48)(cid:48) X (cid:48)(cid:48) X (cid:48) m Section M Ω , M π χ , M K χ (cid:88) (cid:88) (cid:88) - 9 ∆ M K , ∆ M (cid:88) (cid:88) - (cid:88) w - - (cid:88) - 8 (cid:104) J J (cid:105) -light (cid:88) (cid:88) (cid:88) (cid:88) (cid:104) J J (cid:105) -strange (cid:88) (cid:88) (cid:88) - 14 (cid:104)

J J (cid:105) -disc. (cid:88) (cid:88) (cid:88) (cid:88)

For various purposes it is useful to decompose the observables into isospin-symmetric and isospin-breakingparts. This requires a matching of the isospin symmetric and full theories, in which we specify a set ofobservables that must be equal in both theories. Of course, diﬀerent sets will lead to diﬀerent decompo-sitions, which is commonly referred to as scheme dependence. Only the sum of the components, ie. theresult in the full theory, is scheme independent.A possible choice for the observables are the Wilson-ﬂow–based w scale and the masses of mesonsbuilt from an up/down/strange and an anti-up/down/strange quark, M uu / M dd / M ss . These mesons aredeﬁned by taking into account only the quark-connected contributions in their two-point functions [33].Their masses are practical substitutes for the quark masses. Also, they are neutral and have no magneticmoment, so they are a reasonable choice for an isospin decomposition. These masses cannot be measuredin experiments, but have a well deﬁned continuum limit and thus a physical value can be associated tothem.According to partially-quenched chiral perturbation theory coupled to photons [34], the combination M π χ ≡ ( M uu + M dd ) (32)equals the neutral pion mass, M π χ = M π , up to terms that are beyond leading order in isospin breaking.Since such terms are beyond the accuracy needed in this work, we use the experimental value of theneutral pion mass as the physical value of M π χ . Furthermore the diﬀerence, ∆ M ≡ M dd − M uu (33)is a measure of strong-isospin-breaking not aﬀected by electromagnetism. According to [34], ∆ M =2 B δm is valid up to eﬀects that are beyond leading order in isospin breaking, at least around the physicalpoint. Here, B is the two-ﬂavor chiral condensate parameter. For the determination of the physicalvalues of w , M ss and ∆ M , see Section 21.For the decomposition we start with the QCD+QED theory and parameterize our observable (cid:104) O (cid:105) withthe quantities deﬁned above: (cid:104) O (cid:105) ( M π χ w , M ss w , Lw , ∆ M w , e ) . (34)Here, the continuum limit is assumed. We can isolate the electromagnetic part by switching oﬀ theelectromagnetic coupling, while keeping the other parameters ﬁxed: (cid:104) O (cid:105) qed ≡ e · ∂ (cid:104) O (cid:105) ∂e (cid:12)(cid:12)(cid:12)(cid:12) M πχ w ,M ss w , Lw , ∆ Mw ,e =0 . (35)19 a [ fm ] L × T m s m s /m l conf . . ×

48 0 . .

899 71648 ×

64 0 . .

899 3003 . . ×

56 0 . .

843 8873 . . ×

64 0 . .

500 11100 . .

205 10720 . .

007 10360 . .

893 1035

Table 8: List of ensembles used for computing dynamical QED eﬀects with gauge coupling, latticespacing at the physical point, lattice size, quark masses and number of conﬁgurations.The strong-isospin-breaking part is given by the response to the ∆ M parameter: (cid:104) O (cid:105) sib ≡ (∆ M w ) · ∂ (cid:104) O (cid:105) ∂ (∆ M w ) (cid:12)(cid:12)(cid:12)(cid:12) M πχ w ,M ss w , Lw , ∆ Mw =0 ,e =0 , (36)and the isospin-symmetric part is just the remainder: (cid:104) O (cid:105) iso ≡ (cid:104) O (cid:105) ( M π χ w , M ss w , Lw , , . (37)One can also deﬁne the decomposition at a ﬁnite lattice spacing, for which w in lattice units can beadditionally ﬁxed. In doing so the isospin symmetric part (cid:104) O (cid:105) iso has to be distinguished from the value ofthe observable at the bare isospin-symmetric point [ (cid:104) O (cid:105) ] .In this work we use the above deﬁnitions for the isospin decomposition; a similar scheme was putforward in [35]. A diﬀerent scheme would be to keep the renormalized quark masses and the strongcoupling constant ﬁxed as QED is turned on [36]. In case of light quark observables the two schemessupposed to agree well. This can be justiﬁed by the smallness of the electromagnetic part of the neutralpion in the scheme of [36]. Reference [35] found (cid:15) π = 0 . , where (cid:15) π is the parameter that measuresthe size of the electromagnetic contribution in the neutral pion mass. In comparison the same quantityfor the charged pion was found (cid:15) π + = 1 . in [35]. In the isospin expansion of an observable (cid:104) O (cid:105) (see Equation (30)) we refer to the e s dependent terms asdynamical QED contributions.The sea-valence contribution is given by Equation (31) as (cid:104) O (cid:105) (cid:48)(cid:48) = (cid:28)(cid:28) O (cid:48) dets (cid:48) dets (cid:29) A (cid:29) U . (38)Here we made explicit that the path integral is carried out over two gauge ﬁelds: the index A of theexpectation value means averaging over free photon ﬁelds with the action S γ . The rest of the path integralweight is contained in the gluon expectation value, labeled with index U . The trace over coordinate andcolor space in the ﬁrst derivative of the fermion determinant (see Equation (28)) is computed exactlyin the low-lying eigenmode space of the Dirac operator and with random vectors in the complement.According to our ﬁndings, the noise in this term overwhelmingly stems from the random sources, andnot from the gauge ﬁelds. For each U ﬁeld we generate one A ﬁeld and on this ( U, A ) gauge ﬁeld pairwe use about random vectors to estimate the ﬁrst derivative dets (cid:48) / dets . The ﬁrst derivative of theobservable is estimated by a ﬁnite diﬀerence O (cid:48) ≈ e v ( O + − O − ) , see eg. Section 9.20he sea-sea contribution is given by (cid:104) O (cid:105) (cid:48)(cid:48) = (cid:28) [ O − (cid:104) O (cid:105) U ] (cid:28) dets (cid:48)(cid:48) dets (cid:29) A (cid:29) U , (39)where the A -average of the second derivative of the determinant can be done independently from theobservable. This is especially useful, since the noise in this term is dominated by ﬂuctuations in the photonﬁeld. On each U conﬁguration we use about photon ﬁelds, and on each photon ﬁeld, 12 randomsources to estimate the second derivative dets (cid:48)(cid:48) / dets .For both contributions we apply the Truncated Solver Method [37, 38]: the matrix inverters are runwith a reduced precision most of the time, and the resulting small bias is corrected using occasional,high-precision inversions.In this work we compute dynamical QED eﬀects on a dedicated set of ensembles using the action. We have three lattice spacings, with box sizes around L = 3 fm with T = 2 L . Additionally, onthe coarsest lattice there is also an ensemble with an L = 6 fm box. Table 8 gives the parameters ofthese ensembles, together with the number of conﬁgurations. The chosen parameters in these dedicatedruns match the parameters of selected ensembles. For the observables that we consider, we seeno signiﬁcant diﬀerence in the size of dynamical QED contributions between the two diﬀerent volumes.For the volume dependence of the hadron masses, see Section 9 and Figure 9. The L = 3 fm volumesimulations need about an order of magnitude less computer time for the same precision. Therefore, onthe ﬁner lattices we performed simulations in the smaller volume only. w -scale In this section we derive a formula that gives the electromagnetic correction of the w -scale [6]. Thestarting point is the operator W τ [ U ] , which is the logarithmic derivative of the gauge-action density alongthe gradient ﬂow [7]: W τ [ U ] ≡ d ( τ E [ U, τ ]) d log τ , (40)where τ is the gradient ﬂow time and E is a suitable discretization of the gluonic gauge action density.The expectation value of this operator deﬁnes the w -scale via (cid:68) W τ = w ( e ) (cid:69) = 0 . . (41)Since W τ [ U ] is a pure-gauge observable, it neither depends on the valence charge nor on fermion masses:the only isospin-breaking dependence in Equation (41) comes from the electric sea charge. The derivativeswith respect to δm and the valence electric charge are zero. The expansion of the expectation value isgiven by Equations (30) and (31): (cid:104) W τ (cid:105) = (cid:104) W τ (cid:105) + e s (cid:28) ( W τ − (cid:104) W τ (cid:105) ) dets (cid:48)(cid:48) dets (cid:29) (42)We also have to expand the w -scale: w ( e s ) = w + e s δw (43)and the W operator: W τ = w ( e s ) = W τ = w + e s · w δw · dWdτ (cid:12)(cid:12)(cid:12)(cid:12) τ = w (44)21 .009.259.509.7510.0010.25 e M ’’ [ M e V ] fv-corr qed val-valqed val-val-10-50510 e M ’’ [ M e V ] qed val-sea (x1000)-2-1012 3 fm 6 fm e M ’’ [ M e V ] qed sea-sea (x10) Figure 9: Volume dependence of various electromagnetic contributions to the π + mass. For the valence-valence contribution M (cid:48)(cid:48) we apply an inﬁnite-volume correction given by Equation (53). The valence-sea M (cid:48)(cid:48) and sea-sea M (cid:48)(cid:48) contributions are multiplied by and on the plot, respectively. The resultsare obtained with the action at β = 3 . .isospin component meson M . . . . . . . . . . . . . . . . . . . . M (cid:48)(cid:48) . . . . . . . . . . . . . . . . . . . . M (cid:48) m . . . . . . . . . . − − M (cid:48)(cid:48) , M (cid:48)(cid:48) . . . . . . . . . . . . . . . . . . . . Table 9: Plateau ﬁt ranges for diﬀerent isospin-breaking components, for mesons and for the Ω baryon.Here, w denotes the value of the w -scale at the isospin-symmetric point, which of course satisﬁes (cid:68) W τ = w (cid:69) = 0 . (45)From these we obtain the following formula for the electromagnetic correction: δw = − (cid:34) √ τ (cid:28) dW τ dτ (cid:29) − (cid:28) ( W τ − (cid:104) W τ (cid:105) ) dets (cid:48)(cid:48) dets (cid:29) (cid:35) τ = w (46)Section 7 gives details on the fermion-determinant derivative computations. On our coarsest latticespacing we computed the electromagnetic correction in two diﬀerent volumes, L = 3 fm and L = 6 fm(see Table 8 for the ensemble parameters). We obtained for this correction δw = − . on the smalland δw = − . on the large lattice, which are in perfect agreement. Even the isospin-symmetricvalues show no signiﬁcant ﬁnite-size eﬀect: we have w = 1 . on the small and w = 1 . onthe large lattice. In our analyses, we use the isospin symmetric w measured on the large lattices listed inTable 1; whereas for δw we use the small volume ensembles listed in Table 8.22 Isospin breaking: hadron masses

In this section we describe the procedure to obtain the isospin-breaking derivatives of a hadron mass M .On certain ensembles we measure the hadron propagator H at four diﬀerent values of isospin breaking: H , H + , H − , H δm . (47)The ﬁrst is measured at the isospin-symmetric point, the second/third with valence electric charge e v = ±√ πα ∗ and zero quark mass diﬀerence δm = 0 , and the fourth with e v = 0 and δm = 2 m l − r r (cid:12)(cid:12) r =0 . .These allow to calculate ﬁnite diﬀerences with respect to e v and δm , whereas the e s derivatives can becalculated exactly using the formulae in Equation (31). In these measurements both gluon and photonﬁelds are ﬁxed to Coulomb gauge, for the former the gauge ﬁxing procedure is applied after smearing.We use the notation M [ (cid:104) H (cid:105) ] for the mass that is extracted from the hadron-propagator expectationvalue (cid:104) H (cid:105) . At the isospin-symmetric point we have M = M [ (cid:104) H (cid:105) ] . (48)The QED, sea-sea, isospin-breaking component of the propagator, (cid:104) H (cid:105) (cid:48)(cid:48) , is given by Equation (31). Thenthe derivative of the mass can be obtained by application of the chain rule: M (cid:48)(cid:48) = δ M [ H ] δH (cid:12)(cid:12)(cid:12)(cid:12) (cid:104) H (cid:105) (cid:104) H (cid:105) (cid:48)(cid:48) = δ M [ H ] δH (cid:12)(cid:12)(cid:12)(cid:12) (cid:104) H (cid:105) (cid:28) ( H − (cid:104) H (cid:105) ) dets (cid:48)(cid:48) dets (cid:29) , (49)where we use ∂ (cid:104) H (cid:105) /∂e s = 0 at the isospin-symmetric point. In the case of the valence-valence QEDcomponent we can build the derivative as a ﬁnite diﬀerence: M (cid:48)(cid:48) ≈ e v ( M [ (cid:104) H + (cid:105) ] + M [ (cid:104) H − (cid:105) ] − M [ (cid:104) H (cid:105) ]) = 1 e v (cid:0) M [ (cid:104) H + + H − (cid:105) ] − M [ (cid:104) H (cid:105) ] (cid:1) . (50)Here we used (cid:104) H + (cid:105) = (cid:104) H − (cid:105) = (cid:104) H + + H − (cid:105) . Working with the average propagator has the advantagethat the O ( e v ) noise is absent in its expectation value [39]. Equation (50) gives an approximation that isvalid up to, and including, O ( e v ) terms. We have conﬁrmed that these terms are on the order of a fewpercent relative to M (cid:48)(cid:48) with our choice of e v . There is an analogous formula for the strong-isospin-breakingcomponent: M (cid:48) m ≈ m l δm ( M [ (cid:104) H δm (cid:105) ] − M [ (cid:104) H (cid:105) ]) . (51)The QED valence-sea derivative is available in a mixed form: the derivation with respect to e s is exact,but with respect to e v , it is a ﬁnite diﬀerence. Applying these to M [ (cid:104) H (cid:105) ] we get: M (cid:48)(cid:48) ≈ e v (cid:20) δ M [ H ] δH ∂ (cid:104) H (cid:105) ∂e s (cid:12)(cid:12)(cid:12)(cid:12) e s =0 − ( e v → − e v ) (cid:21) = δ M [ H ] δH (cid:12)(cid:12)(cid:12)(cid:12) (cid:104) H + + H − (cid:105) (cid:28) H + − H − e v dets (cid:48) dets (cid:29) . (52)We now specify M [ H ] , i.e. the way to extract the mass from the hadron propagator. Since we areinterested in the small isospin breaking eﬀects here, the choice of the mass extraction procedure is notcrucial. We utilize a procedure based on eﬀective masses instead of ﬁtting the propagator to multipleexponentials. This has the advantage over the standard ﬁtting procedure that the derivatives δ M δH canbe computed easily, in particular they can be given in closed analytic form. Using two/four propagatorpoints, an eﬀective-mass value and its diﬀerential can be given in analytical form for mesons/baryons [40].Then we ﬁt a constant to the plateau of the eﬀective mass. We choose two plateau ranges, so that asystematic error can be associated with ﬁnding the plateau. The ranges are given in Table 9.In this work the strong-isospin-breaking M (cid:48) m and valence-valence contributions M (cid:48)(cid:48) are evaluated on conﬁgurations with box sizes around L = 6 fm. For the sea contributions, M (cid:48)(cid:48) and M (cid:48)(cid:48) , we use23 −→ C light C strange C charm X C conn ( m l , C conn ( m s , C conn ( m c , X (cid:48) (cid:2) ∂∂e C conn (cid:3) ( m l , − (cid:2) ∂∂e C conn (cid:3) ( m s , (cid:2) ∂∂e C conn (cid:3) ( m c , X (cid:48) (cid:104) ∂ ∂e C conn (cid:105) ( m l , (cid:104) ∂ ∂e C conn (cid:105) ( m s , (cid:104) ∂ ∂e C conn (cid:105) ( m c , X (cid:48) m − m l (cid:104) ∂∂m l C conn (cid:105) ( m l , − − Table 10: Isospin symmetric value X and isospin-breaking derivatives X (cid:48) , X (cid:48) , X (cid:48) m of various observables X , namely the light, strange and charm connected contractions of the current propagator, in terms of theconnected vector meson contraction and its derivatives. See Equation (59) and (61) for the deﬁnitions.also the action, but on smaller volumes, L = 3 fm (see Section 7 for details of these ensembles).In the presence of the electromagnetic interaction, hadron masses have a ﬁnite-size eﬀect that ispower-like in the size of the box. In some cases it can be much larger than the exponentially-suppressed,ﬁnite-size eﬀect related to the strong interaction. For the QED L photon action, the eﬀect in the ﬁrst twoorders depends on the hadron only through its electric charge Q and mass M and is known analytically[32, 41]: M ( L ) − M = − ( Qe ) c π (cid:20) L + 2 M L + O ( L − ) (cid:21) with c = 2 . . . . . (53)The ﬁrst two orders of this formula can be used to correct for electromagnetic ﬁnite-size eﬀects. Remaining O ( L − ) eﬀects are beyond the precision of this work and are neglected. Since, for charged hadrons, seaeﬀects are typically much smaller than the valence-valence contribution, we use the universal ﬁnite-sizeformula (53) to correct the valence-valence component and apply no correction to the rest. We cancorroborate this choice by looking at the diﬀerent isospin breaking components of the charged pion masson two diﬀerent volumes on our coarsest lattice, as shown in Figure 9. The corrected M (cid:48)(cid:48) values almostagree on the two volumes. On ﬁner lattices we use the results of the L = 6 fm runs, correcting themwith Equation (53). In the case of the sea contributions the uncorrected M (cid:48)(cid:48) and M (cid:48)(cid:48) data are consistenton the two volumes. On ﬁner lattices we use the small-volume runs to estimate the electromagnetic seaeﬀects without correcting for ﬁnite-volume eﬀects.

10 Current propagator (cid:104)

J J (cid:105)

In this section we consider in detail the deﬁnition and decomposition of the current propagator: (cid:104) J µ,x J ¯ µ, ¯ x (cid:105) , (54)where eJ µ is the quark electromagnetic current. In the continuum limit this propagator can be obtained bycoupling the quarks to an external photon ﬁeld A ext µ and building the second diﬀerential with respect to thisﬁeld. In our lattice regularization, we use this prescription to deﬁne the current propagator. Speciﬁcallythe partition function in the presence of an external photon ﬁeld is given by: Z [ A ext ] ≡ (cid:90) . . . dets[ U, A + A ext ; { q f } , { m f } , e ] . (55)24he current propagator is then deﬁned as the following second diﬀerential: (cid:104) J µ,x J ¯ µ, ¯ x (cid:105) ≡ e δ log ZδA ext µ,x δA ext¯ µ, ¯ x (cid:12)(cid:12)(cid:12)(cid:12) A ext =0 . (56)The so deﬁned propagator satisﬁes current conservation on both the source and sink sides. To computeit, we need the ﬁrst and second derivatives of the fermion matrix at zero external ﬁeld: δδA ext µ,x (cid:12)(cid:12)(cid:12)(cid:12) M f = eq f · D µ [ iP x V U e ieq f A ] ,δ δA ext µ,x δA ext¯ µ, ¯ x (cid:12)(cid:12)(cid:12)(cid:12) M f = − e q f · δ µ ¯ µ · D µ [ P x P ¯ x V U e ieq f A ] . (57)From these we get the current propagator as follows: (cid:104) J µ,x J ¯ µ, ¯ x (cid:105) = (cid:42)(cid:88) f q f C conn µ,x, ¯ µ, ¯ x ( m f , eq f ) + C disc µ,x, ¯ µ, ¯ x − (cid:88) f q f (cid:0) M − f D µ [ P x P ¯ x V U e ieq f A ] δ µ ¯ µ (cid:1)(cid:43) (58)where the connected vector meson contraction is deﬁned as C conn µ,x, ¯ µ, ¯ x ( m, e ) ≡ −

14 Tr (cid:0) M − D µ [ iP x V U e ieA ] M − D ¯ µ [ iP ¯ x V U e ieA ] (cid:1) (59)and the disconnected contraction as C disc µ,x, ¯ µ, ¯ x ≡ (cid:88) f, ¯ f q f q ¯ f I µ,x ( m f , eq f ) I ¯ µ, ¯ x ( m ¯ f , eq ¯ f ) with I µ,x ( m, e ) ≡

14 Tr (cid:0) M − D µ [ iP x V U e ieA ] (cid:1) . (60)In these formulas, Tr is the trace over color and spacetime indices and the P x projection operator clearsthe components of a vector on all sites except for x . Here, the fermion matrix M is understood withmass m and on a gauge background V U e ieA . The M f notation, deﬁned in Equation (28), stands forthe fermion matrix with m f mass and eq f charge. Due to gauge invariance, (cid:104) I µ,x (cid:105) = 0 . Equation (58)is our master formula for the current propagator. In the following we decompose it into several pieces.There are three terms. First is the connected contribution, second is the disconnected contribution andthe third is a contact term. This last one gives no contribution to the observables that we are interestedin and it will be omitted from now on. To obtain the expansion in Equation (30), we have to calculatethe Wick-contractions C conn and C disc at the isospin-symmetric point and also their isospin-breakingderivatives.It is common to split the connected part, (cid:80) f q f C conn ( m f , eq f ) , into the contributions of individualﬂavors: C light ≡ C conn ( m u , e ) + C conn ( m d , − e ) ,C strange ≡ C conn ( m s , − e ) ,C charm ≡ C conn ( m c , e ) , (61)where we suppressed Lorentz indices and coordinates for simplicity. The isospin-limit components of these,as deﬁned in Section 5, in terms of C conn and their derivatives are given in Table 10. For the disconnectedcontribution we give here the formulas for the isospin-symmetric point and for the strong-isospin-breakingterm: C disc0 = [ I µ,x ( m l , − I µ,x ( m s ,

0) + 2 I µ,x ( m c , I µ,x ... → I ¯ µ, ¯ x ... ] , [ C disc ] (cid:48) m = − m l ∂∂m l C disc0 (62)25he detailed implementation of these quantities will be given in Sections 12 and 14. From these we thenconstruct the total expectation value as shown in Equations (30) and (31).It is also common to split the propagator at the isospin-symmetric point into isospin singlet and tripletparts: [ (cid:104) J J (cid:105) ] = (cid:104) J J (cid:105) I =0 + (cid:104) J J (cid:105) I =1 . These are given by (cid:104) J J (cid:105) I =1 ≡ (cid:2) (cid:104) C light (cid:105) (cid:3) = (cid:10) C conn ( m l , (cid:11) , (cid:104) J J (cid:105) I =0 ≡ (cid:2) (cid:104) C light + C strange + C charm + C disc (cid:105) (cid:3) == (cid:10) C conn ( m l ,

0) + C conn ( m s ,

0) + C conn ( m c ,

0) + C disc0 (cid:11) . (63)Finally, we introduce the notation G ( t ) for the zero-momentum timelike current propagator withaveraged Lorentz indices: G ( t ) ≡ (cid:88) (cid:126)x,µ =1 , , (cid:104) J µ,t,(cid:126)x J µ, + J µ,T − t,(cid:126)x J µ, (cid:105) . (64)This is the lattice version of the propagator given in Equation (1) of the main text. G ( t ) can also bedecomposed into connected terms of diﬀerent ﬂavors and a disconnected part. Note that the imaginaryparts of these quantities are zero due to the gauge averaging.

11 Anomalous magnetic moment a µ In this section we provide the deﬁnition for the central observable of the paper: the leading-order hadronicvacuum polarization (LO-HVP) contribution to the anomalous magnetic moment of the muon, a LO − HVP µ .Furthermore, we detail the decomposition of a LO − HVP µ . Since we consider only the LO-HVP contribution,we drop the superscript and multiply the result by , ie. a µ stands for a LO − HVP µ × throughout thiswork.The renormalized scalar hadronic vacuum polarization function (HVP) can be extracted from the zeromomemtum current propagator G ( t ) as [42]: Π( Q ) ≡ T/ (cid:88) t =0 (cid:20) t − aQ ) sin (cid:18) aQt (cid:19)(cid:21) G ( t ) (65)where t and G ( t ) are given in lattice units here and a is the lattice spacing. This formula corresponds toa Fourier transformation followed by a division by Q , including an explicit removal of: 1. a pure ﬁnite-volume eﬀect and 2. the ultraviolet divergence. Renormalization is performed on shell such that Π(0) = 0 .While, in a ﬁnite volume, Π( Q ) is only formally deﬁned at momenta with components that are integermutilples of either π/L or π/T , we use (65) to analytically continue Π( Q ) to any real values of Q .It is worth noting that this approach is related to the time-moment approach of [43]. Time moments canalso be used as input in various approximants that were put forward in [44, 45]. These are based on theapplication of Mellin–Barnes techniques. They converge rapidly with the number of moments retained[45] and also allow for a systematic matching to perturbation theory at short distances.In this work we compute all hadronic O ( e ) eﬀects in the vacuum polarization, including ones thatare reducible under cutting a photon line. Like G ( t ) , Π( Q ) can also be decomposed into the connectedcontributions of various quark ﬂavors and a disconnected contribution (Section 10).The LO-HVP contribution to the anomalous magnetic moment of the muon is computed from theone-photon irreducible part of Π( Q ) , denoted by Π γ I ( Q ) , using the following integral [46]: a µ = 10 α (cid:90) ∞ dQ m µ ω (cid:18) Q m µ (cid:19) Π γI ( Q ) (66)26here ω ( r ) is given after Equation (3) of the main paper, α is the ﬁne structure constant renormalized inthe Thomson-limit, and m µ is the mass of the muon. The diﬀerence between having Π( Q ) and Π γI ( Q ) in the integral in (66) is the one-photon-reducible ( γR ) contribution denoted by a γ R µ . It is an O ( e ) eﬀect that is included in the higher-order hadronic vacuum polarization (HO-HVP) contribution. Thiseﬀect is tiny compared to a µ and has already been computed on the lattice, as discussed in our ﬁnal resultsection, Section 23.We partition the momentum integral in Equation (66) by cutting it into two contributions at a mo-mentum Q max . Below Q max we use the lattice and above that perturbation theory. In the two parts thevacuum polarization is renormalized diﬀerently: it is renormalized to zero at Q = 0 on the lattice andat Q = Q max in perturbation theory. This requires the introduction of an extra term, that accounts forthis diﬀerence. The lattice part is then splitup into the contributions of diﬀerent ﬂavors. In detail, ourpartitioning takes the following form: a µ = a light µ + a strange µ + a charm µ + a disc µ + a pert µ − a γ R µ . (67)Here, the connected light contribution is given as a light µ = 10 α (cid:34)(cid:90) Q dQ m µ ω (cid:18) Q m µ (cid:19) Π light ( Q ) + Π light ( Q ) (cid:90) ∞ Q dQ m µ ω (cid:18) Q m µ (cid:19)(cid:35) , (68)and similarly for the other ﬂavors and the disconnected contributions. The second term accounts for thediﬀerence in the lattice and perturbative renormalization points, as explained above. Using Equation (65)we can express all of these contributions as a weighted sum of the corresponding current propagator. Forthe connected light term we have, for instance: a light µ = 10 α T/ (cid:88) t =0 K ( t ; aQ max , am µ ) G light ( t ) , (69)where the kernel K ( t ; aQ max , am µ ) is given by Equations (65) and (68). It depends on the gauge ensembleonly through the lattice spacing. The perturbative contribution is given by a pert µ = 10 α (cid:90) ∞ Q dQ m µ ω (cid:18) Q m µ (cid:19) (cid:2) Π pert ( Q ) − Π pert ( Q ) (cid:3) . (70)In Reference [47] we demonstrated on our data set that switching to the perturbative calculationcan be safely done for Q (cid:38) GeV , ie. from this point on a µ does not depend on the choice of Q max .In this work we use Q = 3 GeV . The perturbative part for this choice was computed in [47] and isgiven in Section 23, where the ﬁnal result for a µ is put together.We also consider a modiﬁcation of Equation (65), in which the current propagator is restricted to acertain region in time, from t to t . To achieve this, we multiply the propagator by a smooth windowfunction [48] W ( t ; t , t ) ≡ Θ( t ; t , ∆) − Θ( t ; t , ∆) with Θ( t ; t (cid:48) , ∆) ≡ + tanh[( t − t (cid:48) ) / ∆] (71)or equivalently we replace the weight factor as K ( t ) → K ( t ) W ( t ) . We will focus on a particular windowdeﬁned in Reference [48], with parameters t = 0 . fm, t = 1 . fm and ∆ = 0 . fm. The correspondingcontribution to the magnetic moment of the muon is denoted by a LO − HVP µ, win and for brevity we use a µ, win = a LO − HVP µ, win × . We can do the same partitioning as we did with a µ in Equation (67). We will use thosenotations extended by a win subscript. The contributions of the bottom and the top are not shown explicitely here, they will be added in our ﬁnal result section,Section 23.

520 540 560 580 600 620 640 660 680 2 2.5 3 3.5 4 [ a µ li gh t ] t c [fm]random sourcelow modes+solver truncation Figure 10: Comparison of a conventional random source based technique, as we applied it in our earlierwork [47], and a low mode utilizing technique of this work on a β = 3 . ensemble for the caseof [ a light µ ] upper and lower bounds (see Section 13).

12 Noise reduction techniques

In this section we consider quantities at the isospin-symmetric point; noise reduction techniques for theisospin-breaking part are discussed in Section 14. For the strange and charm connected contributions, C strange0 and C charm0 , and for the disconnected contribution C disc0 we use the same measurements thatare presented in our previous work [47]. A new measurement procedure is implemented for the lightconnected component C light0 . It is used to reanalyze the old conﬁgurations and make measurements onnew ensembles. This plays a key role in reducing the ﬁnal statistical error in a µ .The technique utilizes the lowest eigenmodes of the fermion matrix; for an early work with low eigen-modes, see [49]. The way in which we use these modes here is essentially the same as in [50], whereit is called Low Mode Substitution. In the space orthogonal to these modes, the computational eﬀort isreduced considerably by applying imprecise (aka. sloppy) matrix inversions. This is called the TruncatedSolver Method [37] or All Mode Averaging [38]. Here we describe the technique for the connected partof the current propagator. The same technique was applied recently for magnetic moment computationsin [51] also.We consider the connected propagator of Equation (59) for timelike separation, and perform an aver-aging over the source positions, together with a zero spatial-momentum projection at the sink: C ( t, ¯ t ) ≡ L (cid:88) (cid:126) ¯ x,(cid:126)x,µ =1 , , C conn µ,x,µ, ¯ x ( m l ,

0) = − L (cid:88) µ =1 , , ReTr (cid:2) D µ,t M − D µ, ¯ t M − (cid:3) , (72)where D µ,t = (cid:80) (cid:126)x D µ [ iP x U ] is an operator that performs a symmetric, gauge-covariant shift on a vector v x : [ D µ,t v ] x = (cid:40) x = t : iη µ,x (cid:16) U µ,x v x + µ + U † µ,x − µ v x − µ (cid:17) x (cid:54) = t : 0 (73)where η µ,x are the Dirac-gamma matrices in the staggered representation. We use the simplifying notation28 = D µ,t and D = D µ, ¯ t in the following. In Equation (72), we apply the real part because the imaginarypart vanishes anyway after averaging over gauge conﬁgurations.Using the lowest eigenmodes of M we split the quark propagator into an eigenvector part and into itsorthogonal complement, denoted by “e” and ”r”, respectively: M − = M − e + M − r , (74) M − e = (cid:88) i λ i v i v † i and M − r = M − (cid:32) − (cid:88) i v i v † i (cid:33) , (75)where v i /λ i is the i -th eigenvector/eigenvalue of the operator M . For the projection we used a modiﬁedversion of the symmetric Krylov-Schur algorithm described in [52]. Correspondingly, C splits into threeterms: C = C ee + C re + C rr , (76)with eigen-eigen, rest-eigen and rest-rest contributions: C ee = − L ReTr (cid:2) D M − e D M − e (cid:3) , (77) C re = − L ReTr (cid:2) D M − r D M − e + D M − e D M − r (cid:3) , (78) C rr = − L ReTr (cid:2) D M − r D M − r (cid:3) , (79)where an average over µ is assumed but not shown explicitely. The beneﬁt of this decomposition isthat the trace in the eigen-eigen part can be calculated exactly, and is thus equivalent to calculatingthe propagator with all possible sources in position space. This is the main ingredient for the noisereduction. Though no extra inversions are needed in this part, it has to be optimized carefully, sincethere is a double sum over the eigenmodes, where each term is a scalar product v † i D v j . In the rest-eigen part we have terms v † i D M − r D v i and also terms where D and D are exchanged. Therefore, thispart is only a single sum over the eigenmodes, and each term involves one matrix inversion. Note thatthese inversions are preconditioned by the eigenvectors, so they need much less iterations than standardinversions. Additionally, we speed up the inversions by running them with a reduced precision, and forsome randomly selected eigenvectors we correct for the small bias by adding the diﬀerence between ahigh precision solver and the reduced precision one [37, 38]. Finally, the rest-rest part is evaluated usingrandom source vectors ξ : we calculate ξ † D M − r D M − r ξ , which requires two inversions per random source.The reduced precision inverter technique is used here too.As an example we give here the algorithm parameters for one of the ensembles at β = 3 . : modes of the even-odd preconditioned Dirac operator are projected; the high precision inversionhas − accuracy; the reduced precision inverter is capped at conjugate gradient iterations; thebias correction is calculated with a frequency of / and random sources are chosen for the rest-rest term. With these choices, the eigen-eigen part is the dominant source of the error, and since it isalready evaluated with all possible sources, we have reached the limit where the noise comes from theﬂuctuations between gauge conﬁgurations. Using this technique we achieve a factor of ﬁve improvementin the statistical error compared to our previous work [47]. There we applied a random source techniquesimilar to the one that we now use for the rest-rest term. The number of random sources was perconﬁguration. The comparison of the result with the old and new techniques is shown in Figure 10.The number of projected eigenmodes is around for all lattice spacings in the L = 6 fm boxes.The number of projected modes has to be scaled with the physical four-volume, to keep the magnitude ofthe eigen-eigen part constant. On the large lattice with L = 11 fm we project eigenmodes.29 [ a µ li gh t ] ( t c )-( l o ,t = f m ) t c [fm] hiloavg Figure 11: Upper and lower bounds on [ a light µ ] as a function of t c , ie. the upper limit of the timeintegration in Equation (69). The lower bound value at . fm is subtracted and the plot shows the resultof a combined ﬁt of all ensembles that is evaluated at a = 0 . fm (see text). Such plots are used toset the value of t c on our ensembles. We use t c = 4 . fm everywhere.

13 Upper and lower bounds on (cid:104)

J J (cid:105)

In the case of the light and disconnected contributions to the current propagator, the signal deterioratesquickly as distance is increased. To calculate the HVP, a sum over time of the propagator has to beperformed, as Equation (69) shows. As was suggested in [53, 54], we introduce a cut in time t c , beyondwhich the propagator is replaced by upper and lower bounds, thereby reducing the statistical noise. Ourestimate is given by the average of the bounds at a t c where the two bounds meet. In this section onlyisospin-symmetric quantities are considered.Bounds are derived from the assumption that the current propagator is a sum of exponentials withpositive coeﬃcients. In the case of staggered fermions, where opposite parity states with oscillatingcoeﬃcients give also a contribution, the assumption is only satisﬁed after some distance and within thestatistical error. On our ensembles this is usually the case beyond about t ∼ . fm.For the light connected propagator at the isospin-symmetric point the bounds express the positivity(lower bound) and that the propagator should decay faster than the exponential of two pions (upperbound). They are given as ≤ G light ( t ) ≤ G light ( t c ) ϕ ( t ) ϕ ( t c ) , (80)where ϕ ( t ) = exp( − E π t ) . For E π we use the energy of two non-interacting pions with the smallestnon-zero lattice momentum π/L . The larger the t c the better the upper bound, but it comes with morestatistical noise.The exponential decay above assumes an inﬁnite time extent, T = ∞ . We incorporate the eﬀects ofa ﬁnite- T using next-to-leading-order chiral perturbation theory. There the exponential decay with thetwo-pion energy gets replaced by the following cosh -type form: exp( − E π t ) −→ cosh[ E π ( t − T / E π T / − . (81)30sospin component t c [ fm ] X = G light t c [ fm ] X = G disc X (cid:48) m . . X (cid:48)(cid:48) . . X (cid:48)(cid:48) , X (cid:48)(cid:48) . . Table 11: Cuts in time for diﬀerent isospin breaking components of the light and disconnected propagator.For each component we use two diﬀerent cuts: the one that is given in the Table, and another that is . fm larger.This is shown in detail in Section 15, where the above replacement corresponds to (cid:15) → (cid:15) T in Equations(104) and (108). In this case an appropriate upper bound on the propagator is given by using the righthand side of Equation (81) as the bounding function ϕ ( t ) . We use this form of ϕ ( t ) to compute the upperbounds in this work.To obtain a suitable t c on the ensembles, we combine the propagators of all ensembles in asingle analysis. The reason for this is to use all available statistics to analyze the behavior of the boundson t c . The a µ results of the ensembles depend on the pion and kaon masses and the lattice spacing. Theﬁrst two dependencies can be safely eliminated if we consider only the tail of the propagator. (Remember,we are working close to the physical point.) For this we subtracted the value of the lower bound at . fmfrom both bounds on each ensemble. The lattice-spacing dependence was taken into account by makinga continuum extrapolation for this “subtracted” a µ at each value of t c . The result of the continuum ﬁtfor the upper and lower bounds on [ a light µ ] is shown in Figure 11, in which the a function used in thecontinuum extrapolation is evaluated at our ﬁnest lattice spacing, a = 0 . fm. The two bounds meetaround . fm. Here the statistical error of the average covers the central value of both bounds. We willuse this average at t c = 4 . fm as our estimate for [ a light µ ] on the ensembles. Variation of t c in the plateau region had negligible eﬀect on the result. Note, that this value of t c is much larger thanthe one that we used in our previous work, t c = 3 . fm. This improvement is made possible by the noisereduction technique of Section 12.In the case of the isospin-symmetric disconnected propagator the bounds are ≤ − G disc ( t ) ≤ G light ( t c ) ϕ ( t ) ϕ ( t c ) + G strange ( t ) + G charm ( t ) . (82)Since the strange and charm terms fall oﬀ much faster than the light and disconnected one, their con-tribution does not change the value of t c obtained. We use the same measurements for G disc0 as in ourprevious work [47], and take the average of the bounds at a single cut value: t c = 2 . fm. This choice isin accordance with our ﬁndings in [47], that the variation in t c within the plateau of the average boundhas a negligible eﬀect on the result.

14 Isospin-breaking eﬀects in (cid:104)

J J (cid:105)

In this section we describe the procedure that we use to compute the isospin-breaking corrections to thecurrent propagator. We consider the contribution of the light and strange quarks only. The charm quarkcontribution was computed on the lattice in [55].We start with the connected contributions (see Table 10). The electric derivatives in those formulas, X (cid:48) and X (cid:48)(cid:48) , are measured by ﬁnite diﬀerences, as in the case of the isospin-breaking in hadron masses.Speciﬁcally, we compute the following contractions C conn ( m s , , C conn ( m s , + e ∗ ) , C conn ( m s , − e ∗ ) (83)31 [ a µ li gh t ] ’’ extrapolationmeasurement2.53.03.54.04.55.05.5 1 3 5 7 9 11strong isospin breakingstrong isospin breaking [ a µ li gh t ] ’ m κ = m val /m sea Figure 12: Extrapolation procedure to obtain the electromagnetic and strong-isospin-breaking correctionsto the light connected contribution. Measurements are performed with valence over sea quark mass ratiosof κ = { , , , , } . We use the three lightest κ values to obtain the value at κ = 1 with a linearextrapolation shown in the Figure. The data on the plot is taken from our coarsest lattice, correspondingto β = 3 . .for the strange quark and C conn ( κm l , , C conn ( κm l , + e ∗ ) , C conn ( κm l , − e ∗ ) (84)for the light quark, where e ∗ is the physical value of the electric coupling. From these the ﬁnite-diﬀerenceapproximators of the ﬁrst and second derivatives can be built in the standard way. In the light quark case,the G conn ( m l ) propagators are noisy. Instead of the low-mode technique of Section 12, we use a simplerapproach to reduce the noise, which is suﬃciently accurate. We perform computations with valence quarkmasses that are some multiple κ of the sea quark mass, m l , and then implement a chiral extrapolationto the target point at κ = 1 . We measure the contractions in Equation (84) at ﬁve diﬀerent values of κ and use the three lightest of these, κ = 3 , , , to perform a linear extrapolation gauge-conﬁgurationby gauge-conﬁguration. Figure 12 shows the result of this chiral extrapolation procedure on our coarsestensemble for the case of [ a light µ ] (cid:48)(cid:48) . A quadratic extrapolation including all ﬁve κ values gives a result thatis consistent with the κ = 3 , , linear extrapolation within statistical uncertainty.For the strong-isospin derivative of the connected contraction [ C light ] (cid:48) m (see last line of Table 10) weimplement directly the operator corresponding to the mass derivative. Again we use a chiral extrapolationfrom non-physical valence quark masses with κ = 3 , , , similar to the case of the electromagneticderivative. This extrapolation procedure is also plotted in Figure 12 for [ a light µ ] (cid:48) m .Finally the sea-valence and sea-sea electromagnetic derivatives of the connected part, [ a light µ ] (cid:48)(cid:48) / and [ a strange µ ] (cid:48)(cid:48) / , are measured on the set of ensembles that are dedicated to dynamical QED eﬀects (see Table8). There we combine the low-mode, noise-reduction technique with the mass extrapolation procedureabove to reach suﬃcient accuracy.In order to further reduce the noise in the light quark isospin-breaking corrections, when computingquantities like a µ , we set the propagator to zero after some cut t c . The value of t c is obtained by lookingfor a plateau in the derivative of a light µ as a function of t c . For the diﬀerent isospin-breaking derivativesthese values are given in Table 11. To assess the systematic error of the procedure, we also use a secondvalue of t c that is . fm larger. Let us note here that, when computing various derivatives of a µ , not32 N L N N N LO NLO / N √ NLO

NNLO

Table 12: Solutions to the power counting formula of Reference [56] for the current-current correlator.For an NNLO calculation, diagrams of dimension D ≤ will contribute. Each solution describes diagramsof dimension D having N L loops, along with some speciﬁed number of vertices from the Lagrangiansof each order denoted by N , , . The number of vertices from the leading order Lagrangians N is onlylimited by the number of loops.only do we have to consider the derivatives of the propagator, but also those of the lattice scale, whichenters in the weighting function K ( t ; Q max a, m µ a ) (see Equation (69)).Now let us turn to the disconnected contribution . The basic operator for this measurement isthe trace of the quark propagator, I ( m, e ) of Equation (60). This is computed with the help of thelow-lying eigenmodes of the Dirac operator, in a similar way to the calculation of the connected diagram C conn ( m, e ) , described in Section 12. In this case, the computation is technically much simpler, since in I ( m, e ) there is only one quark propagator under the trace, whereas in C conn ( m, e ) there are two. Theeigenvectors depend on the electric coupling, and we compute them for each value of e that we need. Forthe electromagnetic derivative it is useful to rewrite the current (cid:88) f = { u,d,s } q f I ( m f , q f e ) = I ( m l , e ) − I ( m l , − e ) − I ( m s , − e ) (85)using a Taylor expansion around e = 0 in the following way: (cid:88) f = { u,d,s } q f I ( m f , q f e ) = − I ( m l ,

0) + 2 I ( m l , e ) + I ( m l , − e ) − I ( m s , − e ) + O ( e ) . (86)The advantage of this form is that we can compute the ﬁrst and second derivatives without having tocompute the traces with ± e charge. For the strong isospin derivative, where only the ﬁrst derivativeis needed, it suﬃces to compute only one additional trace at a slightly diﬀerent mass than m l . We use . · m l . Altogether we measure traces at the following masses and electric couplings: I ( m l , , I ( m s , ,I (0 . · m l , ,I ( m l , + e ∗ ) , I ( m l , − e ∗ ) , I ( m s , + e ∗ ) , I ( m s , − e ∗ ) , (87)from which all the isospin-breaking derivatives can be constructed using ﬁnite diﬀerences. All these tracesare measured using the same set of random vectors, so that the random noise does not wash out thesignal in the diﬀerences. 33 In this section we consider the current propagator in staggered chiral perturbation theory (SXPT) to next-to-next-to-leading order (NNLO). The goal of this eﬀort is to describe the taste violation and ﬁnite-sizeeﬀects in our lattice simulations. We work in the isospin-symmetric limit throughout this section.In continuum chiral perturbation theory and in momentum space, the NNLO contribution was com-puted in [57, 58], and the ﬁnite-volume corrections are given in [59]. A coordinate space computationincluding ﬁnite-volume eﬀects was given recently in [51]. In staggered chiral perturbation theory, onlythe next-to-leading order (NLO) has been computed [60]. Here we work out the following order and givethe ﬁnal result in the coordinate space representation. This order requires the NLO contributions to thestaggered chiral Lagrangian, which is given in [61]. This Lagrangian has already been used to computepseudoscalar meson masses to NLO [56]. This result will be an important ingredient here: just as incontinuum chiral perturbation theory, the current propagator to NNLO can be considerably simpliﬁed ifone writes it in terms of masses including NLO corrections.The SXPT Lagrangian relevant to our computation is given by L = L + L ,LS + L + L ,SV + L + L , (88)with L , , the standard continuum LO, NLO and NNLO Lagrangians of Gasser and Leutwyler [62]and Bijnens, Colangelo and Ecker [63]. We denote S i = (cid:82) dx L i the corresponding actions, i ∈{

2; 2 , LS ; 4; 4 , SV ; 5; 6 } . We use the standard SXPT power counting scheme of Reference [64], wherebythe LO contributions are O ( p ) ≈ O ( m q ) ≈ O ( a ) , where p stands for a derivative operation, m q is thelight quark mass and a is the lattice spacing. According to this counting there is a LO staggered correction L ,LS given by the Lagrangian of Lee and Sharpe [64] and a NLO staggered correction L ,SV given bySharpe and van de Water [61]. There is also a staggered speciﬁc contribution between NLO and NNLO, L . More details on these terms are given below. There are further O ( a ) terms in the Lagrangian, which–diﬀerently from the staggered corrections– are invariant under the continuum taste symmetry. We arenot giving them explicitly here; their eﬀect is to change the low-energy constants of the theory by O ( a ) amounts – at least to the order that we consider here.A general power counting formula is provided in the Appendix of [56]. Under this scheme, we saythat a diagram M ( p, m q , a ) has dimension D if it scales as M ( p, m q , a ) → λ D M ( p, m q , a ) undera rescaling of the external momenta, quark masses and lattice spacing by p → √ λp , m q → λm q , and a → λa . In Table 12, we enumerate the contributions that are required for our NNLO computationunder this counting scheme. Note that the LO contribution is zero.The ﬁeld variables are denoted by φ . They describe all the ﬂavors and tastes of staggered pions. Theyare expressed as linear combinations of the product of the Hermitian generators of the U (4) taste groupand of the U ( N ) replicated ﬂavor group, ie. φ ≡ (cid:80) αa φ αa ξ α T a . Here the taste index α runs over the 16element set { , µ , µν, µ, I } with µ < ν . A possible representation for the taste generators can be builtfrom the ξ µ Dirac-matrices and the × identity matrix as: ξ α ∈ { ξ , ξ µ = iξ µ ξ , ξ µν = iξ µ ξ ν , ξ µ , ξ I = 1 } (89)For the U ( N ) generators T a we use the generalized Gell-Mann matrices and the a index runs from to N − . We work with N f degenerate ﬂavors and rooting is implemented with the replica trick, ie. N = N f N r , with N r → at the end of the computation. In most cases, φ appears in an exponential formin the Lagrangian, U = exp( iφ/F ) , where F is the pion decay constant in the chiral limit. Traces are takenover both the taste and replica-ﬂavor indices. Generators are normalized as tr( ξ α T a · ξ β T b ) = 2 δ ab δ αβ .The current propagator is obtained by incorporating an external Hermitian vector ﬁeld v µ = v † µ in theLagrangian, setting v µ = QA µ (where Q is the charge matrix) and taking the second order functional34erivative of the partition function with respect to A µ : (cid:104) J µ ( x ) J ¯ µ (¯ x ) (cid:105) ≡ δ log Z [ v ν = QA ν ] δA µ ( x ) δA ¯ µ (¯ x ) (cid:12)(cid:12)(cid:12)(cid:12) A =0 = (cid:28) δ S [ v ν = QA ν ] δA µ ( x ) δ S [ v ν = QA ν ] δA ¯ µ (¯ x ) − δ S [ v ν = QA ν ] δA µ ( x ) δA ¯ µ (¯ x ) (cid:29) (90)up to vanishing disconnected terms. In the following we take the current to operate as a taste singlet, so Q is proportional to the identity matrix in taste space, that is Q = Q a ξ I T a . This restriction is discussedin more detail at the end of the computation. In ﬂavor space, Q is diagonal but non-singlet. Theﬁeld-strength tensor of the electromagnetic vector potential is denoted by F µν = ∂ µ A ν − ∂ ν A µ .In the ﬁrst term of Equation (90), we will refer to the derivative δ S δA µ as current term. The secondterm in Equation (90) gives rise to contact terms proportional to δ ( x − ¯ x ) and derivatives thereof. Theanomalous magnetic moment considered in this work is obtained by integrating the propagator with akernel function, that behaves as ( x − ¯ x ) for small diﬀerences. This eliminates all contact terms with lessthan four derivatives. The smallest order they can enter is therefore O ( p ) , since the two photon ﬁeldsadd an extra p to the counting. Leading-order contributions

The O ( p ) and O ( m q ) terms are given by the standard Euclidean chiral Lagrangian, with an additionalmass term for the taste-ﬂavor singlet L = F (cid:0) D µ U D µ U † (cid:1) − F M tr (cid:0) U + U † (cid:1) + m

12 (tr φ ) , (91)where M is the tree-level Goldstone-boson mass and D µ = ∂ µ U − i [ v µ , U ] is the covariant derivativeincluding the vector ﬁeld. Note that the external vector ﬁeld is O ( p ) in the chiral power counting scheme.The singlet meson mass has to be sent to inﬁnity at the end of the computation ( m → ∞ ). Thefunctional derivatives described above give the current couplings δ S δA µ ( x ) = i Q [ ∂ µ φ, φ ]) − i F tr ([ Q, ∂ µ φ ] φφφ − Qφ [ ∂ µ φ, φ ] φ ) + O ( φ ) . (92)The leading order staggered terms have been described by Lee and Sharpe [64] and generalized tomultiple ﬂavors by Aubin and Bernard [65]. We write these terms as L ,LS . Since these are O ( a ) , theycan contribute to our NNLO calculation in diagrams with up to two loops, so when expanding in φ , weconsider terms up to and including O ( φ ) . The O ( φ ) terms can be absorbed into the O ( φ ) mass termfrom L , providing taste-dependent corrections to the tree-level mass. There are also extra terms for theﬂavor singlets of each taste, shifting these masses separately from the other ﬂavor components. Since L ,LS does not depend on the external ﬁeld, there are no contributions to the current terms. Next-to-leading order contributions

The O ( p ) , O ( p m q ) , and O ( m q ) vertices have been described by Gasser and Leutwyler [62]. We denotethis Lagrangian by L , and use the standard notation for the low-energy constants. According to Table 12they contribute to the calculation in diagrams with up to one loop, so when expanding in φ , we onlyconsider terms of O ( φ ) . These terms can mostly be absorbed into redeﬁnitions of the coeﬃcients forthe mass term, and a rescaling of the ﬁeld variable φ . Such a rescaling cannot aﬀect the ﬁnal result,since we are computing the correlation function of an external ﬁeld. The rescaling of the ﬁeld variableabsorbs the contributions to the current term from the terms in the Lagrangian that are proportional to35he low-energy constants L and L and only the following term remains: δ S δA µ ( x ) = 2 iF L tr ( Q∂ ν [ ∂ µ φ, ∂ ν φ ]) + O ( φ ) . (93)The staggered speciﬁc NLO contributions involve O ( a p ) , O ( a m q ) , and O ( a ) terms, which havebeen described by Sharpe & van de Water [61]. As NLO terms, these can only contribute at one loop,so we only consider their expansion up to O ( φ ) . Similar to L ,LS , most of the terms of L ,SV can beabsorbed into the LO mass as taste-dependent corrections and by ﬁeld rescaling. There are three groupsof terms that cannot be treated this way:1. In the ﬁrst group we have O ( a p ) terms that violate the remnant SO (4) taste symmetry of the LOaction, for example (cid:80) µ (cid:104) ∂ µ φξ µ ∂ µ φξ µ (cid:105) . The net eﬀect of these terms on the current propagator isonly to change the mass that appears in the pion propagators by SO (4) violating terms. In our latticesimulations these SO (4) violations are tiny: on our coarsest lattice the pion mass splittings, withinthe SO (4) multiplets, are about times smaller than the splittings between diﬀerent multiplets.We will set them to zero in our SXPT computation.2. In the second group we have O ( a p ) terms that depend explicitly on the external ﬁeld, and notthrough the covariant derivatives. Reference [61] calls these “extra source-terms”. In principle,these can contribute to the current terms. However, they are all proportional to the commutator ofthe vector ﬁeld with a taste matrix ξ α , and since we take our vector current to be a taste singlet,such commutators vanish. An example for such a term is (cid:80) µ,ν (cid:104) [ A µ , ξ ν ] φξ ν ∂ µ φ (cid:105) .3. Finally, there might be taste-symmetry-breaking terms containing the ﬁeld-strength tensor F µν . Aspurion analysis, similar to the one in Section III of Reference [60], indicates that no such terms areallowed in the L ,SV Lagrangian.

Contributions from beyond next-to-leading order

In SXPT, it is possible to construct a chiral Lagrangian between NLO and NNLO, as discussed by Bailey,Kim and Lee [56]. We denote this Lagrangian L . In it, terms might arise from the dimension-8 ordimension-9 Lagrangians in the Symanzik eﬀective theory, that are either O ( a p ) or O ( a ) . In order tocontribute to our calculation, vertices at this order would need to appear in a tree-level diagram (seeTable 12, where we denote it N √ NLO). As a result, the only terms that could contribute to the currentpropagator would be contact terms. However, such terms require the square of the vector ﬁeld A µ ≈ O ( p ) in order to have a non-zero second functional derivative and therefore they must be O ( p ) . We concludethat L gives no contribution in this calculation.Finally, we come to the NNLO Lagrangian. Similarly to the Bailey-Kim-Lee Lagrangian discussedabove, NNLO terms can only contribute to tree-level diagrams through contact terms at O ( φ ) . Thecontinuum terms are described by Bijnens, Colangelo and Ecker [63], and we write them as L . As wealready mentioned, in our observables only contact terms that are at least O ( p ) can contribute, and theonly such term is: δ S δA µ ( x ) δA ¯ µ (¯ x ) = C tr( Q ) ( δ µ ¯ µ ∂ − ∂ µ ∂ ¯ µ ) ∂ δ ( x − ¯ x ) + O ( φ ) . (94)The O ( a ) taste-violating contributions at NNLO must similarly be contact terms at O ( φ ) . Asmentioned above, such terms must be at least O ( p ) in order to contribute to the a µ . However, O ( a p ) terms can enter only beyond NNLO, so they are not relevant in our calculation.36 nﬁnite- L and T result Performing the computations with the Lagrangians from the previous subsections we obtain the full currentpropagator. As we mentioned in the beginning, it is also necessary to compute the NLO mass shift δM α for each taste arising from the terms not absorbed into the tree-level mass M α .To arrive to the propagator that is used in our computation, we take Equation (90), apply a spatialintegral over x and assume that µ, ¯ µ are spatial. With all the terms enumerated above we end up with: (cid:90) d x (cid:104) J µ ( (cid:126)x, t ) J ¯ µ (0) (cid:105) == (cid:88) α (cid:90) d p (2 π ) N Q (cid:34) F L E p,α − N F (cid:88) β G ,β + δM α ddM α (cid:35) e − E p,α t E p,α p µ p ¯ µ ++ (cid:88) α,β (cid:90) d p (2 π ) d r (2 π ) N Q F e − E r,β t E p,α − e − E p,α t E r,β E p,α E r,β ( E p,α − E r,β ) p µ r ¯ µ ( (cid:126)p · (cid:126)r ) + contact terms , (95)where we deﬁne the relativistic energy of a free particle with α taste as E p,α = (cid:112) M α + (cid:126)p . Thesummations mean a sum over sixteen tastes of the ﬂavor non-singlet pions, α ∈ { , µ , µν, µ, I } . Theﬂavor-singlet pseudoscalars only contribute to the δM α mass-shift terms. These can be transformed awayby switching to the NLO mass everywhere in the formula, ie. applying the shift M α → M α + δM α . Thischanges the result by eﬀects that are higher order than the NNLO considered here. The non-singlet chargesquared is deﬁned as Q = Q a Q a − Q Q . Since the result is proportional to Q , the ﬂavor-singlet partof the current gives no contribution at this order.We use dimensional regularization at scale µ and the MS scheme to work with the ultraviolet divergentloop integrals. Speciﬁcally, the one-loop integral G ,β is given by: G ,β = (cid:90) d p (2 π ) E p,β = M β π (cid:18) log M β µ + R (cid:19) (96)where R contains the divergence isolated by the MS prescription, see eg. [66]. Seemingly, there is also asingularity in the integrand of the double loop integral in the second line of Equation (95), when (cid:126)p = (cid:126)r .This singularity is superﬁcial, since there is a zero in the numerator which cancels it. It is useful towork with the terms of the numerator separately, in which case the separated terms are singular. Thesesingularities have to be regulated. We do this by adding an + iη into the denominator. The full expressionhas to be smooth as η → , and we take this limit at the end of the computation.The only low-energy constant, that appears in the result, is L from the continuum Gasser-LeutwylerLagrangian L ,GL . The ultraviolet divergences coming from the single loop integral G ,β and the doubleloop integral on the second line of Equation (95) can all be absorbed into L . This procedure deﬁnes arenormalized L r , and also a scheme-independent L in the standard way [66].The only contact term that aﬀects our observables is the one proportional to C ; we will ignore thisterm here, since it has no eﬀect on the ﬁnite volume and taste-splitting corrections.We now move to our speciﬁc case of two degenerate light quarks with rooting, so we set N f = 2 , N r = , L = − l and Q = ( + σ ) ⊗ in ﬂavor-replica space, which give Q = N r / . One of thetwo momentum integrals in Equation (95) can be performed analytically, and we also average over theLorentz indices to arrive to our ﬁnal expression: G ( t ) ≡ (cid:88) µ =1 , , (cid:90) d x (cid:104) J µ ( (cid:126)x, t ) J µ (0) (cid:105) = 13 (cid:48) (cid:88) α (cid:90) d p (2 π ) e − E p,α t E p,α p (cid:34) F (cid:48) (cid:88) β Γ( p ; M α , M β ) (cid:35) , (97)where p = (cid:126)p and the summation symbol (cid:80) (cid:48) α = (cid:80) α stands for averaging over the taste index α .The ﬁrst part in the square-bracket is the well-known NLO expression of [60], ie. the taste average of the37ontinuum NLO result. The second part of the square-bracket contains the NNLO correction and is givenexplicitly as Γ( p ; M α , M β ) = p + M α π (cid:18) l − log M β M (cid:19) + 5( p + M α )36 π − M β π + − p + M α − M β π (cid:113) − xx arcsin √ x if x < (cid:113) x − x log( √ x + √ x − if x > with x = p + M α M β . (98)The scheme independent l is introduced with the Goldstone pion mass M as l = − π (cid:18) l + log M µ + R (cid:19) . (99)Equations (97) and (98) reproduce the continuum NNLO result of [51], if we set the masses of all tastesto M .When we use the above result later in this paper, we take the ﬂavor non-singlet pion masses from thesimulations and ﬁx the remaining two parameters as: F = 92 . MeV and l = 16 as in [51]. Finite- L eﬀects The above computation can also be carried out in ﬁnite volume. In the continuum case, this was alreadydone in [51]. Here we generalize those formulas in the presence of taste violations. We use the sametechniques that were applied there, and we also correct that computation. A detailed derivation will notbe given here, we just brieﬂy describe the main strategy and give the results.The current propagator G ( t ) in ﬁnite volume can be obtained from the one in inﬁnite volume byreplacing the momentum integrals with sums. In the inﬁnite-volume expression, Equation (95), there areterms with single and double integrals. The ﬁnite-size correction for those with a single integral can beobtained from a Poisson summation formula, formally: L (cid:88) p − (cid:90) d p (2 π ) = (cid:90) d p (2 π ) (cid:88) n (cid:54) =0 e i(cid:126)n(cid:126)pL ≡ (cid:88)(cid:90) p,n (cid:54) =0 , (100)whereas for double integrals the correction involves more terms: L (cid:88) p L (cid:88) r − (cid:90) d p (2 π ) (cid:90) d r (2 π ) = (cid:88)(cid:90) p,n (cid:54) =0 (cid:88)(cid:90) r,m (cid:54) =0 + (cid:88)(cid:90) p,n (cid:54) =0 (cid:90) r + (cid:90) p (cid:88)(cid:90) r,m (cid:54) =0 , (101)where we also introduced the notation (cid:82) p ≡ (cid:82) d p (2 π ) . In some cases the momentum integrals can be moreeasily performed if the line of integration is deformed into the complex plane. Speciﬁcally we need the38ollowing integrals: I [ f ] ≡ (cid:88)(cid:90) p,n (cid:54) =0 f ( p ) = 12 π ∞ (cid:88) n =1 ν n nL (cid:90) ∞ dp p sin( npL ) f ( p ) I ( M α ) ≡ − (cid:88)(cid:90) p,n (cid:54) =0 E p,α = − M α π ∞ (cid:88) n =1 ν n nL (cid:90) ∞ e − ynM α L ydy (cid:112) y − I ( p ; M α , M β ) ≡ (cid:88) s = ± (cid:88)(cid:90) r,m (cid:54) =0 r E r,β ( E r,β − E p,α + isη ) = M β π ∞ (cid:88) m =1 ν m mL (cid:34)(cid:90) ∞ e − ymM β L y dy (cid:112) y − y + x −

1) ++ π x − √ x (cid:40) exp( − m √ − xM β L ) if x < m √ x − M β L ) if x > (cid:35) with x = p + M α M β . (102)Here f ( p ) is an arbitrary integrable function, ν ξ = (cid:80) (cid:126)n = ξ and the η > regulator was introducedaccording to our earlier discussion. In the I integral, the second term in the square-bracket comes frompoles at r = ( p + M α − M β ) /M β . This pole term was dropped in [51] by saying that the pole can beshifted outside a complex contour. We give the result as the ﬁnite- minus inﬁnite-volume diﬀerence andwe split it into ﬁve terms, one NLO and four NNLO: G ( t ; L ) − G ( t ; ∞ ) = ∆ G ( t ) NLO + (cid:88) i =1 ∆ G ( t ) NNLO ,i . (103)To keep the formulas simple we introduce the notation (cid:15) ( p ; M α , t ) ≡ e − E p,α t E p,α p . (104)Our result for the ﬁnite-volume correction is then: ∆ G ( t ) NLO = 13 (cid:48) (cid:88) α I [ (cid:15) ( M α , t )] F ∆ G ( t ) NNLO , = 13 (cid:48) (cid:88) α,β I [ (cid:15) ( M α , t )Γ( M α , M β )] F ∆ G ( t ) NNLO , = 13 (cid:48) (cid:88) α,β I [ (cid:15) ( M α , t )] · I ( M β ) F ∆ G ( t ) NNLO , = 13 (cid:48) (cid:88) α,β (cid:90) p (cid:15) ( p ; M α , t ) (cid:2) I ( M β ) + I ( p ; M α , M β ) (cid:3) F ∆ G ( t ) NNLO , = 13 (cid:48) (cid:88) α,β I [ (cid:15) ( M α , t ) I ( M α , M β )] . (105)The prefactors arise from the Lorentz-index averaging and (cid:80) (cid:48) α,β = (cid:80) α,β . In the continuum theseformulas agree with the ones in [51], up to the pole contribution that aﬀects the ∆ G ( t ) NNLO , and ∆ G ( t ) NNLO , terms. The largest NNLO term by far is ∆ G ( t ) NNLO , , and has the same order of magnitudeas the NLO term on our lattices. 39 inite- T eﬀects Until now we assumed that the temporal extent of the lattice is inﬁnite, T = ∞ . The correctionsintroduced by a ﬁnite T can also be computed in SXPT. The integrals over the time component of themomenta become sums, which are then related to integrals via Poisson’s summation formula. Formally: (cid:90) dp π −→ T (cid:88) p = (cid:90) dp π (cid:88) n e ip n T . (106)We give here the result of this procedure for three integrals that appear in the current-current correlator,in a one-loop and in a two-loop diagram, respectively: (cid:90) dp π e ip t p + E p −→ cosh [ E p ( t − T / E p sinh( E p T / (cid:90) dp π dq π e ip t p + E e iq t q + E −→ (cid:18) cosh [ E p ( t − T / E p sinh( E p T / (cid:19) = cosh [2 E p ( t − T / E p [cosh( E p T ) − (cid:90) dt (cid:48) (cid:90) dp π dq π dr π ds π e ip ( t − t (cid:48) ) p + E p e iq ( t − t (cid:48) ) q + E p e ir t (cid:48) r + E r e is t (cid:48) s + E r −→ (107) E r E p cosh [2 E r ( t − T / E p T ) − { p ↔ r } + ( E p − E r ) [ E r sinh( E p T ) + E p sinh( E r T ) + E p E r T ]16 E p E r ( E p − E r ) [cosh( E p T ) −

1] [cosh( E r T ) − The computation proceeds as in the case of ﬁnite- L . As in Equation (104), we introduce new notationsto keep the formulas simple: (cid:15) T ( p ; M α , t ) ≡ cosh [2 E p,α ( t − T / E p,α [cosh( E p,α T ) − p , σ T ( p ; M α ) ≡ p E p,α [cosh( E p,α T ) − . (108)For the correction due to ﬁnite L and T we get: G ( t ; L, T ) − G ( t ; ∞ , ∞ ) = (109) (cid:34) (cid:48) (cid:88) α (cid:90) p (cid:15) ( p ; M α , t ) (cid:35) (cid:15) → (cid:15) T − (cid:15) + (cid:34) (cid:48) (cid:88) α (cid:90) p (cid:15) ( p ; M α , t ) 1 F (cid:48) (cid:88) β Γ( p ; M α , M β ) (cid:35) (cid:15) → (cid:15) T − σ T − (cid:15) + (cid:34) ∆ G ( t ) NLO (cid:35) (cid:15) → (cid:15) T + (cid:34) (cid:88) i =1 ∆ G ( t ) NNLO ,i (cid:35) (cid:15) → (cid:15) T − σ T + T F  (cid:48) (cid:88) α (cid:88)(cid:90) p,n σ T ( p ; M α )  + 19 F  (cid:48) (cid:88) α (cid:88)(cid:90) p,n σ T ( p ; M α )   (cid:48) (cid:88) α (cid:88)(cid:90) p,n σ T ( p ; M α ) 1 − e − E p,α T E p,α  − F  (cid:48) (cid:88) α (cid:88)(cid:90) p,n (cid:15) T ( p ; M α , t )   (cid:48) (cid:88) α (cid:88)(cid:90) p,n E p,α ( e E p,α T −  + 118 F (cid:48) (cid:88) α (cid:88)(cid:90) p,n (cid:2) (cid:15) T ( p ; M α , t ) − σ T ( p , M α ) (cid:3) (cid:48) (cid:88) β (cid:88) s = ± (cid:88)(cid:90) r,m r E r,β ( E r,β − E p,α + isη ) 1 − e − E r,β T cosh( E r,β T ) − T at inﬁnite L , ie. G ( t ; ∞ , T ) − G ( t ; ∞ , ∞ ) . The rest ofthe formula gives the correction due to ﬁnite L at ﬁnite T , ie. G ( t ; L, T ) − G ( t ; ∞ , T ) . The terms inthe second line are obtained by replacing (cid:15) by (cid:15) T or by (cid:15) T − σ T in Equations (105) from the previoussubsection. The ﬁrst and second lines give most of the contribution to the ﬁnite-size eﬀect for the casesof interest in this work. The rest, from the third to the sixth line, are genuine ﬁnite- T corrections in thatthey vanish at inﬁnite T . They are all NNLO and are negligible for the lattices considered here. Taste non-singlet contributions to the current

Here we explore brieﬂy to what extent our previous assumption, that the vector current is a taste singlet,is justiﬁed. For this purpose it is useful to work with the valence staggered fermion ﬁelds χ . The latticecurrent propagator, that we introduced in Section 10, can be derived from the following zero-spatial-momentum operator: (cid:88) (cid:126)x (cid:0) χ x U µ,x η µ,x χ x + µ + χ x + µ U † µ,x η µ,x χ x (cid:1) , (110)where the η µ,x phase is the staggered representation of the Dirac matrix and we drop the ﬂavor index forsimplicity. Equation (110) is just the conserved vector current of the staggered fermion action.Staggered bilinears can be assigned with a spin ⊗ taste structure, see [67] for a modern treatment. Inthe case of the operator in (110), two such assignments can be given: γ µ ⊗ and γ µ γ γ ⊗ ξ ξ . (111)The conserved current couples to states of both types. The ﬁrst corresponds to the taste-singlet vectorcurrent, the case which we have fully covered previously. The second is a taste non-singlet pseudovector .Its correlator gives the characteristic contribution to the staggered propagator which oscillates in timewith a factor ( − x . The observables that we consider in this paper are obtained by integrating thepropagator over the whole time range, or at least over some physical distance. In the continuum limitsuch an integration completely eliminates the oscillating contribution. At ﬁnite lattice spacing it gives an O ( a ) suppression compared to the taste singlet vector contribution.It is possible to compute the pseudovector propagator in chiral perturbation theory. This requires theinclusion of an external antisymmetric tensor source t µν . This has been done to NNLO in [68], which haschosen t µν ≈ O ( p ) for the chiral counting of the tensor ﬁeld. With this choice a vertex with the tensorﬁeld ﬁrst appears in the NLO Lagrangian. The one-loop contribution to the tensor-tensor propagator iszero, due to charge conjugation and parity invariance. The two-loop contribution has two NLO verticesand is thus O ( p ) . This, combined with the suppression explained in the previous paragraph, gives O ( a p ) , which is well beyond the order to which we work.As explained in [68], there is an ambiguity in the chiral-counting assignment of t µν . A closely relatedfact is that the tensor-tensor propagator is renormalization scheme dependent, since there is no conserva-tion law for the tensor current. Even if we used a counting t µν ≈ O (1) , the oscillating contribution wouldstill be beyond the NNLO to which we work.Based on these arguments, we ignore the taste non-singlet contribution to the current in our SXPTcomputation.

16 Lellouch-L¨uscher-Gounaris-Sakurai model

In this section we describe a phenomenological model that we use to make predictions for ﬁnite-volumecorrections in Section 17. We also use it to correct for taste-breaking eﬀects in the I = 1 contribution to Pseudovectors are to be distinguished from axial-vectors, the latter have γ µ γ spin structure. Model for the pion form factor

As shown in [42], in inﬁnite volume the Euclidean, current-current correlation function is a Laplacetransform of the corresponding spectral function, ρ ( E ) : G ( t ) = (cid:90) ∞ dEE e − E | t | ρ ( E ) , (112)for t (cid:54) = 0 . Here, G ( t ) is deﬁned in Equation (1) of the main paper and E is the center-of-mass energy.For large | t | , G ( t ) is dominated by the low end of the spectrum, which is governed by two-pion,scattering states. In addition, phenomenology indicates that two-pion states, up to E = 1 . GeV [69,70], are responsible for over 70% of the total a µ and over 85% of the I = 1 contribution computed inthis work. Thus, the contribution of two-pion states should not only provide a good description of thelong-distance behavior of the current-current correlator, important for understanding ﬁnite-volume eﬀects,but also a reasonable model for this correlator at all distances relevant for the determination of a µ . Now,the two-pion contribution to the spectral function is given by (see eg. [71]) ρ ( E ) | ππ = 16 π (cid:18) kE (cid:19) | F π ( k ) | , (113)where E = 2 (cid:112) M π + k , with k the magnitude of the pions’ back-to-back momenta in the center-of-massframe, and F π ( k ) is the timelike, pion, electromagnetic form factor.A good phenomenological description of the pion form factor and the corresponding π - π scatteringphase shift is given by the Gounaris-Sakurai (GS) parametrization [72]. In the context of estimating ﬁnite-volume eﬀects in g µ − , it was used ﬁrst in [73]. The GS parametrization describes well the experimentalspectral function in the I = 1 channel from threshold to E around GeV, thus covering the very important ρ -resonance contribution. This parametrization is given by: F π ( k ) = M (0) M ( E ) − E − iM ρ Γ( E ) , (114)with the energy-dependent width Γ( E ) = Γ ρ (cid:18) kk ρ (cid:19) (cid:18) M ρ E (cid:19) , (115)where k ρ = (cid:113) M ρ − M π , and the energy-dependent mass squared: M ( E ) = M ρ + Γ ρ M ρ k ρ (cid:20) k [ h ( E ) − h ( M ρ )] − (cid:2) E − M ρ (cid:3) k ρ M ρ h (cid:48) ( M ρ ) (cid:21) . (116)Here, the pion-loop function is h ( E ) = 2 kπE log E + 2 k M π (117)and h (cid:48) ( E ) is its derivative with respect to E . To complete the model, we determine the width of the ρ in terms of the ρ - ππ coupling, g , to leading order in an eﬀective theory where the ρ and π are pointlike42articles. It is straightforward to show that Γ ρ = g π k ρ M ρ . (118)From these expressions for F π ( k ) , the phase-shift is simply obtained using Watson’s theorem: δ ( k ) = arg F π ( k ) . (119) G GS ( t ) will denote the inﬁnite-volume correlator given by Equations (112), (113), (114) and (118).The GS model has two free parameters: g and M ρ . Since our simulations are performed very nearthe physical mass point, we ﬁx these parameters to their physical value, neglecting their sub-percentuncertainties: M ρ = 775 MeV is the mass of the ρ meson from [74], and g = 5 . is obtained fromEquation (118), using the width of ρ and the mass of π ± , also from [74]. Model for ﬁnite-volume eﬀects

In a ﬁnite spatial volume of size L × L × L , the two-pion spectrum is discrete because of momentumquantization, and the spectral representation of the current-current correlator becomes a sum, instead ofan integral, over two-pion states. Thus, the large-t behavior of the corresponding, ﬁnite-volume correlationfunction, G ( t ; L ) , can be written as: G ( t ; L ) | t |→∞ −→ (cid:88) n> | (cid:126)A n | e − E n | t | , (120)where n labels the energy eigenstates, in order of increasing energy. Below the four-pion, inelastic thresh-old, the energy of state n is given by E n = 2 (cid:112) M π + k n , with k n determined by the inﬁnite-volume, I G ( J P C ) = 1 + (1 −− ) , π - π scattering phase shift, δ ( k ) , through L¨uscher’s formula [75, 76]: φ ( q n ) + δ ( k n ) = nπ, n = 1 , , . . . , (121)where q = kL/ π and φ ( q ) is given in [75] (see also [77]). The amplitude of the n -th state (cid:126)A n isproportional to (cid:104) | (cid:126)J I =1 (0) | n (cid:105) , where (cid:126)J I =1 collects the spatial components of the isospin I = 1 contributionto the quark electromagnetic current deﬁned after Equation (1) of the main paper. This amplitude isdetermined by the phase-shift and by the timelike, pion, electromagnetic form factor, through a Lellouch-L¨uscher (LL) equation [78–80] | (cid:126)A n | = [ qφ (cid:48) ( q ) + kδ (cid:48) ( k )] − k = k n k n πE n | F π ( k n ) | , (122)where the primes indicate a derivative of the function with respect to its argument. We assume thatEquations (121) and (122) are also approximately true above the inelastic threshold, because the ρ decaysalmost exclusively into two pions [74]. Equations (119) and (120) then deﬁne the Lellouch-L¨uscher-Gounaris-Sakurai (LLGS) model for the ﬁnite-volume current correlator. It will be denoted G LLGS ( t ; L ) .It is important to note that the GS parametrization for ρ ( E ) | ππ , obtained from Equations (114) and(118), decreases as ( E log E ) − for large E and becomes smaller than that of free pions after the ρ peak,when E (cid:38) . GeV. Thus, in the sums over two-pion states in ﬁnite-volume, one can reasonably neglectterms for which E n is greater than . GeV. In the reference volume with L = L ref = 6 . fm, used inSection 17, this corresponds to n = 8 for the Goldstone.In Section 17, we use this LLGS model to compute the ﬁnite-volume correction to the I = 1 con-tribution to a µ . It is determined in the continuum limit for the reference volume L ref and is given by43ntegrating the diﬀerence of the inﬁnite and ﬁnite-volume correlators: a GS µ ( L = ∞ ) − a LLGS µ ( L ref ) = 10 α (cid:90) ∞ dt K ( t ) (cid:2) G GS ( t ) − G LLGS ( t ; L ref ) (cid:3) . (123)In Section 17, this diﬀerence is compared to the results of a dedicated lattice study of ﬁnite-volume eﬀectsin a µ . The good agreement represents a strong validation of the model. Model for taste violations

Here we generalize the LLGS model, for the ﬁnite-volume correlation function, to include the lattice spacingeﬀects arising from taste breaking. Indeed, the dominant, taste-breaking eﬀects in a µ are expected tobe those associated with the two-pion spectrum: these states give the dominant contribution to a µ andthe masses of pions are signiﬁcantly aﬀected by taste breaking on coarser lattices. Our conserved, quarkelectromagnetic current couples, not only to two-Goldstone-pion states, but also to ﬁfteen additional pairsof more massive taste partners of the pion. We label the masses of these states with M τ , where the τ index runs over the 16 element set τ ∈ { , µ , µν, µ, I } with µ < ν .In a description where the pions are free, corresponding to NLO staggered XPT, the two-pion stateshave energies, E (0) n,τ = 2 (cid:112) M τ + k n , with k n = | (cid:126)n | (2 π/L ) , n = (cid:126)n and (cid:126)n ∈ Z . In the interacting case,we make the assumption that two-pion-state energies have the same set of taste copies, but with themomentum, k n,τ , given by the L¨uscher quantization condition of Equation (121). We further assumethat a similar conclusion holds for the amplitudes, i.e. that they satisfy Equation (122) with k n,τ givenby Equation (121). Thus we model the long-distance behavior of the correlator in a ﬁnite spatial volumeand at ﬁnite lattice spacing as: G ( t ; L, a ) | t |→∞ −→ (cid:88) n> (cid:88) τ | (cid:126)A n,τ | e − E n,τ | t | , (124)which we denote G SLLGS ( t ; L, a ) . In the formula, the lattice spacing dependence arises from the splittingsin the pion spectrum. In implementing this model, we assume that all taste-partner pairs of pions couple tothe physical, lightest, ρ taste, with the same coupling g . Thus, we implicitly assume that the dependenceof Γ ρ and of δ ( k ) on M τ is mostly kinematic, an assumption which is borne out by simulations [81].All of these choices guarantee that our model has the correct continuum limit. Also, we keep states upto n = 8 , even for the more massive taste partners. However, we only apply the model to simulationsin which the physical ρ has suﬃcent phase space to decay into all sixteen, taste-partner pion pairs. Thisexcludes only our coarset simulation, with β = 3 . .We use the SLLGS model to correct taste-breaking eﬀects in the I = 1 contribution to a µ , simulationby simulation. This signiﬁcantly reduces the a -dependence of the dominant contribution to a µ , allowingfor a more precise determination of its continuum limit. As shown in Section 18, the optimal way in whichto apply taste-breaking corrections is to consider these corrections in diﬀerent time-windows. Thus, foreach simulation we compute, a LLGS µ, win ( L ref ) − a SLLGS µ, win ( L, a ) = 10 α (cid:90) ∞ dt K ( t ) W ( t ; t , t ) (cid:2) G LLGS ( t ; L ref ) − G SLLGS ( t ; L, a ) (cid:3) , (125)where the window function, W ( t ; t , t ) , is deﬁned in Equation (71) and L is the size of the lattice forthat simulation. The lattice spacing dependence enters in SLLGS from the taste breaking in the pionmasses M τ and these are also taken from the simulation. This correction is applied as an additive shifton the measured a µ . 44 nclusion of ﬁnite- T eﬀects In this section, we generalize the LLGS and SLLGS models to include the eﬀects of the ﬁnite-time extent, T , of the lattice. In our simulations, gauge-boson ﬁelds obey periodic boundary conditions in time andquark ﬁelds, antiperiodic ones. Finite- T eﬀects are expected to be smaller than those due to ﬁnite L ,because T ≥ L in our simulations and because they occur only in a single spacetime direction.The LLGS model describes the contributions to the current-current correlator of two-pion states, inthe presence of interactions. Thus, in studying ﬁnite- T eﬀects, we consider terms in which each currentcouples only to two pions . The three largest contributions are:1. The Euclidean propagation of interacting π + π − states between times and t > , with energies E n . These contributions fall-oﬀ exponentially as e − E n t .2. The Euclidean propagation of interacting π − π + states from t to T , which increases as e − E n ( T − t ) as t approaches T / .3. The contribution of a single pion that wraps around the time direction. This is at the same order indecreasing exponentials of T as the previous contribution. At any given time, there is only a singlepion on the lattice. Therefore, the energy of this state is that of a free pion with at least one unitof momentum, i.e. E (0) n / (cid:112) n (2 π/L ) + M π , with n ≥ the norm-squared of a three vectorin Z . This contribution is constant in t and proportional to e − E (0) n T/ . Moreover, since the pioncouples to the current without recoiling, the matrix element describing this coupling is also that ofthe free pion theory. This contribution appears with a factor of , because it can be caused by thepropagation of either a π + or a π − . Altogether, this contribution has the same form as it does atNLO in XPT.The next order in decreasing exponentials of T is a term of three pions propagating from to t and asingle pion from t to T . We have estimated this contribution and ﬁnd that it contributes to a µ at alevel that is orders of magnitude smaller than our statistical error. Thus, we neglect it, as well as allhigher-order winding terms and, to describe ﬁnite- T eﬀects, we replace G LLGS ( t ; L ) by G LLGS ( t ; L, T ) ≡ G LLGS ( t ; L ) + 13 (cid:88) n> | (cid:126)A n | e − E n ( T − t ) + 23 (cid:88) n> | (cid:126)A (0) n | e − E (0) n T/ , (126)with the free pion amplitude squared, L | (cid:126)A (0) n | = 4 ν n n (2 π/LE (0) n ) , and keeping only states up to n = 8 ,as above. Here, ν n counts the number of vectors of Z that have norm squared n .Using similar arguments, the inclusion of taste-violations is straightforward. One merely performs thereplacement E n → E n,τ and averages over the sixteen tastes. This yields G SLLGS ( t ; L, T, a ) .Now, using the above correlators, it is straightforward to generalize Equation (123) to also includeﬁnite- T corrections. We obtain: a GS µ ( L = ∞ , T = ∞ ) − a LLGS µ ( L ref , T ref )= 10 α (cid:34)(cid:90) ∞ dt K ( t ) G GS ( t ) − (cid:90) T ref / dt K ( t ) G LLGS ( t ; L ref , T ref ) (cid:35) . (127)Similarly, for each simulation, we can estimate taste-breaking eﬀects on an T × L lattice, through: a LLGS µ, win ( L ref , T ref ) − a SLLGS µ, win ( L, T, a ) = 10 α (cid:34)(cid:90) T ref / dt K ( t ) W ( t ; t , t ) G LLGS ( t ; L ref , T ref ) There are also terms in which, four, six, eight, etc. pions couple to the currents. Compared to the two pion terms,these are exponentially suppressed in time but also suppressed by the small coupling of four or more pions to the ρ , whichdominates the energy range that we are modelling here.

640 650 660 670 680 690 700 3 3.5 4 4.5 5 5.5 [ a µ li gh t ] t c [fm] 96x96,hi96x96,lo56x84,hi56x84,lo Figure 13: Upper and lower bounds on the light isospin-symmetric component of a µ . The results shownhere are obtained with the action on two diﬀerent volumes at a = 0 . fm lattice spacing and M π = 121 MeV Goldstone-pion mass. We also have another simulation with M π = 104 MeV mass. Fromthese two we interpolate to M π = 110 MeV. This value ensures that a particular average of pion tastesis ﬁxed to the physical value of the pion mass (see text). − (cid:90) T/ dt K ( t ) W ( t ; t , t ) G SLLGS ( t ; L, T, a ) (cid:35) , (128)thus generalizing Equation (125) to ﬁnite T .

17 Finite-size eﬀects in a µ Finite-size eﬀects on a µ were the largest source of uncertainty in our previous work [47]. In this section wepresent the computation of these eﬀects in a systematic way, which includes dedicated lattice simulations,chiral perturbation theory and phenomenological models. The concrete goal of this section is to providea single number that is to be added to the continuum-extrapolated lattice result obtained in a referencebox, which is deﬁned by a spatial extent of L ref = 6 . fm and a temporal extent of T ref = L ref .First we concentrate on the ﬁnite-size eﬀect of the isospin-symmetric part. Section 6 details ourisospin decomposition. The isospin-breaking part will be discussed later in the last subsection. Theisospin-symmetric part can be further decomposed into an I = 0 and an I = 1 channel. From these the I = 1 is supposed to give the majority of the ﬁnite-size eﬀect. We focus on the I = 1 ﬁrst, and give anestimate of the I = 0 contribution later. According to Equation (63) the I = 1 result is given by the (cid:0) (cid:1) ’th of the connected light contribution.We perform dedicated lattice simulations with two diﬀerent lattice geometries: one on a × latticewith the reference box size and another on a large × lattice with box size L = L big = 10 . fm and T = T big = L big . Since taste violations severely distort the ﬁnite-size behavior, we designed a new actionwith highly-suppressed taste breaking for these computations. The details of the action and thesimulation parameters are given in Section 2. Our strategy is then to compute the ﬁnite-size correction46s the following sum: a µ ( ∞ , ∞ ) − a µ ( L ref , T ref ) == [ a µ ( L big , T big ) − a µ ( L ref , T ref )] + [ a µ ( ∞ , ∞ ) − a µ ( L big , T big )] XPT . (129)The ﬁrst diﬀerence on the right hand side is taken from the dedicated simulations. The seconddiﬀerence is expected to be much smaller than the ﬁrst and is taken from a non-lattice approach: chiralperturbation theory.We consider three non-lattice approaches for both diﬀerences on the right hand side of Equation(129). In case of the ﬁrst diﬀerence, they will be compared to our simulations. The ﬁrst is chiralperturbation theory (XPT), discussed in detail in Section 15. The second is the Lellouch-Luscher-Gounaris-Sakurai model (LLGS), with details in Section 16. In this approach we compute values for the referencebox only, and not for the large box. This is because L big is relatively large and one would have to dealwith a large number of states, which is not practical in that approach. The third approach is that ofHansen and Patella (HP) [82], who use a generic ﬁeld theory framework to relate the ﬁnite-size eﬀectto the electromagnetic form factor of the pion, the latter being determined on the lattice. Note thattheir ﬁrst published result does not include eﬀects that are of order e −√ M π L . These can be signiﬁcantand have been added later [83]. Though the latter version also includes ﬁnite- T eﬀects, it does so withassumptions that are not applicable to our lattices, where T < L . Therefore we use the HP approachhere in the inﬁnite- T limit. Results with the action

We compute the ﬁrst diﬀerence in Equation (129) using dedicated simulations with the action.First we describe the way in which we ﬁx the physical point in these simulations. For this purpose, it isinstructive to look at the inﬂuence of taste violations on the ﬁnite-size eﬀect in NNLO staggered chiralperturbation theory (SXPT). The necessary formulas are given in Section 15. We apply them to variouscases that are described below. The following numbers are obtained for the ﬁnite-size eﬀect:NNLO SXPT results for → continuum a µ ( L big , T big ) − a µ ( L ref , T ref ) 15 . . . . The ﬁrst number gives the continuum prediction, which is about 2% of the total a µ . The second numberstands for the action at a lattice spacing of a = 0 . fm. Here, most of the pion tastes are tooheavy to play any role in the ﬁnite-size behavior. According to SXPT the ﬁnite-size eﬀect is practicallynon-existent there. The action has much suppressed taste violations, and the corresponding number,the third in the table, is already much closer to the continuum. Until now the Goldstone pion mass is setto the physical value M π = M π , ∗ . This pion is the lightest of the sixteen pions in the taste multiplet.We can get much closer to the size of the continuum ﬁnite-size eﬀect if we use Goldstone-pion massesbelow M π , ∗ . For example one can set a taste-averaged pion mass to the physical value.In NLO SXPT the slope of the hadronic vacuum polarization is proportional to (cid:80) α M − π,α . Thismotivates to use the harmonic-mean-square (HMS), deﬁned by M − π, HMS ≡ (cid:88) α M − π,α , to average over the tastes. Setting M π, HMS = M π , ∗ requires lowering the Goldstone-pion mass to M π = 110 MeV. With this choice, the ﬁnite-size eﬀect is of the same size on the lattice and in thecontinuum in NNLO SXPT. This gives the fourth number in the table. This choice results in muchsmaller lattice artefacts than the usual setting with the Goldstone-pion, at least for an observable like theﬁnite-size eﬀect.To generate the data set, we performed simulations with two diﬀerent Goldstone pion masses: M π = 104 MeV and

MeV. To set the physical point as described above, we perform an interpolation47rom these two pion masses to M π = 110 MeV.To compute a light µ from the current propagator in our simulations we use the upper and lowerbounds described in Section 13. The results are plotted in Figure 13 for the M π = 121 MeV simulationpoint. The bounds meet at around . fm and . fm on the small and large volumes, respectively. Atthese distances we take the average of the two bounds as an estimate for a light µ . The results are given inthe table below: M π in → MeV

MeV

MeV a light µ (56 ×

84) 685 . .

7) 668 . .

0) 679 . . a light µ (96 ×

96) 710 . .

9) 684 . .

7) 701 . . In the last column we also give the interpolated value at the physical point, using the HMS averagedpion-mass prescription deﬁned above.We only have one lattice spacing with the action, so no proper continuum extrapolation of theﬁnite-size eﬀect can be done. We estimate the cutoﬀ eﬀect of the result by comparing the total a µ withthe action at this single lattice spacing to the continuum extrapolated lattice result, both inthe L ref volume. The result is about 7% larger than the continuum value. Therefore we reduce themeasured ﬁnite-size eﬀect by 7%, and assign a 7% uncertainty to this correction step. For the diﬀerencewe get a µ ( L big , T big ) − a µ ( L ref , T ref ) = 18 . . stat (1 . cont . (130)The result is obtained from the a light µ numbers from above including a multiplication by the (cid:0) (cid:1) chargefactor. The ﬁrst error is statistical, the second is an estimate of the cutoﬀ eﬀect. Results from non-lattice approaches

The table below collects the ﬁnite-size eﬀect computed in various non-lattice approaches:NLO XPT NNLO XPT LLGS HP a µ ( L big , T big ) − a µ ( L ref , T ref ) 11 . . . − a µ ( L big , ∞ ) − a µ ( L ref , ∞ ) 11 . . . . As we mentioned before, the LLGS approach was not used in the large box. The LLGS numbers inthe table are actually a diﬀerence of the LLGS prediction for a µ ( ∞ , ∞ ) − a µ ( L ref , T ref ) and the residualﬁnite-size eﬀects of the big lattice taken from NNLO XPT. We also give results for the case of inﬁnitetime extent. We see that, according to the models, the ﬁnite- T eﬀect is much smaller than the ﬁnite- L eﬀect.The diﬀerent models give a ﬁnite-size eﬀect of similar size that agrees well with the lattice determinationof Equation (130). Only the NLO result diﬀers by about σ ’s. The fact that NLO chiral perturbationtheory underestimates the ﬁnite-size eﬀect was already shown in [84], at a non-physical pion mass. Usingphysical pion mass, a dedicated ﬁnite-volume study was carried out in [85]. It reaches the same conclusionas we do, albeit with larger errors.The good agreement for the ﬁnite-size eﬀect of the reference box, between the models and the lattice,gives us conﬁdence that the models can be used to reliably compute the very small, residual, ﬁnite-sizeeﬀect of the large box. We get: NLO XPT NNLO XPT HP a µ ( ∞ , ∞ ) − a µ ( L big , T big ) 0 . . − a µ ( ∞ , ∞ ) − a µ ( L big , ∞ ) 1 . . . For an inﬁnite-time extent the NNLO XPT and HP approaches agree nicely. As a ﬁnal value for the largebox ﬁnite-size eﬀect we take the NNLO XPT result including the ﬁnite- T eﬀects: a µ ( ∞ , ∞ ) − a µ ( L big , T big ) = 0 . . big , (131)48 [ m ∆ Π ( L ) ] e - [...] no r m a li z ed b y [ π ] - mLm=m pole (e)=m pole (0)m=m MSbar (e)=m

MSbar (0) α /(mL) Figure 14: Electromagnetic part of the ﬁnite-size eﬀect for scalar QED. Shown is the dimensionless doublediﬀerence for the slope of the vacuum polarization, ie. m [∆Π ( L, e ) − ∆Π ( L, , normalized by thefree ﬁeld value m Π ( ∞ ,

0) = [480 π ] − (see text). Two diﬀerent deﬁnitions for the electromagnetic partare shown, deﬁned by matching with two diﬀerent masses, m pole and m MS . A blue curve shows an orderof magnitude estimate of the ﬁnite-size eﬀect from [86].where the uncertainty is an estimate of higher-order eﬀects, given here by the diﬀerence of the NNLO andNLO values.Until now we have been discussing the I = 1 ﬁnite-size eﬀects. Here we make an estimation ofthe I = 0 channel using XPT. The long-distance behavior of the I = 0 channel is dominated by threepions. A neccessary photon-pion-pion-pion vertex arises from the Wess-Zumino-Witten term in the chiralLagrangian, which is NLO [66]. The lowest order diagram involves two such vertices and two loops, thusthe contribution is N LO. From the NLO and NNLO values for a µ ( ∞ , ∞ ) − a µ ( L ref , T ref ) we estimate thesize of the N LO term, from which we take . . I=0 as our estimate for the I = 0 ﬁnite-size eﬀect. Finite-size eﬀect in the isospin-breaking contributions

A comprehensive study of the electromagnetic ﬁnite-size eﬀects on the current propagator has appearedrecently [86]. The authors conclude that if all particles, except the photon, are treated in inﬁnite volume,then the ﬁnite-size eﬀects are of order α/ ( M π L ) . In practice, however, when all particles reside in theﬁnite box, the usual exponential ﬁnite-size eﬀects become dominant over their electromagnetic counterpartsuppressed by α . In this case it is useful to separate the electromagnetic contributions from the isospin-symmetric part. The QED part exhibits an α/ ( M π L ) behavior. The isospin-symmetric part will havean exponential suppression governed by the neutral-pion mass, exp( − M π L ) . These isospin-symmetriceﬀects are sizeable and discussed earlier in this Section. A subtle point here is the deﬁnition of theelectromagnetic contribution or equivalently the matching of QCD+QED to QCD.It is instructive to study, in a simple model, the role of matching in the size of ﬁnite-volume eﬀects.For this purpose we carry out lattice simulations in scalar QED. Only quenched QED is implemented,since dynamical QED eﬀects enter at order O ( e ) . We perform two sets of simulations:1. First we perform simulations in QED with a bare scalar mass m = 0 . , a coupling α = 1 / and in L boxes in the range L = 16 . . . . The mass of the charged scalar boson extracted fromthe propagator and extrapolated to inﬁnite volume is m pole ( e ) = 0 . . One can also deﬁne a49enormalized mass using the MS prescription [86], and we ﬁnd m MS ( e ) = 0 . .2. We also perform simulations without QED, which is just the free scalar ﬁeld theory, in the same boxsizes and at two bare values of the mass, m = 0 . and . . We use these two values toperform the interpolations that are necessary for the diﬀerent matching conditions. Note that evenin the free case, the pole mass is slightly diﬀerent from the bare mass due to lattice artefacts [87].We use m MS (0) = m pole (0) .We choose a simple observable, the slope of the vacuum polarization function Π ( L, e ) ≡ d Π( Q ) dQ (cid:12)(cid:12)(cid:12) Q =0 .This is built from the conserved current of the scalar ﬁeld theory, which is given as j µ,x /i = φ † x + µ e ieA µ,x φ x − φ † x e − ieA µ,x φ x + µ . (132)From the measured Π , we build the diﬀerence ∆Π ( L, e ) ≡ Π (32 , e ) − Π ( L, e ) and investigate its L dependence. To deﬁne the QED part, we compute the diﬀerence between ∆Π ( L, e ) and ∆Π ( L, . Thiscan be done in diﬀerent ways, depending on how the two theories, with and without QED, are matched.One way is to use m pole to match the theories, another is to do the same with the m MS . The resultsare shown in Figure 14. As one can see, the MS matching leads to the expected α/ ( mL ) behavior, butpole-mass matching leaves the electromagnetic part with a much larger ﬁnite-size eﬀect.In QCD+QED we can deﬁne matching schemes similar to the pole and MS mass matchings of scalarQED. The pole mass is the measured mass of the charged particle, whereas the MS mass is the parameterthat appears in the renormalized Lagrangian. In QCD+QED the analogue of m pole is the charged-pionmass, and that of m MS is the renormalized quark mass. If we were to base our matching on M π + , the ﬁnite-size eﬀects would be much larger than the expected α/ ( M π L ) . Whereas if we use the renormalized quarkmasses in the matching, we expect to see the α/ ( M π L ) ﬁnite-volume behavior in the electromagneticpart. In our scheme, as introduced in Section 6, we keep the neutral-pion mass, M π , ﬁxed instead ofthe charged-pion mass M π + . This should be very close to a scheme where the renormalized quark massesare kept ﬁxed. As such, we expect the QED corrections to a µ to exhibit an α/ ( M π L ) behavior in ourscheme too.The ﬁnite-size eﬀect of O [ α/ ( M π L ) ] from the electromagnetic part is very small compared to theprecision of our study. The ﬁnite-size eﬀect of the strong-isospin-breaking part must also be small: it isactually exactly zero in NLO chiral perturbation theory. In our reference box L ref = 6 . an α/ ( M π L ref ) relative correction corresponds to a ﬁnite-size eﬀect of . in a µ . We will use a value of . . qed as anestimate of the ﬁnite-size eﬀect of the isospin-breaking in our reference box. Final result

For our ﬁnal result for the ﬁnite-size eﬀect of the reference box, we add the numbers in Equations (130)and (131), as well as the estimates from the conclusions of the preceeding two subsections, giving: a µ ( ∞ , ∞ ) − a µ ( L ref , T ref ) = 18 . . stat (1 . cont (0 . big (0 . I =0 (0 . qed [2 . . (133)the ﬁrst error is the statistical uncertainty of our computation, the second is an estimate of the cutoﬀ eﬀects, the third is the uncertainty of the residual ﬁnite-size eﬀect of the “big” lattice, the fourthis a XPT estimate of the I = 0 ﬁnite size eﬀect, the ﬁfth is an estimate of the isospin-breaking eﬀects.The last, total error in the square-brackets is the ﬁrst ﬁve added in quadrature. The vast majority of theﬁnite-size eﬀect is obtained using the lattice computation; for the rest we apply analytic methods.These methods have been validated by the lattice computation: for the majority contribution they givevalues that are consitent with the lattice. 50

540 560 580 600 620 640 6600.000 0.005 0.010 0.015 0.020 [ a µ li gh t ] a [fm ] NNLONLOnoneSLLGS-winNNLO-win

Figure 15: Continuum extrapolation of [ a light µ ] after applying diﬀerent types of taste violation corrections.NLO and NNLO refer to the orders of staggered chiral perturbation theory, there is no correction atLO. SLLGS corresponds to the model presented in Section 16. The dashed lines show the continuumextrapolations used in our ﬁnal result.

18 Continuum extrapolation of a light µ and a disc µ In this section we investigate diﬀerent methods to correct cutoﬀ eﬀects related to taste violations, andargue for two particular approaches that are used to obtain the ﬁnal result of the paper. We focus onobservables at the isospin-symmetric point here, since lattice artefacts in the isospin-breaking parts havenegligible eﬀect on the ﬁnal uncertainties.The connected light and disconnected components of the current propagator show signiﬁcant cutoﬀeﬀects with staggered fermions. They arise due to the well-known taste violation at ﬁnite lattice spacing.Not only are they signiﬁcant, but they appear to decrease much faster than O ( a ) as the continuum limitis approached as shown in Figure 2 for the taste violations in the pion spectrum. These two facts togethersuggest that the standard a continuum extrapolation from commonly used lattice spacings may not besuﬃcient.There is a subtle problem in connection with taste violations and ﬁnite-size eﬀects. On coarse lattices,ﬁnite-size eﬀects are largely suppressed, since most of the pion taste partners are heavy. The ﬁnite-sizeeﬀects increase gradually for ﬁner lattices, but even on our ﬁnest lattice, only about half of the expectedeﬀect is present, as discussed in Section 17. A good correction for cutoﬀ eﬀects should also restore theﬁnite-volume dependence that is expected in the continuum limit.Figure 15 shows the continuum extrapolation of [ a light µ ] . Here we consider four linear ﬁts, by leavingout zero, one, two or three of the coarsest lattice spacings from the ﬁt, these are shown by dashed lines onthe plot. In our ﬁnal analysis, described in Section 23, we also include quadratic ﬁts in a , and take intoaccount several other systematics together with isospin-breaking eﬀects. These lead to several thousandsof ﬁts, from which the ﬁnal value for a light µ is obtained. The four ﬁts in this section serve as an illustration.In Figure 15 diﬀerent symbols correspond to diﬀerent approaches to remove taste violations. Fromthe four continuum-extrapolatied values we then make a weighted average using the Akaike InformationCriterion (AIC) weights of the ﬁts, as discussed in Section 20. The following results are obtained:correction type → none NLO SXPT NNLO SXPT NNLO SXPT-win SLLGS-win [ a light µ ] ( L ref , T ref ) 650(3) 643(2) 633(2) 641(2) 635(3) [ a µ li gh t ] i n w i ndo w [t ... t + . f m ] F i ne ( . f m ) - C oa r s e ( . f m ) t [fm] SLLGSNNLONLO4stout Figure 16: [ a light µ ] computed with a sliding window: the window starts at t and ends . fm later.The plot shows the diﬀerence between a ﬁne and a coarse lattice, the volumes are L = 6 . fm and L = 6 . fm. The black squares with errors are obtained from the simulation. The colored curves arethe predictions of NLO and NNLO staggered chiral perturbation theory and the SLLGS model. They arecomputed at the parameters (pion mass, taste violation, volume) of the simulations.The diﬀerent abbreviations and the role of the reference box, given by spatial and time extents L ref =6 . fm and T ref = L ref , are explained below. In the case of a ﬁt without any correction , the ﬁtqualities are bad and the continuum extrapolated values diﬀer signiﬁcantly. The AIC weighted result isdominated by the ﬁt with the three ﬁnest points.Much better ﬁt qualities compared to the uncorrected case can be obtained if we apply a correction,obtained from staggered chiral perturbation theory (SXPT), to the lattice results before the continuumextrapolation. Section 15 provides the necessary formulas. Speciﬁcally, each data point in a box with size L and T and a lattice with spacing a gets an additive shift as [ a light µ ] ( L, T, a ) → [ a light µ ] ( L, T, a ) + (cid:2) a XPT µ ( L, T ) − a SXPT µ ( L, T, a ) (cid:3) + (cid:2) a XPT µ ( L ref , T ref ) − a XPT µ ( L, T ) (cid:3) . (134)The ﬁrst additive correction removes taste violation artefacts. It vanishes as the lattice spacing goesto zero, and therefore provides a valid continuum extrapolation procedure. Note, however, that on ourlattice spacings, the correction changes the value of the a -continuum extrapolation. This is because itbehaves much more like ∼ a than a pure a . The second additive correction in Equation (134) correctsfor the small diﬀerences between the volumes in our diﬀerent simulations by shifting them to a commonsystem size given by the reference box. The ﬁnite-size eﬀect of the reference box is computed in dedicatedsimulations in Section 17. There is a charge factor (cid:0) (cid:1) in front of both correction terms. It is required,because the XPT results of Section 15 correspond to the I = 1 contribution, not to the light one.The correction with NLO SXPT already makes ﬁt qualities good, except for the case when all latticespacings are included. It also decreases the slope of the continuum extrapolation by a factor of twocompared to the unimproved case and reduces the variation in the extrapolated values. The AIC weightedresult is signiﬁcantly smaller than the one obtained without corrections.With one order higher, using NNLO SXPT, all ﬁt qualities are good and the sensitivity to leaving out the This corresponds to the LO of staggered chiral perturbation theory. F i t qua li t y i n w i ndo w [t ... t + . f m ] t [fm]NNLONLOnone Figure 17: Fit qualities of continuum-extrapolation ﬁts for [ a light µ ] as a function of t , where thepropagator is restricted to a window [ t , t + 0 . fm ] . Diﬀerent colors correspond to corrections obtainedusing diﬀerent orders of staggered chiral perturbation theory. LO corresponds to no correction at all.Diﬀerent symbols with the same color correspond to diﬀerent number of the coarse lattice spacingsignored in the ﬁt: ﬁlled/half-ﬁlled/empty for zero/one/two.coarse lattices is below the statistical errors. However, this order seems to “overdo” the improvement: itturns the extrapolation function, which was increasing towards the continuum limit, to one that approachesit from above. As a result the continuum extrapolated value is further reduced. Also we see no signs ofconvergence: the NNLO correction is more than twice as large as the NLO.More can be learned about cutoﬀ eﬀects by looking at the propagator G light0 instead of [ a light µ ] . Toget rid of the staggered oscillations we slide over the propagator with a smoothing window and computethe [ a light µ ] in those windows. We use the window function from Equation (71) with a step width of ∆ = 0 . fm and a width of t − t = 0 . fm. We then compute the diﬀerence between a coarse and aﬁne lattice, β = 3 . and . . The lattice result is plotted in Figure 16 as a function of t , togetherwith the curves obtained from NLO, NNLO SXPT and SLLGS. These are evaluated at the parametersof the ensembles. We see that NLO SXPT reproduces the cutoﬀ eﬀect only for distances larger thanabout . fm, whereas the NNLO/SLLGS starts to work already at about . / . fm. Another importantobservation is that both NNLO and SLLGS fail badly in the region below ca. . fm, there the NLO ismuch closer to the actual cutoﬀ eﬀect. This is not surprising, since both NNLO and SLLGS have a t → behaviour which does not agree with QCD’s ∼ t − .We can corroborate these ﬁndings by comparing the ﬁt qualities of the a -linear continuum extrap-olations for the sliding windows. For each window, starting at t and ending at t = t + 0 . fm, weperform continuum extrapolations with the LO, NLO and NNLO corrections. The ﬁt qualities of these as afunction of t are shown in Figure 17. The diﬀerent colors correspond to diﬀerent corrections, whereas thediﬀerent symbols represent diﬀerent number of coarse lattices dropped in the ﬁt (zero, one or two). Below t ≈ . fm the statistical error is so small that none of the ﬁts have a good ﬁt quality. Above this there isa short range where the uncorrected ﬁts are good. Then, in the range . (cid:46) t (cid:46) . fm, acceptable ﬁtsare obtained using the NLO correction, while the other ﬁts are much worse. Between . (cid:46) t (cid:46) . fmthe NNLO corrected ﬁts have the best quality, as the NNLO corrected data is almost perfectly linear in a . This agrees with the ﬁndings in Figure 16, that above t ≈ . fm the NNLO describes the cutoﬀeﬀects well. Above t ≈ . fm the statistical errors become large, and all ﬁts are good. The picture isvery similar if we replace the NNLO correction with the one obtained from the SLLGS model.53 [ a µ d i sc ] a [fm ] noneNLONNLONNLO-winSLLGS-win Figure 18: Continuum extrapolation of [ a disc µ ] after applying diﬀerent types of taste violation corrections.NLO and NNLO refer to the orders of staggered chiral perturbation theory, there is no correction atLO. SLLGS corresponds to the model presented in Section 16. The dashed lines show the continuumextrapolations that are used in our ﬁnal result.These ﬁndings explain why the continuum extrapolation of [ a light µ ] using the NNLO correction seemsto overdo the improvement, as seen in Figure 15. For windows with t (cid:46) . fm, the NNLO largelyoverestimates the size of the cutoﬀ eﬀect. This might not come as a surprise, since the chiral expansion isnot expected to work at short distances. This motivates us to apply the NNLO correction only in windowswhere it provides good ﬁt qualities. Accordingly we propose to use the correction [ a light µ ] ( L, T, a ) → [ a light µ ] ( L, T, a ) + (cid:2) a NLO − XPT µ, win1 ( L ref , T ref ) − a NLO − SXPT µ, win1 ( L, T, a ) (cid:3) ++ (cid:2) a NNLO − XPT µ, win2 ( L ref , T ref ) − a NNLO − SXPT µ, win2 ( L, T, a ) (cid:3) , (135)ie. apply the NLO correction in some window-1 and the NNLO in a complementary window-2. In Figure15 we show the continuum extrapolation obtained by this type of correction with red color and withthe label “NNLO-win”. The window limits were set based on the ﬁt qualities in Figure 17: window-1corresponds to the range [0 . . . . . fm, and window-2 to the range above . fm. No correction isapplied for t < . fm. Just as in the case of the NNLO correction over the full t -range, the ﬁt qualitiesare good even when including the coarsest ensemble, and the sensitivity to dropping coarse ensembles issmall. The important diﬀerence to the NNLO ﬁt is that this improved correction also gives a good ﬁt inthe short-intermediate time range.We use the taste-violation correction in Equation (135) to improve the continuum limit in our ﬁnalanalysis. Two choices for the window-1 limits are taken: [0 . . . . . fm and [0 . . . . . fm. The variationin the result due to this choice is included in our systematic error.Analogously we can set up taste violation corrections with the SLLGS model, replacing the NNLOSXPT used in window-2 with SLLGS and keeping the rest of the analysis the same. This gives the bluecurves and points in Figure 15, labeled by “SLLGS-win”. The AIC weighted result for the continuumextrapolation is somewhat smaller than in the case of NNLO-win. We add the diﬀerence between SLLGS-win and NNLO-win to the systematic error associated with the continuum-extrapolation procedure.A similar investigation can be carried out in the case of [ a disc µ ] . There we apply corrections as inEquations (134) or (135), but with a charge factor (cid:0) − (cid:1) instead of (cid:0) (cid:1) . For example the analogoue of54quation (134) is: [ a disc µ ] ( L, T, a ) → [ a disc µ ] ( L, T, a ) − (cid:2) a XPT µ ( L ref , T ref ) − a SXPT µ ( L, T, a ) (cid:3) . (136)The resulting continuum extrapolations are shown in Figure 18. The unimproved data points show severelattice artefacts, and the NNLO SXPT seems to overdo the improvement again. On the other hand,applying the NNLO SXPT only at large distances results in almost ﬂat continuum extrapolations.55 In this section we describe the procedure that is used to obtain the physical values of a µ . Two typesof ﬁt functions are introduced, Type-I and Type-II, which diﬀer in their input parameters. In Type-I ﬁtsthese are experimentally measurable quantities. In Type-II ﬁts the inputs are observables that are notdirectly accessible in experiments. Type-II ﬁts are needed to implement the separation of observables intoisospin-symmetric contribution and isospin-breaking corrections that is described in Section 6. We closethe section by presenting an alternative ﬁt procedure. Type-I ﬁts

In the case of Type-I ﬁts we parameterize the quark-mass and electric-charge dependence of an observable Y around the physical point and for small isospin breaking with a linear function f : Y = f ( { X } ; A, B, . . . ) ≡ A + BX l + CX s + DX δm + EX vv + F X vs + GX ss . (137)The X l , X s , . . . are called independent variables of the ﬁt function, though they can be (statistically)correlated. The A, B, . . . are called the ﬁt coeﬃcients. The Type-I ﬁts have the feature that theirindependent variables { X } are quantities that are experimentally measurable. Here the X l and X s variables describe the deviation from the physical light and strange mass X l = M π M − (cid:20) M π M (cid:21) ∗ , X s = M K χ M − (cid:34) M K χ M (cid:35) ∗ (138)with ∗ denoting the experimental value. No higher orders in X l or X s are needed, since we work close tothe physical point. The remaining X variables measure the distance from the isospin-symmetric limit X δm = ∆ M K M , X vv = e v , X vs = e v e s , X ss = e s , (139)where e v and e s are the valence and sea electric charges, respectively. Higher-order isospin-breaking termsare not considered in this work. The meson masses are deﬁned as M K χ ≡ (cid:0) M K + M K + − M π + (cid:1) , ∆ M K ≡ M K − M K + . (140)In case of the neutral pion we use the combination M π χ ≡ (cid:0) M uu + M dd (cid:1) , (141)(142)where the masses of mesons uu and dd are obtained from contractions involving connected diagrams only.It can be shown in partially-quenched chiral perturbation theory coupled to photons [34], that M π = M π χ up to terms that are second order in isospin breaking.The coeﬃcients A, B, . . . in Equation (137) are speciﬁc to the observable Y . They can depend on56he lattice spacing, and also on the X variables deﬁned above, in particular we use: A = A + A a + A a ,B = B + B a ,C = C + C a ,D = D + D a + D a + D l X l + D s X s ,E = E + E a + E a + E l X l + E s X s ,F = F + F a ,G = G + G a . (143)The lattice spacing a is deﬁned in a so-called mass-dependent, scale-setting scheme: for any ensemble, a is given as the ratio of the Ω mass measured in lattice units divided by its experimental value. In the A, D, E coeﬃcients we use both linear and quadratic dependecies in a ; all other depedencies are assumedto be linear. Depending on the ﬁt qualities, some of these parameters will be set to zero.The parameters A , A , A , B , . . . can be determined by performing a ﬁt for suﬃciently many en-sembles that scatter around the physical point. The physical value of Y can then be obtained from thisﬁt as Y ∗ = A + D [ X δm ] ∗ + ( E + F + G ) · e ∗ , (144)ie. by setting the independent variables X to their physical values, including setting the valence and seaelectric charges to the physical value of the coupling e ∗ . The value e ∗ is related to the experimental valueof the ﬁne structure constant as e ∗ = √ πα ∗ . This choice is valid up to second order in isospin-breaking.As described in Section 5, isospin-breaking corrections are obtained by measuring derivatives withrespect to the δm , e s and e v parameters. These can be incorporated into the above procedure by derivinga system of coupled equations: one by taking Equation (137) at the isospin-symmetric point, and theother four by applying the isospin breaking derivatives, see Equations (26) and (29). We then ﬁnd thefollowing ﬁve equations: [ Y ] = [ A + BX l + CX s ] [ Y ] (cid:48) m = [ DX δm ] (cid:48) m [ Y ] (cid:48)(cid:48) = [ A + BX l + CX s + DX δm ] (cid:48)(cid:48) + [ E ] [ Y ] (cid:48)(cid:48) = [ A + BX l + CX s + DX δm ] (cid:48)(cid:48) + [ F ] [ Y ] (cid:48)(cid:48) = [ A + BX l + CX s + DX δm ] (cid:48)(cid:48) + [ G ] (145)where various isospin components of the coeﬃcients A, B, . . . have to be included, eg. the isospinsymmetric value of E is given by: [ E ] = E + E [ a ] + E [ a ] + E l [ X l ] + E s [ X s ] . (146)The ﬁrst line in (145) parameterizes the isospin-symmetric data, and is the only equation that dependson the A parameter. The next equation describes strong-isospin-breaking, where the electromagneticcoeﬃcients E, F, G trivially drop out. B and C are also absent here, since they depend symmetrically onthe u and d quarks. This equation is the main constraint for D . The ﬁnal three equations are the electricderivatives; they constrain the E, F and G coeﬃcients.Note that the derivatives in Equation (145) are with respect to the bare parameters. The strong-isospin-breaking derivative [ . . . ] (cid:48) m deﬁnes a renormalized observable, but the electric charge derivatives donot. This is due to the fact that the electric charge changes the running of the quark masses and the57attice spacing. However, diﬀerences like [ Y ] (cid:48)(cid:48) − [ A + BX l + CX s + DX δm ] (cid:48)(cid:48) , (147)which actually appear in (145), are free of divergences. When preparing plots to illustrate the continuumextrapolation, the electric derivatives will always refer to such renormalized combinations. Type-II ﬁts

We introduce a second type of parametrization, called Type-II, in order to obtain the isospin decompositiondescribed in Section 6. Type-II ﬁts use the w -scale for scale setting and are deﬁned through: Y = f ( { ˜ X } ; ˜ A, ˜ B, . . . ) ≡ ˜ A + ˜ B ˜ X l + ˜ C ˜ X s + ˜ D ˜ X δm + ˜ E ˜ X vv + ˜ F ˜ X vs + ˜ G ˜ X ss , (148)where the independent variables of the ﬁt function are deﬁned as ˜ X l = M π χ w − [ M π χ w ] ∗ , ˜ X s = M ss w − [ M ss w ] ∗ , ˜ X δm = ∆ M w , ˜ X vv = X vv , ˜ X vs = X vs , ˜ X ss = X ss , (149)with ∆ M = M dd − M uu . Some of the ˜ X variables contain w , M ss and ∆ M , that cannot be measuredexperimentally. The physical values of these quantities have to be determined from a Type-I ﬁt of Equation(137) ﬁrst. ˜ A , ˜ B , . . . in general depend on hadron masses and on the lattice spacing, analogously to thedependencies in Equation (143). Here the lattice spacing is deﬁned through w : it is the physical valueof w divided by the one measured in lattice units. The ﬁt procedure is also completely analogous to theone described above, including the coupled equations for the diﬀerent isospin components. The isospindecomposition can be obtained from the Type-II ﬁt coeﬃcients as [ Y ] iso = ˜ A , [ Y ] sib = ˜ D [∆ M w ] ∗ , [ Y ] qed = ( ˜ E + ˜ F + ˜ G ) · e ∗ . (150)One can also decompose the electromagnetic contribution further to valence-valence, valence-sea andsea-sea parts: [ Y ] qed − vv = ˜ E e ∗ , [ Y ] qed − sv = ˜ F e ∗ , [ Y ] qed − ss = ˜ G e ∗ . (151)The two ﬁt types, Type-I in Equation (137) and Type-II in Equation (148) have to yield the same physicalvalue Y ∗ within error bars. This was always the case for the observables considered here. Later, when wediscuss the ﬁts, it will be obvious from the text which parametrization we are working with, so we dropthe ˜ from the coeﬃcients of the Type-II ﬁts for simplicity. Correlations

In both parametrizations we have to work with a system of equations such as (145), where the unknownparameters are contained in

A, B, . . . . To obtain these we perform a ﬁt taking [ Y ] and the isospinderivatives from several ensembles. The [ Y ] , [ Y ] (cid:48) m and [ Y ] (cid:48)(cid:48) components are measured on the same L ≈ fm ensembles of Table 1, and they are therefore correlated. One also has to take into accountthe correlation between the sea quark derivatives [ Y ] (cid:48)(cid:48) and [ Y ] (cid:48)(cid:48) that are measured on the L ≈ fmensembles of Table 8. These correlations have to be properly included in the ﬁt. Also, we have to take intoaccount the correlation of Y and the independent variables { X } , including the lattice spacing. Speciﬁcallywe compute and minimize the following function to determine the ﬁt parameters A , . . . : χ = (cid:88) i,j ( Y i − f i ) Cov − ij ( Y j − f j ) . (152)58ere the sums run over all ensembles and Y i ( f i ) are the values of the observable (function) on ensemble i . The matrix Cov ij is the statistical covariance of the residuals Y i − f i , computed as Cov ij = (cid:104) ( Y i − f i ) − ( Y i − f i ) (cid:105) (cid:104) ( Y j − f j ) − ( Y j − f j ) (cid:105) , (153)where we denote the statistical average with an overline. Using the jackknife samples this can be obtainedas: Cov ij = N J − N J N J (cid:88) J =1 (cid:104)(cid:16) Y ( J ) i − f ( J ) i (cid:17) − (cid:16) Y (0) i − f (0) i (cid:17)(cid:105) (cid:104) . . . i → j . . . (cid:105) , (154)where an upper index ( J ) means that the quantity is computed on the J -th jackknife sample and J = 0 stands for the average over all jackknife samples. The minimization of the χ -function yields non-linearequations for the parameters, since the Cov matrix depends on them too. To solve the minimizationproblem numerically, we ﬁrst guess the minimum by ignoring the parameter dependence of the

Cov matrix. In all cases this was already a good starting point, which is related to the fact, that the errorson Y are typically much larger than on X . This guessing can be iterated and after a few iterations weswitch to Newton’s method to accelerate the convergence. Alternative ﬁt procedure

In addition to the previously described ﬁt procedure we also use an alternative approach, in which isospincorrections are included in a diﬀerent way. The idea is to use the ﬁt function, eg. the Type-I function inEquation (137), directly without working with the isospin breaking derivatives of that function. For thispurpose we create new, “virtual” ensembles in addition to the already existing isospin-symmetric ones.These virtual ensembles have an isospin breaking with one or more of the e s , e v and ( m d − m u ) /m l parameters set to non-vanishing values, close but not necessarily exactly to their physical values. Theobservables on the virtual ensembles are computed using the isospin-symmetric values and isopsin breakingderivatives measured on the original ensembles. For the global ﬁt we use the original ensembles togetherwith these newly created ones. Since the virtual ensembles were created from the original isospin-symmetricensembles, there are strong correlations between them. Computing the covariance matrix is similar toEquation (154), but now the indices i, j run over all ensembles, including the newly created ones.We used this technique, with the Type-I ﬁt function, to compute the quantities w , M ss and ∆ M .In all three cases the results had similar uncertainties as the original approach, presented earlier in thissection, and for which the results can be found in Section 21. Also, the central values agreed within theirsystematic uncertainty in the two approaches.

20 Uncertainty estimation

Calculation of statistical errors

We use the jackknife method to calculate the statistical errors. To suppress the auto-correlation betweendata from subsequent conﬁgurations we introduce a blocking procedure. It is very convenient to use anequal number of blocks for all ensembles. In this work we use N J = 48 blocks. With this choice we havetypically 100 trajectories or more in a block, which is much larger than the autocorrelation time of thetopological charge (around on our ﬁnest ensembles). For the blocks we apply the delete-one principle,resulting in N J jackknife samples plus the full sample.We keep the correlation between all quantities calculated from the same ensemble. For simplicity, wematch the jackknife samples between ensembles, too. This means that each global ﬁt using all ensemblesat the same time is performed N J + 1 times. The covariance matrix is calculated only for the main sample,there is no need for the errors on the correlations here.59 .000.100.200.300.400.500.600.700.800.901.00 12000 12500 13000 13500 14000 14500

2x combined error – M [MeV ]CDF of syst weightsCDF of stat and syst weights Figure 19: Cumulative distribution functions (CDF) of ∆ M = M dd − M uu values. The red curve showsthe CDF of about 3k diﬀerent analyses obtained from their corresponding AIC weights. At a given pointalong the curve the horizontal band is the statistical error of the analysis at that point. The blue curveshows a CDF corresponding to a combined distribution of the AIC weights and the Gaussian distributionsof the statistical errors. This latter curve is obtained from Equation (157) with λ = 1 . We use the medianand the width of this curve to deﬁne the central value and the total error. For the separation of the totalerror into a statistical and a systematic part we also use the CDF with λ = 2 . Estimation of systematic errors

Throughout the chain of analyses many choices are made, ranging from ﬁt windows and mass extractions,through the various Ans¨atze for the large-time behavior of the

J J correlator, to the various parametriza-tions of a global ﬁt. We call the global ﬁt with a speciﬁc set of such choices an analysis. Each choice of k possible options introduces a factor k in the total number of analyses, which already includes a factor of N J + 1 corresponding to the statistical sampling. Here we describe the procedure to derive a systematicerror coming from the ambiguity of these choices. We follow closely the strategy introduced by us in [32]and also extend it by a new method to separate statistical and systematic errors.For a target observable y we build a histogram from the diﬀerent analyses. Each analysis gets a weightassigned. This weight is given by the Akaike Information Criterion (AIC). The AIC is derived from theKullback-Leibler divergence, which measures the distance of the ﬁt function from the true distribution ofthe points. A derivation of the formula can be found in Section 11 of [32]. Here we use a slightly modiﬁedversion of the AIC: AIC ∼ exp (cid:2) − (cid:0) χ + 2 n par − n data (cid:1)(cid:3) , (155)where the χ , the number of ﬁt parameters n par and the number of data points n data describe the globalﬁt. The ﬁrst two terms in the exponent correspond to the standard AIC, the last term is introduced toweight ﬁts with diﬀerent number of ensembles; this happens when we apply cuts in the lattice spacing.It can be derived from the Kullback-Leibler divergence and arises from the ﬁrst term in Equation (S41)of [32]. In case of normally distributed errors this term can be computed and one obtains n data , whichthen leads to Equation (155) that we use in this paper.The analyses diﬀering only in the parametrization of the ﬁt function, or in a cut in the lattice spacing,are weighted with their AIC weights; in the directions corresponding to other systematic variations, a ﬂat60eighting is applied. Finally, the weights are normalized in such a way, that their sum over all analysesequals 1.Let w i denote the weight of the i -th analysis for a quantity y , with (cid:80) i w i = 1 . We interpret thisweight as a probability. The statistical uncertainties can be included by noting that, due to the centrallimit theorem, they follow a Gaussian distribution N ( y ; m i , σ i ) with a central value m i and a standarddeviation σ i . These parameters are given by the jackknife average and the jackknife error calculated fromthe jackknife samples in the i -th analysis. We then deﬁne a joint probability distribution function of y ,including both statistical and systematic uncertainties, as: (cid:88) i w i N ( y ; m i , σ i ) . (156)In the following we work with the cumulative distribution function (CDF): P ( y ; λ ) = (cid:90) y −∞ dy (cid:48) (cid:88) i w i N ( y (cid:48) ; m i , σ i √ λ ) . (157)Here, for later use, we introduce a parameter λ that rescales the statistical error.The median of the CDF is our choice for the central value of y and its total error is given by the 16%and 84% percentiles of the CDF: σ ≡ (cid:2) ( y − y ) (cid:3) with P ( y ; 1) = 0 . , P ( y ; 1) = 0 . . (158)One could deﬁne a systematic error by evaluating the 16% and 84% percentiles of the P ( y ; 0) function,since here the choice λ = 0 erases the statistical contribution to the distribution. However, P ( y ; 0) isa sum of step functions (shown in red in Figure 19), making the percentiles a function that has jumps,which makes the deﬁnition of the systematic error highly sensitive to the value of the percentile chosen.Here we make a more robust choice for the systematic error. First we demand, that: σ + σ ≡ σ . (159)Now, let us note that the rescaling of each jackknife error σ i with a factor λ is expected to increase thetotal squared statistical error with the same factor: λσ + σ ≡ (cid:2) (˜ y − ˜ y ) (cid:3) with P (˜ y ; λ ) = 0 . , P (˜ y ; λ ) = 0 . . (160)Equations (158), (159) and (160) then provide a deﬁnition for separate statistical and systematic errors.If the λ is not too small, then the joint CDF is smooth and has no sudden jumps, see Figure 19, and theprocedure is insensitive to the choice of λ . We use λ = 2 in our error estimations.To understand the composition of the systematic error we calculate the error budget for all importantquantities in the following way. Imagine that the full analysis uses 9 values of lattice spacing cuts, and weare interested in the corresponding systematic error. We ﬁrst determine 9 total errors for each possiblecuts. From these we construct a second CDF, which is a sum of 9 Gaussians as in Equation (157), with i = 1 . . . , the σ i being the total error and m i the average of the 16 and 84 percentiles of the ﬁts withthe i -th cut, and the w i the sum of the weights of those ﬁts. From this CDF we derive the systematicerror as done above for the original CDF, which is our result for the systematic error corresponding tothe 9 cuts. We remark that the systematic errors are correlated within one error budget, distorting thequadratic sum of the components, that ought to sum up to the full systematic error. Error propagation

Here we describe the way to propagate errors to consecutive analysis steps. Such a case occurs when weperform a Type-II ﬁt using the physical values of w , M ss and ∆ M that were determined in a Type-I ﬁt.61he statistical errors are taken into account by keeping the jackknife samples throughout the wholeanalysis and computing the statistical error only in the end, ie. after the Type-II ﬁt. For the systematicerror there are certain analysis choices, like hadron mass ﬁt ranges, that are shared between the Type-Iand Type-II ﬁts. We carry these over as we do with the jackknife samples.There are also systematics that are independent in the two types of ﬁts. For those, one would like tocombine all of the corresponding analyses of the Type-I ﬁt with all of those of the Type-II ﬁt. The numberof individual analyses can already be several thousand for each type of ﬁt, and by mixing each analysisin the ﬁrst step with each analysis in the second step, the total number of analyses would easily reach amillion. These many combinations are unnecessary, since they include many bad ﬁts with tiny weights.In our approach we select N I results from the Type-I ﬁt by an “importance sampling”: we uniformlysplit the probability interval [0 , into N I bins and, for each bin, take whichever individual ﬁt correspondsto the midpoint of that bin. This produces a list of N I Type-I analyses, sampled according to theirimportance. This selection is then the input into the Type-II ﬁt, and the total number of analyses in thesecond step will only get multiplied by a factor of N I instead of several thousands. We choose N I = 8 inour analyses. We ascertained that when using these N I = 8 ﬁts, both the statistical and the systematicerrors are approximately the same as when considering all ﬁts.62 w , M ss and ∆ M In this section we describe brieﬂy the details of the global ﬁts that are used to obtain the physical values of w , M ss and ∆ M from the experimental values of hadron masses, including the mass of the Ω baryon. Inall three cases we use the Type-I ﬁt function of Equation (137), which can be related to isospin-breakingderivatives as described in Equation (145). The set of parameters, that are used in these ﬁts, can be readoﬀ from Table 13. For a given observable some of the parameters are included in all ﬁts, some never, andthere are also some that are either included or excluded. A systematic error is associated with the latterand is given in the Table.In the case of w , the observable we ﬁt is Y = w M Ω . Since w M Ω is symmetric under u ↔ d exchange, no leading order strong-isospin-breaking terms can appear. Thus we can set the strong-isospin-breaking coeﬃcient ( D ) to zero.To account for the systematic error due to the diﬀerent continuum extrapolations we apply both linearand quadratic functions in the isospin-symmetric component, also we skip zero/one/two/three of thecoarsest lattice spacings in the linear and zero/one/two lattice spacings in the quadratic ﬁts. For the tinyvalence QED component only linear ﬁts are applied, with zero/one/two skips; for the even smaller seaQED contributions we have either constant or linear ﬁt with all lattice spacings.The systematic error of the hadron mass ﬁts is taken into account by 24 diﬀerent combinations of theﬁt ranges: three for the M Ω mass, two for the pseudoscalars, two for the isospin breaking of the M Ω andtwo for the isospin breaking of the pseudoscalars. The pseudoscalar ﬁt ranges are given in Table 5. Forthe Ω mass we use two ﬁt ranges from the four-state ﬁt and one from the GEVP procedure that are givenin Table 6. The ﬁt ranges for the isospin breaking components can be found in Table 9. To account forthe experimental error on M Ω we carry out the analysis with two diﬀerent experimental values: one thatcorresponds to the central value plus the experimental error; the other with this error subtracted.Altogether, these yield a total of 129024 ﬁts. When the diﬀerent analyses are combined into ahistogram to determine the systematic error, the results from diﬀerent ﬁt functions or lattice spacing cutsare weighted with the Akaike Information Criterion, the rest with ﬂat weighting. We obtain [ w ] ∗ = 0 . fm , (161)where the ﬁrst error is statistical, and the second is systematic, the third is the total error; we reach arelative precision of . %. The split up of the error into diﬀerent sources can be found in Table 13. InFigure 20 we show the various isospin components of w M Ω against the lattice spacing squared togetherwith the diﬀerent continuum extrapolations. For the electric derivatives we took the deﬁnition in Equation(147). Our result (161) is in good agreement with earlier four-ﬂavor determinations: w = 0 . fmof [88] and w = 0 . (cid:0) +15 − (cid:1) fm of [89]. In these works the isospin-breaking eﬀects were only estimated,whereas in our case they are fully accounted for.The same procedure is used for M ss as for w . We actually work with Y = ( M ss /M Ω ) instead of M ss /M Ω , since the ﬁt qualities are much better in the ﬁrst case. The 129024 diﬀerent ﬁts give [ M ss ] ∗ = 689 . MeV , (162)with statistical, systematic and total errors as above. The error budget can be found in the second columnof Table 13 and the continuum extrapolations for M ss are shown in Figure 21.Finally we also carry out the analysis for Y = ∆ M /M with ∆ M = M dd − M uu . Since thisobservable has no isospin-symmetric part, the A , B and C coeﬃcients are set to zero. Also, since this isan isospin splitting eﬀect, no electromagnetic sea-sea eﬀects can contribute, so the ﬁt function becomes: ∆ M M = D (cid:18) ∆ M K M (cid:19) + Ee v + F e v e s (163)Diﬀerently from the ﬁts earlier we use a -quadratic ﬁts also in the D and E coeﬃcients. We apply four63 [fm] M ss [MeV] ∆ M [MeV ]median 0.17236 689.89 13170total error 70 (0.4%) 49 (0.07%) 420 (3.2%)statistical error 29 28 320systematic error 63 40 270 M π /M K /M ss ﬁt < M π /M K /M ss ﬁt QED 4 2 140 M Ω ﬁt 16 4 0 M Ω ﬁt QED 9 < M Ω experimental 5 1 < A on/oﬀ on on oﬀ A on/oﬀ on on oﬀ A on/oﬀ 60 38 oﬀ B on/oﬀ 2 3 oﬀ B on/oﬀ oﬀ oﬀ oﬀ C on/oﬀ on on oﬀ C on/oﬀ 9 5 oﬀ D on/oﬀ oﬀ oﬀ on D on/oﬀ oﬀ oﬀ on D on/oﬀ oﬀ oﬀ 40 D l on/oﬀ oﬀ oﬀ 40 D s on/oﬀ oﬀ oﬀ 40 E on/oﬀ on on on E on/oﬀ 18 2 on E on/oﬀ oﬀ oﬀ 40 E l on/oﬀ 7 < E s on/oﬀ 10 2 70 F on/oﬀ on on on F on/oﬀ < < G on/oﬀ on on oﬀ G on/oﬀ 2 3 oﬀTable 13: Physical values and error budgets for w , M ss and ∆ M . The errors are to be understood onthe last digits of the central value, as usual. Both statistical and systematic uncertainties of these Type-Iﬁts are propagated to the Type-II ﬁts. The systematic uncertainties below the dashed line are propagatedby choosing N I = 8 representative ﬁts, as described in the text.64 w [fm] × w [fm] × E-0.06-0.030.000.030.06 qed sea-val, e F-0.50-0.250.000.250.500.000 0.005 0.010 0.015 0.020qed sea-sea, e G a [fm ] Figure 20: Continuum extrapolations of the contributions to w M Ω . From top to bottom: isospin-symmetric, electromagnetic valence-valence, sea-valence and sea-sea component. The results are multi-plied by / [ M Ω ] ∗ . For the deﬁnitions of the components see Equations (145) and (147). The electricderivatives are multiplied by e ∗ . Dashed lines are continuum extrapolations corresponding to the latticespacing dependent part of the A , E , F and G coeﬃcients. They are illustrative examples from our severalthousand ﬁts. Only the lattice spacing dependence is shown: the data points are moved to the physicallight and strange quark mass using the X l and X s dependent terms in the ﬁt. This adjustment varies fromﬁt to ﬁt, the red datapoints are obtained in an a -linear ﬁt to all ensembles. If in a ﬁt the adjusted pointsdiﬀered signiﬁcantly from the red points, we show them with grey color. The ﬁnal result is obtained froma weighted histogram of the several thousand ﬁts. 65 isospin symm, A M [MeV ]M [MeV ] -950-900-850-800-750 qed val-val, e E-40-20 0 20 40 qed sea-val, e F-200-100 0 100 2000.000 0.005 0.010 0.015 0.020qed sea-sea, e G a [fm ] Figure 21: Continuum extrapolations of the contributions to M ss . Plotted is the ratio M ss /M multipliedby [ M Ω ] ∗ . Other details as in Figure 20. 66 .02.12.22.3 ∆ M / ∆ M K2 , D400060008000 qed val-val, e E ∆ M ∆ M [ M e V ] -100-500501000.000 0.005 0.010 0.015 0.020qed sea-val, e F [ M e V ] a [fm ] Figure 22: Continuum extrapolations of the contributions to ∆ M . From top to bottom: [∆ M ] (cid:48) m / [∆ M K ] (cid:48) m , electromagnetic valence-valence and sea-valence components of ∆ M /M . Theelectric derivatives are multiplied by [ e M ] ∗ . Other details as in Figure 20.lattice spacing cuts by skipping zero/one/two/three of the coarsest lattices for linear ﬁts and three cutsfor quadratic ﬁts with zero/one/two skips. Other systematics were treated as in the above ﬁts. Altogetherwe have 3328 ﬁts, which give a central value with statistical, systematic and total errors as: [∆ M ] ∗ = 13170(320)(270)[420] MeV . (164)The corresponding error budget can be found in Table 13. In Figure 22 we show continuum extrapolationsfor the ∆ M / ∆ M K ratio and the valence-valence and sea-valence electric derivatives; these correspondto the D , E and F coeﬃcients in the ﬁt function.

22 Alternatives to the M Ω scale-setting To check our determination of the physical value of w , we consider two other scale-setting quantities:the pion decay constant f π and the Wilson-ﬂow scale t . Both approaches are independent from thesystematics of the M Ω mass determination. We also investigate a eﬀects by changing the deﬁnitionof w by lattice artefacts. We do not consider f π and t in our ﬁnal analysis, because their relation toexperiments is indirect. We work in the isospin-symmetric limit throughout this section.In current lattice simulations it is common to use the pion decay constant for scale setting. Thisobservable however is well deﬁned only in the absence of electromagnetism, and thus useful only insimulations in the isospin-symmetric point. It is possible to connect the experimental decay rate of thepion to an isospin-symmetric pion decay constant, f π . Current state-of-the-art uses a chiral perturbationtheory based approach, which yields f π = 130 . MeV [74]. There are also computations underwayto determine f π [90–92] on the lattice. In these approaches the isospin-symmetric point is deﬁned usingrenormalized quark masses, which is diﬀerent from our hadronic scheme in Section 6. When turning on theelectromagnetic interaction, our scheme keeps certain neutral hadron masses and w constant, contraryto the one used in the f π -based scheme, where the renormalized quark masses and the strong couplingare ﬁxed.Here we carry out an analysis to determine an isospin-symmetric value of [ w ] isoq using the above f π [f m ] a [fm ] [w ] isoq from t [w ] iso from M Ω [w ] isoq from f π Figure 23: Continuum extrapolation of the isospin-symmetric value of w using three diﬀerent inputs: t from the lattice work [89], M Ω from experiment [74] and f π from a combination of chiral perturbationtheory and experiment [74]. The dashed lines are quadratic and cubic functions of a in case of t , andlinear and quadratic otherwise. The colored shaded regions around a = 0 . fm correspond to theuncertainty in the input quantity. The horizontal grey shaded region is our ﬁnal w determination fromEquation (161). Note, there is a diﬀerence in the deﬁnition of the isospin-symmetric point in the diﬀerentinputs.as input. We introduce the notation isoq to emphasize the diﬀerence from our deﬁnition of the isospin-symmetric value [ w ] iso = [ w ] ∗ . To obtain [ w ] isoq we also need a pion and kaon mass that is puriﬁedfrom isospin-breaking eﬀects. For these we take M π = 134 . MeV and M K = 494 . MeV [9].The ﬁt procedure is similar to the Type-I ﬁts that we performed before for w M Ω . The physical pointis given by the f π , M π and M K values above. Since we work with the isospin-symmetric component, onlythe A , B and C coeﬃcients of Equation (137) are kept. We apply both linear and quadratic ﬁts in a ,with the ususal cuts in the lattice spacing. Figure 23 shows representative ﬁts from this analysis, withgood ﬁt qualities. The continuum extrapolated values are consistent with our [ w ] ∗ from Equation (161).However the spread between the diﬀerent continuum extrapolations is smaller, since the curvature of w f π in a is smaller than in w M Ω .Another way to determine w is to take the t -scale, also deﬁned from the Wilson-ﬂow, as input.This determination basically computes the w /t ratio. For the physical value of t we use [ t ] isoq =0 . (cid:0) +8 − (cid:1) fm from [89], which has a precision of about . . The same analysis is carried out asbefore, with the diﬀerence that now we also include cubic ﬁts in a , since the data shows a very strongcurvature and the linear ﬁts have a bad quality. Figure 23 shows representative ﬁts, giving continuumvalues consistent with using M Ω as input, Equation (161).Finally we show here a method to determine [ w ] iso , which is also based on M Ω as an input parameter,but uses the idea of a t-shift in the Wilson ﬂow [93]. The main reason for this analysis is to determinewhether the strong quadratic upward trend in w for small lattice spacings, see top panel of Figure 20,is a genuine cutoﬀ eﬀect? Indeed, the Wilson ﬂow is known to have a transient for small ﬂow times.Although the aﬀected region shrinks as one approaches the continuum limit, the eﬀect might be sizableparticularly if we want to reach an accuracy on the few per-mil level.The t-shift in the Wilson-ﬂow replaces (cid:104) t E ( t ) (cid:105) with (cid:104) t E ( t + sa ) (cid:105) , which is essentially applying theﬂow on a smeared gauge ﬁeld (pre-smearing). It can be interpreted as an improved operator for the energydensity. Obviously, in the continuum limit ﬂows with or without t-shifts are the same. We measured a68 w ] i s o [f m ] a [fm ] w t [s ]/t [s ]w Figure 24: Continuum extrapolation of the isospin symmetric value of w using M Ω . Two w deﬁnitions,the standard one (red circles), and another modiﬁed by the t [ s ] /t [ s ] ratio (green squares) are shown.The ratio approaches 1 in the continuum limit.combination, w · t ( s ) /t ( s ) , which obviously gives back w in the continuum limit. Clearly, thiscombination has a diﬀerent lattice spacing dependence than the original w , determined in the previousSection. There are several s , s choices, which eliminate the strong a behaviour, the upward turningof w for small lattice spacings. They do so without changing the result in the continuum limit withinerrors. This ﬁnding indicates that the upward trend is indeed a cutoﬀ eﬀect related to the Wilson ﬂowand can be removed by modiﬁed operators of the Wilson ﬂow. As an illustration we show s = 0 . and s = 0 . in Figure 24.The latter procedure reduces the coeﬃcient of the a term in the continuum extrapolation and, as aconsequence, could also reduce the error on w . Since the t-shift method is a somewhat unconventionalway to determine w , we leave it as an illustration of how cutoﬀ eﬀects can play a role and we quote ouroriginal w , Equation (161), with the larger error as our ﬁnal result.

23 Results for a µ and its various contributions In this section we present results for the strange, light and disconnected components of a µ in the continuumand inﬁnite-volume limits. We perform the two limits in two separate steps. We introduce a reference boxwith spatial extent L ref = 6 . fm and time extent T ref = L ref . These correspond approximately to thesize of our boxes in the ensemble set. In this reference box we perform the continuum extrapolationfor each ﬂavor component. The ﬁnite-size eﬀect of the reference box is then added in the second step. Forthis we prepared dedicated lattice simulations, including a large box of size L big = T big = 10 . fm, asdiscussed in Section 17. The simulations give the diﬀerence a µ ( L big , T big ) − a µ ( L ref , T ref ) , which are in goodagreement with non-lattice estimates. For the tiny residual ﬁnite-size eﬀect, a µ ( ∞ , ∞ ) − a µ ( L big , T big ) ,the predictions of the non-lattice approaches are taken.In this section we use both Type-I and Type-II parametrizations from Section 19 to perform the globalﬁts. They give compatible results for the observables, however in most cases the Type-I results are moreprecise. This can be partly explained by the relatively large error on the w value, Equation (161), whichis caused by the strong curvature of w M Ω at small lattice spacings. We will take the ﬁnal result fromthe Type-I ﬁt, and use the Type-II ﬁt to perform the isospin decomposition. To get the isospin-symmetric69alue we take the total result from the Type-I ﬁt and substract the isospin-breaking contributions obtainedfrom the Type-II ﬁt.Type-II ﬁts require the physical values of w , M ss and ∆ M as input. These were determined inthe previous Section. From those ﬁts we keep the 24 diﬀerent possibilites related to the hadron massdeterminations. The remaining systematic variations of these are represented by N I = 8 suitably chosenﬁt combinations, as discussed in Section 20.Some of the ﬁt parameters are included in all ﬁts, some never used, and there are also ones that areincluded in half of the ﬁts and excluded in the other half. Which of these options is applied for a givenparameter is decided by looking at the inﬂuence of the parameter on the ﬁt result. The options chosencan be read oﬀ from Table 14. Connected strange contribution

The strange contribution to the connected component of a µ , denoted by a strange µ , is obtained from thestrange ﬂavor term of the connected contractions C strange given in Equations (59) and (61). Its isospin-symmetric component, as well as its electromagnetic isospin-breaking derivatives are given in Table 10.The propagator is then summed over space to project to zero momentum and in time with a weight factor: a strange µ = 10 α T/ (cid:88) t =0 K ( t ; aQ max , am µ ) 16 (cid:88) (cid:126)x,µ =1 , , (cid:68) C strange µ,t,(cid:126)x ; µ, + C strange µ,T − t,(cid:126)x ; µ, (cid:69) , (165)see Equations (64) and (69). Strong-isospin-breaking does not enter in a strange µ , so the D coeﬃcient canbe set to zero in the Type-I ﬁt function. The systematic error estimation was carried out as in the casesof w and M ss in Section 21. The diﬀerences are, that E is always kept and C is not used. Altogetherwe have 32256 ﬁts: the continuum limit and ﬁt-form-related variations are weighted with AIC, the restwith a ﬂat distribution. In the continuum limit we get a strange µ ( L ref , T ref ) = 53 . , (166)with statistical, systematic and total errors. The result is obtained in a ﬁnite box and the ﬁnite-sizecorrection term will be added in a later step. The error budget for the total a strange µ is given in Table 14.We also perform a Type-II ﬁt, from which we obtain the diﬀerent isospin contributions in Table 15. Thecorresponding continuum extrapolations are shown in Figure 25. Connected light contribution

The contribution of the light ﬂavors to the connected part of a µ , denoted by a light µ , is given by replacing thestrange contraction C strange with the connected light quark contraction C light in Equation (165). In theisospin-symmetric part a bounding procedure is applied on the propagator to reduce the noise as discussedin Section 13. In the isospin-breaking parts we apply a cut in time, beyond which the propagator is setto zero, see Section 14. Two diﬀerent cuts, given in Table 11, are used to estimate the correspondingsystematic error.As explained in detail in Section 18, the continuum extrapolation is carried out by ﬁrst applying acorrection to a light µ on each ensemble. This is necessary in order to remove large cutoﬀ eﬀects relatedto taste violations. Two diﬀerent procedures are used for this purpose: one based on NNLO staggeredchiral perturbation theory (Section 15) and another based on the Lellouch-L¨uscher-Gounaris-Sakurai model(Section 16). Both of these provide a good description of the lattice artefacts for the long distance partof the propagator. For shorter distances the lattice artefacts are much smaller in relative size. There agood description of the discretization eﬀects is given by either NLO staggered chiral perturbation theoryor by no improvement at all. In particular we use no improvement below t = 0 . fm, NLO SXPT between t and t = 1 . fm and either NNLO SXPT or SLLGS above t . To assess the systematics we also use70 strange µ ( L ref , T ref ) a light µ ( L ref , T ref ) a disc µ ( L ref , T ref ) median 53.379 640.3 -18.37total error 111 (0.2%) 4.4 (0.7%) 1.58 (8.6%)statistical error 89 2.6 1.15systematic error 67 3.6 1.09 M π /M K /M ss ﬁt 5 < M π /M K /M ss ﬁt QED 3 0.1 < M Ω ﬁt 56 0.3 0.04 M Ω ﬁt QED 2 0.1 < M Ω experimental 5 0.1 0.01 t c in Table 11 – 0.3 0.23NNLO SXPT vs SLLGS – 3.7 0.02window borders – 1.1 0.04Continuum limit (beta cuts) 47 0.5 0.85 A on/oﬀ on on on A on/oﬀ on on on A on/oﬀ 26 0.1 oﬀ B on/oﬀ 11 on 0.10 B on/oﬀ oﬀ oﬀ oﬀ C on/oﬀ on on on C on/oﬀ oﬀ oﬀ oﬀ D on/oﬀ oﬀ on on D on/oﬀ oﬀ on on D on/oﬀ oﬀ oﬀ oﬀ D l on/oﬀ oﬀ 0.2 0.02 D s on/oﬀ oﬀ on oﬀ E on/oﬀ on on on E on/oﬀ on < E on/oﬀ oﬀ oﬀ oﬀ E l on/oﬀ 3 < E s on/oﬀ < F on/oﬀ on on on F on/oﬀ 1 < G on/oﬀ on on on G on/oﬀ 3 < a µ . The errors are to be understood on the last digits of the central value, as usual. Theresults correspond to a box size L ref = 6 . fm and T ref = L ref . a strange µ ( L ref , T ref ) a light µ ( L ref , T ref ) a disc µ ( L ref , T ref ) total 53.379(89)(67) 640.3(2.6)(3.6) -18.37(1.15)(1.09)iso 53.393(89)(68) 634.6(2.7)(3.7) -13.15(1.28)(1.29)qed -0.0136(86)(76) -0.92(34)(43) -0.58(14)(10)qed-vv -0.0086(42)(41) -1.28(40)(33) -0.55(15)(11)qed-sv -0.0014(11)(14) -0.0080(85)(98) 0.011(24)(14)qed-ss -0.0031(76)(69) 0.42(20)(19) -0.047(33)(23)sib – 6.59(63)(53) -4.63(54)(69)Table 15: Continuum extrapolated results for the diﬀerent components of the strange, light and discon-nected contributions to a µ . The results correspond to a box size L ref = 6 . fm and T ref = L ref .71trange, a strange µ ( L ref , T ref ) a light µ ( L ref , T ref ) a disc µ ( L ref , T ref ) -18.37(1.15)(1.09) this work, Equation (168)ﬁnite-size, a µ ( ∞ , ∞ ) − a µ ( L ref , T ref ) [ a charm µ ] iso [ a charm µ ] qed a disc µ < a bottom µ a pert µ − a γ R µ -0.321(11) [95], Table IITable 16: List of contributions to a µ , ie. the leading order hadronic vacuum polarization contribution tothe muon anomalous magnetic moment multiplied by .a ( t , t ) pair, where both values are . fm larger. The corresponding errors can be found in the errorbudget of Table 14. This improvement is only applied to the isospin-symmetric component.There is a small variation in the size of the lattices between diﬀerent ensembles. We correct for this byadding a shift, computed from the ﬁnite- L and T versions of NNLO SXPT or the SLLGS model, in such away that the results correspond to the same box size L = L ref and T = T ref . This is done simultaneouslywith the correction for the cutoﬀ eﬀects, see Equation (134).The above variations in the analysis procedure yield 258048 Type-I ﬁts. We apply the usual weighting,in particular the NNLO SXPT and SLLGS improved results are given the same weight. For the total lightconnected contribution we get a light µ ( L ref , T ref ) = 640 . . . . , (167)with statistical, systematic and total errors. The used ﬁt parameters and the error budget is given in Table14. Performing analogous Type-II ﬁts we get the breakup into individual isospin contributions, given inTable 15. The corresponding continuum extrapolations are shown in Figure 26. Disconnected contribution

The disconnected contribution, denoted by a disc µ , is obtained using Equation (165) with C disc , given inEquation (60), instead of C strange . In our previous work, at the isospin-symmetric point [47], we computedthe eﬀect of charm quarks on a disc µ at the coarsest lattice spacing, and found that it changes the resultby a value much smaller than the statistical error. Thus, we perform the current disconnected analysiswithout taking into account valence charm quarks.The same analysis procedure is applied as in the case of a light µ : we use upper and lower bounds for theisospin-symmetric part, a cut in time for the isospin breaking components and we improve the continuumlimit either with NNLO SXPT or SLLGS. The NNLO SXPT results are almost completely ﬂat as a functionof a , as such the quadratic ﬁts give a curvature that is consistent with zero. We therefore use only a -linear continuum extrapolations. A further diﬀerence is that we have one less lattice spacing as in thecase of a light µ . We end up with 27648 ﬁts in total and a result of a disc µ ( L ref , T ref ) = − . . . . . (168)The error budget is given in Table 14. The Type-II ﬁt gives the individual contributions in Table 15, andthe continuum extrapolations are shown in Figure 27.72 inite-size eﬀects and other contributions Beside the three contributions that we have presented until now, there are a number of smallers onesthat have to be added to get the ﬁnal value on a µ . These are listed in Table 16 together with sourceindications. Several of these have a size that is much smaller than our accuracy. They are given here forcompleteness. We discuss them, now, one by one.The previously presented results correspond to the reference box. Its ﬁnite-size eﬀect is computedin Section 17. This is the fourth entry in Table 16, which already includes the tiny contribution of theelectromagnetic ﬁnite-size eﬀects.The contribution of the connected charm quark, a charm µ , was computed in the isospin-symmetric limitby many groups. Here we use our own result from [47]. An upper bound on the small eﬀect of the charmon a disc µ was given in [47]. We use the value as an error here. The result was obtained using a singlelattice spacing. The even smaller isospin-breaking corrections on a charm µ was computed in [55].Until now we have considered four quark ﬂavors. Obviously, the contributions of the remaining ﬂavorshave to be added. For the bottom quark contribution there is a lattice determination available, [94],whose value we use here. The top can be safely neglected at our level of precision.The a µ in this work involves an integration in momentum up to Q = 3 GeV , as discussed inSection 11. The integration from this value to inﬁnity can be computed in perturbation theory. We usethe value given in [47].As mentioned in Section 11, we compute the current propagator with all O ( e ) eﬀects included. Inthis result the one-photon-reducible ( γR ) contribution belongs to the higher order HVP. This term hasto be subtracted if we are interested in the leading order HVP. A recent lattice determination of this termcan be found in [95]. The corresponding diagram was labeled by the letter ’c’ in their Figure 2.Summing all the contributions gives a µ = 708 . . . . , (169)which is our ﬁnal result for the LO-HVP contribution. The ﬁrst error is statistical. It includes the statisticalerrors of the strange, light and disconnected contributions. The latter two are the dominant ones. Allother uncertainties are added in quadrature, and this is given as the second, systematic error. Its majorsources are the ﬁnite-size eﬀect estimation and the continuum extrapolations. The last error in bracketsis the combined error, which corresponds to a relative precision of . .73 a µ strange a µ strange -0.04-0.03-0.02-0.010.00 qed val-val, e E-0.006-0.0030.0000.0030.006 qed sea-val, e F-0.04-0.020.000.020.040.000 0.005 0.010 0.015 0.020qed sea-sea, e G a [fm ] Figure 25: Continuum extrapolations of the contributions to a strange µ ( L ref , T ref ) . For more explanationssee the caption of Figure 20. 74 a µ light a µ light SLLGS-winNNLO-win2468 strong ib, [w ∆ M ] D-3-2-10 qed val-val, e E-0.10-0.050.000.050.10 qed sea-val, e F-1.0-0.50.00.51.00.000 0.005 0.010 0.015 0.020a [fm ]qed sea-sea, e G Figure 26: Continuum extrapolations of the contributions to of a light µ ( L ref , T ref ) . For more explanationssee the caption of Figure 20. In case of the light contribution we perform the analysis with two diﬀerentimprovement techniques, the corresponding data sets are denoted by the red and blue colors.75 a µ disc a µ disc NNLO-winSSLGS-win-6-5-4-3-2-10 strong ib, [w ∆ M ] D-1.5-1.0-0.50.0 qed val-val, e E-0.10-0.050.000.050.10 qed sea-val, e F-0.10-0.050.000.050.100.000 0.005 0.010 0.015 0.020a [fm ]qed sea-sea, e G Figure 27: Continuum extrapolations of the contributions to of a disc µ ( L ref , T ref ) . For more explanationssee the caption of Figure 20. In case of the light contribution we perform the analysis with two diﬀerentimprovement techniques, the corresponding data sets are denoted by the red and blue colors.76

198 200 202 204 206 208 210 212 214 R -r a t i o / l a tt i c e R B C ’ A ub i n ’ B M W c ’ [ a li gh t µ , w i n ] i s o w / o i m p r o v e m en t N L O S X P T i m p r o v e m e n t a [fm ] Figure 28: Continuum extrapolation of [ a light µ, win ] iso . Two types of improvements are shown: one whereNLO SXPT is used to improve the lattice result and another where no improvement is performed. Foreach, some continuum extrapolations are shown as illustration with dashed lines. They include ﬁts linearand quadratic in a and also where diﬀerent numbers of coarse lattices are skipped in the ﬁt. The datapoints on the plot are corrected for light and strange quark mass eﬀects, this adjustment is diﬀerent fromﬁt to ﬁt. The red points belong to a quadratic ﬁt to all ensembles. Our ﬁnal value in the continuum limitcomes from a histogram of several thousand ﬁts and is given by the ﬁlled red circle in the left panel. Theresults are corrected for ﬁnite-size eﬀects using Equation (173). Other lattice computations are shownwith a green box [51] and a blue triangle [48]. A value computed from the R-ratio method is also given(see text for details).

24 Result for a µ, win The work [48] deﬁned a particularly useful observable a µ, win , in which the current propagator is restrictedto a time window [ t , t ] , using a smooth weight function W ( t ; t , t ) . See Section 11 for the deﬁnitionof W . The advantage of a µ, win over a µ is that, by choosing an appropriate window, the calculation canbe made much less challenging on the lattice than for the full a µ . Here we will be interested in thewindow between t = 0 . fm and t = 1 . fm, ie. in an intermediate time range. By this choice weeliminate both the short-distance region, where large cutoﬀ eﬀects are present, and the long-distanceregion, where the statistical uncertainties, taste violations and ﬁnite-size eﬀects are large. Because thedetermination of a µ, win does not require overcoming many of the challenges described in the main paper,other lattice groups have obtained this quantity with errors comparable to ours [48, 51]. This allows fora sharper benchmarking of our calculation. At the same time a µ, win can also be computed using thephenomenological approach. This is done in Section 25. Therefore, a µ, win is also a powerful tool tocompare the results of lattice and phenomenological computations.To compute a µ, win on the lattice we performed similar global ﬁts that were used to get a µ in Section23. A major diﬀerence is that, in case of the light connected contribution we improve the continuum limitdiﬀerently. While for a µ we use corrections from either NNLO SXPT or the SLLGS model, here for a µ, win we take either NLO SXPT or no improvement at all. For this window observable these choices are moreappropriate, as discussed in Section 18, where we compare the diﬀerent improvements in diﬀerent windows.A further diﬀerence compared to the a µ ﬁt procedure is that no cuts are applied on the propagator intime; the window function suppresses the propagator for distances beyond t = 1 . fm.The results for the strange, light and disconnected contributions and diﬀerent isospin breaking correc-77 strange µ, win ( L ref , T ref ) a light µ, win ( L ref , T ref ) a disc µ, win ( L ref , T ref ) median 27.170 207.03 -1.260total error 30 (0.1%) 1.38 (0.7%) 0.097 (7.7%)statistical error 28 0.19 0.037systematic error 13 1.37 0.089 M π /M K /M ss ﬁt 1 < M π /M K /M ss ﬁt QED 1 < M Ω ﬁt 8 0.02 0.038 M Ω ﬁt QED < < < M Ω experimental 2 0.01 0.001Continuum limit (beta cuts) 5 0.98 0.044none vs NLO SXPT – 1.28 0.066 A on/oﬀ on on on A on/oﬀ on on on A on/oﬀ < B on/oﬀ 3 on on B on/oﬀ oﬀ oﬀ 0.010 C on/oﬀ on on on C on/oﬀ 1 0.11 0.045 D on/oﬀ oﬀ on on D on/oﬀ oﬀ on on D on/oﬀ oﬀ oﬀ oﬀ D l on/oﬀ oﬀ 0.01 oﬀ D s on/oﬀ oﬀ on oﬀ E on/oﬀ on on on E on/oﬀ on on on E on/oﬀ oﬀ oﬀ oﬀ E l on/oﬀ < E s on/oﬀ on oﬀ oﬀ F on/oﬀ on on on F on/oﬀ 1 < G on/oﬀ on on on G on/oﬀ 2 0.01 oﬀTable 17: Continuum extrapolated results and error budget for the strange, light and disconnectedcontributions to a µ, win . The errors are to be understood on the last digits of the central value, as usual.The results correspond to a box size of L ref = 6 . fm and T ref = L ref . a strange µ, win ( L ref , T ref ) a light µ, win ( L ref , T ref ) a disc µ, win ( L ref , T ref ) total 27.170(28)(13) 207.03(0.19)(1.37) -1.260(37)(89)iso 27.175(28)(13) 206.19(0.20)(1.37) -0.906(42)(90)qed -0.0050(35)(37) 0.082(46)(34) -0.115(17)(07)qed-vv -0.0018(14)(10) 0.067(30)(21) -0.114(18)(05)qed-sv -0.00142(60)(30) -0.0222(91)(52) 0.0013(23)(00)qed-ss -0.0018(31)(38) 0.033(30)(39) -0.0027(26)(22)sib – 0.757(40)(16) -0.238(10)(06)Table 18: Continuum extrapolated results for the diﬀerent isospin components of the strange, lightand disconnected contributions to a µ, win . The results correspond to a box size of L ref = 6 . fm and T ref = L ref . 78ions are summarized in Table 18 and the error budget for the total values and the ﬁt parameters used, inTable 17. The largest source of error is the continuum extrapolation of the light connected component.However, it is still much smaller than the typical size of uncertainties in the full a µ determination. InFigure 28 we plot the continuum extrapolation of the isospin-symmetric component of a µ, win , both forthe unimproved and the NLO SXPT corrected results.The comparison of lattice results for [ a light µ, win ] iso is particularly interesting, because it allows to benchmarkthe leading, light-quark contribution to a µ through a quantity that can be computed precisely withoutresorting to highly-advanced techniques. Using our result obtained in the reference box from Table 18and correcting for ﬁnite-size eﬀects (see later) we get: [ a light µ, win ] iso = 206 . . . . , (170)with statistical, systematic and total uncertainties. This result is . σ smaller than [ a light µ, win ] iso = 207 . . of Aubin’19 [51]. Here we use the continuum-extrapolated value, which Aubin’19 obtain from their twoﬁnest lattices in the upper panel of their Figure 7, because its errors bars cover the results of the othercontinuum extrapolations that they consider. Compared with [ a light µ, win ] iso = 202 . . of RBC’18 [48] ourresult is . σ larger. These two comparisons yield an average deviation of 1.2 σ . These two lattice resultsare also shown in Figure 28.Additionally, we also made an analysis of the charm quark contribution. The total a charm µ was obtainedin our previous work [47]. Here we perform a Type-II ﬁt for a charm µ, win . Only the isospin-symmetric componentis used and we obtain the following result: [ a charm µ, win ] iso = 2 . . (171)Here the error is the systematic uncertainty: the statistical is an order of magnitude smaller. The isospinbreaking of the charm should be well below the uncertainties of the ﬁt. See Table 16 for the case of a charm µ . Furthermore, in our dedicated ﬁnite-size study with the action we compute the diﬀerence ofthe light contribution between the “big” and reference boxes and obtain a light µ, win ( L big , T big ) − a light µ, win ( L ref , T ref ) = 0 . , (172)where the error is statistical. Applying the same procedure for a µ, win as we did for a µ in the ﬁnite-sizestudy of Section 17, we get for the ﬁnite-size eﬀect: a µ, win ( ∞ , ∞ ) − a µ, win ( L ref , T ref ) = 0 . . (173)The ﬁrst error is statistical, the second is an estimate of the cutoﬀ eﬀect of the action. We ﬁndthat the ﬁnite- T eﬀects are even less important than in the case of the total a µ , where they were alreadymuch smaller than the ﬁnite- L eﬀects.Summing up these contributions we get a µ, win = 236 . . . . (lattice) , (174)where the ﬁrst error is statistical, the second is systematic and the third in the square brackets is the ﬁrsttwo added in quadrature. Further contributions, that are listed in Table 16, should have an eﬀect muchsmaller than the uncertainties of this result.The value can be directly compared to the one obtained from the R-ratio method in Section 25: a µ, win = 229 . . (R-ratio) , (175)which is smaller than the lattice result by . σ or . %. We can also derive an R-ratio result for theisospin-symmetric light contribution. From the value in Equation (175) we subtract the lattice results for79 [ a li gh t µ , w i n ] ( L = f m ) a [fm ]4stout-on-4stoutoverlap-on-4stout Figure 29: Continuum extrapolation of [ a light µ, win ] . The two datasets correspond to the staggered-on-staggered and the overlap-on-staggered simulations.all contributions, except for [ a light µ, win ( L ref , T ref )] iso and its ﬁnite-size correction. We get: [ a light µ, win ] iso = 200 . . (R-ratio & lattice) . (176)This value is compared in Figure 28 to continuum and inﬁnite-volume extrapolated lattice results fromthis work and from other lattice groups. Crosscheck with overlap fermions

We perform a crosscheck of the above results with a mixed action formulation: overlap valence and staggered sea quarks. Our goal is to provide more evidence that the continuum extrapolationand the current renormalization are done correctly in the case. The overlap fermion action, thematching and the current renormalization are described at length in Section 3. Our target is the isospinsymmetric value of the light connected window observable [ a light µ, win ] . We use lattices of size L ≈ fm inthis crosscheck, the related ﬁnite-volume eﬀects are about ﬁve percent.For the measurement of the overlap current propagator we use 512 random wall sources per conﬁg-uration, the trace over color indices is performed randomly. For the staggered current propagator weuse the same noise reduction technique as on the L ≈ fm lattices. In the overlap case we ﬁnd, thatthe coarsest lattice, corresponding to β = 3 . , is outside the a -scaling region. Figure 29 shows theresults together with continuum extrapolations, they are in good agreement for the two formulations. Thescenario, in which replacing staggered by overlap fermions removes the discrepancy between the R-ratioand the lattice result, seems improbable. In order for this to happen, the overlap result would have to getmore than σ smaller than we determine it here.

25 Phenomenological determination of a µ, win The purpose of this section is to describe the computation of the phenomenological result for a µ, win , whichwe compare with the corresponding lattice result in Section 24. For this we use the R-ratio from e + e − collision experiments and the corresponding covariance matrix from the work of KNT18 [70], courteously80ef. a µ KNT18 [70] 693.26(2.46)KNT19 [96] 692.78(2.42)DHMZ17 [97] 693.1(3.4)DHMZ19 [69] 693.9(4.0)CHHKS19 [98, 99] 692.3(3.3)Table 19: Recent phenomenological determinations of a µ . KNT stands for Keshavarzi, Nomura andTeubner; DHMZ for Davier, Hoecker, Malaescu and Zhang; CHHKS for Colangelo, Hoferichter, Hoid,Kubis and Stoﬀer.given to us by Keshavarzi, Nomura and Teubner.As Table 19 shows, there is a sizeable diﬀerence in the uncertainties on a µ in the recent phenomeno-logical literature. While the KNT19 result has an error of 2.42, DHMZ19 gives a 65% larger error of 4.0.Since we do not want to risk overstating possible diﬀerences between the phenomenological and latticeapproaches, we extend the error estimates of KNT18 [70] by two additional sources, bringing them muchcloser to those of DHMZ. a. The ﬁrst source of uncertainty is related to a tension between the two e + e − experiments, KLOE andBaBar, which have the smallest uncertainties in the window from . to . GeV center-of-mass energy.These two experiments exhibit, for the pion-pion channel in that window, a close to σ discrepancy or a . % relative diﬀerence [100]. Note that the overlapping energy region of the pion-pion channel for KLOEand BaBar ( . − . GeV) provides about 70% of the total a µ . The discrepancy, fully accounted forin DHMZ19 [69], has a strong impact on the discrepancy between the measurement of g µ − and theorypredictions based on the R-ratio.In order to address this discrepancy, we follow the prescription of the Particle Data Group (PDG)for similar tensions between experimental results [74]. After calculating the weighted average of all theexperimental results in the . − . GeV energy range [100], the PDG prescription tells us to adjust theerror by a factor of S = [ χ / ( N − / , where N is the number of experiments (in our case N = 5 ). Thisyields an uncertainty of . instead of the . of [70]. This uncertainty is far less than half the diﬀerencebetween BaBar and KLOE, because the other, less precise experiments dilute the discrepancy. We includethis increased error estimate as (1 . − . ) / = (1 . ππ , where “ ππ ” denotes the uncertainty comingfrom the tension between experiments in the pion-pion channel. Note that the other experimental channelsmay have similar uncertainties, which would increase the error further.It is worth pointing out that hadronic τ decays can be used, in principle, to provide an independentmeasurement of the spectral function in this important low-energy region, as ﬁrst proposed in [101] andupdated in [102–104]. However, this requires controlling isospin-breaking corrections [105–109], which isa challenge and has led to putting this approach aside in the last few years. b. Another possible source of uncertainty comes from the way in which the dispersive integral of theexperimental data for the various, ﬁnal-state channels is performed, including correlations. KNT [70, 96]use a trapezoidal rule and argue that the error resulting from this choice is negligible. They also takeinto account correlations in systematic uncertainties within the same experiment and between diﬀerentexperiments, as well as within and between diﬀerent channels, over extended ranges of center-of-massenergy. On the other hand, DHMZ [69] limit the eﬀects of these correlations to small energy bins anduse splines for integrating the data, correcting for biases if necessary. The end result is that, despite usingthe same experimental input, the two teams ﬁnd results for the various channels which diﬀer, more oftenoutside the error bars of [70] than of [69]. To account for this when using the correlation matrices from[70], we follow a suggestion put forward at the last “Muon g-2 Theory Initiative” meeting [110]. We add,to results obtained with these correlations, an uncertainty obtained by summing, in quadrature, half thediﬀerences of the individual channels. This gives an additional error of (2 . int , where “ int ” stands forthe uncertainty related to the integration and correlation procedures.81hus, using the well known dispersive integral (see eg. Section 5 of [111]) a µ = 10 (cid:16) αm µ π (cid:17) (cid:90) ∞ s th dss R ( s ) ˆ K ( s ) , (177)the experimental R-ratio data set of [70] and the perturbative R-ratio from the rhad package [112], weobtain, a µ = 693 . . stat (1 . ππ (2 . int [3 . , where the ﬁrst error reproduces the one given in [70],while the second and third errors are computed above. The last error, in brackets, is the quadraticallycombined error of the ﬁrst three. It is larger than the one of KNT and is closer to the error of DHMZ17,but still a bit smaller than that of DHMZ19. This enlarged error, as well as the most recent value of . from DHMZ, reduce a bit the strong tension between the measurement of g µ − and the theorypredictions based on the R-ratio method.Having checked that we are able to reproduce well known R-ratio results, we repeat the whole procedurefor a µ, win of Equation (71). For this observable, high statistical precision is easier to reach on the latticeand the continuum and inﬁnite-volume extrapolations are less diﬃcult. In addition, we expect that theR-ratio method yields a similar relative error for a µ, win as for a µ . These facts make a µ, win a convenientobservable to compare the two approaches and, eventually, to combine them for improved overall precision[48].To determine a µ, win from R-ratio data, we transform the latter to Euclidean coordinate space by aLaplace transform [42], where a weighted integral with weight function K ( t ) W ( t ) has to be performed,as described in Section 11. One ends up with an integral as in Equation (177), but the kernel ˆ K replacedby ˆ K win : ˆ K win ( s ) = 3 s / m µ (cid:90) ∞ dt e −√ st K ( t ) W ( t ; t , t ) , (178)where K ( t ) is given by Equation (3) of the main paper. The window parameters t = 0 . fm, t = 1 . fmand ∆ = 0 . fm are the same as in Section 24. We proceed with the computation of the s -integral asin the case of a µ . In particular, we include the ππ and int errors as follows: a. Repeating the R-ratio method, with the ˆ K win kernel, gives the same relative diﬀerence betweenthe KLOE and BaBar results, ie. . , as with the original kernel function ˆ K . Carrying out the PDGprocedure for adjusting errors, we obtain a value of . for the additional “ ππ ” error. b. Since we do not have the contributions of the individual experimental channels in our window forboth the KNT and the DHMZ frameworks, we simply scale down the “ int ” error of the full a µ . Thus,instead of . , we obtain . as an “ int ” uncertainty.Putting the above components together and also including the tiny perturbative contribution from[112], we obtain a µ, win = 229 . . stat (0 . ππ (0 . int [1 . . (179)The last error, in brackets, is all errors added in quadrature. This value is compared with our lattice resultin Section 24.

26 Consequences for electroweak precision observables?

In this section we investigate the claim, put forward by Crivellin, Hoferichter, Manzari and Montull(CHMM) [113], that the result of this paper may lead to a signiﬁcant tension with electroweak preci-sion ﬁts for the hadronic contribution to the running of the electromagnetic coupling, ∆ α (5)had ( M Z ) . Here“ (5) ” means that we consider ﬁve active ﬂavors and M Z is the mass of the Z -boson. CHMM’s analysis isbased on the earlier work of [114] that studies the impact, on precision electroweak ﬁts, of increasing theHVP contribution to reproduce the measured value of ( g µ − / . That work has been recently updated82 ∆ α x KNT18+rhadlattice incl. bottom-0.50.00.51.01.52.0 0...1 1...10 10...100[GeV ] 100...1000 1000...M [Crivellin:2020zul] [ ∆ α - ∆ α K N T ] x proj( ∞ )proj(1.94 GeV) Figure 30: Hadronic contribution to the running of the electromagnetic coupling for Euclidean momenta.For each bin we show ∆ α (5)had ( − q ) − ∆ α (5)had ( − q ) , where q / max is the lower/upper end of the bin.The conventional value of ∆ α (5)had ( M Z ) can be obtained by summing the results of all bins and performinga rotation to Minkowswki space (see text). The upper panel shows results obtained from the experimentalR-ratio of KNT18 [70] and perturbation theory [112], as well as our lattice results. The bottom panelcompares the results with the R-ratio as a baseline. The “reference point” scenario of CHMM [113](“proj( ∞ )”) is shown with crosses. Their “proj( . GeV)” scenario is shown with bursts.and expanded in [115]. In contrast, [116] shows that the shift in ∆ α (5)had ( M Z ) , induced by the sole knowl-edge of the diﬀerence between the lattice and the R-ratio prediction for a µ , is at least nine times smallerthan the error in the electroweak ﬁt determination of ∆ α (5)had ( M Z ) , given in [113] and in Equation (180)below.Indeed, the authors of [113] perform a global ﬁt to electroweak observables and obtain ∆ α (5)had ( M Z ) = 270 . . × − (electroweak) . (180)Note that this value is somewhat smaller, both in value and in uncertainty, than the latest result of theGﬁtter group [117]. The same observable can also be obtained from the experimental R-ratio [69, 96],eg. using the KNT19 result [96] as do CHMM: ∆ α (5)had ( M Z ) = 276 . . × − (R-ratio) , (181)which is . σ higher than the electroweak-ﬁt value.CHMM then consider a variety of scenarios to estimate the possible value of ∆ α (5)had ( M Z ) from theresults given in the present paper. In their “reference point” scenario (“proj ( ∞ ) ”), they assume that therelative diﬀerence between the R-ratio and our result for a µ corresponds to an energy-independent rescalingin the e + e − → hadrons spectral function for all center-of-mass energies, from threshold to inﬁnity. Thus,they obtain ∆ α (5)had ( M Z ) = 276 . . × − × . . × − (CHMM) , (182)which deviates by . σ from the electroweak-ﬁt value. This leads CHMM to conclude that the result of83 light Π strange Π charm Π disc ∞ − ref Π bottom Π (5) Π(1) 355 . .

3) 41 . .

1) 17 . . − . .

6) 2 . .

2) 0 . .

1) 412 . . − Π(1) 363 . .

2) 67 . .

3) 96 . . − . .

2) 0 . .

0) 2 . .

5) 530 . . Table 20: Continuum extrapolated lattice results for the HVP. We give separately the light, strange, charmand disconnected contributions and also the ﬁnite-size eﬀects as computed from the simulations.The bottom quark contribution was obtained from the work [94]. The last column is the sum of all.the present paper, while removing the discrepancy with the experimental determination of a µ , may createa new one, now with electroweak precision measurements.We now take a closer look at the claim of CHMM and point out problems with their assumption. Wewill not study the running of α all the way up to M Z here, because that would take us signiﬁcantly beyondthe scope of the present study of a µ . However, without too much eﬀort we can investigate the runningof α in the Euclidean regime up to scales accessible in our lattice computation. The Euclidean runningof the coupling is obtained from the Euclidean HVP as ∆ α (5)had ( − q ) = e Π (5) ( q ) , (183)where Π (5) is the ﬁve-ﬂavor HVP given, with the notations of Section 11, by: Π (5) = Π light + Π strange + Π charm + Π disc + Π bottom . (184)Here we compute Π (5) both on the lattice and from the R-ratio.To compute the HVP on the lattice we apply the same Type-II ﬁt procedure as we use for thedetermination of a µ and a µ, win in the previous Sections. We correct for ﬁnite-size eﬀects by computingthem in our simulations. For the bottom quark we combine the ﬁrst four moments of Π bottom ,determined by the HPQCD collaboration [94], into a Pad´e approximation. For the purpose of the presentdiscussion it suﬃces to consider two observables: Π (5) (1) the value of Π (5) at q = 1 GeV and thediﬀerence in Π (5) at q = 10 GeV and GeV , denoted by Π (5) (10) − Π (5) (1) . Our ﬁnal continuumextrapolated results are given in Table 20.In the case of Π (5) (10) − Π (5) (1) we have departed from our standard procedure in two details. First,one must note that the full dataset gives bad ﬁt qualities in the continuum extrapolation, especially forthe charm contribution. This is because the precision of our results is orders of magnitude better thaneg. for a µ . Thus, we ﬁt only a subset of about 50 conﬁgurations, maximally spaced along the simulationchain. This increases the statistical error and leads to acceptable ﬁt qualities. Also, we observe latticeartefacts that are much larger than for the other observables in the paper. In the charm case they are onthe level of %. To reﬂect this in the uncertainty of the continuum extrapolation, we use ﬁt functionsthat are quadratic polynomials in a and also apply a ﬂat weighting of the results of diﬀerent proceduresin our determination of the systematic error.To compute the HVP from the R-ratio we apply a dispersion integral (see eg. Section 3 of [111]) Π (5) ( q ) = q π (cid:90) ∞ s th ds R ( s ) s ( s + q ) (185)to the R-ratio data set of [70]. Uncertainties are computed from the covariance matrix of the data. Forenergies above the range of this data set we use the perturbative result from the rhad package [112].In Figure 30 we show HVP diﬀerences corresponding to ﬁve energy bins, starting at zero and endingat the scale M Z . The ﬁrst bin gives the diﬀerence of the HVP between 1 and 0 GeV , the second between10 and 1 GeV and so on. The running to the scale M Z can then be obtained by summing the values inthe ﬁve bins. In the top panel we show the result obtained from the R-ratio. For the ﬁrst two bins, we If we were able to run in the Euclidean up to M Z , the conversion to timelike M Z can be computed in perturbationtheory and the resulting correction is signiﬁcantly smaller than the present error bars on ∆ α (5)had ( M Z ) , as shown eg. in [118]. to GeV , we see a diﬀerence between thelattice and the R-ratio determinations, that is about . in ∆ α (5)had . This corresponds to a relative deviationof approximately . , which is about the same as we have in the total a µ . This fact is not surprisingbecause over of a µ comes from this spacelike region of momenta. In the second bin, corresponding tothe energy range from to GeV , the lattice and R-ratio results already agree. For the bins with largerenergies we show no lattice results: discretization errors are too large to allow a controlled continuumextrapolation.In Figure 30 we also show two scenarios from CHMM. The ﬁrst is their “reference point” projection(“proj( ∞ )”) described above. There, the R-ratio and lattice results are assumed to have the same . relative discrepancy in all energy bins up to M Z , as observed in the ﬁrst bin. The diﬃculty with thisassumption is that CHMM extrapolates over two orders of magnitude in energy, while using data onlyfrom the ﬁrst bin. In addition, our lattice result in the second energy bin is already in clear disagreementwith their hypothesis, and invalidates their estimate of Equation (182).The second of their scenarios considered here is the one in which they assume that the . rescalingof the spectral function only applies to center-of-mass energies below . GeV (“proj( . GeV)”). Thissecond scenario agrees much better with our lattice results, as can be seen in the ﬁrst two bins. Iffuture lattice calculations conﬁrm that the agreement holds in the remaining bins, the tension implied byour lattice calculation of a µ on ∆ α (5)had would be below . σ , signiﬁcantly smaller than the . σ of the“reference point” scenario of [113] and only slightly larger than the . σ already observed with the R-ratioresult of Equation (181). References

1. Ishizuka, N., Fukugita, M., Mino, H., Okawa, M. & Ukawa, A. Operator dependence of hadronmasses for Kogut-Susskind quarks on the lattice.

Nucl. Phys.

B411,

Commun. Math. Phys. [Erratum: Commun. Math. Phys.98,433(1985)], 59 (1985).3. Morningstar, C. & Peardon, M. J. Analytic smearing of SU(3) link variables in lattice QCD.

Phys.Rev.

D69, et al.

Precise Charm to Strange Mass Ratio and Light Quark Masses from FullLattice QCD.

Phys. Rev. Lett. et al.

FLAG Review 2019. arXiv: (2019).6. Borsanyi, S. et al.

High-precision scale setting in lattice QCD.

JHEP

010 (2012).7. Luscher, M. Properties and uses of the Wilson ﬂow in lattice QCD.

JHEP

071 (2010).8. Lepage, G. P. Flavor symmetry restoration and Symanzik improvement for staggered quarks.

Phys.Rev.

D59, et al.

Review of lattice results concerning low-energy particle physics.

Eur. Phys. J.

C77,

112 (2017).10. Capitani, S., Durr, S. & Hoelbling, C. Rationale for UV-ﬁltered clover fermions.

JHEP

Phys. Rev.

D54,

Phys. Rev.

D67,

Phys. Lett.

B417,

Comput. Phys. Commun. et al.

Calculation of the axion mass based on high-temperature lattice quantum chro-modynamics.

Nature et al.

QCD thermodynamics with dynamical overlap fermions.

Phys. Lett.

B713,

Phys. Rev.

D46,

Nucl. Phys.

B255,

Phys. Rev.

D75, et al.

Nonsinglet Axial Vector Couplings of the Baryon Octet in Lattice QCD.

Phys.Lett.

B227, et al.

Nonperturbative QCD Simulations with 2+1 Flavors of Improved StaggeredQuarks.

Rev. Mod. Phys. et al.

Observation of an Excited Ω − Baryon.

Phys. Rev. Lett.

Phys. Rev.

D34. [AIP Conf. Proc.132,267(1985)], 2809 (1986).24. Bazavov, A. et al.

Additional Strange Hadrons from QCD Thermodynamics and Strangeness Freeze-out in Heavy Ion Collisions.

Phys. Rev. Lett. et al.

Constraining the hadronic spectrum through QCD thermodynamics on the lattice.

Phys. Rev.

D96,

AIP Conf. Proc.

Phys. Rev.

D91,

Nucl. Phys.

B721, et al.

SU(2) and SU(3) chiral perturbation theory analyses on baryon masses in 2+1ﬂavor lattice QCD.

Phys. Rev.

D80,

Prog. Theor. Phys. et al.

Leading isospin breaking eﬀects on the lattice.

Phys. Rev.

D87, et al.

Ab initio calculation of the neutron-proton mass diﬀerence.

Science et al.

Isospin splittings in the light baryon octet from lattice QCD and QED.

Phys.Rev. Lett.

Phys. Rev.

D75, et al.

QED eﬀects in the pseudoscalar meson sector.

JHEP

093 (2016).36. Gasser, J., Rusetsky, A. & Scimemi, I. Electromagnetic corrections in hadronic processes.

Eur. Phys.J.

C32,

Comput. Phys. Commun.

Phys. Rev.

D88,

Phys. Rev.

D76,

Phys. Rev.

D80,

Phys. Rev.

D90,

Eur.Phys. J.

A47,

148 (2011).43. Chakraborty, B. et al.

Strange and charm quark contributions to the anomalous magnetic momentof the muon.

Phys. Rev. D g µ − . Phys. Lett. B g µ − . Phys. Rev. D

Phys. Rev. Lett. et al.

Hadronic vacuum polarization contribution to the anomalous magnetic momentsof leptons from ﬁrst principles.

Phys. Rev. Lett. et al.

Calculation of the hadronic vacuum polarization contribution to the muon anomalousmagnetic moment.

Phys. Rev. Lett.

Phys. Rev.

D64, et al.

Overlap Valence on 2+1 Flavor Domain Wall Fermion Conﬁgurations with Deﬂationand Low-mode Substitution.

Phys. Rev.

D82, et al.

Light quark vacuum polarization at the physical point and contribution to the muon g − . Phys. Rev. D

ACM Trans. Math. Software

RBRC Workshop on Lattice Gauge Theories (2016).

54. Borsanyi, S. et al.

Slope and curvature of the hadronic vacuum polarization at vanishing virtualityfrom lattice QCD.

Phys. Rev.

D96, g − from Lattice QCD+QED. Phys. Rev.

D99,

Phys. Rev.

D85,

Nucl. Phys.

B447,

Nucl. Phys.

B568,

JHEP

114 (2017).870. Aubin, C. & Blum, T. Calculating the hadronic vacuum polarization and leading hadronic contribu-tion to the muon anomalous magnetic moment with improved staggered quarks.

Phys. Rev.

D75,

Phys. Rev.

D71,

Nucl. Phys.

B250, p . JHEP

Phys.Rev.

D60,

Phys. Rev.

D68,

Adv. Nucl. Phys.

277 (2003).67. Follana, E. et al.

Highly improved staggered quarks on the lattice, with applications to charmphysics.

Phys. Rev.

D75,

JHEP

078 (2007).69. Davier, M., Hoecker, A., Malaescu, B. & Zhang, Z. A new evaluation of the hadronic vacuumpolarisation contributions to the muon anomalous magnetic moment and to α ( m ) . Eur. Phys. J.C

241 (2020).70. Keshavarzi, A., Nomura, D. & Teubner, T. Muon g − and α ( M Z ) : a new data-based analysis. Phys. Rev.

D97,

Phys. Rept.

Phys. Rev. Lett.

Phys. Rev.

D88, et al.

Review of Particle Physics.

Phys. Rev.

D98,

Nucl. Phys.

B364,

Nucl. Phys.

B354,

Flavor physics and lattice quantum chromodynamics in Modern perspectives in lat-tice QCD: Quantum ﬁeld theory and high performance computing. Proceedings, InternationalSchool, 93rd Session, Les Houches, France, August 3-28, 2009 (2011), 629–698. arXiv: .78. Lellouch, L. & Luscher, M. Weak transition matrix elements from ﬁnite volume correlation functions.

Commun. Math. Phys.

Nucl. Phys.

B619,

Phys. Rev. Lett. ππ scattering using N f = 2 + 1 Wilson improved quarks with massesdown to their physical values.

PoS

LATTICE2014,

079 (2015).882. Hansen, M. T. & Patella, A. Finite-volume eﬀects in ( g − HVP,LO µ . arXiv: (2019).83. Hansen, M. T. & Patella, A. Finite-volume and thermal eﬀects in the leading-HVP contribution tomuonic ( g − . arXiv: (Apr. 2020).84. Giusti, D., Sanﬁlippo, F. & Simula, S. Light-quark contribution to the leading hadronic vacuumpolarization term of the muon g − from twisted-mass fermions. Phys. Rev.

D98, g − with 2+1 ﬂavor lattice QCD. arXiv: (2019).86. Bijnens, J. et al. Electromagnetic ﬁnite-size eﬀects to the hadronic vacuum polarization.

Phys. Rev.D

Quantum ﬁelds on a lattice isbn : 9780521599177, 9780511879197(Cambridge University Press, 1997).88. Dowdall, R., Davies, C., Lepage, G. & McNeile, C. Vus from pi and K decay constants in full latticeQCD with physical u, d, s and c quarks.

Phys. Rev. D et al.

Gradient ﬂow and scale setting on MILC HISQ ensembles.

Phys. Rev. D et al.

QED Corrections to Hadronic Processes in Lattice QCD.

Phys. Rev.

D91, et al.

First lattice calculation of the QED corrections to leptonic decay rates.

Phys. Rev.Lett. et al.

Light-meson leptonic decay rates in lattice QCD+QED.

Phys. Rev.

D100,

JHEP

137 (2014).94. Colquhoun, B., Dowdall, R. J., Davies, C. T. H., Hornbostel, K. & Lepage, G. P. Υ and Υ (cid:48) LeptonicWidths, a bµ and m b from full lattice QCD. Phys. Rev.

D91,

Phys. Rev.

D98, g − of charged leptons, α ( M Z ) , and the hyperﬁnesplitting of muonium. Phys. Rev.

D101, g − and α ( M Z ) using newesthadronic cross-section data. Eur. Phys. J.

C77,

827 (2017).98. Colangelo, G., Hoferichter, M. & Stoﬀer, P. Two-pion contribution to hadronic vacuum polarization.

JHEP

006 (2019).99. Hoferichter, M., Hoid, B.-L. & Kubis, B. Three-pion contribution to hadronic vacuum polarization.

JHEP

137 (2019).100. Anastasi, A. et al.

Combination of KLOE σ (cid:0) e + e − → π + π − γ ( γ ) (cid:1) measurements and determinationof a π + π − µ in the energy range . < s < . GeV . JHEP

173 (2018).101. Alemany, R., Davier, M. & Hocker, A. Improved determination of the hadronic contribution to themuon (g-2) and to alpha (M(z)) using new data from hadronic tau decays.

Eur. Phys. J. C et al. The Discrepancy Between tau and e+e- Spectral Functions Revisited and theConsequences for the Muon Magnetic Anomaly.

Eur. Phys. J. C

Eur. Phys. J. C [Erratum: Eur.Phys.J.C 72, 1874 (2012)],1515 (2011).104. Davier, M., H¨ocker, A., Malaescu, B., Yuan, C.-Z. & Zhang, Z. Update of the ALEPH non-strangespectral functions from hadronic τ decays. Eur. Phys. J. C

Phys. Lett. B

JHEP

002 (2002).107. Jegerlehner, F. & Szafron, R. ρ − γ mixing in the neutral channel pion form factor F eπ and its rolein comparing e + e − with τ spectral functions. Eur. Phys. J. C τ decays for ( g − µ fromLattice QCD. PoS

LATTICE2018,

135 (2018).109. Miranda, J. & Roig, P. New τ -based evaluation of the hadronic contribution to the vacuum po-larization piece of the muon anomalous magnetic moment. arXiv: (July2020).110. https://indico.fnal.gov/event/21626/session/2/contribution/68/material/slides/2.pdf.111. Jegerlehner, F. The Anomalous Magnetic Moment of the Muon. Springer Tracts Mod. Phys. pp.1–693 (2017).112. Harlander, R. V. & Steinhauser, M. rhad: A Program for the evaluation of the hadronic R ratio inthe perturbative regime of QCD.

Comput. Phys. Commun. ( g − µ versus global electroweak ﬁts. arXiv: (Mar. 2020).114. Passera, M., Marciano, W. & Sirlin, A. The Muon g-2 and the bounds on the Higgs boson mass. Phys. Rev. D g -2 and ∆ α connection. arXiv: (June 2020).116. De Rafael, E. On Constraints Between ∆ α had ( M Z ) and ( g µ − HVP . arXiv: (June 2020).117. Haller, J. et al.

Update of the global electroweak ﬁt and constraints on two-Higgs-doublet models.

Eur. Phys. J. C

675 (2018).118.

Theory report on the 11th FCC-ee workshop (2019). arXiv:1905.05078 [hep-ph]