[PDF] Automatic differentiation for error analysis

Abstract

We present ADerrors.jl, a software for linear error propagation and analysis of Monte Carlo data. Although the focus is in data analysis in Lattice QCD, where estimates of the observables have to be computed from Monte Carlo samples, the software also deals with variables with uncertainties, either correlated or uncorrelated. Thanks to automatic differentiation techniques linear error propagation is performed exactly, even in iterative algorithms (i.e. errors in parameters of non-linear fits). In this contribution we present an overview of the capabilities of the software, including access to uncertainties in fit parameters and dealing with correlated data. The software, written in julia, is available for download and use in this https URL

Full PDF

AAutomatic diﬀerentiation for error analysis

Alberto Ramos 𝑎, ∗ 𝑎 Instituto de Física Corpuscular (IFIC), CSIC-Universitat de Valencia46071 - Valencia, SPAIN

E-mail: [email protected]

We present

ADerrors.jl , a software for linear error propagation and analysis of Monte Carlodata. Although the focus is in data analysis in Lattice QCD, where estimates of the observableshave to be computed from Monte Carlo samples, the software also deals with variables withuncertainties, either correlated or uncorrelated. Thanks to automatic diﬀerentiation techniqueslinear error propagation is performed exactly, even in iterative algorithms (i.e. errors in parametersof non-linear ﬁts). In this contribution we present an overview of the capabilities of the software,including access to uncertainties in ﬁt parameters and dealing with correlated data. The software,written in julia, is available for download and use in https://gitlab.ift.uam-csic.es/alberto/aderrors.jl . IFIC/20-55

Tools for High Energy Physics and Cosmology - TOOLS20202-6 November, 2020Institut de Physique des 2 Inﬁnis (IP2I), Lyon, France ∗ Speaker © Copyright owned by the author(s) under the terms of the Creative CommonsAttribution-NonCommercial-NoDerivatives 4.0 International License (CC BY-NC-ND 4.0). https://pos.sissa.it/ a r X i v : . [ h e p - l a t ] D ec utomatic diﬀerentiation for error analysis Alberto Ramos

1. Data analysis in Lattice QCD

Lattice QCD is nowadays able to provide precise results for many quantities of phenomeno-logical interest. A key ingredient in providing solid results is the analysis of lattice QCD data.Lattice QCD is based on discretizing space-time in a hypercubic grid of spacing 𝑎 . The usualcomputation of QCD observables requires the evaluation of path integrals. On the lattice theseobservables are computed as integrals over very many dimensions. The biggest appeal of themethod is that these integrals can be estimated using large-scale computer simulations and MonteCarlo methods. The usual workﬂow in lattice QCD analysis is the following Production of conﬁgurations

A representative ensemble of the discretized lattice QCD action isgenerated using large scale computer simulations. These are usually called conﬁgurations . Measurements of primary observables

Several observables of interest, like hadron masses ( 𝑎𝑀 p , Ω ,... ),decay constants ( 𝑎 𝑓 𝜋,𝐾 ,𝐷,... ), form factors, etc. . . are measured in lattice units on the conﬁg-urations generated on the previous step. These are usually called primary observables . Data analysis

The observable of interest is generically a function of these primary observables.These are called derived observables .The challenges in data analysis are, on one hand, to control the systematic uncertaintiesassociated with the diﬀerent extrapolations (to the continuum 𝑎 →

0, to the inﬁnite volume limit,etc. . . ) required to produce a ﬁnal answer.On the other hand we have to correctly account for the statistical uncertainties of the results.This includes a correct estimate of errors taking into account the autocorrelation of data measured ina Monte Carlo chain. We also have to take into account the correlation of observables measured onthe same conﬁgurations. It is important to note that the functions that deﬁne the derived observablescommonly involve iterative algorithms (cf. section 3).In these proceedings contribution we present a freely available analysis code,

ADerrors.jl ( https://gitlab.ift.uam-csic.es/alberto/aderrors.jl ), that aims to simplify the dataanalysis. It combines the determination of statistical uncertainties using the Γ -method[1–3] withautomatic diﬀerentiation techniques for error propagation [4]. We think that some of the techniquesthat are used for error propagation are potentially useful beyond the lattice QCD community. The software is written in

Julia ( https://julialang.org ). The package in not in thegeneral registry. Still one can use the package manager to install it. ADerrors.jl also dependson

BDIO.jl that is also not registered and should be installed beforehand. In summary the steps tohave a fully functional working environment are:

Code (Julia)julia > import

Pkgjulia > Pkg . add ( "https://gitlab.ift.uam-csic.es/alberto/bdio.jl" ) julia > Pkg . add ( "add https://gitlab.ift.uam-csic.es/alberto/aderrors.jl" ) utomatic diﬀerentiation for error analysis Alberto Ramos

2. A calculator with uncertainties

At the center of the package

ADerrors.jl is the data type uwreal . Generally speaking, a uwreal data type contains the value and the diﬀerence sources of uncertainties of a variable. Herewe list the main concepts to work with these variables: • Calling var = uwreal ([ x , dx ], "TAG" ) will deﬁne var as a uwreal variable with centralvalue x and uncertainty dx. The user provided "TAG" allows to keep track of the diﬀerentcontributions to the uncertainty of a variable. • One can perform normal operations with uwreal data types, as if they were normal realvariables: var2 = 2.0* sin ( var )/ var will propagate errors from var to var2 . • Calling uwerr ( var ) will determine the total uncertainty in var . • value ( var ) and err ( var ) return, respectively, the central value and total error of var . • Calling details ( var ) will give the contribution to the total error of var from each source.Here we show an example that shows that ADerrors.jl can be used a “calculator withuncertainties” that automatically takes care of error propagation and correlations.

Code (Julia)julia > x = uwreal ([12.31, 0.23], "Experiment A" ) Error not available ... maybe run uwerr ) julia > y = uwreal ([4.22, 0.12], "Experiment B" ) Error not available ... maybe run uwerr ) julia > z = x + y Error not available ... maybe run uwerr ) julia > uwerr ( z ) julia > println ( z ) julia > details ( z )16.53 +/- 0.25942243542145693 julia > zero_error = sin ( z ) - ( sin ( x )* cos ( y ) + cos ( x )* sin ( y ) )-6.661338147750939e-16 ( Error not available ... maybe run uwerr ) julia > uwerr ( zero_error ) julia > println ( zero_error ) -6.661338147750939e-16 +/- 1.7281005654554058e-16 utomatic diﬀerentiation for error analysis Alberto Ramos

In many practical situations we have several variables with uncertainties that are not indepen-dent. The covariance between these variables is given, and has to be taken into account in order toperform error propagation correctly.The method cobs ( central_values , covariance , "TAG" ) allows to import correlateddata as a vector of uwreal . One can operate with the elements of this vector naturally. Let us seean example Code (Julia)julia > avg = [16.26, 0.12, -0.0038]; julia > Mcov = [0.478071 -0.176116 0.0135305-0.176116 0.0696489 -0.005544310.0135305 -0.00554431 0.000454180]; julia > p = cobs ( avg , Mcov , "Correlated data" ) element Array{ uwreal ,1}:16.26 (

Error not available ... maybe run uwerr )0.12 (

Error not available ... maybe run uwerr )-0.0038 (

Error not available ... maybe run uwerr ) julia > uwerr .( p ); julia > p element Array{ uwreal ,1}:16.26 +/- 0.69142678571197980.12 +/- 0.2639107803785211-0.0038 +/- 0.021311499243366245 julia > cov ( p ) × Array{Float64,2}:0.478071 -0.176116 0.0135305-0.176116 0.0696489 -0.005544310.0135305 -0.00554431 0.00045418 julia > z = p [1] + p [2] + sin ( p [3]); julia > uwerr ( z ) julia > z

3. Iterative algorithms

In many practical situations our derived observables are deﬁned by an iterative procedure. Thisis the case of ﬁtting parameters, that being a function of the data that requires an iterative procedureto be determined (i.e. the minimization of the 𝜒 function) make cumbersome the determinationof the derivatives required for linear error propagation.4 utomatic diﬀerentiation for error analysis Alberto Ramos

ADerrors.jl uses automatic diﬀerentiation to perform exact linear error propagation. Errorscan be naively propagated even in iterative algorithms. As a concrete example, imagine that we areinterested in ﬁnding the root of a non-linear function 𝑓 ( 𝑥 ) = 𝑎 cos ( 𝑏 sin ( 𝑥 )) − 𝑥 . (1)If the quantities 𝑎, 𝑏 have uncertainties, these will propagate into an uncertainty in the root 𝑥 ★ .Since 𝑥 ★ is deﬁned by 𝑥 ★ = Find the root of 𝑓 ( 𝑥 ) , (2)what error propagation needs is to ﬁnd the derivative of "Find the root of 𝑓 ( 𝑥 ) ". The followingprogram performs the job using ADerrors.jl

Code (Julia)julia > a = uwreal ([1.34, 0.12], "Data 01" ) julia > b = uwreal ([1.34, 0.12], "Data 02" ) julia > x0 = uwreal ([0.5,0.5], "Initial position" ) julia > while true val = a * cos ( b * sin ( x0 )) - x0der = - a * b * sin ( b * sin ( x0 ))* cos ( x0 ) - 1.0 x1 = x0 - val / der if ( abs ( value ( x0 ) - value ( x1 ))<1.0E-10)breakelse x0 = x1 endend julia > uwerr ( x1 ) julia > details ( x1 )0.7838003744331717 +/- 0.056992589665847124 Note that the while loop is just a straightforward application of Newton’s method. Automaticdiﬀerentiation approaches the problem of computing the derivative of "Find the root of 𝑓 ( 𝑥 ) " byjust computing the derivative of each expression of Newton’s algorithm. The position of the root 𝑥 ★ = . ( ) , has an uncertainty that has been propagated from the uncertainties of the originalvariables 𝑎, 𝑏 .Note that we also assigned an error to the initial position where the iteration starts 𝑥 . Thisuncertainty does not contribute to the ﬁnal uncertainty of the position of the root. This is just a In fact

ADerrors.jl uses the

Julia package

ForwardDiff.jl [5] ( https://github.com/JuliaDiff/ForwardDiff.jl/ ) for this propose. utomatic diﬀerentiation for error analysis Alberto Ramosmanifestation of the fact that it does not matter what we choose as starting point for our Newtonmethod, we will end up ﬁnding the same root.

One of the most common situations in data analysis is ﬁtting data to a model. This is done byﬁnding the values of the parameters 𝑝 𝑖 that minimize the function 𝜒 ( 𝑝 𝑖 ; 𝑑 𝑎 ) , 𝑝 𝑖 ( 𝑖 = , . . . , 𝑁 parm ) , 𝑑 𝑎 ( 𝑎 = , . . . , 𝑁 data ) . (3)where 𝑑 𝑎 are the data that carry uncertainties (i.e. Monte Carlo data from a lattice simulation, orexperimental data). This data might be correlated and the determination of the uncertainties in theﬁt parameters has to take these correlations into account.The previous section shows that ADerrors.jl can solve this problem by just propagatingerrors in each step of the minimization routine (i.e. Levenberg-Marquardt, see an explicit examplein [4]). This is cumbersome because it would require to code this algorithm in such a way that uwreal data types are accepted as input.Reference [4] proposes an alternative based on the determination of the Hessian of the 𝜒 function at its minima. This allows ADerrors.jl to rely on external (and eﬃcient) libraries toperform the minimization of the 𝜒 , and delegate the error propagation to the computation of theHessian at the minima once it is found. This Hessian can be determined exactly using automaticdiﬀerentiation techniques, allowing exact error propagation in ﬁt parameters.Let us here show the main characteristics of data ﬁtting using ADerrors.jl (the completeexample is available in appendix A). First we minimize the 𝜒 with an eﬃcient library Code (Julia) julia > fit = optimize ( xx -> lm ( xx , value .( uwy )), [1.0,1.0], LevenbergMarquardt (), autodiff = : forward ) julia > println ( " ) julia > println ( fit ) Results of Optimization Algorithm * Algorithm : LevenbergMarquardt * Minimizer : [1.19727262030669,-0.2938846629270645]*

Sum of squares at Minimum : 19.590325*

Iterations : 7*

Convergence : true* | x - x ' | < 1.0e-08: false* | f ( x ) - f ( x ' )| / | f ( x )| < 1.0e-08: true* | g ( x )| < 1.0e-08: false* Function Calls : 8*

Gradient Calls : 7*

Multiplication Calls : 21 utomatic diﬀerentiation for error analysis Alberto RamosOnce the minima of the 𝜒 is found, ADerrors.jl is able to determine the uncertainties inthe ﬁt parameters. We get the ﬁt parameters as a vector of uwreal from the routine fit_error ( csq , fitp_central_values , data ) where csq is the 𝜒 ( 𝑝 𝑖 , 𝑑 𝑎 ) function (as deﬁned in equation (3)), fitp_central_values are thecentral values of the ﬁt parameters (i.e. the minima of the 𝜒 ), and data is the data whose errorshave to be propagated to the ﬁt parameters.Note that we obtain the ﬁt parameters as an array of uwreal data. This means that we canoperate with ﬁt parameters as if they were normal variables, and even use them as input for otherﬁts. We can also obtain the breaking down of the error in ﬁt parameters (i.e. how much eachoriginal source of uncertainty contributes to the uncertainty in ﬁt parameters). Code (Julia) julia > fitp , chiexp = fit_error ( csq , fit . minimizer , uwy ) julia > uwerr .( fitp ) julia > println ( " ) julia > println ( "Fit results: chi^2 / chi^2_exp: " , fit . ssr , " / " , chiexp ) julia > for i in 1: length ( fitp ) print ( "Fit Parameter " * string ( i )* " (exact value " * string ( exact [ i ])* ": " ) details ( fitp [ i ]) println ( " " ) julia > end println ( " ) Fit results : chi ^2 / chi ^2 _exp : 19.59032480880994 / 18.000000000000004 Fit Parameter exact value ... utomatic diﬀerentiation for error analysis Alberto Ramos . . . . . Monte Carlo time x . . . . . Monte Carlo time x Figure 1:

Monte Carlo histories of 𝑥 and 𝑥 for a random walk in the interval [− , ] . Clearly the twohistories are correlated.

4. Dealing with Monte Carlo data

Up to this point we have shown how

ADerrors.jl works with variables with uncertainties,even when they are correlated. But the most common situation in Lattice QCD analysis is that datato be analyzed come from a Monte Carlo simulation. In this case one can not reduce the variableto a single value +/- error , because when several observables measured on the same ensembleare combined the correlation has to be computed from the Monte Carlo history.

ADerrors.jl deals with this case transparently: uwreal variables can also contain full MonteCarlo histories. Let us ﬁrst generate some Monte Carlo data (see ﬁgure 1)

Code (Julia)julia > julia > nsamp = 500 eta = randn ( nsamp ); julia > x = Vector{Float64}( undef , nsamp ); julia > x [1] = 0.0; julia > for i in 2: nsamp x [ i ] = x [ i -1] + 0.2* eta [ i ]if abs ( x [ i ]) > 1.0 x [ i ] = x [ i -1]endend utomatic diﬀerentiation for error analysis Alberto RamosThe vector x [:] contains samples of the uniform distribution U [− , ] . We can determine theexpected values, for example, of E [ 𝑋 ] or E [ 𝑋 ] . Let us show how these Monte Carlo data can beinput as uwreal data types and check that the variables are correlated. Code (Julia)julia > xp2 = uwreal ( x .^2, "Random walk ensemble in [-1,1]" )0.3533602504472119 ( Error not available ... maybe run uwerr ) julia > xp4 = uwreal ( x .^4, "Random walk ensemble in [-1,1]" )0.2197572322981959 ( Error not available ... maybe run uwerr ) julia > cov ([ xp2 , xp4 ])2 × Array{Float64,2}:0.0007523017604391568 0.00055518681030901070.0005551868103090107 0.00039631731910912886

Crucially the ensemble tag ( "Random walk ensemble in [-1,1]" in this case) is the same forboth observables. This tells

ADerrors.jl that the measurements of both observables have beenperformed in the same ensemble and makes possible to determine the covariance of the data fromthe Monte Carlo histories. We can determine functions of these expected values. Correlations areextracted from the Monte Carlo histories and automatically taken into account. These correlationsare crucial to obtain correct error estimates of derived observables that depend on both variables xp2 , xp4 , as the following example shows Code (Julia)julia > z = xp4 / xp2 ^2; julia > uwerr ( z ) julia > println ( z )1.7764683477541687 +/- 0.12751045462709376

5. Conclusions

In this proceedings contribution we have presented

ADerrors.jl . Using Automatic diﬀer-entiation, this software package performs exact linear error propagation even in quantities that aredeﬁned via iterative algorithms, like ﬁt parameters.

ADerrors.jl keeps track of the origin of theerror in each variable, making possible to obtain how much each ensemble has contributed to theuncertainty in a particular variable.The beneﬁts of data analysis within this framework is that it is robust : if your analysis codedetermine the central values correctly, the exact nature of automatic diﬀerentiation ensures thaterrors are propagated correctly.Some other features of the software include • Exact linear error propagation in ﬁt parameters, integrals and roots of non linear functions. • Handles data from any number of ensembles (i.e. simulations with diﬀerent parameters). • Handles correlated data, with error estimates automatically including the correlations.9 utomatic diﬀerentiation for error analysis

Alberto Ramos • Support for replicas (i.e. several runs with the same simulation parameters). • Irrelgular MC measurements are handled transparently.We refer the interested readers to the original documentation of the package https://gitlab.ift.uam-csic.es/alberto/aderrors.jl ,and the on-line tutorial https://ific.uv.es/~alramos/docs/ADerrors/tutorial/ . Acknowledgments

I am grateful to C. Pena for his help preparing the material of this talk. The author receivedsupport from the Generalitat Valenciana via the genT program (CIDEGENT/2019/040).

A. Error propagation in ﬁts

Code (Julia) using

ADerrors , Distributions , LeastSquaresOptim , Printf npt = 20 x = 2.0* rand ( npt ) @ . model ( x ) = 1.2* exp (- x *0.3) dy = 0.05 ./ ( x .+ model ( x )) y = model ( x ) + dy .* randn ( npt ) uwy = Vector{ uwreal }( undef , npt )for i in 1: nptxfmt = @sprintf( "%6.4f" , x [ i ]) uwy [ i ] = uwreal ([ y [ i ], dy [ i ]], "Data for point at x=" * string ( xfmt ))end uwerr .( uwy ) println ( " )for i in 1: nptprintln ( " dt " , x [ i ], " " , uwy [ i ])end println ( " ) utomatic diﬀerentiation for error analysis Alberto Ramos function fit_definitions ( x , dy ) function lmfit ( prm , y ) res = Vector{ eltype ( prm )}( undef , length ( y ))for i in 1: length ( y ) res [ i ] = ( y [ i ] - prm [1]* exp ( prm [2]* x [ i ])) / dy [ i ]endreturn res end chisq ( prm , data ) = sum ( lmfit ( prm , data ) .^ 2)return lmfit , chisq end lm , csq = fit_definitions ( x , err .( uwy )) fit = optimize ( xx -> lm ( xx , value .( uwy )), [1.0,1.0], LevenbergMarquardt (), autodiff = : forward ) println ( " ) println ( fit ) println ( " ) fitp , chiexp = fit_error ( csq , fit . minimizer , uwy ) uwerr .( fitp ) exact = [1.2, -0.3] utomatic diﬀerentiation for error analysis Alberto Ramos println ( " ) println ( "Fit results: chi^2 / chi^2_exp: " , fit . ssr , " / " , chiexp )for i in 1: length ( fitp ) print ( "Fit Parameter " * string ( i )* " (exact value " * string ( exact [ i ])* ": " ) details ( fitp [ i ]) println ( " " )end println ( " ) References [1]

ALPHA

Collaboration, U. Wolﬀ, “Monte Carlo errors with less errors,”

Comput.Phys.Commun. (2004) 143–153, arXiv:hep-lat/0306017 [hep-lat] .[2]

ALPHA

Collaboration, S. Schaefer, R. Sommer, and F. Virotta, “Critical slowing down anderror analysis in lattice QCD simulations,”

Nucl. Phys.

B845 (2011) 93–119, arXiv:1009.5228 [hep-lat] .[3] F. Virotta,

Critical slowing down and error analysis of lattice QCD simulations . PhD thesis,Humboldt-Universität zu Berlin, Mathematisch-Naturwissenschaftliche Fakultät I, 2012.[4] A. Ramos, “Automatic diﬀerentiation for error analysis of Monte Carlo data,”

Comput. Phys.Commun. (2019) 19–35, arXiv:1809.01289 [hep-lat] .[5] J. Revels, M. Lubin, and T. Papamarkou, “Forward-mode automatic diﬀerentiation in Julia,” arXiv:1607.07892 [cs.MS] (2016) . https://arxiv.org/abs/1607.07892https://arxiv.org/abs/1607.07892