[PDF] Physics-related epistemic uncertainties in proton depth dose simulation

Abstract

A set of physics models and parameters pertaining to the simulation of proton energy deposition in matter are evaluated in the energy range up to approximately 65 MeV, based on their implementations in the Geant4 toolkit. The analysis assesses several features of the models and the impact of their associated epistemic uncertainties, i.e. uncertainties due to lack of knowledge, on the simulation results. Possible systematic effects deriving from uncertainties of this kind are highlighted; their relevance in relation to the application environment and different experimental requirements are discussed, with emphasis on the simulation of radiotherapy set-ups. By documenting quantitatively the features of a wide set of simulation models and the related intrinsic uncertainties affecting the simulation results, this analysis provides guidance regarding the use of the concerned simulation tools in experimental applications; it also provides indications for further experimental measurements addressing the sources of such uncertainties.

Full PDF

aa r X i v : . [ phy s i c s . c o m p - ph ] J u l Physics-related epistemic uncertainties in protondepth dose simulation

Maria Grazia Pia, Marcia Begalli, Anton Lechner, Lina Quintieri, and Paolo Saracco

Abstract — A set of physics models and parameters pertainingto the simulation of proton energy deposition in matter areevaluated in the energy range up to approximately 65 MeV, basedon their implementations in the Geant4 toolkit. The analysisassesses several features of the models and the impact of theirassociated epistemic uncertainties, i.e. uncertainties due to lackof knowledge, on the simulation results. Possible systematiceffects deriving from uncertainties of this kind are highlighted;their relevance in relation to the application environment anddifferent experimental requirements are discussed, with emphasison the simulation of radiotherapy set-ups. By documentingquantitatively the features of a wide set of simulation models andthe related intrinsic uncertainties affecting the simulation results,this analysis provides guidance regarding the use of the concernedsimulation tools in experimental applications; it also providesindications for further experimental measurements addressingthe sources of such uncertainties.

Index Terms — Monte Carlo, simulation, Geant4, hadron ther-apy.

I. I

NTRODUCTION T HE simulation of the energy deposited by protons inmatter is relevant to various experimental applications;radiotherapeutical applications exploit its peculiar pattern priorto stopping, exhibiting the characteristic “Bragg peak”, todeliver a well localized dose to the tumor area [1].Several applications of general purpose Monte Carlo sys-tems, like MCNP [2]–[4], GEANT 3 [5], Geant4 [6], [7],SHIELD-HIT [8], [9], FLUKA [10], [11] and PHITS [12]are documented in the literature concerning this topic, inhadron therapy as well as in other ﬁelds. A variety of physicsoptions - theoretical models, evaluated data compilations andvalues of relevant physical parameters - is available in theseMonte Carlo codes to model the electromagnetic and nuclearinteractions of protons, and of their secondary products. Whilethe software implementations are speciﬁc to each Monte Carlocode, the underlying physics modeling approaches and datacompilations are often common to various simulation systems.Some of the physics models and parameters used in thesimulation of proton interactions with matter are affected byepistemic uncertainties [13], i.e uncertainties due to lack of

Manuscript received March 30, 2010.M. G. Pia and P. Saracco are with INFN Sezione di Genova, Via Dode-caneso 33, I-16146 Genova, Italy (phone: +39 010 3536328, fax: +39 010313358, e-mail: [email protected], [email protected]).M. Begalli is with State University of Rio de Janeiro, Rio de Janeiro, Brazil(email: [email protected]. Lechner is with the Atomic Institute of the Austrian Universities, ViennaUniversity of Technology, Vienna, Austria and CERN, Geneva, Switzerland(email: [email protected]).L. Quintieri is with INFN Laboratori Nazionali di Frascati, Frascati, Italy(e-mail: [email protected]). knowledge. They may originate from various sources [14],such as incomplete understanding of fundamental physicsprocesses, or practical inability to treat them thoroughly,non-existent or conﬂicting experimental data for a physicalparameter or model, or the application of a physics modelbeyond the experimental conditions in which its validity hasbeen demonstrated (e.g. at lower or higher energies, or withdifferent target materials).The role of epistemic uncertainties in the software veriﬁca-tion and validation process has been the object of research inthe context of simulation based on deterministic methods [13];these investigations are motivated by the rigorous risk analysisrequired by some sensitive applications. Limited attention hasbeen devoted so far to the role of epistemic uncertainties inMonte Carlo simulation in particle and nuclear physics, andrelated experimental ﬁelds. This paper addresses this topicby illustrating it in a concrete experimental use case: thesimulation of proton depth dose to water for radiotherapyapplications. The simulation conﬁguration involves a realisticmodel of a therapeutical proton beam line and beam energiesof approximately 65 MeV; this use case is representativeof experimental environments for the treatment of ocularmelanoma [15]–[18].Due to their intrinsic nature, related to lack of knowledge,epistemic uncertainties are difﬁcult to quantify [19]. Althoughthe characterization of epistemic uncertainty contributions isneeded for many of the issues that feed the reliability modelof complex systems, there is no generally accepted methodof measuring epistemic uncertainties and their contribution toreliability estimations. A variety of mathematical formalisms[20] has been developed for this purpose; nevertheless, someof the techniques adopted in the context of deterministic sim-ulations, like interval analysis and applications of Dempster-Shafer theory of evidence [21], are not always directly applica-ble in identical form to the treatment of epistemic uncertaintiesin Monte Carlo simulations.Sensitivity analysis [19] is a tool for exploring how differentuncertainties, including epistemic ones, inﬂuence the modeloutput [22]. This approach is adopted in the study describedhere: the paper identiﬁes a set of epistemic uncertainties inphysics modeling pertinent to the problem domain, documentstheir impact on the simulation results, and discusses theirpotentiality to produce systematic effects in relation to thecharacteristics of the application environment.Typically, in statistical analyses epistemic uncertainty isrepresented as a set of discrete possible or plausible choices(e.g. model choices) [23]. The environment for this kind ofstudy has been realized in the context of a Geant4-based application; the characteristic of Geant4 as a toolkit, encom-passing a wide variety of physics models, along with thefeature of polymorphism characterizing the object orientedprogramming paradigm, allow the conﬁguration of the sim-ulation with a large number of different physics options inthe same software environment. This versatility makes Geant4a convenient playground to evaluate the effects of a numberof physics alternatives on the experimental use case understudy. The sensitivity analysis documented in this paper, whichexamines the response of the system to a wide set of modelingapproaches, plays a conceptually similar role to the intervalanalysis method applied in deterministic simulation, whereparameters subject to epistemic uncertainties are varied withinbounds. As mentioned in the previous paragraph, sensitivityanalysis as applied to the context described in the followingsections contributes to identify and quantify possible system-atic effects in the simulation; it cannot infer anything about thevalidity of any of the physics models, for which experimentaldata would be needed.The physics modeling options available in Geant4 for theuse case under study are broadly representative of the bodyof knowledge in the problem domain; the analysis describedin this paper, albeit performed in the context of a speciﬁcMonte Carlo system, provides elements of interest for othersimulation environments as well.Epistemic uncertainties are in principle reducible [13]; theanalysis documented in this paper identiﬁes areas whereexperimental measurements could reduce them, and highlightstheir requirements of accuracy and other features to improvethe knowledge embedded in the current simulation models.Preliminary assessments relevant to this study were reportedin [24].II. G

EANT BASED SIMULATION OF PROTON DEPTH - DOSEPROFILES

The Geant4 toolkit includes various physics modellingoptions relevant to the application domain considered in thispaper; they concern stopping powers, multiple scattering, crosssections and ﬁnal state models of elastic and inelastic nuclearinteractions of the primary protons, as well as a varietyof electromagnetic and hadronic models for the secondaryparticles resulting from proton interactions with matter.Several Geant4-based simulations of proton therapy set-ups, like [25]- [35], have documented satisfactory agreementwith experimental depth dose measurements in various con-ﬁgurations of beam lines and detectors. Some open issuesremain, which are generally shared by simulations based onother Monte Carlo systems as well, due to common physicsmodeling approaches in the codes and similar practices in theexperimental domain.There is evidence in the literature of different featuresproduced by Geant4 physics models in the energy range below100 MeV [36]–[40]; the discrimination of their accuracy ismade difﬁcult by the still incomplete software validation,which is often hindered by the limited availability of exper-imental data, or their controversial characteristics. The sameproblem affects other Monte Carlo codes as well. Furthermore, some shortcomings are present in the comparisons of protontherapy simulations with experimental data.The physics conﬁguration used in the simulation and theMonte Carlo kernel version are either undocumented, orincompletely documented, in a number of references; thereforeit is not always possible to relate the results documented in theliterature to the physics options and software implementationswhich produced them, thus hindering the reproducibility of theresults. The comparison between simulated and experimentaldata is limited to qualitative appraisal in most cases; the lackof statistical analysis in many articles prevents the readerfrom appraising the signiﬁcance of the compatibility betweensimulation and experimental data, and the relative merits ofthe simulation models in comparison to measurements takenin different experimental conﬁgurations and characterized bydifferent uncertainties.It is a feature of proton therapy simulations that the beamparameters, namely the beam energy and energy spread, arenot known in typical experimental set-ups of this domain withsufﬁcient accuracy to base the simulation on their nominalvalues [27]. The precise knowledge of the beam energy isnot critical to clinical applications; in that context what isof interest is the proton range, which can be measured veryaccurately. In common practice, the beam parameters inputto the simulation are adjusted so that the simulation resultsbest ﬁt the measured depth dose proﬁle [25], [27], [30],[33], [41], [42]; this procedure affects the signiﬁcance offurther comparisons between simulated and experimental data.Experimental techniques have been developed to measure theenergy of a proton beam from radiotherapy accelerators withgreater precision [43], but they are not commonly exploited inconnection with simulation validation studies.The comparison of simulated and experimental Bragg peakproﬁles is also sensitive to the normalization procedures,which are often applied in experimental practice to relatesimulated and measured data; this topic is discussed in detailin [44].These shortcomings do not severely affect current clinicalpractice, where Monte Carlo simulation plays a role of veriﬁ-cation and optimization. More demanding requirements aboutthe predictive capabilities of the simulation may derive fromnew perspectives, such as the use of Monte Carlo simulationin treatment planning, which is the object of ongoing research[45]–[48], or other disciplines, like radiation protection.Epistemic uncertainties undermine the predictive role ofsimulation software; by identifying and quantifying them, theanalysis described in the following sections is propaedeutic tofurther experimental and theoretical efforts to reduce them, orat least to control their effects.III. O

VERVIEW OF RELEVANT PHYSICS MODELS

A brief overview of the Geant4 physics models relevantto the use case addressed in this paper is given below tofacilitate the understanding of the results presented in thefollowing sections. In the context of Geant4, particle inter-actions with matter are represented by processes, which canbe implemented through different models [6]. Similarities with the modeling approaches in other Monte Carlo codesare discussed; they show that the body of knowledge, aswell as knowledge gaps, are largely shared across a varietyof simulation systems. The information summarized in thissection is necessarily succinct; further details can be found inthe cited references.

A. Electromagnetic interactions

Geant4 electromagnetic package [49] includes two packagesrelevant to the experimental use case considered:

Standard and

Low Energy .The proton ionisation process in the

Low Energy electro-magnetic package [50], [51] is articulated through models [52]based on the free electron gas model [53] at lower energy( < ICRU49 , Ziegler77 , Ziegler85 and

Ziegler2000 implement electronicand nuclear stopping powers based on [54]–[57]; their differentenergy ranges of applicability are documented in the respectivereferences. The conﬁguration of the hadron ionisation processis identiﬁed in the following sections through the selected pa-rameterisation model. The process also deals with the emissionof δ -rays.The Geant4 Low Energy electromagnetic package includesprocesses [58], [59] to handle the secondary particles resultingfrom proton interactions: electrons, photons and ions. Modelsfor electrons and photons are based on data libraries (EEDL[60] and EPDL97 [61]) and on analytical formulations origi-nally developed for the Penelope [62] Monte Carlo code; bothoptions account for the atomic relaxation [63] following theprimary processes. Models based on the parameterisations in[55], [56], and on [64] are available for ions.The main features of the Geant4

Standard electromagneticpackage are documented in [65]. More recently a model [68]for proton ionisation based on [54] was implemented in thispackage; this physical approach is the same as the one alreadypresent in the

Low Energy package. The handling of energyloss ﬂuctuations is implemented in this package; it is basedon the model adopted in GEANT 3 [66] and was updated inGeant4 8.3 [67].An assessment of Geant4 electromagnetic processes againstthe NIST (United States National Institute of Standards)reference data can be found in [36]. The validation of Geant4low energy electromagnetic processes against precision mea-surements of electron energy deposition is documented in [69].The

Standard package includes implementations [70], [71]of the multiple scattering process. In the early Geant4 ver-sions a generic process (

G4MultipleScattering ), based onthe Lewis [72] theory, was applicable to any charged par-ticles; it has been complemented by a process special-ized for hadrons (

G4hMultipleScattering ) in Geant4 8.2, andone speciﬁc to electrons (

G4eMultipleScattering ) in Geant49.3. The specialized multiple scattering processes are in-tended to replace the generic one in future Geant4 releases[73]. These processes can be conﬁgured with various mod-els (e.g.

G4UrbanMscModel90 , G4UrbanMscModel92 and

G4UrbanMscModel93 ), which involve some empirical param-eters [74]. Early implementations of Geant4 multiple scatteringwere validated against experimental muon data at low [75],[76] and high [77] energy; [77] showed better accuracy ofGeant4 multiple scattering model (as implemented in Geant41.0) with respect to the GEANT 3 model based on Moli`eretheory. Recent studies, like [69], [78], have highlighted is-sues in the evolution of energy deposition patterns involvingelectron-photon interactions simulated with different Geant4versions; the observed effects have been ascribed to modiﬁ-cations to Geant4 multiple scattering algorithm. The literatureconcerning recent evolutions of Geant4 multiple scattering isfocused on issues related to the dependence of the simulationon transport step size and parameters of the algorithm [67],[68], [74]. Io the best of the authors’ knowledge, the validationof Geant4 proton multiple scattering is not documented inliterature yet for the energy range relevant to this study.The Geant4

Standard and

Low Energy electromagneticpackages are concerned by the same epistemic uncertaintiesaffecting electromagnetic physics modeling relevant to the usecase studied in this paper, which are analyzed in sections V-Band V-C. Differences in the outcome of simulations usingthe two packages may derive from features speciﬁc to eachpackage, which are not associated with epistemic uncertaintiesin the physics domain pertinent to the use case examined inthis paper; they could be due to numerical features, like thenumber of bins in the look-up tables used in the simulationand their interpolation, or algorithms and modelling choicesspeciﬁc to each package. A complete documentation andanalysis of different features of Geant4

Standard and

LowEnergy electromagnetic packages is outside the scope of thispaper.Other Monte Carlo codes used for hadron therapy sim-ulation adopt similar approaches for stopping power calcu-lation at high and low energies. At intermediate energies,stopping powers based on ICRU 49 Report are implemented inSHIELD-HIT; an improvement to include them in FLUKA isdocumented in [79], but it does not appear to be released yet inthe current version of FLUKA (FLUKA-2008) at the time ofwriting this paper. PHITS handles proton ionization accordingto the SPAR code [80], while MCNP uses an energy weightedaverage between the high and low energy calculations [81],which adopt the same methodology as in SPAR. PHITS [82],SHIELD-HIT [42] and MCNPX do not model δ -ray emission.GEANT development was frozen with the 3.21 version in1994; the code is no longer supported, but it is still used forhadron therapy developments [83]. GEANT 3 simulated protonenergy loss based on the Bethe-Bloch equation and dealt with δ -ray production from ionization.MCNP multiple Coulomb scattering treats soft and hardinteractions separately [84]: soft collisions are described usinga continuous scattering approximation; a small number of hardcollisions are simulated directly. Multiple scattering is basedon Moli`ere theory in FLUKA and PHITS [85]; details ofthe algorithm in FLUKA are described in [86]. SHIELD-HITsimulates multiple scattering on the basis of a gaussian model[42], which gives the correlated value of the angular deviationand lateral displacement of the scattered particle. GEANT 3 provided two options for multiple scattering simulation, re-spectively based on a Gaussian approximation and on Moli`eretheory [5]. B. Hadronic interactions

The Geant4 hadronic package addresses the complexity ofnuclear interactions through a software framework [87]. Thebaseline design can accommodate multiple implementationsof cross sections and ﬁnal state models associated with aprocess, which are either complementary in their energy rangecoverage or alternative in their modelling approach; processesand models are meant to be handled polymorphically throughtheir respective base classes.

1) Elastic scattering:

Geant4 includes variouselastic scattering processes:

G4HadronElasticProcess , G4UHadronElasticProcess , G4QElastic and

G4WHadronElasticProcess ; the latter was released inGeant4 9.3 with the purpose [73] of allowing models forelastic scattering to be treated in a similar way to inelasticmodels. A

G4DiffuseElastic [88] process was also releasedin Geant4 9.3; the energy range of its applicability is notexplicitly speciﬁed in [88], but, since this process appearsapplied at energies of 1 GeV and above in the associatedreference, one is led to assume that it is pertinent to higherenergies than those relevant to the use case studied in thispaper. This inference manifests an epistemic uncertainty inthe applicability domain of this process.The

G4HadronElasticProcess class of the hadronic pro-cesses package handles cross section and ﬁnal state calculationaccording to the software design of [87].It can be conﬁgured with the

G4HadronElasticDataSet class, derived from

G4VCrossSectionDataSet and includedin the Geant4 hadronic cross sections package, which im-plements total elastic scattering cross sections derived fromGHEISHA [89].The scattering can be conﬁgured through several mod-els, such as

G4LElastic , G4ElasticCascadeInterface and

G4HadronElastic . G4LElastic , included in the hadronic mod-els package, is based on GHEISHA algorithms reengi-neered in Geant4; it is not meant to conserve energyand momentum on an event-by-event basis, but only onaverage.

G4ElasticCascadeInterface , identiﬁed in the fol-lowing as “Bertini elastic”, is included in the cascade package of the cascade package of hadronic models ;it derives from

G4IntraNuclearTransportModel and imple-ments a model based on the INUCL [90] code. The

G4CascadeElasticInterface class in the same package activatesboth elastic and inelastic interactions.

G4HadronElastic [91],included in the coherent elastic package of hadronic models ,combines elements originally developed for CHIPS (ChiralInvariant Phase Space) [92], [93] with other modelling ap-proaches; it aggregates a

G4VQCrossSection object belongingto CHIPS and a

G4ElasticHadrNucleusHE model.The

G4QElastic process, known as “CHIPSelastic”, delegates the calculation of cross sectionsto the

G4QElasticCrossSection class, derived from

G4VQCrossSection , and implements its own scattering algorithm. All the related classes are included in the chiral inv phase space package of hadronic models .The

G4UHadronElasticProcess process [91], included inthe coherent elastic package of hadronic models , is meant tobe conﬁgured with the dedicated

G4HadronElastic model; thisconﬁguration is referred to in the following as “U-elastic”.The limited documentation in the literature of Geant4 elasticscattering models and other codes does not facilitate theappreciation of their characteristics, nor the identiﬁcation ofthe experimental data with respect to which some of thesimulation models, especially those implementing parameter-isations, may have been calibrated. Although improvementsto Geant4 elastic scattering modeling are mentioned in [40],hardly any validation of the available models is documentedin the literature in the energy range relevant to the use caseunder study. The use of these models in the simulation is asource of epistemic uncertainty, due to the lack of knowledgeof their accuracy in the energy range pertinent to this study.

2) Non-elastic interactions:

Inelastic hadron-nucleus scat-tering is handled in Geant4 through processes speciﬁc to eachparticle type. Processes for protons, neutrons, deuteron, tritonand α particles are relevant to this study.Total inelastic cross sections derived from GHEISHA [89]are available in Geant4 through G4HadronInelasticDataSet for all hadrons, α particles, deuteron and triton; alternativeimplementations based on [94], [95] are available for someenergy ranges and target materials. Cross sections describ-ing neutron-nucleus scattering with higher precision below20 MeV are available in G4NeutronHPInelasticData . Special-ized cross sections, based on [96]- [101], are available forions.Parameterised and theory-driven [102] models of nuclearinelastic scattering are available in Geant4 for protons andother particles, concerning the energy range pertinent to thisstudy.Geant4 Low Energy Parameterised models (LEP), originat-ing from GHEISHA [89], handle protons, neutrons, pions, α particles, deuterons and tritons.The CHIPS G4QInelastic inelastic process [92], [93] imple-mented in Geant4 is applicable to hadron inelastic scatteringin the energy range pertinent to this study.Various options of theory-driven models describe the phasesof intranuclear transport, preequilibrium and nuclear deexci-tation in Geant4. Other Monte Carlo codes used in protontherapy applications use a similar multi-stage approach tosimulate proton inelastic interactions. The primary protonenergy in this study lies at the edge of what is commonlyconsidered as the range of transition between intranuclearcascade and preequilibrium models.Geant4 includes three cascade models, each one with furtherassociated models describing the lower energy phases: they areknown as Binary [103], Bertini [104], [105] and Li`ege [106]cascade models.The Binary cascade model [103] adopts a hybrid approachbetween a classical cascade code and a quantum molecular dy-namics model. It handles the preequilibrium phase through thePrecompound model (

G4PreCompoundModel ) [107], whoseimplementation is based on Grifﬁn’s exciton model [108], [109]; this model can be activated in a simulation applicationeither through the Binary cascade model or as an independentmodel. The nuclear deexcitation associated with the Precom-pound model can be conﬁgured with various options: theyinclude an evaporation model based on Weisskopf-Ewing’s[110], [111] theory, exploiting Dostrovsky’s computationalapproach [112], the Generalized Evaporation Model (GEM)[113] (also used by PHITS), and the optional activation of theFermi break-up [107].A similar approach is adopted in SHIELD-HIT, whosedefault model considers a fast cascade stage, which bringsthe interaction between the projectile and target to a sequenceof binary collisions [114]; this stage is followed by preequi-librium [115] and deexcitation of residual nuclei, with Fermibreak up of light nuclei and evaporation.The Geant4 Bertini Cascade implementation [104], [105] isa reengineering of the INUCL code [90], which is based onBertini’s approach [116] to intranuclear transport; it handlesthe preequilibrium phase based on Grifﬁn’s exciton model andthe evaporation phase based on Weisskopf’s and Dostrovsky’sapproach. The preequilibrium part of INUCL is based onthe Cascade Exciton Model (CEM) [117], which is one ofthe options of MCNPX for proton transport and is alsoimplemented in SHIELD (as well as in other codes) [118].A cascade model based on Bertini’s scheme, derived fromthe HETC [119] implementation in LAHET [120], is availablein MCNPX to handle protons, besides the ISABEL [121],[122] and CEM options. The HETC model was also interfacedto GEANT 3 through CALOR [123].The INCL Intranuclear Cascade [106] model, better knownas Li`ege Cascade, has been reengineered in Geant4 togetherwith the associated ABLA evaporation model [124]; it was ﬁrstreleased in Geant4 version 9.1 [125] and further improved inGeant4 9.3. INCL is included [126] in the LAHET [120] codesystem used by MCNPX. Although, according to [125], INCLis meant for energies above 200 MeV, satisfactory applicationsat energies of a few tens MeV are reported in the literature[127]–[129].The simulation of low energy proton interactions in PHITSis based [130] on the MCNP4C and NMTC [131] codes; thelatter incorporates Bertini’s cascade model [116] for nucleonand meson transport.FLUKA handles inelastic scattering through PEANUT[132] in the energy range relevant to the use case under study;it involves a sequence of intranuclear cascade followed bypreequilibrium and deexcitation. The preequilibrium modelis based on the formalism developed by Blann [133] withsome modiﬁcations [134]; the evaporation model is basedon Weisskopf-Ewing’s approach [135]; Fermi break-up ismodeled for light nuclei [135].Hadronic interactions were not handled by GEANT 3; theirtreatment was delegated to external packages (GHEISHA,CALOR and an early version of FLUKA) interfaced to it.Limited documentation regarding the validation of Geant4inelastic scattering models relevant to the use case of thisstudy is available in the literature. Some comparisons withexperimental data are reported in [38], [39], [40], [91], [103],[136], [137]: although most of the results shown in these references are not directly related to the use case investigatedin this paper, they demonstrate the ongoing validation effortsin this domain.

C. Epistemic uncertainties in the simulation models

Epistemic uncertainties in the physics simulation modelsarise from various sources.In some cases, the lack of knowledge concerns the valueof a physical parameter: this is the case, for instance, forthe mean water ionization potential, for which various values,originating from experimental measurements or theoreticalcalculations, are documented in the literature.Other sources of uncertainty are associated with values, usedin the simulation, deriving from parameterisations, or ﬁts toexperimental data (which may be inspired by theoretical mo-tivations): in the present study this concerns, for instance, thecross sections of nuclear elastic and inelastic interactions, andproton stopping powers. In these cases the uncertainties derivefrom the measurements themselves, the criteria by which thedata are selected for the ﬁt, and from the parameterisation orﬁtting process.Some models may embed parameters or, more generally,features, which are adjusted in the software implementationaccording to empirical procedures: from calibration with re-spect to experimental data to educated guesses, in the absenceof pertinent measurements. This is the case, for instance, forthe Geant4 multiple scattering model and forsome hadronicinteraction models. In this respect, and also for models basedon parameterisations or ﬁts to experimental data, an importantissue is the distinction between the calibration and validationof simulation models; for the reader’s convenience, these con-cepts pertaining to simulation epistemology are brieﬂy recalledhere. Calibration is the process of improving the agreement ofa code calculation with respect to a chosen set of benchmarksthrough the adjustment of parameters implemented in the code[19]; in the Monte Carlo simulation jargon this process is alsoknown as “tuning”. Software validation is deﬁned in the IEEEStandard for Software Veriﬁcation and Validation [138]. Thisgeneric deﬁnition is adapted to speciﬁc application domainswith some slight variants; regarding simulation, validation isusually intended as the process of determining the degree towhich a model is an accurate representation of the real worldfrom the perspective of its intended uses, or of conﬁrmingthat the predictions of a code adequately represent measuredphysical phenomena [19], [139].The limited documentation in the literature of the calibrationof the physics models implemented in Monte Carlo codesdoes not facilitate the understanding whether some of thecomparisons with experimental data reported in the literaturedocument calibration results and their experimental bench-marks, or model validation.The use of a simulation code for predictive purposes outsidethe scope of its validation necessitates extrapolation beyondthe understanding gained strictly from experimental validationdata. This type of uncertainty in our inference is primarilyepistemic.Regarding the energy range of nuclear interactions relevantto the use case considered in this paper, a long debate has

TABLE IP

ROTON PHYSICS MODELLING OPTIONS IN THE SIMULATION

Physics domain Option Process class Model class

Proton ICRU49 G4hLowEnergyIonisation G4hICRU49pstopping Ziegler77 G4hLowEnergyIonisation G4hZiegler1977ppowers Ziegler85 G4hLowEnergyIonisation G4hZiegler1985pZiegler2000 G4hLowEnergyIonisation G4hSRIM2000pMultiple scattering Generic multiple scattering G4MultipleScattering G4UrbanMscModel92Hadron multiple scattering G4hMultipleScattering G4UrbanMscModel90Hadronic LEP G4HadronElasticProcess G4LElasticelastic U-elastic G4UHadronElasticProcess G4HadronElasticscattering Bertini-elastic G4HadronElasticProcess G4ElasticCascadeInterfaceCHIPS-elastic G4QElasticHadronic inelastic Default G4ProtonInelasticProcess G4HadronInelasticDataSetcross sections Wellisch [94] G4ProtonInelasticProcess G4ProtonInelasticCrossSectionHadronic LEP G4ProtonInelasticProcess G4LEProtonInelasticinelastic Precompound G4ProtonInelasticProcess G4PreCompoundModelscattering Precompound-GEM G4ProtonInelasticProcess G4PreCompoundModel, G4ExcitationHandlerPrecompound-Fermi break-up G4ProtonInelasticProcess G4PreCompoundModel, G4ExcitationHandlerBinary cascade G4ProtonInelasticProcess G4BinaryCascadeBertini cascade G4ProtonInelasticProcess G4CascadeInterfaceLi`ege G4ProtonInelasticProcess G4InclAblaCascadeInterfaceCHIPS-inelastic G4QInelastic been going on for decades in the literature about differenttheoretical preequilibrium models, respectively based on theso-called “exciton” and “hybrid” approaches [140]. Thesediscussions involve subtle theoretical arguments; however, ithas been acknowledged that, whichever theoretical approachis chosen to model equilibrium emission, the effects of overlysimple or untested assumptions can be compensated by meansof other uncorrelated phenomenological parameterisation ofthe model. This ongoing theoretical debate is not expected tobe relevant to the use case addressed in this study, since thedifferences of the various theoretical approaches are expectedto affect mainly exclusive channels [141], with negligibleeffects on the resulting deposited energy, which is the objectof this paper. The sources of epistemic uncertainties in thesimulation reside in the phenomenological features of thenuclear model implementations, rather than in the choice oftheoretical approach to preequilibrium modeling.The analysis described in the following sections identiﬁessources of epistemic uncertainties in the physics domain of thesimulation and evaluates systematic effects on the simulationoutcome associated with them.IV. S

OFTWARE CONFIGURATION

A. Simulation

The simulation application includes components responsiblefor the conﬁguration of the geometry and materials of theexperimental set-up, the generation of the proton beam, theselection of the physics features to be used in the particletransport, the collection of relevant observables in dedicatedobjects, and the control of the user’s interaction with Geant4kernel at various stages of the execution.The geometry conﬁguration encompasses a realistic modelof a proton therapy beam line and a volume, placed at theexit of it, where the energy deposited by the proton beam isscored. The beam line model exploits the code of the geometryand material composition of a real-life proton therapy facility[18], which is publicly released in the Geant4 hadrontherapy [30] example; the implementation of the beam line geometryas in Geant4 8.1 was used for all the simulation productions.The scoring volume consists of 4 cm cube, ﬁlled with water;its size is adequate to contain the formation of the Braggpeak of energy loss produced by the proton beam. Thisvolume is deﬁned “sensitive” [6] in Geant4 terms. A readoutgeometry, longitudinally segmented in 200 µm thick slices, issuperimposed to the mass geometry of the sensitive volume;the longitudinal segmentation determines the resolution of thesimulation in the location of the Bragg peak, and mimics theresolution of typical measurements with ionization chambersin experimental practice. The energy deposit proﬁle is scoredthrough Geant4 hits objects. The ﬁgures of longitudinal energydeposition proﬁles included in the following sections showthe energy deposited in each slice of the readout geometry byprimary protons and secondary particles, integrated over allthe events in the simulation run.Primary protons are generated according to a Gaussian dis-tribution with 63.95 MeV mean energy and 300 keV standarddeviation, unless differently speciﬁed in the following sections.A variety of physics options is conﬁgured by means ofa software design exploiting Geant4 G4VModularPhysicsList and

G4VPhysicsConstructor as base classes. A class derivedfrom Geant4 kernel’s

G4VModularPhysicsList is responsiblefor driving the physics selection in the simulation application:it assembles a physics conﬁguration by means of subclasses of

G4VPhysicsConstructor , each one responsible for instantiatingthe physics processes and models pertinent to a particle type(or a group of particles, like ions) and an interaction type(electromagnetic, hadronic elastic or hadronic inelastic).The proton physics options and corresponding Geant4classes evaluated in this study are summarized in Table I.For convenience, a subset of

G4VPhysicsConstructor sub-classes, corresponding to the selection in Table II, has beendeﬁned as a reference conﬁguration for this study.In all the productions the interactions of secondary particlesare simulated as in Table II, unless differently speciﬁed in thefollowing sections.

TABLE IIR

EFERENCE PHYSICS CONFIGURATION IN THE SIMULATION

Particles Physics selection

Electrons Low energy electromagnetic package,Photons library-based modelsPositrons Standard electromagnetic packageProtons

G4hLowEnergyIonization , ICRU49 parameterisations

G4UHadronElastic process,

G4HadronElastic model

G4ProtonInelasticProcess , Precompound modelInelastic cross sections as in [94]Neutrons

G4UHadronElastic process,

G4HadronElastic model

G4NeutronInelasticProcess , Precompound model

G4HadronCaptureProcess , LEP model

G4HadronFissionProcess , LEP modelInelastic cross sections as in [94]Deuteron

G4hLowEnergyIonization , ICRU49 parameterisations (scaled)Triton

G4UHadronElastic process,

G4HadronElastic model α Inelastic process speciﬁc to each particle, LEP modelsTripathi and Shen cross sectionsCharged

G4MultipleScattering process

Options regarding the water ionization potential can beintroduced in the simulation by setting the value of thisparameter through the public interface of the

G4Material class.The secondary particle production threshold is set at 1 µ m(in range) in the water volume and 10 µ m in the passivecomponents of the experimental set-up. The results presentedin the following sections are generated with 10 µ m maximumstep size; the step limit is set approximately an order ofmagnitude smaller than the thickness of the slices in thereadout geometry to ensure adequate precision in scoring thelongitudinal energy deposition proﬁle. B. Simulation production

Simulated data were produced by instantiating various com-binations of physics constructor objects in the simulationapplication corresponding to the options listed in Table I.The simulation production concerned a set of representativephysics conﬁgurations, each differing from the reference oneof Table II by one modeling feature only; this strategy allowedthe unambigous attribution of the differences observed in eachsimulation to the physics feature speciﬁcally under investi-gation. One million events were generated for each physicsconﬁguration.Data samples corresponding to all the options listed in TableI were produced with Geant4 9.3, the latest version availableat the time of writing this paper. Data samples correspondingto a subset of physics conﬁgurations were also produced withthree other Geant4 versions: 8.1p02, 9.1 and 9.2p03. Versions8.1p02 and 9.1 were involved in the validation of electronenergy deposition in [69]; Geant4 8.1 was previously usedfor the assessement of some Geant4 physics models in [35](subject to the same beam settings adjustments as in [27]),while Geant4 9.1 is still widely used in production mode byvarious experiments. Geant4 9.2p03 is an updated version ofGeant4 9.2 including corrections; it was released two monthslater than Geant4 9.3.The simulations were run on Intel Core2 Duo ProcessorsE6420, equipped with 2.13 GHz CPU and 4 GB RAM,Scientiﬁc Linux 4 operating system and gcc 3.4.6. compiler.

C. Data analysis

Data analysis objects were handled in the simulation appli-cation code through AIDA [142] abstract interfaces; the iAIDA[143] implementation was used.The results of the various simulation conﬁgurations werecompared to the outcome of the reference one. No normaliza-tion was performed on the Bragg peak proﬁles to be compared(unless explicity stated in the following sections): thereforethe distributions, which originate from the same number ofprimary protons in all the simulation conﬁgurations, allow theappreciation not only of differences in shape and peak location,but also of absolute effects, like the proton acceptance in thesensitive volume and the total energy deposited in it.Within a given Geant4 version, any observed differencein the results is to be ascribed to intrinsic effects of thephysics models activated in the simulation. To the best ofthe authors’ knowledge, the Geant4 transport kernel did notmodify its behavior through the versions considered in thisstudy; differences observed in comparisons across differentGeant4 versions reﬂect the evolution of the physics modelimplementations.The differences associated with the various simulation con-ﬁgurations were quantitatively estimated by means of statis-tical methods; the comparisons exploited non-parametric, un-binned goodness-of-ﬁt tests available in the Statistical Toolkit[144], [145]. Different algorithms were applied to each testcase to evaluate possible systematic effects on the resultsdue to particular features of the tests: the Anderson-Darling[146], [147], Cramer-von Mises [148]–[150] and Kolmogorov-Smirnov [151], [152] tests. The null hypothesis in the testprocess assumed the equivalence between the distributionssubject to comparison. The critical value for the rejection ofthe null hypothesis was set at 0.05, unless differently speciﬁed;in the following sections the expression “95% conﬁdencelevel” is used to indicate 0.05 signiﬁcance of the test.Goodness-of-ﬁt tests compared the whole longitudinal dis-tributions of energy deposition, as well as the distributionscorresponding to the left and right branches of the longitudinalproﬁles, i.e. at penetration depths respectively up to andbeyond the peak position, to ascertain the compatibility ofthe data samples in detail.The comparison results mentioned in the following sectionsconcern the left branch of the energy deposition proﬁles, unlessdifferently speciﬁed. Due to the mathematical features of thetest statistics and empirical distribution functions, the leftbranch of the proﬁles provides the most sensitive test case tohighlight differences in the distributions subject to comparison.V. R

ESULTS

The following sections summarize the main results of theanalysis of the various Geant4 physics modelling options; theyconcern the longitudinal energy deposition proﬁle.The lateral energy deposition pattern is also of interest toproton therapy, and to other experimental applications as well;related major sources of epistemic uncertainties in the simula-tion are the models of multiple scattering and nuclear elasticscattering, and secondary particle production from nuclear

Depth (mm) D e po s i t e d e n e r g y ( G e V ) Fig. 1. Bragg peak proﬁle resulting from electromagnetic interactionsonly (blue dotted curve), electromagnetic interactions and hadronic elasticscattering (solid red curve), electromagnetic, hadronic elastic and inelasticscattering (dashed black curve). The proﬁles were produced with the Geant49.3 physics conﬁguration listed in Table II and subsets of it. The Bragg peakis from one million primary protons, generated with h E i = 63 . MeVmean energy and σ = 300 keV standard deviation incident on water; theplot shows the deposited energy in each slice of the longitudinally segmentedreadout geometry, integrated over all the generated events. interactions. The data samples produced for the study of thelongitudinal energy deposition proﬁle are of inadequate sizeto draw any statistically signiﬁcant conclusions concerning theeffects of epistemic uncertainties on the lateral energy distri-bution patterns; their quantiﬁcation at comparable signiﬁcancelevel would require simulation samples at least two orders ofmagnitude larger for the analysis of the lateral distribution ofenergy deposition than for the longitudinal one. Such a largescale simulation production in a realistic experimental use casewas beyond the practical reach of the limited computationalresources available to the authors; it should be pursued onceadequate computing means are available.Unless differently speciﬁed in the text, the results wereproduced with Geant4 version 9.3. A. Shaping the longitudinal energy deposition

Electromagnetic, hadronic elastic and inelastic interactionscontribute to a different extent to shaping the energy depositionas a function of penetration depth; hadronic interactions areknown [28], [29] to be relevant to the simulation of theproton Bragg peak. The relative contribution was evaluated byactivating partial and full components of the reference physicsconﬁguration in the simulation: electromagnetic interactionsonly, electromagnetic interactions and hadronic elastic scatter-ing, and the full set, including hadronic inelastic processes, asin Table II. Fig. 1 shows the longitudinal energy depositionresulting from the various contributions: the peak depth and theoverall pattern of energy deposit are dominated by the elec-tromagnetic component, while elastic and inelastic hadronicinteractions contribute to modify its shape. Other options ofGeant4 electromagnetic and hadronic models result in similarapportioning among the various physics contributions.

Depth (mm) D e po s i t e d e n e r g y ( G e V ) Fig. 2. Longitudinal energy deposition due to protons (dark shaded area)and electrons (light shaded area); the thick line represents the total energydeposited by all particle species. The proﬁles were produced with the Geant49.3 physics conﬁguration listed in Table II; the electron production thresholdwas equivalent to 250 eV in water. The Bragg peak is from one millionprimary protons with h E i = 63 . MeV and σ = 300 keV incident onwater; the plot shows the deposited energy in each slice of the longitudinallysegmented readout geometry, integrated over all the generated events. The energy deposited in the sensitive volume derives fromprotons, electrons and other particles; the contribution fromthe latter amounts to less than 1%.The relevance of electrons’ contribution to the depositedenergy is related to the electron production threshold set in thesimulation application; the results described in this paper wereproduced with a threshold equivalent to 250 eV in the sensitivewater volume, that is the lowest energy recommended foruse with Geant4 library-based electron and photon processes,and corresponds to the setting in the validation study of[69]. The resulting contribution of secondary electrons to thelongitudinal energy deposit proﬁle is illustrated in Fig. 2.The accuracy of the secondary electron simulation con-tributes to the overall accuracy of the energy depositionderiving from primary protons; epistemic uncertainties in theelectron simulation models may affect the results. For Geant4,the validation of the longitudinal energy deposition of elec-trons in an energy range relevant to this study is documentedin [69].Low-energy electrons are of great importance in therapeu-tical particle beams, since they are very powerful in causinglethal damage to the cells [153]; the accuracy of δ -ray simula-tion is especially important for studies of the biological effectsof proton irradiation.The passage of primary particles through the beam lineaffects the acceptance, i.e. the fraction of primary particlesreaching the sensitive volume, and the characteristics of theparticles entering it.The spectrum of protons at the entrance of the sensitivevolume, after traversing the beam line, is shown in Fig. 3.The proton interactions in the beam line shift the mean of theenergy distribution to a lower value than the nominal beamenergy of 63.95 MeV and broaden its width, originally of 300keV: the peak part of the spectrum in Fig. 3 is well ﬁt (with Energy (MeV) E n t r i es Fig. 3. Proton energy spectrum at the entrance of the sensitive water volume,after traversing the beam line; the primary beam was modelled according toa gaussian distribution, with h E i = 63 . MeV and σ = 300 keV. p-value 1) by a gaussian distribution with 59.823 MeV meanand 376 keV width (standard deviation). The energy spectrumof the protons entering the sensitive volume is characterizedby an extended low energy tail, which affects the plateau ofthe energy deposition proﬁle in the water volume: low energyprotons stop at lower depth in the water volume, producing alocalized large energy deposit typical of the Bragg peak. Thiseffect is highlighted in Fig. 4, which compares the energydeposition proﬁle resulting from the proton spectrum shapedby the beam line to the one produced by the same spectrum,where the tail was suppressed. To suppress the tail, primaryprotons at the entrance of the sensitive volume, whose energydiffered by more than three standard deviations from the59.823 MeV peak value, were not tracked further. A datasample consisting of 280000 primary protons entering thesensitive volume was used for producing the energy depositionproﬁle in Fig. 4; this sample is larger than the ones shown inother ﬁgures to better expose the effects of the low energyproton tail.In the plateau at lower penetration depth the differencebetween the two curves amounts to more than 15%, whilethe shape of the peak is hardly affected. This feature affectsparameters used in clinical practice to evaluate the quality ofthe irradiation, like the peak to entrance ratio. This analysisshows that imprecise knowledge of the beam line geometryand materials can affect the energy deposited in the sensitivevolume; for accurate simulation of the energy deposition in thesensitive water volume, not only accurate modelling of particleinteractions in water is important, but also in the materials ofthe beam transport line.In experimental practice, the features of the particle spec-trum should be taken into account in the choice of the optimaltechnique for the validation of simulation models: for instance,Faraday-cup based dosimetry is more sensitive to the energydistribution of the proton beam than ionization chambers orcalorimeters [154]; the presence of a small admixture oflow-energy scattered protons can lead to signiﬁcant errors inabsorbed dose determination with Faraday cup. [154]- [156]. Depth (mm) D e po s i t e d e n e r g y ( G e V ) Fig. 4. Energy deposition versus penetration depth resulting from the wholespectrum of particles entering the water phantom (solid line) and producedonly by the peak portion of the spectrum of Fig. 3 (dashed line), excludingthe low energy tail. The peak portion of the primary proton spectrum at theentrance has 59.823 MeV mean energy and 376 keV standard deviation;the ﬁgure shows the deposited energy in each slice of the longitudinallysegmented readout geometry, integrated over 280000 primary protons enteringthe sensitive water volume.

The acceptance values calculated with the various physicsmodel combinations listed in Table I are all compatible withinthe statistical uncertainties. Further details about the effects ofphysics models, and their associated epistemic uncertainties,on the determination of the acceptance in the sensitive volumeare examined in section V-G.

B. Water mean ionisation potential

Knowledge of the mean excitation energy of a mediumis needed to calculate the energy loss of a charged particlepenetrating that medium; various theoretical calculations andexperimental measurements are documented in the literatureconcerning this parameter. The value (75 ± eV) recommendedin ICRU Report 49 [54] is commonly used in Monte Carlocodes (e.g. Geant4, FLUKA, MCNPX). Nevertheless, valuesdiffering from this reference have recently been proposed:among them, 80.8 ± eV in [157], based on theoretical andexperimental considerations, 61.77 eV in [158], 79.7 ± ± ICRU49 stopping powermodel was utilized. The lack of consensus about the value ofthis parameter corresponds to an epistemic uncertainty in thesimulation.The effect of the uncertainty of the water ionization potentialwas estimated by performing simulations with values of 75eV (as in the reference physics conﬁguration), 67.2 eV and80.8 eV; apart from this feature, the application activated thephysics conﬁguration summarized in Table II. The longitudinalenergy deposition proﬁles corresponding to different valuesare shown in Fig. 5. A small shift in the depth of the peakis visible; the 67.2 eV and 80.8 eV settings displace the peakto the adjacent readout geometry slices with respect to the Depth (mm) D e po s i t e d e n e r g y ( G e V ) Fig. 5. Bragg peak proﬁle resulting from different water ionisation potentialsand proton beam energies: 75 eV [54] (solid black curve), 67.2 eV (dotted bluecurve), 80.8 eV (dashed red curve) with proton beam energy of 63.95 MeV,and 80.8 eV (dot-dashed green curve) with proton beam energy of 63.65 MeV.The Bragg peak is from one million primary protons with h E i = 63 . MeVand σ = 300 keV incident on water; the plot shows the deposited energy ineach slice of the longitudinally segmented readout geometry, integrated overall the generated events. depth resulting from the water ionization potential set at 75 eV.As described in section IV-A, the resolution in the Braggpeak location achievable in the simulation is 200 µm , whichcorresponds to the longitudinal segmentation of the readoutgeometry.Similar effects were also observed in simulations with theSHIELD-HIT code [161] and with FLUKA [162]; [35] reportsapproximately 1 mm shift between Geant4-based Bragg peaksimulations of 85.6 and 209.2 MeV proton beams, respectivelywith 75 eV and 70.9 eV water mean ionization potential(however, without specifying the longitudinal resolution of thedeposited energy collection).In experimental practice, the ionization potential is usuallytreated as a free parameter in the simulation, which is adjustedto improve the match between experimental and simulateddata (e.g. [163], [164], [166], [167]). It is worth noting thatdifferent optimal values of this parameter were identiﬁed inthe literature to best match the respective experimental data.The experimental environment typical of therapeutical beamlines provides limited insight into this simulation feature,due to the common practice of empirically determining theoptimal beam parameters based on the comparison betweensimulated and observed depth dose proﬁles, as discussed insection I. A test was performed to investigate this issue: twosimulations were executed with different, but equally plausiblesettings - respectively with 63.95 proton beam energy and 75eV ionization potential, and with 63.65 MeV proton beamenergy and 80.8 eV ionization potential; the beam energy shiftis compatible with typical uncertainties of the experimentalenvironment under study. The resulting longitudinal energydeposition proﬁles, shown in Fig. 5, are practically undistin-guishable: the peak positions coincide, and the goodness ofﬁt tests mentioned in section IV-C conﬁrm their compatibilityat 90% conﬁdence level. The comparison with experimental data, whose beam energy is not known a priori with adequateprecision, would not be capable of discriminating the accuracyof such distributions.The systematic effects highlighted by this analysis are rele-vant only when the simulation is expected to play a predictiverole. In common applications, where the simulation is usedonly for veriﬁcation purpose, the empirical adjudstments ofthe water mean ionization potential and of the proton beamparameters mask any potential systematic effects. The exper-imental discrimination of the simulation accuracy resultingfrom different water ionization potential values would requireprecise knowledge of the beam parameters and accurate mea-surements of the energy deposition as a function of depth,such that displacements of the Bragg peak smaller than 200 µm , associated with the value of this parameter, could beappreciated. C. Proton stopping powers

Compilations of proton stopping powers are available in[54], [55], [56], and in the SRIM [168] code. Despite thewide body of experimental data, theoretical calculations andempirical models of proton stopping powers, no consensus hasyet been achieved on deﬁnitely established values. Evaluationsof empirical and theoretical stopping power models reportedin the literature [169], [170] are limited to a few elementsand compounds; they highlight differences among the variouscompilations. According to these analyses, more recent stop-ping power models do not necessarily correspond to improvedaccuracy; some models describe the stopping powers for somematerials well, but appear less accurate for other materials.Due to this controversial situation, proton stopping powersare a source of epistemic uncertainty in the simulation results.A study was performed to evaluate the effects on the Braggpeak proﬁle related to different stopping power models imple-mented in Geant4.The simulations were performed with physics settings as inTable II, apart from conﬁguring the low energy hadron ion-ization process with various stopping power parameterisationmodels. The energy range of application was set accordingto the recommendations of the respective references for the

ICRU49 , Ziegler77 and

Ziegler85 parameterisation models;lacking speciﬁc documentation about the applicability of the

Ziegler2000 parameterisation, this model was applied up to2 MeV, as for

ICRU49 .The results are illustrated in Fig. 6: the various protonstopping power models produce slightly different energy de-position proﬁles; the

Ziegler77 and

Ziegler85 models producealmost identical results. The peaks associated with alterna-tive proton stopping power models are located in adjacentlongitudinal readout geometry slices with respect to the peakproduced by the reference conﬁguration of Table II, including

ICRU49 stopping powers; as described in section IV-A, thelongitudinal readout segmentation is 200 µ m. Apart from theshift in the peak position, the shapes of the energy depositionproﬁles are statistically compatible at 90% conﬁdence levelaccording to all the goodness of ﬁt tests mentioned in sectionIV-C. Depth (mm) D e po s i t e d e n e r g y ( G e V ) Fig. 6. Energy deposition as a function of depth resulting from differentproton stopping power models:

ICRU49 (solid black curve),

Ziegler77 (dash-dotted green curve)

Ziegler85 (dashed red curve) and

Ziegler2000 (dottedblue curve); the proﬁles correspondign to the

Ziegler77 and

Ziegler85 modelsare barely distinguishable in the plot, due to the great similarity of theirsimulation results. The Bragg peak is from one million primary protons with h E i = 63 . MeV and σ = 300 keV incident on water; the plot showsthe deposited energy in each slice of the longitudinally segmented readoutgeometry, integrated over all the generated events. Similarly to the discussion in the previous section con-cerning the water mean ionization potential, the epistemicuncertainty related to the stopping power model used inthe simulation turns into systematic effects only when thesimulation is required to play a predictive role; otherwise,as shown in the previous case, a small adjustment of theproton beam energy, compatible with typical experimentaluncertainties, would shift the energy deposition proﬁles de-riving from different stopping power models into statisticallyequivalent distributions. These considerations suggest thattypical proton therapy experimental environments would notbe sensitive to the differences of stopping power models, norwould they be adequate to discriminate their accuracy. All theavailable stopping power models appear equally suitable tothat simulation environment; in this respect, one can observethat the use of different Geant4 models has been reportedwith satisfactory agreement against experimental data, despitethe fact that they determine different Bragg peak depths: forinstance,

Ziegler2000 in [31] and

ICRU49 in [27], [30], [33].If predictive capabilities are required from the simulation,higher precision experimental measurements would be neces-sary to discriminate the accuracy of the existing models withthe capability of appreciating shifts in the peak depth smallerthan 200 µm .This context should be taken into account when consideringcomparative evaluations of the accuracy of simulation models:the procedure of empirically adjusting the parameters in thesimulation, based on a selected physics conﬁguration, to bestﬁt the experimental data is prone to bias further comparisonsof other simulation models with the same data. The estimateof the relative accuracy of alternative physics models wouldrequire the capability of comparing the simulation outcometo measurements, without privileging any of the models to Depth (mm) D e po s i t e d e n e r g y ( G e V ) Fig. 7. Energy deposition proﬁles associated with various proton elasticscattering options: ”U-elastic” (solid black curve), Bertini (blue dashed curve),LEP (green dot-dashed curve) and CHIPS (red dotted curve). The Bragg peaksare from one million primary protons with h E i = 63 . MeV and σ = 300 keV incident on water; the plot shows the deposited energy in each slice of thelongitudinally segmented readout geometry, integrated over all the generatedevents. constrain any simulation parameters. D. Hadronic elastic scattering

Four elastic scattering modeling options available in Geant4were compared: the

G4UHadronElasticProcess with the

G4HadronElastic model, the

G4HadronElasticProcess pro-cess with the

G4LElastic (LEP) or

G4ElasticCascadeInterface (Bertini elastic) models, and the CHIPS

G4QElastic process.The simulation application activated the physics conﬁgura-tion as in Table II for all other features apart from protonelastic scattering. The longitudinal energy deposition proﬁlesresulting from the various simulation conﬁgurations are shownin Fig. 7. The distribution of the relative difference of theenergy deposited in each longitudinal slice of the sensitivevolume with respect to the outcome from the reference con-ﬁguration of Table II is shown in Fig. 8 for the various options;the differences are mostly comprised within ± Relative difference (%) N u m b e r o f e n t r i es Fig. 8. Relative difference of the energy deposited in the longitudinalslices of the sensitive water volume for various elastic scattering conﬁgu-rations, with respect to the results of the conﬁguration as in Table II with

G4UHadronElasticProcess and the

G4HadronElastic model: Bertini (solidblack histogram), LEP (blue dashed histogram) and CHIPS (red dottedhistogram) elastic scattering. The relative difference is calculated in each sliceof the longitudinal segmentation of the readout geometry associated with thesensitive volume. The energy deposition derives from one million primaryprotons with h E i = 63 . MeV and σ = 300 keV incident on water. was applied to evaluate this hypothesis. The resulting p-valueis smaller than 0.001 for all the test cases; therefore onecan infer some systematic effects associated with the choiceof the elastic scattering model in the simulation. It is worthremarking that the conclusion drawn from the Wald-Wolfowitztest does not contradict the result of the goodness-of-ﬁt tests:the two types of tests, respectively evaluating the differencesbetween two distributions in terms of sign and of distance,are complementary. A feature of the energy deposition proﬁleshinting at systematic differences is visible in the vicinity ofthe Bragg peak in Fig. 7, where alternative elastic scatteringoptions appear associated with sequences of energy depositionconsistently larger or smaller than those deriving from theconﬁguration of Table II encompassing the ”U-elastic” elasticscattering model.For the use case under study, the small differences exhibitedby the various simulation models look compatible with theexperimental resolution typical of the application domain(of the order of 2-2.5% [154]); therefore, the peculiaritiesof the models do not affect the outcome of the simulationsigniﬁcantly. Based on these results, one can conclude that atthe present stage all the Geant4 elastic scattering options areequivalent for the use case considered in this study. Validationagainst experimental data concerning the energy range andtarget materials pertinent to this use case would strengthenthe predictive reliability of the simulation. E. Hadronic inelastic cross sections

The proton total inelastic cross sections are an importantparameter in the simulation of therapeutical proton beams,since they determine the amount of proton loss from theprimary beam.

TABLE IIIP-

VALUE OF GOODNESS - OF - FIT TESTS COMPARING HADRONIC ELASTICSCATTERING OPTIONS

Version Range Model Kolmogorov Anderson CramerSmirnov Darling von Mises9.3 Bertini 1 1 1Whole LEP 1 1 1CHIPS 0.997 0.997 0.999Left Bertini 1 1 1branch LEP 1 1 1CHIPS 0.996 0.982 0.999Right Bertini 1 0.986 0.972branch LEP 1 0.986 0.972CHIPS 1 0.986 0.9729.1 Left Bertini 1 1 1branch LEP 0.996 0.989 18.1 Left Bertini 1 0.999 0.997branch LEP 1 1 1

Two conﬁgurations of cross sections wereevaluated in this study: those implemented in

G4HadronInelasticDataSet , originating from GHEISHA,and those implemented in

G4ProtonInelasticCrossSection and

G4NeutronInelasticCrossSection , respectively for protonsand neutrons, covering the energy range above 6.8 MeV.Apart from this feature, all the other physics options in thesimulation were set as in Table II.Both cross sections derive from parameterisations of exper-imental data; it is not clear whether the comparisons availablein the literature concern the calibration of the parameterisationwith experimental data, or represent the cross section modelvalidation.No signiﬁcant dependence on the cross section options isobserved regarding the proton acceptance in the sensitive watervolume, which is affected by the interactions to which primaryprotons are subject in the materials of the beam line.The two sets of cross sections determine some differencein the occurrence of the proton inelastic scattering processassociated with them in the sensitive water volume. Conﬁdenceintervals for this quantity were calculated, using Student’s t distribution, based on the simulation sample activating thecross sections as in [94]. The 99% conﬁdence interval for themean value of hadronic inelastic scattering occurrences liesbetween 1688 and 1849, when one million primary protonsare generated, while the number of occurrences with theGHEISHA-like cross section data set is 1654; this valueis signiﬁcantly different from the mean number of inelasticscattering occurrences determined by the cross sections of[94].Nevertheless, the effect of this difference on the longitudinalenergy deposition appears negligible. The distribution of therelative differences of the energy deposition proﬁles associatedwith the two options is shown in Fig. 9; it is consistent withtypical experimental uncertainties in hadron therapy practice.The longitudinal energy deposition proﬁles resulting from thetwo cross section options, with other physics settings as inTable II, are compatible at 90% conﬁdence level according toall the aforementioned goodness-of-ﬁt-tests.Therefore one can conclude that the characteristics of thetwo hadronic cross section data sets do not affect the simula- Relative difference (%) N u m b e r o f e n t r i es Fig. 9. Percent relative difference of the energy deposition proﬁles resultingfrom the activation of

G4HadronInelasticDataSet hadronic inelastic cross sec-tions or

G4ProtonInelasticCrossSection and

G4NeutronInelasticCrossSection ;all the other simulation options are identical in the two cases, and set as inTable II. The energy deposition derives from one million primary protonswith h E i = 63 . MeV and σ = 300 keV incident on water. The relativedifference is calculated in each slice of the longitudinal segmentation of thereadout geometry associated with the sensitive volume. tion of the proton depth dose proﬁles in the use case consideredin this study. F. Hadronic inelastic scattering models

Several alternative hadronic inelastic scattering modelingoptions were evaluated: the Precompound model, the Bertiniand Li`ege cascade models, the LEP parameterised model andthe CHIPS model. In addition, a few conﬁguration options ofthe nuclear deexcitation phase, accessible through the interfaceof the

G4ExcitationHandler class instantiated by the Precom-pound model, were evaluated together with the Precompoundmodel: the generalized evaporation (GEM) model replacingthe default evaporation one, and the activation of the Fermibreak-up. The Precompound model was evaluated both as astandalone model and as invoked by the Binary cascade model.Proton and neutron interactions were handled consistently ineach simulation conﬁguration by activating the same hadronicinelastic model option for both particles; all the other physicsfeatures were set as in Table II.The longitudinal energy deposition proﬁles produced bythe various hadronic inelastic models in Geant4 9.3 appearvisually undistinguishable; therefore no related ﬁgure is shownin this paper.The energy deposition proﬁle produced with the GEMevaporation model closely resembles the one deriving fromthe default evaporation model instantiated in connection withthe Precompound model, as seen in Fig. 10; this observationis conﬁrmed by the results of the goodness-of-ﬁt tests in TableIV with 0.1 signiﬁcance. Differences related to the use of thetwo models were visible with previous Geant4 versions, asshown in Fig. 10; the GEM implementation was subject toimprovements in Geant4 9.3 [73].The distributions of the secondary particles produced inassociation with the two evaporation models look consistent,

Relative difference N u m b e r o f e n t r i es Fig. 10. Relative difference of longitudinal energy deposition proﬁlesassociated with the GEM evaporation model, with respect to the “referencephysics conﬁguration”: simulation based on Geant4 9.3 (solid line) and 9.1(dashed line). The observed systematic effect is related to the correction ofa software feature in Geant4 9.3. The energy deposition derives from onemillion primary protons with h E i = 63 . MeV and σ = 300 keV incidenton water. The relative difference is calculated in each slice of the longitudinalsegmentation of the readout geometry associated with the sensitive volume. compatible with statistical ﬂuctuations; the secondary protondistributions are shown in Fig. 11 as an example. The lackof visible effects does not necessarily mean that these twomodels are characterized by identical features; rather, it showsthat the use case under study is not sensitive to their possibledifference.From this analysis one can conclude that the evaporationmodel options are equivalent for the simulation of the longi-tudinal energy deposition; as documented in section VI, theGEM model is computationally faster than the default one inthe application under study.The evaporation process of nuclear deexcitation is basedon the hypothesis that the excitation energy is high andapproximately equally distributed among the nucleons: thisassumption is justiﬁed for heavy nuclei, but it is not applicableto the water target considered in this use case. It is generallyaccepted that the Fermi break-up represents a more appropriatetheoretical description of the nuclear deexcitation process forlight nuclei: in MCNPX and FLUKA the Fermi break-upis applied to nuclei with atomic mass up to 17, whereasevaporation is applied to heavier nuclei; in Geant4 it is notinvoked by default in the deexcitation of light nuclei, althoughthe public interface of G4ExcitationHandler allows users tomodify the default settings.The effects of the Fermi break-up were evaluated by activat-ing it in the simulation for nuclei with atomic number smallerthan 10; they are visible in the spectrum of the producedsecondaries, with respect to those produced by nuclear de-excitation proceeding through evaporation. From a theoreticalperspective, the application of an evaporation model to thedeexcitation of light nuclei is expected to overestimate theproduction of secondary protons in the lower energy range;this effect is indeed observed in Fig. 11. The activation of theFermi break up affects the longitudinal energy deposition, with Energy (MeV) E n t r i es Fig. 11. Energy spectrum of secondary protons produced with differentconﬁgurations of the Precompound model: default conﬁguration as in Ta-ble II (black circles), conﬁguration with GEM evaporation (red squares),conﬁguration activating Fermi break up (blue triangles) and conﬁgurationactivating the Binary Cascade model (white crosses), which in turn invokesthe Precompound model to handle the preequilibrium phase. The secondaryspectra derive from one million primary protons with h E i = 63 . MeV and σ = 300 keV incident on water. Relative difference (%) N u m b e r o f e n t r i es Fig. 12. Relative difference of the longitudinal energy deposition proﬁlederiving from the activation of the Fermi break-up, with respect to the proﬁlederiving from the default conﬁguration of the Precompound model; all theother settings are as in Table II. The energy deposition derives from onemillion primary protons with h E i = 63 . MeV and σ = 300 keV incidenton water. The relative difference is calculated in each slice of the longitudinalsegmentation of the readout geometry associated with the sensitive volume. respect to that resulting from the default nuclear deexcitationsettings: their relative difference, shown in Fig. 12, exhibits anasymmetric distribution shifted towards negative values. Thiseffect hints at a systematic contribution of the Fermi break-upto decrease the longitudinal energy deposition; nevertheless,the observed differences are consistent with typical uncertain-ties in experimental proton therapy practice.A similar asymmetry in the longitudinal energy depositiondifference is observed when the preequilibrium phase is associ-ated with intranuclear transport; this effect is shown in Fig. 13,which concerns two simulations involving the Precompoundmodel, respectively as an independent model and invoked Relative difference (%) N u m b e r o f e n t r i es Fig. 13. Relative difference of the longitudinal energy deposition proﬁlederiving the activation of the Precompound model either as a standalonemodel, or as invoked by the Binary cascade model. The energy depositionderives from one million primary protons with h E i = 63 . MeV and σ = 300 keV incident on water. The relative difference is calculated in eachslice of the longitudinal segmentation of the readout geometry associated withthe sensitive water volume. through the Binary Cascade model. The transition between thecascade process of intranuclear transport and the preequilib-rium is determined by empirical considerations [103], whichare speciﬁc to each software implementation: for instance, inGeant4 Binary cascade model cascading continues as long asthere are particles above a 70 MeV kinetic energy threshold[103] (along with other conditions required by the algorithm),while a smooth transition around 50 MeV is implementedin FLUKA [10]. The empirical features of the algorithmcorrespond to lack of knowledge from physical principles todetermine the transition between the two r´egimes; the analysisshows that this epistemic uncertainty can be a source ofsystematic effects, such as the bias of the distribution in Fig.13. This effect, which is of a magnitude comparable to typicalexperimental uncertainties in hadron therapy measurements,could be signiﬁcant in use cases where precise predictivepower is expected from the Monte Carlo simulation.No such asymmetries, with respect to simulating the pree-quilibrium phase only (as in the Precompound model), areobserved with two other conﬁgurations involving intranuclearcascade models - the Bertini and Li`ege ones. It is worthremarking that the Li`ege model does not involve a preequilib-rium phase at all, while the Bertini cascade model does. Theapparent absence of consistent trends related to the adoptedphysical approach (modeling intranuclear cascade, preequilib-rium and their interplay) suggests that the observed behavior ofthe code may be inﬂuenced by other implementation details ontop of the basic physics modeling approach; this considerationadds further complexity to the effort of identifying the sourcesof epistemic uncertainty in the simulation, which is a necessarystep towards their reduction or their control.The relative differences of the energy deposition proﬁlesconcerning other hadronic inelastic models with respect to theoutcome deriving from the Precompound one are illustrated inFig. 14. Relative difference (%) N u m b e r o f e n t r i es Fig. 14. Relative difference of the energy deposited in the longitudinalslices of the sensitive water volume associated with various hadronic inelasticoptions, with respect to the conﬁguration of Table II encompassing thePrecompound model: Bertini (solid black histogram), LEP (blue dashedhistogram), Li`ege (green dot-dashed histogram) and CHIPS (red dottedhistogram). The energy deposition derives from one million primary protonswith h E i = 63 . MeV and σ = 300 keV incident on water. All the ﬁnal state models of hadronic inelastic scatteringproduce statistically compatible results at 90% conﬁdencelevel in the considered use case, as shown in Table IV.The Wald-Wolfowitz test concerning the difference with re-spect to the reference physics conﬁguration results in a p-valuesmaller than 0.001 for all the considered modeling options,with the exception of the comparison involving the Li`egecascade model, for which the p-value is 0.360. These resultssuggest the presence of some systematic effects due to thechoice of the hadronic inelastic models in the simulation; someasymmetries and bias with respect to zero are indeed visible inthe distributions in Fig. 14, namely the one concerning the LEPinelastic model, apart from the previously discussed effects inFig. 12 and 13.Like the results discussed in section V-D, this result suggeststhat the selection of the hadronic inelastic model activatedin the simulation can be source of systematic effects. Nev-ertheless, the differences concerning the longitudinal energydeposition patterns appear compatible with typical experi-mental uncertainties in proton therapy dosimetry; thereforethe systematic effects identiﬁed by the Wald-Wolfowitz testwould be negligible in that software application context. Theycould become relevant in use cases where higher accuracy isdemanded.The implementation of the Geant4 Precompound model wassubject to improvements [38] in Geant4 9.2. Nevertheless,these modiﬁcations do not appear to have affected the featuresof the longitudinal energy deposition pattern signiﬁcantly,since the goodness-of-ﬁt tests in Table IV show that the energydeposition proﬁles associated with this model were compatiblewith those deriving from other hadronic inelastic models inprevious Geant4 versions, as well as in the 9.3 one. Thisremark is relevant to previous applications of the Precompoundmodel to the use case under study, which are archived in theliterature.

Energy (MeV) E n t r i es Fig. 15. Secondary proton spectrum resulting from various hadronic inelasticmodels: Precompound (black circles), Bertini (red squares), LEP (whitecrosses), Li`ege (blue triangles) and CHIPS (green stars). The secondaryparticle spectrum derives from one million primary protons with h E i = 63 . MeV and σ = 300 keV incident on water. Nevertheless, despite their similarity at determining thelongitudinal energy deposition proﬁle, some hadronic inelasticmodels exhibit very different characteristics regarding the sec-ondary particles they generate: the secondary proton, neutronand α particle spectra are shown in Fig. 15 to 17. Radiotherapyapplications can be affected by secondary particles within thetarget volume and outside it, both laterally and beyond thedistal edge of the Bragg peak [155]; concerns for the risksdue to secondary particles in proton therapy are discussedin the literature [173]. The analysis documented in the pre-vious paragraphs shows that the different secondary particleproduction patterns do not produce signiﬁcant effects on thelongitudinal energy deposition proﬁle; the quantiﬁcation ofpossible effects on the lateral energy deposition pattern wouldrequire substantially larger data samples, which were notachievable with the limited computational resources availableto the authors in the course of the project documented in thispublication, and is outside the scope of this paper.The identiﬁcation of actual systematic effects related tohadronic inelastic models, and their quantitative estimate,would require experimental measurements with adequate ac-curacy to discriminate not only the features of the energydeposition distribution, but also the characteristics of thesecondary particles they produce. G. Multiple Coulomb scattering

The conﬁguration of proton multiple scattering simulationin a Geant4 application involves the selection of the multiplescattering process and models to be activated, and settingsome parameters used by the multiple scattering algorithm.Default options are provided in Geant4 kernel for the modeland parameters associated to multiple scattering processes;they are summarized in Table V for a set of recent Geant4versions.The analysis evaluated whether recent evolutions of thecode between Geant4 8.1 and 9.3 versions, which involve TABLE IVP-

VALUE OF GOODNESS - OF - FIT TESTS COMPARING HADRONIC INELASTIC SCATTERING OPTIONS

Geant4 Test Hadronic Kolmogorov Anderson Cramerversion range model Smirnov Darling von Mises9.3

Bertini 1 1 1LEP 0.954 0.988 0.984Whole Li`ege 1 1 1CHIPS 1 1 1GEM 1 1 1Fermi break-up 1 1 1Binary 0.954 0.938 0.973Bertini 1 1 1Left LEP 0.945 0.961 0.979branch Li`ege 1 1 1CHIPS 1 1 1GEM 1 1 1Fermi break-up 1 1 1Binary 0.945 0.858 0.962Bertini 1 0.986 0.972Right LEP 1 0.986 0.972branch Li`ege 1 0.986 0.972CHIPS 1 0.986 0.972GEM 1 0.986 0.972Fermi break-up 1 0.986 0.972Binary 1 0.986 0.972

Left Bertini 0.981 0.901 0.980branch LEP 0.945 0.949 0.937

Left Bertini 1 1 1branch LEP 0.996 0.814 0.847TABLE VD

EFAULT SETTINGS OF RELEVANT MULTIPLE SCATTERING PROCESSES

Geant4 Process Range Step Lateral skin Geometry Modelversion Factor Limit Displacement Factor8.1 G4MultipleScattering 0.02 1 Urban9.1 G4MultipleScattering 0.02 1 1 0 2.5 Urban9.2p0.3 G4MultipleScattering 0.02 1 1 3 2.5 Urban9.3 G4MultipleScattering 0.04 1 1 3 2.5 Urban929.3 G4hMultipleScattering 0.2 0 1 3 2.5 Urban90 different model and parameter settings, could be the sourceof systematic effects in Geant4-based simulations for protontherapy applications, originating from epistemic uncertain-ties in the simulation model. Two issues were addressed:the effects related to different multiple scattering processes,

G4MultipleScattering and

G4hMultipleScattering , and thosedue to changes in the

G4MultipleScattering process since the8.1 release. It should be remarked that in the following analysisthe behaviour associated with multiple scattering may resultnot only directly from the implementation of the two abovementioned classes, but also from behavior inherited from theirbase classes or acquired through aggregation of, or dependenceon, other classes, as determined by the software design ofGeant4 multiple scattering domain. The results are reportedfor Geant4 versions 8.1p02, 9.1, 9.2p03, and 9.3.Two multiple scattering processes,

G4hMultipleScattering and

G4MultipleScattering , are applicable to protons in Geant49.3. The former was ﬁrst introduced in Geant4 8.2 to pro-vide faster simulation of hadron transport; the latter com-plies with an earlier class interface and is planned to bewithdrawn from later releases. In Geant4 9.3 the com-mon base class

G4VMultipleScattering accounts for publicmember functions formerly speciﬁc to

G4MultipleScattering . G4hMultipleScattering can be conﬁgured to acquire equiva- lent behavior to

G4MultipleScattering by applying the samesettings (model and parameters) as in

G4MultipleScattering listed in Table V. For convenience, the conﬁguration of

G4hMultipleScattering equivalent to

G4MultipleScattering isstill indicated as

G4MultipleScattering in the following.Only the default settings listed in Table V were tested;the large effects related to epistemic uncertainties observedin this limited interval analysis, which are documented in thefollowing, suggest that this complex problem domain wouldbeneﬁt from a larger-scale dedicated study beyond the scopeof this paper.To acquire sound evidence of effects associated with multi-ple scattering settings, the comparisons were performed overﬁve physics conﬁgurations: the set of processes and models(apart from multiple scattering) as in Table II, and variantsof it consisting of “LEP” and “Bertini” inelastic scatteringtogether with “U-elastic” elastic scattering, and “LEP” and“Bertini” elastic scattering together with the Precompoundhadronic inelastic model. Common effects observed in suchan extended set of test cases could be reasonably associatedwith the multiple scattering domain, excluding their possibleorigin from intrinsic features of a single physics conﬁguration.The resulting longitudinal energy deposition distributionsassociated with proton multiple scattering options are shown Energy (MeV) E n t r i es Fig. 16. Secondary neutron spectrum resulting from various hadronicinelastic models: Precompound (black circles), Bertini (red squares), LEP(white crosses), Li`ege (blue triangles) and CHIPS (green stars). The secondaryparticle spectrum derives from one million primary protons with h E i = 63 . MeV and σ = 300 keV incident on water. Energy (MeV) E n t r i es Fig. 17. Secondary α particle spectrum resulting from various hadronicinelastic models: Precompound (black circles), Bertini (red squares), LEP(white crosses) and CHIPS (green stars); no α particles appear to be producedby the Geant4 implementation of the Li`ege model. The secondary particlespectrum derives from one million primary protons with h E i = 63 . MeVand σ = 300 keV incident on water. in Fig. 18. The two multiple scattering processes producevisibly different longitudinal energy deposition proﬁles; theextent of the differences can be quantitatively appraised inFig. 19, which shows the variation of the longitudinal energydeposition proﬁles simulated with G4hMultipleScattering inGeant4 9.3 with respect to equivalent simulations performedwith

G4MultipleScattering in Geant4 9.3, 9.1 and 8.1p02. Theplot shows results produced with the Bertini hadronic inelasticoption; similar results are obtained with the other physicsconﬁgurations subject to comparison. The energy depositionproﬁle of Geant4 9.2p03 is not shown in Fig. 19 to avoidclogging the plot with many curves; it lies in between theproﬁles produced by Geant4 9.3 with

G4hMultipleScattering and Geant4 9.1.The results of goodness-of-ﬁt tests comparing longitudinal

Depth (mm) D e po s i t e d e n e r g y ( G e V ) Fig. 18. Bragg peak proﬁle resulting from different multiple scatteringprocesses and Geant4 versions:

G4hMultipleScattering (black, thick solidline) in Geant4 9.3,

G4MultipleScattering in Geant4 9.3 (dashed, green line),Geant4 9.2p03 (pink, thin solid line), Geant4 9.1 (dotted, blue line) and Geant48.1p02 (dash-dotted, red line) The same physics conﬁguration was activated inall the simulations, apart from the multiple scattering setting. The Bragg peakis from one million primary protons with h E i = 63 . MeV and σ = 300 keV incident on water; the plot shows the deposited energy in each slice of thelongitudinally segmented readout geometry, integrated over all the generatedevents. Relative difference (%) N u m b e r o f e n t r i es Fig. 19. Relative difference of the energy deposited in the longitudinal slicesof the sensitive water volume associated with different multiple scatteringprocesses and Geant4 versions; the difference is calculated in each longitudinalslice of the readout geometry with respect to a reference conﬁgurationwith

G4hMultipleScattering in Geant4 9.3 for identical conﬁgurations with

G4MultipleScattering in Geant4 9.3 (solid black histogram), 9.1 (dashedblue histogram), 9.2p03 (dotted pink histogram) and 8.1p02 (dash-dotted redhistogram) versions. The reference conﬁguration is as in Table II, except forthe hadronic inelastic scattering option (Bertini instead of Precompound); thisreplacement is due to the greater stability of the Bertini code across the variousGeant4 versions, nevertheless all other physics conﬁgurations produce similarresults. The energy deposition derives from one million primary protons with h E i = 63 . MeV and σ = 300 keV incident on water. energy deposition distributions associated with either protonmultiple scattering process are summarized in Table VI. Thelongitudinal energy deposition distributions associated with G4hMultipleScattering are incompatible at 99.9% conﬁdencelevel with those produced by Geant4 versions 8.1p02 and9.3 with

G4MultipleScattering , and, apart from one test caseinvolving the Kolmogorov-Smirnov test, at 99% conﬁdencelevel with those produced by Geant4 9.1. Regarding the com-parison with the proﬁles generated with Geant4 9.2p03, theAnderson-Darling test rejects the hypothesis of compatibilitywith the proﬁles produced with

G4hMultipleScattering at 95%conﬁdence level, the Kolmogorov-Smirnov does not rejectit, while the Cramer-von Mises rejects it in two physicsconﬁgurations and does not reject it in the other three conﬁgu-ration. The Anderson-Darling and Cramer-von Mises tests areconsidered more powerful than the Kolmogorov-Smirnov test;the Anderson-Darling test is especially sensitive to fat tails[176].

G4MultipleScattering was responsible of the multiplescattering process in these earlier Geant4 versions.It is worth mentioning that [35] shows comparisons ofexperimental and simulated proton energy deposition pro-ﬁles normalized to the number of protons in the beam; thereported simulations were performed with Geant4 8.1p01version. The small plots in logarithmic scale and the limi-tation of the comparisons to qualitative appraisal prevent thereader from understanding whether the different behaviour of

G4hMultipleScattering , and of the

G4MultipleScattering classin later Geant4 versions, with respect to the multiple scatteringimplementation of Geant4 8.1, would affect the compatibilitywith the experimental data of [35].Visible differences are also observed in Fig. 20 con-cerning the energy deposition proﬁles associated with

G4MultipleScattering settings over the various versions. Thetotal energy deposition shown in Fig. 21 exhibits evidentdifferences associated with the various settings.The Geant4 Low Energy Electromagnetic package, used inall the simulations, was subject to conﬁguration and ChangeManagement discipline [174] based on the Uniﬁed SoftwareDevelopment Process [175] framework until Geant4 release9.1; the adopted software process ensured that the softwareof this package relevant to the use case under study didnot undergo modiﬁcations between the 8.1p02 and 9.1 pro-duction versions, which could alter its physical behaviour.The same implementations of the low energy electromagneticprocesses were used in the simulations based on Geant4 9.2and 9.3; therefore, it is plausible that variations observed inthe simulation productions based on different Geant4 versionsare associated with evolutions in other Geant4 domains. Theextent of the differences observed when comparing two Geant4versions appears to be approximately the same over all thehadronic physics conﬁgurations activated in the simulation;since the occurrence of coherent modiﬁcations to all theGeant4 hadronic elastic and inelastic scattering implemen-tations is not likely, this observation suggests that coherentdifferences would derive from modiﬁcations either to Geant4transport kernel or to the multiple scattering domain, whichare common to all the simulations. Major changes to Geant4kernel have not been documented over the considered versions;

Relative difference (%) N u m b e r o f e n t r i es Fig. 20. Relative difference of the energy deposited in the longitudinal slicesof the sensitive water volume associated with the

G4MultipleScattering inGeant4 9.2p03, with respect to identical simulation settings in Geant4 9.3(solid black histogram), Geant4 9.1 (dashed blue histogram), and Geant48.1p02 (dotted red histogram). The reference conﬁguration is as in TableII, except for the hadronic inelastic scattering option (Bertini instead ofPrecompound); this replacement is due to the greater stability of the Bertinicode across the various Geant4 versions, nevertheless all other physicsconﬁgurations produce similar results. The energy deposition derives fromone million primary protons with h E i = 63 . MeV and σ = 300 keVincident on water. the multiple scattering domain, which was subject to evolution,appears the most likely source of the observed discrepancies.Their origin is probably from epistemic uncertainties in thesimulation models, whose validation in the energy rangerelevant to this use case is scarcely documented in literature.The differences concerning multiple scattering settings inthe various Geant4 versions are signiﬁcant. The 99.9% and99% conﬁdence intervals for the mean value of the total energydeposition deriving from G4hMultipleScattering in Geant4 9.3are shown in Fig. 21; the values deriving from Geant4 versions8.1p02, 9.1 and 9.2p03 fall outside the 99.9% conﬁdenceinterval.The results of goodness-of-ﬁt tests are reported in TableVI. In most test cases the longitudinal energy depositiondistributions produced with Geant4 9.3 are incompatible withthose produced with Geant4 9.1 at 95% conﬁdence level;in a few cases the test statistic results in p-values close tothe critical region for 0.05 signiﬁcance. The null hypoth-esis of equivalence of the distributions subject to test isnot rejected, with the same signiﬁcance, in the comparisonsinvolving Geant4 9.3 and 8.1p02 versions. This quantitativeresult is consistent with the qualitative appraisal of Fig. 18,where the energy deposition proﬁle deriving from Geant4 9.3appears closer to the one produced with the earlier 8.1p02version. The energy deposition proﬁles produced with Geant49.2p03 are incompatible with those produced with Geant49.3 (with

G4MultipleScattering ) and Geant4 8.1p02 with 0.05signiﬁcance, while the goodness-of-ﬁt tests fail to reject thehypothesis of compatibility with the proﬁles produced withGeant4 9.1 with 0.05 signiﬁcance. The longitudinal energydeposition distributions produced with Geant4 9.1 and 8.1p02are incompatible with 0.05 signiﬁcance. e V ) ene r g y ( G e V ) po s i t ed ene r g y ( G e V ) T o t a l depo s i t ed ene r g y ( G e V ) T o t a l depo s i t ed ene r g y ( G e V ) UelasticLiège Uelastic CHIPS Uelastic Bertini Uelastic LEP UelasticPrecomp Uelastic PrecompGEM BertiniPrecomp LEP Precomp CHIPS Precomp T o t a l depo s i t ed ene r g y ( G e V ) Hadronic physics models

UelasticLiège Uelastic CHIPS Uelastic Bertini Uelastic LEP UelasticPrecomp Uelastic PrecompGEM BertiniPrecomp LEP Precomp CHIPS Precomp T o t a l depo s i t ed ene r g y ( G e V ) Hadronic physics models

Fig. 21. Total energy deposition in the sensitive volume associated withvarious Geant4 versions and physics conﬁgurations: Geant4 9.3 (red squares),Geant4 9.2p03 (pink diamonds), Geant4 9.1 (blue circles) and Geant4 8.1p02(green triangles); the ﬁlled symbols correspond to simulations activatingthe

G4MultipleScattering multiple scattering process, while the empty onescorrespond to the activation of

G4hMultipleScattering . The upper and lowerlines of the horizontal axis identify respectively the hadronic elastic andinelastic scattering model in each simulation conﬁguration; the other physicsoptions, apart from the multiple scattering under test, were as in Table II. Thedashed and dotted lines represent respectively the 99.9% and 99% conﬁdenceintervals for the mean value of the total deposited energy over variousGeant4 9.3 physics conﬁgurations associated with

G4hMultipleScattering .The total energy deposition derives from one million primary protons with h E i = 63 . MeV and σ = 300 keV incident on water. The simulations with the two multiple scattering processesand with different Geant4 versions produce a signiﬁcantlydifferent total energy deposition in the sensitive volume.The results are shown in Fig. 21; the dashed and dottedlines in the plot represent respectively the 99.9% and 99%conﬁdence intervals for the average energy deposition with

G4hMultipleScattering over all Geant4 9.3 physics conﬁgura-tions.The absolute value of the energy deposition is relevant toapplications where knowledge of the actual dose released to atarget is critical, like oncological treatment planning, radiationprotection or radiation damage estimate. The observed differ-ences would be signiﬁcant in use cases where the simulationhas a predictive role: differences greater than 10% in the dosereleased to a patient, like the effects observed with the variousmultiple scattering implementations released in Geant4, wouldbe important in clinical applications. These use cases wouldnot be limited to the bio-medical application domain; forinstance, the use of Monte Carlo simulation to study thedamage to electronic components exposed to radiation wouldrequire precise estimate of the released dose.The average energy deposition per proton in the sensitivevolume, and the ratio between the energy deposited at the peaklocation and at the entrance of the sensitive volume are approx-imately the same for all the physics conﬁgurations and Geant4versions, as illustrated in Fig. 22 and 23. The 95% conﬁdenceinterval for the mean value deriving from Geant4 9.3 with

G4hMultipleScattering is shown in the ﬁgures to appreciatequantitatively the spread of the results. These observations

57 057.257.457.657.858.058.2 depo s i t ed ene r g y ( G e V ) A v e r age depo s i t ed ene r g y ( G e V ) Hadronic physics models

Fig. 22. Average energy deposition per proton entering the sensitive volumecalculated with various Geant4 versions and physics conﬁgurations: Geant49.3 (red squares), Geant4 9.2p03 (pink diamonds), Geant4 9.1 (blue circles)and Geant4 8.1p02 (green triangles); the ﬁlled symbols correspond to simu-lations activating the

G4MultipleScattering multiple scattering process, whilethe empty ones correspond to the activation of

G4hMultipleScattering . Theupper and lower lines of the horizontal axis identify respectively the hadronicelastic and inelastic scattering model in each simulation conﬁguration; theother physics options, apart from the multiple scattering under test, wereas in Table II. The dotted lines represent the 95% conﬁdence interval forthe mean value of the various Geant4 9.3 physics conﬁgurations associatedwith

G4hMultipleScattering . The average energy deposition derives from onemillion primary protons with h E i = 63 . MeV and σ = 300 keV incidenton water. suggest that the detailed features of the energy deposition inthe water volume are insensitive to the physics options selectedin the simulation, including multiple scattering, and to theevolutions of Geant4 software.The acceptance, i.e. the fraction of protons reaching thesensitive volume, out of all the primary generated ones, isplotted in Fig. 24 for different physics conﬁgurations andGeant4 versions. Various sources can affect it: inelastic nuclearreactions, which remove protons from the beam prior toreaching the sensitive volume, and nuclear elastic and multipleCoulomb scattering, which modify the protons’ direction alongwith their passage through matter.The acceptance appears roughly constant in Fig. 24 forthe various hadronic models, within the set of simulationsassociated with a given multiple scattering option and Geant4version; therefore, the features of these models can be ex-cluded as a source of signiﬁcant differences. Complementarytests, whose results are not reported in Fig. 24, show that theacceptance is not signiﬁcantly sensitive to alternative stoppingpower models either. The multiple scattering algorithm appearsthe most probable source of the observed differences.Fig. 21 and 24 suggest a correlation between the totalenergy deposited in the sensitive volume and the acceptance.This effect was evaluated by means of Pearson’s correlationcoefﬁcient [177]; the null hypothesis consists of assuming nocorrelation between these quantities. The correlation coefﬁ-cient, calculated over all physics conﬁgurations and Geant4versions examined in this study, is 0.965; the null hypothesisis rejected with 0.0001 signiﬁcance. On the other hand, no e r g y o s i t ed ene r g y n c e depo s i t ed ene r g y ea k / en t r an c e depo s i t ed ene r g y P ea k / en t r an c e depo s i t ed ene r g y P ea k / en t r an c e depo s i t ed ene r g y Hadronic physics models P ea k / en t r an c e depo s i t ed ene r g y Hadronic physics models

Fig. 23. Ratio of the energy deposition at the Bragg peak location and atthe entrance of the sensitive volume, deriving from various Geant4 versionsand physics conﬁgurations: Geant4 9.3 (red squares), Geant4 9.2p03 (pinkdiamonds), Geant4 9.1 (blue circles) and Geant4 8.1p02 (green triangles); theﬁlled symbols correspond to simulations activating the

G4MultipleScattering multiple scattering process, while the empty ones correspond to the activationof

G4hMultipleScattering . The upper and lower lines of the horizontal axisidentify respectively the hadronic elastic and inelastic scattering model in eachsimulation conﬁguration; the other physics options, apart from the multiplescattering under test, were as in Table II. The dashed lines represent the 95%conﬁdence interval for the mean value of the various Geant4 9.3 physicsconﬁgurations associated with

G4hMultipleScattering . The energy depositionderives from one million primary protons with h E i = 63 . MeV and σ =300 keV incident on water. correlation of the total energy deposition is observed with theaverage energy deposited per proton, nor with the peak overentrance ratio: the corresponding correlation coefﬁcients are0.120 and 0.151; these values lead to not rejecting the nullhypothesis with 0.1 signiﬁcance.These results hint that the observed discrepancies in thelongitudinal energy deposit distributions are related to effectsdue to multiple scattering in the beam line, rather than tophysics modeling effects in the water volume. Geant4 multi-ple scattering implementation encompasses various empiricalparameters [74], whose settings are characterized by epistemicuncertainties; presumably, the observed effects are associatedwith different angular distribution (including backscattering)and lateral displacement of the scattered particle implementedin the various Geant4 multiple scattering options and versions,and the variations of empirical parameters governing thealgorithm.This ﬁnding stresses the importance of accurately modelingthe beam line geometry and material composition for accuratecalculation of the energy deposited in the sensitive volume. Italso highlights the importance of correctly simulating particleinteractions not only in the sensitive parts of the experimentalset-up, but also in its passive components, since the latterappear to be responsible for signiﬁcant systematic effects onthe energy deposited in the sensitive volume.In hadron therapy practice, proton depth dose proﬁles areusually normalized to a reference value (at the peak or atthe entrance of the sensitive volume), or to the integral of c e ( % ) A cc ep t an c e ( % ) A cc ep t an c e ( % ) A cc ep t an c e ( % ) A cc ep t an c e ( % ) Hadronic physics models A cc ep t an c e ( % ) Hadronic physics models

Fig. 24. Percentage of primary protons (acceptance) reaching the sensitivevolume deriving from various Geant4 versions and physics conﬁgurations:Geant4 9.3 (red squares), Geant4 9.2p03 (pink diamonds), Geant4 9.1 (bluecircles) and Geant4 8.1p02 (green triangles); the ﬁlled symbols correspond tosimulations activating the

G4MultipleScattering multiple scattering process,while the empty ones correspond to the activation of

G4hMultipleScattering .The upper and lower lines of the horizontal axis identify respectively thehadronic elastic and inelastic scattering model in each simulation conﬁgu-ration; the other physics options, apart from the multiple scattering undertest, were as in Table II. The dashed and dotted lines represent respectivelythe 99.9% and 99% conﬁdence intervals for the mean value of the variousGeant4 9.3 physics conﬁgurations of associated with

G4hMultipleScattering .The simulation involves from one million primary protons with h E i = 63 . MeV and σ = 300 keV incident on water. the dose, as discussed in [44]. The same goodness-of-ﬁt testsreported in Table VI were performed after normalizing theenergy deposition proﬁles to the total energy collected inthe sensitive volume: they failed to reject the null hypothesisof compatibility with 0.1 signiﬁcance, which was rejected inthe comparison of the original (non normalized) distributions.This analysis demonstrates that normalized distributions areinsensitive to the large differences exhibited by the variousmodels on an absolute scale. Therefore, comparisons like theone in Fig. 7 of [67], concerning an experimental Bragg peakand one simulated with Geant4 9.0, both normalized to 1, areof limited usefulness to clarify the issues that emerged in theprevious analysis.Further tests were performed, activating the speciﬁc G4eMultipleScattering process for electron multiple scattering,released in Geant4 9.3; no effects were observed on thelongitudinal energy deposition.The authors found only limited documentation in the liter-ature concerning the experimental validation of proton mul-tiple scattering implementations in recent Geant4 versions;the comparisons with experimental data reported in [178]concern muons and electrons and are not pertinent to theuse case object of this investigation, which concerns protons.Therefore, this process is a source of epistemic uncertainty inthe simulation; the analysis described in this paper shows thatthis uncertainty could determine large systematic effects incritical use cases. Further experimental measurements wouldbe useful for the validation of Geant4 multiple scatteringin use cases similar to the one considered in this paper; inparticular, experimental data suitable to clarify the interplay between energy deposition measurements and features of themultiple scattering algorithm would be beneﬁcial. Ideally,the experiment should be able to measure effects relatedto backscattering and lateral displacement, which could beresponsible for the discrepancies in the proton acceptance ob-served with the various algorithm implementations examinedin this paper.VI. C OMPUTATIONAL PERFORMANCE

The extensive survey of physics models and parametersrelevant to the problem domain documented in the previoussections provides guidance for Geant4-based simulations con-cerning similar use cases. The computational performance ofthe available physics options is a relevant parameter in simu-lation applications, especially considering that some previousanalyses demonstrate the equivalence of some of them regard-ing the physical features they produce. Therefore the analysisis complemented here by some information on the associatedcomputational performance in the use case described in thispaper, which can be useful to experimentalists in their Geant4-based applications.The results reported in Table VII, related to Geant4 9.3version, show the average simulation time per primary gener-ated event in each physics conﬁguration; they derive from theproductions for the analysis described in the previous sections.The content of Table VII should not be considered as mea-surements of Geant4 computational performance in absoluteterms: the application code contained analysis features, such asﬁlling a large number of histograms, which added an additionalburden to the execution with respect to the time strictly neededfor particle transport; moreover no effort was invested inthe optimization of the application code. However, since allthe simulations reported in that table were run on identicalhardware and platforms, the measured execution times areinteresting for relative comparisons of the computational per-formance of the various physics conﬁgurations in the use caseobject of this study.The results reported in Table VII involve the

ICRU49 proton stopping power model; simulations involving the

Ziegler77 , Ziegler85 and

Ziegler2000 models are slightlyslower. Simulations involving

G4hMultipleScattering requireapproximately 5% more CPU time than those involving

G4MultipleScattering ; however, the larger acceptance asso-ciated with this multiple scattering model requires longercomputations to track a greater number of particles in thesensitive volume.It is worth remarking that accounting for nuclear interac-tions in the simulation application described in this paper in-creases the computational time consumption by approximately57%, with respect to considering electromagnetic interactionsonly.Based on Table VII, one can observe that the hadronicelastic scattering models exhibit similar computational per-formance, with the exception of the CHIPS model, whichis signiﬁcantly slower; among the hadronic inelastic models,the Li`ege cascade and the LEP ones are faster than the otheroptions.

TABLE VIIA

VERGE

CPU

TIME PER PRIMARY GENERATED EVENT IN VARIOUSPHYSICS CONFIGURATIONS

Hadronic elastic Hadronic inelastic CPU time (ms)

Bertini-elastic Precompound 254.0 ± ± ± ± ± ± ± ± ± ± ± VII. C

ONCLUSION

A number of epistemic uncertainties have been identiﬁed ina survey of Geant4 physics models pertinent to the simulationof proton depth dose, which broadly represent the variety ofapproaches to describe proton interactions with matter in theenergy range up to approximately 100 MeV.In the electromagnetic domain, the epistemic uncertaintiesaffecting the value of the water mean ionization potential andproton stopping powers derive from lack of consensus amongvarious theoretical and experimental references documentedin the literature; they generate signiﬁcant systematic effectson the longitudinal pattern of energy deposit in the sensitivevolume, namely on the depth of the Bragg peak.The epistemic uncertainties affecting the hadronic compo-nents of the simulation are related to the intrinsic differencesof the modeling approaches and empirical parameters theycontain; the limited validation of the models, and the uncleardistinction between the processes of calibration and validationin the few published comparisons with experimental data, arethe main sources of such uncertainties. Their effects on thelongitudinal energy deposit are comparable with experimentaluncertainties typical of proton therapy; the largest differencesconcern secondary particle spectra. A signiﬁcant effect wasobserved in relation to the mode of nuclear deexcitation; inthis respect, there is a consensus towards modeling it throughFermi break-up for light nuclei and evaporation for heavierones. This approach is implemented in some Monte Carlocodes (e.g. MCNP and FLUKA), while it is not adoptedby default in Geant4; users of this code would beneﬁt fromimplementing appropriate settings in their Geant4-based appli-cations to activate Fermi break-up for the deexcitation of lightnuclei, if their simulation use cases are prone to be affectedby the systematic effects highlighted in this study.The analysis shows how the sensitivity of the simulationto epistemic uncertainties cannot be determined in absoluteterms, rather it depends on the experimental application envi-ronment. The relatively large differences in the Bragg peakproﬁle associated with the set of electromagnetic optionsare practically irrelevant in clinical practice, which toleratesadjustments of the beam parameters to reproduce a refer-ence proton range. However, these differences are relevantto applications where a predictive role is expected from thesimulation, such as Monte Carlo based treatment planning TABLE VIP-

VALUE OF GOODNESS - OF - FIT TESTS COMPARING LONGITUDINAL ENERGY DEPOSITION PROFILES DERIVING FROM VARIOUS G EANT VERSIONS ANDPHYSICS CONFIGURATIONS

Compared Physics Kolmogorov Anderson Cramersoftware versions conﬁguration Smirnov Darling von Mises9.3 (G4hMultipleScattering) Uelastic Bertini < . < . < . (G4MultipleScattering) Uelastic LEP < . < . < . Uelastic Precompound < . < . < . Bertini Precompound < . < . < . LEP Precompound < . < . < . (G4hMultipleScattering) Uelastic Bertini 0.219 0.046 0.110 (G4MultipleScattering) Uelastic LEP 0.074 0.013 0.047Uelastic Precompound 0.170 0.037 0.111Bertini Precompound 0.054 0.012 0.044LEP Precompound 0.098 0.019 0.057 (G4hMultipleScattering) Uelastic Bertini 0.009 0.001 0.004 (G4MultipleScattering) Uelastic LEP 0.002 < . < . (G4hMultipleScattering) Uelastic Bertini < . < . < . (G4MultipleScattering) Uelastic LEP < . < . < . Uelastic Precompound < . < . < . Bertini Precompound < . < . < . LEP Precompound < . < . < . (G4MultipleScattering) Uelastic Bertini 0.006 < . (G4MultipleScattering) Uelastic LEP 0.001 < . < . < . < . < . (G4MultipleScattering) Uelastic Bertini 0.277 0.051 0.113 (G4MultipleScattering) Uelastic LEP 0.039 0.011 0.043Uelastic Precompound 0.054 0.012 0.040Bertini Precompound 0.039 0.007 0.028LEP Precompound 0.130 0.030 0.071 (G4MultipleScattering) Uelastic Bertini 0.803 0.505 0.722 (G4MultipleScattering) Uelastic LEP 0.277 0.119 0.270Uelastic Precompound 0.515 0.232 0.475Bertini Precompound 0.219 0.072 0.179LEP Precompound 0.426 0.150 0.297 (G4MultipleScattering) Uelastic Bertini 0.426 0.135 0.286 (G4MultipleScattering) Uelastic LEP 0.709 0.281 0.395Uelastic Precompound 0.884 0.418 0.548Bertini Precompound 0.709 0.324 0.478LEP Precompound 0.426 0.269 0.516 (G4hMultipleScattering) Uelastic Bertini < . < . < . (G4MultipleScattering) Uelastic LEP < . < . < . Uelastic Precompound < . < . < . Bertini Precompound < . < . < . LEP Precompound < . < . < . (G4MultipleScattering) Uelastic Bertini 0.020 0.004 0.016 (G4MultipleScattering) Uelastic LEP 0.001 < . < . < . < . < . LEP Precompound 0.006 < . systems, currently the object of active research, or radiationprotection. The different secondary particle spectra derivingfrom the range of available hadronic options do not affectthe main parameter of clinical interest, i.e. the depth dosedistribution, but they are relevant to other aspects of radiationexposure.By far the largest effects of physics-related epistemic un-certainties in the simulation of proton depth dose are observedin relation to modeling multiple scattering in the beam line.However, even these effects are relevant only to use caseswhere the simulation is invested with predictive role regardingthe absolute dose released to the target; otherwise, commonpractices, like the normalization of the simulated dose to a ref- erence value, would mask the epistemic uncertainty associatedwith the empirical parameters used to model this process.The analysis also highlights the importance of a knowledgeof the whole simulation system regarding the effects visible inthe sensitive volume. Interactions in the beam line affect thespectrum of the protons reaching the sensitive volume and thedose released to it; lack of knowledge of construction details ofthe beam line, or epistemic uncertainties in modeling particleinteractions in the passive components of the system, are proneto bias the simulation outcome.The results documented in this paper about the differentobservables produced by Geant4 physics options identifysome experimental requirements for the discrimination of their features and their validation. Experimental measurements ofadequate accuracy could reduce the epistemic uncertaintiesevidenced in the electromagnetic domain; relevant data couldderive either from a thorough survey of the existing literature,or from new, dedicated measurements. In this respect, itis worthwhile to recall the valuable reference role for thevalidation of electron simulation played by the high precisionmeasurements of [179] and [180], which were originallymotivated by the validation of the ITS (Integrated Tiger Series)[181] Monte Carlo code; similar measurements concerningprotons would be useful to reduce epistemic uncertainties.The sensitivity analysis documented in the previous sectionsalso provides guidance to design meaningful test cases forinclusion in the test process of Monte Carlo systems. Theidentiﬁcation of distributions which expose distinctive featuresof the physics models, as well as of others, which are prone tohide them, is especially useful to designing test cases relevantto monitoring the effects of changes in some critical parts ofthe code.The analysis presented in this paper is a ﬁrst attempt atestimating quantitatively the impact of epistemic uncertain-ties on the considered use case; further reﬁnements wouldcontribute to better understanding the problem. So far, theanalysis has considered each source of epistemic uncertaintyindividually; nevertheless, it would be worthwhile to evalu-ate their combinations, since several systematic contributionscould accumulate their effects to bias the ﬁnal simulationresult. More reﬁned treatments, e.g. based on the theory ofevidence, could shed additional light on the problem; thesemethods would be especially useful if practical constraintshinder the availability of further experimental measurementsto reduce the current uncertainties.The identiﬁcation of the epistemic uncertainties embeddedin a large-scale simulation code is far from trivial; designmethods facilitating their identiﬁcation at early stages of thesoftware development, and their management in sensitivityanalyses, would be beneﬁcial. To the best of the authors’knowledge, this issue has not been studied yet in the contextof Monte Carlo simulation; techniques like aspect orientedprogramming could provide useful paradigms to address it,and the inclusion of epistemic uncertainties in the traceabilityprocess, in the context of a rigorous software process dis-cipline, would effectively support their handling in complexsoftware systems.Although this paper illustrates the problem of epistemicuncertainties in a speciﬁc simulation use case, the issue itaddresses goes beyond the limited application domain con-sidered in this initial study. Regarding the simulation oflow energy proton interactions, the epistemic uncertaintiesdiscussed in this paper and their effects are likely to affectother experimental domains as well: from the exposure ofelectronic components and astronauts to the space radiationenvironment, to the problem of radiation monitoring at particleaccelerators.More generally, the issue of identifying and quantifyingepistemic uncertainties, and their contribution to the overallreliability of simulation systems, permeates all Monte Carloapplication domains. Monte Carlo simulation - not only for particle transport in detectors, but also for event generators -is expected to play a critical role in the physics analysis ofLHC data, which involves energies not yet covered by anyexperimental measurements in controlled laboratory environ-ments; the development of sound methods and tools to dealwith the epistemic uncertainties embedded in LHC simulationsoftware appears a major task for the coming years in supportof LHC physics results.A CKNOWLEDGMENT

The authors thank Andreas Pfeiffer for his signiﬁcanthelp with data analysis tools throughout the project; KatsuyaAmako, Sergio Bertolucci, Luciano Catani, Gloria Corti, An-drea Dotti, Gunter Folger, Simone Giani, Vladimir Grichine,Aatos Heikkinen, Alexander Howard, Vladimir Ivanchenko,Mikhail Kossov, Vicente Lara, Katia Parodi, Alberto Ribon,Takashi Sasaki, Vladimir Uzhinsky and Hans-Peter Wellischfor valuable discussions, and Anita Hollier for proofreadingthe manuscript.CERN Library’s support has been essential to this study;the authors are especially grateful to Tullio Basaglia.INFN Genova Computing Service (Alessandro Brunengo,Mirko Corosu, Paolo Lantero and Francesco Safﬁoti) providedhelpful technical assistance with the simulation production.The authors do not intend to express criticism, nor praiseregarding any of the Monte Carlo codes mentioned in thispaper; the purpose of the paper is limited to documentingtechnical results. R

EFERENCES[1] W. P. Levin, H. Kooy, J. S. Loefﬂer, and T. F. DeLaney, “Proton beamtherapy ”,

Brit. J. Cancer , vol. 93, pp. 849 854, 2005.[2] X-5 Monte Carlo Team, “MCNP – A General Monte Carlo N-ParticleTransport Code, Version 5”, Los Alamos National Laboratory ReportLA-UR-03-1987, 2003, revised 2005.[3] R. A. Forster et al., ‘ ‘MCNP Version 5”,

Nucl. Instrum. Meth. B , vol.213, pp. 82-86, 2004.[4] J. S. Hendricks et al., “MCNPX, Version 26c”, Los Alamos NationalLaboratory Report LA-UR-06-7991, 2006.[5] ”GEANT Detector Description and Simulation Tool”, CERN ProgramLibrary Long Writeup W5013, 1995.[6] S. Agostinelli et al., “Geant4 - a simulation toolkit”

Nucl. Instrum.Meth. A , vol. 506, no. 3, pp. 250-303, 2003.[7] J. Allison et al., “Geant4 Developments and Applications”

IEEE Trans.Nucl. Sci. , vol. 53, no. 1, pp. 270-278, 2006.[8] A. V. Dementyev and N. M. Sobolevsky, ”SHIELD universal MonteCarlo hadron transport code: scope and applications”,

Radiat. Meas. ,vol. 30, no. 5, pp. 553-557, 1999.[9] I. Gudowska, P. Andreo, and N. M .Sobolevsky, “Secondary ParticleProduction in Tissue-like and Shielding Materials for Light and HeavyIons Calculated with the Monte-Carlo Code SHIELD-HIT”,

J. Radiat.Res. , vol. 43 Suppl., pp. 93-97, 2002.[10] G. Battistoni et al., “The FLUKA code: description and benchmarking”,

AIP Conf. Proc. , vol. 896, pp. 31-49, 2007.[11] A. Ferrari et al., “Fluka: a multi-particle transport code”, Report CERN-2005-010, INFN/TC-05/11, SLAC-R-773, Geneva, Oct. 2005.[12] H. Iwase, K. Niita, and T. Nakamura, “Development of general-purposeparticle and heavy ion transport Monte Carlo code”,

J. Nucl. Sci.Technol. , vol. 39, no. 11, pp. 1142-1151, 2002.[13] W. L. Oberkampf, S. M. DeLand, B. M. Rutherford, K. V. Diegert,and K. F. Alvin, “Error and uncertainty in modeling and simulation”,

Reliab. Eng. Syst. Safety , vol. 75, no. 3, pp. 333357, 2002.[14] W. L. Oberkampf, T. G. Trucano, and C. Hirsch, “Veriﬁcation, val-idation, and predictive capability in computational engineering andphysics”,

Appl. Mech. Rev. , vol. 57, no. 5, pp. 345-384, 2004. [15] J. H´erault, N. Iborra, B. Serrano, and P. Chauvel, “Monte Carlosimulation of a protontherapy platform devoted to ocular melanoma”, Med. Phys. , vol. 32, pp. 910-919 2005.[16] D. E. Bonnett, A. Kacperek, M. A. Sheen, R. Goodall, and T. E. Saxton,“The 62 MeV proton beam for the treatment of ocular melanoma atClatterbridge”,

Brit. J. Radiol. , vol. 66, pp. 907-914, 1993.[17] S. H¨ocht et al., “Proton Therapy of Uveal Melanomas in Berlin”,

Strahl.Onk. , vol. 180, no. 7, pp. 419-424, 2004.[18] G. A. P. Cirrone et al., “A 62 MeV proton beam for the treatment ofocular melanoma at Laboratori Nazionali del Sud-INFN (CATANIA)”,in

IEEE Trans. Nucl. Sci. , vol. 51, no. 3, pp. 860-865, 2004.[19] T. G. Trucano, L. P. Swiler, T. Igusa, W. L. Oberkampf, and M. Pilch,“Calibration, validation, and sensitivity analysis: Whats what”,

Reliab.Eng. Syst. Safety , vol. 91, no. 10-11, pp. 1331-1357, 2006.[20] J. C. Helton, “Alternative representations of epistemic uncertainty”,

Reliab. Eng. Syst. Safety , vol. 84, no. 1-3, pp. 1-10, 2004.[21] G. Shafer, “A Mathematical Theory of Evidence”, Princeton UniversityPress, 1976.[22] A. Saltelli, K. Chan, and M. Scott M, editors, “Mathematical andstatistical method for sensitivity analysis”, New York: Wiley; 2001.[23] K. Diegert, S. Klenke, G. Novotny, R. Paulsen, M. Pilch, and T.Trucano, “Toward a More Rigorous Application of Margins andUncertainties within the Nuclear Weapons Life Cycle - A SandiaPerspective”, Sandia National Laboratories Report No. SAND2007-6219, Albuquerque, 2007.[24] A. Lechner and M. G. Pia, “Analysis of Geant4 simulations of protondepth dose proﬁles for radiotherapy applications”, in

IEEE Nucl. Sci.Symp. Conf. Rec. 2008 ,pp. 875-882, 2008.[25] H. Paganetti and B. Gottschalk, “Test of GEANT3 and GEANT4nuclear models for 160 MeV protons stopping in CH ”, Med. Phys. ,vol. 30, no. 7, pp. 1926-1931, 2003.[26] M. Fippel and M. Soukup, “A Monte Carlo dose calculation algorithmfor proton therapy”

Med. Phys. , vol. 31, no. 8, pp. 2263-2273, 2004.[27] H. Paganetti, H. Jiang, S.-Y. Lee, and H. M. Kooy, “Accurate MonteCarlo simulations for nozzle design, commissioning and quality assur-ance for a proton radiation therapy facility”

Med. Phys. , vol. 31, no.7, pp. 2107-2118, 2004.[28] A. B. Rosenfeld, A. J. Wroe, I. M. Cornelius, M. Reinhard, and D.Alexiev, “Analysis of inelastic interactions for therapeutic proton beamsusing Monte Carlo simulation”,

IEEE Trans. Nucl. Sci. , vol. 51, no. 6,pp. 3019-3025, 2004.[29] A. J. Wroe, I. M. Cornelius, and A. B. Rosenfeld, “The role ofnonelastic reactions in absorbed dose distributions from therapeuticproton beams in different medium”,

Med. Phys. , vol. 32, no. 1, pp.37-41, 2005.[30] G. A. P. Cirrone et al., “Implementation of a New Monte CarloGEANT4 Simulation Tool for the Development of a Proton TherapyBeam Line and Veriﬁcation of the Related Dose Distributions”,

IEEETrans. Nucl. Sci. , vol. 52, no. 1, pp. 262-265, 2005.[31] T. Aso et al., “Veriﬁcation of the Dose Distributions With GEANT4Simulation for Proton Therapy”,

IEEE Trans. Nucl. Sci. , vol. 52, no.4, pp. 896-901, 2005.[32] A. Kim, J. W. Kim, I. Hahn, N. Schreuder, and J. Farr , “Simulationsof Therapeutic Proton Beam Formation with GEANT4”

J. Kor. Phys.Soc. , vol. 47, no. 2, pp 197-201, Aug. 2005.[33] C. Baker, D. Shipley, H. Palmans, and A. Kacperek, “Monte carlomodelling of a clinical proton beam-line for the treatment of oculartumours”,

Nucl. Instrum. Meth. A , vol. 562, pp. 1005-1008, 2006.[34] G. A. P. Cirrone et al., “Validation of Geant4 Physics Models for theSimulation of the Proton Bragg Peak”, in

IEEE 2006 Nucl. Sci. Symp.Conf. Rec. , pp. 788-792.[35] C. Zacharatou Jarlskog and H. Paganetti, “Physics settings for usingthe Geant4 Toolkit in proton therapy”,

IEEE Trans. Nucl. Sci. , vol. 55,no. 3, pp. 1018-1025, Jun. 2008.[36] K. Amako et al., “Comparison of Geant4 electromagnetic physicsmodels against the NIST reference data”,

IEEE Trans. Nucl. Sci. , vol.52, no. 4, pp. 910-918, 2005.[37] M. P. W. Chin and N. M. Spyrou, Non-convergence of Geant4hadronic models for 10 and 30 MeV protons in 18O and 14N AppliedRadiation and Isotopes Volume 67, Issue 3, March 2009, Pages 406-414 Proceedings of the ﬁrst international conference on biomedicalapplications of high energy ion beams[38] J. M. Quesada, M. A. Cortes, A. Howard, G. Folger, and V. N.Ivanchenko, “Improvements of preequilibrium and evaporation modelsin Geant4”, in

IEEE Nucl. Sci. Symp. Conf. Rec. , pp. 847-849, 2008. [39] V. N. Ivanchenko and A. Ivantchenko, “Testing suite for validation ofGeant4 hadronic generators”,

J. Phys.: Conf. Ser. , vol. 119, pp. 032026,2008.[40] J. Apostolakis et al., “Progress in hadronic physics modelling inGeant4”,

J. Phys.: Conf. Ser. , vol. 160, pp. 012073, 2009.[41] W. Newhauser, N. Koch, S. Hummel, M. Ziegler, and U. Titt, “MonteCarlo simulations of a nozzle for the treatment of ocular tumours withhigh-energy proton beams”,

Phys. Med. Biol. , vol. 50, pp. 5229-5249,2005.[42] I. Gudowska, N. Sobolevsky, P. Andreo, D, Belkic and A. Brahme,“Ion beam transport in tissue-like media using the Monte Carlo codeSHIELD-HIT”,

Phys. Med. Biol. , vol. 49, pp. 1933-1958, 2004.[43] F. D. Brooks, D. T. L. Jones, C. C. Bowley, J. E. Symons, A. Bufﬂer,and M. S. Allie, “Energy spectra in the NAC proton therapy beam”,

Radiat. Prot. Dosim. , vol. 70, pp. 477-480, 1997.[44] A. Lechner and M. G. Pia, “Effect of normalization algorithms on theanalysis of Bragg peak proﬁles”,

IEEE Trans. Nucl. Sci. , vol. 55, no.6, pp. 3544-3549, 2008.[45] C. Agodi et al., “The INFN TPS project”,

Nuovo Cimento C , vol. 31,no. 1, pp. 99-108, 2008.[46] N. Reynaert et al., “Monte Carlo treatment planning for photon andelectron beams ”,

Radiat. Phys. Chem. , vol 76, no. 4, pp. 643-686,2007.[47] N. Tilly, “Monte Carlo calculations for proton therapy planning, is itworth the effort?”,

Radiotherapy Oncol. , vol 76, Suppl. 2, pp. S17-S18,2005.[48] S. Webb, “The contribution, history, impact and future of physics inmedicine ”,

Acta Oncol. , vol. 48, no. 2, pp. 169-177, 2009.[49] S. Chauvie et al., “Geant4 electromagnetic physics”, in

Proc. MonteCarlo Conference, Lisbon , Nov. 2000.[50] S. Chauvie, G. Depaola, V. Ivanchenko, F. Longo, P. Nieminen andM. G. Pia, “Geant4 Low Energy Electromagnetic Physics”, in

Proc.Computing in High Energy and Nuclear Physics , Beijing, China, pp.337-340, 2001.[51] S. Chauvie et al., “Geant4 Low Energy Electromagnetic Physics”, in

Conf. Rec. 2004 IEEE Nucl. Sci. Symp. , N33-165.[52] S. Giani, V. N. Ivanchenko, G. Mancinelli, P. Nieminen, M. G. Pia,and L. Urban, “Geant4 simulation of energy losses of slow hadrons”,

INFN/AE-99/20 , Frascati, 1999.[53] J. Lindhard and M. Scharff, “Energy loss in matter by fast particles oflow charge”,

Mat. Fys. Medd. Dan. Vid. Selsk. , vol. 27, no. 15, pp. 1,1953.[54] M. J. Berger et al., “Stopping Powers and Ranges for Protons andAlpha Particles”,

ICRU Report 49

INFN/AE-99/18 , Frascati, 1999.[59] S. Giani, V. N. Ivanchenko, G. Mancinelli, P. Nieminen, M. G. Piaand L. Urban, “Geant4 simulation of energy losses of ions”,

INFN/AE-99/21 , Frascati, 1999.[60] S. T. Perkins et al., “Tables and Graphs of Electron-Interaction CrossSections from 10 eV to 100 GeV Derived from the LLNL EvaluatedElectron Data Library (EEDL)”, UCRL-50400 Vol. 31, 1997.[61] D. Cullen et al., “EPDL97, the Evaluated Photon Data Library”, UCRL-50400, Vol. 6, Rev. 5, 1997.[62] J. Baro, J. Sempau, J. M. Fern´andez-Varea, and F. Salvat, “PENELOPE,an algorithm for Monte Carlo simulation of the penetration and energyloss of electrons and positrons in matter”,

Nucl. Instrum. Meth. B , vol.100, no. 1, pp. 31-46, 1995.[63] S. Guatelli, A. Mantero, B. Mascialino, P. Nieminen, and M. G. Pia,“Geant4 Atomic Relaxation”,

IEEE Trans. Nucl. Sci. , vol. 54, no. 3,pp. 585-593, 2007.[64] M. J. Berger et al., “Stopping Powers and Ranges for Protons andAlpha Particles”,

ICRU Report 49 , Bethesda, 1993.[65] H. Burkhardt et al., “Geant4 Standard Electromagnetic Package”, in

Proc. 2005 Conf. on Monte Carlo Method: Versatility Unbounded in aDynamic Computing World , Am. Nucl. Soc., USA, 2005.[66] K. Lassila-Perini and L. Urban, “Geant4 Standard ElectromagneticPackage”,

Nucl. Instrum. Meth. A , vol. 362, no. 2-3, pp. 416-422, 1995. [67] J. Apostolakis et al., “The performance of the Geant4 Standard EMpackage for LHC and other applications”, J. Phys. Conf. Series , vol.119, pp. 1-10, 2008.[68] J. Apostolakis et al., “Geometry and physics of the Geant4 toolkit forhigh and medium energy applications”,

Radiat. Phys. Chem. , vol. 78,no. 10, pp. 859-873, 2009.[69] A. Lechner, M.G. Pia, M. Sudhakar “Validation of Geant4 low energyelectromagnetic processes against precision measurements of electronenergy deposit”,

IEEE Trans. Nucl. Sci. , vol. 56, no. 2, pp. 398-416,2009.[70] L. Urban, “Multiple scattering model in Geant4”, CERN-OPEN-2002-070, Geneva, Switzerland, 2002.[71] L. Urban, “A model of multiple scattering in Geant4”, CERN-OPEN-2006-077, Geneva, Switzerland, 2006.[72] H. W. Lewis, “Multiple Scattering in an Inﬁnite Medium”,

Phys. Rev. ,vol. 78, pp. 526-529, 1950.[73] Geant4 9.3 Release Notes, Online. Available:http://cern.ch/geant4/support/ReleaseNotes4.9.3.html[74] S. Elles, V. N. Ivanchenko, M. Maire and L. Urban, “Geant4 and Fanocavity: where are we?”,

J. Phys. Conf. Series , vol. 102, pp. 1-8, 2009.[75] R. Nartallo et al., “Radiation environment induced degradation onChandra and implications for XMM”, ESA Report, Esa/estec/tos-em/00-015/RN, Noordwijk, 2000.[76] W. J. Murray, “MuScat report - the muon scattering experiment”,

J.Phys. G , vol. 29, pp. 1595-1600, 2003.[77] P. Arce and M. Wadhwa, “Deviation in matter of 45 GeV muons inGEANT3 and GEANT4. A comparison with L3 data”, CMS NOTE2000/016, CERN, Geneva, 2000.[78] C. Rovelli, “Validation of the simulation of the CMS electromagneticcalorimeter using data”, in

IEEE Nucl. Sci. Symp. Conf. Rec. 2008 , pp.2792-2794, 2008.[79] K. Parodi and S. Squarcia, “Improvement of low-energy stopping poweralgorithms in the FLUKA simulation program”,

Nucl. Instr. Meth. A ,vol. 456, no. 3, p. 352-368, 2001.[80] T. W. Armstrong and K. C. Chandler, “Stopping powers and rangesfor muons, charged pions, protons, and heavy ions”,

Nucl. Instr. Meth. ,vol. 113, no. 2, 15, pp. 313-314, 1973.[81] T. Goorley, R. E. Prael, and H. Grady, “Veriﬁcation of Stopping Powersfor Proton Transport in MCNP5”, Los Alamos National LaboratoryReport LA-UR-03-4446, 2003.[82] H. Nose, Y. Kase, N. Matsufuji, and T. Kanai, “Field size effect ofradiation quality in carbon therapy using passive method”,

Med. Phys. ,vol. 36, no. 3, pp. 870-875, 2009.[83] F. Bourhaleb et al., “A treatment planning code for inverse planningand 3D optimization in hadrontherapy”,

Comp. Biol. Med. , vol38, no.9, pp. 990-999, 2008.[84] H. G. Hughes et al., “MCNP5 for proton radiography”,

Radiat. Prot.Dos. , vol. 116, no. 1-4, pp. 109-112, 2005.[85] K. Niita, T. Sato, H. Iwase, H. Nose, H. Nakashima, and L. Sihver,“PHITS - a particle and heavy ion transport code system”,

Radiat.Meas. , vol. 41, no. 9-10, pp. 1080-1090, 2006.[86] A. Ferrari, P. R. Sala, R. Guaraldi and F. Padoani, “An improvedmultiple scattering model for charged particle transport”,

Nucl. Instr.Meth. B , vol. 71, no. 4, pp. 412-426, 2002.[87] J. P. Wellisch, “ Hadronic shower models in Geant4 - the frameworks”,

Comp. Phys. Comm. , vol. 140, no. 1, pp. 65-75, 2001.[88] V. Grichine, “Geant4 hadron elastic diffuse model”,

Comp. Phys.Comm. , vol. 181, pp. 921-927, 2010.[89] H. S. Fesefeldt, “The simulation of hadronic showers: physics andapplications”, PITHA-85-02, RWTH Aachen, Sep. 1985.[90] G. A. Lobov, N. V. Stepanov, A. A. Sibirtsev, and Yu. V. Trebukhovskii,“Statistical Simulation of Hadron and Light-Nuclei Interactions withNuclei. Intranuclear Cascade Model”, ITEP Preprint No. ITEP-91,Moscow, 1983.[91] D. H. Wright et al., “Recent Developments and Validations in Geant4Hadronic Physics” in

AIP Conf. Proc. , vol. 867, pp. 479-486, 2006.[92] M. Kossov, “Chiral-invariant phase space model”,

Eur. Phys. J. A , vol.14, no. 3, pp. 265-269, 2002.[93] P. V. Degtyarenko, M. V. Kossov, and H.-P. Wellisch, “Chiral invariantphase space event generator”

Eur. Phys. J. A , vol. 9, no. 3, pp. 411-420,2003.[94] H. P. Wellisch and D. Axen, “Total reaction cross section calculationsin proton-nucleus scattering”,

Phys. Rev. C , vol. 54, pp. 1329-1332,1996.[95] V. S. Barashenkov and V. M. Maltsev “Cross sections for elementaryparticle interactions”

Fortschr. Phys. , vol. 9, pp. 549-611, 1961. [96] S. Kox et al., “Trends of total reaction cross sections for heavy ioncollisions in the intermediate energy range”,

Phys. Rev. C , vol. 35, pp.1678-1691, 1987.[97] W. Q. Shen et al., “Total reaction cross section for heavy-ion collisionsand its relation to the neutron excess degree of freedom”,

Nucl. Phys.A , vol. 491, no. 1, pp. 130-146, 1989.[98] R. K. Tripathi, F. A. Cucinotta, and J. W. Wilson, “Universal Parame-terization of Absorption Cross Sections”, NASA Technical Paper 3621,Langley Research Center, Hampton, Virginia, Jan. 1997.[99] R. K. Tripathi, F. A. Cucinotta, and J. W. Wilson, “Accurate universalparameterization of absorption cross sections”,

Nucl. Instrum. Meth. B ,vol. 117, no. 4, pp. 347-349, 1996.[100] R. K. Tripathi, F. A. Cucinotta, and J.W. Wilson, “ Universal Pa-rameterization of Absorption Cross Sections - Light Systems”, NASATechnical Paper 1999-209726, 1999.[101] R. K. Tripathi, F. A. Cucinotta, and J.W. Wilson, “ Universal Param-eterization of Absorption Cross Sections III - Light Systems”,

Nucl.Instrum. Meth. B , vol. 155, no. 4, pp. 349-356, 1999.[102] N. Amelin, “Physics and algorithms of the hadronic Monte-Carlo eventgenerators: notes for a developer” Report CERN-IT-99-006, CERN,Geneva, 1999.[103] G. Folger, V.N. Ivanchenko, and J.P. Wellisch, “The Binary Cascade”,

Eur. Phys. J. A , vol. 21, pp. 407-417, 2004.[104] A. Heikkinen, N. Stepanov, Nikita, and J. P. Wellisch, “Bertini intra-nuclear cascade implementation in Geant4”, in

Proc. of CHEP 2003Int. Conf. Comp.in High En. and Nucl. Phys. , La Jolla, 2003.[105] A. Heikkinen, “Implementing the Bertini intra-nuclear-cascade in theGeant4 hadronic framework”, in

The Monte Carlo Method: VersatilityUnbounded in a Dynamic Computing World , on CD-ROM, Am. Nucl.Soc., 2005.[106] A. Boudard, J. Cugnon, S. Leray, and C. Volant, “Intranuclear cascademodel for a comprehensive description of spallation reaction data”,

Phys. Rev. C , vol. 66, no. 4, pp. 044615, 2002.[107] V. Lara and J. P. Wellisch, “Pre-equilibrium and equilibrium decays inGeant4”, in

Proc. CHEP 2000 Int. Conf. on Computing in High Energyand Nuclear Physics , Padova, 2000.[108] J. J. Grifﬁn, “Statistical Model of Intermediate Structure”,

Phys. Rev.Lett. , vol. 17, pp. 478-481, 1966.[109] J. J. Grifﬁn, “Statistical Model of Intermediate Structure, erratum”,“

Phys. Lett. B , vol. 19, pp. 57-57, 1967.[110] V. Weisskopf, “Statistics and Nuclear Reactions”,

Phys. Rev. , vol. 52,pp. 295 - 303, 1937.[111] V. F. Weisskopf and D. H. Ewing, “On the Yield of Nuclear Reactionswith Heavy Elements”,

Phys. Rev. , vol. 57, pp. 472 - 485, 1940.[112] I. Dostrovsky, Z. Fraenkel, and G. Friedlander, “Monte Carlo Calcu-lations of Nuclear Evaporation Processes. III. Applications to Low-Energy Reactions”

Phys. Review , vol. 116, pp. 683-702, 1959.[113] S. Furihata, “Statistical analysis of light fragment production frommedium energy proton-induced reactions”,

Nucl. Instr. Meth. B , vol.171, no. 3, pp. 251-258, 2000.[114] V. D. Toneev and K. K. Gudima, “Particle emission in light and heavyion reactions”,

Nucl. Phys. A , vol. 400, pp. 173-189, 1983.[115] K. K. Gudima, S. G. Mashnik, and V. D. Toneev, “Cascade-excitonmodel of nuclear reactions”

Nucl. Phys. A , vol. 401, pp. 329-361, 1983.[116] H. W. Bertini, “Low-Energy Intranuclear Cascade Calculation”,

Phys.Rev. , vol. 131, pp. 1801-1821, 1963.[117] S. G. Mashnik and V. D. Toneev, “MODEX - the Program forCalculation of the Energy Spectra of Particles Emitted in the Reactionsof Pre-Equilibrium and Equilibrium Statistical Decays”,

Comm. JINR,P4-8417 , Dubna, 1974 .[118] S. G. Mashnik and A. J. Sierk, ”Improved Cascade-Exciton Model ofNuclear Reactions”, Los Alamos National Laboratory Report LA-UR-98-5999, 1998.[119] K. C. Chandler and T. W. Armstrong, ”Operating instructions forthe high-energy nucleon-meson transport code HETC”, Oak RidgeNational Laboratory Report ORNL-4744, 1972.[120] R. E. Prael and H. Lichtenstein, “User Guide to LCS: the LAHET CodeSystem ”, Los Alamos National Laboratory Report LA-UR-89-3014,1989.[121] Y. Yariv and Z. Fraenkel, ”Intranuclear cascade calculation of high-energy heavy-ion interactions”,

Phys. Rev. C , vol. 20, pp. 2227-2243,1979.[122] Y. Yariv and Z. Fraenkel, ”Intranuclear cascade calculation of highenergy heavy ion collisions: Effect of interactions between cascadeparticles”,

Phys. Rev. C , vol. 24, pp. 488-494, 1981. [123] C. Zeitnitz and T. Gabriel, ”The GEANT-CALOR interface and bench-mark calculations of ZEUS test calorimeters”, Nucl. Instr. Meth. A , vol.349, no. 1, pp. 106-111, 1994.[124] J. J. Gaimard and K. H. Schmidt, “A reexamination of the abrasion-ablation model for the description of the nuclear fragmentation reac-tion”,

Nucl. Phys. A , vol. 531, no. 3-4, pp. 709-745, 2001.[125] A Heikkinen, P. Kaitaniemi and A. Boudard, “Implementation of INCLcascade and ABLA evaporation codes in Geant4”

J. Phys.: Conf. Ser. ,vol. 119, pp. 032024, 2008.[126] J. Cugnon et al., “The INCL model for spallation reactions below 10GeV”,

Adv. Space Res. , vol. 40, no. 9, pp. 1332-1338, 2007.[127] T. Aoust et al., “Recent extensions of the INCL4 model for spallationreactions”,

Nucl. Instr. Meth. A , vol. 562, no. 2, pp. 810-813, 2006.[128] S. Pedoux, J. Cugnon, A. Boudard, J. C. David and S. Leray, “Exten-sion of INCL4 between 2 and 15 GeV ”,

Adv. Space Res. , vol. 44, no.8, pp. 926-933, 2009.[129] Y. Yariv et al., “Intra-nuclear cascade models at low energy?”, in

Proc.2007 Int. Conf. Nucl. Data Sci. Technol. , Nice, EDP Sciences, 2008.[130] T. Sato, L. Sihver, H. Iwase, H. Nakashima and K. Niita, “Simulationsof an accelerator-based shielding experiment using the particle andheavy-ion transport code system PHITS”,

Adv. Space Res. , vol. 35, no.2, pp. 208-213, 2003.[131] W. A. Coleman and T. W. Armstrong, “Nucleon-meson transport codeNMTC”, Oak Ridge National Laboratory Report ORNL-4606, 1970.[132] A. Fass`o, A. Ferrari, J. Ranft and P. R. Sala, “FLUKA: performancesand applications in the intermediate energy range”, in:

Proc. AEN/NEASpecialists’ Meeting on Shielding Aspects of Accelerators, Targets andIrradiation Facilities , OECD Documents, Paris, pp. 287-304, 1995.[133] M. Blann, “Hybrid model for pre-equilibrium decay in nuclear reac-tions”,

Phys. Rev. Lett. , vol. 27, pp. 337-340, 1971.[134] A. Ferrari and P. R. Sala, “The Physics of High Energy Reactions”,in

Proc. Workshop Nucl. React. Data Nucl. React. Phys., Design andSafety , A. Gandini, G. Re?o eds., Trieste, 1998.[135] F. Ballarini et al., “Nuclear Models in FLUKA: Present Capabilities,Open Problems, and Future Improvements”,

AIP Conf. Proc. , vol. 769,pp. 1197-1202, 2005.[136] T. Koi et al., “Validation of Hadronic Models in Geant4 ”,

AIP Conf.Proc. , vol. 896, pp. 21-30, 2007.[137] J. P. Wellisch, “Geant4 hadronic physics status and validation for largeHEP detectors”, in

Proc. of CHEP 2003 Int. Conf. on Comp. in HighEn. and Nucl. Phys. , La Jolla, 2003.[138] IEEE Computer Society, “IEEE Standard for Software Veriﬁcation andValidation”, IEEE Std 1012-2004, Jun. 2005.[139] United States Department of Energy, “Advanced Simulation and Com-puting Program Plan”, Sandia National Laboratories Report SAND2004-4607PP, Albuquerque, 2004.[140] J. Bisplinghoff, “Conﬁguration mixing in preequilibrium reactions: Anew look at the hybrid-exciton controversy”,

Phys. Rev. C , vol. 33, pp.1569-1580, 1986.[141] A. J. Koning and M. C. Duijvestijn, “A global pre-equilibrium analysisfrom 7 to 200 MeV based on the optical model potential”,

Nucl. Phys.A , vol. 744, pp. 15-76, 2004.[142] G. Barrand, P. Binko, M. Donszelmann, A. Johnson, and A. Pfeiffer,“Abstract interfaces for data analysis: component architecture for dataanalysis tools”, in

Proc. of CHEP Int. Conf. on Computing in HighEnergy and Nucl. Phys. , pp. 215-218, Science Press, Beijing, 2001.[143] A. Pfeiffer, “iAIDA - an implementation of AIDA in C++”. Online.Available: http://iaida.dynalias.net/[144] G. A. P. Cirrone et al., “A Goodness-of-Fit Statistical Toolkit”,

IEEETrans. Nucl. Sci. , vol. 51, no. 5, pp. 2056-2063, 2004.[145] B. Mascialino, A. Pfeiffer, M. G. Pia, A. Ribon, and P. Viarengo, “Newdevelopments of the Goodness-of-Fit Statistical Toolkit”,

IEEE Trans.Nucl. Sci. , vol. 53, no. 6, pp. 3834-3841, 2006.[146] T. W. Anderson and D. A. Darling, “Asymptotic theory of certaingoodness of ﬁt criteria based on stochastic processes”,

Anls. Ma. St. ,vol. 23, pp. 193-212, 1952.[147] T. W. Anderson and D. A. Darling, “A test of goodness of ﬁt”,

JASA ,vol. 49, pp. 765-769, 1954.[148] H. Cram´er, “On the composition of elementary errors. Second paper:statistical applications”,

Skand. Aktuarietidskr. , vol. 11, pp. 13-74, pp.141-180, 1928.[149] R. von Mises, “Wahrscheinliehkeitsrechnung und ihre Anwendung inder Statistik und theoretischen Physik”, Leipzig: F. Duticke, 1931.[150] M. Fisz, “On a result by M. Rosenblatt concerning the von Mises-Smirnov test”,

Anls. Ma. St. , vol. 31, pp. 427-429, 1960.[151] A. N. Kolmogorov, “Sulla determinazione empirica di una legge didistribuzione”,

Gior. Ist. Ital. Attuari , vol. 4, pp. 83-91, 1933. [152] N. V. Smirnov, “On the estimation of the discrepancy between empir-ical curves of distributions for two independent samples”,

Bull. Math.Univ. Moscou , 1939.[153] J. Kempe and A. Brahme, “Energy-range relation and mean energyvariation in therapeutic particle beams”,

Med. Phys. , vol. 35, no.1, pp.159-170, 2008.[154] P. M. DeLuca Jr., A. Wambersie, and G. Whitmore, “Prescribing,Recording, and Reporting Proton-Beam Therapy: Dosimetry”,

J. ICRU ,vol. 7, no. 2, pp. 49-81, 2007.[155] P. M. DeLuca, Jr., A. Wambersie, and G. Whitmore, “Prescribing,Recording, and Reporting Proton-Beam Therapy: Beam Delivery andProperties”,

J. ICRU , vol. 7, no. 2, pp. 29-48, 2007.[156] L. J. Verhey et al., “The determination of absorbed dose in a protonbeam for purposes of charged-particle radiation therapy”,

Radiat. Res. ,vol. 79, pp. 34-54, 1979.[157] H. Paul, “The mean ionization potential of water, and its connectionto the range of energetic carbon ions in water”,

Nucl. Instrum. Meth.B , vol. 255, pp. 435-437, 2007.[158] R. Schulte et al., “Conceptual Design of a Proton Computed Tomog-raphy System for Applications in Proton Radiation Therapy ”,

IEEETrans. Nucl. Sci. , vol. 51, no. 3, pp. 866-872, 2004.[159] H. Bichsel and T. Hiraoka, “Energy loss of 70 MeV protons inelements”,

Nucl. Instrum. Meth. B , vol. 66, pp. 345-351, 1992.[160] M. Dingfelder, “Cross section calculations in condensed media:charged particles in liquid water”,

Radiat. Prot. Dosim. , vol. 99, no.1-4, pp. 23 - 28, 2002.[161] P. Andreo, “On the clinical spatial resolution achievable with protonsand heavier charged particle radiotherapy beams”,

Phys. Med. Biol. ,vol. 54, pp. N205-N215, 2009.[162] J. Soltani-Nabipour1, D. Sardari, G. Cata-Danil, “Sensitivity of theBragg peak curve to the average ionization potential of the stoppingmedium”,

Rom. J. Phys. , vol. 54, no. 3-4, pp. 321-330, 2009.[163] F. Sommerer, K. Parodi, A Ferrari, K. Poljanc, W. Enghardt, and H.Aiginger, “Investigating the accuracy of the FLUKA code for transportof therapeutic ion beams in matter”,

Phys. Med. Biol. , vol. 51, pp.4385-4398, 2006.[164] M. J. van Goethem et al., “Geant4 simulations of proton beam transportthrough a carbon or beryllium degrader and following a beam line”,

Phys. Med. Biol. , vol. 54, pp. 5831-5846, 2009.[165] Y. Kumazaki, T. Akagi, T. Yanou, D. Suga, Y. Hishikawa, and T.Teshima, “Determination of the mean excitation energy of water fromproton beam ranges”,

Radiat. Meas. , vol. 42, pp. 1683-1691, 2007.[166] H. Paganetti, H. Jiang, K. Parodi, R. Slopsema, and M. Engelsman,“Clinical implementation of full Monte Carlo dose calculation in protonbeam therapy”,

Phys. Med. Biol , vol. 53, pp. 4825-4853, 2008.[167] I. Pshenichnov, A. Botvina, I. Mishustin, and W. Greiner, “Nu-clear fragmentation reactions in extended media studied with Geant4toolkit”, submitted to

Nucl. Instrum. Meth. B , Online. Available:http://arxiv.org/pdf/0911.2017.[168] J. F. Ziegler, J. P. Biersack, and M. D. Ziegler, “SRIM The Stoppingand Range of Ions in Matter”, Ed.: SRIM Co., Chester, MD, 2008.[169] H. Paul and A. Schinner, “Judging the reliability of stopping powertables and programs for protons and alpha particles using statisticalmethods”,

Nucl. Instrum. Meth. B , vol. 227, 461-470, 2005.[170] H. Paul, “A comparison of recent stopping power tables for lightand medium-heavy ions with experimental data, and applications toradiotherapy dosimetry”,

Nucl. Instrum. Meth. B , vol. 247, 167-172,2006.[171] A. Wald and J. Wolfowitz, “An exact test for randomness in the non-parametric case, based on serial correlation,”

Ann. Math. Stat. , vol. 14,pp. 378-388, 1943.[172] B. Gottschalk, R. Platais and H. Paganetti, “Nuclear interactions of for160 MeV protons stopping in copper: a test of Monte Carlo nuclearmodels”,

Med. Phys. , vol. 26, no. 12, pp. 2597-2601, Dec. 1999.[173] C. Zacharatou Jarlskog and H. Paganetti, “Risk of Developing Sec-ond Cancer From Neutron Dose in Proton Therapy as Function ofField Characteristics, Organ, and Patient Age”,

Int. J. Radiat. Oncol.Biol.Phys. , vol. 72, no. 1, pp. 228-235, 2008.[174] S. Guatelli et al., “Experience with Software Process in PhysicsProjects”, in

Conf. Rec. 2004 IEEE Nucl. Sci. Symp. , N40-8.[175] I. Jacobson, J. Booch, and J. Rumbaugh, “The Uniﬁed SoftwareDevelopment Process”, Ed: Addison-Wesley, Boston, 1999.[176] M. A. Stephens, “EDF statistics for goodness-of-ﬁt and some compar-isons”

J. Amer. Statist. Assoc. , vol. 69, pp. 730-737, 1974.[177] K. Pearson, “Mathematical Contributions to the Theory of Evolution.III. Regression, Heredity, and Panmixia”,