[PDF] An information geometry approach for robustness analysis in uncertainty quantification of computer codes

Abstract

Robustness analysis is an emerging field in the domain of uncertainty quantification. It consists of analysing the response of a computer model with uncertain inputs to the perturbation of one or several of its input distributions. Thus, a practical robustness analysis methodology should rely on a coherent definition of a distribution perturbation. This paper addresses this issue by exposing a rigorous way of perturbing densities. The proposed methodology is based the Fisher distance on manifolds of probability distributions. A numerical method to calculate perturbed densities in practice is presented. This method comes from Lagrangian mechanics and consists of solving an ordinary differential equations system. This perturbation definition is then used to compute quantile-oriented robustness indices. The resulting Perturbed-Law based sensitivity Indices (PLI) are illustrated on several numerical models. This methodology is also applied to an industrial study (simulation of a loss of coolant accident in a nuclear reactor), where several tens of the model physical parameters are uncertain with limited knowledge concerning their distributions.

Full PDF

AAn information geometry approach forrobustness analysis in uncertainty quantiﬁcation of computer codes

Clement GAUCHY ∗ Jerome STENGER Roman SUEUR Bertrand IOOSS † EDF R&D, D´epartement PRISME, 6 Quai Watier, 78401, Chatou, France Institut de Math´ematiques de Toulouse, 31062 Toulouse, FranceAugust 10, 2020

Abstract

Robustness analysis is an emerging ﬁeld in the domain of uncertainty quantiﬁca-tion. It consists of analysing the response of a computer model with uncertain inputsto the perturbation of one or several of its input distributions. Thus, a practicalrobustness analysis methodology should rely on a coherent deﬁnition of a distribu-tion perturbation. This paper addresses this issue by exposing a rigorous way ofperturbing densities. The proposed methodology is based on the Fisher distance onmanifolds of probability distributions. A numerical method to calculate perturbeddensities in practice is presented. This method comes from Lagrangian mechanicsand consists of solving an ordinary diﬀerential equations system. This perturbationdeﬁnition is then used to compute quantile-oriented robustness indices. The resultingPerturbed-Law based Indices (PLI) are illustrated on several numerical models. Thismethodology is also applied to an industrial study (simulation of a loss of coolantaccident in a nuclear reactor), where several tens of the model physical parametersare uncertain with limited knowledge concerning their distributions.

Keywords:

Computer experiments, Density perturbation, Fisher metric, Importance sam-pling, Quantile, Sensitivity analysis ∗ Present aﬃliation: DES/ISAS - Service d’tudes mcaniques et thermiques (SEMT), CEA Saclay, 91191Gif-sur-Yvette, France † Corresponding author: [email protected] a r X i v : . [ m a t h . S T ] A ug Introduction

During the last decades, two major trends in industrial and research practices have led to arise in importance of uncertainty quantiﬁcation (UQ) methodologies (de Rocquigny et al.,2008; Smith, 2014; Ghanem et al., 2017). The ﬁrst is the replacement of full-scale physicalexperiments, considered costly and diﬃcult to implement, by numerical models. Thischoice raises the issue of a potential mismatch between computer codes and the physicalreality they aim to simulate. The second trend consists in accounting for the risks in anincreasing number of industrial activities, this implies that those risks should be evaluatedfrom a quantitative point of view. In both situations, the quantiﬁcation of uncertaintiescan be conducted by considering as a vector of random variables, named X = ( X , ..., X d ),the uncertain inputs of the computer code represented by a function G ( · ). The mostwidespread approach consists of running G ( · ) with diﬀerent combinations of inputs inaccordance with their range of variation, in order to study the related uncertainty on theoutput Y = G ( X , ..., X d ) or to estimate a speciﬁc quantity of interest (QoI). A QoI is is astatistical quantity derived from Y , e.g. a performance as the mean of Y or a risk criterionas a high-level quantile of Y .As an example, the nuclear industry faces major issues as facilities age and regulatoryauthorities’ requirements strengthen (Bucalossi et al., 2010; Mousseau and Williams, 2017).For example, the operators have to study the “Loss of Coolant Accident” (LOCA) resultingin a break on the primary loop of pressurized water nuclear reactors. This scenario can besimulated using system thermal-hydraulic computer codes, which include tens of physicalparameters such as condensation or heat transfer coeﬃcients (Mazgaj et al., 2016; Sanchez-2aez et al., 2018). Yet, the values of these parameters are known with a limited precision(Larget, 2019) as they are calculated by the way of other quantities measured via small-scalephysical experiments. Some other variables are only observed during periodic inspections,such as the characteristics of pumps in hydraulic systems.Various methods coming from the UQ domain are useful in considering these uncer-tainties in the system safety analysis. First of all, some methods aim at improving theexploration of the input domain X by using speciﬁc designs of experiments, such as thespace ﬁlling designs (Fang et al., 2006). Such a design allows to cover an input domain asevenly as possible with a ﬁxed number of code runs as well to limit unexplored areas asmuch as possible . For the estimation of some speciﬁc QoI, such as a probability of thresholdexceedance by the output or an α -order quantile of the output, Monte Carlo type meth-ods are often preferred. In particular, accelerated Monte Carlo methods (e.g. importancesampling or subset simulation) target the most informative areas of X in the samplingalgorithm in order to estimate the QoI while controlling its estimation error (Morio andBalesdent, 2016). As a preliminary or concomitant stage, global sensitivity analysis is alsoessential in order to eliminate non-inﬂuential parameters and to rank inﬂuential parametersaccording to their impact on the QoI (Iooss and Lemaˆıtre, 2015; Iooss and Marrel, 2019).All these approaches are useful to deal with the existence of uncertainties in appliedproblems. However, industrial (e.g. nuclear facilities) operators have to face the diﬃcultyof justifying their risk assessment methodologies not merely by providing simulation results.Such a justiﬁcation has to demonstrate that the computed values overestimate the actualrisks which most of the time cannot be calculated. This principle of conservatism, which3an be easily implemented when dealing with very simple monotonic physical models, canbe hard to be adapted to computer codes simulating complex and non monotonic physicalphenomena. It is also not always straightforward to apply this principle when implementingUQ methods based on a set of computer experiments providing a whole range of values forthe output quantity Y .To address this issue, the new UQ branch of robustness analysis has emerged during therecent years in the ﬁeld of sensitivity analysis. It consists of evaluating the impact of thechoice of the inputs’ distributions and, more precisely, by analyzing the QoI variations withrespect to this choice. A ﬁrst solution would consider a whole set of input laws and analysingthe related output distributions. For global sensitivity analysis, Hart and Gremaud (2019)uses “optimal perturbations” of the probability density functions to analyze the robustnessof the variance-based sensitivity indices (called Sobol indices (Sobol, 1993)). Meynaouiet al. (2019) and Chabridon et al. (2018) propose approaches to deal with the so-calledsecond-level uncertainty, i.e. uncertainty on the parameters of the input distributions.Another approach, called optimal uncertainty quantiﬁcation, avoids specifying the inputprobability distributions, turning the problem to the deﬁnition of constraints on moments(Owhadi et al., 2013; Stenger et al., 2019). This solution is out of scope of the present workwhich considers that the initial input probability, that has been deﬁned by the user, is ofpractical importance.In practical engineering uncertainty quantiﬁcation studies, input distributions are trun-cated as it corresponds to physical parameters with known domain of validity. It is thereforenatural to assume no uncertainty on the support of the input random variables. In this4aper, we also assume their mutual independence. Keeping in mind that our goal is todirectly deal with the input distributions (without considering second-level uncertainty),one particularly interesting solution has been proposed in the context of reliability-orientedsensitivity analysis by Lemaˆıtre (2014) (see also Lemaˆıtre et al. (2015); Sueur et al. (2016))with the so called Perturbed-Law based Indices (PLI). A density perturbation consists ofreplacing the density f i of one input X i by a perturbed one f iδ , where δ ∈ R representsa shift of a moment (e.g. the mean or the variance). Amongst all densities with shiftedmean or variance of a δ value, f iδ is deﬁned as the one minimizing the Kullback-Leiblerdivergence from f i . This method has been applied on the computation of a probabilityof failure (Iooss and Le Gratiet, 2019; Perrin and Defaux, 2019), a quantile (Sueur et al.,2017; Larget, 2019) and a superquantile (Iooss et al., 2020; Larget and Gautier, 2020) asthe QoI.However, this method is not fully satisfactory. Indeed, the minimal Kullback-Leiblerdivergence can signiﬁcantly vary between diﬀerent inputs’ distribution of even two diﬀerentparameters of the same density, so that some densities are more perturbed than others.Moreover, some distributions do not have deﬁned moments. As in Perrin and Defaux(2019), an iso-probabilistic operator can be applied to transform all the input random vari-ables into centered normalized Gaussian ones. It allows to make perturbations comparablewhen applied in this standard space, but it remains diﬃcult to translate this interpretationin the initial physical space which is the one of interest for the practitioners. Note thatanother type of robustness analysis has been proposed in quantitative ﬁnance by Cont et al.(2010). These authors investigate whether the estimated QoI is sensitive to a small pertur-5ation of the empirical distribution function. For this purpose, they deﬁne the robustnessof a QoI as its continuity with respect to the Prokhorov distance on the set of integrablerandom variables.The goal of this paper is to propose a novel approach for perturbing probability dis-tribution. It relies on density perturbation based on the Fisher distance (Costa et al.,2012) as a measure of dissimilarity between the initial density f i and the perturbed one f iδ . This distance deﬁnes a geometry on spaces of probability measures called informationgeometry (Nielsen, 2013). The statistical interpretation of the Fisher distance providesan equivalence between perturbation of non-homogeneous quantities and consequently acoherent framework for robustness analysis. To present this approach, we ﬁrst review theexisting density perturbation methods in Section 2. Section 3 is then dedicated to the de-scription of our method and the discussion of our numerical tools. Section 4 illustrates ourmethodology of density perturbation on the practical robustness index PLI. An analyticalapplication and an industrial case study are presented in Section 5. The last section givesconclusions and some research perspectives. The method of Lemaˆıtre et al. (2015) has been later called PLI by Sueur et al. (2016), asit is based on the idea of perturbing the inputs’ densities. It aims at providing a practicalcounterpart to the general idea of analyzing the output QoI of a model in a UQ framework6hen one or several parameters of the input probabilistic model (considered as the referenceone) is changed. This can be seen as a way to take into account an “error term” one couldadd to an imperfectly known input distribution.

To build a perturbed distribution f iδ from a distribution f i , the approach of Lemaˆıtreet al. (2015) is non-parametric. It is mainly thought to analyze perturbations on the mostcommon characteristics of input laws which are the mean and variance. To illustrate itin the case of a mean perturbation, we assume the random variable X i ∼ f i has mean E [ X i ] = µ . By deﬁnition, the perturbed density will have a µ + δ mean. But this isobviously not suﬃcient to fully determine the perturbed law and especially to explicitlyaccess the value of f iδ on the whole domain of X i . Amongst all densities with a mean equalto µ + δ , f iδ is deﬁned as the solution of the minimization problem f iδ = arg min π ∈P , s.t E π [ X i ]= E fi [ X i ]+ δ KL ( π || f ) , (1)where P is the set of all probability measures absolutely continuous with respect to f i . Thisapproach basically consists of perturbing the chosen parameter while changing the initialmodel as little as possible. With this deﬁnition, “changing” the model is understood as anincrease of entropy, the Kullback-Leibler divergence between two densities f and π being KL ( π || f ) = (cid:90) log (cid:18) π ( x ) f ( x ) (cid:19) f ( x ) dx . (2)This method can be applied on higher order moments (for instance moments of order 2,to deﬁne variance perturbation) and, more generally, to constraints that can be expressed7s a function of the perturbed density, as quantiles (Lemaˆıtre, 2014). Notice that, in thecase of an initial Gaussian distribution, the perturbed distribution remains Gaussian witha mean shift of δ .In the general case, this method has several drawbacks: First of all, the likelihood ratiobetween f iδ and f i might not have an analytic form, which leads to numerical diﬃculties.Moreover, this method requires deﬁned moments for the initial density. Finally, the maindiﬃculty concerns the interpretation of the results obtained from this PLI method. Indeed,each uncertain input of the UQ model is perturbed with a range of δ values. To interpret theQoI shift resulting of these perturbations in the standard space, a clear understanding of thephysical meaning of each perturbation is necessary. Low interpretability of the perturbeddensity can appears for some physical parameters, e.g. for uncertainties on the state ofthe system coming from a variability of the quantity throughout the operating process. Inthis case, the probability distribution of the uncertain quantity can be regarded in termsof relative frequency of occurrence. But it can be more diﬃcult when it comes to constantphysical parameters known with a limited accuracy.We recall that all input random variables are assumed mutually independent. Nonethe-less, the eﬀect of perturbations can be considered only for each variable individually andin absolute terms (as a same δ shift might have completely diﬀerent impacts for diﬀerentinput densities). This methodology thus yields diﬃculty to compare the relative impact ofperturbations between diﬀerent inputs. 8 .2 Standard space transformation To interpret the δ shift on the input distribution and especially to allow a comparisonbetween inputs according to the impact on the QoI of a same perturbation, an equivalencecriterion between inputs is required. An idea developed by Perrin and Defaux (2019)consists of applying perturbations in the so-called standard space (instead of the initialphysical space) in which all input laws are identical, making all perturbations equivalent.Finally, the perturbed densities are obtained by applying the reverse transformation as theone used to transform inputs in the standard space.In the case of independent inputs, the required distribution transformation is a sim-ple inverse probability transform. Given a random vector X with cumulative distributionfunction F , the transform is the random vector S = Φ − ( F ( X )), where Φ is the cumula-tive distribution function of the standard Gaussian distribution N ( , I d ). Consequently, S follows a standard Gaussian distribution whatever the initial distribution F . In theKullback-Leibler divergence minimization framework (see Section 2.1), a perturbation ofthe mean simply consists of a mean shift without changing the standard deviation. Hencethis leads to an analytical expression for the perturbed density f iδ thanks to the variablechange formula (Stirzaker, 2003, p.318): f δ ( x ) = e − δ δ Φ − F ( x ))2 f ( x ) . (3)This simple formula makes the perturbed density and the likelihood ratio easy to compute.However, similar perturbations in the standard space implies very diﬀerent ones in thephysical space according to the initial distribution. As an example, Figure 1 depicts twoKullback-Leibler divergences (approximated with Simpson’s rule (Abramowitz and Stegun,9974)) between a particular distribution (the Triangular T ( − , , and the Uniform one U [ − , The Kullback-Leibler divergence can be interpreted as the power of a hypothesis test withnull hypothesis “ X i follows the distribution f i ” and an alternative hypothesis “ X i followsdistribution f iδ ” (Eguchi and Copas, 2006). For this reason, it seems to be an appropriatetool to measure how far a perturbed density is from its initial reference and thus to provide aformal counterpart to the dim idea of “uncertainty on the distribution”. It is especially well the triangular distribution T ( − , ,

1) is parametrized by its minimum a , mode b and maximum c . . . . . δ K L d i v e r g e n ce T ( − , , U ([ − , Figure 1: Kullback-Leibler divergence between the initial distribution and the perturbedone for perturbation levels δ ∈ [0 , To allow intuitive understandings of the consequence of these perturbations on the outputdistribution, it is necessary to base our perturbation method on a metric which allows at thesame time to compare perturbations on diﬀerent parameters of the same distribution and ondiﬀerent inputs of the UQ model. In particular it should not depend on the representationof the input distribution, which means being independent of the parametrization. The11isher distance has all these advantages. It is based on the local scalar product inducedby the Fisher information matrix in a given parametric space and deﬁnes a Riemanniangeometry on the corresponding set of probability measures as on any Riemannian manifoldwith its associated metric. Consider the parametric density family S = { f θ , θ ∈ Θ ⊂ R r } .We recall that every input variables represent physical parameters with known domain ofvalidity, therefore for all θ in Θ, the support of f θ is assumed ﬁxed. The metric associatedto the coordinate function θ , called the Fisher (or Fisher - Rao) metric, is deﬁned as: I ( θ ) = E (cid:2) ∇ θ log f θ ( X )( ∇ θ log f θ ( X )) T (cid:3) , where I ( θ ) is the Fisher information matrix evaluated in θ for this statistical model. TheFisher information, well known for instance in optimal design, Bayesian statistics andmachine learning, is a way of measuring the amount of information that an observablerandom variable X carries about an unknown parameter θ of the distribution of X . TheFisher information matrix deﬁnes the following local inner product in S for u ∈ R r and v ∈ R r : (cid:104) u, v (cid:105) θ = u T I ( θ ) v . (4)Given two distributions f θ and f θ in the manifold S , a path from f θ to f θ is apiecewise smooth map q : [0 , → Θ satisfying q (0) = θ and q (1) = θ . Its length l ( q )satisﬁes the following equation: l ( q ) = (cid:90) (cid:113) (cid:104) ˙ q ( t ) , ˙ q ( t ) (cid:105) q ( t ) dt , (5)where ˙ q is the derivative of q . Alike, the energy E ( q ) of a path is deﬁned by the equation: E ( q ) = (cid:90) (cid:104) ˙ q ( t ) , ˙ q ( t ) (cid:105) q ( t ) dt . (6)12he distance between f θ and f θ , called the Fisher distance, is deﬁned as the minimallength over the set of paths from f θ to f θ , denoted by P ( f θ , f θ ): d F ( f θ , f θ ) = inf q ∈P ( f θ ,f θ ) l ( q ) . (7)The path γ minimizing this length - or equivalently minimizing the energy - is called ageodesic (Costa et al., 2012). The speciﬁc choice of the Fisher information matrix for aRiemannian metric matrix leads to a very interesting statistical interpretation, as shown inAmari (1985, p.27). It is directly related to the Cramer-Rao lower bound (Rao, 1945) whichstates that, for any unbiased estimator (cid:98) θ of θ , the covariance matrix Var( (cid:98) θ ) is bounded by I ( θ ) − . This means that the Fisher information is the maximum amount of informationabout the value of a parameter one can extract from a given sample. More formally,under some regularity conditions [given by (Newey and McFadden, 1994, Theorem 3.3)], if x , .., x n are n independent observations distributed according to a density f θ , the maximumlikelihood estimator (cid:98) θ n of θ converges weakly to a normal law with mean θ and covariance I ( θ ) − n . The density of (cid:98) θ n denoted by p ( (cid:98) θ n , θ ) writes p ( (cid:98) θ n , θ ) = 1 (cid:112) (2 π ) n det( I ( θ )) exp (cid:32) − n ( (cid:98) θ n − θ ) T I ( θ )( (cid:98) θ n − θ )2 (cid:33) . (8)When n is large, this probability density is proportional to ( (cid:98) θ n − θ ) T I ( θ )( (cid:98) θ n − θ ) whichis the local inner product deﬁned in equation (4). Therefore, the Fisher distance betweentwo distributions with parameters θ and θ (cid:48) can be constructed as a measure of the risk ofconfusion between them. In other words, the Fisher distance between two distributions f θ and f θ (cid:48) represents the separability of the two distributions by a ﬁnite sample of independentobservations sampled from the f θ distribution.13e illustrate the Fisher distance on a simple example. Consider the statistical manifoldof univariate normal distributions S = {N ( µ, σ ) , ( µ, σ ) ∈ R × R ∗ + } . The Fisher informationmatrix has the analytical form (Costa et al., 2012): I ( µ, σ ) =  /σ

00 2 /σ  . (9)We can apply the change of coordinate φ ( µ, σ ) → ( µ √ , σ ), so that the related geometry isthe hyperbolic geometry in the Poincar´e half-plane (Stillwell, 1997), in which the geodesicand distance between two normal distributions are known analytically (Costa et al., 2012).Geometrically, the geodesics are the vertical lines and the half circle centered on the line σ = 0.Further details on the interpretation of information geometry can be found in Costaet al. (2012). Figure 2 shows the position of four Gaussian distributions in the ( µ √ , σ )half-plane. It is clear that the Gaussian distributions C and D are more diﬃcult to bedistinguished than the distributions A and B although in both cases the KL divergence isthe same. The hyperbolic geometry induced by the Fisher information provides a represen-tation in accordance with this intuition. Indeed, the two dashed curves are the geodesicsrespectively between points A and B , and points C and D . We observe that the Fisher dis-tance between A and B is greater that the distance between C and D . This illustrates howinformation geometry provides a proper framework to measure statistical dissimilarities ina space of probability measures.The Fisher distance provides a satisfactory grounding to our notion of density pertur-bation. We deﬁne a perturbation of a density f to be of magnitude δ if the Fisher distance14 σ A • B • C • D • σ σ µ µ µ µ C DA B

Figure 2: Representation of four Gaussian distributions in the parameter space on the left,and their respective distributions on the right. Although KL ( A || B ) = KL ( C || D ), it iseasier to distinguish A from B than C from D . The dashed curved lines are two geodesicsin ( µ √ , σ ) plane with diﬀerent lengths.between f and the perturbed density f δ is equal to δ . The set of all perturbations of f atlevel δ is then the Fisher sphere of radius δ centered in f , whenever this perturbation isapplied to one or another of the parameters. This implies that, in this framework, we donot consider one speciﬁc perturbed distribution but a non ﬁnite set of probability densities.The next section is dedicated to the development of a numerical method to compute theFisher spheres of radius δ centered in f . As detailed in Section 3.1, geodesics are deﬁned as the solution of a minimization problem.More speciﬁcally a geodesic is a path with minimal length or energy (denoted E ). Given a15mooth map q : [0 , → S , we have E ( q ) = (cid:90) (cid:104) ˙ q ( t ) , ˙ q ( t ) (cid:105) q ( t ) dt . (10)In the following we denote L ( t, q, ˙ q ) = 12 (cid:104) ˙ q ( t ) , ˙ q ( t ) (cid:105) q ( t ) and L is called the Lagrangian of thesystem. The energy of a path can be rewritten as E ( q ) = (cid:90) L ( t, q, ˙ q ) dt . (11)A necessary condition for the path q to minimize the energy E is to satisfy the Euler-Lagrange equation (see Gelfand and Fomin (2012) for details): ∂L∂q = ddt (cid:18) ∂L∂ ˙ q (cid:19) . (12)We denote p = ∂L∂ ˙ q and obtain by derivation of the quadratic form L ( t, q, ˙ q ) = ˙ q T I ( q ) ˙ q that p = I ( q ) ˙ q , and ˙ q = I − ( q ) p . Then, inspired by Lagrangian mechanics theory (Arnold,1997, p.65), the Hamiltonian H ( p, q ) deﬁned by H ( p, q ) = p T ˙ q − L ( t, q, ˙ q ) = p T I − ( q ) p − ˙ q T I ( q ) ˙ q = p T I − ( q ) p (13)is constant whenever q is a geodesic. Eq. (13) is derived from the Euler Lagrange equationand implies that ( p, q ) follows a system of Ordinary Diﬀerential Equation (ODE) calledHamilton’s equations:  ˙ q = ∂H∂p = I − ( q ) p , ˙ p = − ∂H∂q = ∂L ( t, q, I − ( q ) p ) ∂q . (14)The objective is to determine any geodesics q satisfying q (0) = θ and d F ( f, q (1)) = δ ,it corresponds to computing the Fisher sphere centered in f θ with radius δ . The only16egree of freedom left to fully solve the ODE system (14) is the initial velocity p (0). Noticethat the Hamiltonian is equal to the kinetic energy as p = I ( q ) ˙ q . As the Hamiltonian isconstant on a geodesic, we have for all t :12 (cid:104) ˙ q ( t ) , ˙ q ( t ) (cid:105) q ( t ) = k , (15)where k is non-negative. The length of q is therefore equal to (cid:90) (cid:113) (cid:104) ˙ q ( t ) , ˙ q ( t ) (cid:105) q ( t ) dt = √ k , (16)so that δ = √ k . Therefore, Eq. (13) rewrites: δ = √ k ⇐⇒ p T I − ( q ) p = δ . (17)Taking equation (17) at initial state t = 0, we can determine all the initial velocity suchthat d F ( q (0) , q (1)) = δ . Those velocities are needed to solve the ODE system (14) andcompute the geodesics.Generally, computing the geodesic between two given distributions is a challengingproblem. Methods relying on shooting algorithms have been developed in that matters.Our framework overcomes this problem as we compute the entire Fisher sphere. In the nextsection, we focus on numerical methods for computing geodesics by solving the systems ofODE (14). These methods are illustrated by computing Fisher spheres in the Gaussianmanifold S = {N ( µ, σ ) , ( µ, σ ) ∈ R × R ∗ + } . The Hamilton equations (14) are solved with numerical approximation methods. Fig-ure 3 illustrates our numerical resolution method in the Gaussian case, that is when17 = {N ( µ, σ ) , ( µ, σ ) ∈ R × R ∗ + } . In order to solve (14), we compare two diﬀerent nu-merical methods: namely, the explicit Euler algorithm and the Adams-Moulton algorithm.We recall that in the Gaussian case we dispose of an exact analytical expression of theFisher sphere detailed in Costa et al. (2012). The Fisher sphere is centered in N (0 , δ = 1. Notice that there is no observable diﬀerence between the two meth-ods in Figure 3. Hence, a better way to estimate the numerical error is required. Werecall that the Hamiltonian value is conserved along the geodesics. Therefore, it is possi-ble to quantify the performance of the numerical approximation by computing the value∆( t ) = H ( p ( t ) , q ( t )) − H ( p (0) , q (0)) H ( p (0) , q (0)) for t ∈ [0 , q computed with our numerical methods. − . − . . . . µ . . . . . . . . . σ Adams MoultonExplicit EulerReal Fisher sphere

Figure 3: Geodesics in the Gaussian information geometry computed with Euler explicitand Adams Moulton methods. The radius δ is equal to 1.18igure 4 displays the value of ∆( t ) for t ∈ [0 ,

1] for one arbitrary geodesic shown inFigure 3. The relative error for the Adams Moulton method is negligible while the maxi-mum relative error for the explicit Euler scheme is around 0.3%. Hence, in the Gaussiancase the Adams Moulton scheme is preferred. Nevertheless, some instabilities have beenobserved in practice mainly due to the truncation of the distribution support which impairthe Hamiltonian consistency. Symplectic method (Amari and Nagaoka, 2000; Leimkuhlerand Reich, 2005) and more particularly symplectic Euler algorithm could help to assess thisproblem by forcing the Hamiltonian constant. This will be the subject of a future work.Moreover, the truncation can lead to other numerical errors when the radius δ is too large.Indeed, the normalization factor of some truncated distribution can become smaller thanthe computer machine precision. . . . . . . . . . . . . . ∆ Adams MoultonExplicit Euler

Figure 4: Relative variation of the Hamiltonian ∆ along a geodesic for two diﬀerent nu-merical scheme. 19

Application to Perturbed-Law based Indices

The UQ robustness analysis explained in Section 1 and Section 2 aims at quantifyingthe impact of a lack of knowledge on an input distribution in UQ of model outputs. InSection 3, a coherent formal deﬁnition of density perturbation has been proposed. We nowillustrate the interest of this solution for the deﬁnition of a practical robustness analysismethodology. Analyzing the eﬀect of perturbing an input density ﬁrst requires deﬁning anindex which summarizes this eﬀect on the QoI.

A PLI aims to measure the impact of the modiﬁcation of an input density on some eventsaﬀecting the QoI such as a quantile or a threshold exceedance probability of the modeloutput (Lemaˆıtre et al., 2015; Sueur et al., 2016). In the following, we focus on a quantileof order α , which is often used in practical applications as a risk measure (Mousseau andWilliams, 2017; Delage et al., 2018; Larget, 2019).Given the random vector X = ( X , ..., X d ) ∈ X of our d independent uncertain inputvariables, G ( · ) our numerical model and Y = G ( X ) ∈ R the model output, the quantile oforder α of Y is: q α = inf { t ∈ R , F Y ( t ) ≥ α } , (18)where F Y is the cumulative distribution function of the random variable Y . In order tocompute the i -th PLI, we change the density f i of X i into a density f iδ , where δ ∈ R + q αiδ = inf { t ∈ R , F Y,iδ ( t ) ≥ α } , (19)where F Y,iδ is the cumulative distribution function corresponding to the input variable X i sampled from f iδ . The PLI is then simply deﬁned as the relative change in the outputquantile generated by the perturbation : S iδ = q αiδ − q α q α . (20)This deﬁnition slightly diﬀers from the one proposed in previous studies (Lemaˆıtre et al.,2015; Sueur et al., 2017). Indeed, after several applications of the PLI, it has been foundmore convenient to compute directly the relative variation of the quantile when submittedto a density perturbation. The interpretation is straightforward.In a lot of applications, for instance in nuclear safety exercises, the computer models arecostly in terms of CPU time and memory. Only a limited number of N code runs is thenavailable for the estimation of all the PLIs. We then have a sample Y N = { y ( n ) } ≤ n ≤ N of N outputs of the model from a sample X N = { X ( n ) = ( x ( n )1 , ..., x ( n ) d ) } ≤ n ≤ N of N independentrealizations of X . The estimation of the quantile is based on the empirical quantile esti-mator denoted (cid:98) q αN = inf { t ∈ R , (cid:98) F NY ( t ) ≤ α } where (cid:98) F NY ( t ) = 1 N N (cid:88) n =1 ( y ( n ) ≤ t ) is the empiricalestimator of the cumulative density function of Y . In order to estimate the perturbedquantile (cid:98) q αN,iδ from the same sample X N , we use the so-called reverse importance samplingmechanism from Hesterberg (1996) (see the online supplementary material) to compute21 F NY,iδ (Delage et al., 2018): (cid:98) F NY,iδ ( t ) = N (cid:80) n =1 L ( n ) i ( y ( n ) ≤ t ) N (cid:80) n =1 L ( n ) i , (21)with L ( n ) i the likelihood ratio f iδ ( x ( n ) i ) f i ( x ( n ) i ) . The estimator of the PLI is then (cid:98) S N,iδ = (cid:98) q αN,iδ − (cid:98) q αN (cid:98) q αN . (22)As presented in Section 3, the Fisher sphere of radius δ and centered in the initial inputdistribution f i , denoted by ∂ B F ( f i , δ ) = { g, d F ( f i , g ) = δ } , provides a good deﬁnition forperturbing distributions. This means that we do not consider one speciﬁc perturbation atlevel δ , but a whole set of perturbed distributions ∂ B F ( f i , δ ). Over this set, we computethe maximum S + iδ and the minimum S − iδ of the PLI for any distributions in ∂ B F ( f i , δ ): S + iδ = max g ∈ ∂ B F ( f i ,δ ) S i ( g ) , (23) S − iδ = min g ∈ ∂ B F ( f i ,δ ) S i ( g ) , (24)where S i ( g ) is the PLI with g as the perturbed density for the variable X i .Among all perturbed distributions at level δ , we investigate the one that deviates thequantile the most from its original value. Thus, these two quantities S + iδ and S − iδ measurethe robustness of the numerical code taking into account uncertainties tainting the inputdistribution. In this section, we investigate some theoretical aspects of the PLI estimator (cid:98) S N,iδ . Asit is based on the quantile estimators, we ﬁrst focus on the asymptotic properties of the22stimator (cid:0)(cid:98) q αN , (cid:98) q αN,iδ (cid:1) . Detailed proof of the following results are reported in the Appendix A. Theorem 1.

Suppose that F Y is diﬀerentiable at q α = F − Y ( α ) with F (cid:48) Y ( q α ) > and that F Y,iδ is diﬀerentiable at q αiδ = F − Y,iδ ( α ) with F (cid:48) Y,iδ ( q αiδ ) > . We denote Σ =  σ ˜ θ i ˜ θ i ˜ σ iδ  with σ i = α (1 − α ) F (cid:48) Y ( q α ) , (25)˜ σ iδ = E (cid:20)(cid:16) f iδ ( X i ) f i ( X i ) (cid:17) ( ( G ( X ) ≤ q αiδ ) − α ) (cid:21) F (cid:48) Y,iδ ( q αiδ ) , ˜ θ i = E (cid:104) f iδ ( X i ) f i ( X i ) ( G ( X ) ≤ q α ) ( G ( X ) ≤ q αiδ ) (cid:105) − α E [ ( G ( X ) ≤ q αiδ ) ] F (cid:48) Y ( q α ) F (cid:48) Y,iδ ( q αiδ ) . Suppose that the matrix Σ is invertible and E (cid:34)(cid:18) f iδ ( X i ) f i ( X i ) (cid:19) (cid:35) < + ∞ . Then √ N  (cid:98) q αN (cid:98) q αN,iδ  −  q α q αiδ  L −→ N (0 , Σ) . The PLI S iδ is a straightforward transformation of the joint distribution ( q α , q αiδ ) T . Toobtain the almost sure convergence of (cid:98) S N,iδ to S iδ , it suﬃces to apply the continuous-mapping theorem to the function s ( x, y ) = y − xx . Theorem 2.

Given the assumptions of theorem 1, we have √ N ( (cid:98) S N,iδ − S iδ ) L −→ N (0 , d Ts Σ d s ) with d s =  − q α /q αiδ /q α  . (26)Notice that the asymptotic variance relies on the α initial quantile and perturbed quan-tile, which are precisely what we want to estimate. Hence, Theorem 2 cannot be used for23uilding asymptotic conﬁdence intervals. However, the convergence properties are impor-tant for the method credibility and acceptance. In practice, the estimation error can bemeasured using bootstrapping (Efron, 1979). As already discussed, in practical applications, the computer model is often costly andcannot be reevaluated. The main limitation of the previously exposed estimator (cid:98) S N,iδ arisesfrom the available sample size which is ﬁnite. Therefore, at a certain level of perturbation,there might not be enough sample points to correctly compute the perturbed quantile (andits conﬁdence interval). One of the key issue of the methodology is to determine how farthe input distribution should be perturbed. We propose to adapt the empirical criterionfrom Iooss et al. (2020) in order to establish a maximal perturbed level δ max . The numberof points in the output sample Y N , smaller or larger than the δ -perturbed quantile hasto be suﬃcient. A value of N Y = 10 has been chosen (from several numerical tests) asthe smallest size for computing the PLI-quantile. As soon as a distribution on the Fishersphere exceeds this criteria, the corresponding radius is taken as δ max .The estimation of the quantity of interest S + iδ and S − iδ is summarized as follows: • Choose a level of perturbation δ , an input number i ∈ (cid:74) d (cid:75) and a sample of K points on the Fisher sphere of radius δ centered in f i using the numerical method ofSection 4.2. • For each { f ( k ) iδ } ≤ k ≤ K sampled on the Fisher sphere, estimate q α, ( k ) iδ using the reverseimportance sampling technique based on the sample X N . Verify that the number of24oint in the output sample below or above the perturbed quantile q α, ( k ) iδ satisﬁes thestopping criteria N Y . Then, compute the PLI estimator (cid:98) S ( k ) N,iδ . • The estimators (cid:98) S + N,iδ and (cid:98) S − N,iδ of the quantity of interest S + iδ and S − iδ are taken as themaximal and minimal value of the PLI sampled on the Fisher sphere { (cid:98) S ( k ) N,iδ } .We emphasize that this approach only restricts to expensive computer models. Indeed, thebootstrap variance of the estimated quantile with reverse importance sampling tends tobecome very large as illustrated in Iooss et al. (2020). This is due to the likelihood ratiothat punctually explodes. Thus, when dealing with a cheap code, one can directly resampleover the perturbed distribution in order to estimate the output quantile. In this situation,there is no limiting level of perturbation δ max .The code for the new version of the PLI, called OF-PLI (for Optimal Fisher-based PLI)in the following, is available at https://github.com/JeromeStenger/PLI-Technometrics .In future works, the OF-PLI conﬁdence intervals (computed via bootstrap) will providevaluable additional information such as conﬁdence intervals. They are not pictured in thefollowing application as it requires at this stage further investigations. The code for com-puting the old version of the PLI, called E-PLI (for Entropy-based PLI) in the following,is available in the sensitivity package of the R software. The PLI, as deﬁned in the previous sections, allow to assess to what extent the outputquantile can be impacted by an error of magnitude δ in the characterization of an input25istribution. In the next subsection, we compare in a toy example the newly introducedmethodology (OF-PLI) to the previous one (E-PLI). Moreover, as the PLI are based ona change in the input distribution, it diﬀers from global sensitivity measures (Iooss andLemaˆıtre, 2015) which evaluate the eﬀect of input variability for a ﬁxed probabilistic model.To study the potential coherence and divergence between the two approaches, we compareSobol’ indices and OF-PLI results in Section 5.2 on an analytical model. In the thirdsubsection, we illustrate the use of the OF-PLI as a support in nuclear safety analysis of apressurized water nuclear reactor. The Ishigami function (Ishigami and Homma, 1990) is used as an example for uncertaintyand sensitivity analysis methods, in particular because it exhibits strong non-linearity andnon-monotonicity. In this section, we apply the methodology introduced in Section 4.3to estimate OF-PLI and compare our results to the E-PLI. The Ishigami function, whichtakes three input random variables ( X , X , X ) normally distributed N (0 , G ( x , x , x ) = sin( x ) + 7 sin( x ) + 0 . x sin( x ) . (27)We intend to evaluate the impact of a perturbed input distribution to the 95%-quantile.In this simple example where the function is cheap to evaluate, we do not use the reverseimportance sampling estimator of the quantile as proposed in Section 4.3. We rather drawnew samples of size N = 2000 directly from the perturbed input distributions in order tocompute the output perturbed quantile. We chose a number of K = 100 trajectories over26ach Fisher sphere for computing the minimum and maximum of the OF-PLI. The OF-PLIare computed for perturbation levels δ varying in [0 , . δ max = 0 . (cid:98) S + N,iδ and (cid:98) S − N,iδ .The OF-PLI results are depicted in Figure 5. It appears that the third input has mostimpact in particular for shifting the quantile to the right. On the other hand, the secondinput has more impact for perturbing the quantile to the left. Our results coincide to thewell known behavior of the Ishigami function in terms both of non-linearity of the modeland primary inﬂuence of the third input. . . . . . . . . . δ − . . . . P L I b S + N, δ & b S − N, δ b S + N, δ & b S − N, δ b S + N, δ & b S − N, δ Figure 5: Minimum and maximum of the OF-PLI over the Fisher sphere over K = 100trajectories for δ varying in [0 , . = 0 . µ − . − . . . . σ . . . . b S N,iδ − . . . . . Figure 6: Value of the OF-PLI ˆ S N,iδ (red line) for the third input of the Ishigami model( N = 100, i = 3) over a Fisher sphere of radius δ = 0 . − ,

1] and its variance in [0 , , . − µ − . − . . . P L I X X X σ . . . . P L I X X X Figure 7: Computation of the E-PLI. Left: perturbation of the mean of the Gaussiandistribution. Right: perturbation of the variance of the Gaussian distribution.

The model of interest concerns a ﬂooded river simulation, which is especially useful inassessing the risk of submergence of a dike protecting industrial sites nearby a river. Tothis purpose, we use a model implementing a simpliﬁed version of the 1D hydro-dynamicalequations of Saint Venant. This model computes H , the maximal annual water level of theriver, from four parameters Q , K s , Z m and Z v , which are considered uncertain: H = (cid:32) Q K s (cid:112) . − ( Z m − Z v ) (cid:33) . . (28)29he inputs are modeled as random variables with associated truncated distributions givenin Table 1 (Iooss and Lemaˆıtre, 2015).Table 1: Input variables of the ﬂood model with their associated probability distributions.Input n ◦ Name Description Probability distribution Truncation1 Q Maximal annual ﬂowrate Gumbel G (1013 , , K s Strickler coeﬃcient Normal N (30 , .

5) [15 , + ∞ ]3 Z v River downstream level Triangular T (50) [49 , Z m River upstream level Triangular T (55) [54 , . It gives a total cost of N = 6 × model runs and a standard deviationof the indices’ estimation error smaller than 10 − . As shown in Table 2, in the central30art of the distribution (conventional Sobol’ indices), we observe that the variable Q isclearly more inﬂuential than the variable K s whereas Z v and Z m appear to have almost noinﬂuence on the output. From the target Sobol’ indices, we observe that, in the extremepart of the distribution (close to the 95%-quantile), Q and K s have the same total eﬀect(due to a strong interaction between them in order the output exceeds the threshold).Table 2: Sobol’ indices estimates of the ﬂood model inputs.Inputs Q K s Z v Z m First-order Sobol’ indices 0.713 0.254 0.006 0.006Total Sobol’ indices 0.731 0.271 0.008 0.008First-order target Sobol’ indices 0.242 0.125 0.002 0.002Total target Sobol’ indices 0.867 0.739 0.119 0.121We compute the OF-PLI (w.r.t. a quantile of order α = 0 .

95) for the ﬂood model inputswith the methodology of Section 4.3 for increasing Fisher spheres radius δ ∈ [0 , .

4] withstep 0.1. The spheres are respectively centered on the distributions of Table 1. On each ofthese spheres, we compute the OF-PLI for K = 100 diﬀerent perturbed distributions usinga sample of N = 2000 points distributed according to the initial distribution. The maximalradius δ max = 1 . Q at perturbation level δ > .

4, meaning there are lowerthan N Y = 10 sample points above the maximal perturbed quantile. The Figure 8 depictshow the Fisher sphere centered in the variable Q deforms and how the perturbed densitiesspread around the initial distribution. Figures 8b and 8c indicates that the maximal value31f the OF-PLI is obtained by putting weight to the right hand side of the distributionqueue (the distributions minimizing and maximizing the OF-PLI are colored green andblue). This behavior was here predictable as the the height river is a growing function ofthe river ﬂow (see Eq. (28)). However, this analysis can give substantial information inan real world engineering study. At last, one can observe (Fig. 8a) that the Fisher sphereﬂatten to the boundary of the parameters’ domain. This characteristic is peculiar to eachprobability distribution, for instance it never not happen for the non-truncated normaldistribution.The results of the OF-PLI, displayed in Figure 9, conﬁrm those of the target Sobol’indices (see Table 2): the variables 3 and 4, corresponding to Z v and Z m , are much lessinﬂuential on the output quantile of level α = 0 .

95 than the variables 1 and 2, correspondingto Q and K s . Moreover, perturbations of Q and K s seem to have comparable eﬀects on the95%-quantile of H although they have signiﬁcantly diﬀerent contributions to the outputvariance. On the other hand, compared to target Sobol’ indices, OF-PLI provide moreinformative results with their evolution as a function of δ . This clearly shows how a lackof knowledge on an input uncertainty can have a low or high impact on the value of a riskmeasure. In conclusion, this example conﬁrms the interest of the OF-PLI as it conveyscomplementary information compared to existing sensitivity indices. Notice that the ﬂatparts visible on some curves are due to approximation errors attributed to the low numberof sample points N and high quantile level (0.95).32 .

001 0 .

002 0 .

003 0 .

004 0 .

005 0 . β γ δ : 0.1 δ : 0.2 δ : 0.3 δ : 0.4 δ : 0.5 δ : 0.6 δ : 0.7 δ : 1.0 δ : 1.4 (a) Deformation of the Fisher sphere for increasing radius δ . d e n s i t y Initial distributionArgmaxArgmin (b) Densities over the Fisher sphere ( δ = 0 . d e n s i t y Initial distributionArgmaxArgmin (c) Densities over the Fisher sphere ( δ = 1 . Figure 8: Analysis of the Fisher metric based perturbation of the truncated Gumbel dis-tribution of the variable Q (see Table 1). This industrial application concerns the study of the peak cladding temperature (PCT) offuel rods in case of loss of coolant accident caused by an intermediate-size break in the pri-mary loop (IB-LOCA) in a nuclear pressurized water reactor. According to operation rules,33 . . . . . . . δ − . − . − . . . . P L I QK s Z v Z m Figure 9: Maximum and minimum estimated value of the OF-PLI (cid:98) S + N,iδ and (cid:98) S − N,iδ for thediﬀerent variables of the ﬂood model.this temperature must remain below a threshold to prevent any deterioration of the reactorstate. The thermal-hydraulic transient caused by this accidental scenario is simulated withthe CATHARE2 code (Geﬀraye et al., 2011), providing a temperature proﬁle throughouttime for the surface of the nuclear core assemblies (Mazgaj et al., 2016). The thermal-hydraulic model involves boundary and initial conditions, and many physical parameters(heat transfer coeﬃcient, friction coeﬃcient, etc.) whose exact values are unknown. Theprobability distributions of these inputs can be obtained from data, from expert knowledgeor recovered by solving inverse problems on an experimental database (Baccou et al., 2019).The input uncertainties are propagated inside this model and the UQ objective con-sists of estimating a high-order quantile of the PCT (model output). This α -quantile is34nterpreted as a pessimistic estimate of the PCT. Like any scientiﬁc approach, this method-ology is based on hypotheses, which regulatory authorities ask to evaluate the impact onexhibited results. Indeed, nuclear power operators are required to conduct studies in sucha way to ensure that actual risks are overestimated. By this “conservatism principle” theyare bound to choose the most pessimistic assumption each time a modeling decision hasto be made. In deterministic studies, this simply consists of taking the most penalizingvalues for each of the input variables. This way, the resulting computation is supposed tosimulate a worst case scenario for the examined risk. It is, however, not straightforward toimplement such a principle when the numerical code is complex with interactions betweeninputs and non-monotonic eﬀects of inputs. It is even harder to extend this rationale to aUQ framework aiming to represent all potential scenarios with related occurrence plausi-bility. Recent works (Larget, 2019) have shown that the E-PLI can be useful to support adiscussion on the choice of the input distributions.In our case, we study a reduced scale mock-up of a pressurized water reactor with7 uncertain inputs given in Table 3 (Delage et al., 2018). To compute the OF-PLI, aninput-output sample of size N = 1000 is available, coming from a space ﬁlling design ofexperiments (Fang et al., 2006) (whose points in [0 , d have been transformed to follow theinputs’ probability distributions). More precisely, a Latin Hypercube Sample minimizingthe L -centered discrepancy criterion (Jin et al., 2005) has been used. The OF-PLI (withrespect to a quantile of order α = 0 .

95) will then be estimated without any additional coderun (see Section 4.1).Figure 10 presents the maximum and minimum values of our two estimators (cid:98) S + N,iδ U ([ − . , . LN (0 , .

76) on [0 . , LN (0 , .

76) on [0 . , LN ( − . , .

45) on [0 . , . N (6 . , .

27) on [0 , . (cid:98) S − N,iδ . We compute Fisher spheres with radius δ sampled uniformly in [0 . , . K = 100 perturbeddensities are sampled. The OF-PLIs are ﬁnally estimated on a 1000-sized dataset. Thestopping criterion of 4.3 gives a maximal admissible OF-PLI of 4%, this value is determinedfrom the maximal admissible quantile such that there is N Y = 10 sample points above it.Actually, one can see that (cid:98) S + N, δ is close to this maximal admissible value.Studies previously conducted on the same application (Delage et al., 2018) lead tosimilar results concerning the most inﬂuential inputs on the quantile of the PCT: strongimpact of variables 3 and 4 and weak inﬂuence of variables 1, 2 and 5. In comparisonwith these studies based on the standard space transformation, our information geometry36 . . . . . . δ − − P L I ( % ) b S − N, δ b S − N, δ b S − N, δ b S − N, δ b S − N, δ b S − N, δ b S − N, δ b S + N, δ b S + N, δ b S + N, δ b S + N, δ b S + N, δ b S + N, δ b S + N, δ Figure 10: Bootstrap mean of the maximum and minimum of the OF-PLI S iδ for theCATHARE2 code. The conﬁdence interval are not shown for the sake of clarity.perturbation methodology leads to a reduced evaluated inﬂuence of variable 7. In fact, asit is the only Gaussian distribution, the reverse transformation from the standard space tothe physical one operates diﬀerently for this input than for the others. Finally, according tothe values of (cid:98) S + N, δ and (cid:98) S + N, δ , the variables 3 and 7 appear to be the most inﬂuential inputson the quantile of the PCT. This behavior, which was not observed with the standardspace transformation, is probably due to the fact that the standard space approach allows37erturbing only one of the probability distribution parameters (for example the expectedvalue). Contrarily, our estimator corresponds to the maximal quantile deviation over awhole set of equivalent perturbations. This shows two main advantages of our newlydeveloped methodology: it prevents the interpretation bias induced by the standard spacetransformation and it allows for an exhaustive exploration of density perturbations for agiven δ . Based on the Fisher distance, we have deﬁned an original methodology to perturb inputprobability distributions in the peculiar case of mutual independent input random variables.The Fisher information is an intrinsic characteristic of probability measure and in particulardoes not depend on a speciﬁc chosen parametric representation. This fundamental propertymakes it the proper mathematical tool to compare perturbations on diﬀerent uncertainphysical inputs of a computer model, but also on diﬀerent parameters of the same inputdistribution. It is even possible to get rid of all references to a parametric sub-domain ofthe set of probability measures on X , as a non-parametric extension of the Fisher distanceis proposed by Holbrook et al. (2017). However, this last perspective is limited by practicalissues as it is supposed to rely on a ﬁnite dimension representation of the densities, forexample by means of projection onto an orthonormal basis of the probability space. Thisimplies truncating the inﬁnite sum of the projections of a given probability on all elementsof the base. This approximation will then be poor for probabilities which are very diﬀerentfrom those of the chosen base. This fact shows that in practice it is not easy to eliminate38he reference to a particular parametric model, even in a non-parametric framework.Nevertheless, based on the PLI, our method provides useful information on the mostinﬂuential uncertainties regarding the distributions of input variables, or the so-called “epis-temic uncertainties”. This is in particular crucial not only in making decisions concerningfurther research programs aiming at gaining better knowledge about these variables, butalso to bring strong backing arguments to operators safety demonstrations. Indeed, weargue that this methodology is adequate for uncertainty studies with poorly reliable inputlaws identiﬁcation or when an improved level of robustness is demanded about the choiceof input distributions. In the target application (nuclear licensing), our aim is not only toexhibit safety margin values for the simulated accidents but also to prove the methodologyas a whole does not induce any risk of underestimating these values. Hence we do not onlylook for a worst case assessment method, but for a more global understanding of how apotential error on an input’s distribution aﬀects the output. In that perspective, a practicaloption to increase the conservatism of UQ studies is to replace one or several input dis-tributions by penalized deterministic values or by a penalized versions of the distributionsthemselves. This nevertheless implies to justify the choice of the variables for which thispenalization is done (see, e.g., Larget and Gautier (2020)).Further investigations are still to be completed as this method increases the numericalcomplexity and the computational time compared to the previous method of Lemaˆıtre et al.(2015). Indeed, several Monte Carlo loops are needed to compute the maximal and minimalPLI over Fisher spheres. There is ongoing work about the improvement of the estimation ofthe maximum and the minimum of the PLI on a Fisher sphere. There is known numerical39ssue with the reverse important sampling strategy as the likelihood ratio tends to explodeas well as the conﬁdence intervals. Moreover the method consists in sampling trajectoriesover the Fisher sphere, but one could beneﬁt of a more advanced strategy by optimizingdirectly the PLI over the sphere, via gradient descent along this manifold for instance.The crucial problem of probabilistic dependencies between inputs should also be exploredto extend our framework to the non independent-input case, works in robustness analysisdealing with dependent input can be found for instance in Pesenti et al. (2019). Moreover,using a distance in a complex space such as the space of probability density functions insteadof a moment perturbation makes our methodology harder to interpret from a physicist’sperspective. Thus, it is crucial to clearly deﬁne the statistical interpretation of the Fisherdistance, i.e. the link with the statistical tests theory. Last but not least, the numericaldiﬃculties illustrated in Section 3.3 prevents us from having a complete degree of freedomon the δ value. A Proof of Theorem 1

We study the consistency and asymptotic normality of speciﬁc M and Z -estimators in orderto establish the proof. We suppose this theory is known so that the details can be keptto the bare minimum. Further readings can be found in Chapters 5.2 and 5.3 of Van derVaart (2000). Given a sample ( X ( n ) ) n ∈ (1 ,...,N ) where X is a d -dimensional random vector,40e deﬁne η = α − α ,m θ ( x ) = − ( G ( x ) − θ ) ( G ( x ) ≤ θ ) + η ( G ( x ) − θ ) ( G ( x ) >θ ) ,M N ( θ , θ ) = 1 N N (cid:88) n =1 m θ ( X ( n ) ) + f iδ ( X ( n ) i ) f i ( X ( n ) i ) m θ ( X ( n ) ) , ˆ θ N = arg max M N ( θ , θ ) . (29)ˆ θ N is deﬁned such that its two components correspond respectively to the estimators ˆ q αN andˆ q αN,iδ of the quantile and the perturbed quantile. The map θ (cid:55)→ ∇ θ M N ( θ ) with θ = ( θ , θ ) T has two non decreasing components (it is a sum of non decreasing maps). Now, by deﬁnitionof ˆ θ N and concavity of M n ( θ ), it holds that ∇ θ M N (ˆ θ N ) = 0. Furthermore, we have that ∇ θ M N ( θ ) P −→ ((1 + η ) F Y ( θ ) − η, ((1 + η ) F Y,iδ ( θ ) − ¯ L N η ) T with ¯ L N = 1 N N (cid:88) n =1 f iδ ( X ( n ) i ) f i ( X ( n ) i ) ,and this limit is a strictly non decreasing function. Therefore, the assumptions of Lemma5.10 in (Van der Vaart, 2000, p.47) are satisﬁed, proving the consistency of the estimatorˆ θ N P −→ ( q α , q αiδ ) T .The asymptotic normality is studied via the map ¯ m θ ( x ) (cid:55)→ m θ ( x ) + f iδ ( x ) f i ( x ) m θ ( x ) whichis Lipschitz for the variable θ with Lipschitz constant h ( x ) = max(1 , η ) (cid:18) f iδ ( x i ) f i ( x i ) (cid:19) . Thefunction h belongs in L if E (cid:34)(cid:18) f iδ ( X i ) f i ( X i ) (cid:19) (cid:35) < + ∞ . The map ¯ m θ is also diﬀerentiable in θ = arg max θ ∈ Θ E [ ¯ m θ ( X )] with gradient: ∇ θ ¯ m θ ( x ) = ((1 + η ) ( G ( x ) ≤ θ ) − η, f iδ ( x i ) f i ( x i ) ((1 + η ) ( G ( x ) ≤ θ ) − η )) T . (30)Moreover, the map θ → E [ ¯ m θ ( X )] admits the following Hessian: V θ =  (1 + η ) F (cid:48) Y ( q α ) 00 (1 + η ) F (cid:48) Y,iδ ( q αiδ )  , (31)41hich is symmetric deﬁnite non negative whenever F (cid:48) Y ( q α ) > F (cid:48) Y,iδ ( q αiδ ) >

0. Hence,Theorem 5.23 in (Van der Vaart, 2000, p.53) applies. It proves the asymptotic normalityof the estimator (ˆ q α , ˆ q αiδ ) T . References

Abramowitz, M. and Stegun, I. (Editors) (1974),

Handbook of Mathematical Functions ,Dover Publications, Inc. New York.Amari, S. (1985),

Diﬀerential-Geometrical Methods in Statistics , New York, NY: SpringerNew York.Amari, S. and Nagaoka, H. (2000),

Methods of Information Geometry , Oxford UniversityPress.Arnold, V. I. (1997),

Mathematical Methods of Classical Mechanics (Graduate Texts inMathematics, Vol. 60) , Springer.Baccou, J., Zhang, J., Fillion, P., Damblin, G., Petruzzi, A., Mendiz´abal, R., Revent´os, F.,Skorek, T., Couplet, M., Iooss, B., Oh, D., and Takeda, T. (2019), “Development of goodpractice guidance for quantiﬁcation of thermal-hydraulic code model input uncertainty,”

Nuclear Engineering and Design , 354, 110173.Bucalossi, A., Petruzzi, A., Kristof, M., and D’Auria, F. (2010), “Comparison betweenbest-estimate-plus-uncertainty methods and conservative tools for nuclear power plantlicensing,”

Nuclear Technology , 172, 29–47.42habridon, V., Balesdent, M., Bourinet, J.-M., Morio, J., and Gayton, N. (2018),“Reliability-based sensitivity estimators of rare event probability in the presence of dis-tribution parameter uncertainty,”

Reliability Engineering & System Safety , 178, 164–178.Cont, R., Deguest, R., and Scandolo, G. (2010), “Robustness and sensitivity analysis ofrisk measurement procedures,”

Quantitative Finance , 10, 593 – 606.Costa, S. I. R., Santos, S. A., and Strapasson, J. E. (2012), “Fisher information distance:a geometrical reading,”

Discrete Applied Mathematics , 197, 59–69.de Rocquigny, E., Devictor, N., and Tarantola, S. (Editors) (2008),

Uncertainty in indus-trial practice , Wiley.Delage, T., Sueur, R., and Iooss, B. (2018), “Robustness analysis of epistemic uncertaintiespropagation studies in LOCA assessment thermal-hydraulic model,” in

Proceedings ofANS Best Estimate Plus Uncertainty International Conference (BEPU 2018) , Lucca,Italy.Efron, B. (1979), “Bootstrap Methods: Another Look at the Jackknife,”

The Annals ofStatistics , 7, 1–26.Eguchi, S. and Copas, J. (2006), “Interpreting Kullback-Leibler Divergence with theNeyman-Pearson Lemma,”

Journal of Multivariate Analysis , 97, 2034–2040.Fang, K.-T., Li, R., and Sudjianto, A. (2006),

Design and modeling for computer experi-ments , Chapman & Hall/CRC. 43eﬀraye, G., Antoni, O., Farvacque, M., Kadri, D., Lavialle, G., Rameau, B., and Ruby,A. (2011), “CATHARE2 V2.5 2: A single version for various applications,”

NuclearEngineering and Design , 241, 44564463.Gelfand, I. and Fomin, S. (2012),

Calculus of Variations , Dover Publications.Ghanem, R., Higdon, D., and Owhadi, H. (Editors) (2017),

Springer Handbook on Uncer-tainty Quantiﬁcation , Springer.Hart, J. and Gremaud, P. (2019), “Robustness of the Sobol’ indices to marginal distributionuncertainty,”

SIAM/ASA Journal on Uncertainty Quantiﬁcation , 7, 1224–1244.Hesterberg, T. (1996), “Estimates and conﬁdence intervals for importance sampling sensi-tivity analysis,”

Math. Comput. Modelling , 23, 79–85.Holbrook, A., Lan, S., Streets, J., and Shahbaba, B. (2017), “The nonparametric Fishergeometry and the chi-square process density prior,” arXiv e-prints , arXiv:1707.03117.Iooss, B. and Le Gratiet, L. (2019), “Uncertainty and sensitivity analysis of functional riskcurves based on Gaussian processes,”

Reliability Engineering and System Safety , 187,58–66.Iooss, B. and Lemaˆıtre, P. (2015), “A review on global sensitivity analysis methods,” in Mel-oni, C. and Dellino, G. (editors),

Uncertainty management in Simulation-Optimizationof Complex Systems: Algorithms and Applications , Springer.Iooss, B. and Marrel, A. (2019), “Advanced methodology for uncertainty propagation in44omputer experiments with large number of inputs,”

Nuclear Technology , 205, 1588–1606.Iooss, B., Verg`es, V., and Larget, V. (2020), “BEPU robustness analysis via perturbed-law based sensitivity indices,” in

Accepted to the ANS Best Estimate Plus Uncer-tainty International Conference (BEPU 2020) , Giardini Naxos, Italy, URL https://hal.archives-ouvertes.fr/hal-02864053 .Ishigami, T. and Homma, T. (1990), “An importance quantiﬁcation technique in uncer-tainty analysis for computer models,” in [1990] Proceedings. First International Sympo-sium on Uncertainty Modeling and Analysis , IEEE Comput. Soc. Press.Jin, R., Chen, W., and Sudjianto, A. (2005), “An eﬃcient algorithm for constructingoptimal design of computer experiments,”

Journal of Statistical Planning and Inference ,134, 268–287.Larget, V. (2019), “How to bring conservatism to a BEPU analysis,” in

NURETH-18 ,Portland, USA.Larget, V. and Gautier, M. (2020), “Increasing conservatism in BEPU IB LOCA safetystudies using complementary and industrially cost eﬀective statistical tools,” in

Acceptedto the ANS Best Estimate Plus Uncertainty International Conference (BEPU 2020) ,Giardini Naxos, Italy.Leimkuhler, B. and Reich, S. (2005),

Simulating Hamiltonian Dynamics , Cambridge Uni-versity Press. 45emaˆıtre, P. (2014),

Analyse de sensibilit´e en ﬁabilit´e des structures - Sensitivity analysisin structural reliability , Th`ese de l’Universit´e Bordeaux I.Lemaˆıtre, P., Sergienko, E., Arnaud, A., Bousquet, N., Gamboa, F., and Iooss, B. (2015),“Density modiﬁcation based reliability sensitivity analysis,”

Journal of Statistical Com-putation and Simulation , 85, 1200–1223.Marrel, A. and Chabridon, V. (2020), “Statistical developments for target and condi-tional sensitivity analysis: application on safety studies for nuclear reactor,”

Preprint,https://hal.archives-ouvertes.fr/hal-02541142 .Mazgaj, P., Vacher, J.-L., and Carnevali, S. (2016), “Comparison of CATHARE resultswith the experimental results of cold leg intermediate break LOCA obtained duringROSA-2/LSTF test 7,”

EPJ Nuclear Sciences & Technology , 2.Meynaoui, A., Marrel, A., and Laurent, B. (2019), “New statistical methodology forsecond level global sensitivity analysis,” URL https://hal.archives-ouvertes.fr/hal-02019412 . Working paper or preprint.Morio, J. and Balesdent, M. (2016),

Estimation of rare event probabilities in complexaerospace and other systems , Woodhead Publishing.Mousseau, V. and Williams, B. (2017), “Uncertainty quantiﬁcation in a regulatory envi-ronment,” in Ghanem, R., Higdon, D., and Owhadi, H. (editors),

Springer Handbook onUncertainty Quantiﬁcation , Springer.Newey, W. K. and McFadden, D. (1994), “Large sample estimation and hypothesis testing,”46n

Handbook of Econometrics , volume 4, chapter 36, Elsevier, 2111 – 2245. ISSN: 1573-4412.Nielsen, F. (2013), “Cram´er-Rao Lower Bound and Information Geometry,” in Bhatia,R., Rajan, C. S., and Singh, A. I. (editors),

Connected at Inﬁnity II: A Selection ofMathematics by Indians , Gurgaon: Hindustan Book Agency.Owhadi, H., Scovel, C., Sullivan, T. J., McKerns, M., and Ortiz, M. (2013), “OptimalUncertainty Quantiﬁcation,”

SIAM Review , 55, 271–345.Perrin, G. and Defaux, G. (2019), “Eﬃcient estimation of reliability-oriented sensitivityindices,”

Journal of Scientiﬁc Computing , 80.Pesenti, S., Millossovich, P., and Tsanakas, A. (2019), “Cascade Sensitivity Measures,”

Available at SSRN .Prieur, C. and Tarantola, S. (2017), “Variance-Based Sensitivity Analysis: Theory andEstimation Algorithms,” in Ghanem, R., Higdon, D., and Owhadi, H. (editors),

SpringerHandbook on Uncertainty Quantiﬁcation , Springer.Rao, C. (1945), “Information and the accuracy attainable in the estimation of statisticalparameters,”

Bull. Calcutta Math. Soc. , 37.Saltelli, A. and Tarantola, S. (2002), “On the relative importance of input factors in math-ematical models: Safety assessment for nuclear waste disposal,”

Journal of AmericanStatistical Association , 97, 702–709. 47anchez-Saez, F., S`anchez, A., Villanueva, J., Carlos, S., and Martorell, S. (2018), “Un-certainty analysis of large break loss of coolant accident in a pressurized water reactorusing non-parametric methods,”

Reliability Engineering and System Safety , 174, 19–28.Smith, R. (2014),

Uncertainty quantiﬁcation , SIAM.Sobol, I. (1993), “Sensitivity estimates for non linear mathematical models,”

MathematicalModelling and Computational Experiments , 1, 407–414.Sobol’, I. (2001), “Global sensitivity indices for nonlinear mathematical models and theirMonte Carlo estimates,”

Mathematics and Computers in Simulation , 55, 271 – 280. TheSecond IMACS Seminar on Monte Carlo Methods.Stenger, J., Gamboa, F., Keller, M., and Iooss, B. (2019), “Optimal Uncertainty Quantiﬁ-cation of a risk measurement from a thermal-hydraulic code using Canonical Moments,”

International Journal for Uncertainty Quantiﬁcation , 10, 35–53.Stillwell, J. (1997),

Numbers and Geometry (Undergraduate Texts in Mathematics) ,Springer.Stirzaker, D. (2003),

Elementary Probability , Cambridge University Press.Sueur, R., Bousquet, N., Iooss, B., and Bect, J. (2016), “Perturbed-Law based sensitivityindices for sensitivity analysis in structural reliability,” in

Proceedings of the 8th Interna-tional Conference on Sensitivity Analysis of Model Output (SAMO 2016) , Le Tampon,R´eunion Island, France. 48ueur, R., Iooss, B., and Delage, T. (2017), “Sensitivity analysis using perturbed-lawbased indices for quantiles and application to an industrial case,” in

Proceedings of the10th International Conference on Mathematical Methods in Reliability (MMR 2017) ,Grenoble, France.Van der Vaart, A. W. (2000),