Developments and applications of Shapley effects to reliability-oriented sensitivity analysis with correlated inputs
aa r X i v : . [ m a t h . S T ] J a n Developments and applications of Shapley effects to reliability-orientedsensitivity analysis with correlated inputs
Marouane Il Idrissi a , Vincent Chabridon a,b , Bertrand Iooss a,b,c,d a EDF Lab Chatou, 6 Quai Watier, 78401 Chatou, France b SINCLAIR AI Lab., Saclay, France c Institut de Math´ematiques de Toulouse, 31062 Toulouse, France d Corresponding Author - Email: [email protected] - Phone: +33130877969
Abstract
Reliability-oriented sensitivity analysis methods have been developed for understanding the influenceof model inputs relatively to events characterizing the failure of a system (e.g., a threshold exceedanceof the model output). In this field, the target sensitivity analysis focuses primarily on capturingthe influence of the inputs on the occurrence of such a critical event. This paper proposes newtarget sensitivity indices, based on the Shapley values and called “target Shapley effects”, allowingfor interpretable influence measures of each input in the case of dependence between the inputs. Twoalgorithms (a Monte Carlo sampling one, and a given-data algorithm) are proposed for the estimationof these target Shapley effects based on the ℓ norm. Additionally, the behavior of these target Shapleyeffects are theoretically and empirically studied through various toy-cases. Finally, applications on tworealistic use-cases (a river flood model and a COVID-19 epidemiological model) are discussed. Keywords: sensitivity analysis, reliability analysis, Sobol’ indices, Shapley effects, input correlation
1. Introduction
Nowadays, numerical models are intensively used in all industrial and scientific disciplines to de-scribe physical phenomena (e.g., systems of ordinary differential equations in ecosystem modeling,finite element models in structural mechanics, finite volume schemes in computational fluid dynamics)in order to design, analyze or optimize various processes and systems. In addition to this tremendousgrowth in computational modeling and simulation, the identification and treatment of the multiplesources of uncertainties has become an essential task from the early design stage to the whole systemlife cycle. As an example, such a task is crucial in the management of complex systems such as thoseencountered in energy exploration and production [1] and in sustainable resource development [2].In addition, the emergence of global sensitivity analysis (GSA) of model outputs played a funda-mental role in the development and enhancement of these numerical models (see, e.g., [3, 4] for recentreviews). Mathematically, if the model inputs (resp. output) are denoted by X (resp. Y ) and themodel is written G ( · ), such as Y = G ( X ) , (1)GSA aims at understanding the behavior of Y with respect to (w.r.t. ) X = ( X , . . . , X d ) ⊤ the vectorof d inputs. GSA has been intensively used as a versatile tool to achieve various goals: for instance,uantifying the relative importance of inputs regarding their influence on the output (a.k.a. ”ranking”),identifying the most influential inputs among a large number of inputs (a.k.a. screening) or analyzingthe input-output code behavior [5, 6].When the complex systems are critical or need to be highly safe, numerical models can also be ofgreat help for risk and reliability assessment [7]. Indeed, to track potential failures of a system (whichcould lead to dramatic environmental, human or financial consequences), numerical models allow tosimulate its behavior far from its nominal one (see, e.g., [8] in flood hazard assessment). Analyticalor experimental approaches are often out of the question here. Based on these simulations, the tailbehavior of the output distribution can be studied and typical risk measures can be estimated [9].Among others, the probability that the output Y exceeds a given threshold value t ∈ R , given by P ( Y > t ) and often called a failure probability , is widely used in many applications. When { Y >t } is a rare event (i.e., associated to a very low failure probability), advanced sampling-based orapproximation-based techniques [10] are required to estimate properly the failure probability. In thisvery specific context, dedicated sensitivity analysis methods have been developed, especially in thestructural reliability community (see, e.g., [11, 12, 13]). In such a framework, called reliability-orientedsensitivity analysis (ROSA) [14, 15], the idea is to provide importance measures dedicated to theproblem of rare event estimation.To make it more formal, standard GSA methods mostly focus on quantities of interest (QoI)characterizing the central part of the output distribution (e.g., the variance for Sobol’ indices [16], theentire distribution for moment-independent indices [17]), while ROSA methods focus on risk measuresand their associated practical difficulties (e.g., costly to estimate, inducing a conditioning on thedistributions, non-trivial interpretation of the indices). Following [18], ROSA methods can be analyzedregarding the type of study they consider, i.e., according the following two categories: • target sensitivity analysis (TSA) aims at catching the influence of the inputs (considering theirentire input domain) on the occurrence of the failure event. Basically, this implies to considerthe random variable defined by the indicator function { G ( X ) >t } of the failure domain ; • conditional sensitivity analysis aims at studying the influence of the inputs on the conditional distribution of the output Y |{ G ( X ) > t } , i.e., exclusively within the critical domain. By Eq. (1),a conditioning also appears on the inputs’ domain.Various indices have been proposed to tackle these two types of studies (see, e.g., [19, 13, 15, 20]).The present paper is dedicated to ROSA (under the assumption that the QoI is a failure probability)and focuses on a TSA study. However, a new issue for TSA is addressed here: the possible statisticaldependence between the inputs.Indeed, most of the common GSA methods (and it is similar for the ROSA ones) have beendeveloped under the assumption of independent inputs. As an example, the well-known Sobol’ indices[16] which rely on the so-called functional analysis of variance (ANOVA) and Hoeffding decomposition[21], can be directly interpreted as shares of the output variance that are due to each input andcombination of inputs (called “interactions”) as long as the inputs are independent.When the inputs are dependent, the inputs’ correlations dramatically alter the interpretation ofthe Sobol’ indices. To handle this issue, several approaches have been investigated in the literature.2or instance, [22] proposed to estimate indices for groups of correlated inputs. However, this approachdoes not allow to quantify the influence of individual inputs. Amongst other similar works, [23, 24]proposed to extend the functional ANOVA decomposition to a more general one (e.g., taking thecovariance into account). However, the indices obtained for these approaches can be negative, whichlimits their practical use due to interpretability issues. Parallel to this, other works (see, e.g., [25, 26])considered a Gram–Schmidt procedure to decorrelate the inputs and proposed to estimate two kinds ofcontributions for each variable (an uncorrelated one and a correlated one). These works finally resultedin the proposition of a set of four Sobol’ indices (instead of the two standard ones which are the first-order index and total index in the independent case) which enable to fully capture the correlationeffects in the GSA [27]. Despite this achievement, this approach remains difficult to implement inpractice (see [28] for extensive studies). Finally, the VARS approach [29] (allowing a thorough analysisof the inputs-output relationships) can handle inputs’ correlation but is out of scope of the presentwork which only focuses on variance-based sensitivity indices, directly computed from the numericalmodel.Recently, another research track has been developed by considering another type of indices: the Shapley effects . The initial formulation originates from the “Shapley values” developed in the field ofGame Theory [30, 31]. The underlying idea is to fairly distribute both gains and costs to multipleplayers working cooperatively. By analogy with the GSA framework, the inputs can be seen as theplayers while the overall process can be seen as attributing shares of the output variability to theinputs. Considering the variance of the output in a GSA formulation leads to the so-called “Shapleyeffects” proposed by [32]. In the same vein, [33, 34, 28] bridge the gap between Sobol’ indices andShapley effects while illustrating the usefulness of these new indices to handle correlated inputs in theGSA framework.Thus, the present work tries to extend the use of Shapley effects to the ROSA context. To sumup, the idea is to provide a ROSA index enabling to perform TSA (i.e., capturing the influence ofthe inputs on a risk measure, typically a failure probability here) under the constraint of dependentinputs. Moreover, this work relies on the use of recent promising results and numerical tools (both infield of TSA [35] and Shapley effects’ estimation [36]).The outline of this paper is the following. Section 2 is devoted to a pedagogical introduction ofthe statistical dependence issues for variance-based sensitivity indices, that can be solved by Shapleyeffects. Section 3 presents a new formulation of TSA based on Shapley effects leading to target Shapleyeffects, while Section 4 develops two algorithms for their estimation. Section 5 provides illustrationson simple toy-cases which give analytical expressions of the target Shapley effects, allowing to deeplystudy their behavior. Section 6 applies these new sensitivity indices to two use-cases: a simplifiedmodel of a river flood and an epidemiological model related to the Covid-19 disease. Finally, Section7 gives conclusions and research perspectives.All along the paper, the mathematical notation E ( · ) (resp. V ( · )) will represent the expectation(resp. variance) operator. 3 . Variance-based sensitivity analysis with dependent inputs: the Shapley solution While devoted to computer experiments, GSA has closed connections with multivariate data anal-ysis and statistical learning [37, 38]. Indeed, in all these topics, one important issue is often to providea weight to some variables (the inputs) w.r.t. its impact on another variables (the outputs). Dependingon the domain, such a weight can be either called a “sensitivity index” or an “importance measure”.A very convenient way is to base these weights on the ANOVA (analysis of variance) decomposition[37, 16] of the output variance. Indeed, such a decomposition provides a natural sharing of the outputvariance in shares due to each input. The principle of the “variance-based sensitivity indices” [5] con-sists then in understanding how to separate the contribution of each X i in the variance of Y . However,due to potential statistical dependencies between inputs, this sharing cannot be directly performed.Starting from a simple case such as the linear model, chosen for pedagogical purposes, this sectionprovides a reminder on this topic while illustrating the great interest of Shapley effects in practice. In this section, the aim is to quantify the relative importance of d scalar inputs X j ( j = 1 , . . . , d )by fitting on a data sample (coming from the model Eq. (1)) a linear regression model so as to predicta scalar output Y : Y ( X ) = d X j =0 β j X j + ǫ , (2)where X = 1, β = ( β , . . . , β d ) ⊤ ∈ R d +1 is the effects vector and ǫ ∈ R the model’s error of variance σ .If a sample of inputs and outputs ( X n , Y n ) = (cid:16) X ( i )1 , . . . , X ( i ) d , Y ( i ) (cid:17) i =1 ,...,n is available (with n > d ),the Ordinary Least Squares method (see, e.g., [37]) can easily be used to estimate the parameters β and σ in the linear regression model in Eq. (2). Moreover, one obtains the predictor b Y ( x ∗ ) of Y atany prediction point x ∗ . An important validation metric of this model is the classical coefficient ofdetermination given by: R Y ( X ) = n X i =1 h b Y ( X ( i ) ) − ¯ Y i (cid:30) h Y ( i ) − ¯ Y i (3)where ¯ Y is the output empirical mean. R Y ( X ) represents the percentage of output variability explainedby the linear regression model of Y on X . Finally, from Eq. (2), the variance decomposition expressesas: V ( Y ) = d X j =1 β j V ( X j ) + 2 X k>j β j β k Cov( X j , X k ) + σ . (4)In the specific case of independent inputs, the covariance terms cancel and the standard ANOVA(i.e., V ( Y ) = P β j V ( X j ) + σ ) is obtained. Then, global sensitivity indices, called StandardizedRegression Coefficients (SRC), can be directly computed:SRC j = β j q V ( X j ) / V ( Y ) . (5)The estimation of the SRC is made by replacing the terms in Eq. (5) by their estimates. Interestingly,this metric for relative importance is signed (thanks to the regression coefficient sign), giving the sense4f variation of the output w.r.t. each input. Moreover, SRC j represents a share of variance and thesum of all the SRC j approaches R (i.e., the amount of explained variance by the linear model). Notethat, in a perfect linear regression model (i.e., without any random error term ǫ ), SRC j is equal tothe linear Pearson’s correlation coefficient between X j and Y (denoted by ρ ( X j , Y )). Note also thatthe ANOVA and SRC extend to the functional ANOVA and Sobol’ indices in the general (non-linearmodel) case (see Appendix A).When the inputs are dependent, the main concern is to allocate the covariance terms in Eq. (4) tothe various inputs. In this case, the Partial Correlation Coefficient (PCC) has been promoted in GSA[39, 5] as a substitute to the SRC, in order to cancel the effects of other inputs when allocating theweight of one input X j in the variance of Y :PCC j = ρ ( X j − d X − j , Y − d Y − j ) (6)where X − j is the vector of all the d inputs except X j , d X − j is the prediction of the linear modelexpressing X j w.r.t. X − j and d Y − j is the prediction of the linear model Y w.r.t. X − j . However, PCCis not a right sensitivity index of the input. Indeed, it consists in measuring the linear correlationbetween Y and X j by fixing X − j , and is then a measure of the linearity (and not the importance)between the output and one input.Instead of controlling other inputs X − j such as done in the PCC, the Semi-Partial CorrelationCoefficient (SPCC) quantifies the proportion of the output variance explained by X j after removingthe information brought by X − j (on X j ) [40]:SPCC j = ρ ( X j − d X − j , Y ) . (7)SPCC can also be expressed by using the relation SPCC j = R Y ( X ) − R Y ( X − j ) , which clearly showsthat SPCC gives the additional explanatory power of the input X j in the linear regression model of Y on X . However, the SPCC of highly correlated inputs will be small, despite their “real” explanatorypower on the output. This aspect seems to be the main drawback of SPCC and probably explains itslack of popularity for GSA purposes.In order to give an intuitive view of the multicollinearity issue (i.e., multiple linear regression withcorrelated inputs), we use Venn diagrams (see Figure 1), by considering two inputs X and X andone output Y . From Figure 1, the coefficient of determination can be written as: R Y ( X ,X ) = a + b + ca + b + c + σ , (8)where a + b + c + σ is equal to the variance of Y and a + b + c represents the part of explainedvariance by the regression model (with b = 0 in the uncorrelated case). In this elementary example,the previously introduced sensitivity indices are given by [41]:SRC = ( a + b ) / ( a + b + c + σ ) , SRC = ( c + b ) / ( a + b + c + σ ) , PCC = a/ ( a + σ ) , PCC = c/ ( c + σ ) , SPCC = a/ ( a + b + c + σ ) , SPCC = c/ ( a + b + c + σ ) . (9)5hus, one can understand the limitations of SRC, PCC and SPCC when correlation is present: thevariance share which comes from the correlation between inputs (i.e., the b value in Figure 1 - right) isallocated two times with the SRC but not allocated at all with SPCC, while PCC does not representany variance sharing. Figure 1: Inspired from [41]. Illustration scheme of the effect of two inputs X and X on an output variable Y whenthey are: uncorrelated (left) or correlated (right). The three problems above can be solved by using another sensitivity index which finds a wayto partition the R among the d inputs: the LMG [42, 43] (acronym based on the authors’ names,i.e., “Lindeman - Merenda - Gold”) uses sequential sums of squares from the linear model and obtainsan overall measure by averaging over all orderings of inputs. Mathematically, let u be a subset ofindices in the set of all subsets of { , . . . , d } and X u = ( X j : j ∈ u ) a group of inputs. LMG is based onthe measure of the elementary contribution of any given variable X j to a given subset model Y ( X u )by the increase in R that results from adding that predictive variable to the regression model:LMG j = 1 d ! X π ∈ permutationsof { ,...,d } h R Y ( X v ∪{ j } ) − R Y ( X v ) i (10)with v the indices entered before j in the order π . In Eq. (10), the sum is performed over all thepermutations of { , . . . , d } . For the case of two inputs (see Figure 1), we can easily show that:LMG = ( a + b/ / ( a + b + c + σ ) , LMG = ( c + b/ / ( a + b + c + σ ) . (11)Then, in the LMG framework, the R Y ( X ,X ) has been perfectly shared into two parts with an equitabledistribution of the b term between X and X .This allocation principle exactly corresponds to the application of the Shapley values [30] on thelinear model. This attribution method has been primarily used in cooperative game theory, allowingfor a cooperative allocation of resources between players based on their collective production (seeAppendix B for a more formal definition). The Shapley values solution consists in fairly distributingboth gains and costs to several actors working in coalition. In situations when the contributions ofeach actor are unequal, it ensures that each actor gains as much or more as they would have fromacting independently. Now, if the actors are identified with a set of inputs and the value assignedto each coalition is identified to the explanatory power of the subset of model inputs composing thecoalition, one obtains the LMG in Eq. (10). 6 .2. Shapley effects
In the general case with no hypothesis on the form of the model G ( · ) (Eq. (1)), variance-basedsensitivity indices have been extensively developed [16, 5] and applied for GSA of complex models(see, e.g., [44]). Indeed, in the independent inputs’ case, it allows a variance decomposition of themodel output in different shares (called “Sobol’ indices”) induced by each input and each possibleinteraction between inputs in the model (see Appendix A for the theoretical details).For the dependent inputs’ case, ideas coming from game theory and Shapley values (as for LMGin Eq. (10)), have been recently introduced [32] in order to measure input influence by Sh j = 1 d X A ⊂{− j } (cid:18) d − | A | (cid:19) − (val( A ∪ { j } ) − val( A )) , (12)where val( A ) is the so-called value function (which is, somehow a cost function) assigned to a subset A ∈ P d of inputs, P d is the set of all possible subsets of { , . . . , d } , {− j } denotes the set of indices { , . . . , d } \ j and | A | is the cardinal of A .For GSA purposes, [32] proposes to use the so-called “closed Sobol’ indices” as the value function:val( A ) = S clos A = V (cid:0) E (cid:2) G ( X ) (cid:12)(cid:12) X A (cid:3)(cid:1) V ( G ( X )) , (13)where X A denotes the subset of inputs selected by the indices in A ( X A = ( X i ) i ∈ A ). The attributionproperties of the Shapley values applied to this particular cost function, leads to the definition of the Shapley effects : Sh j = 1 d X A ⊂{− j } (cid:18) d − | A | (cid:19) − (cid:16) S clos A ∪{ j } − S clos A (cid:17) (14)These Shapley effects allow for a quantification of influence for each input, which intrinsically takesinto account both interaction and dependence. Moreover, two important properties of the Shapleyeffects ( Sh j , j = 1 , . . . , d ) allows for an easy interpretation: they sum up to one and are non-negative.They allow for input ranking, in terms of influence, by allocating each input a percentage of the modeloutput’s variance. These indices have been extensively studied in [33, 34]. An alternate way of definingthe Shapley effects has been proposed, by taking the following cost function:val( A ) = E (cid:2) V (cid:0) G ( X ) | X A (cid:1)(cid:3) V ( G ( X )) (15)where A = { , . . . , d } \ A which lead to the equivalent definition of the Shapley effects in Eq. (14).This results allows for alternate estimation methods, as outlined in [45].In order to illustrate the allocation system of the Shapley effects, one can first consider a model7ith three inputs X = ( X , X , X ) ⊤ . From Eq. (14), one gets: Sh = 13 S clos1 + 16 h(cid:0) S clos { , } − S clos2 (cid:1) + (cid:0) S clos { , } − S clos3 (cid:1)i + 13 (cid:0) S clos { , , } − S clos { , } (cid:1) . In the case where the three inputs are independent, one obtains: Sh = S + 12 S { , } + 12 S { , } + 13 S { , , } where one can notice that the Shapley values quantification (in the independent case) consists of theinitial Sobol’ index of the studied input, plus an equal share of the interaction effects between all theinvolved inputs. However, if dependence between inputs is assumed, this behavior cannot be clearlyillustrated, except in the linear case (see Subsection 2.1).The Shapley increment (cid:0) S clos A ∪{ j } − S clos A (cid:1) can be interpreted as being a quantification of the residualeffects of the input j in relation to the subset of variables A . If S clos A is believed to contain the initialeffects of A , plus their interaction effects, and any effects due to the dependence of the inputs in A ,then the Shapley increment quantifies the initial effect of the input j , its interaction effect with A , andthe effects due to its dependence with the inputs in A . Then, the Shapley attribution system weightsall these residual effects, in order to equally redistribute the interaction or dependence effects betweenthe involved inputs, in the same fashion as the LMG in the linear model case, as depicted in Section2.1.It is important to note that the above mentioned behavior of the Shapley effects cannot be verifiedfor any type of dependence structure, due to the lack of a univocal functional decomposition in the caseof dependent inputs. However, this behavior has been highlighted in analytical cases in [34] (mainlyfor Gaussian inputs with a linear correlation structure).
3. Reliability-oriented Shapley effects for target sensitivity analysis
When focusing on complex systems, one often needs to prevent from possible critical events, whichhave a low probability of occurring but are associated to a failure of the system. Such a failure mighthave possible dramatic consequences regarding various factors (e.g., human, environmental, financial).Such a task is covered by the fields of reliability assessment and risk analysis [7, 8]. Mathematically,such a problem implies to focus on a risk measure computed from the tail of the output distribution[9]. Performing sensitivity analysis in such a context requires to use dedicated tools which havebeen gathered by various authors under the denomination of “reliability-oriented sensitivity analysis”(ROSA) (see, e.g., [15, 46, 20]). A large panel of ROSA methods have been proposed in the structuralreliability community such as, for example, several variance-based approaches (see, e.g., [47, 13, 15, 48])and moment-independent approaches (see, e.g., [49, 19, 46]). From the GSA community, severalextensions have also been proposed as extensions to handle risk or reliability measures. Among others,8ne can mention, for instance, the contrast-based indices proposed by [50] as a versatile tool to handleseveral types of QoI and then studied by [51, 52] for quantile-oriented formulations, the quantile-basedglobal sensitivity measures [53] or, finally, other indices related to dependence measures [18, 20].In the context of reliability assessment, a typical risk measure is the failure probability given by: p Yt def = P ( Y > t ) = P ( G ( X ) > t ) = E (cid:2) { G ( X ) >t } ( X ) (cid:3) = E [ F t ( X )] (16)where t represents a certain scalar threshold value characterizing the state of the system. Typically,the event { Y > t } denotes the failure event. As for F t , it represents the input failure domain,i.e., F t def = { X | G ( X ) > t } . Thus, performing a ROSA study induces a few challenges: firstly, thevariable of interest here is no more Y , but can be, for instance, a binary event whose occurrence ischaracterized by the indicator function F t ( X ); secondly, this event is likely to be a “rare event”associated to a low failure probability which might be difficult to estimate in practice [10]; thirdly,the type of study one desires to perform has to be reinterpreted regarding the new QoI. Regardingthis last point, [18] proposes to focus on two types of studies when dealing with critical events: thefirst one, called target sensitivity analysis (TSA), aims at catching the influence of the inputs on theoccurrence of the failure event, while the second one, called “conditional sensitivity analysis” aims atstudying the influence of the inputs once the threshold value has been reached (i.e., within the failuredomain). The present paper is dedicated to ROSA (under the assumption that the QoI is a failureprobability given by Eq. (16)) and focuses on a TSA study.To illustrate this in plain text, one can use the example of the modeling of the water level in a riverprotected by a dyke. From the traditional GSA point of view, the central question would be “Whichinputs influence the water level?”, while in the TSA paradigm, one focuses more on the question“Which inputs influence the occurrence of a specific flood level?”. Note that this particular exampleis more specifically studied in Subsection 6.1.In the case of independent inputs, a first category of sensitivity indices dedicated to TSA is the“target Sobol’ indices” whose first formulation has been proposed by [19]. Similarly, one uses here theclosed Sobol’ index (see Appendix A) as follows:T-S A = V ( E [ F t ( X ) | X A ]) V ( F t ( X )) . (17)where V ( F t ( X )) = p Yt (1 − p Yt ). Several estimation schemes for these indices have been proposed in therare event context [13, 15, 54]. To illustrate the behavior of this type of basic index, one can considerthe additive model given by Y = X + X + X , with X = ( X , X , X ) ⊤ , three independent standardGaussian random variables. The left plot of Figure 2 represents the probability density function (pdf)of Y , with four different values for the threshold t , corresponding to four different failure probabilitylevels. The right plot of Figure 2 presents the different values of T-S A . Note that, for this specificexample, the second-order indices verify T-S { , } = T-S { , } = T-S { , } . Moreover, one can remarkthat the more t is restrictive or loose (i.e., the failure probability being “close” to 0 or 1), the more thethird-order closed Sobol’ index for TSA increases, indicating high interaction effects between the threeinputs. Note that the understandings of this behavior falls under the conditional sensitivity analysisparadigm, which is out of the scope of this paper. However, the acknowledgment of this phenomenon9emains important for better understanding the behavior of the new indices proposed further in thepaper. − − . . . . . Y D en s i t y t=0 t=2 t=4 t=6 − − − . . . . . . t First − Order T − SobolSecond − Order T − SobolThird − Order T − Sobol
Figure 2: Probability density function of the output with four different threshold values (left) and the related targetSobol’ indices (right) for Y being the sum of three independent Gaussian random variables. Another category of sensitivity indices dedicated to TSA gathers the “moment-independent” ROSAindices. Among others, one can mention the two indices proposed by [49] which are given by: η A = 12 E h(cid:12)(cid:12)(cid:12) p Yt − p Y | X A t (cid:12)(cid:12)(cid:12)i (18a) δ A = 12 E (cid:20)(cid:16) p Yt − p Y | X A t (cid:17) (cid:21) . (18b)where p Y | X A t denotes the conditional failure probability when X A is fixed. Note that, if the firstindex does not require any independence assumption for quantifying input influence, it is known tobe difficult to estimate in practice [46]. As for the second one, it is simply proportional to the targetSobol’ index given in Eq. (17). Note that an extension of the δ A index has been proposed in [55]for correlated inputs. It relies on a similar orthogonalization procedure strategy as proposed by [26].However, as mentioned previously, this tends to increase the number of indices to estimate so as toproperly interpret the results.The following section develops the distance-based TSA indices , highlighting their links with exist-ing TSA indices and proposing new TSA indices inspired from the Shapley framework presented inSubsection 2.2. As recalled by several authors [50, 18], one can notice that V ( E [ Y | X A ]) = E h ( E [ Y | X A ] − E [ Y ]) i .This equality can be interpreted as the expected squared distance between two expectations, and thus10llows to apprehend Sobol’ indices (see Eq. (13)) as distance-based indices. Such a general point ofview has been adopted by [50] to provide a generalization of the Sobol’ indices using contrast functions .By applying a similar idea in the TSA paradigm, one can extend the standard T-S A and η A indicesto more general cases using statistical distances. Thus, one can define the distance-based TSA index,relative to a subset of inputs A ∈ P d , as follows:T-S D A = E h D (cid:16) p Yt , p Y | X A t (cid:17)i E h D (cid:16) p Yt , p Y | Xt (cid:17)i (19)where D ( · , · ) denotes any distance function. Links can be made between these indices and the onepreviously presented, through the use of specific distance functions. For example, by choosing thedistance derived from the ℓ norm (i.e., the absolute difference), one can find that the correspondingdistance-based TSA index is proportional to the η A index, that is:T-S ℓ A = E h(cid:12)(cid:12) E [ F t ( X )] − E [ F t ( X ) | X A ] (cid:12)(cid:12)i E h(cid:12)(cid:12) F t ( X ) − E [ F t ( X )] (cid:12)(cid:12)i = 2 E h(cid:12)(cid:12) F t ( X ) − E [ F t ( X )] (cid:12)(cid:12)i η A (20)Moreover, if using the distance derived from the ℓ norm (i.e., the squared difference), one canremark the related distance-based TSA indices are equal to the closed Sobol’ index for TSA, as definedin Eq. (17): T-S ℓ A = T-S A = V (cid:16) E (cid:2) F t ( X ) (cid:12)(cid:12) X A (cid:3)(cid:17) V (cid:16) F t ( X ) (cid:17) . (21)This framework of distance-based TSA indices allows to quantify the influence of inputs (or subsetsof inputs), but the previously introduced indices are not suited for the case of dependent inputs.However, such a framework enables to produce relevant candidates as cost functions for the Shapleysetting as presented in the following. ( D ) -target Shapley effects In this subsection, a novel family of TSA indices is proposed, namely the ( D )-target Shapleyeffects. As mentioned above, these indices are built by taking the distance-based TSA indices, definedin Eq. (19), as cost functions in a Shapley attribution procedure (see Eq. (12)). For a specific input j ∈ { , . . . , d } , its ( D )-target Shapley effects can be defined as being:T-Sh D j = 1 d X A ⊂{− j } (cid:18) d − | A | (cid:19) − (cid:16) T-S D A ∪{ j } − T-S D A (cid:17) (22)where {− j } = { , . . . , d } \ j . The main property allowing for a clear interpretation of the ( D )-targetShapley effects is the following: Property 1 (( D )-target Shapley effects decomposition) . Let A ∈ P d , and val ( A ) = T-S D A . For any istance function D ( ., . ) , the following property holds: d X j =1 T-Sh D j = 1 . (23)It is important to note that this decomposition property does not rely on any independence assump-tion on the inputs of the numerical model. However, in order to ensure a meaningful interpretation ofthese indices, (e.g., such as a percentage of a meaningful statistic in terms of input importance), oneneeds to ensure that the T-Sh j are non-negative, for all j = 1 , . . . , d .By choosing the cost function being equal to T-S ℓ A (i.e., D ( x, y ) = | x − y | ), one can define the( ℓ )-target Shapley effect associated to a variable j ∈ { , . . . , d } as being:T-Sh ℓ j = 1 d X A ⊂{− j } (cid:18) d − | A | (cid:19) − (cid:16) T-S ℓ A ∪{ j } − T-S ℓ A (cid:17) . (24)These indices are non-negative (see the proof in Appendix C.1) which allows the ( ℓ )-target Shapleyeffects to be interpreted as the percentage of the mean absolute deviation of the indicator function(i.e., E h(cid:12)(cid:12) F t ( X ) − E [ F t ( X )] (cid:12)(cid:12)i ) one can allocate to each input X j with j ∈ { , . . . , d } .By choosing T-S ℓ A as the cost function in the Shapley framework (i.e., D ( x, y ) = ( x − y ) ), the( ℓ )-target Shapley effect associated to the variable j ∈ { , . . . , d } can be defined as:T-Sh ℓ j = 1 d X A ⊂{− j } (cid:18) d − | A | (cid:19) − (cid:16) T-S ℓ A ∪{ j } − T-S ℓ A (cid:17) (25)Being also non-negative (see the proof in Appendix C.2), they can be interpreted as a percentage ofthe variance of the indicator function allocated to the input X j with j ∈ { , . . . , d } . Moreover, usingEq. (21), by analogy with Eqs. (13) and (15), if one chooses to define the cost function as being:val( A ) = T-E A def = E h V (cid:0) F t ( X ) | X A (cid:1)i V (cid:16) F t ( X ) (cid:17) (26)with A = { , . . . , d } \ A , then one has an equivalent way of defining the ( ℓ )-target Shapley effect.In the following, by analogy to T-S ℓ A = T-S A and to simplify the terminology and notations, the( ℓ )-target Shapley effect T-Sh ℓ j will be called the target Shapley effect and denoted T-Sh j :T-Sh j def = T-Sh ℓ j . (27)
4. Estimation methods and practical implementation of target Shapley effects
The estimation of the target Shapley effects Eq. (25) can be split into two steps: • Step : estimation of the conditional elements , i.e., the estimation of T-S A or T-E A for all A ∈ P d ; 12 Step : an aggregation procedure , i.e., a step to compute the T-Sh j by plugging in the previousestimations of Step This procedure, introduced in [45] for the estimation of Shapley effects, relies on a Monte Carloestimation of the conditional elements. It requires the knowledge and the ability to sample from themarginal distributions of the inputs, that is P X A for all A ⊆ { , . . . , d } \ ∅ , as well as from all theconditional distributions, that is P X A | X A , for all possible subsets of inputs A . Additionally, one alsoneeds to be able to evaluate the model G ( · ) which is mostly the case in the context of uncertaintyquantification of numerical computer models (ignoring the potential difficulties related to the cost ofa single evaluation of G ( · )) [1].In order to estimate a conditional element T-S A , one needs to draw several i.i.d. samples: • an i.i.d. sample of size N drawn from P X and denoted by ( X (1) , . . . , X ( N ) ); • another i.i.d. sample of size N v drawn from P X A and denoted by ( X (1) A , . . . , X ( N v ) A ); • for each element X ( i ) A , i = 1 , . . . , N v , a corresponding sample of size N p drawn from P X A | X A giventhat X A = X ( i ) A and denoted by ( e X (1) i , . . . , e X ( N p ) i ).Then, the Monte Carlo estimator of T-S A can be defined as: d T-S A, MC = P N v i =1 (cid:16) N p P N p j =1 F t ( e X ( j ) i , X ( i ) A ) − b p Yt (cid:17) ( N v − b p Yt (1 − b p Yt ) (28)with b p Yt = 1 N N X i =1 F t ( X ( i ) ) . (29)Finally, the aggregation procedure gives: [ T-Sh j, MC = 1 d X A ⊂{− j } (cid:18) d − | A | (cid:19) − (cid:16)d T-S A ∪{ j } , MC − d T-S A, MC (cid:17) . (30)Thus, one gets that [ T-Sh j, MC is an unbiased consistent estimator of T-Sh j .Algorithm 1 provides a detailed description on how to implement this estimator in practice. Thisestimation method requires ( N + d ! × ( d − × N v × N p ) calls to the numerical model G ( · ). Asexpected, this first estimation method can become quite expensive in practice. Moreover, numericalmodels usually encountered in industrial studies can be costly-to-evaluate, which can become a stronglimitation for the use of this method.Another algorithm has been proposed in [45], by leveraging an equivalent definition of the Shapleyattribution system, as an arithmetic mean over all the d ! permutations of { , . . . , d } . In the same13 lgorithm 1: Target Shapley effects estimation by a Monte Carlo procedure.
Input:
G, t, d, N, N v , N p , simJoint , simMarginal , simConditional Output: ( [ T-Sh j, MC ) j =1 ,...,d /* Sample from the joint distribution */ ( X (1) , . . . , X ( N ) ) ← sim joint ( N ) /* Estimate the failure probability */ b p Yt ← N P Ni =1 { G ( X ( i ) ) >t } ( X ( i ) ) /* For every subsets of inputs */ for A ∈ P d do /* Sample from the marginal distribution */ ( X (1) A , . . . , X ( N v ) A ) ← simMarginal ( A, N v ) /* For every element of the marginal distribution sample */ for i = 1 , . . . , N v do /* Sample from the conditional distribution given the element of the marginal distribution*/ ( e X (1) i , . . . , e X ( N p ) i ) ← simConditional ( A, N p , X ( i ) A ) /* Compute the conditional element */ d T-S A ← N v − P N v i =1 (cid:16) N p P N p j =1 F t ( e X ( j ) i , X ( i ) A ) − b p Yt (cid:17) × b p Yt (1 − b p Yt ) /* Aggregation step */ for j = 1 , . . . , d do [ T-Sh j, MC ← for A ⊂ {− j } do /* Apply the Shapley weights to every computed increments */ [ T-Sh j, MC + = d (cid:0) d − | A | (cid:1) − (cid:16)d T-S A ∪{ j } − d T-S A (cid:17) fashion as in Eq. (10), it writes: [ T-Sh j = 1 m X π ∈ permutationsof { ,...,d } (cid:16)d T-S v ∪{ j } , MC − d T-S v, MC (cid:17) (31)with v being the indices before j in the order π . In Eq. (31), the sum is not performed over allthe permutations of { , . . . , d } but only on m randomly chosen permutations. By sampling m < d !permutations, one can drive the computational cost of this algorithm to ( N + m × ( d − × N v × N p )calls to G ( · ), for a less precise, but still convergent estimator. A “given-data” estimation method has been introduced in [36] to get the Shapley effects. Thismethod can be seen as an extension of the Monte Carlo estimator when only a single i.i.d. input-output sample is available. This method is appropriate when the input distributions are not knownor when the numerical model G ( · ) is no more available. The main idea behind this method is toreplace the exact samples from the conditional distributions P X A | X A by approximated ones based ona non-parametric nearest-neighbor procedure.Let (cid:0) X (1) , . . . , X ( N ) (cid:1) be an i.i.d. sample of the inputs X and A ∈ P d \ {∅ , [1 : d ] } . Let k AN ( l, n )14e the index such that X ( k AN ( l,n ) ) A is the n -th closest element to X ( l ) A in (cid:16) X (1) A , . . . , X ( N ) A (cid:17) . Note that,if two observations are at an equal distance from X ( l ) A , then one of the two is uniformly randomlyselected. Finally, one can define an estimator of the equivalent cost function defined in Eq. (26): d T-E A, KNN = P Nl =1 N s − P N s i =1 (cid:20) F t (cid:18) X (cid:16) k AN ( l,i ) (cid:17) (cid:19) − N s P N s h =1 F t (cid:18) X (cid:16) k AN ( l,h ) (cid:17) (cid:19)(cid:21) ! N b p Yt (1 − b p Yt ) . (32)Under some mild assumptions, [36] showed that this estimator does asymptotically converge towardsT-E A . With estimates for the conditional elements, one can then define the following plug-in estimator: [ T-Sh j, KNN = 1 d X A ⊂{− j } (cid:18) d − | A | (cid:19) − (cid:16) d T-E A ∪{ j } , KNN − d T-E A, KNN (cid:17) (33)where b p Yt is the empirical mean of F t ( X ) on the i.i.d. sample. Algorithm 2 represents the procedurefor this given-data estimator. This method is less computationally expensive (in terms of model eval-uations) compared to the Monte Carlo sampling-based method, since no additional model evaluation,other than the ones in the i.i.d. sample, is required in order to produce estimates of the target Shapleyeffects. Since the samples of the conditional and marginal distributions are approximated by a non-parametric procedure, this method also allows to reduce the possible input modeling error (e.g., in thecontext of ill-defined input distributions), at the cost of less accurate estimates. Another constraintis due to the fact that the input-output sample has to be i.i.d. which prevents from its use with, forinstance, advanced design of computer experiments.In [36], a random permutation algorithm, homologous to Eq. (31), has been developed, which allowsfor reducing the overall complexity of the method, which, for the sake of conciseness, is not developedin this paper. The algorithms described in the preceding subsections have been implemented in the sensitivity
R package [56]. More precisely, the shapleyPermEx() (sampling-based algorithm) and sobolshap knn() (given-data algorithm) functions can be directly used for the estimation of the target Shapley effects. Inthe applications of Section 6, only the sobolshap knn() function will be used for numerical tractabil-ity. Appendix D provides some minimal code examples for the implementation of the aforementionedalgorithms, along with their random permutation variants [36].All further results can be accessed on a GitLab platform , along with the data used in the followingsections. R code files are available, with explicit code, along with all custom-made functions, in orderto reproduce the analyses presented in this paper. The procedures for the theoretical approximations ofSection 5 are made available, along with the data-simulation functions for the flood case in Subsection6.1. The two datasets used for Subsection 6.2 are also available. Finally, all the figures can bereproduced by simply re-running the different RMarkdown files in the aforementioned GitLab repository. https://gitlab.com/milidris/review l2tse lgorithm 2: Target Shapley effects estimation by a nearest-neighbor procedure.
Input:
X, Y, t
Output: ( [ T-Sh j, KNN ) j =1 ,...,d /* Estimate the failure probability */ b p Yt ← N P Ni =1 { G ( X ( i ) ) >t } ( X ( i ) ) /* For every subsets of inputs */ for A ∈ P d do /* Sample of X A */ X A ← ( X ( j ) i ) i ∈ Aj =1 ,...,n for i = 1 , . . . , N do /* For each row i of X A, find the N s nearest rows in X */ ( e X A, ( j ) i ) j =1 ,...,N s ← KNN ( X ( i ) A , X, N s ) /* Compute the conditional element */ d T-E A ← P Nl =1 (cid:18) N s − P N s i =1 h F t (cid:16) e X A, ( i ) l (cid:17) − N s P N s h =1 F t (cid:16) e X A, ( h ) l (cid:17)i (cid:19) × (cid:0) N b p Yt (1 − b p Yt ) (cid:1) − /* Aggregation step */ for j = 1 , . . . , d do [ T-Sh j, MC ← for A ⊂ {− j } do /* Apply the Shapley weights to every computed increments */ [ T-Sh j, MC + = d (cid:0) d − | A | (cid:1) − (cid:16) d T-E A ∪{ j } − d T-E A (cid:17)
5. Analytical results using a linear model with Gaussian inputs
To illustrate the behavior of the proposed indices, a first toy-case involving a linear model andmultivariate Gaussian inputs is presented. The main advantage is that analytical results can bederived for the marginal distributions of all subsets of inputs, their conditional distribution, and thedistribution of the output given a subset of inputs. Moreover, analytical formulas can be obtained forboth the target Sobol’ indices and the target Shapley effects.Let ( β , β ) = ( β , β , . . . , β d ) ⊤ ∈ R d +1 , µ = ( µ , . . . , µ d ) ⊤ ∈ R d and Σ ∈ M d ( R ) a full-ranksymmetric ( d × d ) square matrix. Assume that X ∼ N d ( µ, Σ), and that the model output writes Y = β + β ⊤ X. (34)Then, one has Y ∼ N (cid:0) β + β ⊤ µ, β ⊤ Σ β (cid:1) and, for any A ∈ P d , ( Y | X A = x A ) ∼ N (cid:16)e µ A , e Σ A (cid:17) with e µ A = β + β ⊤ A x a + β ⊤ A ( µ A + Σ A, Σ − A, ( x a − µ A )) , e Σ A = β ⊤ A (Σ A, − Σ A, Σ − A, Σ A, ) β A . Moreover, one also needs to recall that( X A , X A ) ⊤ ∼ N d µ A µ A ! , Σ A = Σ A, Σ A, Σ A, Σ A, !! A having sizes ( d − | A | ) × ( d − | A | ) ( d − | A | ) × | A || A | × ( d − | A | ) | A | × | A | ! . These results will leadto some numerical approximations of the theoretical values of T-Sh j for all j = 1 , . . . , d using standardmultidimensional integration tools. Here, the function adaptIntegrate() from the cubature packageof the R software has been used, with a fixed error tolerance set to 10 − . This allows for studyingbasic scenarios in order to validate the behavior of the target Shapley effects.In the following, the independent inputs’ case is firstly studied w.r.t. the threshold t . Then, acase involving linear correlation between inputs (driven by the parameter ρ ) is studied accordingly.Finally, a last case where an exogenous input is added. Such case appears as soon as an extra inputis correlated to another one without being explicitly involved in the model G ( · ). The first toy-case can be specified by: X X X ∼ N , , Y = X i =1 X i . (35)In this scenario, the three inputs are equally important in terms of defining Y , but they should also beequally important for the variable of interest F t ( X ), as assessed by the target Sobol’ indices definedin Eq. (17).From [19, 57], one can easily deduce that the first-order (FO) target Sobol’ indices are all equal toeach other. Thus, one has:T-S FO def = T-S = T-S = T-S = V (cid:16) Φ (cid:16) t − X √ (cid:17)(cid:17) V ( F t ( X )) , (36)while the second-order (SO) target Sobol’ indices are given by:T-S SO def = T-S { , } = T-S { , } = T-S { , } = V (Φ ( t − X ′ )) V ( F t ( X )) (37)where Φ( . ) is the standard Gaussian cumulative distribution function (cdf), X ∼ N (0 ,
1) and X ′ ∼N (0 , TO def = T-S { , , } = 1 . (38)From Eqs. (36), (37), and (38), and from Property 1, one can deduce that:T-Sh = T-Sh = T-Sh = 13 . (39)Additionally, as the inputs are independent, interpreting the closed target Sobol’ indices is meaningful,17nd they are equal to: T-S FO = T-S FO (40)T-S SO = T-S SO − FO (41)T-S TO = T-S SO − (cid:0) T-S FO + T-S SO (cid:1) . (42)These results are illustrated in Figure 3. One can remark that, studying the indicator variable F t ( X ) instead of the model output Y leads to interaction effects between the inputs, while the targetShapley effects remain constant for all threshold values t . Such a behavior was expected. However,it highlights the fact the target Shapley effects do not report clearly the interaction effects as targetSobol’ indices would do (by considering all the various orders). In some sense, it summarizes theinformation into a single index which focuses only on correlation effects. − − − . . . . . . t T − Sh l p tY − − − . . . . . . t First − Order SobolSecond − Order SobolThird − Order Sobol
Figure 3: Target Shapley effects (left) and target Sobol’ indices (right) for the linear model with standard independentmultivariate Gaussian inputs, w.r.t. t . The behavior of the target Shapley effects are now studied when a linear dependence is added tothe inputs. Since Property 1 still holds without any condition on the dependence structure on theinput variables, these indices remain relevant while conveying a practical ease of interpretation. Thefollowing model is used: X X X ∼ N , ρ ρ , Y = X i =1 X i . (43)18here − < ρ <
1. In this scenario one hasT-S = V (cid:18) Φ (cid:18) t − X √ ρ ) (cid:19)(cid:19) V ( F t ( X )) , (44)T-S = T-S = V (cid:16) Φ (cid:16) t − X √ (cid:17)(cid:17) V ( F t ( X )) , (45)T-S { , } = T-S { , } = V (Φ ( t − X ′ )) V ( F t ( X )) , (46)T-S { , } = V (Φ ( t − X ′′ )) V ( F t ( X )) (47)where X ′ ∼ N (0 , X ′ ∼ N (0 ,
2) and X ′′ ∼ N (0 , ρ )).From these results, one can directly remark that T-Sh = T-Sh . Note that the values of thetarget Shapley effects can also be obtained by combinations of target Sobol’ indices (see Eq. (25)).These results are illustrated in Figure 4. For fixed threshold values t , the target Shapley effects of thecorrelated inputs X and X increases when ρ increases. This is an expected behavior since, in thiscase: V ( F t ( X )) = Φ (cid:18) t √ ρ (cid:19) (cid:18) − Φ (cid:18) t √ ρ (cid:19)(cid:19) , (48)and subsequently, for a fixed t , the variance of the variable of interest will grow with ρ , as illustratedin Figure 5. This increase in variance due to the correlation between X and X is then allocatedthrough T-Sh and T-Sh , which increase with ρ . On the other hand, T-Sh decreases accordingly, toaccommodate Property 1.In Figure 4, the behavior of the indices w.r.t. ρ can also be clearly seen, as T-Sh is predominantlyabove T-Sh and T-Sh when ρ is negative, and below when it is positive. This can be explained by thefact that, X and X cancel each other out when their correlation is negative, thus lowering the valueof T-S { , } below T-S { , } and T-S { , } , automatically increasing T-Sh in accordance to Property 1.In the other hand, for positive values of ρ , T-S { , } is higher than T-S { , } and T-S { , } , which in turncorresponds to T-Sh being lower than T-Sh = T-Sh . In this use case, inspired by [57], the following model is considered: X X X X ∼ N , ρ ρ , Y = X + 6 X + 4 X (49)where X is an exogenous input, but correlated to X , which should be the most important variablein terms of variance. The threshold is fixed at t = 16. This scenario enables to check that the targetShapley effects allow for quantifying the importance of X through its correlation with an endogenousinput, even though it does not appear directly in the model. Indeed, from the results given in Figure 6,19 . . . . . . ρ − − T − Sh for t= − T − Sh T − Sh T − Sh p tY . . . . . . ρ − − T − Sh for t= − T − Sh T − Sh T − Sh p tY . . . . . . ρ − − T − Sh for t=0 T − Sh T − Sh T − Sh p tY . . . . . . ρ − − T − Sh for t=5 T − Sh T − Sh T − Sh p tY Figure 4: Evolution, w.r.t. ρ and for various threshold values, of the target Shapley effects of correlated Gaussianstandard inputs in a linear model. one can remark that T-Sh increases when ρ goes to either 1 or −
1, despite the fact that it is not useddirectly in the model G ( · ).
6. Applications
In this section, two models related to real phenomena and including dependent random inputs arestudied in the context of TSA.
The target Shapley effects are firstly computed on a simplified model of a river flood [57, 6]. Theaim of this model is to study the behavior of the river’s water level, by comparison with a fixed dykeheight. After a strong simplification of the one-dimensional Saint-Venant equation (with uniform andconstant flow rate), the maximal annual water level h is modeled as h = QBK s q Z m − Z v L , (50)20 ρ − − − . . . . . . . . Figure 5: Variance of F t ( X ) w.r.t. ρ and t for correlated Gaussian standard inputs with a linear model. − . . . . . . . ρ T − Sh T − Sh T − Sh T − Sh Figure 6: Target Shapley effects for the Gaussian Linear model with an exogenous input, w.r.t. ρ . while the model output writes Y = Z v + h. (51)The inputs’ modeling and threshold value are described in Table 1. The problem is of dimension d = 6.In the present TSA study, the variable of interest is { G ( X ) >t } ( X ) with t being the dyke height, fixedto t = 54 . samples) is equal to p Yt = 4 . × − .As in [24], three pairs of inputs are assumed to be linearly dependent: Q and K s with ρ ( Q, K s ) =0 . Z v and Z m with ρ ( Z v , Z m ) = 0 . L and B with ρ ( L, B ) = 0 .
3. The aim of this use-case is toassess the interest of the target Shapley effects in a complex environment. From [24], it can be shownthat, from a GSA standpoint (using a generalized variance decomposition for dependent variables),the two most influential inputs on the annual water level are Q , the maximal annual flow rate, and21nput Description Unit Distribution Q maximal annual flow rate m . s − Gumbel(1013 , , K s Strickler friction coefficient - Normal(30 ,
7) truncated to [15 , + ∞ ) Z v river downstream level m Triangular(49 , , Z m river upstream level m Triangular(54 , , L length of the river stretch m Triangular(4990 , , B river width m Triangular(295 , , t dyke height (threshold) m Fixed to 54 . Table 1: Input variables and distributions for the flood model. Z v , the river downstream level.A Monte Carlo sample of N = 2 × input realizations is drawn (note that the correlations areinjected following the algorithm proposed by [58]). Thus, this procedure leads to the computation of N model output values. Figure 7 represents the estimated target Shapley effects on the flood case, usingthe nearest-neighbor procedure depicted in Subsection 4.2 with an arbitrary number of neighbors setat N s = 2. Moreover, 300 repetitions of the simulation and the estimation procedure give access toconfidence intervals of the estimates (represented by boxplots in Figure 7). From these TSA results,one can notice that Q is granted an influence of 24.3% ( ± K s has 22.6% ( ± Z v around16.7% ( ± K s and non-negligible effects for Z m , L and B . This was expected due to theinteractions induced by the considered TSA variable of interest. This example illustrates the abilityof the target Shapley effects to quantify the relative importance of input variables in a complex modelpresenting statistical dependence between them. Q Ks Zv Zm L B . . . . . Inputs
Figure 7: Estimated target Shapley effects for the flood case. .2. A COVID-19 epidemiological model In 2020, the COVID-19 crisis has raised major issues in the usefulness of epidemic modeling inorder to give useful insights to public policy decision makers. [59] have taken this example to insiston the essential use of GSA on such models, which claim to predict the potential consequences ofintervention policies. A first study has been proposed by [60], in the context of COVID-19 in Italy,to assess the sensitivity of model outputs such as quarantined, recovered or dead people to inputsdriving intervention policies. Another GSA has been performed in [61] in the French context of thefirst COVID-19 wave. By using data coming from this last analysis (thanks to the authors’ agreement),the goal of this section is to demonstrate how TSA can help to characterize the influence of variousuncertain parameters on a real-scale model.
The deterministic compartmental model developed in [61] (also presented in [62]) is representativeof the COVID-19 French epidemic (from March to May) by taking into account the asymptomaticindividuals, the testing strategies, the hospitalized individuals, and people going to Intensive CareUnit (ICU). Using several assumptions, it is based on a system of 10 ordinary differential equationswhich are not developed here for a sake of conciseness (see [62, 61] for more information).Table 2 presents the 20 input parameters with their prior distribution (chosen from literaturestudies), which form the inputs X , assumed to be independent between each other. For the presentstudy, our variable of interest, which is a particular model output, then writes U pmax = max v ∈ time range n U v ( X ) o (52)where U v is the the number of hospitalized patients in ICU at time v . Note that the “p” in U pmax standsfor “prior” as this quantity corresponds to the variable of interest before any calibration w.r.t. theavailable data.In [61], after a first screening step allowing to suppress non-influential inputs, the model is calibratedon real data by using a Bayesian calibration technique. After the analysis of this step, the selectedremaining inputs are X sel = ( p a , N a , N s , R , t , µ, N, I − ) ⊤ (53)and their distributions are obtained from a sample given by the calibration process. The non-influentialinputs are fixed to their nominal values and the posterior variable of interest becomes U max = max v ∈ time range n U v ( X sel ) o (54)with U max being the maximum number of hospitalized people in ICU who need special medical careon the studied temporal range, and U v is the number of hospitalized patients in ICU at time v . The central question of this study would be to determine which inputs influence the event ofa country experiencing a shortage of ICU bed during the time period. For that purpose, one canintroduce a threshold t , which represents the total number of ICU beds in the country, which is23nput Description Prior distribution p a Conditioned on being infected, the probability of having light symp-toms or no symptoms U (0 . , . p H Conditioned on being mild/severely ill, the probability of needing hos-pitalization ( H or U ) U (0 . , . p U Conditioned on going to hospital, the probability of needing ICU U (0 . , . p HD Conditioned on being hospitalized but not in ICU, the probability ofdying U (0 . , . p UD Conditioned on being admitted to ICU, the probability of dying U (0 . , . N a If asymptomatic, number of days until recovery U (8 , N s If symptomatic, number of days until recovery without hospital U (8 , N IH If severe symptomatic, number of days until hospitalization U (8 , N HD If in H , number of days until death U (15 , N UD If in ICU, number of days until death U (8 , N HR If hospitalized but not in ICU, the number of days until recovery U (15 , N UR If in ICU, number of days until recovery U (15 , R Basic reproducing number U (3 , . t Starting date of epidemics (in 2020) U (01 / , / µ Decaying rate for transmission (after social distanciation and lock-down) U (0 . , . N Date of effect of social distanciation and lockdown U (20 , λ Type-1 testing rate U (1 e − , e − p HU Conditioned on being hospitalized in H , the probability of needingICU U (0 . , . N HU If in H , number of days until ICU U (1 , I − Number of infected undetected at the start of epidemics U (1 , Table 2: Model inputs and their prior distribution. H is the number of hospitalized individuals with severe symptoms. U is the number of hospitalized individuals in ICU. assumed to be constant during the studied time period. The new variable of interest would then be { U Pmax >t } ( X ) for the full compartmental model (preliminary study) and { U max >t } ( X sel ) for the modelwith selected inputs (post-calibration study). Two input-output samples of size n = 5000 are available.The first one (preliminary study) includes all the inputs following their prior distribution (see Table 2)and the corresponding output U P max of the compartmental model. The second one (post-calibrationstudy) is composed of a sample of X sel after the Bayesian calibration, and the corresponding output U max of the compartmental model with the non-selected inputs fixed to their nominal values.Five different thresholds are studied on U P max : 5 · , 10 , 5 · , 10 and 2 · , with respectively58 . . .
1% and 2 .
2% of the total output samples being in a failure state. This allowsto illustrate the behavior of the target Shapley effects when the failure probability decreases. Thethreshold of 6300 has been chosen for U max , with 10 .
9% of the total output samples being abovethis threshold. Figure 8 illustrates two different thresholds, and the corresponding estimated failureprobability on the histogram of both outputs.The target Shapley effects have been estimated using a variant of the estimation scheme presentedin Subsection 4.2, with a fixed number of random permutations of 10 , and with a number of neighborsset to 3, following the rule of thumb guideline of [36], due to the sheer complexity of this estimationalgorithm. Since the compartmental model is deterministic, the target Shapley effects have been forced24 maxP D en s i t y + − − − − Threshold: 50000Failure %: 22.04% U max D en s i t y . . . . Threshold: 6300Failure %: 10.9%
Figure 8: Illustration of thresholds on the histograms of U P max (left) and U max (right). to sum up to one. Figure 9 presents the main results for U P max , with the red dotted line being theaverage influence of an input, in the case of similar importance (i.e., ). One can remark that for lessrestrictive thresholds (i.e., threshold for which the failure probability is high), the input N , the effectivedate of lockdown/social distanciation measures, seem to be the most influential, reaching more than50% of the TSA variable of interest’s variance. However, as soon as the threshold becomes more andmore restrictive (i.e., the failure probability decomes lower and lower), the effect of N decreases, andthe effects of the other inputs increase accordingly, in order to reach what seem to be an equilibriumat the value . This behavior can be explained by two main reasons: • As outlined in Subsection 3.1, the nature of a restrictive TSA variable of interest induces highinteraction between the inputs; • The Shapley allocation system, when applied to variance as a production value, redistributes theinteraction effects equally between all inputs (there is no correlation between inputs in this priorstudy).One can argue that, as soon as t becomes very restrictive, the combined interaction effects outweightsthe effect of N itself, and since these effects are equitably distributed among all the inputs, their sharewill tend to go towards .For the post-calibration study, some selected inputs X sel are linearly correlated (see Figure 10 - topleft). This is typically the case for N and µ , with an estimated correlation coefficient ˆ ρ ( N, µ ) = 0 . R and N with an estimated correlation coefficient of ˆ ρ ( N, R ) = − .
66. This correlation25 . . . . . . Threshold ( l ²) − t a r ge t S hap l e y e ff e c t paNaR0NNHUMean influenceOther inputs Figure 9: Target Shapley effects for { U Pmax >t } ( X ) for different thresholds. structure does not allow to get interpretable Sobol’ indices, as outlined in Section 2, which motivatesthe use of Shapley-inspired indices. The Shapley effects and the target Shapley effects of X sel for U max have been computed using the nearest-neighbor procedure, with a fixed number of neighbors of 3, andforced to sum to one because of the deterministic nature of the model.One can remark that N a , the number of days until recovery, seem to be the most important inputin explaining the number maximum number of ICU patients on the studied time range, with a Shapleyeffect of around 35% of the output variance. The inputs p a , N s , R and N seem to present averageeffects, that is around , while t , µ and I − seem to be less influential, with around 5% of explainedvariance each.However, focusing on the event of a ICU bed shortage, one can remark that the target Shapleyeffect of N a is lower (around 22%), with the influence of N being higher (around 15%) than theirShapley effects. Moreover, t , µ and I − present higher TSA effects, i.e., slightly under 10%, due tothe interaction induced by the indicator function. One can also remark that the influence of N s ishigher than the one of R in the TSA setting, which was the other way around for the Shapley effects.This would indicate that N s , the number of days until recovery for a symptomatic patient withouthospitalization, influences more the event of a bed shortage than the basic reproducing number of thevirus, R .
7. Conclusion
This paper aims at proposing a set of new indices adapted to target sensitivity analysis whilebeing able to handle correlated inputs. The goal is to quantify the importance of inputs on theoccurrence of critical failure event of the system under study. The proposed indices are based on acooperative Shapley procedure which aims at allocating the effects of the interaction and correlation26 − − − − − − − − − − − − − − − − − − − − − − − pa N a N s R t M u N I m i nu s paNaNsR0t0MuNIminus0 pa Na Ns R0 t0 Mu N Iminus0 . . . . . Shapley effectspa Na Ns R0 t0 Mu N Iminus0 . . . . . Shapley effects(l²) − target Shapley effects Figure 10: Input correlation matrix (top left), Shapley effects for U max (top right) and target Shapley effects (bottom)for { U max >t } ( X sel ). equally between all the inputs in the same manner as done by the Shapley effects in global sensitivityanalysis. Thus, a general class of distance-based indices is proposed, namely the ( D )-target Shapleyeffects and some relevant properties are highlighted. Depending on the choice of the distance D , well-known preexisting indices can be used as cost functions in the Shapley formulation. Therefore, theseindices allow for the allocation, among the different inputs, of shares of several dispersion statistics(e.g., mean absolute deviation for the ℓ case, variance for the ℓ case). This produces easily usableindices in practice, as they can directly be interpreted as percentage of the dispersion statistic, allocatedto each input. This versatile procedure allows to produce input importance measures according to a27pecific metric, driven by the choice of the distance.In particular, the ( ℓ )-target Shapley effects (called target Shapley effects to simplify), which rep-resents percentages of variance, have been studied more intensively and two dedicated estimationmethods have been proposed. These particular indices have then been applied, analyzed and discussedthrough simple Gaussian toy-cases. Finally, two real-world use-cases have been studied: the modelingof a river flood and the problem of ICU bed shortage during the COVID-19 pandemic. These indicesreveal to be able to detect influential inputs in the context of correlated inputs. For target sensitivityanalysis, such a tool is valuable and can be used as a complement of more standard procedures. Theclear advantage is that only one set of indices is required by an analyst in order to produce easilyinterpretable and meaningful insights. Moreover, the proposed indices can be estimated in the contextof given-data which can be adapted to real applications for which no computer model is available.However, the major drawbacks of the approach are mainly related to the target aspect of the analysis.Indeed, as soon as the event gets more and more rare, all the inputs tend to be influential and makinga clear distinction between interactions and correlation effects becomes difficult.A first perspective could be to improve the estimation strategies. The sampling-based method couldbenefit from a better sampling scheme, such as importance sampling, as described in [63], which couldreduce the estimator’s variance. Recent results from [64] using copulas also shows promising tracks tohave efficient estimation of Shapley effects. Moreover, adapting recent results from [35], with a linkbetween the target Sobol’ indices and the Squared Mutual Information [65], should allow for otherpossibilities of given-data estimation methods. Another method based on a random forest given-dataprocedure, explored in [66] in the context of quantile-oriented importance measure estimation, couldalso yield promising results if transposed to a reliability-oriented setting.Finally, even if the Shapley attribution system allows to deal with input statistical dependencies, afiner decomposition is lacking in order to quantify the origin of each effect (e.g., statistical dependenceand interaction). Further works should use the recent developments in [67] in order to quantifyinteraction effects, with a transposition to the target sensitivity analysis setting. Finally, it has beenshown in [68] that the Shapley attribution system is equivalent to a maximum entropy distribution(e.g., uniform) over all possible orderings of inputs (the Shapley weights). Developments towards otherforms of allocation systems where the allocation is driven from data could also open a path for furtherdevelopments. Acknowledments
We are grateful to as S´ebastien Da Veiga, Cl´ementine Prieur and Fabrice Gamboa for helpfuldiscussions and for having provided the dataset on the COVID-19 model.
References [1] E. De Rocquigny, N. Devictor, S. Tarantola, Uncertainty in industrial practice: a guide to quan-titative uncertainty management, Wiley, 2008.[2] K. Beven, Environmental Modelling: An Uncertain Future?, CRC Press, 2008.283] F. Pianosi, K. Beven, J. Freer, J. Hall, J. Rougier, D. Stephenson, T. Wagener, Sensitivity analysisof environmental models: A systematic review with practical workflow, Environmental Modelling& Software 79 (2016) 214–232.[4] S. Razavi, A. Jakeman, A. Saltelli, C. Prieur, B. Iooss, E. Borgonovo, E. Plischke, S. Lo Piano,T. Iwanaga, W. Becker, S. Tarantola, J. Guillaume, J. Jakeman, H. Gupta, N. Melillo, G. Rabiti,V. Chabridon, Q. Duan, X. Sun, S. Smith, R. Sheikholeslami, N. Hosseini, M. Asadzadeh, A. Puy,S. Kucherenko, H. Maier, The future of sensitivity analysis: An essential discipline for systemsmodelling and policy making, Environmental Modelling and Software, In press (104954) (2020).[5] A. Saltelli, M. Ratto, T. Andres, F. Campolongo, J. Cariboni, D. Gatelli, M. Saisana, S. Tarantola,Global Sensitivity Analysis. The Primer, Wiley, 2008.[6] B. Iooss, P. Lemaˆıtre, A review on global sensitivity analysis methods, in: C. Meloni, G. Dellino(Eds.), Uncertainty management in Simulation-Optimization of Complex Systems: Algorithmsand Applications, Springer, 2015, pp. 101–122.[7] M. Lemaire, A. Chateauneuf, J.-C. Mitteau, Structural Reliability, ISTE Ltd & John Wiley &Sons, Inc., 2009.[8] Y. Richet, V. Bacchi, Inversion algorithm for civil flood defense optimization: Application to two-dimensional numerical model of the garonne river in france, Frontiers in Environmental Science 7(2019) 160.[9] R. T. Rockafellar, J. O. Royset, Engineering Decisions under Risk Averseness, ASCE-ASMEJournal of Risk and Uncertainty in Engineering Systems, Part A: Civil Engineering 1 (2) (2015)1–12.[10] J. Morio, M. Balesdent, Estimation of Rare Event Probabilities in Complex Aerospace and OtherSystems: A Practical Approach, Woodhead Publishing, Elsevier, 2015.[11] Y.-T. Wu, Computational Methods for Efficient Structural Reliability and Reliability SensitivityAnalysis, AIAA Journal 32 (8) (1994) 1717–1723.[12] S. Song, Z. Lu, H. Qiao, Subset simulation for structural reliability sensitivity analysis, ReliabilityEngineering and System Safety 94 (2) (2009) 658–665.[13] P. Wei, Z. Lu, W. Hao, J. Feng, B. Wang, Efficient sampling methods for global reliability sensi-tivity analysis, Computer Physics Communications 183 (2012) 1728–1743.[14] V. Chabridon, Reliability-oriented sensitivity analysis under probabilistic model uncertainty –Application to aerospace systems, Ph.D. thesis, Universit´e Clermont Auvergne (2018).[15] G. Perrin, G. Defaux, Efficient Evaluation of Reliability-Oriented Sensitivity Indices, Journal ofScientific Computing (2019) 1–23.[16] I. M. Sobol, Sensitivity estimates for nonlinear mathematical models, Mathematical Modellingand Computational Experiments 1 (1993) 407–414.2917] E. Borgonovo, A new uncertainty importance measure, Reliability Engineering & System Safety92 (6) (2007) 771–784.[18] H. Raguet, A. Marrel, Target and conditional sensitivity analysis with emphasis on dependencemeasures, Working paper (2018, URL https://arxiv.org/abs/1801.10047).[19] L. Li, Z. Lu, F. Jun, W. Bintuan, Moment-independent importance measure of basic variable andits state dependent parameter solution, Structural Safety 38 (2012) 40–47.[20] A. Marrel, V. Chabridon, Statistical developments for target and conditional sensitivity analy-sis: application on safety studies for nuclear reactor, Preprint HAL, hal-02541142v2 (2020, URLhttps://hal.archives-ouvertes.fr/hal-02541142v2).[21] W. Hoeffding, A class of statistics with asymptotically normal distribution, The Annals of Math-ematical Statistics 19 (3) (1948) 293–325.[22] J. Jacques, C. Lavergne, N. Devictor, Sensitivity analysis in presence of model uncertainty andcorrelated inputs, Reliability Engineering & System Safety 91 (2006) 1126–1134.[23] G. Li, H. Rabitz, P. Yelvington, O. Oluwole, F. Bacon, C. Kolb, J. Schoendorf, Global sensitivityanalysis for systems with independent and/or correlated inputs, Journal of Physical Chemistry114 (2010) 6022–6032.[24] G. Chastaing, F. Gamboa, C. Prieur, Generalized Hoeffding-Sobol decomposition for dependentvariables - Application to sensitivity analysis, Electronic Journal of Statistics 6 (2012) 2420–2448.[25] C. Xu, G. Z. Gertner, Uncertainty and sensitivity analysis for models with correlated parameters,Reliability Engineering & System Safety 93 (2008) 1563–1573.[26] T. Mara, S. Tarantola, Variance-based sensitivity indices for models with dependent inputs, Re-liability Engineering & System Safety 107 (2012) 115–121.[27] T. Mara, S. Tarantola, P. Annoni, Non-parametric methods for global sensitivity analysis of modeloutput with dependent inputs, Environmental Modeling & Software 72 (2015) 173–183.[28] N. Benoumechiara, K. Elie-Dit-Cosaque, Shapley effects for sensitivity analysis with dependentinputs: bootstrap and kriging-based algorithms, ESAIM: Proceedings and Surveys 65 (2019) 266–293.[29] N. Do, S. Razavi, Correlation effects? A major but often neglected component in sensitivity anduncertainty analysis, Water Resources Research 56 (e2019WR025436) (2020).[30] L. S. Shapley, A value for n-person games, in: H. Kuhn, A. W. Tucker (Eds.), Contributions tothe Theory of Games, Volume II, Annals of Mathematics Studies, Princeton University Press,Princeton, NJ, 1953, Ch. 17, pp. 307–317.[31] M. Osborne, A. Rubinstein, A Course in Game Theory, MIT Press, 1994.3032] A. B. Owen, Sobol’ indices and Shapley value, SIAM/ASA Journal of Uncertainty Quantification2 (2014) 245–251.[33] A. B. Owen, C. Prieur, On Shapley value for measuring importance of dependent inputs,SIAM/ASA Journal of Uncertainty Quantification 5 (2017) 986–1002.[34] B. Iooss, C. Prieur, Shapley effects for Sensitivity Analysis with correlated inputs : Comparisonswith Sobol’ Indices, Numerical Estimation and Applications, International Journal for UncertaintyQuantification 9 (5) (2019) 493–514.[35] A. Spagnol, Kernel-based sensitivity indices for high-dimensional optimization problems, Ph.D.thesis, Ecole des Mines de Saint-Etienne (2020).[36] B. Broto, F. Bachoc, M. Depecker, Variance reduction for estimation of shapley effects and adap-tation to unknown input distribution, SIAM/ASA Journal on Uncertainty Quantification 8 (2)(2020) 693–716.[37] R. Christensen, Linear models for multivariate, time series and spatial data, Springer-Verlag,1990.[38] T. Hastie, R. Tibshirani, J. Friedman, The elements of statistical learning, Springer, 2002.[39] J. Helton, J. Johnson, C. Salaberry, C. Storlie, Survey of sampling-based methods for uncertaintyand sensitivity analysis, Reliability Engineering & System Safety 91 (2006) 1175–1209.[40] J. Johnson, J. LeBreton, History and use of relative importance indices in organizational research,Organizational Research Methods 7 (2004) 238–257.[41] L. Clouvel, Uncertainty quantification of the fast flux calculation for a PWR vessel, Ph.D. thesis,Universit´e Paris-Saclay (2019).[42] R. H. Lindeman, P. F. Merenda, R. Z. Gold, Introduction to bivariate and multivariate analysis,Scott Foresman and Company, Glenview, IL, 1980.[43] U. Gr¨omping, Relative importance for linear regression in R : the Package relaimpo , Journal ofStatistical Software 17 (2006) 1–27.[44] J. Nossent, P. Elsen, W. Bauwens, Sobol’ sensitivity analysis of a complex environmental model,Environmental Modelling & Software 26 (12) (2011) 1515 – 1525.[45] E. Song, B. Nelson, J. Staum, Shapley effects for global sensitivity analysis: Theory and compu-tation, SIAM/ASA Journal on Uncertainty Quantification 4 (1) (2016) 1060–1083.[46] P. Derennes, J. Morio, F. Simatos, Simultaneous estimation of complementary moment indepen-dent and reliability-oriented sensitivity measures, Mathematics and Computers in Simulation 182(2021) 721–737.[47] J. Morio, Extreme quantile estimation with nonparametric adaptive importance sampling, Simu-lation Modelling Practice and Theory 27 (2012) 76–89.3148] V. Chabridon, M. Balesdent, G. Perrin, J. Morio, J.-M. Bourinet, N. Gayton, Mechanical Engi-neering Under Uncertainties, Wiley - ISTE Ltd, 2020, Ch. ‘Global reliability-oriented sensitivityanalysis under distribution parameter uncertainty’, pp. 1–43.[49] L. Cui, Z. Lu, X. Zhao, Moment-independent importance measure of basic random variable and itsprobability density evolution solution, Science China Technical Sciences 53 (10) (2010) 1138–1145.[50] J.-C. Fort, T. Klein, N. Rachdi, New sensitivity analysis subordinated to a contrast, Communi-cations in Statistics - Theory and Methods 45 (15) (2016) 4349–4364.[51] T. Browne, J.-C. Fort, B. Iooss, L. Le Gratiet, Estimate of quantile-oriented sensitivity indices,HAL, hal-01450891, version 1 (2017).[52] V. Maume-Deschamps, I. Niang, Estimation of quantile oriented sensitivity indices, Statistics andProbability Letters 134 (2018) 122–127.[53] S. Kucherenko, S. Song, L. Wang, Quantile based global sensitivity measures, Reliability Engi-neering and System Safety 185 (2019) 35–48.[54] L. Li, I. Papaioannou, D. Straub, Global reliability sensitivity estimation based on failure samples,Structural Safety 81 (2019) 101871.[55] L. Li, Z. Lu, C. Chen, Moment-independent importance measure of correlated input variable andits state dependent parameter solution, Aerospace Science and Technology 48 (2016) 281–290.[56] B. Iooss, S. Da Veiga, A. Janon, G. Pujol, sensitivity: Global Sensitivity Analysis of Model Outputs,R package version 1.23.1 (2020).URL https://CRAN.R-project.org/package=sensitivity [57] P. Lemaitre, Analyse de sensibilit´e en fiabilit´e des structures (in English), Ph.D. thesis, Universit´ede Bordeaux (2014).[58] E. Schumann, Generating correlated uniform variates. (2009).URL http://comisef.wikidot.com/tutorial:correlateduniformvariates [59] A. Saltelli, G. Bammer, I. Bruno, E. Charters, M. D. Fiore, et al., Five ways to ensure that modelsserve society: a manifesto (short comments), Nature 582 (2020) 482–484.[60] X. Lu, E. Borgonovo, Is time to intervention in the COVID-19 outbreak really important? Aglobal sensitivity analysis approach, Preprint (2020, arXiv:2005.01833).[61] S. Da Veiga, F. Gamboa, C. Prieur, B. Iooss, Basics and trends in sensitivity analysis, SIAM, Inrevision, 2021.[62] S. Da Veiga, Calibration and sensitivity analysis of a COVID-19 epidemics model, Meeting Ap-pliBUGS (Applications du Bayesian Unified Group of Statisticians), December 2020.URL http://genome.jouy.inra.fr/applibugs/applibugs.rencontres.html Appendix A. ANOVA and Sobol’ indices
In the general non-linear case, as for the ANOVA of the linear model case (see Subsection 2.1),the idea is to find a general decomposition of the output variance. This can be done through thedecomposition of a function with finite variance ( L mathematical property), called the Hoeffdingdecomposition [21], which allows to rewrite G ( X ) as a sum of centered components related to eachpossible subset of inputs. For example, in the case of a model with three inputs X = ( X , X , X ), G ( X ) can be decomposed into four components: G ( X ) = G ∅ (Mean behavior)+ G ( X ) + G ( X ) + G ( X ) (First-order)+ G { , } ( X , X ) + G { , } ( X , X ) + G { , } ( X , X ) (Second-order)+ G { , , } ( X ) . (Third-order)Moreover, if the inputs are assumed to be independent, each term is orthogonal to one another andwrites G A ( x A ) = X B ⊂ A ( − | A |−| B | E [ G ( X ) | X B = x B ] (A.1)where A ∈ P d is a subset of indices and P d the set of all possible subsets of { , . . . , d } , | A | is thecardinal of A and X A denotes the subset of inputs, selected by the indices in A ( X A = ( X i ) i ∈ A ).Then, the Hoeffding decomposition is unique and leads to a variance decomposition called “functionalANOVA”: V [ G ( X )] = X A ∈P d ,A =0 V [ G A ( x A )] . (A.2)33his leads to the definition of the Sobol’ indices [16]: S A = V [ G A ( X A )] V [ G ( X )] = P B ⊂ A ( − | A |−| B | V (cid:0) E (cid:2) G ( X ) (cid:12)(cid:12) X B (cid:3)(cid:1) V [ G ( X )] . (A.3)The sum of the Sobol’ indices over all subset on inputs A ∈ P d being equal to one, they can be directlyinterpreted as the percentage of the output variance due to each subset of input [16, 5]. The Sobol’indices of higher orders than one can be interpreted as a means of quantifying the share of variancedue to the interaction effects induced by the structure of the model G ( · ) between the selected subsetof inputs.Another useful sensitivity index is the closed Sobol’ index [16] which writes S clos A = X B ⊂ A S B = V (cid:0) E (cid:2) G ( X ) (cid:12)(cid:12) X A (cid:3)(cid:1) V [ G ( X )] (A.4)In the independent setting, it can be interpreted as the percentage of variability induced by all thevariables in a selected subset and their interactions. Figure A.11 provides an illustration of the Sobol’indices and the closed Sobol’ indices for a model with three inputs. Each Venn diagram representsthe variance of the output, with the representation of each of the two Sobol’ indices presented above.While this representation is useful in the GSA context, it relies on the assumption of independencebetween the inputs. Figure A.11: Sobol’ indices (left) and closed Sobol’ indices (right).
Appendix B. Axioms of Shapley values
Consider a game with d players, and let val( A ) ∈ R be the cost function quantifying the productionvalue of a coalition (i.e., set of players) A ∈ P d , under the assumption that val( ∅ ) = 0. The Shapleyvalue φ j = φ j (val) , j = 1 , . . . , d attributed to each player can be defined by the following set of axioms:1. (Efficiency) P dj =1 φ j = val( { . . . , d } ), meaning that the sum of the allocated values have to beequal to the value produced by the cooperation of all the players.34. (Symmetry) If val( A ∪ { i } ) = val( A ∪ { j } ) for all A ∈ P d , then φ i = φ j , meaning that if twoplayers allow for the same contribution to every coalition, their attribution should be the same.3. (Dummy) If val( A ∪ { i } ) = val( A ) for all A ∈ P d , then φ i = 0, meaning that if a player doesnot contribute the the production of resources for all coalition, he should not be attributed anyresources.4. (Additivity) If val and val ′ have Shapley Values φ and φ ′ respectively, then the game with costfunction val + val ′ has Shapley values φ j + φ ′ j for j ∈ { , . . . , d } .These four axioms guarantee a cooperative allocation of val( { , . . . , d } ). The unique attributionmethod that satisfies these four axioms are the Shapley values [31], defined by: φ j = 1 d X A ⊂− j (cid:18) d − | A | (cid:19) − (val( A ∪ { j } ) − val( A )) , j = 1 , . . . , d (B.1)where {− j } = { , . . . , d }\ j . One can additionally remark that φ j (val) is a linear operator, meaningthat for some constant c ∈ R , φ j ( c × val) = c × φ j (val). Appendix C. Mathematical proofs
Appendix C.1. Positivity of the ( ℓ ) -target Shapley effects Let A ⊆ { , . . . , d } \ { j } , for j ∈ { , . . . , d } . In order to show that the ( ℓ )-target Shapley effectsare positive, one needs to prove that: T-S ℓ A ∪{ j } ≥ T-S ℓ A . (C.1)In [49], it was shown that the following property holds: η A ∪{ j } ≥ η A (C.2)with η A being defined in Eq. (18). From the definition of T-S ℓ A ,T-S ℓ A = 2 E h(cid:12)(cid:12) F t ( X ) − E [ F t ( X )] (cid:12)(cid:12)i η A , (C.3)one gets immediately the property C.1. Appendix C.2. Positivity of the ( ℓ ) -target Shapley effects Let X = ( X , . . . , X d ) ∈ R d , be a real-valued random vector admitting a probability measure P X on the usual real measurable space. Let L ( P X ) be the functional space such that, for a measurablefunction f , k f k L def = Z R d f ( x ) dP X ( x ) < + ∞ . Let G ( · ) ∈ L be the studied numerical model, anddenote the random variable Y = G ( X ) be the model output (or Y = G ( X ) >t ( X ) the TSA variable ofinterest, without loss of generality). Let A ⊆ { , . . . , d } \ { j } be the indices of the subset of inputs X A and j ∈ { , . . . , d } . In order to show that T-Sh j ≥
0, one needs to prove that:T-S A ∪{ j } − T-S A ≥ V (cid:16) E [ Y | X A ] (cid:17) ≤ V (cid:16) E (cid:2) Y | X A ∪{ j } (cid:3)(cid:17) . (C.5)From the Pythagorean theorem, one has: k Y k L = (cid:13)(cid:13) E (cid:2) Y (cid:12)(cid:12) X A (cid:3)(cid:13)(cid:13) L + (cid:13)(cid:13) Y − E (cid:2) Y (cid:12)(cid:12) X A (cid:3)(cid:13)(cid:13) L , (C.6)which is equivalent to E (cid:2) Y (cid:3) = E h(cid:0) E (cid:2) Y (cid:12)(cid:12) X A (cid:3)(cid:1) i + E h(cid:0) Y − E (cid:2) Y (cid:12)(cid:12) X A (cid:3)(cid:1) i . (C.7)By removing (cid:0) E (cid:2) Y (cid:3)(cid:1) to both sides of the equality, one obtains: V (cid:16) E [ Y | X A ] (cid:17) = V ( Y ) − k Y − E [ Y | X A ] k L . (C.8)By using the formula E (cid:2) Y (cid:12)(cid:12) X A (cid:3) = argmin Z ∈ σ ( X A ) k Y − Z k L , with σ ( X A ) being the span of X A , we deducethat E (cid:2) Y (cid:12)(cid:12) X A (cid:3) ≤ E (cid:2) Y (cid:12)(cid:12) X A ∪{ j } (cid:3) since σ (cid:0) X A (cid:1) ⊆ σ (cid:0) X A ∪{ j } (cid:1) . This leads to V ( Y ) − (cid:13)(cid:13) Y − E (cid:2) Y (cid:12)(cid:12) X A (cid:3)(cid:13)(cid:13) L ≤ V ( Y ) − (cid:13)(cid:13) Y − E (cid:2) Y (cid:12)(cid:12) X A ∪{ j } (cid:3)(cid:13)(cid:13) L . (C.9)Finally, from Eq. (C.8) and Eq. (C.9), we obtain V (cid:16) E (cid:2) Y (cid:12)(cid:12) X A (cid:3)(cid:17) ≤ V (cid:16) E (cid:2) Y (cid:12)(cid:12) X A ∪{ j } (cid:3)(cid:17) (C.10)which concludes the proof. Appendix D. Minimal R code examples for the estimation methods
Appendix D.1. Monte Carlo sampling estimator Total and marginal simulation functionXall <- function (n) mvtnorm :: rmvnorm (n ,mu , Covmat )