EE XPLAINABLE AI WITHOUT I NTERPRETABLE M ODEL
A P
REPRINT
Kary Främling
Department of Computing ScienceUmeå UniversityMit-huset, 901 87 Umeå, Sweden [email protected]
September 30, 2020 A BSTRACT
Explainability has been a challenge in AI for as long as AI has existed. With the recently increaseduse of AI in society, it has become more important than ever that AI systems would be able to explainthe reasoning behind their results also to end-users in situations such as being eliminated from arecruitment process or having a bank loan application refused by an AI system. Especially if theAI system has been trained using Machine Learning, it tends to contain too many parameters forthem to be analysed and understood, which has caused them to be called ‘black-box’ systems. MostExplainable AI (XAI) methods are based on extracting an interpretable model that can be used forproducing explanations. However, the interpretable model does not necessarily map accurately to theoriginal black-box model. Furthermore, the understandability of interpretable models for an end-userremains questionable. The notions of Contextual Importance and Utility (CIU) presented in this papermake it possible to produce human-like explanations of black-box outcomes directly, without creatingan interpretable model. Therefore, CIU explanations map accurately to the black-box model itself.CIU is completely model-agnostic and can be used with any black-box system. In addition to featureimportance, the utility concept that is well-known in Decision Theory provides a new dimension toexplanations compared to most existing XAI methods. Finally, CIU can produce explanations at anylevel of abstraction and using different vocabularies and other means of interaction, which makes itpossible to adjust explanations and interaction according to the context and to the target users.
Explainability has been a challenge in AI for as long as AI has existed. Shortliffe et al pointed out already in 1975that ‘It is our belief, therefore, that a consultation program will gain acceptance only if it serves to augment rather thanreplace the physician’s own decision making processes.’ [1]. The system described in that paper was MYCIN, an expertsystem that was capable of advising physicians who request advice regarding selection of appropriate antimicrobialtherapy for hospital patients with bacterial infections. Great emphasis was put into the interaction with the end-user, inthis case a skilled physician.With the recently increased use of AI in society, it has become more important than ever that AI systems should be ableto explain the reasoning behind their results also to end-users, in situations such as being eliminated from a recruitmentprocess or having a bank loan application refused. Meanwhile, many XAI researchers have pointed out that it is rarethat current XAI research would truly take ‘normal’ end-users into consideration. For instance, Miller et al. illustratethe phenomenon in their article entitled ‘Explainable AI: Beware of Inmates Running the Asylum’ for expressing thetendency that current XAI methods mainly help AI researchers to understand their own results and models [2].Many XAI researchers also point out that it is fair to say that most XAI work uses only the researchers’ intuition of whatconstitutes a ‘good’ explanation, while ignoring the vast and valuable bodies of research in philosophy, psychology, andcognitive science of how people define, generate, select, evaluate, and present explanations [3][4]. Another domain thatseems neglected in current XAI work is
Decision Theory and related sub-domains such as
Multiple Criteria DecisionMaking (MCDM) [5]. Decision Theory is tightly connected with the other mentioned domains because methods of a r X i v : . [ c s . A I] S e p PREPRINT - S
EPTEMBER
30, 2020Decision Theory are intended to produce Decision Support Systems (DSS) that are understood and used by humanswhen taking decisions. Decision Theory and MCDM provide clear definitions of what is meant by the importance of aninput, as well as what is the utility of a given input value towards the outcome of the DSS. A simple linear DSS modelis the weighted sum, where a numerical weight expresses the importance of an input and a numerical score expressesthe utility of different possible input values for different outcomes of the DSS, i.e. how good or favorable a value is.
Contextual Importance and Utility (CIU) extends this linear definition of importance and utility towards non-linearmodels such as those produced by typical ML methods. In many (or most) real-life situations the importance of aninput and the utility of different input values changes depending on values of other inputs. For instance, the outdoortemperature has a great importance on a person’s comfort level as long as the person is outdoors. When the person goesinside, the situation (context) changes and the outdoor temperature then only has an indirect (if any) importance for theperson’s comfort level. Regarding utility, both a very cold and a very warm outdoor temperature might be good or baddepending on the context. For instance, a − ◦ C temperature tends to be uncomfortable if wearing a T-shirt, whereas a ◦ C temperature is uncomfortable if wearing winter clothes. The utility of different temperature values changes whenadding or removing clothes, and vice versa, the utility of different clothes changes when the temperature changes.After this Introduction, Section 2 goes through the most relevant state-of-the-art of XAI methods. Section 3 presentsthe formal definition of CIU. Experimental results are shown in Section 4. Open questions and future research arepresented in Section 5, followed by conclusions in Section 6.
There does not seem to be a clear agreement in XAI literature on the meaning of the terms interpretable versus explainable . For the rest of this paper, interpretable model will be used to signify models whose behaviour humans canunderstand to some extent, such as rules or linear models.
Explanation will be used to signify what is actually presentedto a user for a specific prediction or outcome.XAI methods can be classified into categories model explanation , outcome explanation and model inspection accordingto [6]. Model explanation signifies providing a global explanation of the black-box model through an interpretableand transparent model. This model should be able to mimic the entire behavior of the black-box and it should alsobe understandable by humans. Rule extraction methods and estimation of global feature importance are examples ofmodel explanation methods, as well as decision tree, attention model, etc.Outcome explanation consists in providing an explanation of the outcome of the black-box for a specific instance (orcontext) and can therefore be considered local . It is not required to explain the underlying logic of the entire black-boxbut only the reason for the outcome on a specific input instance. Model inspection is not truly a XAI category, it mainlyrefers to how model or outcome explanations are presented to users (visual or textual for instance) for understandingthe black-box model or its outcome.Most (or all) current outcome explanation methods are so-called post-hoc methods, i.e. they require creating anintermediate interpretable model to provide explanations. The Local Interpretable Model-agnostic Explanations (LIME)method presented in 2016 [7] might be considered a cornerstone regarding post-hoc outcome explanation. LIME belongsto the family of additive feature attribution methods [8] that are based on the assumption that a locally linear model thatrepresents the gradient around the current context is sufficient for outcome explanation purposes. Other methods thatbelong to the same family are for instance Shapley values, DeepLIFT and Layer-Wise Relevance Propagation [8].A major challenge of all methods that use an intermediate interpretable model (the ‘explanation model’ in [8]) is to whatextent the interpretable model actually corresponds to the black-box model. A rising concern among XAI researchersis that current XAI methods themselves tend to be black-boxes whose behaviour is as difficult to understand as thatof the explained AI black-boxes, which causes challenges to assess to what extent XAI explanations can be trusted.Furthermore, it is not evident whether a gradient-based, locally linear model is adequate or accurate for interpreting orexplaining black-box behaviour. CIU differs radically from the existing state-of-the-art in XAI because CIU does notcreate or use an intermediate interpretable model.
The underlying idea behind CIU is to use a similar approach to explanation as humans do when explaining or justifyinga decision to other humans. In a XAI context, the explainer is a (X)AI system that justifies or explains its decisions oractions and the explainee is a human (one or many) that is the target of the explanation [3]. Human explainers tendto identify what were the most important aspects that influenced their decision and start their explanation with them.Human explainers also adapt the abstraction level and vocabulary used in the explanation to their expectations about2
PREPRINT - S
EPTEMBER
30, 2020what is best understood and accepted by the explainee. It is generally not enough to explain only the taken decision, itis also often necessary to justify why another decision wasn’t taken instead.CIU was initially developed in a MCDM context [9]. In MCDM, importance and utility concepts are clearly defined.The Analytic Hierarchy Process (AHP) [10] that was originally developed in the 1970’s seems to have become the mostpopular MCDM method in research and practice [11]. AHP is essentially based on a weighted sum, where the globaloutput can be broken into intermediate concepts in a hierarchical manner. The importance of different criteria (features,inputs of the model) is expressed by numeric weights. The utility expresses how good, favorable or typical a value is forthe output of the model. For a car selection problem, importance and utility can be used for giving explanations suchas ‘This car is good because it has a good size, decent performances and a reasonable price, which are very importantfeatures’, where words indicating utilities are underlined and only the most important features are presented. The use ofa linear model makes the meaning of importance and utility quite understandable to humans, as illustrated in Figure 1a.Rule-based systems, as well as classification trees for instance, are a way of overcoming the linearity limitation buttends to lead to step-wise models as illustrated in Figure 1b. Non-linear models such as neural nets can learn smoothand non-linear functions as illustrated in Figure 1c. Even though CIU can deal with all three kinds of models, the focushere is on the kind of non-linear functions in Figure 1c. We will begin the formal definition of CIU by providing a setof definitions.
Definition 1 (Black-box model) . A black-box model is a mathematical transformation f that maps inputs x to outputs y according to y = f ( x ) . Definition 2 (Context) . A Context C defines the input values x that describe the current situation or instance to beexplained. Definition 3 (Pre-defined output range) . The value range [ absmin j , absmax j ] that an output y j can take by definition. In classification tasks, the Pre-defined output range is typically [0 , . In regression tasks the minimum and max-imum output values present in a training set used for Machine Learning can usually be used as an estimate of [ absmin j , absmax j ] . Definition 4 (Set of studied inputs for CIU) . The index set { i } defines the indices of inputs x for which CIU iscalculated. Definition 5 (Estimated output range) . [ Cmin j ( C, { i } ) , Cmax j ( C, { i } )] is the range of values that an output y j cantake in the Context C when modifying the values of inputs x { i } . The values used for the inputs x { i } should be ‘representative’ or realistic within the Context C . The meaning of‘representative’ is discussed further down in this paper.We are now ready to provide the first definition of Contextual Importance, using a Pre-defined output range , followedby the definition of Contextual Utility.
Definition 6 (Contextual Importance) . Contextual Importance CI j ( C, { i } ) is a numeric value that expresses to whatextent variations in one or several inputs { i } affect the value of an output j of a black-box model f , according to CI j ( C, { i } ) = Cmax j ( C, { i } ) − Cmin j ( C, { i } ) absmax j − absmin j (1) x x y (a) Weighted sum. x x y (b) Rule-based. x x y (c) Non-linear model. Figure 1: Examples of linear ( y = 0 . x + 0 . x ), rule-based and non-linear ( y = ( x . + x ) / ) models.3 PREPRINT - S
EPTEMBER
30, 2020 . . . . . . x1 (with constant x2=0.2) y absmin = 0.0absmax = 1.0 Cmin = 0.02Cmax = 0.52out = 0.178 0.0 0.2 0.4 0.6 0.8 1.0 . . . . . . x2 (with constant x1=0.1) y absmin = 0.0absmax = 1.0 Cmin = 0.158Cmax = 0.658out = 0.178 Figure 2: Illustration of calculations of CI and CU for the non-linear model in Figure 1c.
Definition 7 (Contextual Utility) . Contextual Utility CU j ( C, { i } ) is a numeric value that expresses to what extent thecurrent input values C are favorable for the output y j ( C ) of a black-box model, according to CU j ( C, { i } ) = y j ( C ) − Cmin j ( C, { i } ) Cmax j ( C, { i } ) − Cmin j ( C, { i } ) (2)CI and CU are illustrated in Figure 2 for the non-linear function in Figure 1c. With ( C ) = (0 . , . , CI ( C, { } ) = 0 . and CI ( C, { } ) = 0 . , which signifies that both inputs are exactly as important for the output value. For the utilities, CU ( C, { } ) = 0 , and CU ( C, { } ) = 0 . , so even though the x value is higher than the x value, the utilityof the x value is higher than the utility of the x value for the result y .The estimation of the range [ Cmin j ( C, { i } ) , Cmax j ( C, { i } )] is the only part of CIU that requires more thanone y = f ( x ) calculation. It is also the most critical part of CIU for producing explanations that truly corre-spond to and explain the behaviour of the black-box. Even though it might be possible to calculate or estimate [ Cmin j ( C, { i } ) , Cmax j ( C, { i } )] directly for some models, that is not the case for generic black-box models. Onepossible approach is to generate a Set of representative input vectors . Definition 8 (Set of representative input vectors) . S ( C, { i } ) is an N × M matrix, where M is the length of x and N is a parameter that gives the number of input vectors to generate for obtaining an adequate estimate of the Estimatedoutput range [ Cmin j ( C, { i } ) , Cmax j ( C, { i } )] . A simple way to construct S ( C, { i } ) is to set all input vectors in S ( C, { i } ) to C and then replace the values of inputs { i } with random values from a pre-defined value range that may be different for every input x i . N is the only adjustableparameter of CIU and needs to be determined based on the complexity of the function learned by the model. Moreefficient approaches than random values certainly exist but that remains a topic of future research. Furthermore, randomvalues do not guarantee that the generated input vectors are ‘representative’ . It might even result in input vectors thatare impossible in reality. There is also a risk to have input vectors that are not even close to the examples that wereincluded in the training set of the black-box. This challenge can be addressed at least in the following ways:1. Use a black-box model that has some guarantees that [ Cmin j ( C, { i } ) , Cmax j ( C, { i } )] does not go out-of-bounds even with ‘non-representative’ input vectors. In a classification task, for instance, ‘non-representative’input vectors are not a problem for models whose outputs do not go under zero and do not go over one underany conditions .2. Eliminate or correct input vectors that are impossible in reality or that are too far from those included in thetraining set. One way of doing this could be to remove all rows in S ( C, { i } ) that are too far from any examplein the training set. One example of non-realistic input vectors that are straightforward to correct is if there areone-hot encoded inputs, where only one of the concerned inputs is allowed to be TRUE in every input vector.4 PREPRINT - S
EPTEMBER
30, 20203. Use ‘non-representative’ input vectors on purpose for potentially detecting inconsistencies in the learnedmodel.Now that we have studied how to estimate CI and CU of one or more inputs { i } for any output out j , we will introducethe notion of Intermediate Concept . Definition 9 (Intermediate Concept) . An Intermediate Concept names a given set of inputs { i } . As defined by Equations (1) and (2), CIU can be estimated for any set of inputs { i } . Intermediate concepts make itpossible to specify vocabularies that can be used for producing explanations on any level of abstraction. Differentnames can be used for the same Intermediate Concept (as well as for input features) and change the concept name usedaccording to the current context and the target explainee(s).In addition to using Intermediate Concepts for explaining y values, Intermediate Concept values can be explained usingmore specific Intermediate Concepts or input features. The following defines Generalized Contextual Importance forexplaining Intermediate Concepts.
Definition 10 (Generalized Contextual Importance) . CI j ( C, { i } , { I } ) = Cmax j ( C, { i } ) − Cmin j ( C, { i } ) Cmax j ( C, { I } ) − Cmin j ( C, { I } ) (3) where { I } is the set of input indices that correspond to the Intermediate Concept that we want to explain and { i } ∈ { I } . Equation 3 is similar to Equation 1 when { I } is the set of all inputs, i.e. the range [ absmin j , absmax j ] has beenreplaced by the range [ Cmin j ( C, { I } , Cmax j ( C, { I } ] . Equation 2 for CU does not change by the introduction ofIntermediate Concepts. In other words, Equation 3 allows the explanation of the outputs y j as well as the explanation ofany Intermediate Concept that leads to y j . Experimental results are shown for two well-know benchmark data sets: Iris flowers and Boston Housing. Iris flowersis a classification task, whereas Boston Housing is a regression task. This choice of using simple but well-known datasets signifies that it is relatively easy to understand the learned models and also to assess what ‘correct’ explanationsmight look like.Bar plot visualisations are used in this paper for illustrating CIU. The length on the bar corresponds to the CI value.A configurable threshold value of CU neutral = 0 . has been used for dividing the CU range [0 , into ‘defavorable’and ‘favorable’ ranges. A red-yellow-green color scale visualises the CU value, where CU ∈ [0 , CU neutral ] gives acontinuous transition from red to yellow. CU ∈ [ CU neutral , gives a continuous transition from yellow to dark green.Future analysis and validation with more data-sets and real-life applications will be needed in order to assess whether CU neutral needs to be adjusted in practice.A set S ( C, { i } ) with N = 1000 has been used for all results reported here, which gives a negligible calculation timeusing RStudio Version 1.2 on a MacBook Pro from 2017 with a 2,8 GHz Quad-Core Intel Core i7 processor and 16 GB2133 MHz LPDDR3 memory. The neural net described in [12] was used for the Iris data set, which has the useful property of converging towards themean output value when the input values go towards infinity. Therefore the range [ Cmin j ( C, { i } ) , Cmax j ( C, { i } )] should remain within reasonable bounds. A specific Iris test instance is studied that is quite a typical Virginica withvalues C = (7 , . , , . for the input features ‘Sepal Length’, ‘Sepal Width’, ‘Petal Length’, ‘Petal Width’. Thetrained neural network gives us y = (0 . , . , . for the three outputs classes ‘Setosa’, ‘Versicolor’, and‘Virginica’, so it is clearly a Virginica. Table 1 shows the corresponding CIU values for Iris test .Some questions that could be asked are ‘
Why is it a Virginica? ’ but also ‘
Why is it not a Versicolor or a Setosa? ’. Figure3a shows bar plot explanations for the three Iris classes. It is clear that
Iris test is not a Setosa because none of thefeatures is typical for a Setosa and modifying any of the values will not change the situation. On the other hand, allfeatures are typical for a Virginica. Petal length is clearly the most important feature for the classification of
Iris test .Figure 4 shows how the output value (estimated class probability) changes for Versicolor and Virginica as a function of5
PREPRINT - S
EPTEMBER
30, 2020
Sepal LengthSepal widthPetal LengthPetal width
Setosa
Contextual Importance0.0 0.2 0.4 0.6 0.8 1.0 Sepal LengthSepal widthPetal LengthPetal width
Versicolor
Contextual Importance0.0 0.2 0.4 0.6 0.8 1.0 Sepal LengthSepal widthPetal LengthPetal width
Virginica
Contextual Importance0.0 0.2 0.4 0.6 0.8 1.0 (a) All four inputs versus Iris class.
Sepal size and shapePetal size and shape
Setosa
Contextual Importance0.0 0.2 0.4 0.6 0.8 1.0 Sepal size and shapePetal size and shape
Versicolor
Contextual Importance0.0 0.2 0.4 0.6 0.8 1.0 Sepal size and shapePetal size and shape
Virginica
Contextual Importance0.0 0.2 0.4 0.6 0.8 1.0 (b) Intermediate Concepts ‘Sepal size and shape’ and ‘Petal size and shape’ versus Iris class.
Petal LengthPetal width
Petal size and shape ( Setosa )
Contextual Importance0.0 0.2 0.4 0.6 0.8 1.0 Petal LengthPetal width
Petal size and shape ( Versicolor )
Contextual Importance0.0 0.2 0.4 0.6 0.8 1.0 Petal LengthPetal width
Petal size and shape ( Virginica )
Contextual Importance0.0 0.2 0.4 0.6 0.8 1.0 (c) CIU of ‘Petal width’ and ‘Petal length’ versus Intermediate Concept ‘Petal size and shape’.
Figure 3: CIU bar plot visualisations for Iris task. Bar lengths correspond to CI values. CU values are visualised using acontinuous red-yellow-green color palette.Table 1: CIU values for Iris classes versus input.Iris class ( y j ) Setosa (0.012) Versicolor (0.158) Virginica (0.830)Input feature CI CU CI CU CI CUSepal Length 0.067 0.000 0.242 0.015 0.309 0.990Sepal Width 0.044 0.130 0.234 0.130 0.278 0.880Petal Length 0.314 0.015 0.640 0.008 0.638 0.995Petal Width 0.061 0.087 0.388 0.302 0.448 0.729Sepal size and shape 0.087 0.030 0.320 0.104 0.399 0.910Petal size and shape 0.408 0.023 0.895 0.189 0.903 0.809All inputs 0.869 0.013 0.927 0.110 0.920 0.886the four input features. These graphs confirm that Petal Length is the feature that discriminates Versicolor and Virginicathe most from each other.For showing the use of Intermediate Concepts, a small vocabulary was developed. The vocabulary specifies that‘Sepal size and shape’ is the combination of features ‘Sepal Length’ and ‘Sepal Width’. ‘Petal size and shape’ is thecombination of features ‘Petal Length’ and ‘Petal Width’. When studying the results using the Intermediate Concepts‘Sepal size and shape’ and ‘Petal size and shape’, we get the bar plot explanation in Figure 3b.Finally, Figure 3c answers questions such as ‘why is Petal size and shape not so typical for Versicolor?’ and ‘why isPetal size and shape typical for Virginica?’. These bar plots express what can be observed also in the 3D graphs ofFigure 4, where we can see that the combination of ‘Petal Length’ and ‘Petal Width’ could be even more typical forVirginica than what it is for Iris test . 6
PREPRINT - S
EPTEMBER
30, 2020 . . . Versicolor
Sepal Length O u t pu t v a l ue . . . Versicolor
Sepal width O u t pu t v a l ue . . . Versicolor
Petal Length O u t pu t v a l ue . . . Versicolor
Petal width O u t pu t v a l ue . . . Virginica
Sepal Length O u t pu t v a l ue . . . Virginica
Sepal width O u t pu t v a l ue . . . Virginica
Petal Length O u t pu t v a l ue . . . Virginica
Petal width O u t pu t v a l ue P e t a l Leng t h P e t a l w i d t h O u t pu t v a l ue Versicolor P e t a l Leng t h P e t a l w i d t h O u t pu t v a l ue Virginica
Figure 4: Output y j as a function of input values for Versicolor (left) and Virginica (right). Red dot shows input andoutput values for Iris test . A gradient boosting model was used for the Boston Housing data set. It learned the mapping from the 13 input variablesto the Median value (medv) of owner-occupied homes in $1000’s. The resulting CIU bar plot is shown in Figure 5 forinstances
Two simple benchmark data sets were used in this paper for reasons of illustration and to enable human readers toassess the validity of the results. Experience from more data sets and use cases might lead to further extensions ofCIU. For instance, use cases involving one-hot coding will require using Intermediate Concepts for aggregating theconcerned one-hot inputs into one single explainable feature.CIU provides many topics for future research. For instance, is it always better to use CI values as such or normalisethem to one? What ways of visualising CIU are the best understood by humans? Research is ongoing in these directions7
PREPRINT - S
EPTEMBER
30, 2020 crimzninduschasnoxrmagedisradtaxptratioblacklstat
Row
Contextual Importance0.0 0.2 0.4 0.6 0.8 1.0 crimzninduschasnoxrmagedisradtaxptratioblacklstat
Row
Contextual Importance0.0 0.2 0.4 0.6 0.8 1.0
Figure 5: CIU for two Boston Housing data set instances.but how to best interact with human explainees is a vast domain. Research has also been initiated for using CIU togetherwith Reinforcement and Unsupervised learning.
CIU is a model-agnostic method that allows producing explanations from any black-box model (no matter how ‘black’or not it is), without producing an intermediate interpretable model. Therefore CIU does not have the same challengesof black-box model fidelity as most other XAI methods do. Compared to other output explanation methods, CIU allowsfor more flexibility in how explanations can be produced and presented to explainees due to the possibility to apply CIUto sets of features and Intermediate Concepts. Intermediate Concepts enable the use of different vocabularies dependingon the context and on the explainee. The Contextual Utility concept also allows to produce explanations in a morehuman-like way than other XAI methods.By not using an intermediate interpretable model, CIU does not fit into any of the existing categories presented bymajor XAI survey articles. CIU has only one adjustable parameter, i.e. the number of samples in S ( C, { i } ) , whichmight be possible to eliminate or automate in the future. Therefore, CIU establishes a new category of XAI methodsthat will hopefully help to solve at least some of the many challenges that AI and XAI are currently facing. References [1] Edward H. Shortliffe, Randall Davis, Stanton G. Axline, Bruce G. Buchanan, C.Cordell Green, and Stanley N.Cohen. Computer-based consultations in clinical therapeutics: Explanation and rule acquisition capabilities of themycin system.
Computers and Biomedical Research , 8(4):303 – 320, 1975.[2] Tim Miller, Piers Howe, and Liz Sonenberg. Explainable AI: Beware of inmates running the asylum. In
IJCAI2017 Workshop on Explainable Artificial Intelligence (XAI) , 2017.[3] Tim Miller. Explanation in artificial intelligence: Insights from the social sciences.
Artificial Intelligence ,267:1–38, February 2019.[4] M. Westberg, A. Zelvelder, and A. Najjar. A historical perspective on cognitive science and its influence on xairesearch.
Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence andLecture Notes in Bioinformatics) , 11763 LNAI:205–219, 2019.[5] Ralph Keeney and Howard Raiffa.
Decisions with Multiple Objectives: Preferences and Value Trade-Offs .Cambridge University Press., 1976.[6] Riccardo Guidotti, Anna Monreale, Salvatore Ruggieri, Franco Turini, Fosca Giannotti, and Dino Pedreschi. Asurvey of methods for explaining black box models.
ACM Computing Surveys (CSUR) , 51(5):93, 2018.[7] Marco Tulio Ribeiro, Sameer Singh, and Carlos Guestrin. Why should i trust you?: Explaining the predictions ofany classifier, 2016.[8] Scott M Lundberg and Su-In Lee. A unified approach to interpreting model predictions. In I. Guyon, U. V. Luxburg,S. Bengio, H. Wallach, R. Fergus, S. Vishwanathan, and R. Garnett, editors,
Advances in Neural InformationProcessing Systems 30 , pages 4765–4774. Curran Associates, Inc., 2017.8
PREPRINT - S
EPTEMBER
30, 2020[9] Kary Främling.
Modélisation et apprentissage des préférences par réseaux de neurones pour l’aide à la décisionmulticritère . Phd thesis, INSA de Lyon, March 1996.[10] Thomas L. Saaty.
Decision Making for Leaders: The Analytic Hierarchy Process for Decisions in a ComplexWorld . RWS Publications, Pittsburgh, Pennsylvania, 1999.[11] Sylvain Kubler, Jérémy Robert, William Derigent, Alexandre Voisin, and Yves Le Traon. A state-of the-art survey& testbed of fuzzy ahp (fahp) applications.
Expert Systems with Applications , 65:398–422, 12 2016.[12] Kary Främling and Didier Graillot. Extracting Explanations from Neural Networks. In