[PDF] A Few Good Counterfactuals: Generating Interpretable, Plausible and Diverse Counterfactual Explanations

Abstract

Counterfactual explanations provide a potentially significant solution to the Explainable AI (XAI) problem, but good, native counterfactuals have been shown to rarely occur in most datasets. Hence, the most popular methods generate synthetic counterfactuals using blind perturbation. However, such methods have several shortcomings: the resulting counterfactuals (i) may not be valid data-points (they often use features that do not naturally occur), (ii) may lack the sparsity of good counterfactuals (if they modify too many features), and (iii) may lack diversity (if the generated counterfactuals are minimal variants of one another). We describe a method designed to overcome these problems, one that adapts native counterfactuals in the original dataset, to generate sparse, diverse synthetic counterfactuals from naturally occurring features. A series of experiments are reported that systematically explore parametric variations of this novel method on common datasets to establish the conditions for optimal performance.

Full PDF

AA Few Good Counterfactuals:Generating Interpretable, Plausible & Diverse CounterfactualExplanations

Barry Smyth ú & Mark T. Keane Insight SFI Centre for Data Analytics,VistaMilk SFI Research Centre,School of Computer Science,University College Dublin, Dublin, Ireland{barry.smyth, mark.keane}@ucd.ie

Abstract

Counterfactual explanations provide a poten-tially signiﬁcant solution to the ExplainableAI (XAI) problem, but good, “native” coun-terfactuals have been shown to rarely occurin most datasets. Hence, the most popularmethods generate synthetic counterfactuals us-ing “blind” perturbation. However, such meth-ods have several shortcomings: the resultingcounterfactuals (i) may not be valid data-points(they often use features that do not naturallyoccur), (ii) may lack the sparsity of good coun-terfactuals (if they modify too many features),and (iii) may lack diversity (if the generatedcounterfactuals are minimal variants of one an-other). We describe a method designed toovercome these problems, one that adapts “na-tive" counterfactuals in the original dataset, togenerate sparse, diverse synthetic counterfac-tuals from naturally occurring features. A se-ries of experiments are reported that systemati-cally explore parametric variations of this novelmethod on common datasets to establish theconditions for optimal performance.

Imagine your research paper has been reviewed by anAI model that provides post-hoc explanations for itsdecisions. The system might explain a rejection usinga counterfactual statement: “ if the paper was writtenmore clearly and the evaluation used more datasets, thenit would have been accepted ”. This proposes an alter-native outcome (acceptance) if two aspects of the pa-per had been di erent (clearer writing and a strongerevaluation). This explanation is just one of many pos-sible alternative counterfactuals for explaining a rejec-tion. For example, another counterfactual might em-phasise di erent features of the paper (related work ú Contact Author and technical detail), with the statement: “ if the pa-per had included a better treatment of related work andprovided a clearer technical account of the main algo-rithm, then it would have been accepted ”. While wehope that neither of these criticisms will be appliedto the present work, they show how useful counterfac-tuals can be in providing a causal focus for how analternative outcome could be achieved [Miller, 2019;Byrne, 2019], rather than merely explaining why aparticular outcome was achieved [Doyle et al. , 2004;Kenny and Keane, 2019]. Accordingly, in recent years,there has been an explosion of research on how coun-terfactuals can be used in Explainable AI (XAI) [Adadiand Berrada, 2018; Verma et al. , 2020] and algorithmicrecourse [Karimi et al. , 2020].From a machine learning perspective, counterfactu-als correspond to the nearest unlike neighbours (NUNs)of a target problem to be explained [Dasarathy, 1994;McKenna and Smyth, 2000; Doyle et al. , 2004]. NUNscan be cast as native counterfactuals , but not every NUNwill make a good counterfactual: a NUN may not besimilar enough to the target problem to be a useful ex-planation, or it may involve too many feature di erencesto be easily understood (i.e., lacking sparsity [Wachter et al. , 2017]); it has argued, for psychological reasons,that good counterfacutals should have no more than2 feature di erences [Keane and Smyth, 2020]. Also,it might not be possible to ﬁnd a NUN that empha-sises the right sort of di erences, those that are action-able in practice. Indeed, [Keane and Smyth, 2020] haveshown that good, native counterfactuals are rare (95%of datasets examined had <

1% of natives with Æ synthetic counterfactuals, rather than trying to ﬁndthem in the dataset. As such we distinguish betweenthe generation of two distinct classes of synthetic coun-terfactuals: • Endogenous counterfactuals are generated from nat-urally occurring feature values; that is, they use fea-ture values that exist in other (native) instances • Exogenous counterfactuals are generated in wayshat are not guaranteed to use naturally occurringfeature values; e.g., they may rely on interpolationsof existing feature values.This distinction is important because it has implicationsfor the plausiblity and actionablility of generated coun-terfactuals; e.g., an explanation saying that a propertyneeds 2.32 bedrooms for a higher sale-price is neitherplausible nor actionable. Though most current counter-factual generation techniques produce exogenous coun-terfactuals, here we focus on endogenous counterfactualmethods by adapting “native" counterfactuals (NUNs inthe original dataset) to generate plausible contrastive ex-planations. Recently, [Keane and Smyth, 2020] (hence-forth, KS20) reported an endogenous nearest-neighbouralgorithm that generates counterfactuals from the exist-ing features of a target problem and a suitable nearbyNUN (when one exists). The present paper generalisesthis technique and shows that this new method deliv-ers signiﬁcant improvements in counterfactual quality,across many datasets on key evaluation metrics.

As AI systems are becoming more widespread in our ev-eryday lives, the need for fairness [Russell et al. , 2017],transparency [Larsson and Heintz, 2020], explainabilityis becoming more important [Adadi and Berrada, 2018;Guidotti et al. , 2018; Kusner and Loftus, 2020; Muham-mad et al. , 2016]; indeed, some governmental regulations(such as GDPR) now call for mandatory explanations forAI-based decisions [Goodman and Flaxman, 2017]. Atthe same time, those machine learning approaches thathave proven to be so e ective in real-world tasks (e.g.deep neural networks), appear to be among the most dif-ﬁcult to explain [Gunning, 2017]. One approach to thisproblem is to cast such black-box models as white-boxones and then to use the latter to explain the former;for example, using post-hoc feature-based (e.g., as inLIME [Ribeiro et al. , 2016]) or example-based explana-tions (e.g., as in twin-systems [Kenny and Keane, 2019;Gilpin et al. , 2018]). Counterfactual explanations haveemerged as another popular post-hoc option as they (areargued to) o er psychological, technical and legal ben-eﬁts [Miller, 2019; Byrne, 2019; Wachter et al. , 2017;Mittelstadt et al. , 2019]. In this literature, there is aconsensus that good counterfactuals should be: • Available : for a majority of target problems encoun-tered, giving a high degree of explanation coverage. • Similar : maximally similar to the target problem,to be understandable to end-users. • Sparse : di er in as few features as possible from thetarget problem, to be easily interpreted. • Plausible : contain feature values or combinations offeature values that make sense (e.g., preferably fromknown instances). • Diverse : using di erent features, to provide expla-nations from di erent perspectives (when multiplealternatives are required) Native counterfactuals – such as existing NUNs – areplausible by deﬁnition, but natives that are similar andsparse are rare [Keane and Smyth, 2020]. Hence, mostcounterfactual methods generate synthetic, exogenouscounterfactuals. The seminal work of [Wachter et al. ,2017] proposed an optimisation approach, that gener-ates a new counterfactual p Õ for a target problem p byperturbing the features of p , until a class change oc-curs, in a manner that minimises d ( p, p Õ ) , the distancebetween p and p Õ . While this approach can generatea p Õ that is very similar to p , these “blind” perturba-tion methods can still generate counterfactuals that lacksparsity [McGrath et al. , 2018], o er limited diversity[Mothilal et al. , 2020] and can involve invalid, out-of-distribution data-points [Laugel et al. , 2019]. Hence,[Dandl et al. , 2020] have proposed modiﬁcations to theloss function to minimise the number of di erent fea-tures between p and p Õ ( di s(p,p’) ). And, [Mothilal etal. , 2020] have extended the optimisation function todeal with diversity; so that for given p , the set of coun-terfactuals produced minimises the distance and featuredi erences within the set, while maximising the rangeof features changed across the set. However, to dealwith the out-of-distribution problem, others have ex-plored fundamentally-di erent endogenous approaches,in which known instances are exploited more directly togenerate good counterfactuals [Keane and Smyth, 2020;Laugel et al. , 2019; Poyiadzi et al. , 2020].Notably, [Keane and Smyth, 2020] (KS20) proposedan endogenous technique for generating counterfactualsby adapting native counterfactuals in the dataset. Theydeﬁned a good counterfactual to be one with Æ erences with respect to a target problem, p . Ifa good native counterfactual did not exist for p thenKS20 reused the feature di erences from a good nativecounterfactual that existed for some other problem q similar to p . The di erences between q and its good,native-counterfactual ( q Õ ) serve as a template for gener-ating the new (good) counterfactual for p , using only ex-isting, naturally-occurring feature-values. Even thoughgood native counterfactuals are rare, because they canbe adapted in many di erent ways, KS20 showed theirtechnique could generate counterfactuals that were verysimilar to target problems, while remaining sparse (i.e., Æ erences) and plausible. However, althoughKS20 demonstrated the feasibility of an endogenous ap-proach (on a small set of parametric variations), it waslimited to generating singleton counterfactuals therebylimiting coverage, similarity, and diversity.Using KS20 aqs a starting point, the present workdescribes a novel endogenous method that signiﬁcantlyextends this previous work. First, it proposes a moreelegant and general k -NN approach, capable of gen-erating superior counterfactual candidates from the k nearest native-counterfactuals, rather than just a singlenearest-counterfactual. Second, this new method is ca-pable of generating diverse sets of good counterfactuals.Third, in systematic tests varying key parameters, it isshown to consistently and signiﬁcantly improve coun-erfactual coverage and similarity, when compared to aKS20-baseline, for a respectable number of datasets. The work of [Keane and Smyth, 2020] (KS20) assumedan explanation context in which an underlying decisionmodel ( M ; e.g. a deep learner) was making predic-tions to be explained by a twinned counterfactual gener-ator using k -NN (see [Kenny and Keane, 2019]). Theirmethod aims to generate good counterfactuals ( Æ erences) to explain the prediction of a targetproblem, p . Given a set of training cases/instances, I ,the approach relies on the reuse of an existing good (na-tive) counterfactual, represented as a so-called explana-tion case (XC). An explanation case, xc , contains a tar-get problem instance, x , and a nearby NUN, x Õ , with nomore than d = 2 feature di erences; see Equations 1-5: N U N ( x, x Õ ) ≈∆ class ( x ) ” = class ( x Õ ) (1) matches ( x, x Õ ) = { f ‘ x | x.f ¥ x Õ .f } (2) dif f s ( x, x Õ ) = { f ‘ x | x.f ”¥ x Õ .f } (3) xc d ( x, x Õ ) ≈∆ N U N ( x, x Õ ) · | dif f s ( x, x Õ ) | Æ d (4) XC d = ) xc d ( x, x Õ ) ’ x, x Õ ‘ I * (5)Each explanation case, xc d , is associated with a set of match features, whose values are (approximately) equalin x and x Õ , and a set of di erence features, whose valuesare di erent. Thus, an xc acts as template for gener-ating new counterfactuals, by identifying features thatcan be adapted (the di erence features) and those whichshould remain invariant (the match features).To generate a good counterfactual for some targetproblem, p , the KS20 method ﬁrst identiﬁes the near-est neighbour, xc , based the similarity between p andthe target problem of xc , namely xc.x , and then con-structs a new counterfactual, cf from the feature valuesof p and xc.x Õ (the good counterfactual for x ). The val-ues of match features in p are transferred to cf alongwith the values of the di erence features from x Õ ; seelines 14-18 in Algorithm 1. If the predicted class of cf di ers from the class of p ( class ( cf ) ” = class ( p ) ), then cf is considered to be a valid (good) counterfactual for p .If class ( cf ) = class ( p ) , then sn alternative counterfac-tual can be generated using the values of xc ’s di erencefeatures from the nearest-like-neighbours of x Õ . In essence, KS20 proposed a approach to coun-terfactual generation, as a single nearest explanation-case was used as a template for the counterfactual. It is For convenience, we drop the d without loss of generality. natural to consider a k -NN extension, where k nearest-neighbour explanation-cases are reused, each providinga di erent set of counterfactual candidates (see Algo-rithm 1). Brieﬂy, given a target problem, p , a set of k nearest explanation-cases are selected (lines 5-7), basedon the similarity between p and their target problems( xc.x ); using a euclidean distance metric with featurevalues scaled to unit variance. For each nearest xc , wegenerate a set of candidate counterfactuals based on thefeatures of it’s NUN or good counterfactual ( xc.x Õ ) andthe like neighbours of xc.x Õ (lines 8-13). A counterfac-tual is generated for these NUNs (lines 14-18) and vali-dated by comparing the predicted class to the NUN class(lines 18-20); further discussion of this validation step,which di ers from KS20, see Section 3.2. Thus the timecomplexity of generating a set of counterfactuals for agiven p is O ( km ) where m is the average number of like-neighbours of the NUNs in explanation cases. Given : p , target problem; I , training instances; XC d , explanation cases for d ; k , number of XCs to be reused; M , underlying (classiﬁcation) model. Output: cfs , valid counterfactuals for p . def gen-kNN-CFs( p, I, XC d , k, M ) : xcs Ω getXCs ( p, XC d , k ) cf s Ω ) genCF s ( p, xc, I, M ) | xc ‘ xcs * return cf s def getXCs( p, XC, k ) : XC Õ Ω ) xc œ XC | class ( xc.x ) = class ( p ) * xcs Ω sort ! XC Õ , key = sim ( xc.x, p ) " return xcs [: k ] def genCFs( p, xc, I, M ) : nun Ω { xc.x Õ } nuns Ω nun ﬁ { i ‘ I | class ( i ) = class ( nun ) } cf s Ω { genCF ( p, n ) | n ‘ nuns } cf s Ω { cf | cf ‘ cf s · validateCF ( cf, xc, M ) } return sort ! cf s, key = sim ( cf, p ) " def genCF( p, nun ) : m Ω { f | f ‘ matches ( p, nun ) } d Ω { f | f ‘ dif f s ( p, nun ) } cf Ω m ﬁ d return cf if cf ” = p def validateCF( cf, xc, M ) : return M ( cf ) = class ( xc.x Õ ) Algorithm 1:

Generating counterfactuals byreusing the k nearest explanation cases to a tar-get problem, p .This new method is a desirable extension of KS20 be-ause it promises better coverage, plausibility and diver-sity. On coverage , generating more counterfactual can-didates greatly improves the chances of producing validcounterfactuals and, therefore, should increase the frac-tion of target problems that can be explained with agood counterfactual. On plausibility , this approach hasthe potential to generate counterfactuals that are evenmore similar to the target problem than those associatedwith the nearest explanation case. Finally, on diversity ,since di erent explanation cases may rely on di erentcombinations match/di erence features, from di erentexplanation cases, the resulting counterfactuals shoulddraw from a more diverse set of di erence-features. In a multi-class setting ( n > classes), there are are leasttwo ways to validate a candidate counterfactual, cf . Oneapproach is to look for a class change so that cf is con-sidered valid if its predicted class ( M ( cf ) ) di ers from p ’s class, as mentioned above and used by KS20. An-other option, is to conﬁrm that cf has the same class as the NUN used to produce it ( M ( cf ) = class ( xc i .x Õ ).In this work we use this latter, same class , approach(lines 19-20 in Algorithm 1) because it provides for astronger validation test than the weaker class change ap-proach used by KS20; it is harder for a cf to pass the same class test because there is only a single valid class,whereas any one of n ≠ classes will work in the case ofKS20’s class change . The stronger same class test is alsomore appealing because it constrains the cf to remainwithin the vicinity of the NUN class used to produce it( class ( xc.x Õ ) ), thereby ensuring greater plausibility. We evaluate the counterfactuals produced by the k -NNapproach using 10 well-known ML datasets (see the leg-ends of Figures 1 and 2) with varying numbers of classes,features, instances, in comparison to 2 baselines, and us-ing 3 evaluation metrics (coverage, distance, and diver-sity). The evaluation was implemented in Python 3.8,on a 72 core Intel Xeon ( 512GB RAM, 1x 1TB SSD, 6x10TB SATA RAID, 2x NVIDIA 2080 TI), with an totalrun-time of approximately 60 hours. A form of 10-fold cross-validation is used to evaluate thenewly generated counterfactuals, by selecting 10% of thetraining instances at random to use as test/target prob-lems. Then, the XC case-base is built from a subset ofthe XCs that are available from the remaining instances;we use at most 2x as many XCs as there are test prob-lems. Finally, any remaining instances, which are notpart of any selected XCs, are used to train the underly-ing classiﬁer; in this case we use a gradient boosted clas-siﬁer [Friedman, 2002] , which was found to be capable We used the SciKitLearn implementation with a devianceloss function, a learning rate of 0.1, and 100 boosting stages. of generating su ciently accurate classiﬁcation perfor-mance across the available datasets, but obviously anyalternative classiﬁer could be considered.We use the k -NN technique to generate good counter-factuals, varying k and d . Since this is a form of en-dogenous counterfactual generation, we use two variantsof the (endogenous) KS20 approach as baselines: (i) a retrieval only 1NN variant, which generates a counter-factual from a single XC (and its single NUN), and (ii) a retrieval & adaptation variant ( ) which also uses asingle XC, but considers nearby neighbours of the XC’sNUN, as a source of extra di erence features (KS20’stwo-step method). This variant is equivalent toour k -NN approach with k = 1 ; note, KS20 found the variant to be superior to the variant.Generated counterfactuals are evaluated using 3 dif-ferent metrics, averaging across the test cases and folds: • Test Coverage : the fraction of test problems thatcan be associated with at least one good counter-factual, to assess explanatory coverage • Relative Distance : the ratio of the distance (inversesimilarity) between the closest cf produced and p ,and the distance between xc.x and xc.x Õ from theXC used to generate cf ; relative distance < cf is closer to p than xc.x was to xc.x Õ . This isour proxy measure for reﬂecting plausibility . • Feature Diversity : the fraction of unique di erencefeatures in the counterfactuals; thus, a diversity of0.1 means that 10% of all features appeared as dif-ference features in the counterfactuals generated.Note, the baseline always has the same diversity as , as a single XC is reused (one set of di erences)and so we do not separately report diversity for . Figures 1 and 2 show the results for Æ k Æ with d = 2 and d = 3 . Performance on each datasetis represented as a separate line graph with statisticalsigniﬁcance encoded as follows. If the di erence betweentwo successive points (for a given dataset) is statisticallyvalid ( p < . ) then the points are connected by a solidline, otherwise they are connected by a dashed line; forcoverage we use a z-test of proportions and for relativedistance and diversity we use a t-test . Separately, if amarker is ﬁlled then it means that the di erence betweenits value and the baseline is statistically signiﬁcant(also for p < . ). Notice that the x-axis is not strictlylinear, to provide a greater level of detail for k Æ .In Figure 1(a) we can see how the ability to producegood counterfactuals increases with k up to a point,which depends on the number of available XCs for eachdataset. In all datasets, coverage for k > is signiﬁ-cantly greater than the KS20-baseline, and coverage in-creases to more than 80% of target problems in all cases As this is an endogenous technique the out-of-distribution metrics sometimes used in evaluating exogenoustechniques are not germane.igure 1: Counterfactual evaluation results for d Æ : (a) counterfactual coverage, (b) mean relative distance, and (c)counterfactual diversity along with the relative improvements (d) compared to the baseline as appropriate. for a large enough k . On average, the current k -NN ap-proach is able to increase coverage by almost a factor of2, compared to the KS20 baseline, as indicatedby the relative improvement values for coverage in Fig-ure 1(d); the approximate values for k shown, indicatewhen this maximum coverage is achieved.In Figure 1(b) we see these coverage improvementsalso o er statistically signiﬁcant reductions in relativedistance, for increasing k , this time compared with ,since it o ers better relative distance than . Thus,by using additional explanation cases, even those thatare further away from the test/target problem, we cangenerate valid counterfactuals that are even closer to thetarget. The increase in relative distance for , com-pared with the , is due to the signiﬁcant increasein coverage o ered by , which means there arefar more valid counterfactuals participating in the rel-ative distance calculations. Once again, in Figure 1(d)we show a relative improvement (decrease) in these dis-tances (compared with baseline): on average there is a 3x decrease in relative distance. This is usuallyachieved for a larger value of k than the best coverage,which highlights the beneﬁts of continuing the searchbeyond an initial valid counterfactual.Finally, the diversity results are presented in Figure1(c), showing signiﬁcant improvements in feature diver-sity with increasing k , although not every dataset pro-duces counterfactuals with high levels of diversity. Forexample, in Figure 1(c) the counterfactuals produced for Credit only include 25% of the available features, somost features don’t serve as di erence features. On theother hand, the counterfactuals produced for Abalone include over 80% of features as their di erence features,while datasets such as Auto-MPG, Cleveland, and Glass achieve more moderate levels of diversity with 45% to50% feature participation. Nevertheless, these are con-siderable improvements ( 3x) compared to the diversityof the baseline ( ) approach, as per Figure 1(d).The d = 3 results in Figure 2, though not discussedin detail, show similar trends: coverage and diversity in- igure 2: Counterfactual evaluation results for d Æ : (a) counterfactual coverage, (b) mean relative distance, and (c)counterfactual diversity along with the relative improvements (d) compared to the baseline as appropriate. creases with k while relative distance decreases, and inall cases the best results for k -NN demonstrate signiﬁ-cant improvements over the KS20-baselines. Counterfactuals are playing an increasingly importantrole in Explainable AI research because they can bemore causally informative than alternative factual formsof explanation. However, useful native counterfactu-als – those that are similar to a target problem butdi er in only a few features – can be rare in manyreal-world settings, leading some researchers to pro-pose exogenous techniques for generating synthetic coun-terfactuals [Wachter et al. , 2017; Dandl et al. , 2020;Mothilal et al. , 2020]. While such exogenous tech-niques have shown that synthetic counterfactuals canbe produced, they often rely on features that cannotbe guaranteed to occur naturally, which may limit theirexplanatory-utility for end-users. In response, other re-searchers have advanced alternative endogenous tech- niques for generating counterfactuals, based on featuresthat naturally occur among the instances of a dataset.The main contribution of this work is a novel approachto counterfactual generation that signiﬁcantly advancesprevious proposals on endogenous counterfactual gener-ation, arguably in a more elegant manner. The secondmain contribution lies in its systematic and extensivecomparative testing of endogenous techniques to demon-strate the optimal parameters for current and previousmethods, across a wide range of benchmark datasets.As with any research, there are limitations that in-vite future directions. We have focused on classiﬁcationtasks, but in principle the approach should be equallyapplicable to prediction tasks. The current evaluationfocuses on a like-for-like comparison with endogenouscounterfactual generation methods. Further compar-isons are planned, to compare exogenous and endogenoustechniques more directly. Perhaps more importantly, asis the case with other counterfactual techniques, thoughwe have provided an o ine analysis of counterfactualuality, we have not yet evaluated the counterfactualsproduced in situ , as part of a real live-user explanationsetting. This will be an important part of future re-search, as the utility of any counterfactual generationtechnique will depend critically on the nature of thecounterfactuals produced and their informativeness asexplanations to “real” end-users. But, to be positive,the current tests identify the optimal versions of thesemethods that need to be used in such future user studies. References [Adadi and Berrada, 2018] Amina Adadi and Mo-hammed Berrada. Peeking inside the black-box: Asurvey on explainable artiﬁcial intelligence (xai).

IEEE Access , 6:52138–52160, 2018.[Byrne, 2019] Ruth MJ Byrne. Counterfactuals in Ex-plainable Artiﬁcial Intelligence (XAI). In

IJCAI-19 ,pages 6276–6282, 2019.[Dandl et al. , 2020] Susanne Dandl, Christoph Mol-nar, Martin Binder, and Bernd Bischl. Multi-objective counterfactual explanations. arXiv preprintarXiv:2004.11165 , 2020.[Dasarathy, 1994] Belur V Dasarathy. Minimal consis-tent set (mcs) identiﬁcation for optimal nearest neigh-bor decision systems design.

IEEE Trans. on Systems,Man, and Cybernetics , 24(3):511–517, 1994.[Doyle et al. , 2004] Dónal Doyle, Pádraig Cunningham,Derek Bridge, and Yusof Rahman. Explanation ori-ented retrieval. In

EWCBR , pages 157–168. Springer,2004.[Friedman, 2002] Jerome H Friedman. Stochastic gradi-ent boosting.

Computational statistics & data analy-sis , 38(4):367–378, 2002.[Gilpin et al. , 2018] Leilani H Gilpin, David Bau, Ben ZYuan, Ayesha Bajwa, Michael Specter, and LalanaKagal. Explaining explanations. In , pages 80–89. IEEE, 2018.[Goodman and Flaxman, 2017] Bryce Goodman andSeth Flaxman. European union regulations on algo-rithmic decision-making and a “right to explanation”.

AI magazine , 38(3):50–57, 2017.[Guidotti et al. , 2018] Riccardo Guidotti, Anna Mon-reale, Salvatore Ruggieri, Franco Turini, Fosca Gian-notti, and Dino Pedreschi. A survey of methods forexplaining black box models.

ACM computing surveys ,51(5):1–42, 2018.[Gunning, 2017] David Gunning. Explainable artiﬁcialintelligence (xai).

DARPA, Web , 2(2), 2017.[Karimi et al. , 2020] Amir-Hossein Karimi, Julius vonKügelgen, Bernhard Schölkopf, and Isabel Valera. Al-gorithmic recourse under imperfect causal knowledge.

NIPS , 33, 2020. [Keane and Smyth, 2020] Mark T Keane and BarrySmyth. Good counterfactuals and where to ﬁnd them.In

ICCBR , pages 163–178. Springer, 2020.[Kenny and Keane, 2019] Eoin M Kenny and Mark TKeane. Twin-systems to explain artiﬁcial neuralnetworks using case-based reasoning. In

IJCAI-19,Macao, 10-16 August 2019 , pages 2708–2715, 2019.[Kusner and Loftus, 2020] Matt J Kusner and Joshua RLoftus. The long road to fairer algorithms, 2020.[Larsson and Heintz, 2020] Stefan Larsson and FredrikHeintz. Transparency in artiﬁcial intelligence.

InternetPolicy Review , 9(2), 2020.[Laugel et al. , 2019] Thibault Laugel, Marie-JeanneLesot, Christophe Marsala, Xavier Renard, andMarcin Detyniecki. The dangers of post-hoc inter-pretability. In

IJCAI-19 , pages 2801–2807. AAAIPress, 2019.[McGrath et al. , 2018] Rory McGrath, Luca Costabello,Chan Le Van, Paul Sweeney, Farbod Kamiab, ZhaoShen, and Freddy Lecue. Interpretable credit applica-tion predictions with counterfactual explanations. In

NIPS workshop on Challenges and Opportunities forAI in Financial Services , 2018.[McKenna and Smyth, 2000] Elizabeth McKenna andBarry Smyth. Competence-guided case-base editingtechniques. In

EWCBR , pages 186–197. Springer,2000.[Miller, 2019] Tim Miller. Explanation in artiﬁcial in-telligence: Insights from the social sciences.

ArtiﬁcialIntelligence , 267:1–38, 2019.[Mittelstadt et al. , 2019] Brent Mittelstadt, Chris Rus-sell, and Sandra Wachter. Explaining explanations inai. In roc. of the Conference on Fairness, Accountabil-ity, and Transparency , pages 279–288, 2019.[Mothilal et al. , 2020] Ramaravind K Mothilal, AmitSharma, and Chenhao Tan. Explaining machine learn-ing classiﬁers through diverse counterfactual explana-tions. In

Proc. of the Conference on Fairness, Ac-countability, and Transparency , pages 607–617, 2020.[Muhammad et al. , 2016] Khalil Ibrahim Muhammad,Aonghus Lawlor, and Barry Smyth. A live-user studyof opinionated explanations for recommender systems.In

IUI , pages 256–260, 2016.[Poyiadzi et al. , 2020] Rafael Poyiadzi, Kacper Sokol,Raul Santos-Rodriguez, Tijl De Bie, and Peter Flach.Face: feasible and actionable counterfactual explana-tions. In

Proc. of the AAAI/ACM Conference on AI,Ethics, & Society , pages 344–350, 2020.[Ribeiro et al. , 2016] Marco Tulio Ribeiro, SameerSingh, and Carlos Guestrin. " why should i trust you?".In

Proc of the ACM SIGKDD , pages 1135–1144, 2016.[Russell et al. , 2017] Chris Russell, Matt J Kusner,Joshua Loftus, and Ricardo Silva. When worlds col-lide: integrating di erent counterfactual assumptionsin fairness. In NIPS , pages 6414–6423, 2017.Verma et al. , 2020] Sahil Verma, John Dickerson, andKeegan Hines. Counterfactual explanations for ma-chine learning: A review. arXiv:2010.10596 , 2020.[Wachter et al. , 2017] Sandra Wachter, Brent Mittel-stadt, and Chris Russell. Counterfactual explanationswithout opening the black box.