[PDF] Inverse Classification for Comparison-based Interpretability in Machine Learning

Abstract

In the context of post-hoc interpretability, this paper addresses the task of explaining the prediction of a classifier, considering the case where no information is available, neither on the classifier itself, nor on the processed data (neither the training nor the test data). It proposes an instance-based approach whose principle consists in determining the minimal changes needed to alter a prediction: given a data point whose classification must be explained, the proposed method consists in identifying a close neighbour classified differently, where the closeness definition integrates a sparsity constraint. This principle is implemented using observation generation in the Growing Spheres algorithm. Experimental results on two datasets illustrate the relevance of the proposed approach that can be used to gain knowledge about the classifier.

Full PDF

IInverse Classiﬁcation for Comparison-based Interpretability in Machine Learning

Thibault Laugel , Marie-Jeanne Lesot , Christophe Marsala , Xavier Renard , Marcin Detyniecki Sorbonne Universit¨ı¿œs, UPMC Univ Paris 06, CNRS, LIP6 UMR 7606, 4 place Jussieu 75005 Paris, France AXA – Data Innovation Lab, 48 rue Carnot 92150 Suresnes, France Polish Academy of Science, IBS PAN, Warsaw, [email protected]

Abstract

In the context of post-hoc interpretability, this paper ad-dresses the task of explaining the prediction of a classiﬁer,considering the case where no information is available, nei-ther on the classiﬁer itself, nor on the processed data (neitherthe training nor the test data). It proposes an instance-basedapproach whose principle consists in determining the mini-mal changes needed to alter a prediction: given a data pointwhose classiﬁcation must be explained, the proposed methodconsists in identifying a close neighbour classiﬁed differently,where the closeness deﬁnition integrates a sparsity constraint.This principle is implemented using observation generation inthe Growing Spheres algorithm. Experimental results on twodatasets illustrate the relevance of the proposed approach thatcan be used to gain knowledge about the classiﬁer.

Introduction

Bringing transparency to machine learning models is nowa-days a crucial task. However, the complexity of today’s best-performing models as well as the subjectivity and lack ofconsensus over the notion of interpretability make it difﬁ-cult to address.Over the past few years, multiple approaches have beenproposed to bring interpretability to machine learning, re-lying on intuitions about what ’interpretable’ means andwhat kind of explanations would help a user understand amodel or its predictions. Existing categorizations of inter-pretability (Bibal 2016; Kim and Doshi-Velez 2017; Doshi-Velez and Kim 2017; Biran and Cotton 2017) usually dis-tinguish approaches mainly on the characteristics of theseexplanations: given a classiﬁer to be interpreted, in-model interpretability relies on modifying its learning process tomake it simpler (see for instance Abdollahi et al. 2016).Other approaches consist in building a new simpler modelto replace the original classiﬁer (Lakkaraju and Rudin 2017;Angelino et al. 2017). On the contrary, post-hoc inter-pretability focuses on building an explainer system using theresults of the classiﬁer to be explained.In this work, we propose a post-hoc approach that aimsat explaining a single prediction of a model through com-parison. In particular, given a classiﬁer and an observationto be interpreted, we focus on ﬁnding the closest possibleobservation belonging to a different class. Explaining through particular examples has been shownin cognitive and teaching sciences to facilitate the learningprocess of a user (see e.g. Watson et al. (2008)). This is es-pecially relevant in cases where the classifer decision to ex-plain is complex and other interpretability approaches can-not provide meaningful explanations. Another motivationfor our approach lies in the fact that in many applicationsof machine learning today, no information about the originalclassiﬁer or existing data is made available to the end-user,making model- and data-agnostic intepretability approachesessential.To address these issues we propose

Growing Spheres , agenerative approach that locally explores the input space ofa classiﬁer to ﬁnd its decision boundary. It has the speciﬁtyof not relying on any existing data other than the observationto be interpreted to ﬁnd the minimal change needed to alterits associated prediction.The paper is organized as follows: we ﬁrst present someexisting approaches for post-hoc interpretability and howthey relate to the one proposed in this paper. Then, wedescribe the proposed comparison-based approach as wellas its formalization and motivations. We then describethe

Growing Spheres algorithm. Finally, we illustrate themethod through two real-world applications and analyzehow it can be used to gain information about a classiﬁer.

Post-hoc Interpretability

Post-hoc interpretability approaches aim at explaining thebehavior of a classiﬁer around particular observations to letthe user understand their associated predictions, generallydisregarding what the actual learning process of the modelmight be. Post-hoc interpretability of results has receiveda lot of interest recently (see for instance Kim and Doshi-Velez (2017)), especially as black-box models such as deepneural networks and ensemble models are being more andmore used for classiﬁcation despite their complexity. Thissection brieﬂy reviews the main existing approaches, de-pending on the hypotheses that are made about availableinputs and on the forms the explanations take. These twoaxes of discussion, which obviously overlap, provide a goodframework for the motivations of our approach. a r X i v : . [ s t a t . M L ] D ec vailable Inputs Let us consider the case of a physician using a diagnostictool. It is natural to speculate that (s)he does not have anyinformation about the machine learning model used to makedisease predictions, neither may (s)he have any idea aboutwhat patients were used to train it. This raises the questionof what knowledge (about the machine learning model andthe training or other data) an end-user has, and hence whatinputs a post-hoc explainer should use.Several approaches rely speciﬁcally on the knowledge ofthe algorithm used to make predictions, taking advantage ofthe classiﬁer structure to generate explanations (Barbella etal. 2009; Hendricks et al. 2016). However, in other cases,no information about the prediction model is available (themodel might be accessible only through an API or a soft-ware for instance). This highlights the necessity of hav-ing model-agnostic interpretability methods that can explainpredictions without making any hypotheses on the classiﬁer(Baehrens et al. 2009; Adler et al. 2017; Ribeiro, Singh, andGuestrin 2016). These approaches, sometimes called sensi-tivity analyzes, generally try to analyze how the classiﬁerlocally reacts to small perturbations. For instance, Baehrenset al. (2010) approximate the classiﬁer with Parzen windowsto calculate the local gradient of the model and understandwhat features locally impact the class change.

Forms of Explanations

Beyond the differences regarding their inputs, the variety ofexisting methods also comes from the lack of consensus re-garding the deﬁnition, and a fortiori the formalization, ofthe very notion of interpretability. Depending on the taskperformed by the classiﬁer and the needs of the end-user,explaining a result can take multiple forms. Interpretabilityapproaches hence rely on the following assumptions to de-sign explanations:1. The explanations should be an accurate representation ofwhat the classiﬁer is doing.2. The explanations should be understandably read by theuser.Feature importances (Baehrens et al. 2009; Ribeiro, Singh,and Guestrin 2016), binary rules (Turner 2016) or visual-izations (Krause, Perer, and Bertini 2016) for instance givedifferent insights about predictions without any knowledgeon the classiﬁer. The LIME approach (Ribeiro, Singh, andGuestrin 2016) linearily approximates the local decisionboundary of a classiﬁer and calculates the linear coefﬁcientsof this approximation to give local feature importances,while Hendricks et al. (2016) identify class-discriminativeproperties that justify predictions and generate sentences toexplain image classiﬁcation.In this paper, we consider the case of instance-based ap-proaches, which bring interpretability by comparing an ob-servation to relevant neighbors (Mannino and Koushik 2000;ˇStrumbelj, Kononenko, and Robnik ˇSikonja 2009; Martensand Provost 2014; Kabra, Robie, and Branson 2015). Theseapproaches use other observations (from the train set, fromthe test set or generated ones) as explanations to bring trans-parency to a prediction of a black-box classiﬁer. One of the motivations for instance-based approaches liesin the fact that in some cases the two objectives and men-tioned above are contradictory and cannot be both reachedin a satisfying way. In these complex situations, ﬁnding ex-amples is an easier and more accurate way to describe theclassiﬁer behavior than trying to force a speciﬁc inappropri-ate explanation representation, which would result in incom-plete, useless or misleading explanations for the user.As an illustration, Baehrens et al. (2010) discuss how theirapproach based on Parzen windows does not succeed well inproviding explanations for individual predictions that are atthe boundaries of the training data, giving explanation vec-tors (gradients) actually pointing in the (wrong) opposite di-rection from the decision boundary. Comparison with ob-servations from the other class would probably make moresense in such a case and give more useful insights.Existing instance-based approaches moreover often relyon having some prior knowledge, be it about the machinelearning model, the train dataset, or other labelled instances.For instance, Kabra et al. (2015) try to identify which trainobservations have the highest direct inﬂuence over a singleprediction. Comparison-based Interpretability

In this section, we motivate the proposed approach in thelight of the two axes of discussion presented in the previoussection.

Explaining by Comparing

Disposing of knowledge on the classiﬁer or data is an as-set existing methods can use to create the explanations theydesire. However, the democratization of machine learningimplies that in a lot of nowadays cases, the end-user of anexplainer system does not have access to any of this knowl-edge, making such approaches unrealistic. In this context,the need for a comparison-based interpretability tool thatdoes not rely on any prior knowledge, including any existingdata, constitutes one of the main motivations for our work.Due to its highly subjective nature, interpretability inmachine learning sometimes looks up to cognitive sci-ences for a justiﬁcation for building explanations (whenthey do not, they rely on intuitive ideas about what in-terpretable means). Although not mentioned by the previ-ously cited instance-based approaches, it must be underlinedthat learning through examples also possesses a strong jus-tiﬁcation in cognitive and teaching sciences (Decyk 1994;Watson and Shipman 2008; Mvududu and Kanyongo 2011;van Gog, Kester, and Paas 2011). For instance, Watson et al.(2008) show through experiences that generated exampleshelp students ’see’ abstract concepts that they had troubleunderstanding with more formal explanations.Driven by this cognitive justiﬁcation and the need to havea tool that can be used when the available information isscarce, we propose an instance-based approach relying oncomparison between the observation to be interpreted andrelevant neighbors. rinciple of the Proposed Approach

In order to interprete a prediction through comparison, wepropose to focus on ﬁnding an observation belonging to theother class and answer the question: ’Considering an ob-servation and a classiﬁer, what is the minimal change weneed to apply in order to change the prediction of this ob-servation?’. This problem is similar to inverse classiﬁca-tion (Mannino and Koushik 2000), but we apply it to inter-pretability.Explaining how to change a prediction can help the userunderstand what the model considers as locally important.However, compared to feature importances which are of-ten built to have some kind of statistical robustness, this ap-proach does not claim to bring any causal knowledge. On thecontrary, it gives local insights disregarding the global be-havior of the model and thus differs from other interpretabil-ity approaches. For instance, Ribeiro et al. (2016) evaluatetheir method LIME by looking at how faithful to the globalmodel the local explainer is. However, despite not providingany causal information, the proposed approach provides theexact values needed to change the prediction class, which isalso very helpful to the user.Furthermore, it is important to note that our primary goalhere is to give insights about the classiﬁer, not the reality itis approximating. This approach thus aims at understandinga prediction regardless of whether the classiﬁer is right orwrong, or whether or not the observations generated as ex-planations are absurd. This characteristic is shared with ad-versarial machine learning (Tygar 2011; Szegedy et al. 2014;Goodfellow, Shlens, and Szegedy 2015), which relates toour approach since it aims at ’fooling’ a classiﬁer by gen-erating close variations of original data in order to changetheir predictions. These adversarial examples rely on ex-ploiting weaknesses of classiﬁers such as their sensitivity tounknown data, and are usually generated using some knowl-edge of the classiﬁer (such as its loss function). The ap-proach we propose also relies on generating observationsthat might not be realistic but without any knowledge aboutthe classiﬁer whatsoever and for the purpose of interpretabil-ity.

Finding the Closest Ennemy

For simpliﬁcation purposes, we propose a formalization ofthe proposed approach for binary classiﬁcation. However, itcan be applied to multiclass classiﬁcation.Let us consider a problem where a classiﬁer f mapssome input space X of dimension d to an output space Y = {− , } , and suppose that no information is avail-able about this classiﬁer. Suppose all features are scaled tothe same range. Let x = ( x i ) i ∈ X be the observation tobe interpreted and f ( x ) ∈ Y its associated prediction. Thegoal of the proposed instance-based approach is to explain x through an other observation e ∈ X . The ﬁnal form ofexplanation is the difference vector e − x .In particular, we focus on ﬁnding an observation e belong-ing to a different class than x , i.e. such that f ( e ) (cid:54) = f ( x ) . Forsimpliﬁcation purposes, we call ally an observation belong-ing to the same class as x by the classiﬁer, and ennemy if itis classiﬁed to the other class. Recalling objective mentioned earlier, the ﬁnal explana-tion e − x we are looking for should be an accurate represen-tation of what the classiﬁer is doing. This is why we decideto transform this problem into a minimization problem bydeﬁning the function c : X × X → R + such that c ( x, e ) isthe cost of moving from observation x to ennemy e .Using this notation, we focus on solving the followingminimization problem: e ∗ = arg min e ∈X { c ( x, e ) | f ( e ) (cid:54) = f ( x ) } (1)The difﬁculty of deﬁning the cost function c comes fromthe fact that despite the classiﬁer being designed to learn andoptimize some speciﬁc loss function, the considered black-box hypothesis compells us to choose a different metric.Thus, we deﬁne c as: c ( x, e ) = || x − e || + γ || x − e || (2)with || e − x || = (cid:80) i ≤ d x i (cid:54) = e i , γ ∈ R + the weight associ-ated to the vector sparsity and || . || the Euclidean norm.Looking up to Strumbelj et al. (2009), we choose to usethe l norm of the vector e − x as a component of the costfunction to measure the proximity between e and x . How-ever, recalling objective , we need to make sure that thiscost function guarantees a ﬁnal explanation that can be eas-ily read by the user. In this regard, we consider that humanusers intuitively ﬁnd explanations of small dimension to besimpler. Hence, we decide to integrate vector sparsity, mea-sured by the l norm, as another component of the cost func-tion c and combine it with the l norm as a weighted average.Due to the cost function c being discontinuous and thehypotheses made (black-box classiﬁer and no existing data)solving problem (1) is difﬁcult. Hence, we choose to solvesequentially the two components of the cost function using Growing Spheres , a two-step heuristic approach that approx-imates the solution of this problem.

Growing Spheres

In order to solve the problem deﬁned in Equation (1), theproposed approach

Growing Spheres uses instance genera-tion without relying on existing data. Thus, considering anobservation to interprete, we ignore in which direction theclosest classiﬁer boundary might be. In this context, a greedyapproach to ﬁnd the closest ennemy is to explore the inputspace X by generating instances in all possible direction un-til the decision boundary of the classiﬁer is crossed, thusminimizing the l -component of our metric. This step is de-tailed in the next part, Generation.Then, in order to make the difference vector of the closestennemy sparse, we simplify it by reducing the number offeatures used when moving from x to e (thus minimizing the l component of the cost function and generating the ﬁnalsolution e ∗ ), as explained in the Feature Selection part.An illustration of the two steps of Growing Spheres isdrawn in Figure 1.igure 1: Illustration of Growing Spheres: The red circlerepresents the observation to interprete, the plus signs obser-vations generated by Growing Spheres (blue for allies, blackfor ennemies). The white plus is the ﬁnal ennemy e ∗ used togenerate explanations. Generation

The generation step of

Growing Spheres is detailed in Algo-rithm 1. Its main idea is to generate observations in the fea-ture space in l -spherical layers around x until an ennemyis found. For two positive numbers a and a , we deﬁne a ( a , a ) -spherical layer SL around x as: SL ( x, a , a ) = { z ∈ X : a ≤ || x − z || ≤ a } To generate uniformly over these subspaces, we use theYPHL algorithm (Harman and Lacko 2010) which gener-ates observations uniformly distributed over the surface ofthe unit sphere. We then draw U [ a , a ] -distributed values anduse them to rescale the distances between the generated ob-servations and x . As a result, we obtain observations that areuniformly distributed over SL ( x, a , a ) .The ﬁrst step of the algorithm consists in generating uni-formly n observations in the l -ball of radius η and center x , which corresponds to SL ( x, , η ) (line 1 of Algorithm 1),with n and η hyperparameters of the algorithm.In case this initial generation step already contains enne-mies, we need to make sure that the algorithm did not missthe closest decision boundary. This is done by updating thevalue of the initial radius: η ← η/ and repeating the initialstep until no ennemy is found in the intial ball SL ( x, , η ) (lines 2 to 5).However, if no ennemy is found in SL ( x, , η ) , we up-date a and a using η , generate over SL ( x, a , a ) and re-peat this process until the ﬁrst ennemy has been found (asdetailed in lines 6 to 11).In the end, Algorithm 1 returns the l -closest generatedennemy e from the observation to be interpreted x (as repre-sented by the black plus in Figure 1).Once this is done, we focus on making the associated ex-planation as easy to understand as possible through featureselection. Feature Selection

Let e be the closest ennemy found by Algorithm 1. Our sec-ond objective is to minimize the l component of the costfunction c ( x, e ) deﬁned in Equation (2). This means that we Algorithm 1

Growing spheres generation

Require: f : X → {−

1; 1 } a binary classiﬁer Require: x ∈ X an observation to be interpreted Require:

Hyperparameters: η, n

Ensure:

Ennemy e Generate ( z i ) i ≤ n uniformly in SL ( x, , η ) while ∃ e ∈ ( z i ) i ≤ n | f ( e ) (cid:54) = f ( x ) do η = η/ Update ( z i ) i ≤ n by generating uniformly in SL ( x, (0 , η )) end while Set a = η , a = 2 η while (cid:54) ∃ e ∈ ( z i ) i ≤ n | f ( e ) (cid:54) = f ( x ) do a = a a = a η Generate ( z i ) i ≤ n uniformly in SL ( x, a , a ) end while Return e , the l -closest generated ennemy from x Algorithm 2

Feature Selection

Require: f : X → {−

1; 1 } a binary classiﬁer Require: x ∈ X the observation to be interpreted Require: e ∈ X | f ( e ) (cid:54) = f ( x ) the solution of Algorithm 1 Ensure:

Ennemy e ∗ Set e (cid:48) = e while f ( e (cid:48) ) (cid:54) = f ( x ) do e ∗ = e (cid:48) i = arg min j ∈ [1: d ] , e (cid:48) j (cid:54) = x j | e (cid:48) j − x j | Update e (cid:48) i = x i end whileReturn e ∗ are looking to maximize the sparsity of vector e − x with re-spect to f ( e ) (cid:54) = f ( x ) . To do this, we consider again a naiveheuristic based on the idea that the smallest coordinates of e − x might be less relevant locally regarding the classiﬁerdecision boundary and should thus be the ﬁrst ones to beignored.The feature selection algorithm we use is detailed in Al-gorithm 2.The ﬁnal explanation provided to interprete the observa-tion x and its associated prediction is the vector x − e ∗ , with e ∗ the ﬁnal ennemy identiﬁed by the algorithms (representedby the white plus in Figure 1). Experiments

The aforementioned difﬁculties of working with inter-pretability make it often impossible to evaluate approachesand compare them one to another.Some of the existing approaches (Baehrens et al. 2009;Ribeiro, Singh, and Guestrin 2016; Doshi-Velez and Kim2017) rely on surveys for evaluation, asking users questionsto measure the extent to which they help the user in perform-ing his ﬁnal task, in order to assess some kind of explanationquality. However, creating reproducible research in machineeature MoveMin. shares of referenced articles in Mashable +2016Avg. keyword (max. shares) +913Table 1: Output examples of

Growing Spheres for Article 1,predicted to be not popular by RF learning requires to deﬁne mathematical proxies for expla-nation quality.In this context, we present illustrative examples of theproposed approach applied to news and image classiﬁca-tion. In particular, we analyze how the explanations givenby Growing Spheres can help a user gain knowledge about aproblem or identify weaknesses of a classiﬁer. Additionally,we check that the explanations can be easily read by a userby measuring the sparsity of the explanations found.

Application for News Popularity Prediction

We apply our method to explain the predictions of a randomforest algorithm over the news popularity dataset (Fernan-des, Vinagre, and Cortez 2015). Given 58 numerical featurescreated from 39644 online news articles from website Mash-able, the task is to predict wether said articles have beenshared more than 1400 times or not. Features for instanceencode information about the format and content of the arti-cles, such as the number of words in the title, or a measureof the content subjectivity or the popularity of the keywordsused. We split the dataset and train a random forest classiﬁer( RF ) on 70% of the data. We use a grid search to look forthe best hyperparameters of RF (number of trees) and test iton the rest of the data (0.70 ﬁnal AUC score). We use γ = 1 to deﬁne the cost function c and set the hyperparameters ofAlgorithm 1 to η = 0 . and n = 10000 . Illustrative Example

We apply

Growing Spheres to tworandom observations from the test set (one from each class).For instance, let us consider the case of an article entitled’The White House is Looking for a Few Good Coders’ (Ar-ticle 1). This article is predicted to be not popular by RF .The explanation vector given by Growing Spheres for thisprediction has 2 non-null coordinates that can be found inTable 1: among the articles referenced in Article 1, the leastpopular of them would need to have 2016 more shares inorder to change the prediction of the classiﬁer. Additionally,the keywords used in Article 1 are each associated to severalarticles using them. For each keyword, the most popular ofthese articles would need to have 913 more shares in orderto change the prediction. In other words, Article 1 wouldbe predicted to be popular by RF if the references and thekeywords it uses were more popular themselves.On the opposite, as presented in Table 2, these same fea-tures would need to be reduced for Article 2, entitled ”In-tern’ Magazine Expands Dialogue on Unpaid Work Expe-rience’ and predicted to be popular, to change class. Addi-tionally, the feature ’text subjectivity score’ (score between0 and 1) would need to be reduced by 0.03, indicating that aslightly more objective point of view from the author wouldlead to have Article 2 predicted as being not popular. Feature MoveAvg. keyword (max. shares) -911Min. shares of referenced articles in Mashable -3557Text subjectivity -0.03Table 2: Output examples of Growing Spheres for Article 2,predicted to be popular by RF Figure 2: Sparsity distribution over the news test dataset.Reading: ’30% of the observations of our test dataset haveexplanations that use 5 features or less’.

Sparsity Evaluation

In order to check whether the pro-posed approach fulﬁlls its goal of ﬁnding explanations thatcan be easily understood by the user, we evaluate the globalsparsity of the explanations generated for this problem. Wemeasure sparsity as the number of non-zero coordinatesof the explanation vector || x − e ∗ || . Figure 2 shows thesmoothed cumulative distribution of this value for all 11893test data points. We observe that the maximum value overthe whole test dataset is 17, meaning that each observationof the test dataset only needs to change 17 coordinates or lessin order to cross the decision boundary. Moreover, 80% ofthem only need to move in 9 directions or less, that is 15%of the features only. This shows that the proposed methodindeed achieves sparsity in order to make explanations morereadable. It is important to note that this does not mean thatwe only need 17 features to explain all the observations,since nothing guarantees different explanations use the samefeatures.This experiment gives an illustration of how this methodcan be used to gain knowledge on articles popularity predic-tion. Applications to Digit Classiﬁcation

Another application for this approach is to get some under-standing of how the model behaves in order to improve it.We use the MNIST handwritten digits database (LeCun etal. 1998) and apply

Growing Spheres to the binary classiﬁ-cation problem of recognizing the digits 8 and 9. The ﬁltereddataset contains 11800 instances of 784 features (28 by 28pictures of digits). We use a support vector machine classi-ﬁer (

SVM ) with a RBF kernel and parameter C = 15 . Weigure 3: Output example from the application of GrowingSpheres for two instances. Example of original instance x (left column), its closest ennemy found e ∗ (center) and theexplanation vector x − e ∗ (right). A white pixel indicates a0 value, black a 1train the model on 70% of the data and test it on the rest(0.98 AUC score). We use the same values for γ and thehyperparameters of Algorithm 1 as in the ﬁrst experiment. Illustrative Example

Given a picture of an 8 (Figure 3),our goal is to understand how, according to the classiﬁer, wecould transform this 8 into a 9 (and reciprocally), in order toget a sense of what parts of the image are considered impor-tant. Our intuition would be that ’closing the bottom loop’ ofa 9 should be the most inﬂuential change needed to make a9 become an 8, and hence features provoking a class changeshould include pixels found in the bottom-left area of thedigits. Output examples to interprete a 9 and a 8 predictionsare shown in Figure 3.Looking at Figure 3, the ﬁrst thing we observe conﬁrmsour intuition that a good proportion of the non-null coor-dinates of the explanation vector are pixels located in thebottom-left part of the digits (as seen in pictures right-column pictures). Hence, we can see when comparing leftand center pictures that

Growing Spheres found the closestennemies of the original observation by either opening (topexample) or closing (bottom example) the bottom part of thedigits.However, we also note that some pixels of the explanationvectors are much harder to understand, such as the ones lo-cated on the top right corner of the explanation image for in-stance. This was to be expected since, as mentioned earlier,our method is trying to understand the classiﬁer’s decision,not the reality it is approaximating. In this case, the fact thatthe classiﬁer apparently considers these pixels to be inﬂuen-tial the classiﬁcation of these digits could be an evidence ofthe learned boundary inaccuracy.Finally, we note that the closest ennemies found by

Grow-ing Spheres (pictures in the center) in both cases are notproper 8 and 9 digits. Especially in the bottom example, a human observer would still probably identify the centerdigit as a noised version of the original 9 instead of an 8.Thus, despite achieving high accuracy and having learnedthat bottom-left pixels are important to turn a 9 into an 8 andreciprocally, the classiﬁer still fails to understand the actualconcepts making digits recognizable to a human.We also check the sparsity of our approach over the wholetest set (3528 instances). Once again, our method seems tobe generating sparse explanations since 100% of the testdataset predictions can be interpreted with explanations ofat most 62 features (representing 7.9% of total features).

Conclusion and Future Works

The proposed post-hoc interpretability approach providesexplanations of a single prediction through the comparisonof its associated observation with its closest ennemy. In par-ticular, we introduced a cost function taking into account thesparsity of the explanations, and described the implemen-tation

Growing Spheres, which answers this problem whenhaving no information about the classiﬁer nor existing data.We showed that this approach provides insights about theclassiﬁer through two applications. In the ﬁrst one,

GrowingSpheres allowed us to gain meaningful information aboutfeatures that were locally relevant in news popularity pre-diction. The second application highlighted both strengthsand weaknesses of the support vector machine used for dig-its classiﬁcation, illustrating what concepts were learned bythe classiﬁer. Furthermore, we also checked that the explana-tions provided by the proposed approach are indeed sparse.Beside collaborating with experts of industrial domainsfor explanations validation, outlooks for our work includefocusing on the constraints imposed to the

Growing Spheres algorithm. In numerous real-world applications, the ﬁnalgoal of the user may be such that it would be useless for himto have explanations using speciﬁc features. For instance, abusiness analyst using a model predicting whether or not aspeciﬁc customer is going to make a purchase would ideallyhave an explanation based on features that he can leverage.In this context, forbidding the algorithm to generate expla-nations in speciﬁc areas of the input space or using speciﬁcfeatures is a promising direction for future work.

References [Abdollahi and Nasraoui 2016] Abdollahi, B., and Nasraoui,O. 2016. Explainable Restricted Boltzmann Machines forCollaborative Filtering.

ICML Workshop on Human Inter-pretability in Machine Learning (Whi).[Adler et al. 2017] Adler, P.; Falk, C.; Friedler, S. A.; Ry-beck, G.; Scheidegger, C.; Smith, B.; and Venkatasubrama-nian, S. 2017. Auditing black-box models for indirect inﬂu-ence.

Proceedings - IEEE International Conference on DataMining, ICDM

Proceedings of the23rd ACM SIGKDD International Conference on Knowl-edge Discovery and Data Mining

Journal of Machine Learning Research

Proceedings of the In-ternational Conference on Data Mining

ESANN 2016 proceedings, European Symposium on Artiﬁ-cial Neural Networks, Computational Intelligence and Ma-chine Learning (April):77–82.[Biran and Cotton 2017] Biran, O., and Cotton, C. 2017. Ex-planation and Justiﬁcation in Machine Learning : A Sur-vey.

International Joint Conference on Artiﬁcial IntelligenceWorkshop on Explainable Artiﬁcial Intelligence (IJCAI-XAI) .[Decyk 1994] Decyk, B. N. 1994. Using Examples to Teach-ing Concepts. In

Changing College Classrooms: New teach-ing and learning strategies for an inscreasingly complexworld . 39–63.[Doshi-Velez and Kim 2017] Doshi-Velez, F., and Kim, B.2017. Towards A Rigorous Science of Interpretable Ma-chine Learning. 1–12.[Fernandes, Vinagre, and Cortez 2015] Fernandes, K.; Vina-gre, P.; and Cortez, P. 2015. A proactive intelligent decisionsupport system for predicting the popularity of online news.In

Lecture Notes in Computer Science (including subseriesLecture Notes in Artiﬁcial Intelligence and Lecture Notes inBioinformatics) , volume 9273, 535–546.[Goodfellow, Shlens, and Szegedy 2015] Goodfellow, I. J.;Shlens, J.; and Szegedy, C. 2015. Explaining and Harness-ing Adversarial Examples. In

International Conference onLearning Representation .[Harman and Lacko 2010] Harman, R., and Lacko, V. 2010.On decompositional algorithms for uniform sampling fromn-spheres and n-balls.

Journal of Multivariate Analysis

Lecture Notes inComputer Science (including subseries Lecture Notes inArtiﬁcial Intelligence and Lecture Notes in Bioinformatics)

Proceedings of the IEEEComputer Society Conference on Computer Vision and Pat-tern Recognition , volume 07-12-June, 3917–3925.[Kim and Doshi-Velez 2017] Kim, B., and Doshi-Velez, F.2017. Interpretable Machine Learning : The fuss , the con-crete and the questions. In

ICML Tutorial on interpretablemachine learning .[Krause, Perer, and Bertini 2016] Krause, J.; Perer, A.; and Bertini, E. 2016. Using Visual Analytics to Interpret Predic-tive Machine Learning Models.

ICML Workshop on HumanInterpretability in Machine Learning (Whi):106–110.[Lakkaraju and Rudin 2017] Lakkaraju, H., and Rudin, C.2017. Learning Cost-Effective and Interpretable TreatmentRegimes.

Proceedings of the 20th International Conferenceon Artiﬁcial Intelligence and Statistics

Proceedings of the IEEE

Decision SupportSystems

Mis Quarterly

Teaching Statistics

Pro-ceedings of the 22nd ACM SIGKDD International Confer-ence on Knowledge Discovery and Data Mining - KDD ’16

Data and KnowledgeEngineering

International Con-ference on Learning Representation .[Turner 2016] Turner, R. 2016. A model explanation sys-tem.

IEEE International Workshop on Machine Learningfor Signal Processing, MLSP

IEEE Internet Computing , volume 15, 4–6.[van Gog, Kester, and Paas 2011] van Gog, T.; Kester, L.;and Paas, F. 2011. Effects of worked examples, example-problem, and problem-example pairs on novices’ learning.

Contemporary Educational Psychology