[PDF] Fibres of Failure: Classifying errors in predictive processes

Abstract

We describe Fibres of Failure (FiFa), a method to classify failure modes of predictive processes using the Mapper algorithm from Topological Data Analysis. Our method uses Mapper to build a graph model of input data stratified by prediction error. Groupings found in high-error regions of the Mapper model then provide distinct failure modes of the predictive process. We demonstrate FiFa on misclassifications of MNIST images with added noise, and demonstrate two ways to use the failure mode classification: either to produce a correction layer that adjusts predictions by similarity to the failure modes; or to inspect members of the failure modes to illustrate and investigate what characterizes each failure mode.

Full PDF

FFibres of Failure: Classifying errors in predictive processes

Leo Carlsson Gunnar Carlsson

Mikael Vejdemo-Johansson Abstract

We describe Fibres of Failure (F I F A ), a methodto classify failure modes of predictive processesusing the M APPER algorithm from TopologicalData Analysis. Our method uses M

APPER to builda graph model of input data stratiﬁed by predictionerror. Groupings found in high-error regions ofthe M

APPER model then provide distinct failuremodes of the predictive process.We demonstrate F I F A on misclassiﬁcations ofMNIST images with added noise, and demon-strate two ways to use the failure mode classiﬁca-tion: either to produce a correction layer that ad-justs predictions by similarity to the failure modes;or to inspect members of the failure modes to il-lustrate and investigate what characterizes eachfailure mode.

1. Introduction

In recent years the interest in transparent, interpretable andexplainable models in machine learning has grown dramati-cally, with dedicated workshops at NIPS 2016(Wilson et al.,2016), NIPS 2017 (Tosi et al., 2017; Wilson et al., 2017)and ICML 2017 (Varshney et al., 2017) as well as attentionfrom grant agencies (Gunning, 2016).The approaches to interpretable models go in several dis-tinct directions – producing sparse models (Hara & Maehara,2016; Wisdom et al., 2016; Hayete et al., 2016; Tansey et al.,2017), visualization techniques (Smilkov et al., 2016; Sel-varaju et al., 2016; Thiagarajan et al., 2016; Gallego-Ortiz& Martel, 2016; Krause et al., 2016; Zrihem et al., 2016;Handler et al., 2016), hybrid models (Krakovna & Doshi-Velez, 2016; Reing et al., 2016), input data segmentation(Samek et al., 2016; Hechtlinger, 2016; Thiagarajan et al.,2016), and model diagnostics with or without blackbox in-terpretation layers (Lundberg & Lee, 2016; Vidovic et al., KTH Royal Institute of Technology Ayasdi Inc. StanfordUniversity CUNY College of Staten Island. Correspondenceto: Leo Carlsson < [email protected] > , Mikael Vejdemo-Johansson < [email protected] > .Preliminary work. Under review. Copyright 2018 by the author(s). Fibres of Failure thatdraws on topological data analysis to produce model diag-nostics through a classiﬁcation of prediction failure modesin feature space. Our method relates to both the input datasegmentation and the model diagnostics directions of re-search by ﬁnding and classifying input regions that behaveunexpectedly or erroneously as compared to what the modelis designed to predict.Noisy input as well as adversarial learning has been usedto motivate and to generate examples and insights for inter-pretability (Kindermans et al., 2016). We will use the samebasic idea to illustrate our method – by studying predictionfailures on MNIST images with added noise.

2. Related work

One interpretability method with a large impact on the ﬁeld,LIME (Ribeiro et al., 2016c), inspects single instances byperturbing the input and tracing how predictions change withthe perturbation. Other interpretability methods focus closeron aggregates of inputs, such as TreeView (Thiagarajanet al., 2016), which visualizes deep neural networks by ﬁrstclustering neurons by activation patterns, then clusters thesegroups by prediction labels, and ﬁnally trains a predictor topredict the meta-clusters from the input data directly.The F I F A method builds on M APPER , an algorithm fromTopological Data Analysis that constructs a graph (or sim-plicial complex) model of arbitrary data. M

APPER has hadsuccess in a wide range of application areas, from medicalresearch studying cancer, diabetes, asthma and many moretopics (Nicolau et al., 2011; Li et al., 2015; Hinks et al.,2016; Schneider et al., 2016), genetics and phenotype stud-ies (Romano et al., 2014; Carlsson, 2017; C´amara, 2017;Savir et al., 2017; Bowman et al., 2008), to hyperspectralimaging, material science, sports and politics (Duponchel,2018a;b; Lee et al., 2017; Lum et al., 2013). Of note forour approach are in particular the contributions on cancer,diabetes and fragile X syndrom (Nicolau et al., 2011; Ro-mano et al., 2014; Li et al., 2015) where M

APPER was usedto extract new subgroups from a segmentation of the input a r X i v : . [ c s . C V ] F e b ibres of Failure space.Our results build on two fundamental concepts: viewingpredictive models as functions and therefore usable as in-put to M APPER , and the M

APPER technique for producingintrinsic graph models of arbitrary data sets.As a running illustration in this paper we will be lookingat how a CNN trained on the MNIST dataset fails whenencountering noisy images derived from MNIST. The in-ﬂuence of noise on learning algorithm performance hasbeen studied. (Zhou et al., 2017; Dodge & Karam, 2016)found a dramatic increase in error rates with increased im-age distortion, conﬁrming our choice of illustrative test case.Adversarial learning is another method that has been provensuccessful at deteriorating performance for trained networks(Cisse et al., 2017a; Yuan et al., 2018; Moosavi-Dezfooliet al., 2016; Chen et al., 2018).A lot of work has been done on making deep networks morerobust against perturbations: both against noise deteriorationand against adversarial manipulation (Fawzi et al., 2016;Hein & Andriushchenko, 2017; Noh et al., 2017; Cisse et al.,2017b; Huang et al., 2015; Tram`er et al., 2018).

3. Proposed method

The proposed method, F

IBRES OF F AILURE (F I F A ), takes adifferent approach from the related work. We do not intendto modify deep neural network models, rather we createclassiﬁers on top of the model that recognizes speciﬁc typesof faulty predictions (failure modes) from a deep learningmodel trained to recognize MNIST images. APPER M APPER (Singh et al., 2007) is an algorithm that constructsa graph (more generally a simplicial complex) model for apoint cloud data set. The graph is constructed systematicallyfrom some well deﬁned input data. It was deﬁned in (Singhet al., 2007), and has been shown to have great utility in thestudy of various kinds of data sets (as described in Section2). It can be viewed as a method of unsupervised analysisof data, in the same way as principal component analysis,multidimensional scaling, and projection pursuit can, but itis more ﬂexible than any of these methods. Comparisonsof the method with standard methods in the context of hy-perspectral imaging have been documented in (Duponchel,2018a;b).In topological language, M

APPER starts with the choice ofa collection of continuous ﬁlter functions and an open coverover their range. The ﬁbres , or preimages of this open coverproduces an open cover on the data space, which can bereﬁned using connected components. Doing this with a ﬁneenough cover and non-degenerate ﬁlter functions produces a good cover in the sense of the nerve lemma (Hatcher, 2002),so the nerve complex is homotopy equivalent with the datasource.An open cover here is almost, but not quite the same thingas a partition. In order to track connectivity information,the partition cannot be allowed to become disconnected –that would miss parts of the space, and introduce artiﬁcialdisconnects. The open cover most cleanly translates intoa “fattened” partition, or a partition with overlaps betweenadjacent parts.In more detail, and using a more data-focused and lesstopological description, M

APPER proceeds by the followingsteps. We let X (the dataset) be a ﬁnite metric space.1. Select arbitrary functions f , f , . . . , f k : X → R . Wecall these ﬁlter functions and they encode a separationof datapoints. In practice, the number k is usually , , or . Common ﬁlter functions are statistically meaning-ful quantities such as the values of a density estimatoror centrality measure, or outputs from a machine learn-ing algorithm such as PCA or MDS, or a variable usedin deﬁning the data set.2. For each of the functions, pick parameters to producean overlapping partition of R : a number N i of parti-tions and a proportion of overlap < p < .3. For each function f i , let a i and b i denote the minimumand maximum values taken by f i , and construct anopen cover of the interval J i = [ a i , b i ] by introducing N i subintervals J is = [ a i + ( s − − p/ i , a i + ( s + p/ i ] ⊆ J i where ∆ i = ( b i − a i ) /N i and ≤ s ≤ N i .4. Construct a (likely overlapping) partition of X by let-ting each U s ,...,s k = f − ( J s ) ∩ · · · ∩ f − k ( J ks k ) with ≤ s i ≤ N i be a part in in the partition.5. Apply some clustering algorithm to each U s ,...,s k todecompose it into disjoint sets U s ,...,s k ; j . For ourexperiments, we use single linkage clustering with aheuristic for cutoff based on the histogram of distancesin U , described in detail in (Singh et al., 2007).6. Construct a graph by setting the vertices to σ =( s , . . . , s k ; j ) and connecting σ, . . . , to σ (cid:48) withanedge precisely when U σ ∩ U σ (cid:48) (cid:54) = ∅ For the simplicial complex version, vertices [ σ , . . . , σ d ] are connected when their joint intersec-tion is non-empty. ibres of Failure See (Singh et al., 2007; Carlsson, 2009) for more details.M

APPER has several implementations available: PythonMapper (M¨ullner & Babu, 2013), Kepler Mapper (Saul& van Veen, 2017) and TDAmapper are all open source,while Ayasdi Inc. provides a commercial implementationof the algorithm. For our work we are using the Ayasdiimplementation of M APPER .3.1.1. M

APPER ON PREDICTION FAILURE

The ﬁlters in the M

APPER function have the effect of en-suring separation of features in the data that are separatedby the ﬁlter functions themselves. Step one of F I F A speciﬁ-cally uses a M APPER analysis with prediction error as oneof the ﬁlter functions. By including prediction error thisway, the F I F A algorithm guarantees that any groups that areextracted are homogenous with respect to prediction failure,and thus useable as a failure mode designation.We name a M APPER model with prediction failure as a ﬁltera F I F A model . Subgroups of the F I F A model with tight connectivity in thegraph structure and with homogenous and large average pre-diction failure per component cluster provide a classiﬁcationof failure modes. These can be selected either manually, orusing a community detection algorithm.When selecting failure modes manually, a visualization suchas in Figure 2 is most helpful. Here, ﬂares (tightly con-nected subgraphs emanating from a core, such as Group40) or tightly connected components, loosely connected tosurrounding parts of the graph, are the most compellingcharacterizations of a good failure mode subgroup. Once failure modes have been identiﬁed, one way to use theidentiﬁcation is to add a correction layer to the predictiveprocess. Use a classiﬁer to recognize input data similar to aknown failure mode, and adjust the predictive process outputaccording to the behavior of the failure mode in availabletraining data.3.3.1. T

RAIN CLASSIFIERS

For our illustrative examples, we demonstrate several “onevs rest” binary classiﬁer ensembles where each classiﬁeris trained to recognize one of the failure modes (extractedsubgroups) from the Mapper graph. We demonstrate per-formance of F I F A for model correction using Linear SVM, http://cran.r-project.org/web/packages/TDAmapper http://ayasdi.com Logistic Regression, and Na¨ıve Bayes classiﬁers.3.3.2. E

VALUATE BIAS

A classiﬁer trained on a failure mode may well capturelarger parts of test data than expected. As long as the spaceidentiﬁed as a failure mode has consistent bias, it remainsuseful for model correction: by evaluating the bias in datacaptured by a failure mode classiﬁer we can calibrate thecorrection layer.3.3.3. A

DJUST MODEL

The actual correction on new data is a type of ensemblemodel, and has ﬂexibility on how to reconcile the bias pre-diction with the original model prediction – or even how toreconcile several bias predictions with each other. For ourexample in this paper we choose to override the CNN pre-diction with the observed ground truth in the failure modefrom the training data used to create the classiﬁer. For re-gression tasks we have also used the average of the failuremode training group as an offset to subtract from the modelprediction.

Identifying distinct failure modes and giving examples ofthese is valuable for model inspection and debugging. Sta-tistical methods, such as Kolmogorov-Smirnov testing, canprovide measures of how inﬂuential any one feature is indistinguishing one group from another and can give notionsof what characterizes any one failure mode from other partsof input space. With examples and distinguishing featuresin hand, we can go back to the original model design andevaluate how to adapt the model to handle the failure modesbetter.Much of the work in interpretability for machine learningprovides tools to inspect examples, and for providing amodel explanation for a speciﬁc example. These work wellin conjunction with F I F A to ﬁnd explanations for the identi-ﬁed failure modes.

4. Experiments

In order to evaluate the F I F A method we have trained aCNN classiﬁer on the MNIST data set, created predictionfailures by adding noise to the data, and gone through theF I F A pipeline for the resulting erroneous predictions. Withdistinct failure modes extracted, we then illustrate both aquantitative and a qualitative approach to handling the out-put from F I F A : on the one hand we adjust predictions usingclassiﬁers trained on recognizing each failure mode andmeasure the improvement in classiﬁcation on the resultingensemble approach, on the other hand we compare severalfailure modes that misclassify versions of the same digit ibres of Failure C(k)

Conv2D Conv2D Max pooling Dropout25% Flatten Dropout50%Dense Dense26x26x32 24x24x64 12x12x64 12x12x64 9216 128 128 1028x28x1 Soft-max

Figure 1.

The topology for the CNN model. The numbers displaythe dimension of each layer in the model. The abbreviations,such as Conv2D, describes the speciﬁc transformations performedbetween layers in the model. The activation functions for theclassiﬁcation layer was ’Softmax’ and for the other layers ’ReLU’.The optimizer used was ’Adadelta’. (the digit 5) in different ways.

We created a CNN model with a topology shown in Figure 1.The network topology and parameters was chosen arbitrarilywith the only condition that it performs well on the originalMNIST data set. The activation functions was ’Softmax’for the classiﬁcation layer and ’ReLU’ for all other layers.The optimizer was Adadelta with learningrate = 1 . , ρ = 0 . , and (cid:15) = 1 e − . We trained the model on 60,000clean MNIST training images and tested it on 10,000 cleanMNIST images through 12 epochs. The accuracy on thetest-set of 10,000 clean MNIST images was 99.05%. Wecreated 10,000 corrupt MNIST images using 25% randombinary ﬂips on the clean test images[source for code]. Theaccuracy on the corrupt MNIST images was 40.45%.To create the M APPER graph we used the following:

Filters:

Principal Component 1, probability of Predicteddigit, probability of Ground truth digit, and Ground truthdigit. Our measure of predictive error is the probability ofGround truth digit. By including the Ground truth digititself we separate the model on ground truth, guaranteeingthat any one one failure mode has a consistent ground truththat can be used for corrections.

Metric:

Variance Normalized Euclidean

Variables:

Instances:

We randomly shufﬂed the data from the 10,000clean and 10,000 corrupt images that were used to test theCNN model, and split the 20,000 instances into 5 trainingsets of size 16,000 each and 5 test sets of size 4,000 each.The training sets was used to create 5 M

APPER graphs.This is in order to perform 5-fold cross validation on the classiﬁers.The use of probabilities for predicted and ground truth digitas ﬁlters guarantees that M

APPER separates regions of cor-rect predictions from those of wrong predictions. After all,these probabilities are measures of error for the CNN model.We purposely omitted the activations from the Dense-10layer as input variables because of the direct reference to theprobabilities for both the ground truth digit and the predicteddigit.The following variables were included in the analysis butwere not used to create the F I F A model:

10 activations from the Dense-10 layer, which consists ofthe probabilities for each digit, 0-9.

784 pixel values representing the ﬂattened MNIST imageof size 28x28x1. : prediction by the CNN model, ground truthdigit, corrupt or original data (binary), correct or incorrectprediction(binary), probability of the Predicted digit (high-est value of the Dense-10 layer), and probability of groundtruth digit.Hence, the total number of variables in our analysis were10272.To extract failure modes from the F I F A model we used asupervised community detection method to ﬁnd groups ofapproximately constant prediction error. In the M APPER implementation we are using a grouping method based onAgglomerative Hierarchical Clustering (AHCL) (Edwards& Cavalli-Sforza, 1965; Murtagh & Contreras, 2012) andLouvain Modularity(Blondel et al., 2008) is included. Assupervision, a function on the data is chosen – for F I F A ,choose the measure of prediction error. The difference inmeans of the supervision function produces a graph edgeweighting: edges are weighted as “strong” if they have simi-lar supervision function values, and “weak” if the supervi-sion function values are different. With the graph weightingin place, hierarchical clustering produces a clustering treeusing the weighted edges to generate a graph metric to clus-ter over. Finally, Louvain modularity identiﬁes an optimalgraph partition from the clustering tree.From partitioned groups, we retain as failure modes thosegroups that have at least 15 data points and have less than99.05% correct predictions, which is the accuracy of theCNN model on the original MNIST test data.We trained classiﬁers in a one vs. rest scheme on eachgroup in the 5 folds of data that were used to create the 5M APPER graphs. We used the following types of classiﬁerswith varying parameters shown in square brackets:

Linear-SVM

Loss function: squared hinge, Penaltyfunction: (cid:96) . Regularization parameter, C = ibres of Failure [0 . , . , . , , , , . Logistic Regression

Penalty function: (cid:96) . Regularizationparameter, C = [0 . , . , . , , , , . Na¨ıve Bayes

Gaussian Na¨ıve Bayes using class priors foreach group in the training data set.We used the parameters from each best performing classiﬁerto train new models. This time, we evaluated each modelon second test data set, called ’Corrupt’, which consisted of10,000 new corrupt images using 25% binary ﬂips on theoriginal MNIST test dataset. Hence, we used the same noisesetup as the corrupt images used for testing the CNN model.For the test data sets, we evaluated to what extent eachclassiﬁer predicted member points with the same groundtruth digit as that of the group the model was trained on. Aswe trained the classiﬁers on groups containing a lot of wrongpredictions, it is expected that the classiﬁers will classifymember points with wrong predictions on the test data sets.Hence, we offset the predicted digits with the ground truthdigit of the group it was trained on. We attempt to exploitthe consistent bias of the classiﬁers to improve the accuracyof the now combined CNN and classiﬁer ensemble.

The following parameters were chosen for the three classi-ﬁers we evaluated as model correction layers:

Linear-SVM C = 1 . (chosen as highest accuracy in a5-fold crossvalidation) Logistic Regression C = 1000 (chosen as highest accuracyin a 5-fold crossvalidation) Na¨ıve Bayes

Gaussian Na¨ıve Bayes using priors inducedfrom data.The average number of data points in all failure mode groupsin the 5 folds were 4937 of the total 16,000. The averagenumber of clean data points in all groups in the 5 folds were10.4, accounting for a fraction of 0.21% of the 4937 datapoints. This also means that the failure mode groups encom-passes roughly 62% of all corrupt data points in the trainingset. The number of failure modes (extracted subgroups) ineach fold were 41, 41, 41, 41, and 37, respectively.Table 1 shows the accuracy on the two test data sets us-ing CNN with and without F I F A . The linear-SVM clas-siﬁers performed best on both data sets with an improve-ment by 6.43%pt on the 5-fold cross validation test sets and19.33%pt on the ’Corrupt data’. For the qualitative analysis, we chose to focus on fourgroups with digit 5 as the ground truth digit. Group 50,

G50 G30G47G40

Figure 2.

The Mapper graph for the CNN on MNIST dataset col-ored with probability of predicting the ground truth digit. Thecolorbar is for interpreting the values of the coloring. The circlednodes and edges are the groups Group30, Group40, Group47, andGroup50. The 5-fold Mapper graphs are shown in the Supplement. which is not one of the failure mode groups and Groups 30,40, and 47, all part of the total 39 failure mode groups. Thelocations of each group are shown in Figure 2. The distri-bution of predicted probabilities for each label is shown inFigure 4: group 30 is the group with highest probability tothe digit 5, while 40 and 47 are more focused on 8, 2, and 3.All three groups favor digit 8 as their mean probabilities arebetween 0.5-0.9.We compared these three failure modes with the non-failureGroup 50 and extracted the 5 activations with the highest KS-values from the Dense-128 layer. See Figure 1. To illustratethe differences between the three failure modes regardingthe activations, we have provided a selection of saliencymaps (Simonyan et al., 2013) for all images consideredas true members of each of the three failure mode groups.These were all produced using the keras-vis

Pythonpackage.Figure 3 shows a selection of noisy images and their saliencymaps for some of the activations highest KS-values withinthe Dense-128 layer. The two leftmost image pairs wereselected based on visual clear saliency maps with respectto digits. The two rightmost were selected based on mostunclear/noisy saliency maps. The full collection of saliency ibres of Failure % clean

Corrupt (4 000) 49.7% (10 000)CNN 69.40% 48.53% 41.14%CNN+LR 75.45%

CNN+NB 73.33% 6.85% 48.34%

Table 1.

Performance of the CNN as compared to CNN with F I F A driven improvements both on the average of the 5 folds of testdata and on entirely corrupted test data. The improvements byeach classiﬁer ensemble are for the best performing parameters. Inaddition to prediction accuracy, we also report the average propor-tion of uncorrupted (clean) data points in the 5-fold test data setas well as in the predicted data points by each classiﬁer ensemble.Bold face marks the best performances (highest accuracy; lowestpercentage of clean digits caught). Noticeably linear classiﬁersperform well, producing an almost 20%pt increase in accuracyon corrupted data while imposing corrections on almost no cleanimages. maps for these groups can be found in our supplementalmaterial.The activations 24 and 81, present in all three groups, dis-play activity that is consistent with an activation detectingfeatures of the digit 5, while the activations 89 and 99 corre-spond closer to an activation for the digit 3 and 119, 122 and124 correspond to activations for the digit 8. In particular inthe last three groups, noise that closes loops in a written 5tend to have high saliency.In Table 2 we show the percentage of blank saliency maps,indicating that an activation is missing completely for aparticular input.

5. Discussion and Conclusion

For the quantitative approach to handling failure modes wecould see signiﬁcant improvement even using quite simplis-tic classiﬁers for constructing a correction layer: an increaseby almost 20%pt, avoiding corrections on almost all un-corrupted images was seen in both the linear separationmethods: both with logistic regression and SVM.On the qualitative side, an inspection of the saliency maps– see Figure 3 for a selection of particularly illustrativemaps, and the supplementary material for a full collection –showed us that the groups were distinguished from group 50containing correctly predicted digits 5 differed in networkactivity either by an activation tuned to detecting 5s, or inan activation that often looked for closing loops and foundthem in the added noise. Blank saliency maps were com-mon for the 5-detecting neurons, as can be seen in Table 2,overwhelmingly so for the groups 40 and 47 where correct Activation 24 – one row each for groups 30, 40 and 47Activation 81 – one row each for groups 30, 40 and 47Group 30, activation 89Group 30, activation 124Group 40, activation 89Group 40, activation 99Group 40, activation 119Group 47, activation 89Group 47, activation 122

Figure 3.

Example noisy images and saliency maps for activationsin the penultimate dense layer for the three main failure modesidentiﬁed for noisy 5s. The two leftmost images were chosen as themost clear saliency maps with respect to digits. The two rightmostwere selected based on unclear/noisy saliency maps. All saliencymaps are from images classiﬁed as members of the respectivefailure mode group. ibres of Failure

Group 30

Neuron

24 33 81

89 124%Blank 36.1% 26.2% 60.7% 0% 8.2%

Group 40

Neuron

24 81

89 99 119%Blank 82.2% 91.8% 0% 4.1% 17.9%

Group 47

Neuron

24 49 81

89 122%Blank 70.6% 54.3% 84.3% 0.5% 3.6%

Table 2.

The percentage of blank (all zero) saliency maps for eachof the 5 neurons with the highest absolute KS-values (comparedto group 50) in the Dense-128 layer. The bold neuron numbersare the neurons qualitatively identiﬁed as encoding digit 5. Weobserve that the neurons encoding digit 5 have predominantlylarger percentages of blank saliency maps. llllllll lllllllll llllll lllllllllll llllllllll lllllllll llllll llllllllllllllllllll lllllllll llllllllllll lllllllllll ll lllllllllll llllllll lllllllllllllllllllllllllllllllllllllllll llllllllllllllllllllllllll lllllllllllllllllllll lllllllllllllllllllllll llllllllllllllllllllllllllllllllllll llll llllllllllllllllllllllllllllllllll lllllllllllllllllllllllllllllllllll lllllllllllllllllllllll lllllllllllllllllllllllllllllll

Ground Truth S o ft m a x Figure 4.

The failure modes for a ground truth of 5. We see thedistributions of predictions for the three failure modes: only group30 attaches any signiﬁcant likelihood to the digit 5 at all, while allthree favor 8. For group 40, the digits 2 and 3 are also commonlysuggested, while this happens somewhat more rarely in groups 30and 47. predictions were rare, and much less commonly in group 30where as can be seen in Figure 4 a correct prediction stillcame with signiﬁcant strength in the softmax layer.Using F I F A on a CNN-based MNIST digit classiﬁer thathad to cope with severely corrupted MNIST images we wereable to ﬁnd 39 distinct failure modes based on activationsin the antepenultimate and penultimate layers of our CNNmodel. When inspecting the digit 5 in particular, we foundthat the three identiﬁed failure modes could be distinguishedfrom the wellbehaved parts of input space by speciﬁc activa-tions that seemed to code for features corresponding closelyto the kinds of misclassiﬁcations that were observed.In addition to inspecting examples, we explored the additionof a correction layer to the CNN model. The failure modesact as seeds for training a classiﬁer. The classiﬁer can assign new data to a known failure mode, so that the correctionlayer can adjust for known behaviour of that failure mode.For regression models, our suggestion would be to treat theprediction error as bias, and subtract the mean predictionerror for the identiﬁed failure mode from the model pre-diction. In the CNN on corrupted MNIST example we useto illustrate the methodology, we impose the ground truthdigit from which the identiﬁed failure mode emerged as areplacement prediction. By doing this, we could observeup to a 19.33%pt improvement in prediction accuracy oncorrupted data while accidentially including only 0.32% ofuncorrupted observations in the correction groups. The per-centage of clean data is in well accordance with that in thefailure mode groups; 0.21%.F I F A is generically applicable. While developing themethod we have used it to analyze an energy based regres-sion model used to predict temperatures in electric arc steelfurnaces. In that application, we found failure modes thatconsistently over-predicted and under-predicted by close to o C . Adjusting the regression by the mean predictionerror of the failure group provided signiﬁcant improvementin the energy model and a qualitative analysis of the failuremodes uncovered metallurgically important observationsabout material composition related to high prediction error.The F I F A method picks out high prediction error regionsfrom input space of an arbitrary predictive process, and clas-siﬁes failure modes that are internally similar but that havesigniﬁcant separation either in the predictive behaviour ofthe process or in the distance measure of input space. Hav-ing identiﬁed failure modes we can view them as witnessesfor misbehaviour in different ways, and produce correspond-ingly different developments of the predictive process. Onthe one hand, a failure mode witnesses a region of inputspace with local bias to the predictive process, and we cancorrect speciﬁcally for that bias by classifying new data asbelonging to that failure mode (or not) and correct predic-tions for the failure mode members. On the other hand thefailure mode is a witness for some coherent collection ofpredictive failures. By inspecting features of input spacethat distinguish these from other parts of input space we cangain insights about types of failure that could be handled byadjusting the design of the predictive process itself. References

Blondel, Vincent D, Guillaume, Jean-Loup, Lambiotte, Re-naud, and Lefebvre, Etienne. Fast unfolding of communi-ties in large networks.

Journal of Statistical Mechanics:Theory and Experiment , 2008(10):P10008, 2008.Bowman, G.R., Huang, X., Yao, Y., Sun, J., Carlsson, G.,Guibas, L.J., and Pande, V.S. Structural insight into rnahairpin folding intermediates.

JACS Communications , pp. ibres of Failure

American Mathemat-ical Society , 46(2):255–308, 2009.Carlsson, Gunnar. The shape of biomedical data.

CurrentOpinion in Systems Biology , 1, 2017.Chen, Sen, Xue, Minhui, Fan, Lingling, Hao, Shuang, Xu,Lihua, Zhu, Haojin, and Li, Bo. Automated poisoningattacks and defenses in malware detection systems: Anadversarial machine learning approach.

Computers andSecurity , 73:326––344, 2018.Cisse, Moustapha, Adi, Yossi, Neverova, Natalia, andKeshet, Joseph. Houdini: Fooling deep structured visualand speech recognition models with adversarial examples.In

Advances in Neural Information Processing Systems30 , 2017a.Cisse, Moustapha, Bojanowski, Piotr, Grave, Edouard,Dauphin, Yann, and Usunier, Nicolas. Parseval networks:Improving robustness to adversarial examples. In

Pro-ceedings of the 34th International Conference on Ma-chine Learning , 2017b.C´amara, Pablo G. Topological methods for genomics:Present and future direction.

Current Opinion in Sys-tems Biology , 1:95–101, 2017.Dodge, Samuel and Karam, Lina. Understanding how im-age quality affects deep neural networks. In . IEEE, June 2016.Duponchel, Ludovic. Exploring hyperspectral imaging datasets with topological data analysis.

Analytica ChimicaActa , 1000:123–131, 2018a.Duponchel, Ludovic. When remote sensing meets topologi-cal data analysis.

Journal of Spectral Imaging , 2018b.Edwards, Anthony WF and Cavalli-Sforza, L Luka. Amethod for cluster analysis.

Biometrics , pp. 362–375,1965.Fawzi, Alhussein, Moosavi-Dezfooli, Seyed-Mohsen, andFrossard, Pascal. Robustness of classiﬁers: from adversar-ial to random noise. In

Advances in Neural InformationProcessing Systems 29 , 2016.Gallego-Ortiz, Cristina and Martel, Anne L. Inter-preting extracted rules from ensemble of trees: Ap-plication to computer-aided diagnosis of breast MRI. arXiv:1606.08288 [cs, stat] , June 2016. URL http://arxiv.org/abs/1606.08288 . WHI 2016 (ICMLWorkshop). Gunning, David. Explainable artiﬁcial intelligence (XAI).DARPA Broad Agency Announcement DARPA-BAA-16-53, 2016.Handler, Abram, Blodgett, Su Lin, and O’Connor, Bren-dan. Visualizing textual models with in-text and word-as-pixel highlighting. arXiv:1606.06352 [cs, stat] ,June 2016. URL http://arxiv.org/abs/1606.06352 . WHI 2016 (ICML Workshop).Hara, Satoshi and Maehara, Takanori. Finding AlternateFeatures in Lasso. arXiv:1611.05940 [stat] , Novem-ber 2016. URL http://arxiv.org/abs/1611.05940 . NIPS 2016 InterpretML.Hatcher, Allen.

Algebraic Topology . Cambridge UniversityPress, 2002.Hayete, Boris, Valko, Matthew, Greenﬁeld, Alex, and Yan,Raymond. MDL-motivated compression of GLM ensem-bles increases interpretability and retains predictive power. arXiv:1611.06800 [stat] , November 2016. URL http://arxiv.org/abs/1611.06800 . NIPS 2016 In-terpretML.Hechtlinger, Yotam. Interpretation of Prediction ModelsUsing the Input Gradient. arXiv:1611.07634 [cs, stat] ,November 2016. URL http://arxiv.org/abs/1611.07634 . NIPS 2016 InterpretML.Hein, Matthias and Andriushchenko, Maksym. Formal guar-antees on the robustness of a classiﬁer against adversarialmanipulation. In

Advances in Neural Information Pro-cessing Systems 30 , 2017.Hinks, T.S., T, Brown, LC, Lau, H, Rupani, C, Barber,S, Elliott, JA, Ward, J, Ono, S, Ohta, K, Izuhara, R,Djukanovi´c, RJ, Kurukulaaratchy, A, Chauhan, and P.,Howarth. Multidimensional endotyping in patients withsevere asthma reveals inﬂammatory heterogeneity in ma-trix metalloproteinases and chitinase 3-like protein 1.

J.Allergy Clin Immunol , 138(1), 2016.Huang, Ruitong, Xu, Bing, Schuurmans, Dale, andSzepesv´ari, Csaba. Learning with a strong adversary.11 2015.Kindermans, Pieter-Jan, Sch¨utt, Kristof, M¨uller, Klaus-Robert, and D¨ahne, Sven. Investigating the inﬂuenceof noise and distractors on the interpretation of neu-ral networks. arXiv:1611.07270 [cs, stat] , Novem-ber 2016. URL http://arxiv.org/abs/1611.07270 . NIPS 2016 InterpretML.Krakovna, Viktoriya and Doshi-Velez, Finale. Increasingthe Interpretability of Recurrent Neural Networks UsingHidden Markov Models. arXiv:1611.05934 [cs, stat] ,November 2016. URL http://arxiv.org/abs/1611.05934 . NIPS 2016 InterpretML. ibres of Failure

Krause, Josua, Perer, Adam, and Bertini, Enrico. Using Vi-sual Analytics to Interpret Predictive Machine LearningModels. arXiv:1606.05685 [cs, stat] , June 2016. URL http://arxiv.org/abs/1606.05685 . WHI2016 (ICML Workshop).Lee, Yongjin, arthel, Senja D. B, Dlotko, Pawel, Moosavi,S. Mohamad, Hess, Kathryn, and Smit, Berend. Quantify-ing similarity of pore-geometry in nanoporous materials.

Nature Communications , 2017.Li, Li, Cheng, Wei-Yi, Glicksberg, Benjamin S., Gottesman,Omri, Tamler, Ronald, Chen, Rong, Bottinger, Erwin P.,and Dudley, Joel T. Identiﬁcation of type 2 diabetes sub-groups through topological analysis of patient similarity.

Science Translational Medicine , 7(311), 2015.Lum, Pek Y, Singh, Gurjeet, Lehman, Alan, Ishkanov,Tigran, Vejdemo-Johansson, Mikael, Alagappan,Muthu, Carlsson, John, and Carlsson, Gunnar. Ex-tracting insights from the shape of complex datausing topology.

Scientiﬁc Reports , 3, February2013. ISSN 2045-2322. doi: 10.1038/srep01236.URL .Lundberg, Scott and Lee, Su-In. An unexpected unityamong methods for interpreting model predictions. arXiv:1611.07478 [cs] , November 2016. URL http://arxiv.org/abs/1611.07478 . NIPS 2016 In-terpretML.Moosavi-Dezfooli, Seyed-Mohsen, Alhussein, Fawzi, andFrossard, Pascal. Deepfool: a simple and accurate methodto fool deep neural networks. 11 2016.Murtagh, Fionn and Contreras, Pedro. Algorithms for hier-archical clustering: an overview.

Wiley InterdisciplinaryReviews: Data Mining and Knowledge Discovery , 2(1):86–97, 2012.M¨ullner, Daniel and Babu, Aravindakshan. Python mapper:An open-source toolchain for data exploration, analysisand visualization, 2013. URL http://danifold.net/mapper .Nicolau, Monica, Levine, Arnold J., and Carlsson, Gun-nar. Topology based data analysis identiﬁes a subgroupof breast cancers with a unique mutational proﬁle andexcellent survival.

PNAS , 108:7265–7270, 2011.Noh, Hyeonwoo, You, Tackgeun, Mun, Jonghwan, and Han,Bohyung. Regularizing deep neural networks by noise:Its interpretation and optimization. In

Advances in NeuralInformation Processing Systems 30 , 2017. Phillips, Richard L., Chang, Kyu Hyun, andFriedler, Sorelle A. Interpretable Active Learn-ing. arXiv:1708.00049 [cs, stat] , July 2017. URL http://arxiv.org/abs/1708.00049 . WHI2017 (ICML Workshop).Reing, Kyle, Kale, David C., Steeg, Greg Ver, and Galstyan,Aram. Toward Interpretable Topic Discovery via An-chored Correlation Explanation. arXiv:1606.07043 [cs,stat] , June 2016. URL http://arxiv.org/abs/1606.07043 . ICML 2016 Workshop arXiv:1606.05386 [cs, stat] , June 2016a. URL http://arxiv.org/abs/1606.05386 . WHI 2016 (ICMLWorkshop).Ribeiro, Marco Tulio, Singh, Sameer, and Guestrin, Carlos.Nothing Else Matters: Model-Agnostic Explanations ByIdentifying Prediction Invariance. arXiv:1611.05817 [cs,stat] , November 2016b. URL http://arxiv.org/abs/1611.05817 . NIPS 2016 InterpretML.Ribeiro, Marco Tulio, Singh, Sameer, and Guestrin, Carlos.”Why Should I Trust You?”: Explaining the Predictionsof Any Classiﬁer. arXiv:1602.04938 [cs, stat] , Febru-ary 2016c. URL http://arxiv.org/abs/1602.04938 . arXiv: 1602.04938.Romano, David, Nicolau, Monica, Quintin, Eve-Marie,Mazaika, Paul K., Lightbody, Amy A., Hazlett,Heather Cody, Piven, Joseph, Carlsson, Gunnar, andReiss, Allan L. Topological methods reveal high andlow functioning neuro-phenotypes within fragile x syn-drome.

Human Brain Mapping , 35:4904—-4915, 2014.Samek, Wojciech, Montavon, Gr´egoire, Binder, Alexander,Lapuschkin, Sebastian, and M¨uller, Klaus-Robert. Inter-preting the Predictions of Complex ML Models by Layer-wise Relevance Propagation. arXiv:1611.08191 [cs, stat] ,November 2016. URL http://arxiv.org/abs/1611.08191 . NIPS 2016 InterpretML.Saul, Nathaniel and van Veen, Hendrik Jacob.Mlwave/kepler-mapper: 186f (version 1.0.1), 2017. URL http://doi.org/10.5281/zenodo.1054444 .Savir, Aleksandar, Toth, Gergely, and Duponchel, Ludovic.Topological data analysis (tda) applied to reveal pedoge-netic principles of european topsoil system.

Science ofthe Total Environment , 586(2):1091–1100, 2017.Schneider, David S., Torres, Brenda Y., Oliveira, Jose Hen-rique M., Tate, Ann Thomas, Rath, Poonam, and Cum-nock, Katherine. Tracking resilience to infections bymapping disease space.

PLOS Biology , 14(6), 2016. ibres of Failure

Selvaraju, Ramprasaath R., Das, Abhishek, Vedantam,Ramakrishna, Cogswell, Michael, Parikh, Devi, andBatra, Dhruv. Grad-CAM: Why did you say that? arXiv:1611.07450 [cs, stat] , November 2016. URL http://arxiv.org/abs/1611.07450 . NIPS2016 InterpretML.Simonyan, Karen, Vedaldi, Andrea, and Zisserman, Andrew.Deep inside convolutional networks: Visualising imageclassiﬁcation models and saliency maps. arXiv preprintarXiv:1312.6034 , 2013.Singh, Gurjeet, M´emoli, Facundo, and Carlsson, Gunnar.Topological Methods for the Analysis of High Dimen-sional Data Sets and 3d Object Recognition. In

SPBG , pp.91–100, 2007. URL http://comptop.stanford.edu/preprints/mapperPBG.pdf .Singh, Sameer, Ribeiro, Marco Tulio, and Guestrin, Carlos.Programs as Black-Box Explanations. arXiv:1611.07579[cs, stat] , November 2016. URL http://arxiv.org/abs/1611.07579 . NIPS 2016 InterpretML.Smilkov, Daniel, Thorat, Nikhil, Nicholson, Charles, Reif,Emily, Vi´egas, Fernanda B., and Wattenberg, Martin. Em-bedding Projector: Interactive Visualization and Inter-pretation of Embeddings. arXiv:1611.05469 [cs, stat] ,November 2016. URL http://arxiv.org/abs/1611.05469 . NIPS 2016 InterpretML.Tansey, Wesley, Thomason, Jesse, and Scott, James G.Interpretable Low-Dimensional Regression via Data-Adaptive Smoothing. arXiv:1708.01947 [stat] , Au-gust 2017. URL http://arxiv.org/abs/1708.01947 . WHI 2017 (ICML Workshop).Thiagarajan, Jayaraman J., Kailkhura, Bhavya, Sattigeri,Prasanna, and Ramamurthy, Karthikeyan Natesan. Tree-View: Peeking into Deep Neural Networks Via Feature-Space Partitioning. arXiv:1611.07429 [cs, stat] , Novem-ber 2016. URL http://arxiv.org/abs/1611.07429 . NIPS 2016 InterpretML.Tosi, Alessandra, Vellido, Alfredo, and Alvarez, Mauricio(eds.).

Transparent and Interpretable Machine Learningin Safety Critical Environments, NIPS 2017 Workshop ,2017.Tram`er, Florian, Kurakin, Alexey, Papernot, Nicolas, Good-fellow, Ian, Boneh, Dan, and McDaniel, Patrick. En-semble adversarial training: Attacks and defenses.

In-ternational Conference on Learning Representations ,2018. URL https://openreview.net/forum?id=rkZvSe-RZ . accepted as poster.Varshney, Kush, Weller, Adrian, Kim, Been, and Malioutov,Dmitry (eds.).

Human Interpretability in Machine Learn-ing, ICML 2017 Workshop , 2017. Vidovic, Marina M.-C., G¨ornitz, Nico, M¨uller, Klaus-Robert, and Kloft, Marius. Feature Importance Measurefor Non-linear Learning Algorithms. arXiv:1611.07567[cs, stat] , November 2016. URL http://arxiv.org/abs/1611.07567 . NIPS 2016 InterpretML.Whitmore, Leanne S., George, Anthe, and Hudson,Corey M. Mapping chemical performance on molec-ular structures using locally interpretable explanations. arXiv:1611.07443 [physics, stat] , November 2016. URL http://arxiv.org/abs/1611.07443 . NIPS2016 InterpretML.Wilson, Andrew Gordon, Kim, Been, and Herlands, William(eds.).

Interpretable Machine Learning for Complex Sys-tems, NIPS 2016 Workshop , 2016.Wilson, Andrew Gordon, Yosinski, Jason, Simard, Patrice,Caruana, Rich, and Herlands, William (eds.).

Inter-pretable ML Symposium, NIPS 2017 Workshop , 2017.Wisdom, Scott, Powers, Thomas, Pitton, James, and Atlas,Les. Interpretable Recurrent Neural Networks UsingSequential Sparse Recovery. arXiv:1611.07252 [cs, stat] ,November 2016. URL http://arxiv.org/abs/1611.07252 . NIPS 2016 InterpretML.Yuan, Xiaoyong, He, Pan, Zhu, Qile, Bhat, Rajendra Rana,and Li, Xiaolin. Adversarial examples: Attacks anddefenses for deep learning. 1 2018.Zhou, Yiren, Song, Sibo, and Cheung, Ngai-Man. On clas-siﬁcation of distorted images with deep convolutionalneural networks. arXiV , 2017.Zrihem, Nir Ben, Zahavy, Tom, and Mannor, Shie.Visualizing Dynamics: from t-SNE to SEMI-MDPs. arXiv:1606.07112 [cs, stat] , June 2016. URL http://arxiv.org/abs/1606.07112http://arxiv.org/abs/1606.07112