Grégoire Montavon | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Grégoire Montavon is active.

Explore More

Publication

Featured researches published by Grégoire Montavon.

Journal of Chemical Theory and Computation | 2013

Assessment and Validation of Machine Learning Methods for Predicting Molecular Atomization Energies

Katja Hansen; Grégoire Montavon; Franziska Biegler; Siamac Fazli; Matthias Rupp; Matthias Scheffler; O. Anatole von Lilienfeld; Alexandre Tkatchenko; Klaus-Robert Müller

The accurate and reliable prediction of properties of molecules typically requires computationally intensive quantum-chemical calculations. Recently, machine learning techniques applied to ab initio calculations have been proposed as an efficient approach for describing the energies of molecules in their given ground-state structure throughout chemical compound space (Rupp et al. Phys. Rev. Lett. 2012, 108, 058301). In this paper we outline a number of established machine learning techniques and investigate the influence of the molecular representation on the methods performance. The best methods achieve prediction errors of 3 kcal/mol for the atomization energies of a wide variety of molecules. Rationales for this performance improvement are given together with pitfalls and challenges when applying machine learning approaches to the prediction of quantum-mechanical observables.

PLOS ONE | 2015

On Pixel-Wise Explanations for Non-Linear Classifier Decisions by Layer-Wise Relevance Propagation

Sebastian Bach; Alexander Binder; Grégoire Montavon; Frederick Klauschen; Klaus-Robert Müller; Wojciech Samek

Understanding and interpreting classification decisions of automated image classification systems is of high value in many applications, as it allows to verify the reasoning of the system and provides additional information to the human expert. Although machine learning methods are solving very successfully a plethora of tasks, they have in most cases the disadvantage of acting as a black box, not providing any information about what made them arrive at a particular decision. This work proposes a general solution to the problem of understanding classification decisions by pixel-wise decomposition of nonlinear classifiers. We introduce a methodology that allows to visualize the contributions of single pixels to predictions for kernel-based classifiers over Bag of Words features and for multilayered neural networks. These pixel contributions can be visualized as heatmaps and are provided to a human expert who can intuitively not only verify the validity of the classification decision, but also focus further analysis on regions of potential interest. We evaluate our method for classifiers trained on PASCAL VOC 2009 images, synthetic image data containing geometric shapes, the MNIST handwritten digits data set and for the pre-trained ImageNet model available as part of the Caffe open source package.

New Journal of Physics | 2013

Machine learning of molecular electronic properties in chemical compound space

Grégoire Montavon; Matthias Rupp; Vivekanand V. Gobre; Alvaro Vazquez-Mayagoitia; Katja Hansen; Alexandre Tkatchenko; Klaus-Robert Müller; O. Anatole von Lilienfeld

The combination of modern scientific computing with electronic structure theory can lead to an unprecedented amount of data amenable to intelligent data analysis for the identification of meaningful, novel and predictive structure?property relationships. Such relationships enable high-throughput screening for relevant properties in an exponentially growing pool of virtual compounds that are synthetically accessible. Here, we present a machine learning model, trained on a database of ab initio calculation results for thousands of organic molecules, that simultaneously predicts multiple electronic ground- and excited-state properties. The properties include atomization energy, polarizability, frontier orbital eigenvalues, ionization potential, electron affinity and excitation energies. The machine learning model is based on a deep multi-task artificial neural network, exploiting the underlying correlations between various molecular properties. The input is identical to ab initio methods, i.e.?nuclear charges and Cartesian coordinates of all atoms. For small organic molecules, the accuracy of such a ?quantum machine? is similar, and sometimes superior, to modern quantum-chemical methods?at negligible computational cost.

IEEE Transactions on Neural Networks | 2017

Evaluating the Visualization of What a Deep Neural Network Has Learned

Wojciech Samek; Alexander Binder; Grégoire Montavon; Sebastian Lapuschkin; Klaus-Robert Müller

Deep neural networks (DNNs) have demonstrated impressive performance in complex machine learning tasks such as image classification or speech recognition. However, due to their multilayer nonlinear structure, they are not transparent, i.e., it is hard to grasp what makes them arrive at a particular classification or recognition decision, given a new unseen data sample. Recently, several approaches have been proposed enabling one to understand and interpret the reasoning embodied in a DNN for a single test image. These methods quantify the “importance” of individual pixels with respect to the classification decision and allow a visualization in terms of a heatmap in pixel/input space. While the usefulness of heatmaps can be judged subjectively by a human, an objective quality measure is missing. In this paper, we present a general methodology based on region perturbation for evaluating ordered collections of pixels such as heatmaps. We compare heatmaps computed by three different methods on the SUN397, ILSVRC2012, and MIT Places data sets. Our main result is that the recently proposed layer-wise relevance propagation algorithm qualitatively and quantitatively provides a better explanation of what made a DNN arrive at a particular classification decision than the sensitivity-based approach or the deconvolution method. We provide theoretical arguments to explain this result and discuss its practical implications. Finally, we investigate the use of heatmaps for unsupervised assessment of the neural network performance.

Pattern Recognition | 2017

Explaining nonlinear classification decisions with deep Taylor decomposition

Grégoire Montavon; Sebastian Lapuschkin; Alexander Binder; Wojciech Samek; Klaus-Robert Müller

Nonlinear methods such as Deep Neural Networks (DNNs) are the gold standard for various challenging machine learning problems such as image recognition. Although these methods perform impressively well, they have a significant disadvantage, the lack of transparency, limiting the interpretability of the solution and thus the scope of application in practice. Especially DNNs act as black boxes due to their multilayer nonlinear structure. In this paper we introduce a novel methodology for interpreting generic multilayer neural networks by decomposing the network classification decision into contributions of its input elements. Although our focus is on image classification, the method is applicable to a broad set of input data, learning tasks and network architectures. Our method called deep Taylor decomposition efficiently utilizes the structure of the network by backpropagating the explanations from the output to the input layer. We evaluate the proposed method empirically on the MNIST and ILSVRC data sets. HighlightsA novel method to explain nonlinear classification decisions in terms of input variables is introduced.The method is based on Taylor expansions and decomposes the output of a deep neural network in terms of input variables.The resulting deep Taylor decomposition can be applied directly to existing neural networks without retraining.The method is tested on two large-scale neural networks for image classification: BVLC CaffeNet and GoogleNet.

arXiv: Machine Learning | 2012

Deep Boltzmann Machines and the Centering Trick

Grégoire Montavon; Klaus-Robert Müller

Deep Boltzmann machines are in theory capable of learning efficient representations of seemingly complex data. Designing an algorithm that effectively learns the data representation can be subject to multiple difficulties. In this chapter, we present the “centering trick” that consists of rewriting the energy of the system as a function of centered states. The centering trick improves the conditioning of the underlying optimization problem and makes learning more stable, leading to models with better generative and discriminative properties.

Digital Signal Processing | 2018

Methods for Interpreting and Understanding Deep Neural Networks

Grégoire Montavon; Wojciech Samek; Klaus-Robert Müller

Abstract This paper provides an entry point to the problem of interpreting a deep neural network model and explaining its predictions. It is based on a tutorial given at ICASSP 2017. As a tutorial paper, the set of methods covered here is not exhaustive, but sufficiently representative to discuss a number of questions in interpretability, technical challenges, and possible applications. The second part of the tutorial focuses on the recently proposed layer-wise relevance propagation (LRP) technique, for which we provide theory, recommendations, and tricks, to make most efficient use of it on real data.

IEEE Signal Processing Magazine | 2013

Analyzing Local Structure in Kernel-Based Learning: Explanation, Complexity, and Reliability Assessment

Grégoire Montavon; Mikio L. Braun; Tammo Krueger; Klaus-Robert Müller

Over the last decade, nonlinear kernel-based learning methods have been widely used in the sciences and in industry for solving, e.g., classification, regression, and ranking problems. While their users are more than happy with the performance of this powerful technology, there is an emerging need to additionally gain better understanding of both the learning machine and the data analysis problem to be solved. Opening the nonlinear black box, however, is a notoriously difficult challenge. In this review, we report on a set of recent methods that can be universally used to make kernel methods more transparent. In particular, we discuss relevant dimension estimation (RDE) that allows to assess the underlying complexity and noise structure of a learning problem and thus to distinguish high/low noise scenarios of high/low complexity respectively. Moreover, we introduce a novel local technique based on RDE for quantifying the reliability of the learned predictions. Finally, we report on techniques that can explain the individual nonlinear prediction. In this manner, our novel methods not only help to gain further knowledge about the nonlinear signal processing problem itself, but they broaden the general usefulness of kernel methods in practical signal processing applications.

Archive | 2016

Layer-Wise Relevance Propagation for Deep Neural Network Architectures

Alexander Binder; Sebastian Bach; Grégoire Montavon; Klaus-Robert Müller; Wojciech Samek

We present the application of layer-wise relevance propagation to several deep neural networks such as the BVLC reference neural net and googlenet trained on ImageNet and MIT Places datasets. Layer-wise relevance propagation is a method to compute scores for image pixels and image regions denoting the impact of the particular image region on the prediction of the classifier for one particular test image. We demonstrate the impact of different parameter settings on the resulting explanation.

PLOS ONE | 2017

What is relevant in a text document?: An interpretable machine learning approach

Leila Arras; Franziska Horn; Grégoire Montavon; Klaus-Robert Müller; Wojciech Samek

Text documents can be described by a number of abstract concepts such as semantic category, writing style, or sentiment. Machine learning (ML) models have been trained to automatically map documents to these abstract concepts, allowing to annotate very large text collections, more than could be processed by a human in a lifetime. Besides predicting the text’s category very accurately, it is also highly desirable to understand how and why the categorization process takes place. In this paper, we demonstrate that such understanding can be achieved by tracing the classification decision back to individual words using layer-wise relevance propagation (LRP), a recently developed technique for explaining predictions of complex non-linear classifiers. We train two word-based ML models, a convolutional neural network (CNN) and a bag-of-words SVM classifier, on a topic categorization task and adapt the LRP method to decompose the predictions of these models onto words. Resulting scores indicate how much individual words contribute to the overall classification decision. This enables one to distill relevant information from text documents without an explicit semantic information extraction step. We further use the word-wise relevance scores for generating novel vector-based document representations which capture semantic information. Based on these document vectors, we introduce a measure of model explanatory power and show that, although the SVM and CNN models perform similarly in terms of classification accuracy, the latter exhibits a higher level of explainability which makes it more comprehensible for humans and potentially more useful for other applications.

Explore More