Alexander Binder
Singapore University of Technology and Design
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Alexander Binder.
PLOS ONE | 2015
Sebastian Bach; Alexander Binder; Grégoire Montavon; Frederick Klauschen; Klaus-Robert Müller; Wojciech Samek
Understanding and interpreting classification decisions of automated image classification systems is of high value in many applications, as it allows to verify the reasoning of the system and provides additional information to the human expert. Although machine learning methods are solving very successfully a plethora of tasks, they have in most cases the disadvantage of acting as a black box, not providing any information about what made them arrive at a particular decision. This work proposes a general solution to the problem of understanding classification decisions by pixel-wise decomposition of nonlinear classifiers. We introduce a methodology that allows to visualize the contributions of single pixels to predictions for kernel-based classifiers over Bag of Words features and for multilayered neural networks. These pixel contributions can be visualized as heatmaps and are provided to a human expert who can intuitively not only verify the validity of the classification decision, but also focus further analysis on regions of potential interest. We evaluate our method for classifiers trained on PASCAL VOC 2009 images, synthetic image data containing geometric shapes, the MNIST handwritten digits data set and for the pre-trained ImageNet model available as part of the Caffe open source package.
IEEE Transactions on Neural Networks | 2017
Wojciech Samek; Alexander Binder; Grégoire Montavon; Sebastian Lapuschkin; Klaus-Robert Müller
Deep neural networks (DNNs) have demonstrated impressive performance in complex machine learning tasks such as image classification or speech recognition. However, due to their multilayer nonlinear structure, they are not transparent, i.e., it is hard to grasp what makes them arrive at a particular classification or recognition decision, given a new unseen data sample. Recently, several approaches have been proposed enabling one to understand and interpret the reasoning embodied in a DNN for a single test image. These methods quantify the “importance” of individual pixels with respect to the classification decision and allow a visualization in terms of a heatmap in pixel/input space. While the usefulness of heatmaps can be judged subjectively by a human, an objective quality measure is missing. In this paper, we present a general methodology based on region perturbation for evaluating ordered collections of pixels such as heatmaps. We compare heatmaps computed by three different methods on the SUN397, ILSVRC2012, and MIT Places data sets. Our main result is that the recently proposed layer-wise relevance propagation algorithm qualitatively and quantitatively provides a better explanation of what made a DNN arrive at a particular classification decision than the sensitivity-based approach or the deconvolution method. We provide theoretical arguments to explain this result and discuss its practical implications. Finally, we investigate the use of heatmaps for unsupervised assessment of the neural network performance.
Pattern Recognition | 2017
Grégoire Montavon; Sebastian Lapuschkin; Alexander Binder; Wojciech Samek; Klaus-Robert Müller
Nonlinear methods such as Deep Neural Networks (DNNs) are the gold standard for various challenging machine learning problems such as image recognition. Although these methods perform impressively well, they have a significant disadvantage, the lack of transparency, limiting the interpretability of the solution and thus the scope of application in practice. Especially DNNs act as black boxes due to their multilayer nonlinear structure. In this paper we introduce a novel methodology for interpreting generic multilayer neural networks by decomposing the network classification decision into contributions of its input elements. Although our focus is on image classification, the method is applicable to a broad set of input data, learning tasks and network architectures. Our method called deep Taylor decomposition efficiently utilizes the structure of the network by backpropagating the explanations from the output to the input layer. We evaluate the proposed method empirically on the MNIST and ILSVRC data sets. HighlightsA novel method to explain nonlinear classification decisions in terms of input variables is introduced.The method is based on Taylor expansions and decomposes the output of a deep neural network in terms of input variables.The resulting deep Taylor decomposition can be applied directly to existing neural networks without retraining.The method is tested on two large-scale neural networks for image classification: BVLC CaffeNet and GoogleNet.
Computer Vision and Image Understanding | 2013
Alexander Binder; Wojciech Samek; Klaus-Robert Müller; Motoaki Kawanabe
In this paper we propose a novel biased random sampling strategy for image representation in Bag-of-Words models. We evaluate its impact on the feature properties and the ranking quality for a set of semantic concepts and show that it improves performance of classifiers in image annotation tasks and increases the correlation between kernels and labels. As second contribution we propose a method called Output Kernel Multi-Task Learning (MTL) to improve ranking performance by transfer information between classes. The main advantages of output kernel MTL are that it permits asymmetric information transfer between tasks and scales to training sets of several thousand images. We give a theoretical interpretation of the method and show that the learned contributions of source tasks to target tasks are semantically consistent. Both strategies are evaluated on the ImageCLEF PhotoAnnotation dataset. Our best visual result which used the MTL method was ranked first according to mean Average Precision (mAP) within the purely visual submissions in the ImageCLEF 2011 PhotoAnnotation Challenge. Our multi-modal submission achieved the first rank by mAP among all submissions in the same competition.
international conference on indoor positioning and indoor navigation | 2013
Daniel Becker; Jens Einsiedler; Bernd Schäufele; Alexander Binder; Ilja Radusch
Vehicular positioning technologies enable a broad range of applications and services such as navigation systems, driver assistance systems and self-driving vehicles. However, Global Navigation Satellite Systems (GNSS) do not work in enclosed areas such as parking garages. For these scenarios, a wide range of indoor positioning technologies are available inside the vehicle (internal) and based on infrastructure (external). Based on our previous work, we use off-the-shelf network video cameras to detect the position of moving vehicles within the parking garage in multiple non-overlapping camera views. Towards the goal of using this system as positioning source for vehicles, detected positions need to be transmitted to the communication endpoint in the correct vehicle. The key problem thereby is the association of the externally-observed position to the endpoint in the corresponding vehicle. State-of-the-art tracking-by-detection techniques can differentiate multiple camera-detected vehicles but the generated tracks are anonymous and cannot inherently be associated to the corresponding vehicle. To bridge this gap, we present a tracking-by-identification solution which analyzes vehicle movement patterns by multiple vehicle sensor modalities and compares them with camera-detected tracks to identify the track with the best correlation. The presented approach is based on Kalman Filters and suitable for real-time operation. Test results show that a correct and robust association between endpoints and camera-detected tracks is achieved and that occurring identity switches can be resolved.
PLOS ONE | 2012
Alexander Binder; Shinichi Nakajima; Marius Kloft; Christina Müller; Wojciech Samek; Ulf Brefeld; Klaus-Robert Müller; Motoaki Kawanabe
Combining information from various image features has become a standard technique in concept recognition tasks. However, the optimal way of fusing the resulting kernel functions is usually unknown in practical applications. Multiple kernel learning (MKL) techniques allow to determine an optimal linear combination of such similarity matrices. Classical approaches to MKL promote sparse mixtures. Unfortunately, 1-norm regularized MKL variants are often observed to be outperformed by an unweighted sum kernel. The main contributions of this paper are the following: we apply a recently developed non-sparse MKL variant to state-of-the-art concept recognition tasks from the application domain of computer vision. We provide insights on benefits and limits of non-sparse MKL and compare it against its direct competitors, the sum-kernel SVM and sparse MKL. We report empirical results for the PASCAL VOC 2009 Classification and ImageCLEF2010 Photo Annotation challenge data sets. Data sets (kernel matrices) as well as further information are available at http://doc.ml.tu-berlin.de/image_mkl/(Accessed 2012 Jun 25).
conference on image and video retrieval | 2009
Motoaki Kawanabe; Shinichi Nakajima; Alexander Binder
In order to achieve good performance in object classification problems, it is necessary to combine information from various image features. Because the large margin classifiers are constructed based on similarity measures between samples called kernels, finding appropriate feature combinations boils down to designing good kernels among a set of candidates, for example, positive mixtures of predetermined base kernels. There are a couple of ways to determine the mixing weights of multiple kernels: (a) uniform weights, (b) a brute force search over a validation set and (c) multiple kernel learning (MKL). MKL is theoretically and technically very attractive, because it learns the kernel weights and the classifier simultaneously based on the margin criterion. However, we often observe that the support vector machine (SVM) with the average kernel works at least as good as MKL. In this paper, we propose as an alternative, a two-step approach: at first, the kernel weights are determined by optimizing the kernel-target alignment score and then the combined kernel is used by the standard SVM with a single kernel. The experimental results with the VOC 2008 data set [8] show that our simple procedure outperforms the average kernel and MKL.
Archive | 2016
Alexander Binder; Sebastian Bach; Grégoire Montavon; Klaus-Robert Müller; Wojciech Samek
We present the application of layer-wise relevance propagation to several deep neural networks such as the BVLC reference neural net and googlenet trained on ImageNet and MIT Places datasets. Layer-wise relevance propagation is a method to compute scores for image pixels and image regions denoting the impact of the particular image region on the prediction of the classifier for one particular test image. We demonstrate the impact of different parameter settings on the resulting explanation.
international conference of the ieee engineering in medicine and biology society | 2013
Wojciech Samek; Alexander Binder; Klaus-Robert Müller
Combining information from different sources is a common way to improve classification accuracy in Brain-Computer Interfacing (BCI). For instance, in small sample settings it is useful to integrate data from other subjects or sessions in order to improve the estimation quality of the spatial filters or the classifier. Since data from different subjects may show large variability, it is crucial to weight the contributions according to importance. Many multi-subject learning algorithms determine the optimal weighting in a separate step by using heuristics, however, without ensuring that the selected weights are optimal with respect to classification. In this work we apply Multiple Kernel Learning (MKL) to this problem. MKL has been widely used for feature fusion in computer vision and allows to simultaneously learn the classifier and the optimal weighting. We compare the MKL method to two baseline approaches and investigate the reasons for performance improvement.
workshop on applications of computer vision | 2011
Motoaki Kawanabe; Alexander Binder; Christina Müller; Wojciech Wojcikiewicz
Automatic annotation of images is a challenging task in computer vision because of “semantic gap” between highlevel visual concepts and image appearances. Therefore, user tags attached to images can provide further information to bridge the gap, even though they are partially uninformative and misleading. In this work, we investigate multi-modal visual concept classification based on visual features and user tags via kernel-based classifiers. An issue here is how to construct kernels between sets of tags. We deploy Markov random walks on graphs of key tags to incorporate co-occurrence between them. This procedure acts as a smoothing of tag based features. Our experimental result on the ImageCLEF2010 PhotoAnnotation benchmark shows that our proposed method outperforms the baseline relying solely on visual information and a recently published state-of-the-art approach.