Kerstin Bunte | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Kerstin Bunte is active.

Explore More

Publication

Featured researches published by Kerstin Bunte.

Neural Networks | 2012

Limited Rank Matrix Learning, discriminative dimension reduction and visualization

Kerstin Bunte; Petra Schneider; Barbara Hammer; Frank-Michael Schleif; Thomas Villmann; Michael Biehl

We present an extension of the recently introduced Generalized Matrix Learning Vector Quantization algorithm. In the original scheme, adaptive square matrices of relevance factors parameterize a discriminative distance measure. We extend the scheme to matrices of limited rank corresponding to low-dimensional representations of the data. This allows to incorporate prior knowledge of the intrinsic dimension and to reduce the number of adaptive parameters efficiently. In particular, for very large dimensional data, the limitation of the rank can reduce computation time and memory requirements significantly. Furthermore, two- or three-dimensional representations constitute an efficient visualization method for labeled data sets. The identification of a suitable projection is not treated as a pre-processing step but as an integral part of the supervised training. Several real world data sets serve as an illustration and demonstrate the usefulness of the suggested method.

IEEE Transactions on Neural Networks | 2010

Regularization in Matrix Relevance Learning

Petra Schneider; Kerstin Bunte; Han Stiekema; Barbara Hammer; Thomas Villmann; Michael Biehl

In this paper, we present a regularization technique to extend recently proposed matrix learning schemes in learning vector quantization (LVQ). These learning algorithms extend the concept of adaptive distance measures in LVQ to the use of relevance matrices. In general, metric learning can display a tendency towards oversimplification in the course of training. An overly pronounced elimination of dimensions in feature space can have negative effects on the performance and may lead to instabilities in the training. We focus on matrix learning in generalized LVQ (GLVQ). Extending the cost function by an appropriate regularization term prevents the unfavorable behavior and can help to improve the generalization ability. The approach is first tested and illustrated in terms of artificial model data. Furthermore, we apply the scheme to benchmark classification data sets from the UCI Repository of Machine Learning. We demonstrate the usefulness of regularization also in the case of rank limited relevance matrices, i.e., matrix learning with an implicit, low-dimensional representation of the data.

Pattern Recognition | 2011

Learning effective color features for content based image retrieval in dermatology

Kerstin Bunte; Michael Biehl; Marcel F. Jonkman; Nicolai Petkov

We investigate the extraction of effective color features for a content-based image retrieval (CBIR) application in dermatology. Effectiveness is measured by the rate of correct retrieval of images from four color classes of skin lesions. We employ and compare two different methods to learn favorable feature representations for this special application: limited rank matrix learning vector quantization (LiRaM LVQ) and a Large Margin Nearest Neighbor (LMNN) approach. Both methods use labeled training data and provide a discriminant linear transformation of the original features, potentially to a lower dimensional space. The extracted color features are used to retrieve images from a database by a k-nearest neighbor search. We perform a comparison of retrieval rates achieved with extracted and original features for eight different standard color spaces. We achieved significant improvements in every examined color space. The increase of the mean correct retrieval rate lies between 10% and 27% in the range of k=1-25 retrieved images, and the correct retrieval rate lies between 84% and 64%. We present explicit combinations of RGB and CIE-Lab color features corresponding to healthy and lesion skin. LiRaM LVQ and the computationally more expensive LMNN give comparable results for large values of the method parameter @k of LMNN (@k>=25) while LiRaM LVQ outperforms LMNN for smaller values of @k. We conclude that feature extraction by LiRaM LVQ leads to considerable improvement in color-based retrieval of dermatologic images.

Neurocomputing | 2010

Adaptive local dissimilarity measures for discriminative dimension reduction of labeled data

Kerstin Bunte; Barbara Hammer; Axel Wismüller; Michael Biehl

Due to the tremendous increase of electronic information with respect to the size of data sets as well as their dimension, dimension reduction and visualization of high-dimensional data has become one of the key problems of data mining. Since embedding in lower dimensions necessarily includes a loss of information, methods to explicitly control the information kept by a specific dimension reduction technique are highly desirable. The incorporation of supervised class information constitutes an important specific case. The aim is to preserve and potentially enhance the discrimination of classes in lower dimensions. In this contribution we use an extension of prototype-based local distance learning, which results in a nonlinear discriminative dissimilarity measure for a given labeled data manifold. The learned local distance measure can be used as basis for other unsupervised dimension reduction techniques, which take into account neighborhood information. We show the combination of different dimension reduction techniques with a discriminative similarity measure learned by an extension of learning vector quantization (LVQ) and their behavior with different parameter settings. The methods are introduced and discussed in terms of artificial and real world data sets.

Neural Computation | 2012

A general framework for dimensionality-reducing data visualization mapping

Kerstin Bunte; Michael Biehl; Barbara Hammer

In recent years, a wealth of dimension-reduction techniques for data visualization and preprocessing has been established. Nonparametric methods require additional effort for out-of-sample extensions, because they provide only a mapping of a given finite set of points. In this letter, we propose a general view on nonparametric dimension reduction based on the concept of cost functions and properties of the data. Based on this general principle, we transfer nonparametric dimension reduction to explicit mappings of the data manifold such that direct out-of-sample extensions become possible. Furthermore, this concept offers the possibility of investigating the generalization ability of data visualization to new data points. We demonstrate the approach based on a simple global linear mapping, as well as prototype-based local linear mappings. In addition, we can bias the functional form according to given auxiliary information. This leads to explicit supervised visualization mappings with discriminative properties comparable to state-of-the-art approaches.

Neurocomputing | 2011

Neighbor embedding XOM for dimension reduction and visualization

Kerstin Bunte; Barbara Hammer; Thomas Villmann; Michael Biehl; Axel Wismüller

We present an extension of the Exploratory Observation Machine (XOM) for structure-preserving dimensionality reduction. Based on minimizing the Kullback-Leibler divergence of neighborhood functions in data and image spaces, this Neighbor Embedding XOM (NE-XOM) creates a link between fast sequential online learning known from topology-preserving mappings and principled direct divergence optimization approaches. We quantitatively evaluate our method on real-world data using multiple embedding quality measures. In this comparison, NE-XOM performs as a competitive trade-off between high embedding quality and low computational expense, which motivates its further use in real-world settings throughout science and engineering.

PLOS ONE | 2013

Analysis of Flow Cytometry Data by Matrix Relevance Learning Vector Quantization

Michael Biehl; Kerstin Bunte; Petra Schneider

Flow cytometry is a widely used technique for the analysis of cell populations in the study and diagnosis of human diseases. It yields large amounts of high-dimensional data, the analysis of which would clearly benefit from efficient computational approaches aiming at automated diagnosis and decision support. This article presents our analysis of flow cytometry data in the framework of the DREAM6/FlowCAP2 Molecular Classification of Acute Myeloid Leukemia (AML) Challenge, 2011. In the challenge, example data was provided for a set of 179 subjects, comprising healthy donors and 23 cases of AML. The participants were asked to provide predictions with respect to the condition of 180 patients in a test set. We extracted feature vectors from the data in terms of single marker statistics, including characteristic moments, median and interquartile range of the observed values. Subsequently, we applied Generalized Matrix Relevance Learning Vector Quantization (GMLVQ), a machine learning technique which extends standard LVQ by an adaptive distance measure. Our method achieved the best possible performance with respect to the diagnoses of test set patients. The extraction of features from the flow cytometry data is outlined in detail, the machine learning approach is discussed and classification results are presented. In addition, we illustrate how GMLVQ can provide deeper insight into the problem by allowing to infer the relevance of specific markers and features for the diagnosis.

Neurocomputing | 2012

Stochastic neighbor embedding (SNE) for dimension reduction and visualization using arbitrary divergences

Kerstin Bunte; Sven Haase; Michael Biehl; Thomas Villmann

We present a systematic approach to the mathematical treatment of the t-distributed stochastic neighbor embedding (t-SNE) and the stochastic neighbor embedding (SNE) method. This allows an easy adaptation of the methods or exchange of their respective modules. In particular, the divergence which measures the difference between probability distributions in the original and the embedding space can be treated independently from other components like, e.g. the similarity of data points or the data distribution. We focus on the extension for different divergences and propose a general framework based on the consideration of Frechet-derivatives. This way the general approach can be adapted to the user specific needs.

Artificial Intelligence in Medicine | 2012

Texture feature ranking with relevance learning to classify interstitial lung disease patterns

Markus B. Huber; Kerstin Bunte; Mahesh B. Nagarajan; Michael Biehl; Lawrence A. Ray; Axel Wismüller

OBJECTIVE The generalized matrix learning vector quantization (GMLVQ) is used to estimate the relevance of texture features in their ability to classify interstitial lung disease patterns in high-resolution computed tomography images. METHODOLOGY After a stochastic gradient descent, the GMLVQ algorithm provides a discriminative distance measure of relevance factors, which can account for pairwise correlations between different texture features and their importance for the classification of healthy and diseased patterns. 65 texture features were extracted from gray-level co-occurrence matrices (GLCMs). These features were ranked and selected according to their relevance obtained by GMLVQ and, for comparison, to a mutual information (MI) criteria. The classification performance for different feature subsets was calculated for a k-nearest-neighbor (kNN) and a random forests classifier (RanForest), and support vector machines with a linear and a radial basis function kernel (SVMlin and SVMrbf). RESULTS For all classifiers, feature sets selected by the relevance ranking assessed by GMLVQ had a significantly better classification performance (p<0.05) for many texture feature sets compared to the MI approach. For kNN, RanForest, and SVMrbf, some of these feature subsets had a significantly better classification performance when compared to the set consisting of all features (p<0.05). CONCLUSION While this approach estimates the relevance of single features, future considerations of GMLVQ should include the pairwise correlation for the feature ranking, e.g. to reduce the redundancy of two equally relevant features.

international symposium on neural networks | 2012

Large margin linear discriminative visualization by Matrix Relevance Learning

Michael Biehl; Kerstin Bunte; Frank-Michael Schleif; Petra Schneider; Thomas Villmann

We suggest and investigate the use of Generalized Matrix Relevance Learning (GMLVQ) in the context of discriminative visualization. This prototype-based, supervised learning scheme parameterizes an adaptive distance measure in terms of a matrix of relevance factors. By means of a few benchmark problems, we demonstrate that the training process yields low rank matrices which can be used efficiently for the discriminative visualization of labeled data. Comparison with well known standard methods illustrate the flexibility and discriminative power of the novel approach. The mathematical analysis of GMLVQ shows that the corresponding stationarity condition can be formulated as an eigenvalue problem with one or several strongly dominating eigenvectors. We also study the inclusion of a penalty term which enforces non-singularity of the relevance matrix and can be used to control the role of higher order eigenvalues, efficiently.

Explore More