Andrej Gisbrecht
Bielefeld University
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Andrej Gisbrecht.
Neurocomputing | 2015
Andrej Gisbrecht; Alexander Schulz; Barbara Hammer
Abstract Novel non-parametric dimensionality reduction techniques such as t-distributed stochastic neighbor embedding (t-SNE) lead to a powerful and flexible visualization of high-dimensional data. One drawback of non-parametric techniques is their lack of an explicit out-of-sample extension. In this contribution, we propose an efficient extension of t-SNE to a parametric framework, kernel t-SNE, which preserves the flexibility of basic t-SNE, but enables explicit out-of-sample extensions. We test the ability of kernel t-SNE in comparison to standard t-SNE for benchmark data sets, in particular addressing the generalization ability of the mapping for novel data. In the context of large data sets, this procedure enables us to train a mapping for a fixed size subset only, mapping all data afterwards in linear time. We demonstrate that this technique yields satisfactory results also for large data sets provided missing information due to the small size of the subset is accounted for by auxiliary information such as class labels, which can be integrated into kernel t-SNE based on the Fisher information.
Neurocomputing | 2011
Andrej Gisbrecht; Bassam Mokbel; Barbara Hammer
The generative topographic mapping (GTM) has been proposed as a statistical model to represent high-dimensional data by a distribution induced by a sparse lattice of points in a low-dimensional latent space, such that visualization, compression, and data inspection become possible. The formulation in terms of a generative statistical model has the benefit that relevant parameters of the model can be determined automatically based on an expectation maximization scheme. Further, the model offers a large flexibility such as a direct out-of-sample extension and the possibility to obtain different degrees of granularity of the visualization without the need of additional training. Original GTM is restricted to Euclidean data points in a given Euclidean vector space. Often, data are not explicitly embedded in a Euclidean vector space, rather pairwise dissimilarities of data can be computed, i.e. the relations between data points are given rather than the data vectors themselves. We propose a method which extends the GTM to relational data and which allows us to achieve a sparse representation of data characterized by pairwise dissimilarities, in latent space. The method, relational GTM, is demonstrated on several benchmarks.
International Journal of Neural Systems | 2012
Andrej Gisbrecht; Bassam Mokbel; Frank-Michael Schleif; Xibin Zhu; Barbara Hammer
Prototype based learning offers an intuitive interface to inspect large quantities of electronic data in supervised or unsupervised settings. Recently, many techniques have been extended to data described by general dissimilarities rather than Euclidean vectors, so-called relational data settings. Unlike the Euclidean counterparts, the techniques have quadratic time complexity due to the underlying quadratic dissimilarity matrix. Thus, they are infeasible already for medium sized data sets. The contribution of this article is twofold: On the one hand we propose a novel supervised prototype based classification technique for dissimilarity data based on popular learning vector quantization (LVQ), on the other hand we transfer a linear time approximation technique, the Nyström approximation, to this algorithm and an unsupervised counterpart, the relational generative topographic mapping (GTM). This way, linear time and space methods result. We evaluate the techniques on three examples from the biomedical domain.
Neurocomputing | 2015
Andrej Gisbrecht; Frank-Michael Schleif
Domain specific (dis-)similarity or proximity measures used e.g. in alignment algorithms of sequence data are popular to analyze complicated data objects and to cover domain specific data properties. Without an underlying vector space these data are given as pairwise (dis-)similarities only. The few available methods for such data focus widely on similarities and do not scale to large datasets. Kernel methods are very effective for metric similarity matrices, also at large scale, but costly transformations are necessary starting with non-metric (dis-) similarities. We propose an integrative combination of Nystrom approximation, potential double centering and eigenvalue correction to obtain valid kernel matrices at linear costs in the number of samples. By the proposed approach effective kernel approaches become accessible. Experiments with several larger (dis-)similarity datasets show that the proposed method achieves much better runtime performance than the standard strategy while keeping competitive model accuracy. The main contribution is an efficient and accurate technique, to convert (potentially non-metric) large scale dissimilarity matrices into approximated positive semi-definite kernel matrices at linear costs. HighlightsWe propose a linear time and memory efficient approach for converting low rank dissimilarity matrices to similarity matrices and vice versa.Our approach is applicable for proximities obtained from non-metric proximity measures (indefinite kernels, non-standard dissimilarity measures).The presented approach also comprises a generalization of Landmark MDS - the presented approach is in general more accurate and flexible than Landmark MDS.We provide an alternative derivation of the Nystrom approximation together with a convergence proof, also for indefinite kernels not given in the workshop paper as a core element of the approach.
Wiley Interdisciplinary Reviews-Data Mining and Knowledge Discovery | 2015
Andrej Gisbrecht; Barbara Hammer
In this overview, commonly used dimensionality reduction techniques for data visualization and their properties are reviewed. Thereby, the focus lies on an intuitive understanding of the underlying mathematical principles rather than detailed algorithmic pipelines. Important mathematical properties of the technologies are summarized in the tabular form. The behavior of representative techniques is demonstrated for three benchmarks, followed by a short discussion on how to quantitatively evaluate these mappings. In addition, three currently active research topics are addressed: how to devise dimensionality reduction techniques for complex non‐vectorial data sets, how to easily shape dimensionality reduction techniques according to the users preferences, and how to device models that are suited for big data sets. WIREs Data Mining Knowl Discov 2015, 5:51–73. doi: 10.1002/widm.1147
SIMBAD'13 Proceedings of the Second international conference on Similarity-Based Pattern Recognition | 2013
Frank-Michael Schleif; Andrej Gisbrecht
Domain specific (dis-)similarity or proximity measures, employed e.g. in alignment algorithms in bio-informatics, are often used to compare complex data objects and to cover domain specific data properties. Lacking an underlying vector space, data are given as pairwise (dis-)similarities. The few available methods for such data do not scale well to very large data sets. Kernel methods easily deal with metric similarity matrices, also at large scale, but costly transformations are necessary starting with non-metric (dis-) similarities. We propose an integrative combination of Nystrom approximation, potential double centering and eigenvalue correction to obtain valid kernel matrices at linear costs. Accordingly effective kernel approaches, become accessible for these data. Evaluation at several larger (dis-)similarity data sets shows that the proposed method achieves much better runtime performance than the standard strategy while keeping competitive model accuracy. Our main contribution is an efficient linear technique, to convert (potentially non-metric) large scale dissimilarity matrices into approximated positive semi-definite kernel matrices.
Neurocomputing | 2012
Xibin Zhu; Andrej Gisbrecht; Frank-Michael Schleif; Barbara Hammer
Recently, diverse high quality prototype-based clustering techniques have been developed which can directly deal with data sets given by general pairwise dissimilarities rather than standard Euclidean vectors. Examples include affinity propagation, relational neural gas, or relational generative topographic mapping. Corresponding to the size of the dissimilarity matrix, these techniques scale quadratically with the size of the training set, such that training becomes prohibitive for large data volumes. In this contribution, we investigate two different linear time approximation techniques, patch processing and the Nystrom approximation. We apply these approximations to several representative clustering techniques for dissimilarities, where possible, and compare the results for diverse data sets.
Neurocomputing | 2011
Andrej Gisbrecht; Barbara Hammer
Generative topographic mapping (GTM) provides a flexible statistical model for unsupervised data inspection and topographic mapping. Since it yields to an explicit mapping of a low-dimensional latent space to the observation space and an explicit formula for a constrained Gaussian mixture model induced thereof, it offers diverse functionalities including clustering, dimensionality reduction, topographic mapping, and the like. However, it shares the property of most unsupervised tools that noise in the data cannot be recognized as such and, in consequence, is visualized in the map. The framework of visualization based on auxiliary information and, more specifically, the framework of learning metrics as introduced in [14,21] constitutes an elegant way to shape the metric according to auxiliary information at hand such that only those aspects are displayed in distance-based approaches which are relevant for a given classification task. Here we introduce the concept of relevance learning into GTM such that the metric is shaped according to auxiliary class labels. Relying on the prototype-based nature of GTM, efficient realizations of this paradigm are developed and compared on a couple of benchmarks to state-of-the-art supervised dimensionality reduction techniques.
2013 17th International Conference on Information Visualisation | 2013
Andrej Gisbrecht; Barbara Hammer; Bassam Mokbel; Alexander Sczyrba
We investigate the potential of modern nonlinear dimensionality reduction techniques for an interactive cluster detection in bioinformatics applications. We demonstrate that recent non-parametric techniques such as t-distributed stochastic neighbor embedding (t-SNE) allow a cluster identification which is superior to direct clustering of the original data or cluster detection based on classical parametric dimensionality reduction approaches. Non-parametric approaches, however, display quadratic complexity which makes them unsuitable in interactive devices. As speedup, we propose kernel-t-SNE as a fast parametric counterpart based on t-SNE.
international symposium on neural networks | 2012
Andrej Gisbrecht; Bassam Mokbel; Barbara Hammer
t-distributed stochastic neighbor embedding (t-SNE) constitutes a nonlinear dimensionality reduction technique which is particularly suited to visualize high dimensional data sets with intrinsic nonlinear structures. A major drawback, however, consists in its squared complexity which makes the technique infeasible for large data sets or online application in an interactive framework. In addition, since the technique is non parametric, it possesses no direct method to extend the technique to novel data points. In this contribution, we propose an extension of t-SNE to an explicit mapping. In the limit, it reduces to standard non-parametric t-SNE, while offering a feasible nonlinear embedding function for other parameter choices. We evaluate the performance of the technique when trained on a small subpart of the given data only. It turns out that its generalization ability is good when evaluated with the standard quality curve. Further, in many cases, it obtains a quality which approximates the quality of t-SNE when trained on the full data set, albeit only 10% of the data are used for training. This opens the way towards efficient nonlinear dimensionality reduction techniques as required in interactive settings.