Sebastian Mika | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Sebastian Mika is active.

Explore More

Publication

Featured researches published by Sebastian Mika.

IEEE Transactions on Neural Networks | 2001

An introduction to kernel-based learning algorithms

Klaus-Robert Müller; Sebastian Mika; Gunnar Rätsch; Koji Tsuda; Bernhard Schölkopf

This paper provides an introduction to support vector machines, kernel Fisher discriminant analysis, and kernel principal component analysis, as examples for successful kernel-based learning methods. We first give a short background about Vapnik-Chervonenkis theory and kernel feature spaces and then proceed to kernel based learning in supervised and unsupervised scenarios including practical and algorithmic considerations. We illustrate the usefulness of kernel algorithms by discussing applications such as optical character recognition and DNA analysis.

ieee workshop on neural networks for signal processing | 1999

Fisher discriminant analysis with kernels

Sebastian Mika; Gunnar Rätsch; Jason Weston; Bernhard Schölkopf; K.R. Mullers

A non-linear classification technique based on Fishers discriminant is proposed. The main ingredient is the kernel trick which allows the efficient computation of Fisher discriminant in feature space. The linear classification in feature space corresponds to a (powerful) non-linear decision function in input space. Large scale simulations demonstrate the competitiveness of our approach.

IEEE Transactions on Neural Networks | 1999

Input space versus feature space in kernel-based methods

Bernhard Schölkopf; Sebastian Mika; Christopher J. C. Burges; Phil Knirsch; Klaus-Robert Müller; Gunnar Rätsch; Alexander J. Smola

This paper collects some ideas targeted at advancing our understanding of the feature spaces associated with support vector (SV) kernel functions. We first discuss the geometry of feature space. In particular, we review what is known about the shape of the image of input space under the feature space map, and how this influences the capacity of SV methods. Following this, we describe how the metric governing the intrinsic geometry of the mapped surface can be computed in terms of the kernel, using the example of the class of inhomogeneous polynomial kernels, which are often used in SV pattern recognition. We then discuss the connection between feature space and input space by dealing with the question of how one can, given some vector in feature space, find a preimage (exact or approximate) in input space. We describe algorithms to tackle this issue, and show their utility in two applications of kernel methods. First, we use it to reduce the computational complexity of SV decision functions; second, we combine it with the Kernel PCA algorithm, thereby constructing a nonlinear statistical denoising technique which is shown to perform well on real-world data.

international conference on machine learning | 2004

A kernel view of the dimensionality reduction of manifolds

Ji Hun Ham; Daniel D. Lee; Sebastian Mika; Bernhard Schölkopf

We interpret several well-known algorithms for dimensionality reduction of manifolds as kernel methods. Isomap, graph Laplacian eigenmap, and locally linear embedding (LLE) all utilize local neighborhood information to construct a global embedding of the manifold. We show how all three algorithms can be described as kernel PCA on specially constructed Gram matrices, and illustrate the similarities and differences between the algorithms with representative examples.

german conference on bioinformatics | 2000

Engineering support vector machine kernels that recognize translation initiation sites

Alexander Zien; Gunnar Rätsch; Sebastian Mika; Bernhard Schölkopf; Thomas Lengauer; Klaus-Robert Müller

MOTIVATION In order to extract protein sequences from nucleotide sequences, it is an important step to recognize points at which regions start that code for proteins. These points are called translation initiation sites (TIS). RESULTS The task of finding TIS can be modeled as a classification problem. We demonstrate the applicability of support vector machines for this task, and show how to incorporate prior biological knowledge by engineering an appropriate kernel function. With the described techniques the recognition performance can be improved by 26% over leading existing approaches. We provide evidence that existing related methods (e.g. ESTScan) could profit from advanced TIS recognition.

IEEE Transactions on Pattern Analysis and Machine Intelligence | 2003

Constructing descriptive and discriminative nonlinear features: Rayleigh coefficients in kernel feature spaces

Sebastian Mika; Gunnar Rätsch; Jason Weston; Bernhard Schölkopf; Alexander J. Smola; Klaus-Robert Müller

We incorporate prior knowledge to construct nonlinear algorithms for invariant feature extraction and discrimination. Employing a unified framework in terms of a nonlinearized variant of the Rayleigh coefficient, we propose nonlinear generalizations of Fishers discriminant and oriented PCA using support vector kernel functions. Extensive simulations show the utility of our approach.

IEEE Transactions on Pattern Analysis and Machine Intelligence | 2002

Constructing boosting algorithms from SVMs: an application to one-class classification

Gunnar Rätsch; Sebastian Mika; Bernhard Schölkopf; Klaus-Robert Müller

We show via an equivalence of mathematical programs that a support vector (SV) algorithm can be translated into an equivalent boosting-like algorithm and vice versa. We exemplify this translation procedure for a new algorithm: one-class leveraging, starting from the one-class support vector machine (1-SVM). This is a first step toward unsupervised learning in a boosting framework. Building on so-called barrier methods known from the theory of constrained optimization, it returns a function, written as a convex combination of base hypotheses, that characterizes whether a given test point is likely to have been generated from the distribution underlying the training data. Simulations on one-class classification problems demonstrate the usefulness of our approach.

Journal of Chemical Information and Modeling | 2009

Benchmark data set for in silico prediction of Ames mutagenicity.

Katja Hansen; Sebastian Mika; Timon Schroeter; Andreas Sutter; Antonius Ter Laak; Thomas Steger-Hartmann; Nikolaus Heinrich; Klaus-Robert Müller

Up to now, publicly available data sets to build and evaluate Ames mutagenicity prediction tools have been very limited in terms of size and chemical space covered. In this report we describe a new unique public Ames mutagenicity data set comprising about 6500 nonconfidential compounds (available as SMILES strings and SDF) together with their biological activity. Three commercial tools (DEREK, MultiCASE, and an off-the-shelf Bayesian machine learner in Pipeline Pilot) are compared with four noncommercial machine learning implementations (Support Vector Machines, Random Forests, k-Nearest Neighbors, and Gaussian Processes) on the new benchmark data set.

Journal of Machine Learning Research | 2001

Regularized principal manifolds

Alexander J. Smola; Sebastian Mika; Bernhard Schölkopf; Robert C. Williamson

Many settings of unsupervised learning can be viewed as quantization problems - the minimization of the expected quantization error subject to some restrictions. This allows the use of tools such as regularization from the theory of (supervised) risk minimization for unsupervised learning. This setting turns out to be closely related to principal curves, the generative topographic map, and robust coding.We explore this connection in two ways: (1) we propose an algorithm for finding principal manifolds that can be regularized in a variety of ways; and (2) we derive uniform convergence bounds and hence bounds on the learning rates of the algorithm. In particular, we give bounds on the covering numbers which allows us to obtain nearly optimal learning rates for certain types of regularization operators. Experimental results demonstrate the feasibility of the approach.

Journal of Computer-aided Molecular Design | 2007

Estimating the domain of applicability for machine learning QSAR models: a study on aqueous solubility of drug discovery molecules

Timon Schroeter; Anton Schwaighofer; Sebastian Mika; Antonius Ter Laak; Detlev Suelzle; Ursula Ganzer; Nikolaus Heinrich; Klaus-Robert Müller

We investigate the use of different Machine Learning methods to construct models for aqueous solubility. Models are based on about 4000 compounds, including an in-house set of 632 drug discovery molecules of Bayer Schering Pharma. For each method, we also consider an appropriate method to obtain error bars, in order to estimate the domain of applicability (DOA) for each model. Here, we investigate error bars from a Bayesian model (Gaussian Process (GP)), an ensemble based approach (Random Forest), and approaches based on the Mahalanobis distance to training data (for Support Vector Machine and Ridge Regression models). We evaluate all approaches in terms of their prediction accuracy (in cross-validation, and on an external validation set of 536 molecules) and in how far the individual error bars can faithfully represent the actual prediction error.

Explore More