Gunnar Rätsch
ETH Zurich
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Gunnar Rätsch.
IEEE Transactions on Neural Networks | 2001
Klaus-Robert Müller; Sebastian Mika; Gunnar Rätsch; Koji Tsuda; Bernhard Schölkopf
This paper provides an introduction to support vector machines, kernel Fisher discriminant analysis, and kernel principal component analysis, as examples for successful kernel-based learning methods. We first give a short background about Vapnik-Chervonenkis theory and kernel feature spaces and then proceed to kernel based learning in supervised and unsupervised scenarios including practical and algorithmic considerations. We illustrate the usefulness of kernel algorithms by discussing applications such as optical character recognition and DNA analysis.
ieee workshop on neural networks for signal processing | 1999
Sebastian Mika; Gunnar Rätsch; Jason Weston; Bernhard Schölkopf; K.R. Mullers
A non-linear classification technique based on Fishers discriminant is proposed. The main ingredient is the kernel trick which allows the efficient computation of Fisher discriminant in feature space. The linear classification in feature space corresponds to a (powerful) non-linear decision function in input space. Large scale simulations demonstrate the competitiveness of our approach.
IEEE Transactions on Neural Networks | 1999
Bernhard Schölkopf; Sebastian Mika; Christopher J. C. Burges; Phil Knirsch; Klaus-Robert Müller; Gunnar Rätsch; Alexander J. Smola
This paper collects some ideas targeted at advancing our understanding of the feature spaces associated with support vector (SV) kernel functions. We first discuss the geometry of feature space. In particular, we review what is known about the shape of the image of input space under the feature space map, and how this influences the capacity of SV methods. Following this, we describe how the metric governing the intrinsic geometry of the mapped surface can be computed in terms of the kernel, using the example of the class of inhomogeneous polynomial kernels, which are often used in SV pattern recognition. We then discuss the connection between feature space and input space by dealing with the question of how one can, given some vector in feature space, find a preimage (exact or approximate) in input space. We describe algorithms to tackle this issue, and show their utility in two applications of kernel methods. First, we use it to reduce the computational complexity of SV decision functions; second, we combine it with the Kernel PCA algorithm, thereby constructing a nonlinear statistical denoising technique which is shown to perform well on real-world data.
Machine Learning | 2001
Gunnar Rätsch; Takashi Onoda; Klaus-Robert Müller
Recently ensemble methods like ADABOOST have been applied successfully in many problems, while seemingly defying the problems of overfitting.ADABOOST rarely overfits in the low noise regime, however, we show that it clearly does so for higher noise levels. Central to the understanding of this fact is the margin distribution. ADABOOST can be viewed as a constraint gradient descent in an error function with respect to the margin. We find that ADABOOST asymptotically achieves a hard margin distribution, i.e. the algorithm concentrates its resources on a few hard-to-learn patterns that are interestingly very similar to Support Vectors. A hard margin is clearly a sub-optimal strategy in the noisy case, and regularization, in our case a “mistrust” in the data, must be introduced in the algorithm to alleviate the distortions that single difficult patterns (e.g. outliers) can cause to the margin distribution. We propose several regularization methods and generalizations of the original ADABOOST algorithm to achieve a soft margin. In particular we suggest (1) regularized ADABOOSTREG where the gradient decent is done directly with respect to the soft margin and (2) regularized linear and quadratic programming (LP/QP-) ADABOOST, where the soft margin is attained by introducing slack variables.Extensive simulations demonstrate that the proposed regularized ADABOOST-type algorithms are useful and yield competitive results for noisy data.
international conference on artificial neural networks | 1997
Klaus-Robert Müller; Alexander J. Smola; Gunnar Rätsch; Bernhard Schölkopf; Jens Kohlmorgen; Vladimir Vapnik
Support Vector Machines are used for time series prediction and compared to radial basis function networks. We make use of two different cost functions for Support Vectors: training with (i) an e insensitive loss and (ii) Hubers robust loss function and discuss how to choose the regularization parameters in these models. Two applications are considered: data from (a) a noisy (normal and uniform noise) Mackey Glass equation and (b) the Santa Fe competition (set D). In both cases Support Vector Machines show an excellent performance. In case (b) the Support Vector approach improves the best known result on the benchmark by a factor of 29%.
Proceedings of the National Academy of Sciences of the United States of America | 2009
Kenneth L. McNally; Kevin L. Childs; Regina Bohnert; Rebecca M. Davidson; Keyan Zhao; Victor Jun Ulat; Georg Zeller; Richard M. Clark; Douglas R. Hoen; Thomas E. Bureau; Renee Stokowski; Dennis G. Ballinger; Kelly A. Frazer; D. R. Cox; Badri Padhukasahasram; Carlos Bustamante; Detlef Weigel; David J. Mackill; Richard Bruskiewich; Gunnar Rätsch; C. Robin Buell; Hei Leung; Jan E. Leach
Rice, the primary source of dietary calories for half of humanity, is the first crop plant for which a high-quality reference genome sequence from a single variety was produced. We used resequencing microarrays to interrogate 100 Mb of the unique fraction of the reference genome for 20 diverse varieties and landraces that capture the impressive genotypic and phenotypic diversity of domesticated rice. Here, we report the distribution of 160,000 nonredundant SNPs. Introgression patterns of shared SNPs revealed the breeding history and relationships among the 20 varieties; some introgressed regions are associated with agronomic traits that mark major milestones in rice improvement. These comprehensive SNP data provide a foundation for deep exploration of rice diversity and gene–trait relationships and their use for future rice improvement.
Nature | 2011
Xiangchao Gan; Oliver Stegle; Jonas Behr; Joshua G. Steffen; Philipp Drewe; Katie L. Hildebrand; Rune Lyngsoe; Sebastian J. Schultheiss; Edward J. Osborne; Vipin T. Sreedharan; André Kahles; Regina Bohnert; Géraldine Jean; Paul S. Derwent; Paul J. Kersey; Eric J. Belfield; Nicholas P. Harberd; Eric Kemen; Christopher Toomajian; Paula X. Kover; Richard M. Clark; Gunnar Rätsch; Richard Mott
Genetic differences between Arabidopsis thaliana accessions underlie the plant’s extensive phenotypic variation, and until now these have been interpreted largely in the context of the annotated reference accession Col-0. Here we report the sequencing, assembly and annotation of the genomes of 18 natural A. thaliana accessions, and their transcriptomes. When assessed on the basis of the reference annotation, one-third of protein-coding genes are predicted to be disrupted in at least one accession. However, re-annotation of each genome revealed that alternative gene models often restore coding potential. Gene expression in seedlings differed for nearly half of expressed genes and was frequently associated with cis variants within 5 kilobases, as were intron retention alternative splicing events. Sequence and expression variation is most pronounced in genes that respond to the biotic environment. Our data further promote evolutionary and functional studies in A. thaliana, especially the MAGIC genetic reference population descended from these accessions.
PLOS Computational Biology | 2008
Asa Ben-Hur; Cheng Soon Ong; Sören Sonnenburg; Bernhard Schölkopf; Gunnar Rätsch
The increasing wealth of biological data coming from a large variety of platforms and the continued development of new high-throughput methods for probing biological systems require increasingly more sophisticated computational approaches. Putting all these data in simple-to-use databases is a first step; but realizing the full potential of the data requires algorithms that automatically extract regularities from the data, which can then lead to biological insight. Many of the problems in computational biology are in the form of prediction: starting from prediction of a genes structure, prediction of its function, interactions, and role in disease. Support vector machines (SVMs) and related kernel methods are extremely good at solving such problems [1]–[3]. SVMs are widely used in computational biology due to their high accuracy, their ability to deal with high-dimensional and large datasets, and their flexibility in modeling diverse sources of data [2], [4]–[6]. The simplest form of a prediction problem is binary classification: trying to discriminate between objects that belong to one of two categories—positive (+1) or negative (−1). SVMs use two key concepts to solve this problem: large margin separation and kernel functions. The idea of large margin separation can be motivated by classification of points in two dimensions (see Figure 1). A simple way to classify the points is to draw a straight line and call points lying on one side positive and on the other side negative. If the two sets are well separated, one would intuitively draw the separating line such that it is as far as possible away from the points in both sets (see Figures 2 and and3).3). This intuitive choice captures the idea of large margin separation, which is mathematically formulated in the section Classification with Large Margin. Open in a separate window Figure 1 A linear classifier separating two classes of points (squares and circles) in two dimensions. The decision boundary divides the space into two sets depending on the sign of f(x) = 〈w,x〉+b. The grayscale level represents the value of the discriminant function f(x): dark for low values and a light shade for high values.
german conference on bioinformatics | 2000
Alexander Zien; Gunnar Rätsch; Sebastian Mika; Bernhard Schölkopf; Thomas Lengauer; Klaus-Robert Müller
MOTIVATION In order to extract protein sequences from nucleotide sequences, it is an important step to recognize points at which regions start that code for proteins. These points are called translation initiation sites (TIS). RESULTS The task of finding TIS can be modeled as a classification problem. We demonstrate the applicability of support vector machines for this task, and show how to incorporate prior biological knowledge by engineering an appropriate kernel function. With the described techniques the recognition performance can be improved by 26% over leading existing approaches. We provide evidence that existing related methods (e.g. ESTScan) could profit from advanced TIS recognition.
Proceedings of the National Academy of Sciences of the United States of America | 2008
Sascha Laubinger; Timo Sachsenberg; Georg Zeller; Wolfgang Busch; Jan U. Lohmann; Gunnar Rätsch; Detlef Weigel
The processing of Arabidopsis thaliana microRNAs (miRNAs) from longer primary transcripts (pri-miRNAs) requires the activity of several proteins, including DICER-LIKE1 (DCL1), the double-stranded RNA-binding protein HYPONASTIC LEAVES1 (HYL1), and the zinc finger protein SERRATE (SE). It has been noted before that the morphological appearance of weak se mutants is reminiscent of plants with mutations in ABH1/CBP80 and CBP20, which encode the two subunits of the nuclear cap-binding complex. We report that, like SE, the cap-binding complex is necessary for proper processing of pri-miRNAs. Inactivation of either ABH1/CBP80 or CBP20 results in decreased levels of mature miRNAs accompanied by apparent stabilization of pri-miRNAs. Whole-genome tiling array analyses reveal that se, abh1/cbp80, and cbp20 mutants also share similar splicing defects, leading to the accumulation of many partially spliced transcripts. This is unlikely to be an indirect consequence of improper miRNA processing or other mRNA turnover pathways, because introns retained in se, abh1/cbp80, and cbp20 mutants are not affected by mutations in other genes required for miRNA processing or for nonsense-mediated mRNA decay. Taken together, our results uncover dual roles in splicing and miRNA processing that distinguish SE and the cap-binding complex from specialized miRNA processing factors such as DCL1 and HYL1.