Karsten M. Borgwardt | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Karsten M. Borgwardt is active.

Explore More

Publication

Featured researches published by Karsten M. Borgwardt.

Nature | 2011

Spontaneous epigenetic variation in the Arabidopsis thaliana methylome

Claude Becker; Jörg Hagmann; Jonas Müller; Daniel Koenig; Oliver Stegle; Karsten M. Borgwardt; Detlef Weigel

Heritable epigenetic polymorphisms, such as differential cytosine methylation, can underlie phenotypic variation. Moreover, wild strains of the plant Arabidopsis thaliana differ in many epialleles, and these can influence the expression of nearby genes. However, to understand their role in evolution, it is imperative to ascertain the emergence rate and stability of epialleles, including those that are not due to structural variation. We have compared genome-wide DNA methylation among 10 A. thaliana lines, derived 30 generations ago from a common ancestor. Epimutations at individual positions were easily detected, and close to 30,000 cytosines in each strain were differentially methylated. In contrast, larger regions of contiguous methylation were much more stable, and the frequency of changes was in the same low range as that of DNA mutations. Like individual positions, the same regions were often affected by differential methylation in independent lines, with evidence for recurrent cycles of forward and reverse mutations. Transposable elements and short interfering RNAs have been causally linked to DNA methylation. In agreement, differentially methylated sites were farther from transposable elements and showed less association with short interfering RNA expression than invariant positions. The biased distribution and frequent reversion of epimutations have important implications for the potential contribution of sequence-independent epialleles to plant evolution.

international conference on data mining | 2005

Shortest-path kernels on graphs

Karsten M. Borgwardt; Hans-Peter Kriegel

Data mining algorithms are facing the challenge to deal with an increasing number of complex objects. For graph data, a whole toolbox of data mining algorithms becomes available by defining a kernel function on instances of graphs. Graph kernels based on walks, subtrees and cycles in graphs have been proposed so far. As a general problem, these kernels are either computationally expensive or limited in their expressiveness. We try to overcome this problem by defining expressive graph kernels which are based on paths. As the computation of all paths and longest paths in a graph is NP-hard, we propose graph kernels based on shortest paths. These kernels are computable in polynomial time, retain expressivity and are still positive definite. In experiments on classification of graph models of proteins, our shortest-path kernels show significantly higher classification accuracy than walk-based kernels.

intelligent systems in molecular biology | 2005

Protein function prediction via graph kernels

Karsten M. Borgwardt; Cheng Soon Ong; Stefan Schönauer; S. V. N. Vishwanathan; Alexander J. Smola; Hans-Peter Kriegel

MOTIVATION Computational approaches to protein function prediction infer protein function by finding proteins with similar sequence, structure, surface clefts, chemical properties, amino acid motifs, interaction partners or phylogenetic profiles. We present a new approach that combines sequential, structural and chemical information into one graph model of proteins. We predict functional class membership of enzymes and non-enzymes using graph kernels and support vector machine classification on these protein graphs. RESULTS Our graph model, derivable from protein sequence and structure only, is competitive with vector models that require additional protein information, such as the size of surface pockets. If we include this extra information into our graph model, our classifier yields significantly higher accuracy levels than the vector models. Hyperkernels allow us to select and to optimally combine the most relevant node attributes in our protein graphs. We have laid the foundation for a protein function prediction system that integrates protein information from various sources efficiently and effectively. AVAILABILITY More information available via www.dbs.ifi.lmu.de/Mitarbeiter/borgwardt.html.

Nature | 2008

The genome of the simian and human malaria parasite Plasmodium knowlesi.

Arnab Pain; Ulrike Böhme; Andrew Berry; Karen Mungall; Robert D. Finn; Andrew P. Jackson; T. Mourier; J. Mistry; E. M. Pasini; Martin Aslett; S. Balasubrammaniam; Karsten M. Borgwardt; Karen Brooks; Celine Carret; Tim Carver; Inna Cherevach; Tracey Chillingworth; Taane G. Clark; M. R. Galinski; Neil Hall; D. Harper; David Harris; Heidi Hauser; A. Ivens; C. S. Janssen; Thomas M. Keane; N. Larke; S. Lapp; M. Marti; S. Moule

Plasmodium knowlesi is an intracellular malaria parasite whose natural vertebrate host is Macaca fascicularis (the ‘kra’ monkey); however, it is now increasingly recognized as a significant cause of human malaria, particularly in southeast Asia. Plasmodium knowlesi was the first malaria parasite species in which antigenic variation was demonstrated, and it has a close phylogenetic relationship to Plasmodium vivax, the second most important species of human malaria parasite (reviewed in ref. 4). Despite their relatedness, there are important phenotypic differences between them, such as host blood cell preference, absence of a dormant liver stage or ‘hypnozoite’ in P. knowlesi, and length of the asexual cycle (reviewed in ref. 4). Here we present an analysis of the P. knowlesi (H strain, Pk1(A+) clone) nuclear genome sequence. This is the first monkey malaria parasite genome to be described, and it provides an opportunity for comparison with the recently completed P. vivax genome and other sequenced Plasmodium genomes. In contrast to other Plasmodium genomes, putative variant antigen families are dispersed throughout the genome and are associated with intrachromosomal telomere repeats. One of these families, the KIRs, contains sequences that collectively match over one-half of the host CD99 extracellular domain, which may represent an unusual form of molecular mimicry.

Cell | 2016

1,135 Genomes Reveal the Global Pattern of Polymorphism in Arabidopsis thaliana

Carlos Alonso-Blanco; Jorge Andrade; Claude Becker; Felix Bemm; Joy Bergelson; Karsten M. Borgwardt; Jun Cao; Eunyoung Chae; Todd M. Dezwaan; Wei Ding; Joseph R. Ecker; Moises Exposito-Alonso; Ashley Farlow; Joffrey Fitz; Xiangchao Gan; Dominik Grimm; Angela M. Hancock; Stefan R. Henz; Svante Holm; Matthew Horton; Mike Jarsulic; Randall A. Kerstetter; Arthur Korte; Pamela Korte; Christa Lanz; Cheng-Ruei Lee; Dazhe Meng; Todd P. Michael; Richard Mott; Ni Wayan Muliyati

Summary Arabidopsis thaliana serves as a model organism for the study of fundamental physiological, cellular, and molecular processes. It has also greatly advanced our understanding of intraspecific genome variation. We present a detailed map of variation in 1,135 high-quality re-sequenced natural inbred lines representing the native Eurasian and North African range and recently colonized North America. We identify relict populations that continue to inhabit ancestral habitats, primarily in the Iberian Peninsula. They have mixed with a lineage that has spread to northern latitudes from an unknown glacial refugium and is now found in a much broader spectrum of habitats. Insights into the history of the species and the fine-scale distribution of genetic diversity provide the basis for full exploitation of A. thaliana natural variation through integration of genomes and epigenomes with molecular and non-molecular phenotypes.

international conference on machine learning | 2007

Supervised feature selection via dependence estimation

Le Song; Alexander J. Smola; Arthur Gretton; Karsten M. Borgwardt; Justin Bedo

We introduce a framework for filtering features that employs the Hilbert-Schmidt Independence Criterion (HSIC) as a measure of dependence between the features and the labels. The key idea is that good features should maximise such dependence. Feature selection for various supervised learning problems (including classification and regression) is unified under this framework, and the solutions can be approximated using a backward-elimination algorithm. We demonstrate the usefulness of our method on both artificial and real world datasets.

The Plant Cell | 2012

Arabidopsis defense against Botrytis cinerea: chronology and regulation deciphered by high-resolution temporal transcriptomic analysis

Oliver P. Windram; Priyadharshini Madhou; Stuart McHattie; Claire Hill; Richard Hickman; Emma J. Cooke; Dafyd J. Jenkins; Christopher A. Penfold; Laura Baxter; Emily Breeze; Steven John Kiddle; Johanna Rhodes; Susanna Atwell; Daniel J. Kliebenstein; Youn-sung Kim; Oliver Stegle; Karsten M. Borgwardt; Cunjin Zhang; Alex Tabrett; Roxane Legaie; Jonathan D. Moore; Bärbel Finkenstädt; David L. Wild; A. Mead; David A. Rand; Jim Beynon; Sascha Ott; Vicky Buchanan-Wollaston; Katherine J. Denby

The authors generated a high-resolution time series of Arabidopsis thaliana gene expression following infection with the fungal pathogen Botrytis cinerea. Computational analysis of this large data set identified the timing of specific processes and regulatory events in the host plant and showed a role for the transcription factor TGA3 in the defense response against the fungal pathogen. Transcriptional reprogramming forms a major part of a plant’s response to pathogen infection. Many individual components and pathways operating during plant defense have been identified, but our knowledge of how these different components interact is still rudimentary. We generated a high-resolution time series of gene expression profiles from a single Arabidopsis thaliana leaf during infection by the necrotrophic fungal pathogen Botrytis cinerea. Approximately one-third of the Arabidopsis genome is differentially expressed during the first 48 h after infection, with the majority of changes in gene expression occurring before significant lesion development. We used computational tools to obtain a detailed chronology of the defense response against B. cinerea, highlighting the times at which signaling and metabolic processes change, and identify transcription factor families operating at different times after infection. Motif enrichment and network inference predicted regulatory interactions, and testing of one such prediction identified a role for TGA3 in defense against necrotrophic pathogens. These data provide an unprecedented level of detail about transcriptional changes during a defense response and are suited to systems biology analyses to generate predictive models of the gene regulatory networks mediating the Arabidopsis response to B. cinerea.

Data Mining and Knowledge Discovery | 2007

Future trends in data mining

Hans-Peter Kriegel; Karsten M. Borgwardt; Peer Kröger; Alexey Pryakhin; Matthias Schubert; Arthur Zimek

Over recent years data mining has been establishing itself as one of the major disciplines in computer science with growing industrial impact. Undoubtedly, research in data mining will continue and even increase over coming decades. In this article, we sketch our vision of the future of data mining. Starting from the classic definition of “data mining”, we elaborate on topics that — in our opinion — will set trends in data mining.

international conference on data mining | 2006

Pattern Mining in Frequent Dynamic Subgraphs

Karsten M. Borgwardt; Hans-Peter Kriegel; Peter Wackersreuther

Graph-structured data is becoming increasingly abundant in many application domains. Graph mining aims at finding interesting patterns within this data that represent novel knowledge. While current data mining deals with static graphs that do not change over time, coming years will see the advent of an increasing number of time series of graphs. In this article, we investigate how pattern mining on static graphs can be extended to time series of graphs. In particular, we are considering dynamic graphs with edge insertions and edge deletions over time. We define frequency in this setting and provide algorithmic solutions for finding frequent dynamic subgraph patterns. Existing subgraph mining algorithms can be easily integrated into our framework to make them handle dynamic graphs. Experimental results on real-world data confirm the practical feasibility of our approach.

international conference on machine learning | 2007

A dependence maximization view of clustering

Le Song; Alexander J. Smola; Arthur Gretton; Karsten M. Borgwardt

We propose a family of clustering algorithms based on the maximization of dependence between the input variables and their cluster labels, as expressed by the Hilbert-Schmidt Independence Criterion (HSIC). Under this framework, we unify the geometric, spectral, and statistical dependence views of clustering, and subsume many existing algorithms as special cases (e.g. k-means and spectral clustering). Distinctive to our framework is that kernels can also be applied on the labels, which can endow them with particular structures. We also obtain a perturbation bound on the change in k-means clustering.

Explore More