Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Oliver Kohlbacher is active.

Publication


Featured researches published by Oliver Kohlbacher.


Nature | 2013

Charting a dynamic DNA methylation landscape of the human genome

Michael J. Ziller; Hongcang Gu; Fabian Müller; Julie Donaghey; Linus T.-Y. Tsai; Oliver Kohlbacher; Philip L. De Jager; Evan D. Rosen; David A. Bennett; Bradley E. Bernstein; Andreas Gnirke; Alexander Meissner

DNA methylation is a defining feature of mammalian cellular identity and is essential for normal development. Most cell types, except germ cells and pre-implantation embryos, display relatively stable DNA methylation patterns, with 70–80% of all CpGs being methylated. Despite recent advances, we still have a limited understanding of when, where and how many CpGs participate in genomic regulation. Here we report the in-depth analysis of 42 whole-genome bisulphite sequencing data sets across 30 diverse human cell and tissue types. We observe dynamic regulation for only 21.8% of autosomal CpGs within a normal developmental context, most of which are distal to transcription start sites. These dynamic CpGs co-localize with gene regulatory elements, particularly enhancers and transcription-factor-binding sites, which allow identification of key lineage-specific regulators. In addition, differentially methylated regions (DMRs) often contain single nucleotide polymorphisms associated with cell-type-related diseases as determined by genome-wide association studies. The results also highlight the general inefficiency of whole-genome bisulphite sequencing, as 70–80% of the sequencing reads across these data sets provided little or no relevant information about CpG methylation. To demonstrate further the utility of our DMR set, we use it to classify unknown samples and identify representative signature regions that recapitulate major DNA methylation dynamics. In summary, although in theory every CpG can change its methylation state, our results suggest that only a fraction does so as part of coordinated regulatory programs. Therefore, our selected DMRs can serve as a starting point to guide new, more effective reduced representation approaches to capture the most informative fraction of CpGs, as well as further pinpoint putative regulatory elements.


BMC Bioinformatics | 2008

OpenMS – An open-source software framework for mass spectrometry

Marc Sturm; Andreas Bertsch; Clemens Gröpl; Andreas Hildebrandt; Rene Hussong; Eva Lange; Nico Pfeifer; Ole Schulz-Trieglaff; Alexandra Zerck; Knut Reinert; Oliver Kohlbacher

BackgroundMass spectrometry is an essential analytical technique for high-throughput analysis in proteomics and metabolomics. The development of new separation techniques, precise mass analyzers and experimental protocols is a very active field of research. This leads to more complex experimental setups yielding ever increasing amounts of data. Consequently, analysis of the data is currently often the bottleneck for experimental studies. Although software tools for many data analysis tasks are available today, they are often hard to combine with each other or not flexible enough to allow for rapid prototyping of a new analysis workflow.ResultsWe present OpenMS, a software framework for rapid application development in mass spectrometry. OpenMS has been designed to be portable, easy-to-use and robust while offering a rich functionality ranging from basic data structures to sophisticated algorithms for data analysis. This has already been demonstrated in several studies.ConclusionOpenMS is available under the Lesser GNU Public License (LGPL) from the project website at http://www.openms.de.


Nucleic Acids Research | 2005

Specificity prediction of adenylation domains in nonribosomal peptide synthetases (NRPS) using transductive support vector machines (TSVMs)

Christian Rausch; Tilmann Weber; Oliver Kohlbacher; Wolfgang Wohlleben; Daniel H. Huson

We present a new support vector machine (SVM)-based approach to predict the substrate specificity of subtypes of a given protein sequence family. We demonstrate the usefulness of this method on the example of aryl acid-activating and amino acid-activating adenylation domains (A domains) of nonribosomal peptide synthetases (NRPS). The residues of gramicidin synthetase A that are 8 Å around the substrate amino acid and corresponding positions of other adenylation domain sequences with 397 known and unknown specificities were extracted and used to encode this physico-chemical fingerprint into normalized real-valued feature vectors based on the physico-chemical properties of the amino acids. The SVM software package SVMlight was used for training and classification, with transductive SVMs to take advantage of the information inherent in unlabeled data. Specificities for very similar substrates that frequently show cross-specificities were pooled to the so-called composite specificities and predictive models were built for them. The reliability of the models was confirmed in cross-validations and in comparison with a currently used sequence-comparison-based method. When comparing the predictions for 1230 NRPS A domains that are currently detectable in UniProt, the new method was able to give a specificity prediction in an additional 18% of the cases compared with the old method. For 70% of the sequences both methods agreed, for <6% they did not, mainly on low-confidence predictions by the existing method. None of the predictive methods could infer any specificity for 2.4% of the sequences, suggesting completely new types of specificity.


Nucleic Acids Research | 2011

NRPSpredictor2-a web server for predicting NRPS adenylation domain specificity

Marc Röttig; Marnix H. Medema; Kai Blin; Tilmann Weber; Christian Rausch; Oliver Kohlbacher

The products of many bacterial non-ribosomal peptide synthetases (NRPS) are highly important secondary metabolites, including vancomycin and other antibiotics. The ability to predict substrate specificity of newly detected NRPS Adenylation (A-) domains by genome sequencing efforts is of great importance to identify and annotate new gene clusters that produce secondary metabolites. Prediction of A-domain specificity based on the sequence alone can be achieved through sequence signatures or, more accurately, through machine learning methods. We present an improved predictor, based on previous work (NRPSpredictor), that predicts A-domain specificity using Support Vector Machines on four hierarchical levels, ranging from gross physicochemical properties of an A-domain’s substrates down to single amino acid substrates. The three more general levels are predicted with an F-measure better than 0.89 and the most detailed level with an average F-measure of 0.80. We also modeled the applicability domain of our predictor to estimate for new A-domains whether they lie in the applicability domain. Finally, since there are also NRPS that play an important role in natural products chemistry of fungi, such as peptaibols and cephalosporins, we added a predictor for fungal A-domains, which predicts gross physicochemical properties with an F-measure of 0.84. The service is available at http://nrps.informatik.uni-tuebingen.de/.


Bioinformatics | 2006

MultiLoc: prediction of protein subcellular localization using N-terminal targeting sequences, sequence motifs and amino acid composition

Annette Höglund; Pierre Dönnes; Torsten Blum; Hans-Werner Adolph; Oliver Kohlbacher

MOTIVATION Functional annotation of unknown proteins is a major goal in proteomics. A key annotation is the prediction of a proteins subcellular localization. Numerous prediction techniques have been developed, typically focusing on a single underlying biological aspect or predicting a subset of all possible localizations. An important step is taken towards emulating the protein sorting process by capturing and bringing together biologically relevant information, and addressing the clear need to improve prediction accuracy and localization coverage. RESULTS Here we present a novel SVM-based approach for predicting subcellular localization, which integrates N-terminal targeting sequences, amino acid composition and protein sequence motifs. We show how this approach improves the prediction based on N-terminal targeting sequences, by comparing our method TargetLoc against existing methods. Furthermore, MultiLoc performs considerably better than comparable methods predicting all major eukaryotic subcellular localizations, and shows better or comparable results to methods that are specialized on fewer localizations or for one organism. AVAILABILITY http://www-bs.informatik.uni-tuebingen.de/Services/MultiLoc/


eLife | 2014

Sequence co-evolution gives 3D contacts and structures of protein complexes

Thomas A. Hopf; Charlotta Schärfe; João Garcia Lopes Maia Rodrigues; Anna G. Green; Oliver Kohlbacher; Chris Sander; Alexandre M. J. J. Bonvin; Debora S. Marks

Protein–protein interactions are fundamental to many biological processes. Experimental screens have identified tens of thousands of interactions, and structural biology has provided detailed functional insight for select 3D protein complexes. An alternative rich source of information about protein interactions is the evolutionary sequence record. Building on earlier work, we show that analysis of correlated evolutionary sequence changes across proteins identifies residues that are close in space with sufficient accuracy to determine the three-dimensional structure of the protein complexes. We evaluate prediction performance in blinded tests on 76 complexes of known 3D structure, predict protein–protein contacts in 32 complexes of unknown structure, and demonstrate how evolutionary couplings can be used to distinguish between interacting and non-interacting protein pairs in a large complex. With the current growth of sequences, we expect that the method can be generalized to genome-wide elucidation of protein–protein interaction networks and used for interaction predictions at residue resolution. DOI: http://dx.doi.org/10.7554/eLife.03430.001


Cell | 2013

Transcriptional and epigenetic dynamics during specification of human embryonic stem cells.

Casey A. Gifford; Michael J. Ziller; Hongcang Gu; Cole Trapnell; Julie Donaghey; Alexander M. Tsankov; Alex K. Shalek; David R. Kelley; Alexander A. Shishkin; Robbyn Issner; Xiaolan Zhang; Michael J. Coyne; Jennifer L. Fostel; Laurie Holmes; Jim Meldrim; Mitchell Guttman; Charles B. Epstein; Hongkun Park; Oliver Kohlbacher; John L. Rinn; Andreas Gnirke; Eric S. Lander; Bradley E. Bernstein; Alexander Meissner

Differentiation of human embryonic stem cells (hESCs) provides a unique opportunity to study the regulatory mechanisms that facilitate cellular transitions in a human context. To that end, we performed comprehensive transcriptional and epigenetic profiling of populations derived through directed differentiation of hESCs representing each of the three embryonic germ layers. Integration of whole-genome bisulfite sequencing, chromatin immunoprecipitation sequencing, and RNA sequencing reveals unique events associated with specification toward each lineage. Lineage-specific dynamic alterations in DNA methylation and H3K4me1 are evident at putative distal regulatory elements that are frequently bound by pluripotency factors in the undifferentiated hESCs. In addition, we identified germ-layer-specific H3K27me3 enrichment at sites exhibiting high DNA methylation in the undifferentiated state. A better understanding of these initial specification events will facilitate identification of deficiencies in current approaches, leading to more faithful differentiation strategies as well as providing insights into the rewiring of human regulatory programs during cellular transitions.


Nucleic Acids Research | 2010

YLoc—an interpretable web server for predicting subcellular localization

Sebastian Briesemeister; Jörg Rahnenführer; Oliver Kohlbacher

Predicting subcellular localization has become a valuable alternative to time-consuming experimental methods. Major drawbacks of many of these predictors is their lack of interpretability and the fact that they do not provide an estimate of the confidence of an individual prediction. We present YLoc, an interpretable web server for predicting subcellular localization. YLoc uses natural language to explain why a prediction was made and which biological property of the protein was mainly responsible for it. In addition, YLoc estimates the reliability of its own predictions. YLoc can, thus, assist in understanding protein localization and in location engineering of proteins. The YLoc web server is available online at www.multiloc.org/YLoc.


BMC Bioinformatics | 2009

MultiLoc2: integrating phylogeny and Gene Ontology terms improves subcellular protein localization prediction

Torsten Blum; Sebastian Briesemeister; Oliver Kohlbacher

BackgroundKnowledge of subcellular localization of proteins is crucial to proteomics, drug target discovery and systems biology since localization and biological function are highly correlated. In recent years, numerous computational prediction methods have been developed. Nevertheless, there is still a need for prediction methods that show more robustness and higher accuracy.ResultsWe extended our previous MultiLoc predictor by incorporating phylogenetic profiles and Gene Ontology terms. Two different datasets were used for training the system, resulting in two versions of this high-accuracy prediction method. One version is specialized for globular proteins and predicts up to five localizations, whereas a second version covers all eleven main eukaryotic subcellular localizations. In a benchmark study with five localizations, MultiLoc2 performs considerably better than other methods for animal and plant proteins and comparably for fungal proteins. Furthermore, MultiLoc2 performs clearly better when using a second dataset that extends the benchmark study to all eleven main eukaryotic subcellular localizations.ConclusionMultiLoc2 is an extensive high-performance subcellular protein localization prediction system. By incorporating phylogenetic profiles and Gene Ontology terms MultiLoc2 yields higher accuracies compared to its previous version. Moreover, it outperforms other prediction systems in two benchmarks studies. MultiLoc2 is available as user-friendly and free web-service, available at: http://www-bs.informatik.uni-tuebingen.de/Services/MultiLoc2.


Genome Biology | 2009

Simultaneous alignment of short reads against multiple genomes.

Korbinian Schneeberger; Jörg Hagmann; Stephan Ossowski; Norman Warthmann; Sandra Gesing; Oliver Kohlbacher; Detlef Weigel

Genome resequencing with short reads generally relies on alignments against a single reference. GenomeMapper supports simultaneous mapping of short reads against multiple genomes by integrating related genomes (e.g., individuals of the same species) into a single graph structure. It constitutes the first approach for handling multiple references and introduces representations for alignments against complex structures. Demonstrated benefits include access to polymorphisms that cannot be identified by alignments against the reference alone. Download GenomeMapper at http://1001genomes.org.

Collaboration


Dive into the Oliver Kohlbacher's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar

Knut Reinert

Free University of Berlin

View shared research outputs
Top Co-Authors

Avatar

Jens Krüger

University of Tübingen

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Marc Sturm

University of Tübingen

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Richard Grunzke

Dresden University of Technology

View shared research outputs
Researchain Logo
Decentralizing Knowledge