Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Richard Röttger is active.

Publication


Featured researches published by Richard Röttger.


Nature Methods | 2015

Comparing the Performance of Biomedical Clustering Methods

Christian Wiwie; Jan Baumbach; Richard Röttger

Identifying groups of similar objects is a popular first step in biomedical data analysis, but it is error-prone and impossible to perform manually. Many computational methods have been developed to tackle this problem. Here we assessed 13 well-known methods using 24 data sets ranging from gene expression to protein domains. Performance was judged on the basis of 13 common cluster validity indices. We developed a clustering analysis platform, ClustEval (http://clusteval.mpi-inf.mpg.de), to promote streamlined evaluation, comparison and reproducibility of clustering results in the future. This allowed us to objectively evaluate the performance of all tools on all data sets with up to 1,000 different parameter sets each, resulting in a total of more than 4 million calculated cluster validity indices. We observed that there was no universal best performer, but on the basis of this wide-ranging comparison we were able to develop a short guideline for biomedical clustering tasks. ClustEval allows biomedical researchers to pick the appropriate tool for their data type and allows method developers to compare their tool to the state of the art.


Nucleic Acids Research | 2012

CoryneRegNet 6.0—Updated database content, new analysis methods and novel features focusing on community demands

Josch Pauling; Richard Röttger; Andreas Tauch; Vasco Azevedo; Jan Baumbach

Post-genomic analysis techniques such as next-generation sequencing have produced vast amounts of data about micro organisms including genetic sequences, their functional annotations and gene regulatory interactions. The latter are genetic mechanisms that control a cells characteristics, for instance, pathogenicity as well as survival and reproduction strategies. CoryneRegNet is the reference database and analysis platform for corynebacterial gene regulatory networks. In this article we introduce the updated version 6.0 of CoryneRegNet and describe the updated database content which includes, 6352 corynebacterial regulatory interactions compared with 4928 interactions in release 5.0 and 3235 regulations in release 4.0, respectively. We also demonstrate how we support the community by integrating analysis and visualization features for transiently imported custom data, such as gene regulatory interactions. Furthermore, with release 6.0, we provide easy-to-use functions that allow the user to submit data for persistent storage with the CoryneRegNet database. Thus, it offers important options to its users in terms of community demands. CoryneRegNet is publicly available at http://www.coryneregnet.de.


Nucleic Acids Research | 2014

Bi-Force: large-scale bicluster editing and its application to gene expression data biclustering

Peng Sun; Nora K. Speicher; Richard Röttger; Jiong Guo; Jan Baumbach

Abstract The explosion of the biological data has dramatically reformed todays biological research. The need to integrate and analyze high-dimensional biological data on a large scale is driving the development of novel bioinformatics approaches. Biclustering, also known as ‘simultaneous clustering’ or ‘co-clustering’, has been successfully utilized to discover local patterns in gene expression data and similar biomedical data types. Here, we contribute a new heuristic: ‘Bi-Force’. It is based on the weighted bicluster editing model, to perform biclustering on arbitrary sets of biological entities, given any kind of pairwise similarities. We first evaluated the power of Bi-Force to solve dedicated bicluster editing problems by comparing Bi-Force with two existing algorithms in the BiCluE software package. We then followed a biclustering evaluation protocol in a recent review paper from Eren et al. (2013) (A comparative analysis of biclustering algorithms for gene expressiondata. Brief. Bioinform., 14:279–292.) and compared Bi-Force against eight existing tools: FABIA, QUBIC, Cheng and Church, Plaid, BiMax, Spectral, xMOTIFs and ISA. To this end, a suite of synthetic datasets as well as nine large gene expression datasets from Gene Expression Omnibus were analyzed. All resulting biclusters were subsequently investigated by Gene Ontology enrichment analysis to evaluate their biological relevance. The distinct theoretical foundation of Bi-Force (bicluster editing) is more powerful than strict biclustering. We thus outperformed existing tools with Bi-Force at least when following the evaluation protocols from Eren et al. Bi-Force is implemented in Java and integrated into the open source software package of BiCluE. The software as well as all used datasets are publicly available at http://biclue.mpi-inf.mpg.de.


IEEE/ACM Transactions on Computational Biology and Bioinformatics | 2012

How Little Do We Actually Know? On the Size of Gene Regulatory Networks

Richard Röttger; Ulrich Rückert; Jan Taubert; Jan Baumbach

The National Center for Biotechnology Information (NCBI) recently announced the availability of whole genome sequences for more than 1,000 species. And the number of sequenced individual organisms is growing. Ongoing improvement of DNA sequencing technology will further contribute to this, enabling large-scale evolution and population genetics studies. However, the availability of sequence information is only the first step in understanding how cells survive, reproduce, and adjust their behavior. The genetic control behind organized development and adaptation of complex organisms still remains widely undetermined. One major molecular control mechanism is transcriptional gene regulation. The direct juxtaposition of the total number of sequenced species to the handful of model organisms with known regulations is surprising. Here, we investigate how little we even know about these model organisms. We aim to predict the sizes of the whole-organism regulatory networks of seven species. In particular, we provide statistical lower bounds for the expected number of regulations. For Escherichia coli we estimate at most 37 percent of the expected gene regulatory interactions to be already discovered, 24 percent for Bacillus subtilis, and <;3% human, respectively. We conclude that even for our best researched model organisms we still lack substantial understanding of fundamental molecular control mechanisms, at least on a large scale.


Bioinformatics | 2013

Density parameter estimation for finding clusters of homologous proteins--tracing actinobacterial pathogenicity lifestyles.

Richard Röttger; Prabhav Kalaghatgi; Peng Sun; Siomar de Castro Soares; Vasco Azevedo; Tobias Wittkop; Jan Baumbach

MOTIVATION Homology detection is a long-standing challenge in computational biology. To tackle this problem, typically all-versus-all BLAST results are coupled with data partitioning approaches resulting in clusters of putative homologous proteins. One of the main problems, however, has been widely neglected: all clustering tools need a density parameter that adjusts the number and size of the clusters. This parameter is crucial but hard to estimate without gold standard data at hand. Developing a gold standard, however, is a difficult and time consuming task. Having a reliable method for detecting clusters of homologous proteins between a huge set of species would open opportunities for better understanding the genetic repertoire of bacteria with different lifestyles. RESULTS Our main contribution is a method for identifying a suitable and robust density parameter for protein homology detection without a given gold standard. Therefore, we study the core genome of 89 actinobacteria. This allows us to incorporate background knowledge, i.e. the assumption that a set of evolutionarily closely related species should share a comparably high number of evolutionarily conserved proteins (emerging from phylum-specific housekeeping genes). We apply our strategy to find genes/proteins that are specific for certain actinobacterial lifestyles, i.e. different types of pathogenicity. The whole study was performed with transitivity clustering, as it only requires a single intuitive density parameter and has been shown to be well applicable for the task of protein sequence clustering. Note, however, that the presented strategy generally does not depend on our clustering method but can easily be adapted to other clustering approaches. AVAILABILITY All results are publicly available at http://transclust.mmci.uni-saarland.de/actino_core/ or as Supplementary Material of this article. CONTACT [email protected] SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.


2009 First International Workshop on Near Field Communication | 2009

All-I-Touch as Combination of NFC and Lifestyle

Fabian Kneissl; Richard Röttger; Uwe Sandner; Jan Marco Leimeister; Helmut Krcmar

For this paper we developed the concept and implemented a fully working prototype of a snowboarder community platform based on Near Field Communication. All-I-Touch is a service which provides product information at the point of sale and additionally connects the user with his social community in Facebook. Through this combination it is possible to increase incentives for the end user to use the service as well as the snowboard manufacturer to equip his products with NFC tags. The end user benefits from meaningful product information which is enriched with comments from his friends or arbitrary users through the connection to Facebook. This information is more likely to represent an independent impression of the product than the manufacturers description and thus it is more valuable. On the other hand, manufacturers profit from All-I-Touch as a tool for Viral Marketing and benefit from a highly relevant target group. To enhance the attractiveness of the Face book application we extended All-I-Touch to include Pieces, Places and People. Thus, the user can update his Facebook profile page instantly with information like where he has been and whom he met - and all that just by holding a mobile phone over a tag!


Annals of Human Genetics | 2016

Differentially Methylated Genomic Regions in Birth-Weight Discordant Twin Pairs

Mubo Chen; Jan Baumbach; Fabio Vandin; Richard Röttger; Eudes Barbosa; Mingchui Dong; Morten Frost; Lene Christiansen; Qihua Tan

Poor nutrition during critical growth phases may alter the structural and physiologic development of vital organs thus “programming” the susceptibility to adult‐onset diseases and disease‐related health conditions. Epigenome‐wide association studies have been performed in birth‐weight discordant twin pairs to find evidence for such “programming” effects, but no significant results emerged. We further investigated this issue using a new computational approach: Instead of probing single genomic sites for significant alterations in epigenetic marks, we scan for differentially methylated genomic regions. Whole genome DNA methylation levels were measured in whole blood from 150 pairs of adult identical twins discordant for birth‐weight. Intrapair differential DNA methylation was associated with qualitative (large or small) and quantitative (percentage) birth‐weight discordance at each genomic site using regression models adjusting for age and sex. Based on the regression results, genomic regions with consistent alteration patterns of DNA methylation were located and tested for significant robustness using computational permutation tests. This yielded an interesting genomic region on chromosome 1, which is significantly differentially methylated for quantitative birth‐weight discordance. The region covers two genes (TYW3 and CRYZ) both reportedly associated with metabolism. We conclude that prenatal conditions for birth‐weight discordance may result in persistent epigenetic modifications potentially affecting even adult health.


Briefings in Functional Genomics | 2014

On the limits of computational functional genomics for bacterial lifestyle prediction

Eudes Barbosa; Richard Röttger; Anne-Christin Hauschild; Vasco Azevedo; Jan Baumbach

We review the level of genomic specificity regarding actinobacterial pathogenicity. As they occupy various niches in diverse habitats, one may assume the existence of lifestyle-specific genomic features. We include 240 actinobacteria classified into four pathogenicity classes: human pathogens (HPs), broad-spectrum pathogens (BPs), opportunistic pathogens (OPs) and non-pathogenic (NP). We hypothesize: (H1) Pathogens (HPs and BPs) possess specific pathogenicity signature genes. (H2) The same holds for OPs. (H3) Broad-spectrum and exclusively HPs cannot be distinguished from each other because of an observation bias, i.e. many HPs might yet be unclassified BPs. (H4) There is no intrinsic genomic characteristic of OPs compared with pathogens, as small mutations are likely to play a more dominant role to survive the immune system. To study these hypotheses, we implemented a bioinformatics pipeline that combines evolutionary sequence analysis with statistical learning methods (Random Forest with feature selection, model tuning and robustness analysis). Essentially, we present orthologous gene sets that computationally distinguish pathogens from NPs (H1). We further show a clear limit in differentiating OPs from both NPs (H2) and pathogens (H4). HPs may also not be distinguished from bacteria annotated as BPs based only on a small set of orthologous genes (H3), as many HPs might as well target a broad range of mammals but have not been annotated accordingly. In conclusion, we illustrate that even in the post-genome era and despite next-generation sequencing technology, our ability to efficiently deduce real-world conclusions, such as pathogenicity classification, remains quite limited.


Internet Mathematics | 2011

Extension and Robustness of Transitivity Clustering for Protein–Protein Interaction Network Analysis

Tobias Wittkop; Sven Rahmann; Richard Röttger; Sebastian Böcker; Jan Baumbach

Abstract Partitioning biological data objects into groups such that the objects within the groups share common traits is a longstanding challenge in computational biology. Recently, we developed and established transitivity clustering, a partitioning approach based on weighted transitive graph projection that utilizes a single similarity threshold as density parameter. In previous publications, we concentrated on the graphical user interface and on concrete biomedical application protocols. Here, we contribute the following theoretical considerations: (1) We provide proofs that the average similarity between objects from the same cluster is above the user-given threshold and that the average similarity between objects from different clusters is below the threshold. (2) We extend transitivity clustering to an overlapping clustering tool by integrating two new approaches. (3) We demonstrate the power of transitivity clustering for protein-complex detection. We evaluate our approaches against others by utilizing gold-standard data that was previously used by Brohée et al. for reviewing existing bioinformatics clustering tools. The extended version of this article is available online at http://transclust.mpi-inf.mpg.de .


Scientific Reports | 2015

Massive fungal biodiversity data re-annotation with multi-level clustering

Duong Vu; Szaniszlo Szoke; Christian Wiwie; Jan Baumbach; Gianluigi Cardinali; Richard Röttger; Vincent Robert

With the availability of newer and cheaper sequencing methods, genomic data are being generated at an increasingly fast pace. In spite of the high degree of complexity of currently available search routines, the massive number of sequences available virtually prohibits quick and correct identification of large groups of sequences sharing common traits. Hence, there is a need for clustering tools for automatic knowledge extraction enabling the curation of large-scale databases. Current sophisticated approaches on sequence clustering are based on pairwise similarity matrices. This is impractical for databases of hundreds of thousands of sequences as such a similarity matrix alone would exceed the available memory. In this paper, a new approach called MultiLevel Clustering (MLC) is proposed which avoids a majority of sequence comparisons, and therefore, significantly reduces the total runtime for clustering. An implementation of the algorithm allowed clustering of all 344,239 ITS (Internal Transcribed Spacer) fungal sequences from GenBank utilizing only a normal desktop computer within 22 CPU-hours whereas the greedy clustering method took up to 242 CPU-hours.

Collaboration


Dive into the Richard Röttger's collaboration.

Top Co-Authors

Avatar

Jan Baumbach

University of Southern Denmark

View shared research outputs
Top Co-Authors

Avatar

Vasco Azevedo

Universidade Federal de Minas Gerais

View shared research outputs
Top Co-Authors

Avatar

Artur Silva

Federal University of Pará

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Qihua Tan

University of Southern Denmark

View shared research outputs
Top Co-Authors

Avatar

Eudes Barbosa

Universidade Federal de Minas Gerais

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Sandeep Tiwari

Universidade Federal de Minas Gerais

View shared research outputs
Top Co-Authors

Avatar

Syed Shah Hassan

Universidade Federal de Minas Gerais

View shared research outputs
Researchain Logo
Decentralizing Knowledge