Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Jan Baumbach is active.

Publication


Featured researches published by Jan Baumbach.


BMC Bioinformatics | 2011

clusterMaker: a multi-algorithm clustering plugin for Cytoscape

John H. Morris; Leonard Apeltsin; Aaron M. Newman; Jan Baumbach; Tobias Wittkop; Gang Su; Gary D. Bader; Thomas E. Ferrin

BackgroundIn the post-genomic era, the rapid increase in high-throughput data calls for computational tools capable of integrating data of diverse types and facilitating recognition of biologically meaningful patterns within them. For example, protein-protein interaction data sets have been clustered to identify stable complexes, but scientists lack easily accessible tools to facilitate combined analyses of multiple data sets from different types of experiments. Here we present clusterMaker, a Cytoscape plugin that implements several clustering algorithms and provides network, dendrogram, and heat map views of the results. The Cytoscape network is linked to all of the other views, so that a selection in one is immediately reflected in the others. clusterMaker is the first Cytoscape plugin to implement such a wide variety of clustering algorithms and visualizations, including the only implementations of hierarchical clustering, dendrogram plus heat map visualization (tree view), k-means, k-medoid, SCPS, AutoSOME, and native (Java) MCL.ResultsResults are presented in the form of three scenarios of use: analysis of protein expression data using a recently published mouse interactome and a mouse microarray data set of nearly one hundred diverse cell/tissue types; the identification of protein complexes in the yeast Saccharomyces cerevisiae; and the cluster analysis of the vicinal oxygen chelate (VOC) enzyme superfamily. For scenario one, we explore functionally enriched mouse interactomes specific to particular cellular phenotypes and apply fuzzy clustering. For scenario two, we explore the prefoldin complex in detail using both physical and genetic interaction clusters. For scenario three, we explore the possible annotation of a protein as a methylmalonyl-CoA epimerase within the VOC superfamily. Cytoscape session files for all three scenarios are provided in the Additional Files section.ConclusionsThe Cytoscape plugin clusterMaker provides a number of clustering algorithms and visualizations that can be used independently or in combination for analysis and visualization of biological data sets, and for confirming or generating hypotheses about biological function. Several of these visualizations and algorithms are only available to Cytoscape users through the clusterMaker plugin. clusterMaker is available via the Cytoscape plugin manager.


Bioinformatics | 2006

Graph-based analysis and visualization of experimental results with ONDEX

Jacob Köhler; Jan Baumbach; Jan Taubert; Michael Specht; Andre Skusa; Alexander Rüegg; Christopher J. Rawlings; Paul J. Verrier; Stephan Philippi

MOTIVATION Assembling the relevant information needed to interpret the output from high-throughput, genome scale, experiments such as gene expression microarrays is challenging. Analysis reveals genes that show statistically significant changes in expression levels, but more information is needed to determine their biological relevance. The challenge is to bring these genes together with biological information distributed across hundreds of databases or buried in the scientific literature (millions of articles). Software tools are needed to automate this task which at present is labor-intensive and requires considerable informatics and biological expertise. RESULTS This article describes ONDEX and how it can be applied to the task of interpreting gene expression results. ONDEX is a database system that combines the features of semantic database integration and text mining with methods for graph-based analysis. An overview of the ONDEX system is presented, concentrating on recently developed features for graph-based analysis and visualization. A case study is used to show how ONDEX can help to identify causal relationships between stress response genes and metabolic pathways from gene expression data. ONDEX also discovered functional annotations for most of the genes that emerged as significant in the microarray experiment, but were previously of unknown function.


Nucleic Acids Research | 2010

AltAnalyze and DomainGraph: analyzing and visualizing exon expression data

Dorothea Emig; Nathan Salomonis; Jan Baumbach; Thomas Lengauer; Bruce R. Conklin; Mario Albrecht

Alternative splicing is an important mechanism for increasing protein diversity. However, its functional effects are largely unknown. Here, we present our new software workflow composed of the open-source application AltAnalyze and the Cytoscape plugin DomainGraph. Both programs provide an intuitive and comprehensive end-to-end solution for the analysis and visualization of alternative splicing data from Affymetrix Exon and Gene Arrays at the level of proteins, domains, microRNA binding sites, molecular interactions and pathways. Our software tools include easy-to-use graphical user interfaces, rigorous statistical methods (FIRMA, MiDAS and DABG filtering) and do not require prior knowledge of exon array analysis or programming. They provide new methods for automatic interpretation and visualization of the effects of alternative exon inclusion on protein domain composition and microRNA binding sites. These data can be visualized together with affected pathways and gene or protein interaction networks, allowing a straightforward identification of potential biological effects due to alternative splicing at different levels of granularity. Our programs are available at http://www.altanalyze.org and http://www.domaingraph.de. These websites also include extensive documentation, tutorials and sample data.


Journal of Biotechnology | 2008

The GlxR regulon of the amino acid producer Corynebacterium glutamicum: In silico and in vitro detection of DNA binding sites of a global transcription regulator

Thomas A. Kohl; Jan Baumbach; Britta Jungwirth; Alfred Pühler; Andreas Tauch

The glxR (cg0350) gene of Corynebacterium glutamicum ATCC 13032 encodes a DNA-binding transcription regulator of the CRP/FNR protein family. Five genomic DNA regions known to be bound by GlxR provided the seed information for DNA binding site discovery by expectation maximization and Gibbs sampling approaches. The detection of additional motifs in the genome sequence of C. glutamicum was performed with a position weight matrix and a profile hidden Markov model, both deduced from the initial motif discovery. A combined iterative search for GlxR binding sites revealed 201 potential operator sequences. The interaction of purified GlxR protein with 51 selected binding sites was demonstrated in vitro by performing electrophoretic mobility shift assays with double-stranded 40-mer oligonucleotides. Considering potential operon structures and the genomic organization of C. glutamicum, the expression of 53 transcription units comprising 96 genes may be controlled directly by GlxR. The DNA binding site of GlxR is apparently specified by the consensus sequence TGTGANNTANNTCACA. Integration of the data into the transcriptional regulatory network model of C. glutamicum revealed a high connectivity of the deduced regulatory interactions and suggested that GlxR controls at least (i) sugar uptake, glycolysis, and gluconeogenesis, (ii) acetate, lactate, gluconate, and ethanol metabolism, (iii) aromatic compound degradation, (iv) aerobic and anaerobic respiration, (v) glutamate uptake and nitrogen assimilation, (vi) fatty acid biosynthesis, (vii) deoxyribonucleotide biosynthesis, (viii) the cellular stress response, and (ix) resuscitation.


Nature Methods | 2010

Partitioning biological data with transitivity clustering.

Tobias Wittkop; Dorothea Emig; Sita Lange; Sven Rahmann; Mario Albrecht; John H. Morris; Sebastian Böcker; Jens Stoye; Jan Baumbach

1. Huisken, J., Swoger, J., Del Bene, F., Wittbrodt, J. & Stelzer, E.H. Science 305, 1007–1009 (2004). 2. Huisken, J. & Stainier, D.Y. Development 136, 1963–1975 (2009). 3. Lindeberg, T. J. Appl. Stat. 21, 224–270 (1994). 4. Fischler, M.A. & Bolles, R.C. Commun. ACM 24, 381–395 (1981). 5. Preibisch, S., Saalfeld, S. & Tomancak, P. Bioinformatics 25, 1463–1465 (2009). 6. Preibisch, S., Rohlfing, T., Hasak M.P. & Tomancak P. SPIE Medical Imaging 2008 (eds., Reinhardt, J.M. & Pluim, J.P.W.) 6914, 69140E-69140E-8 (2008). 7. Swoger, J. et al. Opt. Express 15, 8029–8042 (2007). (Supplementary Table 1). The average bead displacement and the ratio between correspondence candidates and true correspondences is a quantitative measure of the reconstruction success, which is crucial for automatic validation of registration results in long time-lapse recordings. The beads can be removed optically or computationally from the sample (Supplementary Methods). We applied the bead-based registration framework to SPIM recordings of early Drosophila melanogaster embryos, which are very challenging samples for multiview reconstruction owing to the scattering of the yolk that severely limits the overlap between views. We imaged Drosophila embryos expressing ubiquitous HisYFP from five and seven views in an extended time-lapse recording covering early embryonic development. We registered each time point separately and then registered all time points to each other compensating for minor drift during image acquisition (Supplementary Methods). We combined content-based fusion with nonlinear blending5 to compensate for brightness differences at boundaries between views (Supplementary Fig. 4). The reconstructed multiview acquisition of the specimen showed, in contrast to the single view, comparable lateral and axial resolution (Fig. 1e,f). We never imaged the anterior and posterior poles of the embryo with full lateral resolution in this acquisition, and yet the cells were clearly distinguishable, demonstrating the precision of the multiview reconstruction (Fig. 1g–i). In the middle of the specimen, the resolution was lower because only some views contributed high-content information whereas other views were blocked by the yolk (Fig. 1h). The reconstructed time-lapse recording provided an unprecedented four-dimensional view of Drosophila embryogenesis (Supplementary Videos 4 and 5). The bead-based registration framework is sample-independent (Supplementary Data and Supplementary Fig. 5) and enables fully unguided registration without prior knowledge of the arrangement of the views (Supplementary Video 4). The software outperforms intensity-based registration approaches6,7 in terms of precision and speed, enabling accurate registration of large, multiview acquisitions in minutes (Supplementary Data, Supplementary Fig. 6 and Supplementary Table 1). The run time of the bead-based registration framework is comparable to the time it takes to acquire the multiview data, and thus, to our knowledge, it is currently the only solution allowing robust, real-time registration of time-lapse SPIM recordings. Moreover, the bead-based registration framework is applicable to other optical sectioning microscopy techniques (Supplementary Fig. 7 and Supplementary Video 6), considerably expanding the possible applications in biology. We provide our bead-based registration algorithm to the bioimaging community as an open-source plugin for Fiji (Supplementary Fig. 8 and Supplementary Software; http://pacific.mpi-cbg.de/wiki/index. php/SPIM_Registration).


PLOS ONE | 2011

Evidence for Reductive Genome Evolution and Lateral Acquisition of Virulence Functions in Two Corynebacterium pseudotuberculosis Strains

Jeronimo C. Ruiz; Vívian D'Afonseca; Artur Silva; Amjad Ali; Anne Cybelle Pinto; Anderson Rodrigues dos Santos; Aryanne A. M. C. Rocha; Débora O. Lopes; Fernanda Alves Dorella; Luis G. C. Pacheco; Marcília Pinheiro da Costa; Meritxell Zurita Turk; Núbia Seyffert; Pablo M. R. O. Moraes; Siomar de Castro Soares; Sintia Almeida; Thiago Luiz de Paula Castro; Vinicius Augusto Carvalho de Abreu; Eva Trost; Jan Baumbach; Andreas Tauch; Maria Paula Cruz Schneider; John Anthony McCulloch; Louise Teixeira Cerdeira; Rommel Thiago Jucá Ramos; Adhemar Zerlotini; Anderson J. Dominitini; Daniela M. Resende; Elisângela Monteiro Coser; Luciana Márcia Oliveira

Background Corynebacterium pseudotuberculosis, a Gram-positive, facultative intracellular pathogen, is the etiologic agent of the disease known as caseous lymphadenitis (CL). CL mainly affects small ruminants, such as goats and sheep; it also causes infections in humans, though rarely. This species is distributed worldwide, but it has the most serious economic impact in Oceania, Africa and South America. Although C. pseudotuberculosis causes major health and productivity problems for livestock, little is known about the molecular basis of its pathogenicity. Methodology and Findings We characterized two C. pseudotuberculosis genomes (Cp1002, isolated from goats; and CpC231, isolated from sheep). Analysis of the predicted genomes showed high similarity in genomic architecture, gene content and genetic order. When C. pseudotuberculosis was compared with other Corynebacterium species, it became evident that this pathogenic species has lost numerous genes, resulting in one of the smallest genomes in the genus. Other differences that could be part of the adaptation to pathogenicity include a lower GC content, of about 52%, and a reduced gene repertoire. The C. pseudotuberculosis genome also includes seven putative pathogenicity islands, which contain several classical virulence factors, including genes for fimbrial subunits, adhesion factors, iron uptake and secreted toxins. Additionally, all of the virulence factors in the islands have characteristics that indicate horizontal transfer. Conclusions These particular genome characteristics of C. pseudotuberculosis, as well as its acquired virulence factors in pathogenicity islands, provide evidence of its lifestyle and of the pathogenicity pathways used by this pathogen in the infection process. All genomes cited in this study are available in the NCBI Genbank database (http://www.ncbi.nlm.nih.gov/genbank/) under accession numbers CP001809 and CP001829.


BMC Bioinformatics | 2007

CoryneRegNet 4.0 – A reference database for corynebacterial gene regulatory networks

Jan Baumbach

BackgroundDetailed information on DNA-binding transcription factors (the key players in the regulation of gene expression) and on transcriptional regulatory interactions of microorganisms deduced from literature-derived knowledge, computer predictions and global DNA microarray hybridization experiments, has opened the way for the genome-wide analysis of transcriptional regulatory networks. The large-scale reconstruction of these networks allows the in silico analysis of cell behavior in response to changing environmental conditions. We previously published CoryneRegNet, an ontology-based data warehouse of corynebacterial transcription factors and regulatory networks. Initially, it was designed to provide methods for the analysis and visualization of the gene regulatory network of Corynebacterium glutamicum.ResultsNow we introduce CoryneRegNet release 4.0, which integrates data on the gene regulatory networks of 4 corynebacteria, 2 mycobacteria and the model organism Escherichia coli K12. As the previous versions, CoryneRegNet provides a web-based user interface to access the database content, to allow various queries, and to support the reconstruction, analysis and visualization of regulatory networks at different hierarchical levels. In this article, we present the further improved database content of CoryneRegNet along with novel analysis features. The network visualization feature GraphVis now allows the inter-species comparisons of reconstructed gene regulatory networks and the projection of gene expression levels onto that networks. Therefore, we added stimulon data directly into the database, but also provide Web Service access to the DNA microarray analysis platform EMMA. Additionally, CoryneRegNet now provides a SOAP based Web Service server, which can easily be consumed by other bioinformatics software systems. Stimulons (imported from the database, or uploaded by the user) can be analyzed in the context of known transcriptional regulatory networks to predict putative contradictions or further gene regulatory interactions. Furthermore, it integrates protein clusters by means of heuristically solving the weighted graph cluster editing problem. In addition, it provides Web Service based access to up to date gene annotation data from GenDB.ConclusionThe release 4.0 of CoryneRegNet is a comprehensive system for the integrated analysis of procaryotic gene regulatory networks. It is a versatile systems biology platform to support the efficient and large-scale analysis of transcriptional regulation of gene expression in microorganisms. It is publicly available at http://www.CoryneRegNet.DE.


BMC Bioinformatics | 2007

Large scale clustering of protein sequences with FORCE -A layout based heuristic for weighted cluster editing

Tobias Wittkop; Jan Baumbach; Francisco P. Lobo; Sven Rahmann

BackgroundDetecting groups of functionally related proteins from their amino acid sequence alone has been a long-standing challenge in computational genome research. Several clustering approaches, following different strategies, have been published to attack this problem. Today, new sequencing technologies provide huge amounts of sequence data that has to be efficiently clustered with constant or increased accuracy, at increased speed.ResultsWe advocate that the model of weighted cluster editing, also known as transitive graph projection is well-suited to protein clustering. We present the FORCE heuristic that is based on transitive graph projection and clusters arbitrary sets of objects, given pairwise similarity measures. In particular, we apply FORCE to the problem of protein clustering and show that it outperforms the most popular existing clustering tools (Spectral clustering, TribeMCL, GeneRAGE, Hierarchical clustering, and Affinity Propagation). Furthermore, we show that FORCE is able to handle huge datasets by calculating clusters for all 192 187 prokaryotic protein sequences (66 organisms) obtained from the COG database. Finally, FORCE is integrated into the corynebacterial reference database CoryneRegNet.ConclusionFORCE is an applicable alternative to existing clustering algorithms. Its theoretical foundation, weighted cluster editing, can outperform other clustering paradigms on protein homology clustering. FORCE is open source and implemented in Java. The software, including the source code, the clustering results for COG and CoryneRegNet, and all evaluation datasets are available at http://gi.cebitec.uni-bielefeld.de/comet/force/.


BMC Genomics | 2006

CoryneRegNet: an ontology-based data warehouse of corynebacterial transcription factors and regulatory networks.

Jan Baumbach; Karina Brinkrolf; Lisa F. Czaja; Sven Rahmann; Andreas Tauch

BackgroundThe application of DNA microarray technology in post-genomic analysis of bacterial genome sequences has allowed the generation of huge amounts of data related to regulatory networks. This data along with literature-derived knowledge on regulation of gene expression has opened the way for genome-wide reconstruction of transcriptional regulatory networks. These large-scale reconstructions can be converted into in silico models of bacterial cells that allow a systematic analysis of network behavior in response to changing environmental conditions.DescriptionCoryneRegNet was designed to facilitate the genome-wide reconstruction of transcriptional regulatory networks of corynebacteria relevant in biotechnology and human medicine. During the import and integration process of data derived from experimental studies or literature knowledge CoryneRegNet generates links to genome annotations, to identified transcription factors and to the corresponding cis-regulatory elements. CoryneRegNet is based on a multi-layered, hierarchical and modular concept of transcriptional regulation and was implemented by using the relational database management system MySQL and an ontology-based data structure. Reconstructed regulatory networks can be visualized by using the yFiles JAVA graph library. As an application example of CoryneRegNet, we have reconstructed the global transcriptional regulation of a cellular module involved in SOS and stress response of corynebacteria.ConclusionCoryneRegNet is an ontology-based data warehouse that allows a pertinent data management of regulatory interactions along with the genome-scale reconstruction of transcriptional regulatory networks. These models can further be combined with metabolic networks to build integrated models of cellular function including both metabolism and its transcriptional regulation.


Nature Methods | 2015

Comparing the Performance of Biomedical Clustering Methods

Christian Wiwie; Jan Baumbach; Richard Röttger

Identifying groups of similar objects is a popular first step in biomedical data analysis, but it is error-prone and impossible to perform manually. Many computational methods have been developed to tackle this problem. Here we assessed 13 well-known methods using 24 data sets ranging from gene expression to protein domains. Performance was judged on the basis of 13 common cluster validity indices. We developed a clustering analysis platform, ClustEval (http://clusteval.mpi-inf.mpg.de), to promote streamlined evaluation, comparison and reproducibility of clustering results in the future. This allowed us to objectively evaluate the performance of all tools on all data sets with up to 1,000 different parameter sets each, resulting in a total of more than 4 million calculated cluster validity indices. We observed that there was no universal best performer, but on the basis of this wide-ranging comparison we were able to develop a short guideline for biomedical clustering tasks. ClustEval allows biomedical researchers to pick the appropriate tool for their data type and allows method developers to compare their tool to the state of the art.

Collaboration


Dive into the Jan Baumbach's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar

Qihua Tan

University of Southern Denmark

View shared research outputs
Top Co-Authors

Avatar

Vasco Azevedo

Universidade Federal de Minas Gerais

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Sven Rahmann

University of Duisburg-Essen

View shared research outputs
Top Co-Authors

Avatar

Artur Silva

Federal University of Pará

View shared research outputs
Top Co-Authors

Avatar

Nicolas Alcaraz

University of Southern Denmark

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Tobias Wittkop

Buck Institute for Research on Aging

View shared research outputs
Researchain Logo
Decentralizing Knowledge