Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Lex Overmars is active.

Publication


Featured researches published by Lex Overmars.


Briefings in Bioinformatics | 2013

Data mining in the Life Sciences with Random Forest: a walk in the park or lost in the jungle?

Wouter G. Touw; Jumamurat R. Bayjanov; Lex Overmars; Lennart Backus; Jos Boekhorst; Michiel Wels; Sacha A. F. T. van Hijum

In the Life Sciences ‘omics’ data is increasingly generated by different high-throughput technologies. Often only the integration of these data allows uncovering biological insights that can be experimentally validated or mechanistically modelled, i.e. sophisticated computational approaches are required to extract the complex non-linear trends present in omics data. Classification techniques allow training a model based on variables (e.g. SNPs in genetic association studies) to separate different classes (e.g. healthy subjects versus patients). Random Forest (RF) is a versatile classification algorithm suited for the analysis of these large data sets. In the Life Sciences, RF is popular because RF classification models have a high-prediction accuracy and provide information on importance of variables for classification. For omics data, variables or conditional relations between variables are typically important for a subset of samples of the same class. For example: within a class of cancer patients certain SNP combinations may be important for a subset of patients that have a specific subtype of cancer, but not important for a different subset of patients. These conditional relationships can in principle be uncovered from the data with RF as these are implicitly taken into account by the algorithm during the creation of the classification model. This review details some of the to the best of our knowledge rarely or never used RF properties that allow maximizing the biological insights that can be extracted from complex omics data sets using RF.


Extremophiles | 2014

Microbial diversity and biogeochemical cycling in soda lakes

Dimitry Y. Sorokin; Tom Berben; Emily Denise Melton; Lex Overmars; Charlotte D. Vavourakis; Gerard Muyzer

Soda lakes contain high concentrations of sodium carbonates resulting in a stable elevated pH, which provide a unique habitat to a rich diversity of haloalkaliphilic bacteria and archaea. Both cultivation-dependent and -independent methods have aided the identification of key processes and genes in the microbially mediated carbon, nitrogen, and sulfur biogeochemical cycles in soda lakes. In order to survive in this extreme environment, haloalkaliphiles have developed various bioenergetic and structural adaptations to maintain pH homeostasis and intracellular osmotic pressure. The cultivation of a handful of strains has led to the isolation of a number of extremozymes, which allow the cell to perform enzymatic reactions at these extreme conditions. These enzymes potentially contribute to biotechnological applications. In addition, microbial species active in the sulfur cycle can be used for sulfur remediation purposes. Future research should combine both innovative culture methods and state-of-the-art ‘meta-omic’ techniques to gain a comprehensive understanding of the microbes that flourish in these extreme environments and the processes they mediate. Coupling the biogeochemical C, N, and S cycles and identifying where each process takes place on a spatial and temporal scale could unravel the interspecies relationships and thereby reveal more about the ecosystem dynamics of these enigmatic extreme environments.


BMC Genomics | 2011

Comparative analyses imply that the enigmatic sigma factor 54 is a central controller of the bacterial exterior

Christof Francke; Tom Groot Kormelink; Yanick Hagemeijer; Lex Overmars; Vincent Sluijter; Roy Moezelaar; Roland J. Siezen

BackgroundSigma-54 is a central regulator in many pathogenic bacteria and has been linked to a multitude of cellular processes like nitrogen assimilation and important functional traits such as motility, virulence, and biofilm formation. Until now it has remained obscure whether these phenomena and the control by Sigma-54 share an underlying theme.ResultsWe have uncovered the commonality by performing a range of comparative genome analyses. A) The presence of Sigma-54 and its associated activators was determined for all sequenced prokaryotes. We observed a phylum-dependent distribution that is suggestive of an evolutionary relationship between Sigma-54 and lipopolysaccharide and flagellar biosynthesis. B) All Sigma-54 activators were identified and annotated. The relation with phosphotransfer-mediated signaling (TCS and PTS) and the transport and assimilation of carboxylates and nitrogen containing metabolites was substantiated. C) The function annotations, that were represented within the genomic context of all genes encoding Sigma-54, its activators and its promoters, were analyzed for intra-phylum representation and inter-phylum conservation. Promoters were localized using a straightforward scoring strategy that was formulated to identify similar motifs. We found clear highly-represented and conserved genetic associations with genes that concern the transport and biosynthesis of the metabolic intermediates of exopolysaccharides, flagella, lipids, lipopolysaccharides, lipoproteins and peptidoglycan.ConclusionOur analyses directly implicate Sigma-54 as a central player in the control over the processes that involve the physical interaction of an organism with its environment like in the colonization of a host (virulence) or the formation of biofilm.


PLOS ONE | 2013

Classification of the Adenylation and Acyl-Transferase Activity of NRPS and PKS Systems Using Ensembles of Substrate Specific Hidden Markov Models

Barzan I. Khayatt; Lex Overmars; Roland J. Siezen; Christof Francke

There is a growing interest in the Non-ribosomal peptide synthetases (NRPSs) and polyketide synthases (PKSs) of microbes, fungi and plants because they can produce bioactive peptides such as antibiotics. The ability to identify the substrate specificity of the enzymes adenylation (A) and acyl-transferase (AT) domains is essential to rationally deduce or engineer new products. We here report on a Hidden Markov Model (HMM)-based ensemble method to predict the substrate specificity at high quality. We collected a new reference set of experimentally validated sequences. An initial classification based on alignment and Neighbor Joining was performed in line with most of the previously published prediction methods. We then created and tested single substrate specific HMMs and found that their use improved the correct identification significantly for A as well as for AT domains. A major advantage of the use of HMMs is that it abolishes the dependency on multiple sequence alignment and residue selection that is hampering the alignment-based clustering methods. Using our models we obtained a high prediction quality for the substrate specificity of the A domains similar to two recently published tools that make use of HMMs or Support Vector Machines (NRPSsp and NRPS predictor2, respectively). Moreover, replacement of the single substrate specific HMMs by ensembles of models caused a clear increase in prediction quality. We argue that the superiority of the ensemble over the single model is caused by the way substrate specificity evolves for the studied systems. It is likely that this also holds true for other protein domains. The ensemble predictor has been implemented in a simple web-based tool that is available at http://www.cmbi.ru.nl/NRPS-PKS-substrate-predictor/.


PLOS ONE | 2012

Transcriptomes reveal genetic signatures underlying physiological variations imposed by different fermentation conditions in Lactobacillus plantarum.

Peter A. Bron; Michiel Wels; Roger S. Bongers; Hermien van Bokhorst-van de Veen; Anne Wiersma; Lex Overmars; Maria L. Marco; Michiel Kleerebezem

Lactic acid bacteria (LAB) are utilized widely for the fermentation of foods. In the current post-genomic era, tools have been developed that explore genetic diversity among LAB strains aiming to link these variations to differential phenotypes observed in the strains investigated. However, these genotype-phenotype matching approaches fail to assess the role of conserved genes in the determination of physiological characteristics of cultures by environmental conditions. This manuscript describes a complementary approach in which Lactobacillus plantarum WCFS1 was fermented under a variety of conditions that differ in temperature, pH, as well as NaCl, amino acid, and O2 levels. Samples derived from these fermentations were analyzed by full-genome transcriptomics, paralleled by the assessment of physiological characteristics, e.g., maximum growth rate, yield, and organic acid profiles. A data-storage and -mining suite designated FermDB was constructed and exploited to identify correlations between fermentation conditions and industrially relevant physiological characteristics of L. plantarum, as well as the associated transcriptome signatures. Finally, integration of the specific fermentation variables with the transcriptomes enabled the reconstruction of the gene-regulatory networks involved. The fermentation-genomics platform presented here is a valuable complementary approach to earlier described genotype-phenotype matching strategies which allows the identification of transcriptome signatures underlying physiological variations imposed by different fermentation conditions.


BMC Genomics | 2013

MGcV: the microbial genomic context viewer for comparative genome analysis

Lex Overmars; Robert Kerkhoven; Roland J. Siezen; Christof Francke

BackgroundConserved gene context is used in many types of comparative genome analyses. It is used to provide leads on gene function, to guide the discovery of regulatory sequences, but also to aid in the reconstruction of metabolic networks. We present the Microbial Genomic context Viewer (MGcV), an interactive, web-based application tailored to strengthen the practice of manual comparative genome context analysis for bacteria.ResultsMGcV is a versatile, easy-to-use tool that renders a visualization of the genomic context of any set of selected genes, genes within a phylogenetic tree, genomic segments, or regulatory elements. It is tailored to facilitate laborious tasks such as the interactive annotation of gene function, the discovery of regulatory elements, or the sequence-based reconstruction of gene regulatory networks. We illustrate that MGcV can be used in gene function annotation by visually integrating information on prokaryotic genes, like their annotation as available from NCBI with other annotation data such as Pfam domains, sub-cellular location predictions and gene-sequence characteristics such as GC content. We also illustrate the usefulness of the interactive features that allow the graphical selection of genes to facilitate data gathering (e.g. upstream regions, ID’s or annotation), in the analysis and reconstruction of transcription regulation. Moreover, putative regulatory elements and their corresponding scores or data from RNA-seq and microarray experiments can be uploaded, visualized and interpreted in (ranked-) comparative context maps. The ranked maps allow the interpretation of predicted regulatory elements and experimental data in light of each other.ConclusionMGcV advances the manual comparative analysis of genes and regulatory elements by providing fast and flexible integration of gene related data combined with straightforward data retrieval. MGcV is available at http://mgcv.cmbi.ru.nl.


Microbial Biotechnology | 2011

Reconstruction of the regulatory network of Lactobacillus plantarum WCFS1 on basis of correlated gene expression and conserved regulatory motifs.

Michiel Wels; Lex Overmars; Christof Francke; Michiel Kleerebezem; Roland J. Siezen

Gene regulatory networks can be reconstructed by combining transcriptome data from many different experiments to elucidate relations between the activity of certain transcription factors and the genes they control. To obtain insight in the regulatory network of Lactobacillus plantarum, microarray transcriptome data from more than 70 different experimental conditions were combined and the expression profiles of the transcriptional units (TUs) were compared. The TUs that displayed correlated expression were used to identify putative cis‐regulatory elements by searching the upstream regions of the TUs for conserved motifs. Predicted motifs were extended and refined by searching for motifs in the upstream regions of additional TUs with correlated expression. In this way, cis‐acting elements were identified for 41 regulons consisting of at least four TUs (correlation > 0.7). This set of regulons included the known regulons of CtsR and LexA, but also several novel ones encompassing genes with coherent biological functions. Visualization of the regulons and their connections revealed a highly interconnected regulatory network. This network contains several subnetworks that encompass genes of correlated biological function, such as sugar and energy metabolism, nitrogen metabolism and stress response.


BMC Genomics | 2012

Comparative genome analysis of central nitrogen metabolism and its control by GlnR in the class Bacilli

Tom Groot Kormelink; Eric Koenders; Yanick Hagemeijer; Lex Overmars; Roland J. Siezen; Willem M. de Vos; Christof Francke

BackgroundThe assimilation of nitrogen in bacteria is achieved through only a few metabolic conversions between alpha-ketoglutarate, glutamate and glutamine. The enzymes that catalyze these conversions are glutamine synthetase, glutaminase, glutamate dehydrogenase and glutamine alpha-ketoglutarate aminotransferase. In low-GC Gram-positive bacteria the transcriptional control over the levels of the related enzymes is mediated by four regulators: GlnR, TnrA, GltC and CodY. We have analyzed the genomes of all species belonging to the taxonomic families Bacillaceae, Listeriaceae, Staphylococcaceae, Lactobacillaceae, Leuconostocaceae and Streptococcaceae to determine the diversity in central nitrogen metabolism and reconstructed the regulation by GlnR.ResultsAlthough we observed a substantial difference in the extent of central nitrogen metabolism in the various species, the basic GlnR regulon was remarkably constant and appeared not affected by the presence or absence of the other three main regulators. We found a conserved regulatory association of GlnR with glutamine synthetase (glnRA operon), and the transport of ammonium (amtB-glnK) and glutamine/glutamate (i.e. via glnQHMP, glnPHQ, gltT, alsT). In addition less-conserved associations were found with, for instance, glutamate dehydrogenase in Streptococcaceae, purine catabolism and the reduction of nitrite in Bacillaceae, and aspartate/asparagine deamination in Lactobacillaceae.ConclusionsOur analyses imply GlnR-mediated regulation in constraining the import of ammonia/amino-containing compounds and the production of intracellular ammonia under conditions of high nitrogen availability. Such a role fits with the intrinsic need for tight control of ammonia levels to limit futile cycling.


Bioinformatics | 2015

CiVi: circular genome visualization with unique features to analyze sequence elements

Lex Overmars; Sacha A. F. T. van Hijum; Roland J. Siezen; Christof Francke

UNLABELLED We have developed CiVi, a user-friendly web-based tool to create custom circular maps to aid the analysis of microbial genomes and sequence elements. Sequence related data such as gene-name, COG class, PFAM domain, GC%, and subcellular location can be comprehensively viewed. Quantitative gene-related data (e.g. expression ratios or read counts) as well as predicted sequence elements (e.g. regulatory sequences) can be uploaded and visualized. CiVi accommodates the analysis of genomic elements by allowing a visual interpretation in the context of: (i) their genome-wide distribution, (ii) provided experimental data and (iii) the local orientation and location with respect to neighboring genes. CiVi thus enables both experts and non-experts to conveniently integrate public genome data with the results of genome analyses in circular genome maps suitable for publication. CONTACT [email protected] SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online. AVAILABILITY AND IMPLEMENTATION CiVi is freely available at http://civi.cmbi.ru.nl.


PLOS ONE | 2013

Reduce manual curation by combining gene predictions from multiple annotation engines, a case study of start codon prediction

T. Ederveen; Lex Overmars; Sacha A. F. T. van Hijum

Nowadays, prokaryotic genomes are sequenced faster than the capacity to manually curate gene annotations. Automated genome annotation engines provide users a straight-forward and complete solution for predicting ORF coordinates and function. For many labs, the use of AGEs is therefore essential to decrease the time necessary for annotating a given prokaryotic genome. However, it is not uncommon for AGEs to provide different and sometimes conflicting predictions. Combining multiple AGEs might allow for more accurate predictions. Here we analyzed the ab initio open reading frame (ORF) calling performance of different AGEs based on curated genome annotations of eight strains from different bacterial species with GC% ranging from 35–52%. We present a case study which demonstrates a novel way of comparative genome annotation, using combinations of AGEs in a pre-defined order (or path) to predict ORF start codons. The order of AGE combinations is from high to low specificity, where the specificity is based on the eight genome annotations. For each AGE combination we are able to derive a so-called projected confidence value, which is the average specificity of ORF start codon prediction based on the eight genomes. The projected confidence enables estimating likeliness of a correct prediction for a particular ORF start codon by a particular AGE combination, pinpointing ORFs notoriously difficult to predict start codons. We correctly predict start codons for 90.5±4.8% of the genes in a genome (based on the eight genomes) with an accuracy of 81.1±7.6%. Our consensus-path methodology allows a marked improvement over majority voting (9.7±4.4%) and with an optimal path ORF start prediction sensitivity is gained while maintaining a high specificity.

Collaboration


Dive into the Lex Overmars's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar

Roland J. Siezen

Radboud University Nijmegen

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Dimitry Y. Sorokin

Delft University of Technology

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Michiel Wels

Radboud University Nijmegen

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Tanja Woyke

Joint Genome Institute

View shared research outputs
Top Co-Authors

Avatar

Michiel Kleerebezem

Wageningen University and Research Centre

View shared research outputs
Top Co-Authors

Avatar

Tom Berben

University of Amsterdam

View shared research outputs
Researchain Logo
Decentralizing Knowledge