Erwin Datema
Wageningen University and Research Centre
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Erwin Datema.
Nature | 2011
X. Xu; S.K. Pan; S.F. Cheng; B. Zhang; Christian W. B. Bachem; J.M. de Boer; T.J.A. Borm; Bjorn Kloosterman; H.J. van Eck; Erwin Datema; Aska Goverse; R.C.H.J. van Ham; Richard G. F. Visser
Potato (Solanum tuberosum L.) is the world’s most important non-grain food crop and is central to global food security. It is clonally propagated, highly heterozygous, autotetraploid, and suffers acute inbreeding depression. Here we use a homozygous doubled-monoploid potato clone to sequence and assemble 86% of the 844-megabase genome. We predict 39,031 protein-coding genes and present evidence for at least two genome duplication events indicative of a palaeopolyploid origin. As the first genome sequence of an asterid, the potato genome reveals 2,642 genes specific to this large angiosperm clade. We also sequenced a heterozygous diploid clone and show that gene presence/absence variants and other potentially deleterious mutations occur frequently and are a likely cause of inbreeding depression. Gene family expansion, tissue-specific expression and recruitment of genes to new pathways contributed to the evolution of tuber development. The potato genome sequence provides a platform for genetic improvement of this vital crop.
PLOS Genetics | 2012
Pierre J. G. M. de Wit; Ate van der Burgt; B. Ökmen; I. Stergiopoulos; Kamel A. Abd-Elsalam; Andrea Aerts; Ali H. Bahkali; H. Beenen; Pranav Chettri; Murray P. Cox; Erwin Datema; Ronald P. de Vries; Braham Dhillon; Austen R. D. Ganley; S.A. Griffiths; Yanan Guo; Richard C. Hamelin; Bernard Henrissat; M. Shahjahan Kabir; Mansoor Karimi Jashni; Gert H. J. Kema; Sylvia Klaubauf; Alla Lapidus; Anthony Levasseur; Erika Lindquist; Rahim Mehrabi; Robin A. Ohm; Timothy J. Owen; Asaf Salamov; Arne Schwelm
We sequenced and compared the genomes of the Dothideomycete fungal plant pathogens Cladosporium fulvum (Cfu) (syn. Passalora fulva) and Dothistroma septosporum (Dse) that are closely related phylogenetically, but have different lifestyles and hosts. Although both fungi grow extracellularly in close contact with host mesophyll cells, Cfu is a biotroph infecting tomato, while Dse is a hemibiotroph infecting pine. The genomes of these fungi have a similar set of genes (70% of gene content in both genomes are homologs), but differ significantly in size (Cfu >61.1-Mb; Dse 31.2-Mb), which is mainly due to the difference in repeat content (47.2% in Cfu versus 3.2% in Dse). Recent adaptation to different lifestyles and hosts is suggested by diverged sets of genes. Cfu contains an α-tomatinase gene that we predict might be required for detoxification of tomatine, while this gene is absent in Dse. Many genes encoding secreted proteins are unique to each species and the repeat-rich areas in Cfu are enriched for these species-specific genes. In contrast, conserved genes suggest common host ancestry. Homologs of Cfu effector genes, including Ecp2 and Avr4, are present in Dse and induce a Cf-Ecp2- and Cf-4-mediated hypersensitive response, respectively. Strikingly, genes involved in production of the toxin dothistromin, a likely virulence factor for Dse, are conserved in Cfu, but their expression differs markedly with essentially no expression by Cfu in planta. Likewise, Cfu has a carbohydrate-degrading enzyme catalog that is more similar to that of necrotrophs or hemibiotrophs and a larger pectinolytic gene arsenal than Dse, but many of these genes are not expressed in planta or are pseudogenized. Overall, comparison of their genomes suggests that these closely related plant pathogens had a common ancestral host but since adapted to different hosts and lifestyles by a combination of differentiated gene content, pseudogenization, and gene regulation.
Microbial Cell Factories | 2012
Jurgen F. Nijkamp; Marcel van den Broek; Erwin Datema; Stefan de Kok; Lizanne Bosman; Marijke A. H. Luttik; Pascale Daran-Lapujade; Wanwipa Vongsangnak; Jens Nielsen; Wilbert H. M. Heijne; Paul Klaassen; Chris J. Paddon; Darren M. Platt; Peter Kötter; Roeland C. H. J. van Ham; Marcel J. T. Reinders; Jack T. Pronk; Dick de Ridder; Jean-Marc Daran
Saccharomyces cerevisiae CEN.PK 113-7D is widely used for metabolic engineering and systems biology research in industry and academia. We sequenced, assembled, annotated and analyzed its genome. Single-nucleotide variations (SNV), insertions/deletions (indels) and differences in genome organization compared to the reference strain S. cerevisiae S288C were analyzed. In addition to a few large deletions and duplications, nearly 3000 indels were identified in the CEN.PK113-7D genome relative to S288C. These differences were overrepresented in genes whose functions are related to transcriptional regulation and chromatin remodelling. Some of these variations were caused by unstable tandem repeats, suggesting an innate evolvability of the corresponding genes. Besides a previously characterized mutation in adenylate cyclase, the CEN.PK113-7D genome sequence revealed a significant enrichment of non-synonymous mutations in genes encoding for components of the cAMP signalling pathway. Some phenotypic characteristics of the CEN.PK113-7D strains were explained by the presence of additional specific metabolic genes relative to S288C. In particular, the presence of the BIO1 and BIO6 genes correlated with a biotin prototrophy of CEN.PK113-7D. Furthermore, the copy number, chromosomal location and sequences of the MAL loci were resolved. The assembled sequence reveals that CEN.PK113-7D has a mosaic genome that combines characteristics of laboratory strains and wild-industrial strains.
Plant Journal | 2014
Saulo Alves Aflitos; Elio Schijlen; Hans de Jong; Dick de Ridder; Sandra Smit; Richard Finkers; Jun Wang; Gengyun Zhang; Ning Li; Likai Mao; Freek T. Bakker; Rob Dirks; Timo M. Breit; Barbara Gravendeel; Henk Huits; Darush Struss; Ruth Swanson-Wagner; Hans van Leeuwen; Roeland C. H. J. van Ham; Laia Fito; Laetitia Guignier; Myrna Sevilla; Philippe Ellul; Eric Ganko; Arvind Kapur; Emannuel Reclus; Bernard de Geus; Henri van de Geest; Bas te Lintel Hekkert; Jan C. van Haarst
We explored genetic variation by sequencing a selection of 84 tomato accessions and related wild species representative of the Lycopersicon, Arcanum, Eriopersicon and Neolycopersicon groups, which has yielded a huge amount of precious data on sequence diversity in the tomato clade. Three new reference genomes were reconstructed to support our comparative genome analyses. Comparative sequence alignment revealed group-, species- and accession-specific polymorphisms, explaining characteristic fruit traits and growth habits in the various cultivars. Using gene models from the annotated Heinz 1706 reference genome, we observed differences in the ratio between non-synonymous and synonymous SNPs (dN/dS) in fruit diversification and plant growth genes compared to a random set of genes, indicating positive selection and differences in selection pressure between crop accessions and wild species. In wild species, the number of single-nucleotide polymorphisms (SNPs) exceeds 10 million, i.e. 20-fold higher than found in most of the crop accessions, indicating dramatic genetic erosion of crop and heirloom tomatoes. In addition, the highest levels of heterozygosity were found for allogamous self-incompatible wild species, while facultative and autogamous self-compatible species display a lower heterozygosity level. Using whole-genome SNP information for maximum-likelihood analysis, we achieved complete tree resolution, whereas maximum-likelihood trees based on SNPs from ten fruit and growth genes show incomplete resolution for the crop accessions, partly due to the effect of heterozygous SNPs. Finally, results suggest that phylogenetic relationships are correlated with habitat, indicating the occurrence of geographical races within these groups, which is of practical importance for Solanum genome evolution studies.
The Plant Genome | 2009
Lukas A. Mueller; R.M. Klein Lankhorst; S. D. Tanksley; R.M. Peters; M.J. van Staveren; Erwin Datema; Mark Fiers; R.C.H.J. van Ham; Dóra Szinay; J.H.S.G.M. de Jong
The genome of tomato (Solanum lycopersicum L.) is being sequenced by an international consortium of 10 countries (Korea, China, the United Kingdom, India, the Netherlands, France, Japan, Spain, Italy, and the United States) as part of the larger “International Solanaceae Genome Project (SOL): Systems Approach to Diversity and Adaptation” initiative. The tomato genome sequencing project uses an ordered bacterial artificial chromosome (BAC) approach to generate a high‐quality tomato euchromatic genome sequence for use as a reference genome for the Solanaceae and euasterids. Sequence is deposited at GenBank and at the SOL Genomics Network (SGN). Currently, there are around 1000 BACs finished or in progress, representing more than a third of the projected euchromatic portion of the genome. An annotation effort is also underway by the International Tomato Annotation Group. The expected number of genes in the euchromatin is ∼40,000, based on an estimate from a preliminary annotation of 11% of finished sequence. Here, we present this first snapshot of the emerging tomato genome and its annotation, a short comparison with potato (Solanum tuberosum L.) sequence data, and the tools available for the researchers to exploit this new resource are also presented. In the future, whole‐genome shotgun techniques will be combined with the BAC‐by‐BAC approach to cover the entire tomato genome. The high‐quality reference euchromatic tomato sequence is expected to be near completion by 2010.
Genetics | 2008
Xiaomin Tang; Dóra Szinay; Chunting Lang; M.S. Ramanna; Edwin van der Vossen; Erwin Datema; René Klein Lankhorst; Jan de Boer; Sander A. Peters; Chris tian Bachem; Willem J. Stiekema; Richard G. F. Visser; Hans de Jong; Yuling Bai
Ongoing genomics projects of tomato (Solanum lycopersicum) and potato (S. tuberosum) are providing unique tools for comparative mapping studies in Solanaceae. At the chromosomal level, bacterial artificial chromosomes (BACs) can be positioned on pachytene complements by fluorescence in situ hybridization (FISH) on homeologous chromosomes of related species. Here we present results of such a cross-species multicolor cytogenetic mapping of tomato BACs on potato chromosomes 6 and vice versa. The experiments were performed under low hybridization stringency, while blocking with Cot-100 was essential in suppressing excessive hybridization of repeat signals in both within-species FISH and cross-species FISH of tomato BACs. In the short arm we detected a large paracentric inversion that covers the whole euchromatin part with breakpoints close to the telomeric heterochromatin and at the border of the short arm pericentromere. The long arm BACs revealed no deviation in the colinearity between tomato and potato. Further comparison between tomato cultivars Cherry VFNT and Heinz 1706 revealed colinearity of the tested tomato BACs, whereas one of the six potato clones (RH98-856-18) showed minor putative rearrangements within the inversion. Our results present cross-species multicolor BAC–FISH as a unique tool for comparative genetic studies across Solanum species.
Chromosome Research | 2008
Song Bin Chang; Tae Jin Yang; Erwin Datema; Joke J.F.A. van Vugt; Ben Vosman; Anja G. J. Kuipers; Marie Meznikova; Dóra Szinay; René Klein Lankhorst; E. Jacobsen; Hans de Jong
This paper presents a bird’s-eye view of the major repeats and chromatin types of tomato. Using fluorescence in-situ hybridization (FISH) with Cot-1, Cot-10 and Cot-100 DNA as probes we mapped repetitive sequences of different complexity on pachytene complements. Cot-100 was found to cover all heterochromatin regions, and could be used to identify repeat-rich clones in BAC filter hybridization. Next we established the chromosomal locations of the tandem and dispersed repeats with respect to euchromatin, nucleolar organizer regions (NORs), heterochromatin, and centromeres. The tomato genomic repeats TGRII and TGRIII appeared to be major components of the pericentromeres, whereas the newly discovered TGRIV repeat was found mainly in the structural centromeres. The highly methylated NOR of chromosome 2 is rich in [GACA]4, a microsatellite that also forms part of the pericentromeres, together with [GA]8, [GATA]4 and Ty1-copia. Based on the morphology of pachytene chromosomes and the distribution of repeats studied so far, we now propose six different chromatin classes for tomato: (1) euchromatin, (2) chromomeres, (3) distal heterochromatin and interstitial heterochromatic knobs, (4) pericentromere heterochromatin, (5) functional centromere heterochromatin and (6) nucleolar organizer region.
BMC Plant Biology | 2008
Erwin Datema; Lukas A. Mueller; Robert M. Buels; James J. Giovannoni; Richard G. F. Visser; Willem J. Stiekema; Roeland C. H. J. van Ham
BackgroundTomato (Solanum lycopersicon) and potato (S. tuberosum) are two economically important crop species, the genomes of which are currently being sequenced. This study presents a first genome-wide analysis of these two species, based on two large collections of BAC end sequences representing approximately 19% of the tomato genome and 10% of the potato genome.ResultsThe tomato genome has a higher repeat content than the potato genome, primarily due to a higher number of retrotransposon insertions in the tomato genome. On the other hand, simple sequence repeats are more abundant in potato than in tomato. The two genomes also differ in the frequency distribution of SSR motifs. Based on EST and protein alignments, potato appears to contain up to 6,400 more putative coding regions than tomato. Major gene families such as cytochrome P450 mono-oxygenases and serine-threonine protein kinases are significantly overrepresented in potato, compared to tomato. Moreover, the P450 superfamily appears to have expanded spectacularly in both species compared to Arabidopsis thaliana, suggesting an expanded network of secondary metabolic pathways in the Solanaceae. Both tomato and potato appear to have a low level of microsynteny with A. thaliana. A higher degree of synteny was observed with Populus trichocarpa, specifically in the region between 15.2 and 19.4 Mb on P. trichocarpa chromosome 10.ConclusionThe findings in this paper present a first glimpse into the evolution of Solanaceous genomes, both within the family and relative to other plant species. When the complete genome sequences of these species become available, whole-genome comparisons and protein- or repeat-family specific studies may shed more light on the observations made here.
Plant Journal | 2009
Sander A. Peters; Erwin Datema; Dóra Szinay; Marjo J. van Staveren; Elio Schijlen; Jan C. van Haarst; Thamara Hesselink; Marleen H. C. Abma-Henkens; Yuling Bai; Hans de Jong; Willem J. Stiekema; René Klein Lankhorst; Roeland C. H. J. van Ham
We studied the physical and genetic organization of chromosome 6 of tomato (Solanum lycopersicum) cv. Heinz 1706 by combining bacterial artificial chromosome (BAC) sequence analysis, high-information-content fingerprinting, genetic analysis, and BAC-fluorescent in situ hybridization (FISH) mapping data. The chromosome positions of 81 anchored seed and extension BACs corresponded in most cases with the linear marker order on the high-density EXPEN 2000 linkage map. We assembled 25 BAC contigs and eight singleton BACs spanning 2.0 Mb of the short-arm euchromatin, 1.8 Mb of the pericentromeric heterochromatin and 6.9 Mb of the long-arm euchromatin. Sequence data were combined with their corresponding genetic and pachytene chromosome positions into an integrated map that covers approximately a third of the chromosome 6 euchromatin and a small part of the pericentromeric heterochromatin. We then compared physical length (Mb), genetic (cM) and chromosome distances (microm) for determining gap sizes between contigs, revealing relative hot and cold spots of recombination. Through sequence annotation we identified several clusters of functionally related genes and an uneven distribution of both gene and repeat sequences between heterochromatin and euchromatin domains. Although a greater number of the non-transposon genes were located in the euchromatin, the highly repetitive (22.4%) pericentromeric heterochromatin displayed an unexpectedly high gene content of one gene per 36.7 kb. Surprisingly, the short-arm euchromatin was relatively rich in repeats as well, with a repeat content of 13.4%, yet the ratio of Ty3/Gypsy and Ty1/Copia retrotransposable elements across the chromosome clearly distinguished euchromatin (2:3) from heterochromatin (3:2).
BMC Bioinformatics | 2008
Mark Fiers; Ate van der Burgt; Erwin Datema; Joost C. W. de Groot; Roeland C. H. J. van Ham
BackgroundModern omics research involves the application of high-throughput technologies that generate vast volumes of data. These data need to be pre-processed, analyzed and integrated with existing knowledge through the use of diverse sets of software tools, models and databases. The analyses are often interdependent and chained together to form complex workflows or pipelines. Given the volume of the data used and the multitude of computational resources available, specialized pipeline software is required to make high-throughput analysis of large-scale omics datasets feasible.ResultsWe have developed a generic pipeline system called Cyrille2. The system is modular in design and consists of three functionally distinct parts: 1) a web based, graphical user interface (GUI) that enables a pipeline operator to manage the system; 2) the Scheduler, which forms the functional core of the system and which tracks what data enters the system and determines what jobs must be scheduled for execution, and; 3) the Executor, which searches for scheduled jobs and executes these on a compute cluster.ConclusionThe Cyrille2 system is an extensible, modular system, implementing the stated requirements. Cyrille2 enables easy creation and execution of high throughput, flexible bioinformatics pipelines.