Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Dan Bolser is active.

Publication


Featured researches published by Dan Bolser.


Nature | 2012

Analysis of the bread wheat genome using whole-genome shotgun sequencing

Rachel Brenchley; Manuel Spannagl; Matthias Pfeifer; Gary L. A. Barker; Rosalinda D’Amore; Alexandra M. Allen; Neil McKenzie; Melissa Kramer; Arnaud Kerhornou; Dan Bolser; Suzanne Kay; Darren Waite; Martin Trick; Ian Bancroft; Yong Gu; Naxin Huo; Ming-Cheng Luo; Sunish K. Sehgal; Bikram S. Gill; Sharyar Kianian; Olin D. Anderson; Paul J. Kersey; Jan Dvorak; W. Richard McCombie; Anthony Hall; Klaus F. X. Mayer; Keith J. Edwards; Michael W. Bevan; Neil Hall

Bread wheat (Triticum aestivum) is a globally important crop, accounting for 20 per cent of the calories consumed by humans. Major efforts are underway worldwide to increase wheat production by extending genetic diversity and analysing key traits, and genomic resources can accelerate progress. But so far the very large size and polyploid complexity of the bread wheat genome have been substantial barriers to genome analysis. Here we report the sequencing of its large, 17-gigabase-pair, hexaploid genome using 454 pyrosequencing, and comparison of this with the sequences of diploid ancestral and progenitor genomes. We identified between 94,000 and 96,000 genes, and assigned two-thirds to the three component genomes (A, B and D) of hexaploid wheat. High-resolution synteny maps identified many small disruptions to conserved gene order. We show that the hexaploid genome is highly dynamic, with significant loss of gene family members on polyploidization and domestication, and an abundance of gene fragments. Several classes of genes involved in energy harvesting, metabolism and growth are among expanded gene families that could be associated with crop productivity. Our analyses, coupled with the identification of extensive genetic variation, provide a resource for accelerating gene discovery and improving this major crop.


Genome Research | 2017

An improved assembly and annotation of the allohexaploid wheat genome identifies complete families of agronomic genes and provides genomic evidence for chromosomal translocations

Bernardo Clavijo; Luca Venturini; Christian Schudoma; Gonzalo Garcia Accinelli; Gemy Kaithakottil; Jonathan Wright; Philippa Borrill; George Kettleborough; Darren Heavens; Helen D. Chapman; James Lipscombe; Tom Barker; Fu-Hao Lu; Neil McKenzie; Dina Raats; Ricardo H. Ramirez-Gonzalez; Aurore Coince; Ned Peel; Lawrence Percival-Alwyn; Owen Duncan; Josua Trösch; Guotai Yu; Dan Bolser; Guy Namaati; Arnaud Kerhornou; Manuel Spannagl; Heidrun Gundlach; Georg Haberer; Robert Davey; Christine Fosker

Advances in genome sequencing and assembly technologies are generating many high-quality genome sequences, but assemblies of large, repeat-rich polyploid genomes, such as that of bread wheat, remain fragmented and incomplete. We have generated a new wheat whole-genome shotgun sequence assembly using a combination of optimized data types and an assembly algorithm designed to deal with large and complex genomes. The new assembly represents >78% of the genome with a scaffold N50 of 88.8 kb that has a high fidelity to the input data. Our new annotation combines strand-specific Illumina RNA-seq and Pacific Biosciences (PacBio) full-length cDNAs to identify 104,091 high-confidence protein-coding genes and 10,156 noncoding RNA genes. We confirmed three known and identified one novel genome rearrangements. Our approach enables the rapid and scalable assembly of wheat genomes, the identification of structural variants, and the definition of complete gene models, all powerful resources for trait analysis and breeding of this key global crop.


Bioinformatics | 2013

EDAM: an ontology of bioinformatics operations, types of data and identifiers, topics and formats

Jon C. Ison; Matúš Kalaš; Inge Jonassen; Dan Bolser; Mahmut Uludag; Hamish McWilliam; James Malone; Rodrigo Lopez; Steve Pettifer; Peter Rice

Motivation: Advancing the search, publication and integration of bioinformatics tools and resources demands consistent machine-understandable descriptions. A comprehensive ontology allowing such descriptions is therefore required. Results: EDAM is an ontology of bioinformatics operations (tool or workflow functions), types of data and identifiers, application domains and data formats. EDAM supports semantic annotation of diverse entities such as Web services, databases, programmatic libraries, standalone tools, interactive applications, data schemas, datasets and publications within bioinformatics. EDAM applies to organizing and finding suitable tools and data and to automating their integration into complex applications or workflows. It includes over 2200 defined concepts and has successfully been used for annotations and implementations. Availability: The latest stable version of EDAM is available in OWL format from http://edamontology.org/EDAM.owl and in OBO format from http://edamontology.org/EDAM.obo. It can be viewed online at the NCBO BioPortal and the EBI Ontology Lookup Service. For documentation and license please refer to http://edamontology.org. This article describes version 1.2 available at http://edamontology.org/EDAM_1.2.owl. Contact: [email protected]


Methods of Molecular Biology | 2016

Ensembl Plants: Integrating Tools for Visualizing, Mining, and Analyzing Plant Genomics Data.

Dan Bolser; Daniel M. Staines; Emily Pritchard; Paul J. Kersey

Ensembl Plants ( http://plants.ensembl.org ) is an integrative resource presenting genome-scale information for a growing number of sequenced plant species (currently 33). Data provided includes genome sequence, gene models, functional annotation, and polymorphic loci. Various additional information are provided for variation data, including population structure, individual genotypes, linkage, and phenotype data. In each release, comparative analyses are performed on whole genome and protein sequences, and genome alignments and gene trees are made available that show the implied evolutionary history of each gene family. Access to the data is provided through a genome browser incorporating many specialist interfaces for different data types, and through a variety of additional methods for programmatic access and data mining. These access routes are consistent with those offered through the Ensembl interface for the genomes of non-plant species, including those of plant pathogens, pests, and pollinators.Ensembl Plants is updated 4-5 times a year and is developed in collaboration with our international partners in the Gramene ( http://www.gramene.org ) and transPLANT projects ( http://www.transplantdb.org ).


Nucleic Acids Research | 2016

Tools and data services registry: a community effort to document bioinformatics resources

Jon Ison; Kristoffer Rapacki; Hervé Ménager; Matúš Kalaš; Emil Rydza; Piotr Jaroslaw Chmura; Christian Anthon; Niall Beard; Karel Berka; Dan Bolser; Tim Booth; Anthony Bretaudeau; Jan Brezovsky; Rita Casadio; Gianni Cesareni; Frederik Coppens; Michael Cornell; Gianmauro Cuccuru; Kristian Davidsen; Gianluca Della Vedova; Tunca Doğan; Olivia Doppelt-Azeroual; Laura Emery; Elisabeth Gasteiger; Thomas Gatter; Tatyana Goldberg; Marie Grosjean; Björn Grüning; Manuela Helmer-Citterich; Hans Ienasescu

Life sciences are yielding huge data sets that underpin scientific discoveries fundamental to improvement in human health, agriculture and the environment. In support of these discoveries, a plethora of databases and tools are deployed, in technically complex and diverse implementations, across a spectrum of scientific disciplines. The corpus of documentation of these resources is fragmented across the Web, with much redundancy, and has lacked a common standard of information. The outcome is that scientists must often struggle to find, understand, compare and use the best resources for the task at hand. Here we present a community-driven curation effort, supported by ELIXIR—the European infrastructure for biological information—that aspires to a comprehensive and consistent registry of information about bioinformatics resources. The sustainable upkeep of this Tools and Data Services Registry is assured by a curation effort driven by and tailored to local needs, and shared amongst a network of engaged partners. As of November 2015, the registry includes 1785 resources, with depositions from 126 individual registrations including 52 institutional providers and 74 individuals. With community support, the registry can become a standard for dissemination of information about bioinformatics resources: we welcome everyone to join us in this common endeavour. The registry is freely available at https://bio.tools.


Nucleic Acids Research | 2012

The SEQanswers wiki: a wiki database of tools for high-throughput sequencing analysis

Jing-Woei Li; Keith Robison; Marcel Martin; Andreas Sjödin; Matthew D. Young; Eric C. Olivares; Dan Bolser

Recent advances in sequencing technology have created unprecedented opportunities for biological research. However, the increasing throughput of these technologies has created many challenges for data management and analysis. As the demand for sophisticated analyses increases, the development time of software and algorithms is outpacing the speed of traditional publication. As technologies continue to be developed, methods change rapidly, making publications less relevant for users. The SEQanswers wiki (SEQwiki) is a wiki database that is actively edited and updated by the members of the SEQanswers community (http://SEQanswers.com/). The wiki provides an extensive catalogue of tools, technologies and tutorials for high-throughput sequencing (HTS), including information about HTS service providers. It has been implemented in MediaWiki with the Semantic MediaWiki and Semantic Forms extensions to collect structured data, providing powerful navigation and reporting features. Within 2 years, the community has created pages for over 500 tools, with approximately 400 literature references and 600 web links. This collaborative effort has made SEQwiki the most comprehensive database of HTS tools anywhere on the web. The wiki includes task-focused mini-reviews of commonly used tools, and a growing collection of more than 100 HTS service providers. SEQwiki is available at: http://wiki.SEQanswers.com/.


Nucleic Acids Research | 2012

MetaBase—the wiki-database of biological databases

Dan Bolser; Pierre-Yves Chibon; Nicolas Palopoli; Sungsam Gong; Daniel Jacob; Victoria Dominguez Del Angel; Dan Swan; Sebastian Bassi; Virginia González; Prashanth Suravajhala; Seungwoo Hwang; Paolo Romano; Robert Edwards; Bryan Bishop; John Eargle; Timur Shtatland; Nicholas J. Provart; Dave Clements; Daniel P. Renfro; Daeui Bhak; Jong Bhak

Biology is generating more data than ever. As a result, there is an ever increasing number of publicly available databases that analyse, integrate and summarize the available data, providing an invaluable resource for the biological community. As this trend continues, there is a pressing need to organize, catalogue and rate these resources, so that the information they contain can be most effectively exploited. MetaBase (MB) (http://MetaDatabase.Org) is a community-curated database containing more than 2000 commonly used biological databases. Each entry is structured using templates and can carry various user comments and annotations. Entries can be searched, listed, browsed or queried. The database was created using the same MediaWiki technology that powers Wikipedia, allowing users to contribute on many different levels. The initial release of MB was derived from the content of the 2007 Nucleic Acids Research (NAR) Database Issue. Since then, approximately 100 databases have been manually collected from the literature, and users have added information for over 240 databases. MB is synchronized annually with the static Molecular Biology Database Collection provided by NAR. To date, there have been 19 significant contributors to the project; each one is listed as an author here to highlight the community aspect of the project.


Plant and Cell Physiology | 2015

Triticeae resources in Ensembl Plants.

Dan Bolser; Arnaud Kerhornou; Brandon Walts; Paul J. Kersey

Recent developments in DNA sequencing have enabled the large and complex genomes of many crop species to be determined for the first time, even those previously intractable due to their polyploid nature. Indeed, over the course of the last 2 years, the genome sequences of several commercially important cereals, notably barley and bread wheat, have become available, as well as those of related wild species. While still incomplete, comparison with other, more completely assembled species suggests that coverage of genic regions is likely to be high. Ensembl Plants (http://plants.ensembl.org) is an integrative resource organizing, analyzing and visualizing genome-scale information for important crop and model plants. Available data include reference genome sequence, variant loci, gene models and functional annotation. For variant loci, individual and population genotypes, linkage information and, where available, phenotypic information are shown. Comparative analyses are performed on DNA and protein sequence alignments. The resulting genome alignments and gene trees, representing the implied evolutionary history of the gene family, are made available for visualization and analysis. Driven by the case of bread wheat, specific extensions to the analysis pipelines and web interface have recently been developed to support polyploid genomes. Data in Ensembl Plants is accessible through a genome browser incorporating various specialist interfaces for different data types, and through a variety of additional methods for programmatic access and data mining. These interfaces are consistent with those offered through the Ensembl interface for the genomes of non-plant species, including those of plant pathogens, pests and pollinators, facilitating the study of the plant in its environment.


Nucleic Acids Research | 2018

Ensembl Genomes 2018: an integrated omics infrastructure for non-vertebrate species.

Paul J. Kersey; James E. Allen; Alexis Allot; Matthieu Barba; Sanjay Boddu; Bruce J. Bolt; Denise R. Carvalho-Silva; Mikkel Christensen; Paul Davis; Christoph Grabmueller; Navin Kumar; Zicheng Liu; Thomas Maurel; Ben Moore; Mark D. McDowall; Uma Maheswari; Guy Naamati; Victoria Newman; Chuang Kee Ong; Michael Paulini; Helder Pedro; Emily Perry; Matthew Russell; Helen Sparrow; Electra Tapanari; Kieron Taylor; Alessandro Vullo; Gareth Williams; Amonida Zadissia; Andrew Olson

Abstract Ensembl Genomes (http://www.ensemblgenomes.org) is an integrating resource for genome-scale data from non-vertebrate species, complementing the resources for vertebrate genomics developed in the Ensembl project (http://www.ensembl.org). Together, the two resources provide a consistent set of programmatic and interactive interfaces to a rich range of data including genome sequence, gene models, transcript sequence, genetic variation, and comparative analysis. This paper provides an update to the previous publications about the resource, with a focus on recent developments and expansions. These include the incorporation of almost 20 000 additional genome sequences and over 35 000 tracks of RNA-Seq data, which have been aligned to genomic sequence and made available for visualization. Other advances since 2015 include the release of the database in Resource Description Framework (RDF) format, a large increase in community-derived curation, a new high-performance protein sequence search, additional cross-references, improved annotation of non-protein-coding genes, and the launch of pre-release and archival sites. Collectively, these changes are part of a continuing response to the increasing quantity of publicly-available genome-scale data, and the consequent need to archive, integrate, annotate and disseminate these using automated, scalable methods.


PLOS ONE | 2014

De Novo Transcriptome Assembly and Analyses of Gene Expression during Photomorphogenesis in Diploid Wheat Triticum monococcum

Samuel E. Fox; Matthew Geniza; Mamatha Hanumappa; Sushma Naithani; Christopher M. Sullivan; Justin Preece; Vijay K. Tiwari; Justin Elser; Jeffrey M. Leonard; Abigail Sage; Cathy Gresham; Arnaud Kerhornou; Dan Bolser; Fiona M. McCarthy; Paul J. Kersey; Gerard R. Lazo; Pankaj Jaiswal

Background Triticum monococcum (2n) is a close ancestor of T. urartu, the A-genome progenitor of cultivated hexaploid wheat, and is therefore a useful model for the study of components regulating photomorphogenesis in diploid wheat. In order to develop genetic and genomic resources for such a study, we constructed genome-wide transcriptomes of two Triticum monococcum subspecies, the wild winter wheat T. monococcum ssp. aegilopoides (accession G3116) and the domesticated spring wheat T. monococcum ssp. monococcum (accession DV92) by generating de novo assemblies of RNA-Seq data derived from both etiolated and green seedlings. Principal Findings The de novo transcriptome assemblies of DV92 and G3116 represent 120,911 and 117,969 transcripts, respectively. We successfully mapped ∼90% of these transcripts from each accession to barley and ∼95% of the transcripts to T. urartu genomes. However, only ∼77% transcripts mapped to the annotated barley genes and ∼85% transcripts mapped to the annotated T. urartu genes. Differential gene expression analyses revealed 22% more light up-regulated and 35% more light down-regulated transcripts in the G3116 transcriptome compared to DV92. The DV92 and G3116 mRNA sequence reads aligned against the reference barley genome led to the identification of ∼500,000 single nucleotide polymorphism (SNP) and ∼22,000 simple sequence repeat (SSR) sites. Conclusions De novo transcriptome assemblies of two accessions of the diploid wheat T. monococcum provide new empirical transcriptome references for improving Triticeae genome annotations, and insights into transcriptional programming during photomorphogenesis. The SNP and SSR sites identified in our analysis provide additional resources for the development of molecular markers.

Collaboration


Dive into the Dan Bolser's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar

Paul J. Kersey

European Bioinformatics Institute

View shared research outputs
Top Co-Authors

Avatar

Cathy Gresham

Mississippi State University

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Gerard R. Lazo

Agricultural Research Service

View shared research outputs
Top Co-Authors

Avatar

Justin Elser

Oregon State University

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge