Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Robert Davey is active.

Publication


Featured researches published by Robert Davey.


Genome Research | 2017

An improved assembly and annotation of the allohexaploid wheat genome identifies complete families of agronomic genes and provides genomic evidence for chromosomal translocations

Bernardo Clavijo; Luca Venturini; Christian Schudoma; Gonzalo Garcia Accinelli; Gemy Kaithakottil; Jonathan Wright; Philippa Borrill; George Kettleborough; Darren Heavens; Helen D. Chapman; James Lipscombe; Tom Barker; Fu-Hao Lu; Neil McKenzie; Dina Raats; Ricardo H. Ramirez-Gonzalez; Aurore Coince; Ned Peel; Lawrence Percival-Alwyn; Owen Duncan; Josua Trösch; Guotai Yu; Dan Bolser; Guy Namaati; Arnaud Kerhornou; Manuel Spannagl; Heidrun Gundlach; Georg Haberer; Robert Davey; Christine Fosker

Advances in genome sequencing and assembly technologies are generating many high-quality genome sequences, but assemblies of large, repeat-rich polyploid genomes, such as that of bread wheat, remain fragmented and incomplete. We have generated a new wheat whole-genome shotgun sequence assembly using a combination of optimized data types and an assembly algorithm designed to deal with large and complex genomes. The new assembly represents >78% of the genome with a scaffold N50 of 88.8 kb that has a high fidelity to the input data. Our new annotation combines strand-specific Illumina RNA-seq and Pacific Biosciences (PacBio) full-length cDNAs to identify 104,091 high-confidence protein-coding genes and 10,156 noncoding RNA genes. We confirmed three known and identified one novel genome rearrangements. Our approach enables the rapid and scalable assembly of wheat genomes, the identification of structural variants, and the definition of complete gene models, all powerful resources for trait analysis and breeding of this key global crop.


Genome Research | 2009

Repetitive sequence variation and dynamics in the ribosomal DNA array of Saccharomyces cerevisiae as revealed by whole-genome resequencing

Stephen A. James; Michael J.T. O'Kelly; David M. Carter; Robert Davey; Alexander van Oudenaarden; Ian N. Roberts

Ribosomal DNA (rDNA) plays a key role in ribosome biogenesis, encoding genes for the structural RNA components of this important cellular organelle. These genes are vital for efficient functioning of the cellular protein synthesis machinery and as such are highly conserved and normally present in high copy numbers. In the bakers yeast Saccharomyces cerevisiae, there are more than 100 rDNA repeats located at a single locus on chromosome XII. Stability and sequence homogeneity of the rDNA array is essential for function, and this is achieved primarily by the mechanism of gene conversion. Detecting variation within these arrays is extremely problematic due to their large size and repetitive structure. In an attempt to address this, we have analyzed over 35 Mbp of rDNA sequence obtained from whole-genome shotgun sequencing (WGSS) of 34 strains of S. cerevisiae. Contrary to expectation, we find significant rDNA sequence variation exists within individual genomes. Many of the detected polymorphisms are not fully resolved. For this type of sequence variation, we introduce the term partial single nucleotide polymorphism, or pSNP. Comparative analysis of the complete data set reveals that different S. cerevisiae genomes possess different patterns of rDNA polymorphism, with much of the variation located within the rapidly evolving nontranscribed intergenic spacer (IGS) region. Furthermore, we find that strains known to have either structured or mosaic/hybrid genomes can be distinguished from one another based on rDNA pSNP number, indicating that pSNP dynamics may provide a reliable new measure of genome origin and stability.


Frontiers in Genetics | 2013

Sequencing quality assessment tools to enable data-driven informatics for high throughput genomics.

Richard M. Leggett; Ricardo H. Ramirez-Gonzalez; Bernardo Clavijo; Darren Waite; Robert Davey

The processes of quality assessment and control are an active area of research at The Genome Analysis Centre (TGAC). Unlike other sequencing centers that often concentrate on a certain species or technology, TGAC applies expertise in genomics and bioinformatics to a wide range of projects, often requiring bespoke wet lab and in silico workflows. TGAC is fortunate to have access to a diverse range of sequencing and analysis platforms, and we are at the forefront of investigations into library quality and sequence data assessment. We have developed and implemented a number of algorithms, tools, pipelines and packages to ascertain, store, and expose quality metrics across a number of next-generation sequencing platforms, allowing rapid and in-depth cross-platform Quality Control (QC) bioinformatics. In this review, we describe these tools as a vehicle for data-driven informatics, offering the potential to provide richer context for downstream analysis and to inform experimental design.


Bioinformatics | 2015

NanoOK: Multi-reference alignment analysis of nanopore sequencing data, quality and error profiles

Richard M. Leggett; Darren Heavens; Mario Caccamo; Matthew D. Clark; Robert Davey

Motivation: The Oxford Nanopore MinION sequencer, currently in pre-release testing through the MinION Access Programme (MAP), promises long reads in real-time from an inexpensive, compact, USB device. Tools have been released to extract FASTA/Q from the MinION base calling output and to provide basic yield statistics. However, no single tool yet exists to provide comprehensive alignment-based quality control and error profile analysis—something that is extremely important given the speed with which the platform is evolving. Results: NanoOK generates detailed tabular and graphical output plus an in-depth multi-page PDF report including error profile, quality and yield data. NanoOK is multi-reference, enabling detailed analysis of metagenomic or multiplexed samples. Four popular Nanopore aligners are supported and it is easily extensible to include others. Availability and implementation: NanoOK is an open-source software, implemented in Java with supporting R scripts. It has been tested on Linux and Mac OS X and can be downloaded from https://github.com/TGAC/NanoOK. A VirtualBox VM containing all dependencies and the DH10B read set used in this article is available from http://opendata.tgac.ac.uk/nanook/. A Docker image is also available from Docker Hub—see program documentation https://documentation.tgac.ac.uk/display/NANOOK. Contact: [email protected] Supplementary information: Supplementary data are available at Bioinformatics online.


F1000Research | 2014

The Open Science Peer Review Oath

Jelena Aleksic; Adrian Alexa; Teresa K. Attwood; Neil Chue Hong; Martin Dahlö; Robert Davey; Holger Dinkel; Konrad U. Förstner; Ivo Grigorov; Jean-Karim Hériché; Leo Lahti; Daniel MacLean; Michael Markie; Jenny Molloy; Maria Victoria Schneider; Camille Scott; Richard Smith-Unna; Bruno Vieira

One of the foundations of the scientific method is to be able to reproduce experiments and corroborate the results of research that has been done before. However, with the increasing complexities of new technologies and techniques, coupled with the specialisation of experiments, reproducing research findings has become a growing challenge. Clearly, scientific methods must be conveyed succinctly, and with clarity and rigour, in order for research to be reproducible. Here, we propose steps to help increase the transparency of the scientific method and the reproducibility of research results: specifically, we introduce a peer-review oath and accompanying manifesto. These have been designed to offer guidelines to enable reviewers (with the minimum friction or bias) to follow and apply open science principles, and support the ideas of transparency, reproducibility and ultimately greater societal impact. Introducing the oath and manifesto at the stage of peer review will help to check that the research being published includes everything that other researchers would need to successfully repeat the work. Peer review is the lynchpin of the publishing system: encouraging the community to consciously (and conscientiously) uphold these principles should help to improve published papers, increase confidence in the reproducibility of the work and, ultimately, provide strategic benefits to authors and their institutions.


BMC Bioinformatics | 2016

CerealsDB 3.0: expansion of resources and data integration

Paul A. Wilkinson; Mark O. Winfield; Gary L. A. Barker; Simon Tyrrell; Xingdong Bian; Alexandra M. Allen; Amanda J. Burridge; Jane A. Coghill; Christy Waterfall; Mario Caccamo; Robert Davey; Keith J. Edwards

BackgroundThe increase in human populations around the world has put pressure on resources, and as a consequence food security has become an important challenge for the 21st century. Wheat (Triticum aestivum) is one of the most important crops in human and livestock diets, and the development of wheat varieties that produce higher yields, combined with increased resistance to pests and resilience to changes in climate, has meant that wheat breeding has become an important focus of scientific research. In an attempt to facilitate these improvements in wheat, plant breeders have employed molecular tools to help them identify genes for important agronomic traits that can be bred into new varieties. Modern molecular techniques have ensured that the rapid and inexpensive characterisation of SNP markers and their validation with modern genotyping methods has produced a valuable resource that can be used in marker assisted selection. CerealsDB was created as a means of quickly disseminating this information to breeders and researchers around the globe.DescriptionCerealsDB version 3.0 is an online resource that contains a wide range of genomic datasets for wheat that will assist plant breeders and scientists to select the most appropriate markers for use in marker assisted selection. CerealsDB includes a database which currently contains in excess of a million putative varietal SNPs, of which several hundreds of thousands have been experimentally validated. In addition, CerealsDB also contains new data on functional SNPs predicted to have a major effect on protein function and we have constructed a web service to encourage data integration and high-throughput programmatic access.ConclusionCerealsDB is an open access website that hosts information on SNPs that are considered useful for both plant breeders and research scientists. The recent inclusion of web services designed to federate genomic data resources allows the information on CerealsDB to be more fully integrated with the WheatIS network and other biological databases.


F1000Research | 2013

StatsDB: platform-agnostic storage and understanding of next generation sequencing run metrics

Ricardo H. Ramirez-Gonzalez; Richard M. Leggett; Darren Waite; Anil Thanki; Nizar Drou; Mario Caccamo; Robert Davey

Modern sequencing platforms generate enormous quantities of data in ever-decreasing amounts of time. Additionally, techniques such as multiplex sequencing allow one run to contain hundreds of different samples. With such data comes a significant challenge to understand its quality and to understand how the quality and yield are changing across instruments and over time. As well as the desire to understand historical data, sequencing centres often have a duty to provide clear summaries of individual run performance to collaborators or customers. We present StatsDB, an open-source software package for storage and analysis of next generation sequencing run metrics. The system has been designed for incorporation into a primary analysis pipeline, either at the programmatic level or via integration into existing user interfaces. Statistics are stored in an SQL database and APIs provide the ability to store and access the data while abstracting the underlying database design. This abstraction allows simpler, wider querying across multiple fields than is possible by the manual steps and calculation required to dissect individual reports, e.g. ”provide metrics about nucleotide bias in libraries using adaptor barcode X, across all runs on sequencer A, within the last month”. The software is supplied with modules for storage of statistics from FastQC, a commonly used tool for analysis of sequence reads, but the open nature of the database schema means it can be easily adapted to other tools. Currently at The Genome Analysis Centre (TGAC), reports are accessed through our LIMS system or through a standalone GUI tool, but the API and supplied examples make it easy to develop custom reports and to interface with other packages.


Nature plants | 2017

Data management and best practice for plant science

Sabina Leonelli; Robert Davey; Elizabeth Arnaud; Geraint Parry; Ruth Bastow

Plant research produces data in a profusion of types and scales, and in ever-increasing volume. What are the challenges and opportunities presented by data management in contemporary plant science? And how can researchers make efficient and fruitful use of data management tools and strategies?


Bioinformatics | 2007

MPP: a microarray-to-phylogeny pipeline for analysis of gene and marker content datasets

Robert Davey; George M. Savva; Jo Dicks; Ian N. Roberts

UNLABELLED MPP is a Java application, encompassing both new and established algorithms, for the analysis of gene and marker content datasets arising from high-throughput microarray techniques. MPP analyses flat file output from microarray experiments to determine the probability of the presence or absence of genes or markers within a genome. MPP can construct gene or marker content datasets for a number of genomes and can use the data to estimate an evolutionary tree or network. Results from gene content analyses may be validated by comparing them to known gene contents. MPP was initially developed to analyse data derived from comparative genome hybridization (CGH) microarray experiments in fungi and bacteria. It has recently been adapted to analyse retrotransposon-based insertion polymorphism (RBIP) marker scores derived from tagged microarray marker (TAM) experiments in pea. New analytical procedures may be added easily to MPP as plugins in order to increase the scope of the software. AVAILABILITY MPP source code, executables and online help are available at http://cbr.jic.ac.uk/dicks/software/


ACS Synthetic Biology | 2017

Leaf LIMS: A Flexible Laboratory Information Management System with a Synthetic Biology Focus

Thomas Craig; Richard Holland; Rosalinda D’Amore; James Johnson; Hannah V. McCue; Anthony West; Valentin Zulkower; Hille Tekotte; Yizhi Cai; Daniel Swan; Robert Davey; Christiane Hertz-Fowler; Anthony Hall; Mark X. Caddick

This paper presents Leaf LIMS, a flexible laboratory information management system (LIMS) designed to address the complexity of synthetic biology workflows. At the projects inception there was a lack of a LIMS designed specifically to address synthetic biology processes, with most systems focused on either next generation sequencing or biobanks and clinical sample handling. Leaf LIMS implements integrated project, item, and laboratory stock tracking, offering complete sample and construct genealogy, materials and lot tracking, and modular assay data capture. Hence, it enables highly configurable task-based workflows and supports data capture from project inception to completion. As such, in addition to it supporting synthetic biology it is ideal for many laboratory environments with multiple projects and users. The system is deployed as a web application through Docker and is provided under a permissive MIT license. It is freely available for download at https://leaflims.github.io .

Collaboration


Dive into the Robert Davey's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Javier Herrero

University College London

View shared research outputs
Researchain Logo
Decentralizing Knowledge