Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Richard M. Leggett is active.

Publication


Featured researches published by Richard M. Leggett.


Genome Research | 2011

Assemblathon 1: A competitive assessment of de novo short read assembly methods

Dent Earl; Keith Bradnam; John St. John; Aaron E. Darling; Dawei Lin; Joseph Fass; Hung On Ken Yu; Vince Buffalo; Daniel R. Zerbino; Mark Diekhans; Ngan Nguyen; Pramila Ariyaratne; Wing-Kin Sung; Zemin Ning; Matthias Haimel; Jared T. Simpson; Nuno A. Fonseca; Inanc Birol; T. Roderick Docking; Isaac Ho; Daniel S. Rokhsar; Rayan Chikhi; Dominique Lavenier; Guillaume Chapuis; Delphine Naquin; Nicolas Maillet; Michael C. Schatz; David R. Kelley; Adam M. Phillippy; Sergey Koren

Low-cost short read sequencing technology has revolutionized genomics, though it is only just becoming practical for the high-quality de novo assembly of a novel large genome. We describe the Assemblathon 1 competition, which aimed to comprehensively assess the state of the art in de novo assembly methods when applied to current sequencing technologies. In a collaborative effort, teams were asked to assemble a simulated Illumina HiSeq data set of an unknown, simulated diploid genome. A total of 41 assemblies from 17 different groups were received. Novel haplotype aware assessments of coverage, contiguity, structure, base calling, and copy number were made. We establish that within this benchmark: (1) It is possible to assemble the genome to a high level of coverage and accuracy, and that (2) large differences exist between the assemblies, suggesting room for further improvements in current methods. The simulated benchmark, including the correct answer, the assemblies, and the code that was used to evaluate the assemblies is now public and freely available from http://www.assemblathon.org/.


Plant Journal | 2013

Resistance gene enrichment sequencing (RenSeq) enables reannotation of the NB‐LRR gene family from sequenced plant genomes and rapid mapping of resistance loci in segregating populations

Florian Jupe; Kamil Witek; Walter Verweij; Jadwiga Śliwka; Leighton Pritchard; Graham J. Etherington; Daniel MacLean; Peter J. A. Cock; Richard M. Leggett; Glenn J. Bryan; Linda Cardle; Ingo Hein; Jonathan D. G. Jones

Summary RenSeq is a NB-LRR (nucleotide binding-site leucine-rich repeat) gene-targeted, Resistance gene enrichment and sequencing method that enables discovery and annotation of pathogen resistance gene family members in plant genome sequences. We successfully applied RenSeq to the sequenced potato Solanum tuberosum clone DM, and increased the number of identified NB-LRRs from 438 to 755. The majority of these identified R gene loci reside in poorly or previously unannotated regions of the genome. Sequence and positional details on the 12 chromosomes have been established for 704 NB-LRRs and can be accessed through a genome browser that we provide. We compared these NB-LRR genes and the corresponding oligonucleotide baits with the highest sequence similarity and demonstrated that ∼80% sequence identity is sufficient for enrichment. Analysis of the sequenced tomato S. lycopersicum ‘Heinz 1706’ extended the NB-LRR complement to 394 loci. We further describe a methodology that applies RenSeq to rapidly identify molecular markers that co-segregate with a pathogen resistance trait of interest. In two independent segregating populations involving the wild Solanum species S. berthaultii (Rpi-ber2) and S. ruiz-ceballosii (Rpi-rzc1), we were able to apply RenSeq successfully to identify markers that co-segregate with resistance towards the late blight pathogen Phytophthora infestans. These SNP identification workflows were designed as easy-to-adapt Galaxy pipelines.


F1000Research | 2015

MinION Analysis and Reference Consortium: Phase 1 data release and analysis

Camilla L. C. Ip; Matthew Loose; John R. Tyson; Mariateresa de Cesare; Bonnie L. Brown; Miten Jain; Richard M. Leggett; David Eccles; Vadim Zalunin; John M. Urban; Paolo Piazza; Rory Bowden; Benedict Paten; Solomon Mwaigwisya; Elizabeth M. Batty; Jared T. Simpson; Terrance P. Snutch; Ewan Birney; David Buck; Sara Goodwin; Hans J. Jansen; Justin O'Grady; Hugh E. Olsen; MinION Analysis

The advent of a miniaturized DNA sequencing device with a high-throughput contextual sequencing capability embodies the next generation of large scale sequencing tools. The MinION™ Access Programme (MAP) was initiated by Oxford Nanopore Technologies™ in April 2014, giving public access to their USB-attached miniature sequencing device. The MinION Analysis and Reference Consortium (MARC) was formed by a subset of MAP participants, with the aim of evaluating and providing standard protocols and reference data to the community. Envisaged as a multi-phased project, this study provides the global community with the Phase 1 data from MARC, where the reproducibility of the performance of the MinION was evaluated at multiple sites. Five laboratories on two continents generated data using a control strain of Escherichia coli K-12, preparing and sequencing samples according to a revised ONT protocol. Here, we provide the details of the protocol used, along with a preliminary analysis of the characteristics of typical runs including the consistency, rate, volume and quality of data produced. Further analysis of the Phase 1 data presented here, and additional experiments in Phase 2 of E. coli from MARC are already underway to identify ways to improve and enhance MinION performance.


Bioinformatics | 2014

NextClip: an analysis and read preparation tool for Nextera long mate pair libraries

Richard M. Leggett; Bernardo Clavijo; Leah Clissold; Matthew D. Clark; Mario Caccamo

SUMMARY Illuminas recently released Nextera Long Mate Pair (LMP) kit enables production of jumping libraries of up to 12 kb. The LMP libraries are an invaluable resource for carrying out complex assemblies and other downstream bioinformatics analyses such as the characterization of structural variants. However, LMP libraries are intrinsically noisy and to maximize their value, post-sequencing data analysis is required. Standardizing laboratory protocols and the selection of sequenced reads for downstream analysis are non-trivial tasks. NextClip is a tool for analyzing reads from LMP libraries, generating a comprehensive quality report and extracting good quality trimmed and deduplicated reads. AVAILABILITY AND IMPLEMENTATION Source code, user guide and example data are available from https://github.com/richardmleggett/nextclip/.


Virology | 2013

Metagenomic study of the viruses of African straw-coloured fruit bats: Detection of a chiropteran poxvirus and isolation of a novel adenovirus

Kate S. Baker; Richard M. Leggett; Nicholas Bexfield; Mark Alston; Gordon M. Daly; Shawn Todd; Mary Tachedjian; Clare Holmes; Sandra Crameri; Lin-Fa Wang; Jonathan L. Heeney; Richard Suu-Ire; Paul Kellam; Andrew A. Cunningham; J. L. N. Wood; Mario Caccamo; Pablo R. Murcia

Abstract Viral emergence as a result of zoonotic transmission constitutes a continuous public health threat. Emerging viruses such as SARS coronavirus, hantaviruses and henipaviruses have wildlife reservoirs. Characterising the viruses of candidate reservoir species in geographical hot spots for viral emergence is a sensible approach to develop tools to predict, prevent, or contain emergence events. Here, we explore the viruses of Eidolon helvum, an Old World fruit bat species widely distributed in Africa that lives in close proximity to humans. We identified a great abundance and diversity of novel herpes and papillomaviruses, described the isolation of a novel adenovirus, and detected, for the first time, sequences of a chiropteran poxvirus closely related with Molluscum contagiosum. In sum, E. helvum display a wide variety of mammalian viruses, some of them genetically similar to known human pathogens, highlighting the possibility of zoonotic transmission.


Proceedings of the National Academy of Sciences of the United States of America | 2013

Coiled-coil protein Scy is a key component of a multiprotein assembly controlling polarized growth in Streptomyces

Neil A. Holmes; John Walshaw; Richard M. Leggett; Kate A. Dalton; Gillespie; Andrew M. Hemmings; B Gust; Gabriella H. Kelemen

Polarized growth in eukaryotes requires polar multiprotein complexes. Here, we establish that selection and maintenance of cell polarity for growth also requires a dedicated multiprotein assembly in the filamentous bacterium, Streptomyces coelicolor. We present evidence for a tip organizing center and confirm two of its main components: Scy (Streptomyces cytoskeletal element), a unique bacterial coiled-coil protein with an unusual repeat periodicity, and the known polarity determinant DivIVA. We also establish a link between the tip organizing center and the filament-forming protein FilP. Interestingly, both deletion and overproduction of Scy generated multiple polarity centers, suggesting a mechanism wherein Scy can both promote and limit the number of emerging polarity centers via the organization of the Scy-DivIVA assemblies. We propose that Scy is a molecular “assembler,” which, by sequestering DivIVA, promotes the establishment of new polarity centers for de novo tip formation during branching, as well as supporting polarized growth at existing hyphal tips.


Frontiers in Genetics | 2013

Sequencing quality assessment tools to enable data-driven informatics for high throughput genomics.

Richard M. Leggett; Ricardo H. Ramirez-Gonzalez; Bernardo Clavijo; Darren Waite; Robert Davey

The processes of quality assessment and control are an active area of research at The Genome Analysis Centre (TGAC). Unlike other sequencing centers that often concentrate on a certain species or technology, TGAC applies expertise in genomics and bioinformatics to a wide range of projects, often requiring bespoke wet lab and in silico workflows. TGAC is fortunate to have access to a diverse range of sequencing and analysis platforms, and we are at the forefront of investigations into library quality and sequence data assessment. We have developed and implemented a number of algorithms, tools, pipelines and packages to ascertain, store, and expose quality metrics across a number of next-generation sequencing platforms, allowing rapid and in-depth cross-platform Quality Control (QC) bioinformatics. In this review, we describe these tools as a vehicle for data-driven informatics, offering the potential to provide richer context for downstream analysis and to inform experimental design.


F1000Research | 2017

MinION Analysis and Reference Consortium: Phase 2 data release and analysis of R9.0 chemistry

Miten Jain; John R. Tyson; Matthew Loose; Camilla L. C. Ip; David Eccles; Justin O'Grady; Sunir Malla; Richard M. Leggett; Ola Wallerman; Hans J. Jansen; Vadim Zalunin; Ewan Birney; Bonnie L. Brown; Terrance P. Snutch; Hugh E. Olsen

Background: Long-read sequencing is rapidly evolving and reshaping the suite of opportunities for genomic analysis. For the MinION in particular, as both the platform and chemistry develop, the user community requires reference data to set performance expectations and maximally exploit third-generation sequencing. We performed an analysis of MinION data derived from whole genome sequencing of Escherichia coli K-12 using the R9.0 chemistry, comparing the results with the older R7.3 chemistry. Methods: We computed the error-rate estimates for insertions, deletions, and mismatches in MinION reads. Results: Run-time characteristics of the flow cell and run scripts for R9.0 were similar to those observed for R7.3 chemistry, but with an 8-fold increase in bases per second (from 30 bps in R7.3 and SQK-MAP005 library preparation, to 250 bps in R9.0) processed by individual nanopores, and less drop-off in yield over time. The 2-dimensional (“2D”) N50 read length was unchanged from the prior chemistry. Using the proportion of alignable reads as a measure of base-call accuracy, 99.9% of “pass” template reads from 1-dimensional (“1D”) experiments were mappable and ~97% from 2D experiments. The median identity of reads was ~89% for 1D and ~94% for 2D experiments. The total error rate (miscall + insertion + deletion ) decreased for 2D “pass” reads from 9.1% in R7.3 to 7.5% in R9.0 and for template “pass” reads from 26.7% in R7.3 to 14.5% in R9.0. Conclusions: These Phase 2 MinION experiments serve as a baseline by providing estimates for read quality, throughput, and mappability. The datasets further enable the development of bioinformatic tools tailored to the new R9.0 chemistry and the design of novel biological applications for this technology. Abbreviations: K: thousand, Kb: kilobase (one thousand base pairs), M: million, Mb: megabase (one million base pairs), Gb: gigabase (one billion base pairs).


Bioinformatics | 2015

NanoOK: Multi-reference alignment analysis of nanopore sequencing data, quality and error profiles

Richard M. Leggett; Darren Heavens; Mario Caccamo; Matthew D. Clark; Robert Davey

Motivation: The Oxford Nanopore MinION sequencer, currently in pre-release testing through the MinION Access Programme (MAP), promises long reads in real-time from an inexpensive, compact, USB device. Tools have been released to extract FASTA/Q from the MinION base calling output and to provide basic yield statistics. However, no single tool yet exists to provide comprehensive alignment-based quality control and error profile analysis—something that is extremely important given the speed with which the platform is evolving. Results: NanoOK generates detailed tabular and graphical output plus an in-depth multi-page PDF report including error profile, quality and yield data. NanoOK is multi-reference, enabling detailed analysis of metagenomic or multiplexed samples. Four popular Nanopore aligners are supported and it is easily extensible to include others. Availability and implementation: NanoOK is an open-source software, implemented in Java with supporting R scripts. It has been tested on Linux and Mac OS X and can be downloaded from https://github.com/TGAC/NanoOK. A VirtualBox VM containing all dependencies and the DH10B read set used in this article is available from http://opendata.tgac.ac.uk/nanook/. A Docker image is also available from Docker Hub—see program documentation https://documentation.tgac.ac.uk/display/NANOOK. Contact: [email protected] Supplementary information: Supplementary data are available at Bioinformatics online.


BMC Genomics | 2014

Reference-free SNP detection: dealing with the data deluge.

Richard M. Leggett; Daniel MacLean

Reference-free SNP detection, that is identifying SNPs between samples directly from comparison of primary sequencing data with other primary sequencing data and not to a pre-assembled reference genome is an emergent and potentially disruptive technology that is beginning to open up new vistas in variant identification that reveals new applications in non-model organisms and metagenomics. The modern, effcient data structures these tools use enables researchers with a reference sequence to sample many more individuals with lower computing storage and processing overhead. In this article we will discuss the technologies and tools implementing reference-free SNP detection and the potential impact on studies of genetic variation in model and non-model organisms, metagenomics and personal genomics and medicine.

Collaboration


Dive into the Richard M. Leggett's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Justin O'Grady

University College London

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Camilla L. C. Ip

Wellcome Trust Centre for Human Genetics

View shared research outputs
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge