Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Thomas D. Otto is active.

Publication


Featured researches published by Thomas D. Otto.


BMC Genomics | 2012

A tale of three next generation sequencing platforms: comparison of Ion Torrent, Pacific Biosciences and Illumina MiSeq sequencers

Michael A. Quail; Miriam Smith; Paul Coupland; Thomas D. Otto; Simon R. Harris; Thomas Richard Connor; Anna Bertoni; Harold Swerdlow; Yong Gu

BackgroundNext generation sequencing (NGS) technology has revolutionized genomic and genetic research. The pace of change in this area is rapid with three major new sequencing platforms having been released in 2011: Ion Torrent’s PGM, Pacific Biosciences’ RS and the Illumina MiSeq. Here we compare the results obtained with those platforms to the performance of the Illumina HiSeq, the current market leader. In order to compare these platforms, and get sufficient coverage depth to allow meaningful analysis, we have sequenced a set of 4 microbial genomes with mean GC content ranging from 19.3 to 67.7%. Together, these represent a comprehensive range of genome content. Here we report our analysis of that sequence data in terms of coverage distribution, bias, GC distribution, variant detection and accuracy.ResultsSequence generated by Ion Torrent, MiSeq and Pacific Biosciences technologies displays near perfect coverage behaviour on GC-rich, neutral and moderately AT-rich genomes, but a profound bias was observed upon sequencing the extremely AT-rich genome of Plasmodium falciparum on the PGM, resulting in no coverage for approximately 30% of the genome. We analysed the ability to call variants from each platform and found that we could call slightly more variants from Ion Torrent data compared to MiSeq data, but at the expense of a higher false positive rate. Variant calling from Pacific Biosciences data was possible but higher coverage depth was required. Context specific errors were observed in both PGM and MiSeq data, but not in that from the Pacific Biosciences platform.ConclusionsAll three fast turnaround sequencers evaluated here were able to generate usable sequence. However there are key differences between the quality of that data and the applications it will support.


GigaScience | 2013

Assemblathon 2: evaluating de novo methods of genome assembly in three vertebrate species

Keith Bradnam; Joseph Fass; Anton Alexandrov; Paul Baranay; Michael Bechner; Inanc Birol; Sébastien Boisvert; Jarrod Chapman; Guillaume Chapuis; Rayan Chikhi; Hamidreza Chitsaz; Wen Chi Chou; Jacques Corbeil; Cristian Del Fabbro; Roderick R. Docking; Richard Durbin; Dent Earl; Scott J. Emrich; Pavel Fedotov; Nuno A. Fonseca; Ganeshkumar Ganapathy; Richard A. Gibbs; Sante Gnerre; Élénie Godzaridis; Steve Goldstein; Matthias Haimel; Giles Hall; David Haussler; Joseph Hiatt; Isaac Ho

BackgroundThe process of generating raw genome sequence data continues to become cheaper, faster, and more accurate. However, assembly of such data into high-quality, finished genome sequences remains challenging. Many genome assembly tools are available, but they differ greatly in terms of their performance (speed, scalability, hardware requirements, acceptance of newer read technologies) and in their final output (composition of assembled sequence). More importantly, it remains largely unclear how to best assess the quality of assembled genome sequences. The Assemblathon competitions are intended to assess current state-of-the-art methods in genome assembly.ResultsIn Assemblathon 2, we provided a variety of sequence data to be assembled for three vertebrate species (a bird, a fish, and snake). This resulted in a total of 43 submitted assemblies from 21 participating teams. We evaluated these assemblies using a combination of optical map data, Fosmid sequences, and several statistical methods. From over 100 different metrics, we chose ten key measures by which to assess the overall quality of the assemblies.ConclusionsMany current genome assemblers produced useful assemblies, containing a significant representation of their genes and overall genome structure. However, the high degree of variability between the entries suggests that there is still much room for improvement in the field of genome assembly and that approaches which work well in assembling the genome of one species may not necessarily work well for another.


BMC Genomics | 2012

Optimizing Illumina next-generation sequencing library preparation for extremely AT-biased genomes.

Samuel O. Oyola; Thomas D. Otto; Yong-ping Gu; Gareth Maslen; Magnus Manske; Susana Campino; Daniel J. Turner; Bronwyn MacInnis; Dominic P. Kwiatkowski; Harold Swerdlow; Michael A. Quail

BackgroundMassively parallel sequencing technology is revolutionizing approaches to genomic and genetic research. Since its advent, the scale and efficiency of Next-Generation Sequencing (NGS) has rapidly improved. In spite of this success, sequencing genomes or genomic regions with extremely biased base composition is still a great challenge to the currently available NGS platforms. The genomes of some important pathogenic organisms like Plasmodium falciparum (high AT content) and Mycobacterium tuberculosis (high GC content) display extremes of base composition. The standard library preparation procedures that employ PCR amplification have been shown to cause uneven read coverage particularly across AT and GC rich regions, leading to problems in genome assembly and variation analyses. Alternative library-preparation approaches that omit PCR amplification require large quantities of starting material and hence are not suitable for small amounts of DNA/RNA such as those from clinical isolates. We have developed and optimized library-preparation procedures suitable for low quantity starting material and tolerant to extremely high AT content sequences.ResultsWe have used our optimized conditions in parallel with standard methods to prepare Illumina sequencing libraries from a non-clinical and a clinical isolate (containing ~53% host contamination). By analyzing and comparing the quality of sequence data generated, we show that our optimized conditions that involve a PCR additive (TMAC), produces amplified libraries with improved coverage of extremely AT-rich regions and reduced bias toward GC neutral templates.ConclusionWe have developed a robust and optimized Next-Generation Sequencing library amplification method suitable for extremely AT-rich genomes. The new amplification conditions significantly reduce bias and retain the complexity of either extremes of base composition. This development will greatly benefit sequencing clinical samples that often require amplification due to low mass of DNA starting material.


Bioinformatics | 2009

ABACAS: algorithm-based automatic contiguation of assembled sequences

Samuel A. Assefa; Thomas M. Keane; Thomas D. Otto; Chris Newbold; Matthew Berriman

Summary: Due to the availability of new sequencing technologies, we are now increasingly interested in sequencing closely related strains of existing finished genomes. Recently a number of de novo and mapping-based assemblers have been developed to produce high quality draft genomes from new sequencing technology reads. New tools are necessary to take contigs from a draft assembly through to a fully contiguated genome sequence. ABACAS is intended as a tool to rapidly contiguate (align, order, orientate), visualize and design primers to close gaps on shotgun assembled contigs based on a reference sequence. The input to ABACAS is a set of contigs which will be aligned to the reference genome, ordered and orientated, visualized in the ACT comparative browser, and optimal primer sequences are automatically generated. Availability and Implementation: ABACAS is implemented in Perl and is freely available for download from http://abacas.sourceforge.net Contact: [email protected]


Genome Biology | 2010

Improving draft assemblies by iterative mapping and assembly of short reads to eliminate gaps

Isheng J. Tsai; Thomas D. Otto; Matthew Berriman

Advances in sequencing technology allow genomes to be sequenced at vastly decreased costs. However, the assembled data frequently are highly fragmented with many gaps. We present a practical approach that uses Illumina sequences to improve draft genome assemblies by aligning sequences against contig ends and performing local assemblies to produce gap-spanning contigs. The continuity of a draft genome can thus be substantially improved, often without the need to generate new data.


PLOS Neglected Tropical Diseases | 2012

A systematically improved high quality genome and transcriptome of the human blood fluke Schistosoma mansoni.

Anna V. Protasio; Isheng J. Tsai; A. K. Babbage; Sarah Nichol; Martin Hunt; Martin Aslett; Nishadi De Silva; Giles S. Velarde; Timothy J. C. Anderson; Richard Clark; Claire Davidson; Gary P. Dillon; Nancy Holroyd; Philip T. LoVerde; Christine Lloyd; Jacquelline McQuillan; Guilherme Oliveira; Thomas D. Otto; Sophia J. Parker-Manuel; Michael A. Quail; R. Alan Wilson; Adhemar Zerlotini; David W. Dunne; Matthew Berriman

Schistosomiasis is one of the most prevalent parasitic diseases, affecting millions of people in developing countries. Amongst the human-infective species, Schistosoma mansoni is also the most commonly used in the laboratory and here we present the systematic improvement of its draft genome. We used Sanger capillary and deep-coverage Illumina sequencing from clonal worms to upgrade the highly fragmented draft 380 Mb genome to one with only 885 scaffolds and more than 81% of the bases organised into chromosomes. We have also used transcriptome sequencing (RNA-seq) from four time points in the parasites life cycle to refine gene predictions and profile their expression. More than 45% of predicted genes have been extensively modified and the total number has been reduced from 11,807 to 10,852. Using the new version of the genome, we identified trans-splicing events occurring in at least 11% of genes and identified clear cases where it is used to resolve polycistronic transcripts. We have produced a high-resolution map of temporal changes in expression for 9,535 genes, covering an unprecedented dynamic range for this organism. All of these data have been consolidated into a searchable format within the GeneDB (www.genedb.org) and SchistoDB (www.schistodb.net) databases. With further transcriptional profiling and genome sequencing increasingly accessible, the upgraded genome will form a fundamental dataset to underpin further advances in schistosome research.


Genome Research | 2011

Chromosome and gene copy number variation allow major structural change between species and strains of Leishmania

Matthew B. Rogers; James D. Hilley; Nicholas J. Dickens; Jon Wilkes; Paul A. Bates; Daniel P. Depledge; David J. Harris; Yerim Her; Pawel Herzyk; Hideo Imamura; Thomas D. Otto; Mandy Sanders; Kathy Seeger; Jean-Claude Dujardin; Matthew Berriman; Deborah F. Smith; Christiane Hertz-Fowler; Jeremy C. Mottram

Leishmania parasites cause a spectrum of clinical pathology in humans ranging from disfiguring cutaneous lesions to fatal visceral leishmaniasis. We have generated a reference genome for Leishmania mexicana and refined the reference genomes for Leishmania major, Leishmania infantum, and Leishmania braziliensis. This has allowed the identification of a remarkably low number of genes or paralog groups (2, 14, 19, and 67, respectively) unique to one species. These were found to be conserved in additional isolates of the same species. We have predicted allelic variation and find that in these isolates, L. major and L. infantum have a surprisingly low number of predicted heterozygous SNPs compared with L. braziliensis and L. mexicana. We used short read coverage to infer ploidy and gene copy numbers, identifying large copy number variations between species, with 200 tandem gene arrays in L. major and 132 in L. mexicana. Chromosome copy number also varied significantly between species, with nine supernumerary chromosomes in L. infantum, four in L. mexicana, two in L. braziliensis, and one in L. major. A significant bias against gene arrays on supernumerary chromosomes was shown to exist, indicating that duplication events occur more frequently on disomic chromosomes. Taken together, our data demonstrate that there is little variation in unique gene content across Leishmania species, but large-scale genetic heterogeneity can result through gene amplification on disomic chromosomes and variation in chromosome number. Increased gene copy number due to chromosome amplification may contribute to alterations in gene expression in response to environmental conditions in the host, providing a genetic basis for disease tropism.


Molecular Microbiology | 2010

New insights into the blood-stage transcriptome of Plasmodium falciparum using RNA-Seq

Thomas D. Otto; Daniel Wilinski; Sammy Assefa; Thomas M. Keane; Louis R Sarry; Ulrike Böhme; Jacob Lemieux; Bart Barrell; Arnab Pain; Matthew Berriman; Chris Newbold; Manuel Llinás

Recent advances in high‐throughput sequencing present a new opportunity to deeply probe an organisms transcriptome. In this study, we used Illumina‐based massively parallel sequencing to gain new insight into the transcriptome (RNA‐Seq) of the human malaria parasite, Plasmodium falciparum. Using data collected at seven time points during the intraerythrocytic developmental cycle, we (i) detect novel gene transcripts; (ii) correct hundreds of gene models; (iii) propose alternative splicing events; and (iv) predict 5′ and 3′ untranslated regions. Approximately 70% of the unique sequencing reads map to previously annotated protein‐coding genes. The RNA‐Seq results greatly improve existing annotation of the P. falciparum genome with over 10% of gene models modified. Our data confirm 75% of predicted splice sites and identify 202 new splice sites, including 84 previously uncharacterized alternative splicing events. We also discovered 107 novel transcripts and expression of 38 pseudogenes, with many demonstrating differential expression across the developmental time series. Our RNA‐Seq results correlate well with DNA microarray analysis performed in parallel on the same samples, and provide improved resolution over the microarray‐based method. These data reveal new features of the P. falciparum transcriptional landscape and significantly advance our understanding of the parasites red blood cell‐stage transcriptome.


PLOS Pathogens | 2011

Genomic insights into the origin of parasitism in the emerging plant pathogen Bursaphelenchus xylophilus.

Taisei Kikuchi; James A. Cotton; Jonathan J. Dalzell; Koichi Hasegawa; Natsumi Kanzaki; Paul McVeigh; Takuma Takanashi; Isheng J. Tsai; Samuel A. Assefa; Peter J. A. Cock; Thomas D. Otto; Martin Hunt; Adam J. Reid; Alejandro Sanchez-Flores; Kazuko Tsuchihara; Toshiro Yokoi; Mattias C. Larsson; Johji Miwa; Aaron G. Maule; Norio Sahashi; John T. Jones; Matthew Berriman

Bursaphelenchus xylophilus is the nematode responsible for a devastating epidemic of pine wilt disease in Asia and Europe, and represents a recent, independent origin of plant parasitism in nematodes, ecologically and taxonomically distinct from other nematodes for which genomic data is available. As well as being an important pathogen, the B. xylophilus genome thus provides a unique opportunity to study the evolution and mechanism of plant parasitism. Here, we present a high-quality draft genome sequence from an inbred line of B. xylophilus, and use this to investigate the biological basis of its complex ecology which combines fungal feeding, plant parasitic and insect-associated stages. We focus particularly on putative parasitism genes as well as those linked to other key biological processes and demonstrate that B. xylophilus is well endowed with RNA interference effectors, peptidergic neurotransmitters (including the first description of ins genes in a parasite) stress response and developmental genes and has a contracted set of chemosensory receptors. B. xylophilus has the largest number of digestive proteases known for any nematode and displays expanded families of lysosome pathway genes, ABC transporters and cytochrome P450 pathway genes. This expansion in digestive and detoxification proteins may reflect the unusual diversity in foods it exploits and environments it encounters during its life cycle. In addition, B. xylophilus possesses a unique complement of plant cell wall modifying proteins acquired by horizontal gene transfer, underscoring the impact of this process on the evolution of plant parasitism by nematodes. Together with the lack of proteins homologous to effectors from other plant parasitic nematodes, this confirms the distinctive molecular basis of plant parasitism in the Bursaphelenchus lineage. The genome sequence of B. xylophilus adds to the diversity of genomic data for nematodes, and will be an important resource in understanding the biology of this unusual parasite.


Genome Biology | 2013

REAPR: a universal tool for genome assembly evaluation.

Martin Hunt; Taisei Kikuchi; Mandy Sanders; Chris Newbold; Matthew Berriman; Thomas D. Otto

Methods to reliably assess the accuracy of genome sequence data are lacking. Currently completeness is only described qualitatively and mis-assemblies are overlooked. Here we present REAPR, a tool that precisely identifies errors in genome assemblies without the need for a reference sequence. We have validated REAPR on complete genomes or de novo assemblies from bacteria, malaria and Caenorhabditis elegans, and demonstrate that 86% and 82% of the human and mouse reference genomes are error-free, respectively. When applied to an ongoing genome project, REAPR provides corrected assembly statistics allowing the quantitative comparison of multiple assemblies. REAPR is available at http://www.sanger.ac.uk/resources/software/reapr/.

Collaboration


Dive into the Thomas D. Otto's collaboration.

Top Co-Authors

Avatar

Matthew Berriman

Wellcome Trust Sanger Institute

View shared research outputs
Top Co-Authors

Avatar

Mandy Sanders

Wellcome Trust Sanger Institute

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Arnab Pain

King Abdullah University of Science and Technology

View shared research outputs
Top Co-Authors

Avatar

Ulrike Böhme

Wellcome Trust Sanger Institute

View shared research outputs
Top Co-Authors

Avatar

Julian C. Rayner

Wellcome Trust Sanger Institute

View shared research outputs
Top Co-Authors

Avatar

Michael A. Quail

Wellcome Trust Sanger Institute

View shared research outputs
Top Co-Authors

Avatar

Dominic P. Kwiatkowski

Wellcome Trust Sanger Institute

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Wim Degrave

Laboratory of Molecular Biology

View shared research outputs
Researchain Logo
Decentralizing Knowledge