Jacqueline A. Keane
Wellcome Trust Sanger Institute
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Jacqueline A. Keane.
Bioinformatics | 2015
Andrew J. Page; Carla Cummins; Martin Hunt; Vanessa K. Wong; Sandra Reuter; Matthew T. G. Holden; Maria Fookes; Daniel Falush; Jacqueline A. Keane; Julian Parkhill
Summary: A typical prokaryote population sequencing study can now consist of hundreds or thousands of isolates. Interrogating these datasets can provide detailed insights into the genetic structure of prokaryotic genomes. We introduce Roary, a tool that rapidly builds large-scale pan genomes, identifying the core and accessory genes. Roary makes construction of the pan genome of thousands of prokaryote samples possible on a standard desktop without compromising on the accuracy of results. Using a single CPU Roary can produce a pan genome consisting of 1000 isolates in 4.5 hours using 13 GB of RAM, with further speedups possible using multiple processors. Availability and implementation: Roary is implemented in Perl and is freely available under an open source GPLv3 license from http://sanger-pathogens.github.io/Roary Contact: [email protected] Supplementary information: Supplementary data are available at Bioinformatics online.
Nucleic Acids Research | 2015
Nicholas J. Croucher; Andrew J. Page; Thomas Richard Connor; Aidan Delaney; Jacqueline A. Keane; Stephen D. Bentley; Julian Parkhill; Simon R. Harris
The emergence of new sequencing technologies has facilitated the use of bacterial whole genome alignments for evolutionary studies and outbreak analyses. These datasets, of increasing size, often include examples of multiple different mechanisms of horizontal sequence transfer resulting in substantial alterations to prokaryotic chromosomes. The impact of these processes demands rapid and flexible approaches able to account for recombination when reconstructing isolates’ recent diversification. Gubbins is an iterative algorithm that uses spatial scanning statistics to identify loci containing elevated densities of base substitutions suggestive of horizontal sequence transfer while concurrently constructing a maximum likelihood phylogeny based on the putative point mutations outside these regions of high sequence diversity. Simulations demonstrate the algorithm generates highly accurate reconstructions under realistically parameterized models of bacterial evolution, and achieves convergence in only a few hours on alignments of hundreds of bacterial genome sequences. Gubbins is appropriate for reconstructing the recent evolutionary history of a variety of haploid genotype alignments, as it makes no assumptions about the underlying mechanism of recombination. The software is freely available for download at github.com/sanger-pathogens/Gubbins, implemented in Python and C and supported on Linux and Mac OS X.
Nature Genetics | 2015
Vanessa K. Wong; Stephen Baker; Derek Pickard; Julian Parkhill; Andrew J. Page; Nicholas A. Feasey; Robert A. Kingsley; Nicholas R. Thomson; Jacqueline A. Keane; F X Weill; David J. Edwards; Jane Hawkey; Simon R. Harris; Alison E. Mather; Amy K. Cain; James Hadfield; Peter J. Hart; Nga Tran Vu Thieu; Elizabeth J. Klemm; Dafni A. Glinos; Robert F. Breiman; Conall H. Watson; Samuel Kariuki; Melita A. Gordon; Robert S. Heyderman; Chinyere K. Okoro; Jan Jacobs; Octavie Lunguya; W. John Edmunds; Chisomo L. Msefula
The emergence of multidrug-resistant (MDR) typhoid is a major global health threat affecting many countries where the disease is endemic. Here whole-genome sequence analysis of 1,832 Salmonella enterica serovar Typhi (S. Typhi) identifies a single dominant MDR lineage, H58, that has emerged and spread throughout Asia and Africa over the last 30 years. Our analysis identifies numerous transmissions of H58, including multiple transfers from Asia to Africa and an ongoing, unrecognized MDR epidemic within Africa itself. Notably, our analysis indicates that H58 lineages are displacing antibiotic-sensitive isolates, transforming the global population structure of this pathogen. H58 isolates can harbor a complex MDR element residing either on transmissible IncHI1 plasmids or within multiple chromosomal integration sites. We also identify new mutations that define the H58 lineage. This phylogeographical analysis provides a framework to facilitate global management of MDR typhoid and is applicable to similar MDR lineages emerging in other bacterial species.
Genome Biology | 2015
Martin Hunt; Nishadi De Silva; Thomas D. Otto; Julian Parkhill; Jacqueline A. Keane; Simon R. Harris
The assembly of DNA sequence data is undergoing a renaissance thanks to emerging technologies capable of producing reads tens of kilobases long. Assembling complete bacterial and small eukaryotic genomes is now possible, but the final step of circularizing sequences remains unsolved. Here we present Circlator, the first tool to automate assembly circularization and produce accurate linear representations of circular sequences. Using Pacific Biosciences and Oxford Nanopore data, Circlator correctly circularized 26 of 27 circularizable sequences, comprising 11 chromosomes and 12 plasmids from bacteria, the apicoplast and mitochondrion of Plasmodium falciparum and a human mitochondrion. Circlator is available at http://sanger-pathogens.github.io/circlator/.
Cell Host & Microbe | 2014
Tu Anh N. Pham; Simon Clare; David Goulding; Julia Maryam Arasteh; Mark D. Stares; Hilary P. Browne; Jacqueline A. Keane; Andrew J. Page; Natsuhiko Kumasaka; Leanne Kane; Lynda Mottram; Katherine Harcourt; Christine Hale; Mark J. Arends; Daniel J. Gaffney; Gordon Dougan; Trevor D. Lawley
Summary Our intestinal microbiota harbors a diverse microbial community, often containing opportunistic bacteria with virulence potential. However, mutualistic host-microbial interactions prevent disease by opportunistic pathogens through poorly understood mechanisms. We show that the epithelial interleukin-22 receptor IL-22RA1 protects against lethal Citrobacter rodentium infection and chemical-induced colitis by promoting colonization resistance against an intestinal opportunistic bacterium, Enterococcus faecalis. Susceptibility of Il22ra1−/− mice to C. rodentium was associated with preferential expansion and epithelial translocation of pathogenic E. faecalis during severe microbial dysbiosis and was ameloriated with antibiotics active against E. faecalis. RNA sequencing analyses of primary colonic organoids showed that IL-22RA1 signaling promotes intestinal fucosylation via induction of the fucosyltransferase Fut2. Additionally, administration of fucosylated oligosaccharides to C. rodentium-challenged Il22ra1−/− mice attenuated infection and promoted E. faecalis colonization resistance by restoring the diversity of anaerobic commensal symbionts. These results support a model whereby IL-22RA1 enhances host-microbiota mutualism to limit detrimental overcolonization by opportunistic pathogens.
Bioinformatics | 2015
Martin Hunt; Astrid Gall; Swee Hoe Ong; Jacqui Brener; Bridget Ferns; Philip J. R. Goulder; Eleni Nastouli; Jacqueline A. Keane; Paul Kellam; Thomas D. Otto
Motivation: An accurate genome assembly from short read sequencing data is critical for downstream analysis, for example allowing investigation of variants within a sequenced population. However, assembling sequencing data from virus samples, especially RNA viruses, into a genome sequence is challenging due to the combination of viral population diversity and extremely uneven read depth caused by amplification bias in the inevitable reverse transcription and polymerase chain reaction amplification process of current methods. Results: We developed a new de novo assembler called IVA (Iterative Virus Assembler) designed specifically for read pairs sequenced at highly variable depth from RNA virus samples. We tested IVA on datasets from 140 sequenced samples from human immunodeficiency virus-1 or influenza-virus-infected people and demonstrated that IVA outperforms all other virus de novo assemblers. Availability and implementation: The software runs under Linux, has the GPLv3 licence and is freely available from http://sanger-pathogens.github.io/iva Contact: [email protected] Supplementary information: Supplementary data are available at Bioinformatics online.
bioRxiv | 2016
Andrew J. Page; Nishadi De Silva; Martin Hunt; Michael A. Quail; Julian Parkhill; Simon R. Harris; Thomas D. Otto; Jacqueline A. Keane
The rapidly reducing cost of bacterial genome sequencing has lead to its routine use in large-scale microbial analysis. Though mapping approaches can be used to find differences relative to the reference, many bacteria are subject to constant evolutionary pressures resulting in events such as the loss and gain of mobile genetic elements, horizontal gene transfer through recombination and genomic rearrangements. De novo assembly is the reconstruction of the underlying genome sequence, an essential step to understanding bacterial genome diversity. Here we present a high-throughput bacterial assembly and improvement pipeline that has been used to generate nearly 20 000 annotated draft genome assemblies in public databases. We demonstrate its performance on a public data set of 9404 genomes. We find all the genes used in multi-locus sequence typing schema present in 99.6 % of assembled genomes. When tested on low-, neutral- and high-GC organisms, more than 94 % of genes were present and completely intact. The pipeline has been proven to be scalable and robust with a wide variety of datasets without requiring human intervention. All of the software is available on GitHub under the GNU GPL open source license.
bioRxiv | 2016
Andrew J. Page; Ben Taylor; Aidan Delaney; Jorge Soares; Torsten Seemann; Jacqueline A. Keane; Simon R. Harris
Rapidly decreasing genome sequencing costs have led to a proportionate increase in the number of samples used in prokaryotic population studies. Extracting single nucleotide polymorphisms (SNPs) from a large whole genome alignment is now a routine task, but existing tools have failed to scale efficiently with the increased size of studies. These tools are slow, memory inefficient and are installed through non-standard procedures. We present SNP-sites which can rapidly extract SNPs from a multi-FASTA alignment using modest resources and can output results in multiple formats for downstream analysis. SNPs can be extracted from a 8.3 GB alignment file (1842 taxa, 22 618 sites) in 267 seconds using 59 MB of RAM and 1 CPU core, making it feasible to run on modest computers. It is easy to install through the Debian and Homebrew package managers, and has been successfully tested on more than 20 operating systems. SNP-sites is implemented in C and is available under the open source license GNU GPL version 3.
bioRxiv | 2017
Martin Hunt; Alison E. Mather; Leonor Sánchez-Busó; Andrew J. Page; Julian Parkhill; Jacqueline A. Keane; Simon R. Harris
Antimicrobial resistance (AMR) is one of the major threats to human and animal health worldwide, yet few high-throughput tools exist to analyse and predict the resistance of a bacterial isolate from sequencing data. Here we present a new tool, ARIBA, that identifies AMR-associated genes and single nucleotide polymorphisms directly from short reads, and generates detailed and customizable output. The accuracy and advantages of ARIBA over other tools are demonstrated on three datasets from Gram-positive and Gram-negative bacteria, with ARIBA outperforming existing methods.
Nature Communications | 2016
Vanessa K. Wong; Stephen Baker; Thomas Richard Connor; Derek Pickard; Andrew J. Page; Jayshree Dave; Niamh Murphy; Richard Holliman; Armine Sefton; Michael Millar; Zoe A. Dyson; Gordon Dougan; Kathryn E. Holt; Julian Parkhill; Nicholas A. Feasey; Robert A. Kingsley; Nicholas R. Thomson; Jacqueline A. Keane; F X Weill; Simon Le Hello; Jane Hawkey; David J. Edwards; Simon R. Harris; Amy K. Cain; James Hadfield; Peter J. Hart; Nga Tran Vu Thieu; Elizabeth J. Klemm; Robert F. Breiman; Conall H. Watson
The population of Salmonella enterica serovar Typhi (S. Typhi), the causative agent of typhoid fever, exhibits limited DNA sequence variation, which complicates efforts to rationally discriminate individual isolates. Here we utilize data from whole-genome sequences (WGS) of nearly 2,000 isolates sourced from over 60 countries to generate a robust genotyping scheme that is phylogenetically informative and compatible with a range of assays. These data show that, with the exception of the rapidly disseminating H58 subclade (now designated genotype 4.3.1), the global S. Typhi population is highly structured and includes dozens of subclades that display geographical restriction. The genotyping approach presented here can be used to interrogate local S. Typhi populations and help identify recent introductions of S. Typhi into new or previously endemic locations, providing information on their likely geographical source. This approach can be used to classify clinical isolates and provides a universal framework for further experimental investigations.