Sophia David
Wellcome Trust Sanger Institute
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Sophia David.
Mbio | 2015
Claire E. Turner; James E Abbott; Theresa Lamagni; Matthew T. G. Holden; Sophia David; Mike Jones; Androulla Efstratiou; Shiranee Sriskandan
ABSTRACT Group A Streptococcus (GAS) genotype emm89 is increasingly recognized as a leading cause of disease worldwide, yet factors that underlie the success of this emm type are unknown. Surveillance identified a sustained nationwide increase in emm89 invasive GAS disease in the United Kingdom, prompting longitudinal investigation of this genotype. Whole-genome sequencing revealed a recent dramatic shift in the emm89 population with the emergence of a new clade that increased to dominance over previous emm89 variants. Temporal analysis indicated that the clade arose in the early 1990s but abruptly increased in prevalence in 2008, coinciding with an increased incidence of emm89 infections. Although standard variable typing regions (emm subtype, tee type, sof type, and multilocus sequence typing [MLST]) remained unchanged, uniquely the emergent clade had undergone six distinct regions of homologous recombination across the genome compared to the rest of the sequenced emm89 population. Two of these regions affected known virulence factors, the hyaluronic acid capsule and the toxins NADase and streptolysin O. Unexpectedly, and in contrast to the rest of the sequenced emm89 population, the emergent clade-associated strains were genetically acapsular, rendering them unable to produce the hyaluronic acid capsule. The emergent clade-associated strains had also acquired an NADase/streptolysin O locus nearly identical to that found in emm12 and modern emm1 strains but different from the rest of the sequenced emm89 population. The emergent clade-associated strains had enhanced expression of NADase and streptolysin O. The genome remodeling in the new clade variant and the resultant altered phenotype appear to have conferred a selective advantage over other emm89 variants and may explain the changes observed in emm89 GAS epidemiology. IMPORTANCE Sudden upsurges or epidemic waves are common features of group A streptococcal disease. Although the mechanisms behind such changes are largely unknown, they are often associated with an expansion of a single genotype within the population. Using whole-genome sequencing, we investigated a nationwide increase in invasive disease caused by the genotype emm89 in the United Kingdom. We identified a new clade variant that had recently emerged in the emm89 population after having undergone several core genomic recombination-related changes, two of which affected known virulence factors. An unusual finding of the new variant was the loss of the hyaluronic acid capsule, previously thought to be essential for causing invasive disease. A further genomic adaptation in the NADase/streptolysin O locus resulted in enhanced production of these toxins. Recombination-related genome remodeling is clearly an important mechanism in group A Streptococcus that can give rise to more successful and potentially more pathogenic variants. Sudden upsurges or epidemic waves are common features of group A streptococcal disease. Although the mechanisms behind such changes are largely unknown, they are often associated with an expansion of a single genotype within the population. Using whole-genome sequencing, we investigated a nationwide increase in invasive disease caused by the genotype emm89 in the United Kingdom. We identified a new clade variant that had recently emerged in the emm89 population after having undergone several core genomic recombination-related changes, two of which affected known virulence factors. An unusual finding of the new variant was the loss of the hyaluronic acid capsule, previously thought to be essential for causing invasive disease. A further genomic adaptation in the NADase/streptolysin O locus resulted in enhanced production of these toxins. Recombination-related genome remodeling is clearly an important mechanism in group A Streptococcus that can give rise to more successful and potentially more pathogenic variants.
Journal of Clinical Microbiology | 2016
Sophia David; Massimo Mentasti; Rediat Tewolde; Martin Aslett; Simon R. Harris; Baharak Afshar; Anthony Underwood; Norman K. Fry; Julian Parkhill; Timothy G. Harrison
ABSTRACT Sequence-based typing (SBT), analogous to multilocus sequence typing (MLST), is the current “gold standard” typing method for investigation of legionellosis outbreaks caused by Legionella pneumophila. However, as common sequence types (STs) cause many infections, some investigations remain unresolved. In this study, various whole-genome sequencing (WGS)-based methods were evaluated according to published guidelines, including (i) a single nucleotide polymorphism (SNP)-based method, (ii) extended MLST using different numbers of genes, (iii) determination of gene presence or absence, and (iv) a kmer-based method. L. pneumophila serogroup 1 isolates (n = 106) from the standard “typing panel,” previously used by the European Society for Clinical Microbiology Study Group on Legionella Infections (ESGLI), were tested together with another 229 isolates. Over 98% of isolates were considered typeable using the SNP- and kmer-based methods. Percentages of isolates with complete extended MLST profiles ranged from 99.1% (50 genes) to 86.8% (1,455 genes), while only 41.5% produced a full profile with the gene presence/absence scheme. Replicates demonstrated that all methods offer 100% reproducibility. Indices of discrimination range from 0.972 (ribosomal MLST) to 0.999 (SNP based), and all values were higher than that achieved with SBT (0.940). Epidemiological concordance is generally inversely related to discriminatory power. We propose that an extended MLST scheme with ∼50 genes provides optimal epidemiological concordance while substantially improving the discrimination offered by SBT and can be used as part of a hierarchical typing scheme that should maintain backwards compatibility and increase discrimination where necessary. This analysis will be useful for the ESGLI to design a scheme that has the potential to become the new gold standard typing method for L. pneumophila.
Scientific Reports | 2016
Thomas Crellen; Fiona Allan; Sophia David; Caroline Durrant; Thomas Huckvale; Nancy Holroyd; Aidan M. Emery; David Rollinson; David M. Aanensen; Matthew Berriman; Joanne P. Webster; James A. Cotton
Schistosoma mansoni is a parasitic fluke that infects millions of people in the developing world. This study presents the first application of population genomics to S. mansoni based on high-coverage resequencing data from 10 global isolates and an isolate of the closely-related Schistosoma rodhaini, which infects rodents. Using population genetic tests, we document genes under directional and balancing selection in S. mansoni that may facilitate adaptation to the human host. Coalescence modeling reveals the speciation of S. mansoni and S. rodhaini as 107.5–147.6KYA, a period which overlaps with the earliest archaeological evidence for fishing in Africa. Our results indicate that S. mansoni originated in East Africa and experienced a decline in effective population size 20–90KYA, before dispersing across the continent during the Holocene. In addition, we find strong evidence that S. mansoni migrated to the New World with the 16–19th Century Atlantic Slave Trade.
PLOS Genetics | 2017
Sophia David; Leonor Sánchez-Busó; Simon R. Harris; Pekka Marttinen; Christophe Rusniok; Carmen Buchrieser; Timothy G. Harrison; Julian Parkhill
Legionella pneumophila is an environmental bacterium and the causative agent of Legionnaires’ disease. Previous genomic studies have shown that recombination accounts for a high proportion (>96%) of diversity within several major disease-associated sequence types (STs) of L. pneumophila. This suggests that recombination represents a potentially important force shaping adaptation and virulence. Despite this, little is known about the biological effects of recombination in L. pneumophila, particularly with regards to homologous recombination (whereby genes are replaced with alternative allelic variants). Using newly available population genomic data, we have disentangled events arising from homologous and non-homologous recombination in six major disease-associated STs of L. pneumophila (subsp. pneumophila), and subsequently performed a detailed characterisation of the dynamics and impact of homologous recombination. We identified genomic “hotspots” of homologous recombination that include regions containing outer membrane proteins, the lipopolysaccharide (LPS) region and Dot/Icm effectors, which provide interesting clues to the selection pressures faced by L. pneumophila. Inference of the origin of the recombined regions showed that isolates have most frequently imported DNA from isolates belonging to their own clade, but also occasionally from other major clades of the same subspecies. This supports the hypothesis that the possibility for horizontal exchange of new adaptations between major clades of the subspecies may have been a critical factor in the recent emergence of several clinically important STs from diverse genomic backgrounds. However, acquisition of recombined regions from another subspecies, L. pneumophila subsp. fraseri, was rarely observed, suggesting the existence of a recombination barrier and/or the possibility of ongoing speciation between the two subspecies. Finally, we suggest that multi-fragment recombination may occur in L. pneumophila, whereby multiple non-contiguous segments that originate from the same molecule of donor DNA are imported into a recipient genome during a single episode of recombination.
Clinical Infectious Diseases | 2017
Sophia David; Baharak Afshar; Massimo Mentasti; Christophe Ginevra; Isabelle Podglajen; Simon R. Harris; Victoria J. Chalker; Sophie Jarraud; Timothy G. Harrison; Julian Parkhill
Summary Whole-genome sequencing can be used to support or refute suspected links between hospital water systems and Legionnaires’ disease cases. However, caveats regarding the interpretation of genomic data from Legionella pneumophila are described that should be considered in future investigations.
Eurosurveillance | 2017
Susanne Schjørring; Marc Stegger; Charlotte Kjelsø; Berit Lilje; Jette Marie Bangsborg; Randi Føns Petersen; Sophia David; Søren A. Uldum
Between July and November 2014, 15 community-acquired cases of Legionnaires´ disease (LD), including four with Legionella pneumophila serogroup 1 sequence type (ST) 82, were diagnosed in Northern Zealand, Denmark. An outbreak was suspected. No ST82 isolates were found in environmental samples and no external source was established. Four putative-outbreak ST82 isolates were retrospectively subjected to whole genome sequencing (WGS) followed by phylogenetic analyses with epidemiologically unrelated ST82 sequences. The four putative-outbreak ST82 sequences fell into two clades, the two clades were separated by ca 1,700 single nt polymorphisms (SNP)s when recombination regions were included but only by 12 to 21 SNPs when these were removed. A single putative-outbreak ST82 isolate sequence segregated in the first clade. The other three clustered in the second clade, where all included sequences had < 5 SNP differences between them. Intriguingly, this clade also comprised epidemiologically unrelated isolate sequences from the UK and Denmark dating back as early as 2011. The study confirms that recombination plays a major role in L. pneumophila evolution. On the other hand, strains belonging to the same ST can have only few SNP differences despite being sampled over both large timespans and geographic distances. These are two important factors to consider in outbreak investigations.
Infection Control and Hospital Epidemiology | 2016
O. Colin Stine; Shana Burrowes; Sophia David; J. Kristie Johnson; Mary-Claire Roghmann
OBJECTIVE To define how often methicillin-resistant Staphylococcus aureus (MRSA) is spread from resident to resident in long-term care facilities using whole-genome sequencing DESIGN Prospective cohort study SETTING A long-term care facility PARTICIPANTS Elderly residents in a long-term care facility METHODS Cultures for MRSA were obtained weekly from multiple body sites from residents with known MRSA colonization over 12-week study periods. Simultaneously, cultures to detect MRSA acquisition were obtained weekly from 2 body sites in residents without known MRSA colonization. During the first 12-week cycle on a single unit, we sequenced 8 MRSA isolates per swab for 2 body sites from each of 6 residents. During the second 12-week cycle, we sequenced 30 MRSA isolates from 13 residents with known MRSA colonization and 3 residents who had acquired MRSA colonization. RESULTS MRSA isolates from the same swab showed little genetic variation between isolates with the exception of isolates from wounds. The genetic variation of isolates between body sites on an individual was greater than that within a single body site with the exception of 1 sample, which had 2 unrelated strains among the 8 isolates. In the second cycle, 10 of 16 residents colonized with MRSA (63%) shared 1 of 3 closely related strains. Of the 3 residents with newly acquired MRSA, 2 residents harbored isolates that were members of these clusters. CONCLUSIONS Point prevalence surveys with whole-genome sequencing of MRSA isolates may detect resident-to-resident transmission more accurately than routine surveillance cultures for MRSA in long-term care facilities. Infect Control Hosp Epidemiol 2016;37:685-691.
Mbio | 2015
Claire E. Turner; Theresa Lamagni; Matthew T. G. Holden; Sophia David; Michael D. Jones; Androulla Efstratiou; Shiranee Sriskandan
We are pleased that Friaes et al. have elected to examine emm 89 strains in Portugal (1); it was our hope that other groups would evaluate emm 89 strains from elsewhere, using either whole-genome sequencing (WGS) or the PCR method that we described in our report (2). Subsequent to our publication (2), Zhu et al. used WGS to identify variants of the nga - slo promoter among emm 89 isolates in the United States, Finland, and Iceland (3) indicative of the new emergent emm 89 clade in these countries. Although they did not use WGS, Friaes et al. demonstrated that it is likely that the emergent emm 89 clade that we identified in the United Kingdom and elsewhere is also present in Portugal (1). Friaes and colleagues have noted that the focus on different aspects of emm 89 strains in the various studies (2–4) makes it hard to determine if the same clade had disseminated in different regions. In our original report, we aimed to provide a single comprehensive description of all six regions of recombination that characterize this emergent emm 89 clade (2). Two of these regions, the nga - slo locus (region 2) and the hasABC capsule locus (region 6), were highlighted because of their phenotypic significance; it was beyond the scope of our paper to undertake in-depth analysis of all of the regions. We did, however, clearly demonstrate enhanced NGA-SLO toxin expression and stated that this could be due to a single polymorphism or several polymorphisms within the nga - slo locus and promoter or elsewhere in the genome. Indeed, it may be misleading to focus on a single region or multiple regions of recombination without understanding of the collective impact of all of the changes. In …
bioRxiv | 2018
Tommi Mäklin; Teemu Kallonen; Sophia David; Ben Pascoe; Guillaume Méric; David M. Aanensen; Edward J. Feil; Samuel K. Sheppard; Jukka Corander; Antti Honkela
Traditional 16S ribosomal RNA sequencing and whole-genome shotgun metagenomics can determine the composition of bacterial communities on genus level and species level but high-resolution inference on the strain level is challenging due to close relatedness between strain genomes. We present the mSWEEP pipeline for identifying and estimating relative abundances of bacterial strains from plate sweeps of enrichment cultures. mSWEEP uses a database of biologically grouped sequence assemblies as a reference and achieves ultra-fast mapping and accurate inference using pseudoalignment, Bayesian probabilistic modeling, and a control for false positive results. We use sequencing data from the major human pathogens Campylobacter jejuni, Campylobacter coli, Klebsiella pneumoniae and Staphylococcus epidermidis to demonstrate that mSWEEP significantly outperforms previous state-of-the-art in strain quantification and detection accuracy. The introduction of mSWEEP opens up a new field of plate sweep metagenomics and facilitates investigation of bacterial cultures composed of mixtures of organisms at differing levels of variation.
Microbial Genomics | 2018
Sophia David; Massimo Mentasti; Sandra Lai; Lalita Vaghji; Derren Ready; Victoria J. Chalker; Julian Parkhill
The diversity of Legionella pneumophila populations within single water systems is not well understood, particularly in those unassociated with cases of Legionnaires’ disease. Here, we performed genomic analysis of 235 L. pneumophila isolates obtained from 28 water samples in 13 locations within a large occupational building. Despite regular treatment, the water system of this building is thought to have been colonized by L. pneumophila for at least 30 years without evidence of association with Legionnaires’ disease cases. All isolates belonged to one of three sequence types (STs), ST27 (n=81), ST68 (n=122) and ST87 (n=32), all three of which have been recovered from Legionnaires’ disease patients previously. Pairwise single nucleotide polymorphism differences amongst isolates of the same ST were low, ranging from 0 to 19 in ST27, from 0 to 30 in ST68 and from 0 to 7 in ST87, and no homologous recombination was observed in any lineage. However, there was evidence of horizontal transfer of a plasmid, which was found in all ST87 isolates and only one ST68 isolate. A single ST was found in 10/13 sampled locations, and isolates of each ST were also more similar to those from the same location compared with those from different locations, demonstrating spatial structuring of the population within the water system. These findings provide the first insights into the diversity and genomic evolution of a L. pneumophila population within a complex water system not associated with disease.