James B. Pettengill
Center for Food Safety and Applied Nutrition
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by James B. Pettengill.
BMC Microbiology | 2013
Andrea R. Ottesen; Antonio González Peña; James R. White; James B. Pettengill; Cong Li; Sarah Allard; Steven L. Rideout; Marc-Antoine Allard; Thomas Hill; Peter Evans; Errol Strain; Steven M. Musser; Rob Knight; Eric R. Brown
BackgroundResearch to understand and control microbiological risks associated with the consumption of fresh fruits and vegetables has examined many environments in the farm to fork continuum. An important data gap however, that remains poorly studied is the baseline description of microflora that may be associated with plant anatomy either endemically or in response to environmental pressures. Specific anatomical niches of plants may contribute to persistence of human pathogens in agricultural environments in ways we have yet to describe. Tomatoes have been implicated in outbreaks of Salmonella at least 17 times during the years spanning 1990 to 2010. Our research seeks to provide a baseline description of the tomato microbiome and possibly identify whether or not there is something distinctive about tomatoes or their growing ecology that contributes to persistence of Salmonella in this important food crop.ResultsDNA was recovered from washes of epiphytic surfaces of tomato anatomical organs; leaves, stems, roots, flowers and fruits of Solanum lycopersicum (BHN602), grown at a site in close proximity to commercial farms previously implicated in tomato-Salmonella outbreaks. DNA was amplified for targeted 16S and 18S rRNA genes and sheared for shotgun metagenomic sequencing. Amplicons and metagenomes were used to describe “native” bacterial microflora for diverse anatomical parts of Virginia-grown tomatoes.ConclusionsDistinct groupings of microbial communities were associated with different tomato plant organs and a gradient of compositional similarity could be correlated to the distance of a given plant part from the soil. Unique bacterial phylotypes (at 95% identity) were associated with fruits and flowers of tomato plants. These include Microvirga, Pseudomonas, Sphingomonas, Brachybacterium, Rhizobiales, Paracocccus, Chryseomonas and Microbacterium. The most frequently observed bacterial taxa across aerial plant regions were Pseudomonas and Xanthomonas. Dominant fungal taxa that could be identified to genus with 18S amplicons included Hypocrea, Aureobasidium and Cryptococcus. No definitive presence of Salmonella could be confirmed in any of the plant samples, although 16S sequences suggested that closely related genera were present on leaves, fruits and roots.
PLOS ONE | 2013
Marc W. Allard; Yan Luo; Errol Strain; James B. Pettengill; Ruth Timme; Charles Y. Wang; Cong Li; Christine E. Keys; Jie Zheng; Robert Stones; Mark R. Wilson; Steven M. Musser; Eric W. Brown
Facile laboratory tools are needed to augment identification in contamination events to trace the contamination back to the source (traceback) of Salmonella enterica subsp. enterica serovar Enteritidis (S. Enteritidis). Understanding the evolution and diversity within and among outbreak strains is the first step towards this goal. To this end, we collected 106 new S. Enteriditis isolates within S. Enteriditis Pulsed-Field Gel Electrophoresis (PFGE) pattern JEGX01.0004 and close relatives, and determined their genome sequences. Sources for these isolates spanned food, clinical and environmental farm sources collected during the 2010 S. Enteritidis shell egg outbreak in the United States along with closely related serovars, S. Dublin, S. Gallinarum biovar Pullorum and S. Gallinarum. Despite the highly homogeneous structure of this population, S. Enteritidis isolates examined in this study revealed thousands of SNP differences and numerous variable genes (n = 366). Twenty-one of these genes from the lineages leading to outbreak-associated samples had nonsynonymous (causing amino acid changes) changes and five genes are putatively involved in known Salmonella virulence pathways. While chromosome synteny and genome organization appeared to be stable among these isolates, genome size differences were observed due to variation in the presence or absence of several phages and plasmids, including phage RE-2010, phage P125109, plasmid pSEEE3072_19 (similar to pSENV), plasmid pOU1114 and two newly observed mobile plasmid elements pSEEE1729_15 and pSEEE0956_35. These differences produced modifications to the assembled bases for these draft genomes in the size range of approximately 4.6 to 4.8 mbp, with S. Dublin being larger (∼4.9 mbp) and S. Gallinarum smaller (4.55 mbp) when compared to S. Enteritidis. Finally, we identified variable S. Enteritidis genes associated with virulence pathways that may be useful markers for the development of rapid surveillance and typing methods, potentially aiding in traceback efforts during future outbreaks involving S. Enteritidis PFGE pattern JEGX01.0004.
Genome Biology and Evolution | 2014
Maria Hoffmann; Shaohua Zhao; James B. Pettengill; Yan Luo; Steven R. Monday; Jason Abbott; Sherry Ayers; Hediye Nese Cinar; Tim Muruvanda; Cong Li; Marc W. Allard; Jean M. Whichard; Jianghong Meng; Eric W. Brown; Patrick F. McDermott
Salmonella enterica subsp. enterica serovar Heidelberg (S. Heidelberg) is one of the top serovars causing human salmonellosis. Recently, an antibiotic-resistant strain of this serovar was implicated in a large 2011 multistate outbreak resulting from consumption of contaminated ground turkey that involved 136 confirmed cases, with one death. In this study, we assessed the evolutionary diversity of 44 S. Heidelberg isolates using whole-genome sequencing (WGS) generated by the 454 GS FLX (Roche) platform. The isolates, including 30 with nearly indistinguishable (one band difference) Xbal pulsed-field gel electrophoresis patterns (JF6X01.0032, JF6X01.0058), were collected from various sources between 1982 and 2011 and included nine isolates associated with the 2011 outbreak. Additionally, we determined the complete sequence for the chromosome and three plasmids from a clinical isolate associated with the 2011 outbreak using the Pacific Biosciences (PacBio) system. Using single-nucleotide polymorphism (SNP) analyses, we were able to distinguish highly clonal isolates, including strains isolated at different times in the same year. The isolates from the recent 2011 outbreak clustered together with a mean SNP variation of only 17 SNPs. The S. Heidelberg isolates carried a variety of phages, such as prophage P22, P4, lambda-like prophage Gifsy-2, and the P2-like phage which carries the sopE1 gene, virulence genes including 62 pathogenicity, and 13 fimbrial markers and resistance plasmids of the incompatibility (Inc)I1, IncA/C, and IncHI2 groups. Twenty-one strains contained an IncX plasmid carrying a type IV secretion system. On the basis of the recent and historical isolates used in this study, our results demonstrated that, in addition to providing detailed genetic information for the isolates, WGS can identify SNP targets that can be utilized for differentiating highly clonal S. Heidelberg isolates.
Genome Biology and Evolution | 2013
Ruth Timme; James B. Pettengill; Marc W. Allard; Errol Strain; Rodolphe Barrangou; Chris Wehnes; JoAnn S. Van Kessel; Jeffrey S. Karns; Steven M. Musser; Eric W. Brown
The enteric pathogen Salmonella enterica is one of the leading causes of foodborne illness in the world. The species is extremely diverse, containing more than 2,500 named serovars that are designated for their unique antigen characters and pathogenicity profiles—some are known to be virulent pathogens, while others are not. Questions regarding the evolution of pathogenicity, significance of antigen characters, diversity of clustered regularly interspaced short palindromic repeat (CRISPR) loci, among others, will remain elusive until a strong evolutionary framework is established. We present the first large-scale S. enterica subsp. enterica phylogeny inferred from a new reference-free k-mer approach of gathering single nucleotide polymorphisms (SNPs) from whole genomes. The phylogeny of 156 isolates representing 78 serovars (102 were newly sequenced) reveals two major lineages, each with many strongly supported sublineages. One of these lineages is the S. Typhi group; well nested within the phylogeny. Lineage-through-time analyses suggest there have been two instances of accelerated rates of diversification within the subspecies. We also found that antigen characters and CRISPR loci reveal different evolutionary patterns than that of the phylogeny, suggesting that a horizontal gene transfer or possibly a shared environmental acquisition might have influenced the present character distribution. Our study also shows the ability to extract reference-free SNPs from a large set of genomes and then to use these SNPs for phylogenetic reconstruction. This automated, annotation-free approach is an important step forward for bacterial disease tracking and in efficiently elucidating the evolutionary history of highly clonal organisms.
PeerJ | 2015
Steve Davis; James B. Pettengill; Yan Luo; Justin Payne; Al Shpuntoff; Hugh Rand; Errol Strain
The analysis of next-generation sequence (NGS) data is often a fragmented step-wise process. For example, multiple pieces of software are typically needed to map NGS reads, extract variant sites, and construct a DNA sequence matrix containing only single nucleotide polymorphisms (i.e., a SNP matrix) for a set of individuals. The management and chaining of these software pieces and their outputs can often be a cumbersome and diffi cult task. Here, we present CFSAN SNP Pipeline, which combines into a single package the mapping of NGS reads to a reference genome with Bowtie2, processing of those mapping (BAM) files using SAMtools, identification of variant sites using VarScan, and production of a SNP matrix using custom Python scripts. We also introduce a Python package (CFSAN SNP Mutator) that when given a reference genome will generate variants of known position against which we validate our pipeline. We created 1,000 simulated Salmonella enterica sp. enterica Serovar Agona genomes at 100× and 20× coverage, each containing 500 SNPs, 20 single-base insertions and 20 single-base deletions. For the 100× dataset, the CFSAN SNP Pipeline recovered 98.9% of the introduced SNPs and had a false positive rate of 1.04 × 10 −6 ; for the 20× dataset 98.8% of SNPs were recovered and the false positive rate was 8.34 × 10 −7 . Based on these results, CFSAN SNP Pipeline is a robust and accurate tool that it is among the first to combine into a single executable the myriad steps required to produce a SNP matrix from NGS data. Such a tool is useful to those working in an applied setting (e.g., food safety traceback investigations) as well as for those interested in evolutionary questions.
PLOS ONE | 2013
Guojie Cao; Jianghong Meng; Errol Strain; Robert Stones; James B. Pettengill; Shaohua Zhao; Patrick F. McDermott; Eric W. Brown; Marc W. Allard
Salmonella Newport has ranked in the top three Salmonella serotypes associated with foodborne outbreaks from 1995 to 2011 in the United States. In the current study, we selected 26 S. Newport strains isolated from diverse sources and geographic locations and then conducted 454 shotgun pyrosequencing procedures to obtain 16–24 × coverage of high quality draft genomes for each strain. Comparative genomic analysis of 28 S. Newport strains (including 2 reference genomes) and 15 outgroup genomes identified more than 140,000 informative SNPs. A resulting phylogenetic tree consisted of four sublineages and indicated that S. Newport had a clear geographic structure. Strains from Asia were divergent from those from the Americas. Our findings demonstrated that analysis using whole genome sequencing data resulted in a more accurate picture of phylogeny compared to that using single genes or small sets of genes. We selected loci around the mutS gene of S. Newport to differentiate distinct lineages, including those between invH and mutS genes at the 3′ end of Salmonella Pathogenicity Island 1 (SPI-1), ste fimbrial operon, and Clustered, Regularly Interspaced, Short Palindromic Repeats (CRISPR) associated-proteins (cas). These genes in the outgroup genomes held high similarity with either S. Newport Lineage II or III at the same loci. S. Newport Lineages II and III have different evolutionary histories in this region and our data demonstrated genetic flow and homologous recombination events around mutS. The findings suggested that S. Newport Lineages II and III diverged early in the serotype evolution and have evolved largely independently. Moreover, we identified genes that could delineate sublineages within the phylogenetic tree and that could be used as potential biomarkers for trace-back investigations during outbreaks. Thus, whole genome sequencing data enabled us to better understand the genetic background of pathogenicity and evolutionary history of S. Newport and also provided additional markers for epidemiological response.
PeerJ | 2014
James B. Pettengill; Yan Luo; Steven Davis; Yi Chen; Narjol Gonzalez-Escalona; Andrea R. Ottesen; Hugh Rand; Marc W. Allard; Errol Strain
Comparative genomics based on whole genome sequencing (WGS) is increasingly being applied to investigate questions within evolutionary and molecular biology, as well as questions concerning public health (e.g., pathogen outbreaks). Given the impact that conclusions derived from such analyses may have, we have evaluated the robustness of clustering individuals based on WGS data to three key factors: (1) next-generation sequencing (NGS) platform (HiSeq, MiSeq, IonTorrent, 454, and SOLiD), (2) algorithms used to construct a SNP (single nucleotide polymorphism) matrix (reference-based and reference-free), and (3) phylogenetic inference method (FastTreeMP, GARLI, and RAxML). We carried out these analyses on 194 whole genome sequences representing 107 unique Salmonella enterica subsp. enterica ser. Montevideo strains. Reference-based approaches for identifying SNPs produced trees that were significantly more similar to one another than those produced under the reference-free approach. Topologies inferred using a core matrix (i.e., no missing data) were significantly more discordant than those inferred using a non-core matrix that allows for some missing data. However, allowing for too much missing data likely results in a high false discovery rate of SNPs. When analyzing the same SNP matrix, we observed that the more thorough inference methods implemented in GARLI and RAxML produced more similar topologies than FastTreeMP. Our results also confirm that reproducibility varies among NGS platforms where the MiSeq had the lowest number of pairwise differences among replicate runs. Our investigation into the robustness of clustering patterns illustrates the importance of carefully considering how data from different platforms are combined and analyzed. We found clear differences in the topologies inferred, and certain methods performed significantly better than others for discriminating between the highly clonal organisms investigated here. The methods supported by our results represent a preliminary set of guidelines and a step towards developing validated standards for clustering based on whole genome sequence data.
PLOS ONE | 2013
Andrea R. Ottesen; Antonio Gonzalez; Rebecca L. Bell; Caroline Arce; Steven L. Rideout; Marc W. Allard; Peter Evans; Errol Strain; Steven M. Musser; Rob Knight; Eric W. Brown; James B. Pettengill
The ability to detect a specific organism from a complex environment is vitally important to many fields of public health, including food safety. For example, tomatoes have been implicated numerous times as vehicles of foodborne outbreaks due to strains of Salmonella but few studies have ever recovered Salmonella from a tomato phyllosphere environment. Precision of culturing techniques that target agents associated with outbreaks depend on numerous factors. One important factor to better understand is which species co-enrich during enrichment procedures and how microbial dynamics may impede or enhance detection of target pathogens. We used a shotgun sequence approach to describe taxa associated with samples pre-enrichment and throughout the enrichment steps of the Bacteriological Analytical Manuals (BAM) protocol for detection of Salmonella from environmental tomato samples. Recent work has shown that during efforts to enrich Salmonella (Proteobacteria) from tomato field samples, Firmicute genera are also co-enriched and at least one co-enriching Firmicute genus (Paenibacillus sp.) can inhibit and even kills strains of Salmonella. Here we provide a baseline description of microflora that co-culture during detection efforts and the utility of a bioinformatic approach to detect specific taxa from metagenomic sequence data. We observed that uncultured samples clustered together with distinct taxonomic profiles relative to the three cultured treatments (Universal Pre-enrichment broth (UPB), Tetrathionate (TT), and Rappaport-Vassiliadis (RV)). There was little consistency among samples exposed to the same culturing medias, suggesting significant microbial differences in starting matrices or stochasticity associated with enrichment processes. Interestingly, Paenibacillus sp. (Salmonella inhibitor) was significantly enriched from uncultured to cultured (UPB) samples. Also of interest was the sequence based identification of a number of sequences as Salmonella despite indication by all media, that samples were culture negative for Salmonella. Our results substantiate the nascent utility of metagenomic methods to improve both biological and bioinformatic pathogen detection methods.
BMC Infectious Diseases | 2015
Jacob Moran-Gilad; Vitali Sintchenko; Susanne Karlsmose Pedersen; William J. Wolfgang; James B. Pettengill; Errol Strain; Rene S. Hendriksen
The advent of next-generation sequencing (NGS) has revolutionised public health microbiology. Given the potential impact of NGS, it is paramount to ensure standardisation of ‘wet’ laboratory and bioinformatic protocols and promote comparability of methods employed by different laboratories and their outputs. Therefore, one of the ambitious goals of the Global Microbial Identifier (GMI) initiative (http://www.globalmicrobialidentifier.org/) has been to establish a mechanism for inter-laboratory NGS proficiency testing (PT). This report presents findings from the survey recently conducted by Working Group 4 among GMI members in order to ascertain NGS end-use requirements and attitudes towards NGS PT. The survey identified the high professional diversity of laboratories engaged in NGS-based public health projects and the wide range of capabilities within institutions, at a notable range of costs. The priority pathogens reported by respondents reflected the key drivers for NGS use (high burden disease and ‘high profile’ pathogens). The performance of and participation in PT was perceived as important by most respondents. The wide range of sequencing and bioinformatics practices reported by end-users highlights the importance of standardisation and harmonisation of NGS in public health and underpins the use of PT as a means to assuring quality. The findings of this survey will guide the design of the GMI PT program in relation to the spectrum of pathogens included, testing frequency and volume as well as technical requirements. The PT program for external quality assurance will evolve and inform the introduction of NGS into clinical and public health microbiology practice in the post-genomic era.
Journal of Bacteriology | 2012
Ruth Timme; Marc W. Allard; Yan Luo; Errol Strain; James B. Pettengill; Charles Wang; Cong Li; Christine E. Keys; Jie Zheng; Robert Stones; Mark R. Wilson; Steven M. Musser; Eric W. Brown
Salmonella enterica subsp. enterica serovar Enteritidis is a common food-borne pathogen, often associated with shell eggs and poultry. Here, we report draft genomes of 21 S. Enteritidis strains associated with or related to the U.S.-wide 2010 shell egg recall. Eleven of these genomes were from environmental isolates associated with the egg outbreak, and 10 were reference isolates from previous years, unrelated to the outbreak. The whole-genome sequence data for these 21 human pathogen strains are being released in conjunction with the newly formed 100K Genome Project.