Sarah K. Highlander | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Sarah K. Highlander is active.

Explore More

Publication

Featured researches published by Sarah K. Highlander.

Applied and Environmental Microbiology | 2013

Development of a Dual-Index Sequencing Strategy and Curation Pipeline for Analyzing Amplicon Sequence Data on the MiSeq Illumina Sequencing Platform

James J. Kozich; Sarah L. Westcott; Nielson T. Baxter; Sarah K. Highlander; Patrick D. Schloss

ABSTRACT Rapid advances in sequencing technology have changed the experimental landscape of microbial ecology. In the last 10 years, the field has moved from sequencing hundreds of 16S rRNA gene fragments per study using clone libraries to the sequencing of millions of fragments per study using next-generation sequencing technologies from 454 and Illumina. As these technologies advance, it is critical to assess the strengths, weaknesses, and overall suitability of these platforms for the interrogation of microbial communities. Here, we present an improved method for sequencing variable regions within the 16S rRNA gene using Illuminas MiSeq platform, which is currently capable of producing paired 250-nucleotide reads. We evaluated three overlapping regions of the 16S rRNA gene that vary in length (i.e., V34, V4, and V45) by resequencing a mock community and natural samples from human feces, mouse feces, and soil. By titrating the concentration of 16S rRNA gene amplicons applied to the flow cell and using a quality score-based approach to correct discrepancies between reads used to construct contigs, we were able to reduce error rates by as much as two orders of magnitude. Finally, we reprocessed samples from a previous study to demonstrate that large numbers of samples could be multiplexed and sequenced in parallel with shotgun metagenomes. These analyses demonstrate that our approach can provide data that are at least as good as that generated by the 454 platform while providing considerably higher sequencing coverage for a fraction of the cost.

Genome Research | 2011

Chimeric 16S rRNA sequence formation and detection in Sanger and 454-pyrosequenced PCR amplicons

Brian J. Haas; Dirk Gevers; Ashlee M. Earl; Mike Feldgarden; Doyle V. Ward; Georgia Giannoukos; Dawn Ciulla; Diana Tabbaa; Sarah K. Highlander; Erica Sodergren; Barbara A. Methé; Todd Z. DeSantis; Joseph F. Petrosino; Rob Knight; Bruce Birren

Bacterial diversity among environmental samples is commonly assessed with PCR-amplified 16S rRNA gene (16S) sequences. Perceived diversity, however, can be influenced by sample preparation, primer selection, and formation of chimeric 16S amplification products. Chimeras are hybrid products between multiple parent sequences that can be falsely interpreted as novel organisms, thus inflating apparent diversity. We developed a new chimera detection tool called Chimera Slayer (CS). CS detects chimeras with greater sensitivity than previous methods, performs well on short sequences such as those produced by the 454 Life Sciences (Roche) Genome Sequencer, and can scale to large data sets. By benchmarking CS performance against sequences derived from a controlled DNA mixture of known organisms and a simulated chimera set, we provide insights into the factors that affect chimera formation such as sequence abundance, the extent of similarity between 16S genes, and PCR conditions. Chimeras were found to reproducibly form among independent amplifications and contributed to false perceptions of sample diversity and the false identification of novel taxa, with less-abundant species exhibiting chimera rates exceeding 70%. Shotgun metagenomic sequences of our mock community appear to be devoid of 16S chimeras, supporting a role for shotgun metagenomics in validating novel organisms discovered in targeted sequence surveys.

Science | 2010

A catalog of reference genomes from the human microbiome.

Karen E. Nelson; George M. Weinstock; Sarah K. Highlander; Kim C. Worley; Heather Huot Creasy; Jennifer R. Wortman; Douglas B. Rusch; Makedonka Mitreva; Erica Sodergren; Asif T. Chinwalla; Michael Feldgarden; Dirk Gevers; Brian J. Haas; Ramana Madupu; Doyle V. Ward; Bruce Birren; Richard A. Gibbs; Barbara A. Methé; Joseph F. Petrosino; Robert L. Strausberg; Granger Sutton; Owen White; Richard Wilson; Scott Durkin; Michelle G. Giglio; Sharvari Gujja; Clint Howarth; Chinnappa D. Kodira; Nikos C. Kyrpides; Teena Mehta

News from the Inner Tube of Life A major initiative by the U.S. National Institutes of Health to sequence 900 genomes of microorganisms that live on the surfaces and orifices of the human body has established standardized protocols and methods for such large-scale reference sequencing. By combining previously accumulated data with new data, Nelson et al. (p. 994) present an initial analysis of 178 bacterial genomes. The sampling so far barely scratches the surface of the microbial diversity found on humans, but the work provides an important baseline for future analyses. Standardized protocols and methods are being established for large-scale sequencing of the microorganisms living on humans. The human microbiome refers to the community of microorganisms, including prokaryotes, viruses, and microbial eukaryotes, that populate the human body. The National Institutes of Health launched an initiative that focuses on describing the diversity of microbial species that are associated with health and disease. The first phase of this initiative includes the sequencing of hundreds of microbial reference genomes, coupled to metagenomic sequencing from multiple body sites. Here we present results from an initial reference genome sequencing of 178 microbial genomes. From 547,968 predicted polypeptides that correspond to the gene complement of these strains, previously unidentified (“novel”) polypeptides that had both unmasked sequence length greater than 100 amino acids and no BLASTP match to any nonreference entry in the nonredundant subset were defined. This analysis resulted in a set of 30,867 polypeptides, of which 29,987 (~97%) were unique. In addition, this set of microbial genomes allows for ~40% of random sequences from the microbiome of the gastrointestinal tract to be associated with organisms based on the match criteria used. Insights into pan-genome analysis suggest that we are still far from saturating microbial species genetic data sets. In addition, the associated metrics and standards used by our group for quality assurance are presented.

Clinical Chemistry | 2009

Metagenomic Pyrosequencing and Microbial Identification

Joseph F. Petrosino; Sarah K. Highlander; Ruth Ann Luna; Richard A. Gibbs; James Versalovic

BACKGROUND The Human Microbiome Project has ushered in a new era for human metagenomics and high-throughput next-generation sequencing strategies. CONTENT This review describes evolving strategies in metagenomics, with a special emphasis on the core technology of DNA pyrosequencing. The challenges of microbial identification in the context of microbial populations are discussed. The development of next-generation pyrosequencing strategies and the technical hurdles confronting these methodologies are addressed. Bioinformatics-related topics include taxonomic systems, sequence databases, sequence-alignment tools, and classifiers. DNA sequencing based on 16S rRNA genes or entire genomes is summarized with respect to potential pyrosequencing applications. SUMMARY Both the approach of 16S rDNA amplicon sequencing and the whole-genome sequencing approach may be useful for human metagenomics, and numerous bioinformatics tools are being deployed to tackle such vast amounts of microbiological sequence diversity. Metagenomics, or genetic studies of microbial communities, may ultimately contribute to a more comprehensive understanding of human health, disease susceptibilities, and the pathophysiology of infectious and immune-mediated diseases.

Science | 2009

Genome Project Standards in a New Era of Sequencing

Patrick Chain; Darren Grafham; Robert S. Fulton; Michael Fitzgerald; Jessica B. Hostetler; Donna M. Muzny; J. Ali; Bruce W. Birren; David Bruce; Christian Buhay; James R. Cole; Yan Ding; Shannon Dugan; Dawn Field; George M Garrity; Richard A. Gibbs; Tina Graves; Cliff Han; Scott H. Harrison; Sarah K. Highlander; Philip Hugenholtz; H. M. Khouri; Chinnappa D. Kodira; Eugene Kolker; Nikos C. Kyrpides; D. Lang; Alla Lapidus; S. A. Malfatti; Victor Markowitz; T. Metha

More detailed sequence standards that keep up with revolutionary sequencing technologies will aid the research community in evaluating data. For over a decade, genome sequences have adhered to only two standards that are relied on for purposes of sequence analysis by interested third parties (1, 2). However, ongoing developments in revolutionary sequencing technologies have resulted in a redefinition of traditional whole-genome sequencing that requires reevaluation of such standards. With commercially available 454 pyrosequencing (followed by Illumina, SOLiD, and now Helicos), there has been an explosion of genomes sequenced under the moniker “draft”; however, these can be very poor quality genomes (due to inherent errors in the sequencing technologies, and the inability of assembly programs to fully address these errors). Further, one can only infer that such draft genomes may be of poor quality by navigating through the databases to find the number and type of reads deposited in sequence trace repositories (and not all genomes have this available), or to identify the number of contigs or genome fragments deposited to the database. The difficulty in assessing the quality of such deposited genomes has created some havoc for genome analysis pipelines and has contributed to many wasted hours. Exponential leaps in raw sequencing capability and greatly reduced prices have further skewed the time- and cost-ratios of draft data generation versus the painstaking process of improving and finishing a genome. The result is an ever-widening gap between drafted and finished genomes that only promises to continue (see the figure, page 236); hence, there is an urgent need to distinguish good from poor data sets.

Gastroenterology | 2011

Gastrointestinal Microbiome Signatures of Pediatric Patients With Irritable Bowel Syndrome

Delphine M. Saulnier; Kevin Riehle; Toni Ann Mistretta; Maria Alejandra Diaz; Debasmita Mandal; Sabeen Raza; Erica M. Weidler; Xiang Qin; Cristian Coarfa; Aleksandar Milosavljevic; Joseph F. Petrosino; Sarah K. Highlander; Richard A. Gibbs; Susan V. Lynch; Robert J. Shulman; James Versalovic

BACKGROUND & AIMS The intestinal microbiomes of healthy children and pediatric patients with irritable bowel syndrome (IBS) are not well defined. Studies in adults have indicated that the gastrointestinal microbiota could be involved in IBS. METHODS We analyzed 71 samples from 22 children with IBS (pediatric Rome III criteria) and 22 healthy children, ages 7-12 years, by 16S ribosomal RNA gene sequencing, with an average of 54,287 reads/stool sample (average 454 read length = 503 bases). Data were analyzed using phylogenetic-based clustering (Unifrac), or an operational taxonomic unit (OTU) approach using a supervised machine learning tool (randomForest). Most samples were also hybridized to a microarray that can detect 8741 bacterial taxa (16S rRNA PhyloChip). RESULTS Microbiomes associated with pediatric IBS were characterized by a significantly greater percentage of the class γ-proteobacteria (0.07% vs 0.89% of total bacteria, respectively; P < .05); 1 prominent component of this group was Haemophilus parainfluenzae. Differences highlighted by 454 sequencing were confirmed by high-resolution PhyloChip analysis. Using supervised learning techniques, we were able to classify different subtypes of IBS with a success rate of 98.5%, using limited sets of discriminant bacterial species. A novel Ruminococcus-like microbe was associated with IBS, indicating the potential utility of microbe discovery for gastrointestinal disorders. A greater frequency of pain correlated with an increased abundance of several bacterial taxa from the genus Alistipes. CONCLUSIONS Using 16S metagenomics by PhyloChip DNA hybridization and deep 454 pyrosequencing, we associated specific microbiome signatures with pediatric IBS. These findings indicate the important association between gastrointestinal microbes and IBS in children; these approaches might be used in diagnosis of functional bowel disorders in pediatric patients.

BMC Microbiology | 2007

Subtle genetic changes enhance virulence of methicillin resistant and sensitive Staphylococcus aureus

Sarah K. Highlander; Kristina G. Hulten; Xiang Qin; Huaiyang Jiang; Shailaja Yerrapragada; Edward O. Mason; Yue Shang; Tiffany M. Williams; Régine M Fortunov; Yamei Liu; Okezie Igboeli; Joseph F. Petrosino; Madhan R. Tirumalai; Akif Uzman; George E. Fox; Ana Maria Cardenas; Donna M. Muzny; Lisa Hemphill; Yan Ding; Shannon Dugan; Peter R Blyth; Christian Buhay; Huyen Dinh; Alicia Hawes; Michael Holder; Christie Kovar; Sandra L. Lee; Wen Liu; Lynne V. Nazareth; Qiaoyan Wang

BackgroundCommunity acquired (CA) methicillin-resistant Staphylococcus aureus (MRSA) increasingly causes disease worldwide. USA300 has emerged as the predominant clone causing superficial and invasive infections in children and adults in the USA. Epidemiological studies suggest that USA300 is more virulent than other CA-MRSA. The genetic determinants that render virulence and dominance to USA300 remain unclear.ResultsWe sequenced the genomes of two pediatric USA300 isolates: one CA-MRSA and one CA-methicillin susceptible (MSSA), isolated at Texas Childrens Hospital in Houston. DNA sequencing was performed by Sanger dideoxy whole genome shotgun (WGS) and 454 Life Sciences pyrosequencing strategies. The sequence of the USA300 MRSA strain was rigorously annotated. In USA300-MRSA 2658 chromosomal open reading frames were predicted and 3.1 and 27 kilobase (kb) plasmids were identified. USA300-MSSA contained a 20 kb plasmid with some homology to the 27 kb plasmid found in USA300-MRSA. Two regions found in US300-MRSA were absent in USA300-MSSA. One of these carried the arginine deiminase operon that appears to have been acquired from S. epidermidis. The USA300 sequence was aligned with other sequenced S. aureus genomes and regions unique to USA300 MRSA were identified.ConclusionUSA300-MRSA is highly similar to other MRSA strains based on whole genome alignments and gene content, indicating that the differences in pathogenesis are due to subtle changes rather than to large-scale acquisition of virulence factor genes. The USA300 Houston isolate differs from another sequenced USA300 strain isolate, derived from a patient in San Francisco, in plasmid content and a number of sequence polymorphisms. Such differences will provide new insights into the evolution of pathogens.

Journal of Bacteriology | 2004

Complete Genome Sequence of Rickettsia typhi and Comparison with Sequences of Other Rickettsiae

Michael P. McLeod; Xiang Qin; Sandor E. Karpathy; Jason Gioia; Sarah K. Highlander; George E. Fox; Thomas Z. McNeill; Huaiyang Jiang; Donna M. Muzny; Leni S. Jacob; Alicia Hawes; Erica Sodergren; Rachel Gill; Jennifer Hume; Maggie Morgan; Guangwei Fan; Anita G. Amin; Richard A. Gibbs; Chao Hong; Xue Jie Yu; David H. Walker; George M. Weinstock

Rickettsia typhi, the causative agent of murine typhus, is an obligate intracellular bacterium with a life cycle involving both vertebrate and invertebrate hosts. Here we present the complete genome sequence of R. typhi (1,111,496 bp) and compare it to the two published rickettsial genome sequences: R. prowazekii and R. conorii. We identified 877 genes in R. typhi encoding 3 rRNAs, 33 tRNAs, 3 noncoding RNAs, and 838 proteins, 3 of which are frameshifts. In addition, we discovered more than 40 pseudogenes, including the entire cytochrome c oxidase system. The three rickettsial genomes share 775 genes: 23 are found only in R. prowazekii and R. typhi, 15 are found only in R. conorii and R. typhi, and 24 are unique to R. typhi. Although most of the genes are colinear, there is a 35-kb inversion in gene order, which is close to the replication terminus, in R. typhi, compared to R. prowazekii and R. conorii. In addition, we found a 124-kb R. typhi-specific inversion, starting 19 kb from the origin of replication, compared to R. prowazekii and R. conorii. Inversions in this region are also seen in the unpublished genome sequences of R. sibirica and R. rickettsii, indicating that this region is a hot spot for rearrangements. Genome comparisons also revealed a 12-kb insertion in the R. prowazekii genome, relative to R. typhi and R. conorii, which appears to have occurred after the typhus (R. prowazekii and R. typhi) and spotted fever (R. conorii) groups diverged. The three-way comparison allowed further in silico analysis of the SpoT split genes, leading us to propose that the stringent response system is still functional in these rickettsiae.

PLOS ONE | 2012

Evaluation of 16s rDNA-based community profiling for human microbiome research

Doyle V. Ward; Dirk Gevers; Georgia Giannoukos; Ashlee M. Earl; Barbara A. Methé; Erica Sodergren; Michael Feldgarden; Dawn Ciulla; Diana Tabbaa; Cesar Arze; Elizabeth L. Appelbaum; Leigh Aird; Scott Anderson; Tulin Ayvaz; Edward A. Belter; Monika Bihan; Toby Bloom; Jonathan Crabtree; Laura Courtney; Lynn K. Carmichael; David J. Dooling; Rachel L. Erlich; Candace N. Farmer; Lucinda Fulton; Robert S. Fulton; Hongyu Gao; John Gill; Brian J. Haas; Lisa Hemphill; Otis Hall

The Human Microbiome Project will establish a reference data set for analysis of the microbiome of healthy adults by surveying multiple body sites from 300 people and generating data from over 12,000 samples. To characterize these samples, the participating sequencing centers evaluated and adopted 16S rDNA community profiling protocols for ABI 3730 and 454 FLX Titanium sequencing. In the course of establishing protocols, we examined the performance and error characteristics of each technology, and the relationship of sequence error to the utility of 16S rDNA regions for classification- and OTU-based analysis of community structure. The data production protocols used for this work are those used by the participating centers to produce 16S rDNA sequence for the Human Microbiome Project. Thus, these results can be informative for interpreting the large body of clinical 16S rDNA data produced for this project.

Genome Biology | 2008

Large scale variation in Enterococcus faecalis illustrated by the genome analysis of strain OG1RF

Agathe Bourgogne; Danielle A. Garsin; Xiang Qin; Kavindra V. Singh; Jouko Sillanpää; Shailaja Yerrapragada; Yan Ding; Shannon Dugan-Rocha; Christian Buhay; Hua Shen; Guan Chen; Gabrielle Williams; Donna M. Muzny; Arash Maadani; Kristina A. Fox; Jason Gioia; Lei Chen; Yue Shang; Cesar A. Arias; Sreedhar R. Nallapareddy; Meng Zhao; Vittal P. Prakash; Shahreen Chowdhury; Huaiyang Jiang; Richard A. Gibbs; Barbara E. Murray; Sarah K. Highlander; George M. Weinstock

BackgroundEnterococcus faecalis has emerged as a major hospital pathogen. To explore its diversity, we sequenced E. faecalis strain OG1RF, which is commonly used for molecular manipulation and virulence studies.ResultsThe 2,739,625 base pair chromosome of OG1RF was found to contain approximately 232 kilobases unique to this strain compared to V583, the only publicly available sequenced strain. Almost no mobile genetic elements were found in OG1RF. The 64 areas of divergence were classified into three categories. First, OG1RF carries 39 unique regions, including 2 CRISPR loci and a new WxL locus. Second, we found nine replacements where a sequence specific to V583 was substituted by a sequence specific to OG1RF. For example, the iol operon of OG1RF replaces a possible prophage and the vanB transposon in V583. Finally, we found 16 regions that were present in V583 but missing from OG1RF, including the proposed pathogenicity island, several probable prophages, and the cpsCDEFGHIJK capsular polysaccharide operon. OG1RF was more rapidly but less frequently lethal than V583 in the mouse peritonitis model and considerably outcompeted V583 in a murine model of urinary tract infections.ConclusionE. faecalis OG1RF carries a number of unique loci compared to V583, but the almost complete lack of mobile genetic elements demonstrates that this is not a defining feature of the species. Additionally, OG1RFs effects in experimental models suggest that mediators of virulence may be diverse between different E. faecalis strains and that virulence is not dependent on the presence of mobile genetic elements.

Explore More