J. Paul Brooks
Virginia Commonwealth University
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by J. Paul Brooks.
BMC Microbiology | 2015
J. Paul Brooks; David J. Edwards; Michael Harwich; Maria C. Rivera; Jennifer M. Fettweis; Myrna G. Serrano; Robert Reris; Nihar U. Sheth; Bernice Huang; Philippe H. Girerd; Jerome F. Strauss; Kimberly K. Jefferson; Gregory A. Buck
BackgroundCharacterizing microbial communities via next-generation sequencing is subject to a number of pitfalls involving sample processing. The observed community composition can be a severe distortion of the quantities of bacteria actually present in the microbiome, hampering analysis and threatening the validity of conclusions from metagenomic studies. We introduce an experimental protocol using mock communities for quantifying and characterizing bias introduced in the sample processing pipeline. We used 80 bacterial mock communities comprised of prescribed proportions of cells from seven vaginally-relevant bacterial strains to assess the bias introduced in the sample processing pipeline. We created two additional sets of 80 mock communities by mixing prescribed quantities of DNA and PCR product to quantify the relative contribution to bias of (1) DNA extraction, (2) PCR amplification, and (3) sequencing and taxonomic classification for particular choices of protocols for each step. We developed models to predict the “true” composition of environmental samples based on the observed proportions, and applied them to a set of clinical vaginal samples from a single subject during four visits.ResultsWe observed that using different DNA extraction kits can produce dramatically different results but bias is introduced regardless of the choice of kit. We observed error rates from bias of over 85% in some samples, while technical variation was very low at less than 5% for most bacteria. The effects of DNA extraction and PCR amplification for our protocols were much larger than those due to sequencing and classification. The processing steps affected different bacteria in different ways, resulting in amplified and suppressed observed proportions of a community. When predictive models were applied to clinical samples from a subject, the predicted microbiome profiles were better reflections of the physiology and diagnosis of the subject at the visits than the observed community compositions.ConclusionsBias in 16S studies due to DNA extraction and PCR amplification will continue to require attention despite further advances in sequencing technology. Analysis of mock communities can help assess bias and facilitate the interpretation of results from environmental samples.
PLOS ONE | 2012
Patricio S. La Rosa; J. Paul Brooks; Elena Deych; Edward L. Boone; David J. Edwards; Qin Wang; Erica Sodergren; George M. Weinstock; William D. Shannon
This paper presents new biostatistical methods for the analysis of microbiome data based on a fully parametric approach using all the data. The Dirichlet-multinomial distribution allows the analyst to calculate power and sample sizes for experimental design, perform tests of hypotheses (e.g., compare microbiomes across groups), and to estimate parameters describing microbiome properties. The use of a fully parametric model for these data has the benefit over alternative non-parametric approaches such as bootstrapping and permutation testing, in that this model is able to retain more information contained in the data. This paper details the statistical approaches for several tests of hypothesis and power/sample size calculations, and applies them for illustration to taxonomic abundance distribution and rank abundance distribution data using HMP Jumpstart data on 24 subjects for saliva, subgingival, and supragingival samples. Software for running these analyses is available.
Microbiology | 2014
Jennifer M. Fettweis; J. Paul Brooks; Myrna G. Serrano; Nihar U. Sheth; Philippe H. Girerd; David J. Edwards; Jerome F. Strauss; Kimberly K. Jefferson; Gregory A. Buck
Women of European ancestry are more likely to harbour a Lactobacillus-dominated microbiome, whereas African American women are more likely to exhibit a diverse microbial profile. African American women are also twice as likely to be diagnosed with bacterial vaginosis and are twice as likely to experience preterm birth. The objective of this study was to further characterize and contrast the vaginal microbial profiles in African American versus European ancestry women. Through the Vaginal Human Microbiome Project at Virginia Commonwealth University, 16S rRNA gene sequence analysis was used to compare the microbiomes of vaginal samples from 1268 African American women and 416 women of European ancestry. The results confirmed significant differences in the vaginal microbiomes of the two groups and identified several taxa relevant to these differences. Major community types were dominated by Gardnerella vaginalis and the uncultivated bacterial vaginosis-associated bacterium-1 (BVAB1) that were common among African Americans. Moreover, the prevalence of multiple bacterial taxa that are associated with microbial invasion of the amniotic cavity and preterm birth, including Mycoplasma, Gardnerella, Prevotella and Sneathia, differed between the two ethnic groups. We investigated the contributions of intrinsic and extrinsic factors, including pregnancy, body mass index, diet, smoking and alcohol use, number of sexual partners, and household income, to vaginal community composition. Ethnicity, pregnancy and alcohol use correlated significantly with the relative abundance of bacterial vaginosis-associated species. Trends between microbial profiles and smoking and number of sexual partners were observed; however, these associations were not statistically significant. These results support and extend previous findings that there are significant differences in the vaginal microbiome related to ethnicity and demonstrate that these differences are pronounced even in healthy women.
BMC Systems Biology | 2010
Seth B. Roberts; Christopher M. Gowen; J. Paul Brooks; Stephen S Fong
BackgroundMicroorganisms possess diverse metabolic capabilities that can potentially be leveraged for efficient production of biofuels. Clostridium thermocellum (ATCC 27405) is a thermophilic anaerobe that is both cellulolytic and ethanologenic, meaning that it can directly use the plant sugar, cellulose, and biochemically convert it to ethanol. A major challenge in using microorganisms for chemical production is the need to modify the organism to increase production efficiency. The process of properly engineering an organism is typically arduous.ResultsHere we present a genome-scale model of C. thermocellum metabolism, i SR432, for the purpose of establishing a computational tool to study the metabolic network of C. thermocellum and facilitate efforts to engineer C. thermocellum for biofuel production. The model consists of 577 reactions involving 525 intracellular metabolites, 432 genes, and a proteomic-based representation of a cellulosome. The process of constructing this metabolic model led to suggested annotation refinements for 27 genes and identification of areas of metabolism requiring further study. The accuracy of the i SR432 model was tested using experimental growth and by-product secretion data for growth on cellobiose and fructose. Analysis using this model captures the relationship between the reduction-oxidation state of the cell and ethanol secretion and allowed for prediction of gene deletions and environmental conditions that would increase ethanol production.ConclusionsBy incorporating genomic sequence data, network topology, and experimental measurements of enzyme activities and metabolite fluxes, we have generated a model that is reasonably accurate at predicting the cellular phenotype of C. thermocellum and establish a strong foundation for rational strain design. In addition, we are able to draw some important conclusions regarding the underlying metabolic mechanisms for observed behaviors of C. thermocellum and highlight remaining gaps in the existing genome annotations.
BMC Genomics | 2012
Jennifer M. Fettweis; Myrna G. Serrano; Nihar U. Sheth; Carly M Mayer; Abigail L. Glascock; J. Paul Brooks; Kimberly K. Jefferson; Gregory A. Buck
BackgroundThe application of next-generation sequencing to the study of the vaginal microbiome is revealing the spectrum of microbial communities that inhabit the human vagina. High-resolution identification of bacterial taxa, minimally to the species level, is necessary to fully understand the association of the vaginal microbiome with bacterial vaginosis, sexually transmitted infections, pregnancy complications, menopause, and other physiological and infectious conditions. However, most current taxonomic assignment strategies based on metagenomic 16S rDNA sequence analysis provide at best a genus-level resolution. While surveys of 16S rRNA gene sequences are common in microbiome studies, few well-curated, body-site-specific reference databases of 16S rRNA gene sequences are available, and no such resource is available for vaginal microbiome studies.ResultsWe constructed the Vaginal 16S rDNA Reference Database, a comprehensive and non-redundant database of 16S rDNA reference sequences for bacterial taxa likely to be associated with vaginal health, and we developed STIRRUPS, a new method that employs the USEARCH algorithm with a curated reference database for rapid species-level classification of 16S rDNA partial sequences. The method was applied to two datasets of V1-V3 16S rDNA reads: one generated from a mock community containing DNA from six bacterial strains associated with vaginal health, and a second generated from over 1,000 mid-vaginal samples collected as part of the Vaginal Human Microbiome Project at Virginia Commonwealth University. In both datasets, STIRRUPS, used in conjunction with the Vaginal 16S rDNA Reference Database, classified more than 95% of processed reads to a species-level taxon using a 97% global identity threshold for assignment.ConclusionsThis database and method provide accurate species-level classifications of metagenomic 16S rDNA sequence reads that will be useful for analysis and comparison of microbiome profiles from vaginal samples. STIRRUPS can be used to classify 16S rDNA sequence reads from other ecological niches if an appropriate reference database of 16S rDNA sequences is available.
Operations Research | 2011
J. Paul Brooks
In the interest of deriving classifiers that are robust to outlier observations, we present integer programming formulations of Vapniks support vector machine (SVM) with the ramp loss and hard margin loss. The ramp loss allows a maximum error of 2 for each training observation, while the hard margin loss calculates error by counting the number of training observations that are in the margin or misclassified outside of the margin. SVM with these loss functions is shown to be a consistent estimator when used with certain kernel functions. In computational studies with simulated and real-world data, SVM with the robust loss functions ignores outlier observations effectively, providing an advantage over SVM with the traditional hinge loss when using the linear kernel. Despite the fact that training SVM with the robust loss functions requires the solution of a quadratic mixed-integer program (QMIP) and is NP-hard, while traditional SVM requires only the solution of a continuous quadratic program (QP), we are able to find good solutions and prove optimality for instances with up to 500 observations. Solution methods are presented for the new formulations that improve computational performance over industry-standard integer programming solvers alone.
Annals of Operations Research | 2010
J. Paul Brooks; Eva K. Lee
Classification is concerned with the development of rules for the allocation of observations to groups, and is a fundamental problem in machine learning. Much of previous work on classification models investigates two-group discrimination. Multi-category classification is less-often considered due to the tendency of generalizations of two-group models to produce misclassification rates that are higher than desirable. Indeed, producing “good” two-group classification rules is a challenging task for some applications, and producing good multi-category rules is generally more difficult. Additionally, even when the “optimal” classification rule is known, inter-group misclassification rates may be higher than tolerable for a given classification model. We investigate properties of a mixed-integer programming based multi-category classification model that allows for the pre-specification of limits on inter-group misclassification rates. The mechanism by which the limits are satisfied is the use of a reserved judgment region, an artificial category into which observations are placed whose attributes do not sufficiently indicate membership to any particular group. The method is shown to be a consistent estimator of a classification rule with misclassification limits, and performance on simulated and real-world data is demonstrated.
American Journal of Obstetrics and Gynecology | 2015
Matthew Josiah Allen-Daniels; Myrna G. Serrano; Lindsey P. Pflugner; Jennifer M. Fettweis; Melissa A. Prestosa; Vishal N. Koparde; J. Paul Brooks; Jerome F. Strauss; Roberto Romero; Tinnakorn Chaiworapongsa; David A. Eschenbach; Gregory A. Buck; Kimberly K. Jefferson
OBJECTIVE Microbial invasion of the amniotic cavity is associated with spontaneous preterm labor and adverse pregnancy outcome, and Mycoplasma hominis often is present. However, the pathogenic process by which M hominis invades the amniotic cavity and gestational tissues, often resulting in chorioamnionitis and preterm birth, remains unknown. We hypothesized that strains of M hominis vary genetically with regards to their potential to invade and colonize the amniotic cavity and placenta. STUDY DESIGN We sequenced the entire genomes of 2 amniotic fluid isolates and a placental isolate of M hominis from pregnancies that resulted in preterm births and compared them with the previously sequenced genome of the type strain PG21. We identified genes that were specific to the amniotic fluid/placental isolates. We then determined the microbial burden and the presence of these genes in another set of subjects from whom samples of amniotic fluid had been collected and were positive for M hominis. RESULTS We identified 2 genes that encode surface-located membrane proteins (Lmp1 and Lmp-like) in the sequenced amniotic fluid/placental isolates that were truncated severely in PG21. We also identified, for the first time, a microbial gene of unknown function that is referred to in this study as gene of interest C that was associated significantly with bacterial burden in amniotic fluid and the risk of preterm delivery in patients with preterm labor. CONCLUSION A gene in M hominis was identified that is associated significantly with colonization and/or infection of the upper reproductive tract during pregnancy and with preterm birth.
Clinics in Laboratory Medicine | 2014
Bernice Huang; Jennifer M. Fettweis; J. Paul Brooks; Kimberly K. Jefferson; Gregory A. Buck
Deep sequence analysis of the vaginal microbiome is revealing an unexpected complexity that was not anticipated as recently as several years ago. The lack of clarity in the definition of a healthy vaginal microbiome, much less an unhealthy vaginal microbiome, underscores the need for more investigation of these phenomena. Some clarity may be gained by the careful analysis of the genomes of the specific bacteria in these women. Ongoing studies will clarify this process and offer relief for women with recurring vaginal maladies and hope for pregnant women to avoid the experience of preterm birth.
PLOS ONE | 2014
Jennifer M. Fettweis; Myrna G. Serrano; Bernice Huang; J. Paul Brooks; Abigail L. Glascock; Nihar U. Sheth; Jerome F. Strauss; Kimberly K. Jefferson; Gregory A. Buck
Humans are colonized by thousands of bacterial species, but it is difficult to assess the metabolic and pathogenic potential of the majority of these because they have yet to be cultured. Here, we characterize an uncultivated vaginal mycoplasma tightly associated with trichomoniasis that was previously known by its 16S rRNA sequence as “Mnola.” In this study, the mycoplasma was found almost exclusively in women infected with the sexually transmitted pathogen Trichomonas vaginalis, but rarely observed in women with no diagnosed disease. The genomes of four strains of this species were reconstructed using metagenome sequencing and assembly of DNA from four discrete mid-vaginal samples, one of which was obtained from a pregnant woman with trichomoniasis who delivered prematurely. These bacteria harbor several putative virulence factors and display unique metabolic strategies. Genes encoding proteins with high similarity to potential virulence factors include two collagenases, a hemolysin, an O-sialoglycoprotein endopeptidase and a feoB-type ferrous iron transport system. We propose the name “Candidatus Mycoplasma girerdii” for this potential new pathogen.