Bob Mau
University of Wisconsin-Madison
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Bob Mau.
Nature | 2001
Nicole T. Perna; Guy Plunkett; Valerie Burland; Bob Mau; Jeremy D. Glasner; Debra J. Rose; George F. Mayhew; Peter S. Evans; Jason Gregor; Heather A. Kirkpatrick; György Pósfai; Jeremiah D. Hackett; Sara Klink; Adam Boutin; Ying Shao; Leslie Miller; Erik J. Grotbeck; N. Wayne Davis; Alex Lim; Eileen T. Dimalanta; Konstantinos Potamousis; Jennifer Apodaca; Thomas S. Anantharaman; Jieyi Lin; Galex Yen; David C. Schwartz; Rodney A. Welch; Frederick R. Blattner
The bacterium Escherichia coli O157:H7 is a worldwide threat to public health and has been implicated in many outbreaks of haemorrhagic colitis, some of which included fatalities caused by haemolytic uraemic syndrome. Close to 75,000 cases of O157:H7 infection are now estimated to occur annually in the United States. The severity of disease, the lack of effective treatment and the potential for large-scale outbreaks from contaminated food supplies have propelled intensive research on the pathogenesis and detection of E. coli O157:H7 (ref. 4). Here we have sequenced the genome of E. coli O157:H7 to identify candidate genes responsible for pathogenesis, to develop better methods of strain detection and to advance our understanding of the evolution of E. coli, through comparison with the genome of the non-pathogenic laboratory strain E. coli K-12 (ref. 5). We find that lateral gene transfer is far more extensive than previously anticipated. In fact, 1,387 new genes encoded in strain-specific clusters of diverse sizes were found in O157:H7. These include candidate virulence factors, alternative metabolic capacities, several prophages and other new functions—all of which could be targets for surveillance.
PLOS ONE | 2010
Aaron E. Darling; Bob Mau; Nicole T. Perna
Background Multiple genome alignment remains a challenging problem. Effects of recombination including rearrangement, segmental duplication, gain, and loss can create a mosaic pattern of homology even among closely related organisms. Methodology/Principal Findings We describe a new method to align two or more genomes that have undergone rearrangements due to recombination and substantial amounts of segmental gain and loss (flux). We demonstrate that the new method can accurately align regions conserved in some, but not all, of the genomes, an important case not handled by our previous work. The method uses a novel alignment objective score called a sum-of-pairs breakpoint score, which facilitates accurate detection of rearrangement breakpoints when genomes have unequal gene content. We also apply a probabilistic alignment filtering method to remove erroneous alignments of unrelated sequences, which are commonly observed in other genome alignment methods. We describe new metrics for quantifying genome alignment accuracy which measure the quality of rearrangement breakpoint predictions and indel predictions. The new genome alignment algorithm demonstrates high accuracy in situations where genomes have undergone biologically feasible amounts of genome rearrangement, segmental gain and loss. We apply the new algorithm to a set of 23 genomes from the genera Escherichia, Shigella, and Salmonella. Analysis of whole-genome multiple alignments allows us to extend the previously defined concepts of core- and pan-genomes to include not only annotated genes, but also non-coding regions with potential regulatory roles. The 23 enterobacteria have an estimated core-genome of 2.46Mbp conserved among all taxa and a pan-genome of 15.2Mbp. We document substantial population-level variability among these organisms driven by segmental gain and loss. Interestingly, much variability lies in intergenic regions, suggesting that the Enterobacteriacae may exhibit regulatory divergence. Conclusions The multiple genome alignments generated by our software provide a platform for comparative genomic and population genomic studies. Free, open-source software implementing the described genome alignment approach is available from http://gel.ahabs.wisc.edu/mauve.
Infection and Immunity | 2003
J. Wei; Marcia B. Goldberg; Valerie Burland; Malabi M. Venkatesan; Wen Deng; G. Fournier; George F. Mayhew; Guy Plunkett; Debra J. Rose; Aaron E. Darling; Bob Mau; Nicole T. Perna; Shelley M. Payne; L. J. Runyen-Janecky; Shiguo Zhou; David C. Schwartz; Frederick R. Blattner
ABSTRACT We determined the complete genome sequence of Shigella flexneri serotype 2a strain 2457T (4,599,354 bp). Shigella species cause >1 million deaths per year from dysentery and diarrhea and have a lifestyle that is markedly different from those of closely related bacteria, including Escherichia coli. The genome exhibits the backbone and island mosaic structure of E. coli pathogens, albeit with much less horizontally transferred DNA and lacking 357 genes present in E. coli. The strain is distinctive in its large complement of insertion sequences, with several genomic rearrangements mediated by insertion sequences, 12 cryptic prophages, 372 pseudogenes, and 195 S. flexneri-specific genes. The 2457T genome was also compared with that of a recently sequenced S. flexneri 2a strain, 301. Our data are consistent with Shigella being phylogenetically indistinguishable from E. coli. The S. flexneri-specific regions contain many genes that could encode proteins with roles in virulence. Analysis of these will reveal the genetic basis for aspects of this pathogenic organisms distinctive lifestyle that have yet to be explained.
Bioinformatics | 2009
Anna I. Rissman; Bob Mau; Bryan S. Biehl; Aaron E. Darling; Jeremy D. Glasner; Nicole T. Perna
Summary: Mauve Contig Mover provides a new method for proposing the relative order of contigs that make up a draft genome based on comparison to a complete or draft reference genome. A novel application of the Mauve aligner and viewer provides an automated reordering algorithm coupled with a powerful drill-down display allowing detailed exploration of results. Availability: The software is available for download at http://gel.ahabs.wisc.edu/mauve. Contact: [email protected] Supplementary information: Supplementary data are available at Bioinformatics online and http://gel.ahabs.wisc.edu
Journal of Bacteriology | 2008
Tim Durfee; Richard Nelson; Schuyler F. Baldwin; Guy Plunkett; Valerie Burland; Bob Mau; Joseph F. Petrosino; Xiang Qin; Donna M. Muzny; Mulu Ayele; Richard A. Gibbs; Bálint Csörgo; György Pósfai; George M. Weinstock; Frederick R. Blattner
Escherichia coli DH10B was designed for the propagation of large insert DNA library clones. It is used extensively, taking advantage of properties such as high DNA transformation efficiency and maintenance of large plasmids. The strain was constructed by serial genetic recombination steps, but the underlying sequence changes remained unverified. We report the complete genomic sequence of DH10B by using reads accumulated from the bovine sequencing project at Baylor College of Medicine and assembled with DNAStars SeqMan genome assembler. The DH10B genome is largely colinear with that of the wild-type K-12 strain MG1655, although it is substantially more complex than previously appreciated, allowing DH10B biology to be further explored. The 226 mutated genes in DH10B relative to MG1655 are mostly attributable to the extensive genetic manipulations the strain has undergone. However, we demonstrate that DH10B has a 13.5-fold higher mutation rate than MG1655, resulting from a dramatic increase in insertion sequence (IS) transposition, especially IS150. IS elements appear to have remodeled genome architecture, providing homologous recombination sites for a 113,260-bp tandem duplication and an inversion. DH10B requires leucine for growth on minimal medium due to the deletion of leuLABCD and harbors both the relA1 and spoT1 alleles causing both sensitivity to nutritional downshifts and slightly lower growth rates relative to the wild type. Finally, while the sequence confirms most of the reported alleles, the sequence of deoR is wild type, necessitating reexamination of the assumed basis for the high transformability of DH10B.
Journal of Computational and Graphical Statistics | 1997
Bob Mau; Michael A. Newton
Abstract Using a stochastic model for the evolution of discrete characters among a group of organisms, we derive a Markov chain that simulates a Bayesian posterior distribution on the space of dendograms. A transformation of the tree into a canonical cophenetic matrix form, with distinct entries along its superdiagonal, suggests a simple proposal distribution for selecting candidate trees “close” to the current tree in the chain. We apply the consequent Metropolis algorithm to published restriction site data on nine species of plants. The Markov chain mixes well from random starting trees, generating reproducible estimates and confidence sets for the path of evolution.
Journal of Bacteriology | 2011
Jeremy D. Glasner; Ching Hong Yang; Sylvie Reverchon; Nicole Hugouvieux-Cotte-Pattat; Guy Condemine; Jean Pierre Bohin; Frédérique Van Gijsegem; Shihui Yang; Thierry Franza; Guy Plunkett; Michael San Francisco; Amy O. Charkowski; Béatrice Py; Kenneth Bell; Lise Rauscher; Pablo Rodríguez-Palenzuela; Ariane Toussaint; Maria C. Holeva; Sheng Yang He; Vanessa Douet; Martine Boccara; Carlos Blanco; Ian K. Toth; Bradley D. Anderson; Bryan S. Biehl; Bob Mau; Sarah M. Flynn; Frédéric Barras; Magdalen Lindeberg; Paul R. J. Birch
Dickeya dadantii is a plant-pathogenic enterobacterium responsible for the soft rot disease of many plants of economic importance. We present here the sequence of strain 3937, a strain widely used as a model system for research on the molecular biology and pathogenicity of this group of bacteria.
Genome Biology | 2006
Bob Mau; Jeremy D. Glasner; Aaron E. Darling; Nicole T. Perna
BackgroundComparisons of complete bacterial genomes reveal evidence of lateral transfer of DNA across otherwise clonally diverging lineages. Some lateral transfer events result in acquisition of novel genomic segments and are easily detected through genome comparison. Other more subtle lateral transfers involve homologous recombination events that result in substitution of alleles within conserved genomic regions. This type of event is observed infrequently among distantly related organisms. It is reported to be more common within species, but the frequency has been difficult to quantify since the sequences under comparison tend to have relatively few polymorphic sites.ResultsHere we report a genome-wide assessment of homologous recombination among a collection of six complete Escherichia coli and Shigella flexneri genome sequences. We construct a whole-genome multiple alignment and identify clusters of polymorphic sites that exhibit atypical patterns of nucleotide substitution using a random walk-based method. The analysis reveals one large segment (approximately 100 kb) and 186 smaller clusters of single base pair differences that suggest lateral exchange between lineages. These clusters include portions of 10% of the 3,100 genes conserved in six genomes. Statistical analysis of the functional roles of these genes reveals that several classes of genes are over-represented, including those involved in recombination, transport and motility.ConclusionWe demonstrate that intraspecific recombination in E. coli is much more common than previously appreciated and may show a bias for certain types of genes. The described method provides high-specificity, conservative inference of past recombination events.
Proceedings of the National Academy of Sciences of the United States of America | 2007
David J. Samuelson; Stephanie E. Hesselson; Beth A. Aperavich; Yunhong Zan; Jill D. Haag; Amy Trentham-Dietz; John M. Hampton; Bob Mau; Kai-Shun Chen; Caroline Baynes; Kay-Tee Khaw; Robert Luben; Barbara Perkins; Mitul Shah; Paul Pharoah; Alison M. Dunning; Doug Easton; Bruce A.J. Ponder; Michael N. Gould
Breast cancer risk is a polygenic trait. To identify breast cancer modifier alleles that have a high population frequency and low penetrance we used a comparative genomics approach. Quantitative trait loci (QTL) were initially identified by linkage analysis in a rat mammary carcinogenesis model followed by verification in congenic rats carrying the specific QTL allele under study. The Mcs5a locus was identified by fine-mapping Mcs5 in a congenic model. Here we characterize the Mcs5a locus, which when homozygous for the Wky allele, reduces mammary cancer risk by 50%. The Mcs5a locus is a compound QTL with at least two noncoding interacting elements: Mcs5a1 and Mcs5a2. The resistance phenotype is only observed in rats carrying at least one copy of the Wky allele of each element on the same chromosome. Mcs5a1 is located within the ubiquitin ligase Fbxo10, whereas Mcs5a2 includes the 5′ portion of Frmpd1. Resistant congenic rats show a down-regulation of Fbxo10 in the thymus and an up-regulation of Frmpd1 in the spleen. The association of the Mcs5a1 and Mcs5a2 human orthologs with breast cancer was tested in two population-based breast cancer case-control studies (≈12,000 women). The minor alleles of rs6476643 (MCS5A1) and rs2182317 (MCS5A2) were independently associated with breast cancer risk. The minor allele of rs6476643 increases risk, whereas the rs2182317 minor allele decreases risk. Both alleles have a high population frequency and a low penetrance toward breast cancer risk.
Bioinformatics | 2004
Aaron E. Darling; Bob Mau; Frederick R. Blattner; Nicole T. Perna
UNLABELLED GRIL is a tool to automatically identify collinear regions in a set of bacterial-size genome sequences. GRIL uses three basic steps. First, regions of high sequence identity are located. Second, some of these regions are filtered based on user-specified criteria. Finally, the remaining regions of sequence identity are used to define significant collinear regions among the sequences. By locating collinear regions of sequence, GRIL provides a basis for multiple genome alignment using current alignment systems. GRIL also provides a basis for using current inversion distance tools to infer phylogeny. AVAILABILITY GRIL is implemented in C++ and runs on any x86-based Linux or Windows platform. It is available from http://asap.ahabs.wisc.edu/gril