Heewook Lee
Indiana University Bloomington
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Heewook Lee.
Proceedings of the National Academy of Sciences of the United States of America | 2012
Heewook Lee; Ellen Popodi; Haixu Tang; Patricia L. Foster
Knowledge of the rate and nature of spontaneous mutation is fundamental to understanding evolutionary and molecular processes. In this report, we analyze spontaneous mutations accumulated over thousands of generations by wild-type Escherichia coli and a derivative defective in mismatch repair (MMR), the primary pathway for correcting replication errors. The major conclusions are (i) the mutation rate of a wild-type E. coli strain is ∼1 × 10−3 per genome per generation; (ii) mutations in the wild-type strain have the expected mutational bias for G:C > A:T mutations, but the bias changes to A:T > G:C mutations in the absence of MMR; (iii) during replication, A:T > G:C transitions preferentially occur with A templating the lagging strand and T templating the leading strand, whereas G:C > A:T transitions preferentially occur with C templating the lagging strand and G templating the leading strand; (iv) there is a strong bias for transition mutations to occur at 5′ApC3′/3′TpG5′ sites (where bases 5′A and 3′T are mutated) and, to a lesser extent, at 5′GpC3′/3′CpG5′ sites (where bases 5′G and 3′C are mutated); (v) although the rate of small (≤4 nt) insertions and deletions is high at repeat sequences, these events occur at only 1/10th the genomic rate of base-pair substitutions. MMR activity is genetically regulated, and bacteria isolated from nature often lack MMR capacity, suggesting that modulation of MMR can be adaptive. Thus, comparing results from the wild-type and MMR-defective strains may lead to a deeper understanding of factors that determine mutation rates and spectra, how these factors may differ among organisms, and how they may be shaped by environmental conditions.
Proceedings of the National Academy of Sciences of the United States of America | 2016
Hongan Long; Samuel F. Miller; Chloe Strauss; Chaoxian Zhao; Lei Cheng; Zhiqiang Ye; Katherine Griffin; Ronald Te; Heewook Lee; Chi-Chun Chen; Michael Lynch
Significance The evolution of antibiotic resistance by pathogenic bacteria poses a major challenge for human health. Whereas it is clear that natural selection promotes resistance-conferring mutations, our understanding of the response of the mutation rate to antibiotics is limited. With hundreds of Escherichia coli cell lines evolving in a near-neutral scenario under exposure to the fluoroquinolone norfloxacin, this study reveals a significant linear relationship between the mutation rate and antibiotic concentration, while also demonstrating that antibiotic treatment compromises the efficiency of DNA oxidative-damage repair and postreplicative mismatch repair. Thus, antibiotics not only impose a selective challenge to target and off-target bacteria but also accelerate the rate of adaptation by magnifying the rate at which advantageous mutations arise. Although it is well known that microbial populations can respond adaptively to challenges from antibiotics, empirical difficulties in distinguishing the roles of de novo mutation and natural selection have left several issues unresolved. Here, we explore the mutational properties of Escherichia coli exposed to long-term sublethal levels of the antibiotic norfloxacin, using a mutation accumulation design combined with whole-genome sequencing of replicate lines. The genome-wide mutation rate significantly increases with norfloxacin concentration. This response is associated with enhanced expression of error-prone DNA polymerases and may also involve indirect effects of norfloxacin on DNA mismatch and oxidative-damage repair. Moreover, we find that acquisition of antibiotic resistance can be enhanced solely by accelerated mutagenesis, i.e., without direct involvement of selection. Our results suggest that antibiotics may generally enhance the mutation rates of target cells, thereby accelerating the rate of adaptation not only to the antibiotic itself but to additional challenges faced by invasive pathogens.
Proceedings of the National Academy of Sciences of the United States of America | 2015
Patricia L. Foster; Heewook Lee; Ellen Popodi; Jesse P. Townes; Haixu Tang
Significance Because genetic variation underlies evolution, a complete understanding of evolutionary processes requires identifying and characterizing the forces determining the stability of the genome. Using mutation accumulation and whole-genome sequencing, we found that spontaneous mutation rates in three widely diverged Escherichia coli strains are nearly identical. To determine the importance of DNA damage in driving mutation rates, we investigated 11 strains, each defective for a major DNA repair pathway. The striking result was that only loss of the ability to repair or prevent oxidative DNA damage significantly impacted mutation rates and spectra. These results suggest that, with the exception of those that defend against oxidative damage, DNA repair pathways may exist primarily to defend against DNA damage induced by exogenous agents. A complete understanding of evolutionary processes requires that factors determining spontaneous mutation rates and spectra be identified and characterized. Using mutation accumulation followed by whole-genome sequencing, we found that the mutation rates of three widely diverged commensal Escherichia coli strains differ only by about 50%, suggesting that a rate of 1–2 × 10−3 mutations per generation per genome is common for this bacterium. Four major forces are postulated to contribute to spontaneous mutations: intrinsic DNA polymerase errors, endogenously induced DNA damage, DNA damage caused by exogenous agents, and the activities of error-prone polymerases. To determine the relative importance of these factors, we studied 11 strains, each defective for a major DNA repair pathway. The striking result was that only loss of the ability to prevent or repair oxidative DNA damage significantly impacted mutation rates or spectra. These results suggest that, with the exception of oxidative damage, endogenously induced DNA damage does not perturb the overall accuracy of DNA replication in normally growing cells and that repair pathways may exist primarily to defend against exogenously induced DNA damage. The thousands of mutations caused by oxidative damage recovered across the entire genome revealed strong local-sequence biases of these mutations. Specifically, we found that the identity of the 3′ base can affect the mutability of a purine by oxidative damage by as much as eightfold.
G3: Genes, Genomes, Genetics | 2013
Patricia L. Foster; Andrew J. Hanson; Heewook Lee; Ellen Popodi; Haixu Tang
By sequencing the genomes of 34 mutation accumulation lines of a mismatch-repair defective strain of Escherichia coli that had undergone a total of 12,750 generations, we identified 1625 spontaneous base-pair substitutions spread across the E. coli genome. These mutations are not distributed at random but, instead, fall into a wave-like spatial pattern that is repeated almost exactly in mirror image in the two separately replicated halves of the bacterial chromosome. The pattern is correlated to genomic features, with mutation densities greatest in regions predicted to have high superhelicity. Superimposed upon this pattern are regional hotspots, some of which are located where replication forks may collide or be blocked. These results suggest that, as they traverse the chromosome, the two replication forks encounter parallel structural features that change the fidelity of DNA replication.
Proceedings of the National Academy of Sciences of the United States of America | 2016
Ashok S. Bhagwat; Weilong Hao; Jesse P. Townes; Heewook Lee; Haixu Tang; Patricia L. Foster
Significance C:G to T:A mutations constitute the largest class of spontaneous base substitutions in all organisms. These mutations are thought to be a result of cytosine deaminations, but what promotes these deaminations is unclear. We confirm here the hypothesis that they occur predominantly in single-stranded DNA (ssDNA) and identify the ssDNA in the lagging strand template as the preferred site of C:G to T:A mutations. As a consequence, replication creates a strand bias in these mutations, and this overwhelms any strand bias resulting from transcription. These results explain a long-recognized bias in base composition of microbial genomes called GC skew and predicts that C:G to T:A mutations created by the APOBEC3 family deaminases in cancer genomes should occur with the same strand bias. The rate of cytosine deamination is much higher in single-stranded DNA (ssDNA) than in double-stranded DNA, and copying the resulting uracils causes C to T mutations. To study this phenomenon, the catalytic domain of APOBEC3G (A3G-CTD), an ssDNA-specific cytosine deaminase, was expressed in an Escherichia coli strain defective in uracil repair (ung mutant), and the mutations that accumulated over thousands of generations were determined by whole-genome sequencing. C:G to T:A transitions dominated, with significantly more cytosines mutated to thymine in the lagging-strand template (LGST) than in the leading-strand template (LDST). This strand bias was present in both repair-defective and repair-proficient cells and was strongest and highly significant in cells expressing A3G-CTD. These results show that the LGST is accessible to cellular cytosine deaminating agents, explains the well-known GC skew in microbial genomes, and suggests the APOBEC3 family of mutators may target the LGST in the human genome.
Molecular Biology and Evolution | 2015
Hongan Long; Sibel Kucukyildirim; Way Sung; Emily Williams; Heewook Lee; Matthew S. Ackerman; Thomas G. Doak; Haixu Tang; Michael Lynch
Deinococcus bacteria are extremely resistant to radiation, oxidation, and desiccation. Resilience to these factors has been suggested to be due to enhanced damage prevention and repair mechanisms, as well as highly efficient antioxidant protection systems. Here, using mutation-accumulation experiments, we find that the GC-rich Deinococcus radiodurans has an overall background genomic mutation rate similar to that of E. coli, but differs in mutation spectrum, with the A/T to G/C mutation rate (based on a total count of 88 A:T → G:C transitions and 82 A:T → C:G transversions) per site per generation higher than that in the other direction (based on a total count of 157 G:C → A:T transitions and 33 G:C → T:A transversions). We propose that this unique spectrum is shaped mainly by the abundant uracil DNA glycosylases reducing G:C → A:T transitions, adenine methylation elevating A:T → C:G transversions, and absence of cytosine methylation decreasing G:C → A:T transitions. As opposed to the greater than 100× elevation of the mutation rate in MMR(-) (DNA Mismatch Repair deficient) strains of most other organisms, MMR(-) D. radiodurans only exhibits a 4-fold elevation, raising the possibility that other DNA repair mechanisms compensate for a relatively low-efficiency DNA MMR pathway. As D. radiodurans has plentiful insertion sequence (IS) elements in the genome and the activities of IS elements are rarely directly explored, we also estimated the insertion (transposition) rate of the IS elements to be 2.50 × 10(-3) per genome per generation in the wild-type strain; knocking out MMR did not elevate the IS element insertion rate in this organism.
Methods of Molecular Biology | 2012
Heewook Lee; Haixu Tang
As a classic topic in bioinformatics, the fragment assembly problem has been studied for over two decades. Fragment assembly algorithms take a set of DNA fragments as input, piece them together into a set of aligned overlapping fragments (i.e., contigs), and output a consensus sequence for each of the contigs. The rapid advance of massively parallel sequencing, often referred to as next-generation sequencing (NGS) technologies, has revolutionized DNA sequencing by reducing both its time and cost by several orders of magnitude in the past few years, but posed new challenges for fragment assembly. As a result, many new approaches have been developed to assemble NGS sequences, which are typically shorter with a higher error rate, but at a much higher throughput, than classic methods provided. In this chapter, we review both classic and new algorithms for fragment assembly, with a focus on NGS sequences. We also discuss a few new assembly problems emerging from the broader applications of NGS techniques, which are distinct from the classic fragment assembly problem.
Nucleic Acids Research | 2016
Heewook Lee; Thomas G. Doak; Ellen Popodi; Patricia L. Foster; Haixu Tang
A majority of large-scale bacterial genome rearrangements involve mobile genetic elements such as insertion sequence (IS) elements. Here we report novel insertions and excisions of IS elements and recombination between homologous IS elements identified in a large collection of Escherichia coli mutation accumulation lines by analysis of whole genome shotgun sequencing data. Based on 857 identified events (758 IS insertions, 98 recombinations and 1 excision), we estimate that the rate of IS insertion is 3.5 × 10−4 insertions per genome per generation and the rate of IS homologous recombination is 4.5 × 10−5 recombinations per genome per generation. These events are mostly contributed by the IS elements IS1, IS2, IS5 and IS186. Spatial analysis of new insertions suggest that transposition is biased to proximal insertions, and the length spectrum of IS-caused deletions is largely explained by local hopping. For any of the ISs studied there is no region of the circular genome that is favored or disfavored for new insertions but there are notable hotspots for deletions. Some elements have preferences for non-coding sequence or for the beginning and end of coding regions, largely explained by target site motifs. Interestingly, transposition and deletion rates remain constant across the wild-type and 12 mutant E. coli lines, each deficient in a distinct DNA repair pathway. Finally, we characterized the target sites of four IS families, confirming previous results and characterizing a highly specific pattern at IS186 target-sites, 5′-GGGG(N6/N7)CCCC-3′. We also detected 48 long deletions not involving IS elements.
Journal of Computational Biology | 2014
Heewook Lee; Ellen Popodi; Patricia L. Foster; Haixu Tang
Next-generation sequencing techniques are now commonly used to characterize structural variations (SVs) in population genomics and elucidate their associations with phenotypes. Many of the computational tools developed for detecting structural variations work by mapping paired-end reads to a reference genome and identifying the discordant read-pairs whose mapped loci in the reference genome deviate from the expected insert size and orientation. However, repetitive regions in the reference genome represent a major challenge in SV detection, because the paired-end reads from these regions may be mapped to multiple loci in the reference genome, resulting in spuriously discordant read-pairs. To address this issue, we have developed an algorithmic approach for read mapping and SV detection based on the framework of A-Bruijn graphs. Instead of mapping reads to a linear sequence of the reference genome, we propose to map reads onto the A-Bruijn graph constructed from the reference genome in which all instances of the same repeat are collapsed into a single edge. As a result, any given read, either from repetitive regions or not, will be mapped to a unique location in the A-Bruijn graph, and each discordant read-pair in the A-Bruijn graph indicates a potentially true SV event. We also developed a simple clustering algorithm to derive valid clusters of these discordant read-pairs, each supporting a different SV event. Finally, we demonstrate the performance of this approach, compared to existing approaches, by identifying transposition events of insertion sequence (IS) elements, a class of simple mobile genetic elements (MGEs), in E. coli by using simulated and real paired-end sequence data acquired from E. coli mutation accumulation lines.
mSystems | 2017
Karin E. Kram; Christopher Geiger; Wazim Mohammed Ismail; Heewook Lee; Haixu Tang; Patricia L. Foster; Steven E. Finkel
With a growing body of work directed toward understanding the mechanisms of evolution using experimental systems, it is crucial to decipher what effects the experimental setup has on the outcome. If the goal of experimental laboratory evolution is to elucidate underlying evolutionary mechanisms and trends, these must be demonstrated in a variety of systems and environments. Here, we perform experimental evolution in a complex medium allowing the cells to transition through all five phases of growth, including death phase and long-term stationary phase. We show that the swiftness of selection and the specific targets of adaptive evolution are different in this system compared to others. We also observe parallel evolution where different mutations in the same genes are under positive natural selection. Together, these data show that while some outcomes of microbial evolution experiments may be generalizable, many outcomes will be environment or system specific. ABSTRACT Experimental evolution of bacterial populations in the laboratory has led to identification of several themes, including parallel evolution of populations adapting to carbon starvation, heat stress, and pH stress. However, most of these experiments study growth in defined and/or constant environments. We hypothesized that while there would likely continue to be parallelism in more complex and changing environments, there would also be more variation in what types of mutations would benefit the cells. In order to test our hypothesis, we serially passaged Escherichia coli in a complex medium (Luria-Bertani broth) throughout the five phases of bacterial growth. This passaging scheme allowed cells to experience a wide variety of stresses, including nutrient limitation, oxidative stress, and pH variation, and therefore allowed them to adapt to several conditions. After every ~30 generations of growth, for a total of ~300 generations, we compared both the growth phenotypes and genotypes of aged populations to the parent population. After as few as 30 generations, populations exhibit changes in growth phenotype and accumulate potentially adaptive mutations. There were many genes with mutant alleles in different populations, indicating potential parallel evolution. We examined 8 of these alleles by constructing the point mutations in the parental genetic background and competed those cells with the parent population; five of these alleles were found to be adaptive. The variety and swiftness of adaptive mutations arising in the populations indicate that the cells are adapting to a complex set of stresses, while the parallel nature of several of the mutations indicates that this behavior may be generalized to bacterial evolution. IMPORTANCE With a growing body of work directed toward understanding the mechanisms of evolution using experimental systems, it is crucial to decipher what effects the experimental setup has on the outcome. If the goal of experimental laboratory evolution is to elucidate underlying evolutionary mechanisms and trends, these must be demonstrated in a variety of systems and environments. Here, we perform experimental evolution in a complex medium allowing the cells to transition through all five phases of growth, including death phase and long-term stationary phase. We show that the swiftness of selection and the specific targets of adaptive evolution are different in this system compared to others. We also observe parallel evolution where different mutations in the same genes are under positive natural selection. Together, these data show that while some outcomes of microbial evolution experiments may be generalizable, many outcomes will be environment or system specific.