Science | 2019

Passenger hotspot mutations in cancer driven by APOBEC3A and mesoscale genomic features

 
 
 
 
 
 
 

Abstract


APOBEC3A hairpin passenger hotspots Genomic features are often examined at extremes to determine the impact of mutations. These genomic regions span from the trinucleotide context to megabases that underlie chromatin and chromosomal features. Examining mutational dynamics at the mesoscale, the intermediate span of the genome, Buisson et al. characterized the mutational dynamics of cancer (see the Perspective by Carter). They found that mutations caused by the APOBEC enzyme in DNA stem-loops, a mesoscale feature of the genome, could drive recurrent mutations. Many of these types of mutations have been identified as likely drivers of cancer. However, APOBEC-generated mutations outside of stem-loops were more likely to be cancer driver mutations, providing a genomic context for separating cancer driver from passenger mutations. Science, this issue p. eaaw2872; see also p. 1228 An analysis of more than 2500 human tumors reveals that cancer driver and passenger mutations can be identified from mesoscale genomic context. INTRODUCTION Extensive tumor sequencing efforts have transformed the way in which cancer driver genes are identified. Appropriate statistical modeling is crucial for distinguishing true drivers from passenger events that accumulate during tumorigenesis but provide no fitness advantage to cancer cells. A central assumption used in discovering driver genes and specific driver mutations is that exact positional recurrence is unlikely by chance: Seeing exactly the same DNA base pair mutated recurrently across patients is taken as proof that the mutation must be under functional selection for contributing to tumor fitness. The assumption is that mutational processes, being essentially random, are unlikely to hit the exact same base pair over and over again. However, although functional selection is clearly a key cause of recurrent mutations in cancers, whether it is the only prominent cause is not known. RATIONALE To distinguish driver mutations from passengers, it is critical to understand the landscape of background mutations in cancer genomes. Recent pan-cancer mutation analyses have revealed rules of mutation distribution at the smallest (one to three base pairs) and largest (megabase) scales. At the small scale, mutational processes such as those attributable to sunlight, cigarette smoke, or random DNA copying errors generate patterns known as mutational signatures at the trinucleotide level. At the opposite extreme, the cell’s nucleus is organized into two large compartments known as A and B, each consisting of multi-megabase chromatin domains. Compartment A contains gene-rich, open, active, early-replicating euchromatin. Compartment B contains gene-poor, closed, inactive, and late-replicating heterochromatin. Mutation frequency is generally higher in compartment B. Cancer genomes have been studied in detail at these two opposite scales, but less attention has been paid so far to the intervening “mesoscale.” RESULTS We investigated the influence of mesoscale genomic features on mutational recurrence. We found that mutagenesis by the cytidine deaminase APOBEC3A is uniquely sensitive to mesoscale features, specifically the ability of DNA to adopt particular “hairpin” (stem-loop) structures while transiently single-stranded. Identifying DNA loci that can form hairpins requires sequence analysis at the mesoscale (~30–base pair) level. Combining biochemistry and bioinformatics, we deduced the features of APOBEC3A’s optimal DNA substrates, revealing that cytosine bases presented in a short loop at the end of a strongly paired stem can be mutated up to 200 times as frequently as nonhairpin sites. Analyzing the most frequent APOBEC mutations in protein-coding regions of cancer genomes, we identified numerous recurrent mutations at optimal hairpins in genes unconnected to cancer. Conversely, we found that mutational hotspots at nonoptimal sites are enriched in known cancer driver genes. CONCLUSION Our results indicate that there are multiple possible routes to mutational hotspots in cancer. Functional mutations in oncogenes or tumor suppressors can rise to prominence through positive selection. These driver hotspots are not restricted to the “favorite” sites of any particular mutagen. In contrast, DNA sites that happen to be perfect substrates for a mutagen can give rise to “passenger hotspot mutations” that owe their prevalence to substrate optimality, not to any effects on tumor fitness. In light of these findings, we recommend caution in interpreting the long lists of putative novel cancer driver hotspots being produced by high-throughput sequencing projects. APOBEC3A has a taste for hairpins. The APOBEC cytidine deaminase enzymes are a prominent cause of mutations in cancer. Analysis of mutational patterns at the mesoscale (~30–base pair) level reveals that APOBEC3A strongly prefers “hairpin” substrates. These stem-loop DNA structures can form via intrastrand base pairing. Cytosine bases presented at the end of a stable hairpin are exceptionally vulnerable to attack by APOBEC3A, leading to recurrent mutations in the absence of any selective benefit (“passenger hotspots,” left). In contrast, APOBEC mutational hotspots in known cancer driver genes (“driver hotspots,” right) are not restricted to any particular kind of DNA structure. Cancer drivers require statistical modeling to distinguish them from passenger events, which accumulate during tumorigenesis but provide no fitness advantage to cancer cells. The discovery of driver genes and mutations relies on the assumption that exact positional recurrence is unlikely by chance; thus, the precise sharing of mutations across patients identifies drivers. Examining the mutation landscape in cancer genomes, we found that many recurrent cancer mutations previously designated as drivers are likely passengers. Our integrated bioinformatic and biochemical analyses revealed that these passenger hotspot mutations arise from the preference of APOBEC3A, a cytidine deaminase, for DNA stem-loops. Conversely, recurrent APOBEC-signature mutations not in stem-loops are enriched in well-characterized driver genes and may predict new drivers. This demonstrates that mesoscale genomic features need to be integrated into computational models aimed at identifying mutations linked to diseases.

Volume 364
Pages None
DOI 10.1126/science.aaw2872
Language English
Journal Science

Full Text