Premal Shah
University of Pennsylvania
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Premal Shah.
Cell | 2013
Premal Shah; Yang Ding; Malwina Niemczyk; Grzegorz Kudla; Joshua B. Plotkin
Summary Deep sequencing now provides detailed snapshots of ribosome occupancy on mRNAs. We leverage these data to parameterize a computational model of translation, keeping track of every ribosome, tRNA, and mRNA molecule in a yeast cell. We determine the parameter regimes in which fast initiation or high codon bias in a transgene increases protein yield and infer the initiation rates of endogenous Saccharomyces cerevisiae genes, which vary by several orders of magnitude and correlate with 5′ mRNA folding energies. Our model recapitulates the previously reported 5′-to-3′ ramp of decreasing ribosome densities, although our analysis shows that this ramp is caused by rapid initiation of short genes rather than slow codons at the start of transcripts. We conclude that protein production in healthy yeast cells is typically limited by the availability of free ribosomes, whereas protein production under periods of stress can sometimes be rescued by reducing initiation or elongation rates.
Nature | 2013
Yao Xu; Peijun Ma; Premal Shah; Antonis Rokas; Yi Liu; Carl Hirschie Johnson
Circadian rhythms are oscillations in biological processes that function as a key adaptation to the daily rhythms of most environments. In the model cyanobacterial circadian clock system, the core oscillator proteins are encoded by the gene cluster kaiABC. Genes with high expression and functional importance, such as the kai genes, are usually encoded by optimal codons, yet the codon-usage bias of the kaiBC genes is not optimized for translational efficiency. We discovered a relationship between codon usage and a general property of circadian rhythms called conditionality; namely, that endogenous rhythmicity is robustly expressed under some environmental conditions but not others. Despite the generality of circadian conditionality, however, its molecular basis is unknown for any system. Here we show that in the cyanobacterium Synechococcus elongate, non-optimal codon usage was selected as a post-transcriptional mechanism to switch between circadian and non-circadian regulation of gene expression as an adaptive response to environmental conditions. When the kaiBC sequence was experimentally optimized to enhance expression of the KaiB and KaiC proteins, intrinsic rhythmicity was enhanced at cool temperatures that are experienced by this organism in its natural habitat. However, fitness at those temperatures was highest in cells in which the endogenous rhythms were suppressed at cool temperatures as compared with cells exhibiting high-amplitude rhythmicity. These results indicate natural selection against circadian systems in cyanobacteria that are intrinsically robust at cool temperatures. Modulation of circadian amplitude is therefore crucial to its adaptive significance. Moreover, these results show the direct effects of codon usage on a complex phenotype and organismal fitness. Our work also challenges the long-standing view of directional selection towards optimal codons, and provides a key example of natural selection against optimal codons to achieve adaptive responses to environmental changes.
Proceedings of the National Academy of Sciences of the United States of America | 2011
Premal Shah; Michael A. Gilchrist
The genetic code is redundant with most amino acids using multiple codons. In many organisms, codon usage is biased toward particular codons. Understanding the adaptive and nonadaptive forces driving the evolution of codon usage bias (CUB) has been an area of intense focus and debate in the fields of molecular and evolutionary biology. However, their relative importance in shaping genomic patterns of CUB remains unsolved. Using a nested model of protein translation and population genetics, we show that observed gene level variation of CUB in Saccharomyces cerevisiae can be explained almost entirely by selection for efficient ribosomal usage, genetic drift, and biased mutation. The correlation between observed codon counts within individual genes and our model predictions is 0.96. Although a variety of factors shape patterns of CUB at the level of individual sites within genes, our results suggest that selection for efficient ribosome usage is a central force in shaping codon usage at the genomic scale. In addition, our model allows direct estimation of codon-specific mutation rates and elongation times and can be readily applied to any organism with high-throughput expression datasets. More generally, we have developed a natural framework for integrating models of molecular processes to population genetics models to quantitatively estimate parameters underlying fundamental biological processes, such a protein translation.
Evolution | 2013
Matthew L. Niemiller; Benjamin M. Fitzpatrick; Premal Shah; Lars Schmitz; Thomas J. Near
The genetic mechanisms underlying regressive evolution—the degeneration or loss of a derived trait—are largely unknown, particularly for complex structures such as eyes in cave organisms. In several eyeless animals, the visual photoreceptor rhodopsin appears to have retained functional amino acid sequences. Hypotheses to explain apparent maintenance of function include weak selection for retention of light‐sensing abilities and its pleiotropic roles in circadian rhythms and thermotaxis. In contrast, we show that there has been repeated loss of functional constraint of rhodopsin in amblyopsid cavefishes, as at least three cave lineages have independently accumulated unique loss‐of‐function mutations over the last 10.3 Mya. Although several cave lineages still possess functional rhodopsin, they exhibit increased rates of nonsynonymous mutations that have greater effect on the structure and function of rhodopsin compared to those in surface lineages. These results indicate that functionality of rhodopsin has been repeatedly lost in amblyopsid cavefishes. The presence of a functional copy of rhodopsin in some cave lineages is likely explained by stochastic accumulation of mutations following recent subterranean colonization.
Proceedings of the National Academy of Sciences of the United States of America | 2015
Premal Shah; David M. McCandlish; Joshua B. Plotkin
Significance How large a role does history play in evolution? Do later events depend critically on specific earlier events, or do all events occur more or less independently? If a change occurs early in evolution, does it become easier or harder to revert the change as time proceeds? Here, we explore these ideas in the context of protein evolution, by simulating sequence evolution under purifying selection and then systematically permuting the order of amino acid substitutions. Our results suggest that the amino acid substitutions that occur in evolution are typically contingent on the presence of prior substitutions, and that substitutions that occur early in evolution become entrenched and difficult to modify as subsequent substitutions accrue. The phenotypic effect of an allele at one genetic site may depend on alleles at other sites, a phenomenon known as epistasis. Epistasis can profoundly influence the process of evolution in populations and shape the patterns of protein divergence across species. Whereas epistasis between adaptive substitutions has been studied extensively, relatively little is known about epistasis under purifying selection. Here we use computational models of thermodynamic stability in a ligand-binding protein to explore the structure of epistasis in simulations of protein sequence evolution. Even though the predicted effects on stability of random mutations are almost completely additive, the mutations that fix under purifying selection are enriched for epistasis. In particular, the mutations that fix are contingent on previous substitutions: Although nearly neutral at their time of fixation, these mutations would be deleterious in the absence of preceding substitutions. Conversely, substitutions under purifying selection are subsequently entrenched by epistasis with later substitutions: They become increasingly deleterious to revert over time. Our results imply that, even under purifying selection, protein sequence evolution is often contingent on history and so it cannot be predicted by the phenotypic effects of mutations assayed in the ancestral background.
Nature | 2013
David M. McCandlish; Premal Shah; Yang Ding; Joshua B. Plotkin
Arising from M. S. Breen, C. Kemena, P. K. Vlasov, C. Notredame & F. A. Kondrashov 490, 535–538 (2012)10.1038/nature11510An important question in molecular evolution is whether an amino acid that occurs at a given site makes an independent contribution to fitness, or whether its contribution depends on the state of other sites in the organism’s genome, a phenomenon known as epistasis. Breen and colleagues recently argued that epistasis must be “pervasive throughout protein evolution” because the observed ratio between the per-site rates of non-synonymous and synonymous substitutions (dN/dS) is much lower than would be expected in the absence of epistasis. However, when calculating the expected dN/dS ratio in the absence of epistasis, Breen et al. assumed that all amino acids observed at a given position in a protein alignment have equal fitness. Here, we relax this unrealistic assumption and show that any dN/dS value can in principle be achieved at a site, without epistasis; furthermore, for all nuclear and chloroplast genes in the Breen et al. data set, we show that the observed dN/dS values and the observed patterns of amino-acid diversity at each site are jointly consistent with a non-epistatic model of protein evolution.
Genetics | 2009
Michael A. Gilchrist; Premal Shah; Russell Zaretzki
Codon usage bias (CUB) has been documented across a wide range of taxa and is the subject of numerous studies. While most explanations of CUB invoke some type of natural selection, most measures of CUB adaptation are heuristically defined. In contrast, we present a novel and mechanistic method for defining and contextualizing CUB adaptation to reduce the cost of nonsense errors during protein translation. Using a model of protein translation, we develop a general approach for measuring the protein production cost in the face of nonsense errors of a given allele as well as the mean and variance of these costs across its coding synonyms. We then use these results to define the nonsense error adaptation index (NAI) of the allele or a contiguous subset thereof. Conceptually, the NAI value of an allele is a relative measure of its elevation on a specific and well-defined adaptive landscape. To illustrate its utility, we calculate NAI values for the entire coding sequence and across a set of nonoverlapping windows for each gene in the Saccharomyces cerevisiae S288c genome. Our results provide clear evidence of adaptation to reduce the cost of nonsense errors and increasing adaptation with codon position and expression. The magnitude and nature of this adaptation are also largely consistent with simulation results in which nonsense errors are the only selective force driving CUB evolution. Because NAI is derived from mechanistic models, it is both easier to interpret and more amenable to future refinement than other commonly used measures of codon bias. Further, our approach can also be used as a starting point for developing other mechanistically derived measures of adaptation such as for translational accuracy.
Genome Biology and Evolution | 2012
Yang Ding; Premal Shah; Joshua B. Plotkin
Experimental studies of translation have found that short genes tend to exhibit greater densities of ribosomes than long genes in eukaryotic species. It remains an open question whether the elevated ribosome density on short genes is due to faster initiation or slower elongation dynamics. Here, we address this question computationally using 5′-mRNA folding energy as a proxy for translation initiation rates and codon bias as a proxy for elongation rates. We report a significant trend toward reduced 5′-secondary structure in shorter coding sequences, suggesting that short genes initiate faster during translation. We also find a trend toward higher 5′-codon bias in short genes, suggesting that short genes elongate faster than long genes. Both of these trends hold across a diverse set of eukaryotic taxa. Thus, the elevated ribosome density on short eukaryotic genes is likely caused by differential rates of initiation, rather than differential rates of elongation.
Evolution | 2013
Premal Shah; Benjamin M. Fitzpatrick; James A. Fordyce
Phylogenetic hypotheses are frequently used to examine variation in rates of diversification across the history of a group. Patterns of diversification‐rate variation can be used to infer underlying ecological and evolutionary processes responsible for patterns of cladogenesis. Most existing methods examine rate variation through time. Methods for examining differences in diversification among groups are more limited. Here, we present a new method, parametric rate comparison (PRC), that explicitly compares diversification rates among lineages in a tree using a variety of standard statistical distributions. PRC can identify subclades of the tree where diversification rates are at variance with the remainder of the tree. A randomization test can be used to evaluate how often such variance would appear by chance alone. The method also allows for comparison of diversification rate among a priori defined groups. Further, the application of the PRC method is not restricted to monophyletic groups. We examined the performance of PRC using simulated data, which showed that PRC has acceptable false‐positive rates and statistical power to detect rate variation. We apply the PRC method to the well‐studied radiation of North American Plethodon salamanders, and support the inference that the large‐bodied Plethodon glutinosus clade has a higher historical rate of diversification compared to other Plethodon salamanders.
Genome Biology and Evolution | 2015
Michael A. Gilchrist; Wei-Chen Chen; Premal Shah; Cedric Landerer; Russell Zaretzki
Extracting biologically meaningful information from the continuing flood of genomic data is a major challenge in the life sciences. Codon usage bias (CUB) is a general feature of most genomes and is thought to reflect the effects of both natural selection for efficient translation and mutation bias. Here we present a mechanistically interpretable, Bayesian model (ribosome overhead costs Stochastic Evolutionary Model of Protein Production Rate [ROC SEMPPR]) to extract meaningful information from patterns of CUB within a genome. ROC SEMPPR is grounded in population genetics and allows us to separate the contributions of mutational biases and natural selection against translational inefficiency on a gene-by-gene and codon-by-codon basis. Until now, the primary disadvantage of similar approaches was the need for genome scale measurements of gene expression. Here, we demonstrate that it is possible to both extract accurate estimates of codon-specific mutation biases and translational efficiencies while simultaneously generating accurate estimates of gene expression, rather than requiring such information. We demonstrate the utility of ROC SEMPPR using the Saccharomyces cerevisiae S288c genome. When we compare our model fits with previous approaches we observe an exceptionally high agreement between estimates of both codon-specific parameters and gene expression levels (ρ>0.99 in all cases). We also observe strong agreement between our parameter estimates and those derived from alternative data sets. For example, our estimates of mutation bias and those from mutational accumulation experiments are highly correlated (ρ=0.95). Our estimates of codon-specific translational inefficiencies and tRNA copy number-based estimates of ribosome pausing time (ρ=0.64), and mRNA and ribosome profiling footprint-based estimates of gene expression (ρ=0.53−0.74) are also highly correlated, thus supporting the hypothesis that selection against translational inefficiency is an important force driving the evolution of CUB. Surprisingly, we find that for particular amino acids, codon usage in highly expressed genes can still be largely driven by mutation bias and that failing to take mutation bias into account can lead to the misidentification of an amino acid’s “optimal” codon. In conclusion, our method demonstrates that an enormous amount of biologically important information is encoded within genome scale patterns of codon usage, accessing this information does not require gene expression measurements, but instead carefully formulated biologically interpretable models.