Stephanie J. Spielman
University of Texas at Austin
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Stephanie J. Spielman.
Nature Reviews Genetics | 2016
Julian Echave; Stephanie J. Spielman; Claus O. Wilke
It has long been recognized that certain sites within a protein, such as sites in the protein core or catalytic residues in enzymes, are evolutionarily more conserved than other sites. However, our understanding of rate variation among sites remains surprisingly limited. Recent progress to address this includes the development of a wide array of reliable methods to estimate site-specific substitution rates from sequence alignments. In addition, several molecular traits have been identified that correlate with site-specific mutation rates, and novel mechanistic biophysical models have been proposed to explain the observed correlations. Nonetheless, current models explain, at best, approximately 60% of the observed variance, highlighting the limitations of current methods and models and the need for new research directions.
PLOS ONE | 2013
Matthew Z. Tien; Austin G. Meyer; Dariya K. Sydykova; Stephanie J. Spielman; Claus O. Wilke
The relative solvent accessibility (RSA) of a residue in a protein measures the extent of burial or exposure of that residue in the 3D structure. RSA is frequently used to describe a proteins biophysical or evolutionary properties. To calculate RSA, a residues solvent accessibility (ASA) needs to be normalized by a suitable reference value for the given amino acid; several normalization scales have previously been proposed. However, these scales do not provide tight upper bounds on ASA values frequently observed in empirical crystal structures. Instead, they underestimate the largest allowed ASA values, by up to 20%. As a result, many empirical crystal structures contain residues that seem to have RSA values in excess of one. Here, we derive a new normalization scale that does provide a tight upper bound on observed ASA values. We pursue two complementary strategies, one based on extensive analysis of empirical structures and one based on systematic enumeration of biophysically allowed tripeptides. Both approaches yield congruent results that consistently exceed published values. We conclude that previously published ASA normalization values were too small, primarily because the conformations that maximize ASA had not been correctly identified. As an application of our results, we show that empirically derived hydrophobicity scales are sensitive to accurate RSA calculation, and we derive new hydrophobicity scales that show increased correlation with experimentally measured scales.
Molecular Biology and Evolution | 2015
Stephanie J. Spielman; Claus O. Wilke
Numerous computational methods exist to assess the mode and strength of natural selection in protein-coding sequences, yet how distinct methods relate to one another remains largely unknown. Here, we elucidate the relationship between two widely used phylogenetic modeling frameworks: dN/dS models and mutation-selection (MutSel) models. We derive a mathematical relationship between dN/dS and scaled selection coefficients, the focal parameters of MutSel models, and use this relationship to gain deeper insight into the behaviors, limitations, and applicabilities of these two modeling frameworks. We prove that, if all synonymous changes are neutral, standard MutSel models correspond to dN/dS ≤ 1. However, if synonymous codons differ in fitness, dN/dS can take on arbitrarily high values even if all selection is purifying. Thus, the MutSel modeling framework cannot necessarily accommodate positive, diversifying selection, while dN/dS cannot distinguish between purifying selection on synonymous codons and positive selection on amino acids. We further propose a new benchmarking strategy of dN/dS inferences against MutSel simulations and demonstrate that the widely used Goldman-Yang-style dN/dS models yield substantially biased dN/dS estimates on realistic sequence data. In contrast, the less frequently used Muse-Gaut-style models display much less bias. Strikingly, the least-biased and most precise dN/dS estimates are never found in the models with the best fit to the data, measured through both AIC and BIC scores. Thus, selecting models based on goodness-of-fit criteria can yield poor parameter estimates if the models considered do not precisely correspond to the underlying mechanism that generated the data. In conclusion, establishing mathematical links among modeling frameworks represents a novel, powerful strategy to pinpoint previously unrecognized model limitations and strengths.
Journal of Molecular Evolution | 2014
Amir Shahmoradi; Dariya K. Sydykova; Stephanie J. Spielman; Eleisha L. Jackson; Eric T. Dawson; Austin G. Meyer; Claus O. Wilke
Several recent works have shown that protein structure can predict site-specific evolutionary sequence variation. In particular, sites that are buried and/or have many contacts with other sites in a structure have been shown to evolve more slowly, on average, than surface sites with few contacts. Here, we present a comprehensive study of the extent to which numerous structural properties can predict sequence variation. The quantities we considered include buriedness (as measured by relative solvent accessibility), packing density (as measured by contact number), structural flexibility (as measured by B factors, root-mean-square fluctuations, and variation in dihedral angles), and variability in designed structures. We obtained structural flexibility measures both from molecular dynamics simulations performed on nine non-homologous viral protein structures and from variation in homologous variants of those proteins, where they were available. We obtained measures of variability in designed structures from flexible-backbone design in the Rosetta software. We found that most of the structural properties correlate with site variation in the majority of structures, though the correlations are generally weak (correlation coefficients of 0.1–0.4). Moreover, we found that buriedness and packing density were better predictors of evolutionary variation than structural flexibility. Finally, variability in designed structures was a weaker predictor of evolutionary variability than buriedness or packing density, but it was comparable in its predictive power to the best structural flexibility measures. We conclude that simple measures of buriedness and packing density are better predictors of evolutionary variation than the more complicated predictors obtained from dynamic simulations, ensembles of homologous structures, or computational protein design.
Journal of Cell Biology | 2017
Zuzana Kadlecova; Stephanie J. Spielman; Dinah Loerke; Aparna Mohanakrishnan; Dana Kim Reed; Sandra L. Schmid
The critical initiation phase of clathrin-mediated endocytosis (CME) determines where and when endocytosis occurs. Heterotetrameric adaptor protein 2 (AP2) complexes, which initiate clathrin-coated pit (CCP) assembly, are activated by conformational changes in response to phosphatidylinositol-4,5-bisphosphate (PIP2) and cargo binding at multiple sites. However, the functional hierarchy of interactions and how these conformational changes relate to distinct steps in CCP formation in living cells remains unknown. We used quantitative live-cell analyses to measure discrete early stages of CME and show how sequential, allosterically regulated conformational changes activate AP2 to drive both nucleation and subsequent stabilization of nascent CCPs. Our data establish that cargoes containing Yxx&phgr; motif, but not dileucine motif, play a critical role in the earliest stages of AP2 activation and CCP nucleation. Interestingly, these cargo and PIP2 interactions are not conserved in yeast. Thus, we speculate that AP2 has evolved as a key regulatory node to coordinate CCP formation and cargo sorting and ensure high spatial and temporal regulation of CME.
PLOS ONE | 2015
Stephanie J. Spielman; Claus O. Wilke
We introduce Pyvolve, a flexible Python module for simulating genetic data along a phylogeny using continuous-time Markov models of sequence evolution. Easily incorporated into Python bioinformatics pipelines, Pyvolve can simulate sequences according to most standard models of nucleotide, amino-acid, and codon sequence evolution. All model parameters are fully customizable. Users can additionally specify custom evolutionary models, with custom rate matrices and/or states to evolve. This flexibility makes Pyvolve a convenient framework not only for simulating sequences under a wide variety of conditions, but also for developing and testing new evolutionary models. Pyvolve is an open-source project under a FreeBSD license, and it is available for download, along with a detailed user-manual and example scripts, from http://github.com/sjspielman/pyvolve.
Journal of Molecular Evolution | 2013
Stephanie J. Spielman; Claus O. Wilke
We have investigated the influence of the plasma membrane environment on the molecular evolution of G protein-coupled receptors (GPCRs), the largest receptor family in Metazoa. In particular, we have analyzed the site-specific rate variation across the two primary structural partitions, transmembrane (TM) and extramembrane (EM), of these membrane proteins. We find that TM domains evolve more slowly than do EM domains, though TM domains display increased rate heterogeneity relative to their EM counterparts. Although the majority of residues across GPCRs experience strong to weak purifying selection, many GPCRs experience positive selection at both TM and EM residues, albeit with a slight bias towards the EM. Further, a subset of GPCRs, chemosensory receptors (including olfactory and taste receptors), exhibit increased rates of evolution relative to other GPCRs, an effect which is more pronounced in their TM spans. Although it has been previously suggested that the TM’s low evolutionary rate is caused by their high percentage of buried residues, we show that their attenuated rate seems to stem from the strong biophysical constraints of the membrane itself, or by functional requirements. In spite of the strong evolutionary constraints acting on the TM spans of GPCRs, positive selection and high levels of evolutionary rate variability are common. Thus, biophysical constraints should not be presumed to preclude a protein’s ability to evolve.
PeerJ | 2015
Stephanie J. Spielman; Keerthana Kumar; Claus O. Wilke
Biogenic amine receptors play critical roles in regulating behavior and physiology in both vertebrates and invertebrates, particularly within the central nervous system. Members of the G-protein coupled receptor (GPCR) family, these receptors interact with endogenous bioamine ligands such as dopamine, serotonin, and epinephrine, and are targeted by a wide array of pharmaceuticals. Despite the clear clinical and biological importance of these receptors, their evolutionary history remains poorly characterized. In particular, the relationships among biogenic amine receptors and any specific evolutionary constraints acting within distinct receptor subtypes are largely unknown. To advance and facilitate studies in this receptor family, we have constructed a comprehensive, high-quality sequence alignment of vertebrate biogenic amine receptors. In particular, we have integrated a traditional multiple sequence approach with robust structural domain predictions to ensure that alignment columns accurately capture the highly-conserved GPCR structural domains, and we demonstrate how ignoring structural information produces spurious inferences of homology. Using this alignment, we have constructed a structurally-partitioned maximum-likelihood phylogeny from which we deduce novel biogenic amine receptor relationships and uncover previously unrecognized lineage-specific receptor clades. Moreover, we find that roughly 1% of the 3039 sequences in our final alignment are either misannotated or unclassified, and we propose updated classifications for these receptors. We release our comprehensive alignment and its corresponding phylogeny as a resource for future research into the evolution and diversification of biogenic amine receptors.
Genetics | 2016
Stephanie J. Spielman; Suyang Wan; Claus O. Wilke
Two broad paradigms exist for inferring dN/dS, the ratio of nonsynonymous to synonymous substitution rates, from coding sequences: (i) a one-rate approach, where dN/dS is represented with a single parameter, or (ii) a two-rate approach, where dN and dS are estimated separately. The performances of these two approaches have been well studied in the specific context of proper model specification, i.e., when the inference model matches the simulation model. By contrast, the relative performances of one-rate vs. two-rate parameterizations when applied to data generated according to a different mechanism remain unclear. Here, we compare the relative merits of one-rate and two-rate approaches in the specific context of model misspecification by simulating alignments with mutation–selection models rather than with dN/dS-based models. We find that one-rate frameworks generally infer more accurate dN/dS point estimates, even when dS varies among sites. In other words, modeling dS variation may substantially reduce accuracy of dN/dS point estimates. These results appear to depend on the selective constraint operating at a given site. For sites under strong purifying selection (dN/dS ≲ 0.3), one-rate and two-rate models show comparable performances. However, one-rate models significantly outperform two-rate models for sites under moderate-to-weak purifying selection. We attribute this distinction to the fact that, for these more quickly evolving sites, a given substitution is more likely to be nonsynonymous than synonymous. The data will therefore be relatively enriched for nonsynonymous changes, and modeling dS contributes excessive noise to dN/dS estimates. We additionally find that high levels of divergence among sequences, rather than the number of sequences in the alignment, are more critical for obtaining precise point estimates.
Protein Science | 2016
Eleisha L. Jackson; Amir Shahmoradi; Stephanie J. Spielman; Benjamin R. Jack; Claus O. Wilke
Structural properties such as solvent accessibility and contact number predict site‐specific sequence variability in many proteins. However, the strength and significance of these structure–sequence relationships vary widely among different proteins, with absolute correlation strengths ranging from 0 to 0.8. In particular, two recent works have made contradictory observations. Yeh et al. (Mol. Biol. Evol. 31:135–139, 2014) found that both relative solvent accessibility (RSA) and weighted contact number (WCN) are good predictors of sitewise evolutionary rate in enzymes, with WCN clearly out‐performing RSA. Shahmoradi et al. (J. Mol. Evol. 79:130–142, 2014) considered these same predictors (as well as others) in viral proteins and found much weaker correlations and no clear advantage of WCN over RSA. Because these two studies had substantial methodological differences, however, a direct comparison of their results is not possible. Here, we reanalyze the datasets of the two studies with one uniform analysis pipeline, and we find that many apparent discrepancies between the two analyses can be attributed to the extent of sequence divergence in individual alignments. Specifically, the alignments of the enzyme dataset are much more diverged than those of the virus dataset, and proteins with higher divergence exhibit, on average, stronger structure–sequence correlations. However, the highest structure–sequence correlations are observed at intermediate divergence levels, where both highly conserved and highly variable sites are present in the same alignment.