[PDF] Biophysical inference of epistasis and the effects of mutations on protein stability and function

Abstract

Understanding the relationship between protein sequence, function, and stability is a fundamental problem in biology. While high-throughput methods have produced large numbers of sequence-function pairs, functional assays do not distinguish whether mutations directly affect function or are destabilizing the protein. Here, we introduce a statistical method to infer the underlying biophysics from a high-throughput binding assay by combining information from many mutated variants. We fit a thermodynamic model describing the bound, unbound, and unfolded states to high quality data of protein G domain B1 binding to IgG-Fc. We infer an energy landscape with distinct folding and binding energies for each substitution providing a detailed view of how mutations affect binding and stability across the protein. We accurately infer folding energy of each variant in physical units, validated by independent data, whereas previous high-throughput methods could only measure indirect changes in stability. While we assume an additive sequence-energy relationship, the binding fraction is epistatic due its non-linear relation to energy. Despite having no epistasis in energy, our model explains much of the observed epistasis in binding fraction, with the remaining epistasis identifying conformationally dynamic regions.

Full PDF

BBiophysical inference of epistasis and the eﬀects of mutations onprotein stability and function

Jakub Otwinowski

University of Pennsylvania, Biology Department, [email protected]

Abstract

Understanding the relationship between protein sequence, function, and stability is a fundamentalproblem in biology. While high-throughput methods have produced large numbers of sequence-functionpairs, functional assays do not distinguish whether mutations directly aﬀect function or are destabi-lizing the protein. Here, we introduce a statistical method to infer the underlying biophysics from ahigh-throughput binding assay by combining information from many mutated variants. We ﬁt a thermo-dynamic model describing the bound, unbound, and unfolded states to high quality data of protein Gdomain B1 binding to IgG-Fc. We infer an energy landscape with distinct folding and binding energiesfor each substitution providing a detailed view of how mutations aﬀect binding and stability across theprotein. We accurately infer folding energy of each variant in physical units, validated by independentdata, whereas previous high-throughput methods could only measure indirect changes in stability. Whilewe assume an additive sequence-energy relationship, the binding fraction is epistatic due its non-linear re-lation to energy. Despite having no epistasis in energy, our model explains much of the observed epistasisin binding fraction, with the remaining epistasis identifying conformationally dynamic regions.

Author Summary

Determining how mutations impact protein stability and function is instrumental in understanding howproteins carry out their biological tasks, how they evolve, and how to engineer novel proteins. Howevermeasuring diﬀerences in function between mutated variants does not distinguish whether mutations aredirectly aﬀecting function or are destabilizing the protein. Here, we ﬁt a thermodynamic model to datadescribing how thousands of variants of a protein bind to an antibody fragment. We accurately infer separatefolding and binding energies in physical units, providing a detailed energy landscape describing how mutationsaﬀect binding and stability across the protein, and our non-linear model reproduces many of the observedinteractions between sites.

Introduction

Deep mutational scanning (DMS) studies have produced detailed maps of how proteins and regulatory se-quences are related to function by assaying up to millions of mutated variants, and has had many applications,from identifying viral epitopes to protein engineering [1, 2]. While these studies aim to understand molecularfunction and evolution by collecting large numbers of sequence-function pairs, the full sequence-function map1 a r X i v : . [ q - b i o . B M ] M a r s very diﬃcult to determine due to the enormity of sequence space. Diﬀerent sites in a sequence may notcontribute to molecular function independently, and the eﬀect of a substitution at one site may depend onthe genetic background. This non-additivity, or epistasis, means that the entire space of possible sequencesmay have to be explored to understand molecular function.Given the limited data, mathematical modeling is necessary to make any progress. However, purelystatistical inferences are diﬃcult to interpret in terms of known biology, and can be too ﬂexible to makereliable predictions [3, 4]. On the other hand, biophysical systems are not arbitrarily complex as they followphysical laws and structural constraints. In other words, there is hope that biophysical knowledge can helpexplain sequence-function relationships. A powerful assumption is that in between sequence and function lierelevant intermediate phenotypes for which we can derive relatively simple relations to sequence and function[5]. The stability of a protein’s fold is a fundamental molecular phenotype under selection. Studying howmutations aﬀect stability is a fundamental challenge in protein science [6], and is the aim of some DMSstudies [7, 8, 9, 10]. However, assaying molecular function, such as binding to a ligand, is a necessary andinsuﬃcient measure of stability, in that most proteins must be folded to function, but do not necessarilyfunction if folded. In addition, high-throughput techniques, such as proteolysis assays [11], do not measurefree energy, but measure scores that are correlated with stability. Similarly, scores from high-throughputbinding assays do not measure binding energy in physical units, and do not distinguish whether changes seenin variants are due to changes in the overall fold stability or stability of the binding interface.While high-throughput assays often confound function and stability, these can be separated with a ther-modynamic approach. Thermodynamic approaches have been at the heart of biophysical models appliedto data to quantify the evolution of regulatory sequences [12, 13, 14, 15], and proteins [16, 17, 18]. In thecontext of proteins, there are typically a few relevant conformational states, and the kinetics are fast enoughto reach thermal equilibrium, with a free energy determining the probability of each state. At a minimum, aprotein has folded and unfolded states, and other states may be due to binding, mis-folding, or some otherconformational changes. Two-state models have been important in understanding observed patterns of sub-stitutions in protein evolution [19, 20, 21], and in general, the ensemble of protein conformations generatesepistasis that makes protein evolution diﬃcult to predict [22]. A powerful simpliﬁcation is to approximatethe total energy by a sum over site-speciﬁc energies (additivity), which has been observed in most changesto fold stability [23, 24]. However, even with additivity in energy, the probability of a protein being in aparticular state is non-linear with respect to energy and therefore epistatic with regard to sequence.In this work, we infer a thermodynamic model that separates folding and stability in a small bacterialprotein, protein G domain B1 (GB1), a model system of folding and stability, where a recent high-throughputassay of functional binding to an immunoglobulin fragment (IgG-Fc) described the epistasis between nearly allpairs of residues [10]. We infer a thermodynamic model with two states, bounded, and unbound, and anothermodel with three states: bound-folded, unbound-folded and unfolded. The approximation of additivity inenergy allows us to separate how mutations destabilize the binding interface and how they destabilize theoverall fold. We validate these approximations by predicting independently measured changes in fold stability.We describe the folding and stability landscape of the protein, identify which sites contribute most to binding,and explain much of the observed epistasis without assuming any energetic interactions.2 esults In vitro selection of protein variants

Olson et al. [10] mutagenized GB1 to create a library of protein variants which contained all single amino acidsubstitutions (1045 variants) and nearly all double substitutions (536k variants) of a reference or wild-typesequence. The library was sequenced before and after an in vitro selection assay, and the fraction of boundprotein to IgG-Fc for a variant with sequence σ is p (cid:48) ( σ ) = n ( σ ) n ( σ ) r , where n ( σ ) and n ( σ ) are the sequencecounts before and after selection, and r is a global factor that accounts for systematic diﬀerences betweeninitial and ﬁnal sequencing (see Methods for a maximum likelihood derivation of p (cid:48) ). For convenience, wedeﬁne ﬁtness as the logarithm of the binding fraction normalized by the wild-type σ W f ( σ ) = log (cid:18) n ( σ ) n ( σ ) n ( σ W ) n ( σ W ) (cid:19) , (1)as an analogy to the growth rate of an exponentially growing population, although we do not imply that thisis the (relative) growth rate of an organism with this variant.With nearly every possible double substitution it is possible to study interactions between sites, or epista-sis. We deﬁne pairwise epistasis in ﬁtness as the diﬀerence between the ﬁtness of the double mutants relativeto the wild-type and the expectation of additivity, i.e., the sum of the ﬁtness of two single mutants: J abij = f ( σ W/ ( i,a ) / ( j,b ) ) − f ( σ W/ ( i,a ) ) − f ( σ W/ ( j,b ) ) . (2)where / ( i, a ) and / ( j, b ) indicate substitutions at positions i, j with amino acids a, b .While changes in ﬁtness across all single and double mutants show where a protein is sensitive to binding,such changes are not informative of whether mutations are destabilizing the binding interface or the overallfold, as changes in either one inﬂuence the fraction of bound protein. A thermodynamic model is necessaryto separate these eﬀects. Thermodynamic models

Proteins fold into complicated structures and interact with other molecules depending on the free energyof their diﬀerent states or conformations. Under natural conditions, protein states reach thermodynamicequilibrium very quickly and the Boltzmann distribution relates the probability of state i to the free energy E i : p i ( σ ) = Z e − E i ( σ ) , where E i ( σ ) are in dimensionless units and Z is the normalization factor over states[25]. For a two state bound/unbound model, the fraction of bound protein is e E ( σ ) , with energy E ( σ ) . Inorder to separate binding and stability, we deﬁne three states: unfolded and unbound, folded and unbound,and folded and bound, and therefore the fraction of bound protein is p ( σ ) = e − E f ( σ ) − E b ( σ ) e − E f ( σ ) + e − E f ( σ ) − E b ( σ ) = 11 + e E b ( σ ) (1 + e E f ( σ ) ) (3)with folding energy E f ( σ ) and binding energy E b ( σ ) . The folding energy is relative to the unfolded state,whereas the binding energy is relative to the folded-unbound state, up to a constant that depends on theconcentration of ligand (the chemical potential). Importantly, this binding energy measures only the desta-bilization of the binding interface, and is distinct from dissociation constants that are related to our modelby K d ∝ e E b (1 + e E f ) (neglecting the chemical potential). Intuitively, low binding and folding energy leads3o large p , and the shaded areas in Fig. 1 show regions in energy space where each label indicates the mostlikely state.Given an experimentally measured p of a single variant, there are many values E b and E f that match p ,and it is not possible to identify these energies, as shown by the contour lines of equal p in Fig. 1. However,with the approximation of additivity in energy over sites, it is possible to combine information from manysequences to estimate energies. Additivity means that the energy is a sum over energies speciﬁc to eachsubstitution (cid:15) ( i, a ) : E f ( σ ) = (cid:15) Wf + (cid:88) i (cid:15) f ( i, σ i ) (4) E b ( σ ) = (cid:15) Wb + (cid:88) i (cid:15) b ( i, σ i ) (5)where subscripts f and b indicate folding and binding respectively, σ i is the amino acid at position i , and (cid:15) ( i, σ Wi ) = 0 so that the wild-type energy is (cid:15) W . While additivity has been observed in many experimentalmeasurements of changes in fold stability [23, 24], it is a local approximation that is likely to be violated forhighly mutated sequences.To illustrate how multiple data points constrain this non-linear model, consider a hypothetical quartetof sequences and their measured binding fraction p : the wild-type, two sequences with single substitutions,and a sequence with both of those substitutions. The lines in Fig. 1 are the energy coordinates consistentwith the given p , and the dashed lines are the additive energies that connect the wild-type (red line) to thesingle mutants (black lines), and to the double mutant (blue line). The parameters are not constrained givena wild-type p and single mutant p , as a rectangles can be placed anywhere between two lines as long as theopposing corners land on them. However, when considering all four sequences the largest rectangle musthave lengths that are the sum of the smaller rectangle lengths due to additivity, and the non-linearity of thecurves constrains the parameters (additive energies) that can ﬁt the data, with more data providing moreconstraints.We use all sequences and associated counts in a maximum a likelihood framework to infer all additive andwild-type folding and binding energies, converted to kcal/mol (see Methods). For comparison, we also inferenergies of the two state model. In Methods, we modify the likelihood to account for non-speciﬁc backgroundbinding, and describe a procedure to overcome local optima via bootstrapping. Inferred energy landscape

We compare the inferred additive folding energies to independent low-throughput measurements of 81 singlesubstitutions in Fig. 2A (collected from diﬀerent sources in [10]). The three state model (bound/unbound/unfolded)accurately predicts (cid:15) f in physical units with an root mean squared error of 0.39 kcal/mol and a correlation of ρ = 0 . , which is better than computational methods (~0.6 to ~0.7) and close to the amount of correlationbetween replicates of low-throughput methods (~0.86) [26]. Six variants with 2-6 mutations are also predicted(Fig. 2A red) with comparable accuracy. However, 2 highly stable variants (G β ˆ f ( σ ) = log( p ( σ ) /p ( σ W )) and measured ﬁtness of 96.4% and 97.1% fortwo and three state models respectively (see Fig. S1). However, the additive energies inferred by the two4 olded andboundfolded andunbound unfolded andunboundp=0.95p=0.21p=0.045p=0.0003 −10−50510 −10 −5 0 5 10 folding energy E f b i nd i ng ene r g y E b Figure 1: Thermodynamic model of three protein states: unfolded and unbound, folded and unbound, andfolded and bound, described by eq. 3. Shaded areas correspond to regions in energy space where the labeledstate is dominant. Given sequences and binding fractions p , the non-linear Boltzmann form (eq. 3) imposesconstraints on the possible parameters (additive energies). Solid lines are the energies compatible with p for four hypothetical sequences: wild-type (red), two single mutants (black) and a double mutant (blue).Dashed lines represent additive energies, and connect the wild-type to the single mutants (black) and thedouble mutant (blue), which has lengths equal to the sum of additive energies. −4−3−2−101−4 −3 −2 −1 0 1 predicted e f , kcal/mol m ea s u r ed e f , kc a l / m o l A residue depth r m s ene r g y , kc a l / m o l B Figure 2: A) Accurate prediction of changes in folding energy (cid:15) f (eq. 4, commonly referred to as ∆∆ G ) byﬁtting a three state thermodynamic model to deep mutational scanning data. Predicted energies have a rootmean square error of 0.39 kcal/mol and ρ = 0 . compared to independent measurements of (cid:15) f for 81 singlesubstitutions [10]. Six variants with 2-6 mutations are shown in red. The line has a slope of unity. B) Foldingenergies (teal) have a stronger relation to residue depth than binding energies (red). Root mean square energychanges at each position are shown, and a plus sign indicates sites at the protein-protein interface [27, 28]).5 i nd i ng ene r g y f o l d i ng ene r g y kcal/mol Figure 3: Inferred additive binding (cid:15) b and folding (cid:15) f energies show strikingly diﬀerent patterns. Three ofthe binding sites (27, 31, 43) have strong eﬀects on binding. Many substitutions at positions 23 and 41 arebeneﬁcial for folding and deleterious for binding, although overall substitutions are uncorrelated.state model have no relation to the independently measured (cid:15) f (Fig. S2). Clearly a three state model isnecessary to predict folding energy, and has the added beneﬁt of estimating binding energy.To assess how much sampling noise inﬂuences our results, we calculate 95% conﬁdence intervals of (cid:15) f and (cid:15) b from the bootstrapped estimates (Fig. S3), and ﬁnd that they are very narrow compared to the rangeof eﬀect sizes for most of the 2092 energy parameters. Examining (cid:15) f and (cid:15) b across sites and amino acids(Fig. 3) reveals a detailed picture of how folding and binding are sensitive to substitutions. The energies havestriking diﬀerences in their patterns, and (cid:15) f and (cid:15) b are uncorrelated (Fig. 4A, ρ = 0 . , p value = . ). Somesubstitutions have strong antagonistic eﬀects, such as at positions 23 and 41, neither of which are at thebinding interface. The substitutions 41L and 54G have particularly strong antagonism, although the doublemutant is known to be strongly epistatic, and there may be systematic errors in these parameters from eﬀectsnot captured by our model. With relatively few exceptions, amino acid substitutions in GB1 do not producetrade-oﬀs between binding and fold stability.The energy landscape of a protein is determined by its structure, so we expect that the inferred energiesare related to structural features. Both (cid:15) f and (cid:15) b correlate with residue depth (Fig. 2B), and (cid:15) b has a weakerrelation to depth, except for a few sensitive shallow residues. The three most sensitive sites to binding areat the interface of the two proteins [27, 28] (plus signs in Fig. 2B), but many other sites at the interface arenot sensitive.A top down view of folding vs. binding energy of single and double mutants depicts how the variants fallinto each of the three states. In this phenotypic space, the wild-type is better than most of the observedsequences, and in terms of binding fraction, it is in the 72nd and 85th percentile of the single and double6 olded andbound unfolded andunboundfolded andunbound folding energy E f , kcal/mol b i nd i ng ene r g y E b , kc a l / m o l p folded andbound unfolded andunboundfolded andunbound folding energy E f , kcal/mol b i nd i ng ene r g y E b , kc a l / m o l count Figure 4: Folding vs. binding energy for single (A) and double (B) mutants. Red dot is the wild-type energy,the red line is where binding fraction is the same as wild-type p = p W , and below the red line variants have p > p W . The three equilibrium states are labeled, with E f = 0 demarcating the folded and unfolded states.substitutions respectively. Most variants fall in the region of excess stability ( E f (cid:28) ), whereas bindingenergies are distributed around the wild-type, implying that binding fraction is more sensitive to changes inbinding energy than folding energy for the majority of variants. This is consistent with the lack of correlationbetween additive energies from the two state model and independently measured folding energies (Fig. S2),as well as the lack of correlation between changes in ﬁtness and folding energies [10]. Patterns of epistasis

While the energies in our models are non-epistatic, the binding fraction and ﬁtness are epistatic due to thenon-linearity between binding fraction and energy (eq. 3), and our inferred pairwise epistasis ˆ J abij , analogousto eq. 2, can be compared to the observed epistasis. We ﬁlter out non-biological epistasis due to experimentallimits on measured ﬁtness, i.e., non-speciﬁc background binding (see Methods), and average over amino acidsin each pair of sites. The predictions from the three-state model reproduces the biological epistasis betterthan the two-state model, which vastly underestimates the magnitude of epistasis across the protein (Fig. 5).Notably, the three state model predicts much of the negative epistasis, but misses clusters of positive epistasis.To quantify these deviations, the diﬀerence between the predicted and observed epistasis can be normalizedby the noise in the observed epistasis, z ij = ˆ J ij − J ij √ v Jij (see Methods). The clusters of positive epistasis are moreclearly visible after ﬁltering out all but the most underestimated epistasis (bottom 5% of z ij , Fig. S4). Asnoted in [10], the residues in these positions had correlated conformational dynamics in NMR and moleculardynamics studies (positions 7, 9, 11, 12, 14, 16, 33, 37, 38, 40, 54, 56, [29, 30, 31]). This suggests that thisunexplained epistasis is due to systematic errors not accounted for in the model, such as epistasis in energy orsome alternative conformations, and therefore large prediction errors can identify sites that should be studiedin more detail. 7 bserved three state two state10 20 30 40 50 10 20 30 40 50 10 20 30 40 501020304050 −2−1012 epistasis Figure 5: Patterns of pairwise epistasis observed in the data and predicted by the three state and two statethermodynamic models. Shown are observed pairwise ﬁtness epistasis (eq. 2), and inferred epistasis from thethree state model and the two state model. A network of residues that undergo correlated conformationaldynamics (positions 7, 9, 11, 12, 14, 16, 33, 37, 38, 40, 54, 56) have signiﬁcant observed positive epistasisthat the thermodynamic models fail to estimate. Epistasis is averaged for each pair of sites over the relevantamino acid substitutions. We ﬁlter out non-biological epistasis that is a consequence of the experimentallimits on measured ﬁtness due to non-speciﬁc background binding (see Methods).We also made predictions for a follow up study by Wu et al. [32], which targeted 4 highly epistatic sites inGB1, and assayed all combinations of mutations, i.e. variants. Most of the mutants with 3 or 4 mutationshave very weak binding, and the ﬁtness predictions from the model trained on the Olson et. al data canroughly predict their functionality (Fig. S5). Our model predicts functional quadruple mutants with a truepositive rate of 86% and a true negative rate of 95% (deﬁning functional as f > − . , Tab. S1). At the sametime, the ﬁtness is underestimated for many variants much more than expected from measurement variability(Fig. S6), suggesting that, in this more mutated data, some unaccounted for epistasis is restoring binding orstability for approximately 20% of variants. Discussion

We have shown how with a few biophysical assumptions, i.e.,. a small number of thermodynamic states andadditivity of energy, are suﬃcient to extract a detailed folding and binding energy landscape of a protein.Many DMS studies use in vitro and in vivo selection assays, and quantify the results with enrichmentratios, similar to ﬁtness in eq. 1. Most of these studies focus on single substitutions from a wild-type,however ﬁtness changes of single substitutions confound changes in stability and binding. We have shown howcombining information from multiple sequences with many mutations provide the constraints to separate thesephenotypes in a thermodynamic model, and therefore highly mutagenized sequence-function experiments canprovide a rich description of a protein’s energy landscape.The inferred energies from the three state model very accurately predict the independent low-throughputmeasurements, and show that it is feasible to infer folding and binding energy in physical units accurately fromsimple high-throughput functional assays. Several DMS studies have indirectly measured stability. Araya etal. [7] applied a metric, based on the rescue eﬀect of double mutations, to identify stabilizing mutations, andRocklin et al. [11] used proteolysis assays that correlate with stability. Olson et al. [10] extracted stabilitymeasures from this dataset by searching for single mutation ﬁtness changes in diﬀerent genetic backgrounds8hat correlated with the literature set of stability measurements. Further reﬁnements were developed withclustering methods that use structural information and physiochemical properties [33]. However, these ad-hocmethods can suﬀer from over-ﬁtting and require extraneous knowledge to work eﬀectively. In contrast, thethermodynamic model directly infers stability in physical units and does not require any benchmark stabilitymeasurements or structural information.Our method also infers the binding energies that are marginal to the folded–unbound state and measurethe destabilization of the binding interface. In constrast, dissociation constants, as measured by titrationcurves, do not account for diﬀerences in fold stability. The recently developed tite-seq method [34] infers asaturation constant to account for sequence dependent diﬀerences in stability and expression, but does notcompensate for stability eﬀects in the dissociation constant itself.We have shown that the three state model shows good agreement with patterns of epistasis, and deviationsfrom our model identify a network of residues that have correlated conformational dynamics. Since thedeviations and measurement noise itself can be rather large, sign epistasis, path accessibility, and othergeometric features of the inferred genotype-phenotype map are likely to be distorted. It is possible that morecomplex models, such energetic interaction terms or more conformational states, can describe the remainingepistasis in double mutants and in variants with more than two mutations. The three state model works wellfor the relatively small GB1, but larger proteins may need additional states, such as mis-folded conformations,to accurately model their properties.Inferred energy landscapes from DMS may also be useful in understanding protein structure. Inferredenergy parameters may be useful for calibrating potential functions used in structure prediction [35]. Reﬁnedthermodynamic models with pairwise epistatic energy may be able to infer protein contacts directly, similarto how multiple sequence alignments of homologous proteins can infer contacts [36], providing a way topredict structure from DMS studies. Thermodynamic models coupled with DMS also provide a way to studyintrinsically disordered proteins, which fold and bind simultaneously, but have no persistent structure whileunbound [37]. Since many conformations can correspond to these states, free energy diﬀerences are a naturalway to quantify the properties of disordered proteins.While we have inferred a detailed genotype-phenotype map of GB1, yet we do not know the consequencesfor evoltuion, which depend on how the binding fraction aﬀects the organismal ﬁtness. Manhart and Morozov[38] explored the evolutionary dynamics of a ﬁtness function that is a linear combination of the three proteinstates. This leads to an evolutionary coupling between binding and folding, where selection on foldingcan drive changes in binding, and vice versa. With appropriate data it may be possible to infer selectioncoeﬃcients associated with each state, as well as evolutionary trajectories in energy space, from multiplesequence alignments.

Methods

Poisson likelihood for an in vitro selection assay

In an in vitro selection assay with one round, the library of protein variants is sequenced before and afterbinding, and therefore the count or multiplicity of each sequence carries information on the binding. For eachvariant σ , with initial and ﬁnal counts n and n , we deﬁne a Poisson log-likelihood with intensity λ = N and λ = N pr , where p represents the fraction of bound protein and r is the systematic diﬀerence between9nitial and ﬁnal sequencing. We omit the dependence on σ for brevity (note that r does not depend on σ ).The per variant joint likelihood over the two time points is LL = − N (1 + pr ) + n log( N ) + n log( N pr ) , (6)omitting terms that don’t depend on parameters. This has a nuisance parameter N per sequence, which hasan ML estimate, given the other parameters N ∗ = n + n pr Plugging it into the likelihood and dropping terms which don’t depend on the parameters results in LL = − ( n + n ) log(1 + pr ) + n log( pr ) . (7)The maximum likelihood estimate of binding fraction is p (cid:48) = n n r . Since some of the counts can be very small,we add a pseudo-count of to n and n to slightly regularize the estimate. Thermodynamic model inference

We use the likelihood in eq. 7, and parameterize p as a thermodynamic model following the Boltzmanndistribution. We modify p to account for non-speciﬁc background binding p : p ( σ ) = 11 + e E b ( σ ) (1 + e E f ( σ ) ) (1 − p ) + p , (8)and the energies have an additive relation to sequence deﬁned by eqs. 45. The total likelihood is the sum ofthe per variant likelihoods LL = (cid:80) σ LL ( σ ) .This log-likelihood is non-convex, and is optimized using the NLopt library [39], which implements themethod of moving asymptotes algorithm [40], and uses the log-likelihood gradients with respect to r , p ,and the energies. The initial parameters were r = 1 , p = 0 . , and all energies set to zero. r and p werereparameterized as e r (cid:48) and e p (cid:48) inside the optimization function, so that the original parameters are non-negative. In the optimization algorithm, upper and lower bounds on ε are set to limit very small gradientswhich stop the optimization prematurely. The value of to ± was chosen by optimizing with diﬀerent bounds, ± , ± , ± , and ± , and choosing the result with the highest likelihood. Dimensionless energies areconverted to kcal/mol with T = 297 , and therefore the bounds are ± . kcal/mol. Bootstrap

Since the optimization algorithm can get stuck in local optima, we use a bootstrap restarting procedure toovercome local optima related to sampling noise [41], and to generate a bootstrap distribution of parameters toquantify their uncertainty due to sampling noise. The maximum likelihood parameters from the ﬁt describedabove, θ , are the initial parameters in an iterative procedure that alternates optimizing on the original andbootstrapped data.Each iteration consists of: 1) creation of bootstrapped data with counts drawn from a Poisson distributionwith means n and n . 2) Searching for the maximum likelihood parameters θ (cid:48) on the bootstrapped datawith initial parameters θ . 3) Searching for the maximum likelihood parameters θ (cid:48)(cid:48) on the original data with10nitial parameters θ (cid:48) . 4) If the likelihood is no better than the best optimization within a small threshold LL ( θ (cid:48)(cid:48) ) ≤ LL ( θ ) + η ( η = 0 . ), then add the bootstrapped parameters θ (cid:48) to a list. 5) If the likelihood isbetter than the previous best LL ( θ (cid:48)(cid:48) ) > LL ( θ ) + η , then update the best parameters, θ ← θ (cid:48)(cid:48) , and delete thelist of bootstrapped parameters. 6) terminate the procedure once 100 bootstraps have been accumulated inthe list. Pairwise epistasis

A minimal amount of non-speciﬁc background binding, p , imposes a lower bound on measured binding frac-tion in the experiment, estimated to be f = log( p /p ( σ W ) = − . by our three state model. This thresholdeﬀect produces large amounts of non-biological positive pairwise epistasis, e.g. when the double mutant hasthe same level of binding as one of the single mutants at the background level. Therefore, the data shown inFig. 5 excludes ˆ J abij with sequences near this threshold, i.e., min (cid:16) f ( σ W/ ( i,a ) / ( j,b ) ) , f ( σ W/ ( i,a ) ) , f ( σ W/ ( j,b ) ) (cid:17) < − . . Sample variance of ﬁtness and epistasis

Since estimated ﬁtness is asymptotically Gaussian, the sample variance of ﬁtness is equal to the curvature ofthe log-likelihood surface. Replacing p (cid:48) with e f (cid:48) in eq. 7, the asymptotic variance of f (cid:48) is − (cid:16) ∂ LL∂f (cid:48) (cid:17) − . Thevariance of the ﬁtness estimate, as deﬁned in eq. 1 is the sum of the focal and wild-type variances v f ( σ ) = n ( σ ) + n ( σ ) n ( σ ) n ( σ ) + n ( σ W ) + n ( σ W ) n ( σ W ) n ( σ W ) , (9)and the variance in epistasis is the sum of the single and double mutant variances v J abij = v f ( σ W/ ( i,a ) / ( j,b ) ) + v f ( σ W/ ( i,a ) ) + v f ( σ W/ ( j,b ) ) , The variance is then averaged over amino acids a, b for each position i, j . References [1] Douglas M Fowler and Stanley Fields. Deep mutational scanning: a new style of protein science.

NatureMethods , 11(8):801–807, 2014.[2] Emily E Wrenbeck, Matthew S Faber, and Timothy A Whitehead. Deep sequencing methods for proteinengineering and design.

Current Opinion in Structural Biology , 45:36–44, August 2017.[3] Jakub Otwinowski and Joshua B Plotkin. Inferring ﬁtness landscapes by regression produces biasedestimates of epistasis.

Proceedings of the National Academy of Sciences of the United States of America ,111:E2301–9, May 2014.[4] Louis du Plessis, Gabriel E. Leventhal, and Sebastian Bonhoeﬀer. How good are statistical models atapproximating complex ﬁtness landscapes.

Mol Biol Evol , page msw097, May 2016.[5] Shimon Bershtein, Adrian WR Serohijos, and Eugene I Shakhnovich. Bridging the physical scales inevolutionary biology: from protein sequence space to ﬁtness of organisms and populations.

CurrentOpinion in Structural Biology , 42:31–40, February 2017.116] Thomas J Magliery, Jason J Lavinder, and Brandon J Sullivan. Protein stability by number: high-throughput and statistical approaches to one of protein science’s most diﬃcult problems.

Current Opin-ion in Chemical Biology , 15(3):443–451, June 2011.[7] C. L. Araya, D. M. Fowler, W. Chen, I. Muniez, J. W. Kelly, and S. Fields. A fundamental proteinproperty, thermodynamic stability, revealed solely from large-scale measurements of protein function.

Proceedings of the National Academy of Sciences , 109(42), 2012.[8] Michael W. Traxlmayr, Christoph Hasenhindl, Matthias Hackl, Gerhard Stadlmayr, Jakub D. Rybka,Nicole Borth, Johannes Grillari, Florian Rüker, and Christian Obinger. Construction of a StabilityLandscape of the CH3 Domain of Human IgG1 by Combining Directed Evolution with High ThroughputSequencing.

Journal of Molecular Biology , 423(3):397–412, October 2012.[9] Ikjin Kim, Christina R. Miller, David L. Young, and Stanley Fields. High-throughput Analysis of invivo Protein Stability.

Molecular & Cellular Proteomics , page mcp.O113.031708, 2013.[10] C. Anders Olson, Nicholas C. Wu, and Ren Sun. A Comprehensive Biophysical Description of PairwiseEpistasis throughout an Entire Protein Domain.

Current Biology , 24(22):2643–2651, 2014.[11] Gabriel J. Rocklin, Tamuka M. Chidyausiku, Inna Goreshnik, Alex Ford, Scott Houliston, AlexanderLemak, Lauren Carter, Rashmi Ravichandran, Vikram K. Mulligan, Aaron Chevalier, Cheryl H. Arrow-smith, and David Baker. Global analysis of protein folding using massively parallel design, synthesis,and testing.

Science , 357(6347):168–175, July 2017.[12] Ville Mustonen and Michael Lässig. Evolutionary population genetics of promoters: Predicting bindingsites and functional phylogenies.

PNAS , 102(44):15936–15941, November 2005.[13] Ville Mustonen, Justin Kinney, Curtis G Callan, and Michael Lässig. Energy-dependent ﬁtness: aquantitative model for the evolution of yeast transcription factor binding sites.

Proceedings of theNational Academy of Sciences of the United States of America , 105(34):12376–81, August 2008.[14] Justin B Kinney, Anand Murugan, Curtis G Callan, and Edward C Cox. Using deep sequencing tocharacterize the biophysical mechanism of a transcriptional regulatory sequence.

Proceedings of theNational Academy of Sciences of the United States of America , 107(20):9158–63, May 2010.[15] Mato Lagator, Tiago Paixão, Nicholas H. Barton, Jonathan P. Bollback, and Călin C. Guet. On themechanistic nature of epistasis in a canonical cis-regulatory element. eLife Sciences , 6:e25192, May 2017.[16] Jesse D. Bloom, Jonathan J. Silberg, Claus O. Wilke, D. Allan Drummond, Christoph Adami, Frances H.Arnold, and Alan R. Fersht. Thermodynamic Prediction of Protein Neutrality.

Proceedings of theNational Academy of Sciences of the United States of America , 102(3):606–611, 2005.[17] C. Scott Wylie and Eugene I. Shakhnovich. A biophysical protein folding model accounts for mostmutational ﬁtness eﬀects in viruses.

PNAS , 108(24):9916–9921, June 2011.[18] Julian Echave and Claus O. Wilke. Biophysical Models of Protein Evolution: Understanding the Patternsof Evolutionary Sequence Divergence.

Annual Review of Biophysics , 46(1):85–103, 2017.[19] Tyler N. Starr and Joseph W. Thornton. Epistasis in protein evolution.

Protein Science , 25(7):1204–1218, July 2016. 1220] Ugo Bastolla, Yves Dehouck, and Julian Echave. What evolution tells us about protein physics, andprotein physics tells us about evolution.

Current Opinion in Structural Biology , 42:59–66, February2017.[21] David a Liberles, Sarah a Teichmann, Ivet Bahar, Ugo Bastolla, Jesse Bloom, Erich Bornberg-Bauer,Lucy J Colwell, a P Jason de Koning, Nikolay V Dokholyan, Julian Echave, Arne Elofsson, Dietlind LGerloﬀ, Richard a Goldstein, Johan a Grahnen, Mark T Holder, Clemens Lakner, Nicholas Lartillot,Simon C Lovell, Gavin Naylor, Tina Perica, David D Pollock, Tal Pupko, Lynne Regan, Andrew Roger,Nimrod Rubinstein, Eugene Shakhnovich, Kimmen SjÃ ¶ lander, Shamil Sunyaev, Ashley I Teufel, Jef-frey L Thorne, Joseph W Thornton, Daniel M Weinreich, and Simon Whelan. The interface of proteinstructure, protein biophysics, and molecular evolution. Protein science : a publication of the ProteinSociety , 21(6):769–85, June 2012.[22] Zachary R. Sailer and Michael J. Harms. Molecular ensembles make evolution unpredictable.

PNAS ,page 201711927, October 2017.[23] J. A. Wells. Additivity of mutational eﬀects in proteins.

Biochemistry , 29(37):8509–8517, September1990.[24] W. S. Sandberg and T. C. Terwilliger. Engineering multiple properties of a protein by combinatorialmutagenesis.

PNAS , 90(18):8367–8371, September 1993.[25] Rob Phillips, Jane Kondev, Julie Theriot, and Hernan Garcia.

Physical Biology of the Cell, SecondEdition . Garland Science, October 2012.[26] Vladimir Potapov, Mati Cohen, and Gideon Schreiber. Assessing computational methods for predictingprotein stability upon mutation: good on average but not in the details.

Protein Eng Des Sel , 22(9):553–560, September 2009.[27] A. Elisabeth Sauer-Eriksson, Gerard J Kleywegt, Mathias UhlÃ c (cid:13) n, and T. Alwyn Jones. Crystalstructure of the C2 fragment of streptococcal protein G in complex with the Fc domain of human IgG.

Structure , 3(3):265–278, March 1995.[28] David J. Sloan and Homme W. Hellinga. Dissection of the protein G B1 domain binding site for humanIgG Fc fragment.

Protein Science , 8(8):1643–1648, January 1999.[29] G. Marius Clore and Charles D. Schwieters. Amplitudes of protein backbone dynamics and correlatedmotions in a small alpha/beta protein: correspondence of dipolar coupling and heteronuclear relaxationmeasurements.

Biochemistry , 43(33):10678–10691, August 2004.[30] Oliver F. Lange, Helmut GrubmÃŒller, and Bert L. de Groot. Molecular dynamics simulations of proteinG challenge NMR-derived correlated backbone motions.

Angew. Chem. Int. Ed. Engl. , 44(22):3394–3399,May 2005.[31] Phineus R. L. Markwick, Guillaume Bouvignies, and Martin Blackledge. Exploring multiple timescalemotions in protein GB3 using accelerated molecular dynamics and NMR spectroscopy.

J. Am. Chem.Soc. , 129(15):4724–4730, April 2007.[32] Nicholas C. Wu, Lei Dai, C. Anders Olson, James O. Lloyd-Smith, and Ren Sun. Adaptation in proteinﬁtness landscapes is facilitated by indirect paths. eLife , 5:e16965, July 2016.1333] Nicholas C. Wu, C. Anders Olson, and Ren Sun. High-throughput identiﬁcation of protein mutantstability computed from a double mutant ﬁtness landscape.

Protein Science , pages n/a–n/a, December2015.[34] Rhys M. Adams, Thierry Mora, Aleksandra M. Walczak, and Justin B. Kinney. Measuring the sequence-aﬃnity landscape of antibodies with massively parallel titration curves. eLife Sciences , 5:e23156, De-cember 2016.[35] Jooyoung Lee, Peter L. Freddolino, and Yang Zhang. Ab Initio Protein Structure Prediction. In

FromProtein Structure to Function with Bioinformatics , pages 3–35. Springer, Dordrecht, 2017.[36] Faruck Morcos, Terence Hwa, JosÃ c (cid:13)

N Onuchic, and Martin Weigt. Direct Coupling Analysis forProtein Contact Prediction. In Daisuke Kihara, editor,

Protein Structure Prediction , volume 1137 of

Methods in Molecular Biology , pages 55–70. Springer New York, New York, NY, 2014.[37] Peter E Wright and H Jane Dyson. Linking folding and binding.

Current Opinion in Structural Biology ,19(1):31–38, February 2009.[38] Michael Manhart and Alexandre V. Morozov. Protein folding and binding can emerge as evolutionaryspandrels through structural coupling.

PNAS , 112(6):1797–1802, February 2015.[39] Steven G. Johnson. The NLopt nonlinear-optimization package.[40] Krister Svanberg. A class of globally convergent optimization methods based on conservative convexseparable approximations.

SIAM Journal on Optimization , pages 555–573.[41] Simon N. Wood. Minimizing Model Fitting Objectives That Contain Spurious Local Minima by Boot-strap Restarting.

Biometrics , 57(1):240–244, March 2001.14 upplemental Material hree state two state−6 −4 −2 0 −6 −4 −2 0−9−6−303 inferred fitness f i t ne ss count Figure S1: Inferred versus predicted ﬁtness for two state model and three state model. Correlations are96.4% and 97.1% respectively. d h n ρ TP FP TN FN1 76 0.95 39 1 29 72 2091 0.74 498 19 1342 2323 26019 0.42 1596 158 21355 29104 121174 0.39 982 162 113984 6046Table S1: Statistics for predictions on Wu et. al data. d h : hamming distance from wild-type. n : numberof variants. ρ : correlation coeﬃcient weighted by /v f . TP, FP, TN, FN: True/false positives/negatives forwhether the variant is functional ( f > − . ). −3−2−101 −1.0−0.5 0.0 0.5 1.0 predicted energy kcal/mol m ea s u r ed ene r g y kc a l / m o l Figure S2: Additive energies from two state model do not predict changes in fold stability. Measured (cid:15) sameas in ﬁg. 2B. If variants have excess stability, the measured binding fraction would be mostly sensitive tobinding, which would lead to energies unrelated to folding.16

505 −5 0 5 max likelihood energy kcal/mol boo t s t r ap , kc a l / m o l A

05 0 5 max likelihood energy kcal/mol boo t s t r ap kc a l / m o l B Figure S3: Bootstrapped parameters of the three state model, for the folding (A) and binding (B) energies.Red points denote the median value from the bootstrap, and the gray bars show the 95% conﬁdence interval. overestimated underestimated10 20 30 40 50 10 20 30 40 501020304050 −1012 epistasis

Figure S4: Observed epistasis J ij (averaged over amino acids) which our model overestimates ( ˆ J ij (cid:29) J ij )and underestimates ( ˆ J ij (cid:28) J ij ). Overestimated epistasis is the top 5% of z ij and underestimated epistasis isthe bottom 5%. Underestimated epistasis is largely positive and corresponds to the dynamically correlatedresidue network. 17 inferred fitness f i t ne ss log count Figure S5: Fitness is less predictable in a follow-up study that targeted all combinations at four sites [32].Panels show true and inferred ﬁtness for 1 to 4 substitutions from wild-type. A substantial fraction offunctional variants are underestimated, suggesting some unaccounted for epistasis in energy or conformationaldynamics. See also Fig. S6 and Tab. S1. cdf c oun t Figure S6: Around 20% of variants in Wu et al predictions have ﬁtnesses underestimated more than expectedfrom measurement variation. Shown is distribution of CDF (cid:18) ˆ f ( σ ) − f ( σ ) √ v f ( σ ) (cid:19)(cid:19)