Finding a unifying motif of intermolecular cooperativity in protein associations
Sebastián R. Accordino, J. Ariel Rodriguez-Fris, Gustavo A. Appignanesi, Ariel Fernández
FFinding a unifying motif of intermolecular cooperativity in protein associations
Sebasti´an R. Accordino † , J. Ariel Rodriguez-Fris † , Gustavo A. Appignanesi † , and Ariel Fern´andez ‡♣ † Secci´on Fisicoqu´ımica, INQUISUR-UNS-CONICET-Departamento de Qu´ımica,Universidad Nacional del Sur, Avda. Alem 153, 8000 Bah´ıa Blanca, Argentina. ‡ Instituto Argentino de Matem´atica “Alberto P. Calder´on”, CONICET, Buenos Aires 1083, Argentina. ♣ Department of Computer Science, The University of Chicago, Chicago, IL 60637 (Dated: November 2, 2018)At the molecular level, most biological processes entail protein associations which in turn rely ona small fraction of interfacial residues called hot spots. Here we show that hot spots share a unifyingmolecular attribute: they provide a third-body contribution to intermolecular cooperativity. Suchmotif, based on the wrapping of interfacial electrostatic interactions, is essential to maintain theintegrity of the interface and can be exploited in rational drug design since such regions may serveas blueprints to engineer small molecules disruptive of protein-protein interfaces.
PACS numbers: 87.15.km, 87.15.kr, 87.15.K-
Protein associations are basic molecular processes inbiology [1–13]. In spite of their importance, their bio-physical underpinnings remain a subject of debate [1–13].A challenging standing problem involves the characteri-zation of hot spots [1–12]. These are few in number andprovide the most significant contribution to the stabilityof the protein-protein interface. Knowledge-based andfirst-principle docking potentials have been relatively suc-cessful at predicting these singular sites [1–12], fitting theoutcome of probes for experimental identification suchas site-directed mutation or alanine-scanning [1]. Thesetechniques assess the impact on binding free energy of thetruncation of an individual residue side chain at the β -carbon. Notwithstanding these predictive successes, thephysical nature of hot spots remains elusive. Even theestablishment of general rules for hot-spot characteriza-tion has proven unfeasible so far, as has been explicitlyrecognized [1, 4, 5] and constitutes the scope of this let-ter. Attempts at rationalizing the stability of protein-protein interfaces based on pairwise interactions betweenthe two chains is inconclusive at best, as demonstrated inthis letter. This leads us to focus our attention on higherorder energetic contributions as a theoretical frameworkto explain and predict binding hot spots. Given the rel-ative abundance of hydrophilic residues on the proteinsurface, protein associations are always confronted withthe disruptive effect of polar hydration [14, 15]. Thus,the integrity of the protein-protein interface becomes ex-tremely reliant on intermolecular cooperativity [14, 15].We make this concept precise by invoking three-body cor-relations, whereupon a third nonpolar body protects anelectrostatic interaction pairing the other two by con-tributing to the exclusion of surrounding water. Sincethese three-body correlations must engage the two pro-tein molecules, the correlations must be subject to an ad-ditional constraint: One body belongs to a protein chainand the other two to its binding partner. To completethis description it is necessary to classify pairwise electro-static interactions in terms of an abundance distribution P ( ρ ), where ρ is the number of three-body correlationsassociated with an interaction. This distribution is de-fined by its mean value (cid:104) ρ (cid:105) = (cid:80) ρ [ ρ · P ( ρ )] and dispersion σ = ( (cid:104) ( ρ − (cid:104) ρ (cid:105) ) (cid:105) ) / , which leads us to single out anunderprotected interaction (UPI) as one in the tail ofthe distribution, that is, with ρ ≤ (cid:104) ρ (cid:105) − σ . The UPIsare crucial in defining protein associations due to theirsensitivity to critical changes in intermolecular coopera-tivity brought about by site-directed aminoacid substi-tution. As demonstrated previously [14–17], UPIs arealso adhesive, hence promoters of protein association be-cause their inherent stability increases upon approach ofa third-body nonpolar group that enhances its dehydra-tion, and de-screens the partial charges. This physicalpicture leads us to assert that intermolecular cooperativ-ity will be most sensitive to site-directed mutation in twoparticular instances: a) When a site mutation changesthe wrapping value ρ of an intermolecular or intramolec-ular interaction decreasing it to a value below the mean ρ ; b) When in a free protein subunit the alanine sub-stitution raises the wrapping of a UPI and, additionally,this interaction is intermolecularly wrapped in the com-plex. This analysis leads us to characterize hot spots asthe residues whose alanine substitution most drasticallyaffects intermolecular cooperativity. This conjecture isvalidated in this work by combinatorially dissecting theprotein-protein interfaces of structurally reported com-plexes that have been independently studied by alaninescanning through experimental means. The analysis boilsdown to a decomposition of the interface into a web ofthree-body cooperative interactions. Besides its scien-tific interest, the knowledge gained from our approachmay significantly impact drug discovery endeavors [18],especially since hot spots are expected to constitute theblueprint for the design of small molecule drugs disrup-tive of protein-protein associations.UPIs that involve hydrogen bonds (HBs) are nameddehydrons. This structural motif has been extensivelydiscussed in the literature and identified in soluble a r X i v : . [ c ond - m a t . s o f t ] A ug proteins with PDB-reported structure [13–17]. Thus,the extent of hydrogen-bond protection can be deter-mined directly from atomic coordinates. This parameterindicates the number of three-body correlations engagingthe HB and is also known as the wrapping of the bondand denoted ρ . It is given by the number of side-chaincarbonaceous nonpolar groups (CH n , n = 0, 1, 2, 3,where the carbon atom of these groups is not bondedto an electrophilic atom or polarized group) containedwithin a desolvation domain around the HB. Each wrap-ping nonpolar group represents the third body within athree-body correlation involving the HB. This domainis typically defined as the reunion of two intersectingspheres of fixed radius ( thickness of three water layers)centered at the α − carbons of the residues paired by thehydrogen bond. In structures of PDB-reported solubleproteins, backbone hydrogen bonds (BHB) are protectedon average by ρ = 26 . ± . ρ − distribution, i.e.their microenvironment contains 19 or fewer nonpolargroups, so their ρ − value is below the mean (=26.6) mi-nus one standard deviation (=7.5). While the statisticson ρ − values for backbone hydrogen bonds vary with theradius, the tails of the distribution remain invariant, thusenabling a robust identification of structural deficiencies[14–17]. In the present work we are dealing with proteincomplexes and accordingly we compute the ρ − ∼ arifer/courses/DehydronCalculator.exeThe wrapping concept may be spatially represented asshown in Fig. 1, where two different types of three-bodycorrelations are illustrated. Figure 1a) shows an instanceof intermolecular wrapping of an intramolecular HB,while Fig. 1b) shows the wrapping of an intermolecularHB.Our virtual alanine-scanning procedure is performedby computationally replacing each residue of a proteinchain (one at a time) with alanine within the 3D structureof the complex and assessing the impact of the substitu-tion on intermolecular cooperativity. For most residues(those with a side chain larger than that of alanine) thismeans truncating the residue side chain at the β − carbonso that the whole side chain is replaced by a methylgroup, thus significantly reducing the extent of wrap-ping involving the residue. In the special case of glycine(which lacks a β − carbon) we include a methyl at thecorresponding position, increasing the extent of wrappingenabled by the residue. The in silico scanning process en-tails computing the change in ρ − value generated by eachAla-substitution on each intra and intermolecular BHBsof the complex. In a first stage, we calculate the ρ value FIG. 1. Illustration of intermolecular cooperativity repre-sented by three-body correlations: a) Trp 169 (full atomicdetail) of hGHbp (red chain) wrapping three intramolecularBHBs of the hGH chain (blue chain). The BHBs of the hGHchain are indicated by white sticks between the corresponding α − carbons; b) Similar to a) but for the complex between theHIV glycoprotein gp120 and the CD4 receptor. Here a Trpresidue of the CD4 chain wraps an intermolecular BHB. for all BHBs from the complex structure, producing a setof wild-type ρ − values. For each mutated residue we per-form the corresponding ala-substitution leaving all othercoordinates unchanged and we recalculate the full setof ρ − values (mutated ρ − values). Then, in accord withour premise of intermolecular cooperativity, hot spots arepredicted taking into account their role as intermolecularwrappers according to the following classes: a) The Ala-substitution of a residue on one chain lowers the ρ − valueof a BHB (an intramolecular BHB in the partner proteinor an intermolecular BHB) and the mutated ρ − value ofthis BHB falls below (cid:104) ρ (cid:105) . These predicted hot spots willbe labeled class A hot spots. In the cases where the final ρ − value falls below the dehydron threshold, ρ = 19, (de-hydron creation) these A-class hot spots will be labeledA*; b) Alanine substitution increases the wrapping capa-bility of a non-wrapper residue (glycine, serine, cysteine,aspartic acid or asparagine) located within the desolva-tion environment of a BHB of its own protein chain. Inaddition, the intramolecular wrapping value of the BHBis ρ ≤
19 and this BHB is intermolecularly wrappedwithin the complex. These alanine substitutions raisethe intramolecular ρ value in ∆ ρ = +1. These alter-ations lower the need for intermolecular wrapping uponassociation and the resulting predictions will be labeledclass B hot spots. In the cases when the intramolecularwrapping value of the BHB is exactly ρ = 19, we willdenote these B*-class hot spots. This sub-class impliesthat the ala-substitution is indicative of a net intramolec-ular removal of a dehydron. We decided to leave asideside chain - side chain hydrogen bonds from the coop-erativity analysis based on the following grounds: Thefluctuational nature of surface side chains imposes an en-tropic cost associated with HB formation which makesthe latter marginally stable at best [13]. Also, the wrap-ping statistics for side chain HBs are essentially flat withno clear distinction of the tails of the distribution do tothe conformational richness of the side chains. An a pos-teriori justification for the exclusion arises from the veryartifactual nature of surface side-chain HBs. Particularlymisleading are the large B-factors of solvent-exposed sidechains and the large hydration demands of exposed po-lar groups, which hinder HB formation. These artifactswould yield an overwhelming number of false positivesin the cooperativity analysis of the protein-protein in-terface (most interfacial residues would be hot spots).In turn, we shall not take into account salt bridges inour analysis, since they are not expected to significantlystabilize protein structure. These bridges are destabi-lizing with respect to hydrophobic replacement of bothcharged partners and charge burial has been shown to beusually destabilizing ([19] and references therein). How-ever, it is also known that for a pair of complimentaryburied charges it is preferable for them to be paired bya salt bridge than to be buried isolated from each other[19]. Thus, an Ala-mutation of a residue engaged in asalt bridge with its complex partner protein would bedestabilizing. This trivial type of hot spots accounts forapproximately 15 % of all the hot spots in the complexesconsidered and obviously lies outside the scope of ourcooperativity-based analysis.We performed a cooperativity-based alanine scanninganalysis on several protein-protein interfaces fromcomplexes with PDB reported structure for whichexperimental alanine scanning results are available[2, 3] (in each case, the first protein of the complexindicated is the one mutated and we provide the PDBentry of the complex and reference of the experimentalalanine scanning results): Human growth hormonereceptor/Human growth hormone[1] (3HHR), Trypsininhibitor/Beta-Trypsin[20] (2PTC), P53/MDM2[21](1YCR), CD4/GP120[22] (1GC1), Ribonuclease in-hibitor/Ribonuclease A[23] (1DFJ), Colicin E9 im-munity protein/Colicin E9 DNase domain[24] (1BXI),Barnase/Barstar[25] (1BRS), Barstar/Barnase[25](1BRS), Ribonuclease inhibitor/Angiogenin[23] (14Y).Figure 2 displays our predictions. The experimentalalanine substitution of a native protein subunit yieldsa change in its binding free energy (∆ G ) which is de-noted by ∆∆ G = ∆ G mut − ∆ G wt , (mut=mutated,wt=wild type) and is indicated with a color scale. Thecooperativity-based hot-spot predictions of our methodare indicated with gray squares below the correspondingresidues and are denoted by A, A*, B and B*.The letter“S” labels trival salt bridge hot spots which are removedfrom the list of experimental hot spots used for the com-parison with our computational method. FIG. 2. Experimental alanine scaning probes contrastedagainst cooperativity-based in silico scanning for the com-plexes indicated. For each case we display the portion of theprotein chain or the set of residues with experimental data.The colors indicate the experimentally determined ∆∆ G val-ues for the corresponding hot spots, as shown in the scaleat the right. The gray squares indicate our computationalpredictions, and the letter code is explained in the text.TABLE I. Predictions obtained for the different protein com-plexes studied (see Table I).Experimental hot spots Prediction success (percentages)(∆∆ G value) A+A*+B+B* A+A* A*+B* ≥ ≥ ≥ ≥ To quantify the predicting ability of our method, inTable I we show our global predictions over the wholeset of protein complexes studied.This comparison between theory and experiment re-veals that our computational procedure locates most ofthe experimental alanine-scanning hot spots, with op-timal performance (89 % prediction success) for themost significant contributors determined experimentally(∆∆ G ≥ G ≥ ρ − values aver-aged over the residues wrapped in class A hot spotsyields ρ = 20 .
3, a value higher than the dehydron thresh-old ( ρ = 19). However, when we average the mutated ρ − values we get a final ρ = 18, that is, below the dehy-dron threshold. Thus, the dehydron threshold is in factstatistically framed by the averaged wild-type and mu-tated ρ − values for A-class hot spots, thus revealing therelevance of the qualitative wrapping differences for pro-tein affinity. At this point it is worth recalling that ourmethod disregards two-body terms unless they are en-gaged in a three-body correlation. This approach seemsnatural in view of the fact that no protein-protein inter-face has proven trivial at the conventional pairwise levelanalysis [1–12] and given the absence of clear rules forhot-spot prediction [1–12]. This last point also makesdifficult to establish a control for our results, but wehave nonetheless defined an elementary one based on po-lar and hydrophobic complementarities. To this end, wehave simply characterized residues as hydrophobic (non-polar aromatic or aliphatic side chains) or polar (polaror charged side chains) and built a contact matrix forthe complex interface. For each residue we calculatedthe minimum distance between its α − carbon and the α − carbons of the residues of the partner protein andbetween the centroid of its side chain and those of thepartner side chains. An intermolecular contact was con-sidered to occur when this minimum distance was below6 ˚A (the results are robust to moderate changes in thecontact parameter and fit a criteria previously adoptedfor protein-protein interfaces[1]). We applied this analy-sis to the hGH/hGHbp complex interface which yieldeda significant level of mismatches (around 37 %), thusindicating that the protein association cannot be sim-ply rationalized as a search for pairwise polar-polar andhydrophobic-hydrophobic complementarity. More inter-estingly, when we restrict the analysis to the experimen-tally determined hot spots, the percentage of mismatchesis slightly higher (42 %). And if we look at the two mostimportant hotspots (Trp 104 and Trp 169), these residuesare involved in 8 mismatches and only 1 hydrophobic-hydrophobic contact. This level of mismatching seemsunavoidable given the high polar content at the proteinsurface which becomes buried upon creation of the com-plex. However, when we focus on three-body interac-tions, we discover that many hydrophobic residues atthe complex interface approach polar residues in orderto wrap BHBs in which the latter are involved.To summarize, this letter has shown that protein-protein interfaces elude standard physico-chemical anal-ysis. Their rationalization in terms of pairwise comple-mentarity along the contact region is unsatisfactory, es-pecially when it comes to understand the role of hotspots as determinants of protein associations. Againstthis reality, this work unravels a seemingly overlookedsimple molecular motif that proves to be ubiquitous indetermining protein-protein associations. This motif isan indicator of three-body intermolecular cooperativity.In essence, such effects arise as a group in one proteinchain stabilizes (wraps) a preformed hydrogen bond inthe partner chain or an inter-chain hydrogen bond, sothat three bodies intervene in the interaction and not allthree belong to the same chain. We have shown thathot-spot predictions based solely on this molecular at- tribute and defined by two pure combinatorial rules basedon structural analysis of protein complexes, account formost (89 %) of the hot spots experimentally determinedby alanine-scanning in a set of protein complexes. Thus,the simplicity of our method contrasts with the complex-ity of approaches based on full fledged potentials withexplicit water (where many-body terms are subsumed inall-atom interactions). We do not deny the relevanceof these predictive methods, but such avenues have notproven enlightening in terms of identifying clear molec-ular promoters of protein associations. By contrast, theresults presented in this work fulfill such imperative andmight be instrumental in the design of small moleculesaimed at disrupting protein-protein interfaces by fulfill-ing the wrapping capabilities of hot spots. We believethat our combinatorial identification of a molecular pro-moter of protein associations holds promise as a guidanceto rational drug design.Financial support from ANPCyT (PME 2006-1581)and CONICET is gratefully acknowledged. [1] Clackson, T.; Wells, J. A. Science , , 383-386.[2] Bogan, A. A.; Thorn, K. S. J Mol Biol , , 1-9.[3] Thorn, K. S.; Bogan, A. A. Bioinformatics , ,284-285.[4] Ofran, Y.; Rost, B. PLoS Comp Biol , , 1169-1176.[5] Kortemme, T.; Baker, D. Proc Natl Acad Sci U S A , , 14116-14121.[6] Chuang, G-Y. et al. Protein Science , , 1662-1672.[7] Chakrabarti, P.; Janin, J. Proteins , , 334-343.[8] Privalov, P. et al. J Mol Biol , , 1-9.[9] Li, J.; Liu, Q. Bioinformatics , , 743-750.[10] Keskin, O.; Ma, B.; Nussinov, R. J Mol Biol , ,1281-1294.[11] Li, Z.; Li, J. Bioinformatics , , 3304-3316.[12] Geppert, T.; Hoy, B.; Wessler, S.; Schneider, G. ChemBiol , , 344-353.[13] Fern´andez, A.; Lynch, M. Nature , , 502-505.[14] Fern´andez, A.; Scott, R. Phys Rev Lett , , 018102-018105[15] Fern´andez, A.; Scott, R. Biophysical J , , 1914-1928[16] Pietrosemoli, N.; Crespo, A.; Fern´andez, A. J Prot Res , , 3519-3526.[17] Schulz, E.; Frechero, M.; Appignanesi, G.; Fern´andez, A. PLoS ONE , , 12844-12849.[18] Wells, J. A.; McClendon, C. L. Nature , , 1001-1009.[19] Sindelar, C. V.; Hendsch, Z. S.; Tidor, B. Protein Science , , 1898-1914.[20] Castro, M. J. M.; Anderson, S. Biochemistry , ,11435-11446.[21] B¨ottger, A et al. J Mol Biol , , 744-756.[22] Ashkenazi, A. et al. Proc Nat Acad Sci U S A , ,7150-7154.[23] Chen, C-Z.; Shapiro, R. Proc. Natl. Acad. Sci. U S A , , 1761-1766.[24] K¨uhlmann, U. C. et al J Mol Biol , , 1163-1178. [25] Schreiber, G.; Fersht, A. R. J Mol Biol ,248