A unified analytical theory of heteropolymers for sequence-specific phase behaviors of polyelectrolytes and polyampholytes
aa r X i v : . [ q - b i o . B M ] J a n A unified analytical theory of heteropolymers for sequence-specific phase behaviorsof polyelectrolytes and polyampholytes
Yi-Hsuan Lin,
1, 2
Jacob P. Brady,
3, 4, 1
Hue Sun Chan, a) and Kingshuk Ghosh
5, 6, b)1)
Department of Biochemistry, University of Toronto, Toronto, Ontario,Canada Molecular Medicine, The Hospital for Sick Children, Toronto, Ontario,Canada Department of Molecular Genetics, University of Toronto, Toronto, Ontario,Canada Department of Chemistry, University of Toronto, Toronto, ON,Canada Department of Physics and Astronomy, University of Denver, Colorado, CO,USA Molecular and Cellular Biophysics, University of Denver, Colorado, CO,USA (Dated: 10 January 2020) a) Electronic mail: [email protected] b) Electronic mail: [email protected] . INTRODUCTION Mesoscopic compartmentalization undergirded by liquid-liquid phase separation (LLPS)of intrinsically disordered proteins or regions (IDPs or IDRs) and nucleic acids is now rec-ognized as a versatile means for biomolecular organization and regulation . Some of thesephase-separated droplet-like compartments are intracellular bodies—such as stress granules,P-granules and nucleoli—that may be characterized as “membraneless organelles”. Outsidethe cell, biomolecular LLPS can be biologically useful as well, as in the formation of certainextracellular materials. Collectively referred to as biomolecular condensates, these phase-separated bodies participate in many vital functions, as highlighted by their recently eluci-dated roles in endocytosis , silencing chromatin , transcription , and translation . Therepertoire of relevant discoveries is rapidly expanding . LLPS of globular proteins, forexample lens protein solutions, have also been observed and are of biological importance .Recent bioinformatics analyses suggest that IDPs and IDRs comprise a significant frac-tion of the proteomes of higher organisms, and that functional LLPS is likely ubiquitous .The propensity for an IDP or IDR to phase separate is governed by its amino acid sequenceand modulated by solution/environmental conditions (temperature, hydrostatic pressure ,pH, ionic strength , etc) as well as their interactions with other biopolymers such as RNA.Thus, any “big-picture” survey of the physical basis of biomolecular condensates requiresnot only consideration of many different sequences but a large variety of environmental con-ditions. Adding to this combinatorial complexity is that even for a given wildtype sequence,postranslational modifications, mutations, and splicing can lead to diverse LLPS propen-sities. In this context, analytical theories are the most computationally efficient tool forlarge-scale exploration of sequence-dependent biomolecular LLPS. Although explicit-chainsimulations provide more energetic and structural details and field-theory simulationsafford more numerical accuracy , currently the number of sequences that can be simu-lated by these approaches is limited because of their high computational cost. Moreover,analytical theories are valuable for insights into physical principles that are less manifestin simulation studies. With this in mind, we build on recent success in using analyticaltheories to account for sequence-dependent biomolecular condensates under certain limitedconditions so as to develop improved theories that are more generally applicable.Building sequence-specific theories of LLPS will also have implications in phase separation3f block polyamphoytes and its comparison with complex coacervation between oppositelycharged homopolyelectrolytes, a topic of intense research in polymer physics . Diblockpolyampholytes with repeat units of a polycation segment followed by a polyanion segmentcan be envisioned to be equivalent to two oppositely charged homopolyelectrolytes. Forthis reason LLPS of block polyampholytes – a limiting case of our theory – is often termedself-coacervation and shares features similar to complex cocervation of a polycation andpolyanion . Experiments and simulation have also reported differences between the phasediagrams of block polyampholytes and homopolyelectrolyte coacervation. The observed dif-ferences can be explained by the presence of ‘charge pattern interfaces’ where two segmentsof oppositely charged blocks merge in polyampholytes. Homopolyelectrolytes, on the otherhand, lack such connectivities, thus leading to different types of salt localization in compar-ison to block polyampholytes . Application of a general sequence-based analytical theoryof polyampholyte LLPS will further advance these comparisons between complex coacerva-tion and self-coacervation. Future effort in theory development is needed in this direction.Thus, our framework should be useful not only for high-throughput analyses of the LLPSpropensities of naturally occurring biological sequences but also for the design of artificialbiological and non-biological heteropolymers with desired LLPS properties .Inasmuch as sequence-specific analytical theories for biomolecular condensates are con-cerned, a recent multiple-chain formulation based on the traditional random phase approx-imation (RPA) has been applied to study the dependence of LLPS of IDPs on thecharge patterns along their chain sequences . This approach accounts for the experimentaldifference in LLPS propensity between the Ddx4 helicase IDR and its charge-scrambled mu-tant . It also provides insight into a possible anti-correlation between multiple-chainLLPS propensity and single-chain conformational dimensions as well as the degree ofdemixing of different charge sequences under LLPS conditions . As an initial step, theseadvances are useful. As a heteropolymer theory, however, traditional RPA is knownto have two main shortcomings. First, the density of monomers of the polymer chains insolution is assumed to be roughly homogeneous as density fluctuations are neglected be-yond second order in RPA. A rigorous treatment proposed by Edwards and Muthukumarhas shown the importance of including density fluctuations to higher orders . Nonethe-less, a recent comparison of field-theory simulation and RPA indicates that RPA is rea-sonably accurate for intermediate to high monomer densities for the cases considered, and4hat significant deviations between RPA and field theory simulation occur only for volumefraction < .
02 that of the highest condensed-phase simulated . Second, traditional RPAneglects the fact that monomer-monomer interactions can cause conformational variationof individual chains by computing the single-chain structure factor using a Gaussian chainwith no intrachain interaction. This limitation, which applies to homopolymers as well asheteropolymers, is particularly acute for the latter. Indeed, experimental and computa-tional studies have shown that single-chain conformational heterogeneities and dimensionsare sensitive to sequence specific interactions . Regarding this shortcoming, recently animproved analytical approach was developed at the single-chain level by replacing the Kuhnlength l (termed “bare” Kuhn length) of the Gaussian chain by a set of renormalized Kuhnlengths, l , that embodies the sequence-specific interactions approximately . Renormal-ized structure factors have also been exploited to improve homopolymer LLPS theories forpolyelectrolytes .Noting that the first shortcoming described above is likely limited only to regimes ofextremely low polymer concentrations, here we first focus on rectifying the second short-coming by combining the earlier, traditional sequence-dependent RPA theory with thesequence-dependent single-chain theory that utilizes a renormalized Gaussian (rG) chain for-mulation for a better account of conformational heterogeneity. We refer to this theoryas rG-RPA. As a control, we also study a simpler theory, analogous to our earlier formula-tion , that invokes a Gaussian chain with fixed Kuhn length. Following Shen and Wang ,we refer to this l = l theory as fG-RPA. Extensive comparisons of rG-RPA and fG-RPA pre-dictions on various systems indicate that rG-RPA represents a significant improvement overfG-RPA. As will be detailed below, the superiority of rG-RPA is most notable in its abilityto account for the LLPSs of both polyampholytes and polyelectrolytes whereas fG-RPA isinadequate for polyelectrolytic polymers. II. THEORY
We consider an overall neutral solution of n p charged polymers, each consisting of N monomers (residues), and small ions including n s salt ions and n c counterions with chargenumbers z s and z c respectively. The charge pattern of a polymer is given by an N -dimensional vector | σ i = [ σ , σ , ..., σ N ] T , where σ τ is the charge on the τ th monomer;5nd q c ≡ ( P τ σ τ ) /N is the net charge per monomer. For simplicity, we consider the casewith only one species of positive and one species of negative ions; their numbers are denotedas n + and n − respectively. Moreover, “salt” is identified as the small ions that carry chargesof the same sign as the polymers, whereas “counterions” are the small ions carrying chargesopposite to that of the polymers. Thus, n s = n + if q c > n s = n − if q c <
0; and | q c | n p N + z s n s = z c n c for solution neutrality. The densities ( ρ ) of monomers, salt ions,and counterions are, respectively, ρ m = n p N/ Ω, ρ s = n s / Ω, and ρ c = n c / Ω, where Ω issolution volume. Although only a simple system with at most two species of small ionsis analyzed here for conceptual clarity, our theory can be readily expanded to account formultiple species of small ions.Details of our formulation are given in the Appendix. Here we provide the key steps inthe derivation. Let F be the total free energy of the system. Then f ≡ F l / ( k B T Ω) is freeenergy in units of k B T per volume l , where l is the bare Kuhn length, k B is Boltzmannconstant and T is absolute temperature. In our theory, f = − s + f ion + f p + f , (1)where s is mixing entropy, f ion and f p are interactions among the small ions and involving thepolymers, respectively, that arise from density fluctuations and f is the mean-field excludedvolume interaction, all expressed in the same units as f . The mixing entropy, which accountsfor the configurational freedom of the solutes, takes the Flory-Huggins form, viz., − s = φ m N ln φ m + φ s ln φ s + φ c ln φ c + φ w ln φ w , (2)where φ m , φ s , φ c , and φ w = 1 − φ m − φ s − φ c are volume fractions ( φ = ρl ), respectively, ofpolymers, salt ions, counterions, and solvent (water for IDP systems). Following Muthuku-mar, the charge of each small ion is taken to be distributed over a finite volume comparableto that of a monomer. The corresponding interaction free energy among the small ions is f ion = − π (cid:20) ln(1 + κl ) − κl + 12 ( κl ) (cid:21) , (3)where 1 /κ = 1 / p πl B ( z s ρ s + z c ρ c ) is the Debye screening length, l B being Bejurrm length.Polymers interact via a κ -dependent screened Coulomb potential and a uniform excluded-volume repulsion with strength v . The origin of this repulsive term is to be understood as6n effective interaction between polymer and solvent. By setting v repulsive we imply thepolymer is in a good solvent. These interactions are contained in the expression U p [ R ] = 12 n p X α,β =1 N X τ,µ =1 (cid:20) σ τ σ µ e − κ | R α,τ − R β,µ | | R α,τ − R β,µ | + v δ ( R α,τ − R β,µ ) (cid:21) , (4)where R α,τ is the position of the τ th monomer in the α th polymer. The U p form facil-itates the formulation in terms of density fields below. For this purpose, the divergentself-interaction terms in U p are either regularized subsequently or inconsequential becausethey do not contribute to phase-separation properties. Chain connectivity of the polymersare enforced by the potential T [ R ] = 32 l n p X α =1 N − X τ =1 ( R α,τ +1 − R α,τ ) . (5)Thus, aside from a combinatorial factor that has already been included in Eq. 2, the partitionfunction involving the polymers is given by Z p = Z n p Y α =1 N Y τ =1 d R α,τ e − T [ R ] − U p [ R ] . (6)Now, by applying the Hubbard-Stratonovich transformation and converting real-space to k -space variables, we convert the coordinate-space partition function in Eq. 6 to a k -spacepartition function involving a charge-density field ψ and a matter-density field w , viz., Z p = Z Z ′ p , Z ′ p = Z Y k = r ν k v dψ k d w k π Ω e − H [ ψ, w ] , (7)where Z = exp[ − v ( N n p ) / k = , H [ ψ, w ] = 12Ω X k = (cid:20) ν k ψ − k ψ k + w − k w k v (cid:21) − n p ln Q p [ ψ, w ] , (8) ν k ≡ k / (4 πl B ) + ( z s ρ s + z c ρ c ), k ≡ | k | , Q p [ ψ, w ] = R D [ R ] exp( −H p [ ψ, w ]) is the single-polymer partition function with D [ R ] ≡ Q Nτ =1 d R τ (the chain label α in R is dropped sincethe integration here is only over one chain), and H p [ ψ, w ] = 32 l N − X τ =1 ( R τ +1 − R τ ) + i Ω X k = N X τ =1 ( σ τ ψ k + w k ) e − i k · R τ . (9)The total interaction free energy involving the polymers in the unit of Eq. 1 is − ( l / Ω) ln Z p ,which we express as the sum of a density-fluctuation contribution f p = − ( l / Ω) ln Z ′ p and7 mean-field contribution f = − ( l / Ω) ln Z = v ρ m . The f term involves neither smallions nor electrostatic interactions because the excluded volumes of the small ions are notconsidered beyond the incompressibility condition in Eq. 2 and the solution system as awhole is neutral.We evaluate Z ′ p in Eq. 7 perturbatively by expanding H [ ψ, w ] to second order in density: H [ ψ, w ] ≈ X k = h ψ − k w − k | ν k + ρ m ξ k ρ m ζ k ρ m ζ k v − + ρg k (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) ψ k w k + , (10)where g k , ξ k , and ζ k are monomer density-monomer density, charge-charge, and monomerdensity-charge correlation functions in k -space, h . . . | and | . . . i are, respectively, row andcolumn vectors. Z ′ p can then be calculated as a Gaussian integral to yield f p = − l ln Z ′ p Ω = l Z d k (2 π ) ln (cid:20) ρ m (cid:18) ξ k ν k + v g k (cid:19) + v ν k ρ m (cid:0) ξ k g k − ζ k (cid:1)(cid:21) . (11)Evaluation of g k , ξ k , and ζ k requires knowledge of the single-polymer Q p (Eq. 8), which ingeneral depends on the sequence charge pattern. fG-RPA makes the simplifying assumptionthat Q p is that of Gaussian chains with a fixed l , i.e., assumes that the second term in Eq. 9vanishes. As introduced above, here we use a renormalized Kuhn length l = xl to betteraccount for the effects of interactions on Q p by making the improved approximation Q p ≈ Z D [ R ] e −H p ; H p = 32 l x N − X τ =1 ( R τ +1 − R τ ) . (12)Accordingly, the correlation functions in Eq. 11 are computed using l instead of l : g k → g xk = 1 N h | ˆ G xk | i , ξ k → ξ xk = 1 N h σ | ˆ G xk | σ i , ζ k → ζ xk = 1 N h σ | ˆ G xk | i , (13)where ˆ G xk is the N × N correlation matrix of the renormalized Gaussian chain with [ ˆ G xk ] τµ =exp[ − ( kl ) x | τ − µ | / h | and | i are N -dimensional vectors with all elements equal to 1.As emphasized above, the single x variable here for end-to-end distance serves to providean approximate account of sequence specific effects in single-chain conformations. A moreaccurate formalism that may be pursued in the future is to consider x as a function of specificresidue pairs, i.e. x → x ( τ, µ ), so as to provide a structure factor that applies to all lengthscales as in the approach of Shen and Wang .A variational approach similar to that in Sawle and Ghosh is applied to obtain asequence-specific x by first expressing H p in Eq. 9 as H p = H p + H p where H p is given8y Eq. 12 and H p is the discrepany in using the renormalized H p to approximate H p . Ingeneral, a partially optimized solution for x may be obtained by minimizing the differences inaveraged physical quantities computed using H p versus those computed using H p , i.e., mini-mizing contributions from H p . To simplify this calculation, we use, as in Ref. 68, the polymersquared end-to-end distance | R N − R | as the physical quantity for the partial optimizationof x . The derivation proceeds largely as before , except the monomer-monomer interactionpotential in Ref. 68 is now replaced by the effective field-field correlation function U eff ( k ) ≡ N X τ,µ =1 h σ τ σ µ h ψ − k ψ k i + h w − k w k i + ( σ τ + σ µ ) h ψ − k w k i i , (14)where h . . . i represents averaging over field configurations. This analysis, the details of whichare given in the appendix, leads to an equation that allows us to determine x :1 − x − N l N − Z d k (2 π ) k Ξ xk det ∆ xk = 0 , (15)where ∆ xk is the 2 × g k , ξ k , and ζ k replaced by their renormalized g xk , ξ xk , and ζ xk in Eq. 13. In the numerator of the integrand in Eq. 15,Ξ xk = ¯ ξ xk v + ν k ¯ g xk + ρ (cid:0) ¯ ξ xk g xk + ξ xk ¯ g xk − ζ xk ¯ ζ xk (cid:1) , (16)where ¯ ξ xk = 1 N h σ | ˆ L ˆ G xk | σ i , ¯ g xk = 1 N h | ˆ L ˆ G xk | i , ¯ ζ xk = 1 N h σ | ˆ L ˆ G xk | i , (17)with ˆ L being an N × N matrix with [ ˆ L ] τµ = | τ − µ | . Now, for any chosen excluded-volumeparameter v , x can be solved as the only unknown in Eq. 15. With x determined, f p can becomputed via Eq. 11 and combined with the above expressions for s , f ion and f to completethe free energy function in Eq. 1 for our rG-RPA theory. Here we use v = 4 πl /
3, which isabout the ∼ l size of a monomer, in the applications below.We note that while v > v = 4 πl / II. RESULTSA. Salt-free rG-RPA unifies established LLPS trends of both uniformlycharged polyelectrolytes and neutral polyampholytes
We first illustrate the more general applicability of rG-RPA by comparing rG-RPAand fG-RPA predictions for salt-free solutions of uniformly charged polyelectrolytes (fullycharged homopolymers) and 4-block overall neutral polyampholytes of several different chainlengths (Fig. 1). As stated above, fG-RPA corresponds to setting x = l /l = 1 and v = 0 inrG-RPA. While fG-RPA is not identical to our earlier RPA because fG-RPA subsumes theeffects of small ions in a screening potential for the polymers whereas our earlier RPA theorytreats the small ions and polymers on the same footing, both theories share the Gaussian-chain approximation and their predicted trends are very similar, as will be illustrated byexamples below.The rG-RPA-predicted critical point (( φ m ) cr , / ( l B ) cr ) in Fig. 1(a) for polyelectrolytes isinsensitive to chain length (( l B ) cr is critical Bjerrum length; 1 / ( l B ) cr is proportional to thecritical temperature T cr ). As N increases, lim N →∞ / ( l B ) cr ≈ . N →∞ ( φ m ) cr ≈ .
05. These predictions are consistent with lattice-chain simulations and other theo-ries . The fG-RPA predictions are drastically different, viz., lim N →∞ / ( l B ) cr → ∞ and lim N →∞ ( φ m ) cr → and its predictions for polyelectrolytes are inconsistent with the aforementioned establishedresults . This comparison between rG-RPA and fG-RPA underscores the importance ofappropriately accounting for conformational heterogeneity in understanding polyelectrolyteLLPS and the effectiveness of using renormalized Kuhn lengths for the purpose.Both rG-RPA and fG-RPA predict 1 / ( l B ) cr → ∞ and ( φ m ) cr → N → ∞ for thepolyampholytes (Fig. 1(b and d)). These results are consistent with simple RPA theory ,a charged hard-sphere chain model , and lattice-chain simulations . Not surprisingly, bothrG-RPA and fG-RPA posit that the T cr ’s of polyelectrolytes are much lower than those ofneutral polyampholytes because direct electrostatic attractions exist for polyampholytes buteffective attractions among polyelectrolytes can only be mediated by counterions.For the polyampholytes, rG-RPA (Fig. 1(b)) predicts lower T cr ’s than fG-RPA (Fig. 1(d)).With a more accurate treatment of single-chain conformational dimensions, rG-RPA should10 a) N=240 (c) (b)(d)N=120N=80 N=40N=40N=80N=120N=240 FIG. 1. Salt-free LLPS of polyelectrolytes and polyampholytes. rG-RPA (a and b, top panels)and fG-RPA (c and d, bottom panels) phase diagrams for N = 10, 25, 40, 80, 120, and 240polyelectrolytes with charge sequences σ τ = − τ = 1 , , . . . , N (a and c, left panels) and N =40, 80, 120, and 240 4-block polyampholytes with charge sequences σ τ = +1 for τ = 1 , , . . . , N/ τ = N/ , N/ , . . . , N/
4, and σ τ = − τ = N/ , N/ , . . . , N/ τ = 3 N/ , N/ , . . . , N (b and d, right panels). Grey circles are critical points. For thecoexistence curves in (a and c), N decreases from top to bottom, with the N = 80, 120, and 240curves in (a) being nearly identical. entail more compact isolated single-chain conformations for block polyampholytes, result-ing in less accessibility of the charges for interchain cohesive interactions and therefore aweaker—but physically more accurate—LLPS propensity.Notably, the fG-RPA-predicted phase boundaries of both polyelectrolytes and polyam-pholytes exhibit an inverse S-shape phase boundaries (the condensed-phase part of the co-existence curves concave upward; Fig. 1(c and d)). In contrast, rG-RPA predicts that only11olyampholytes have inverse S-shape phase boundaries (Fig. 1(b)), whereas polyelectrolytesphase boundaries convex upward with a relatively flat φ m dependence around the criticalpoints (Fig. 1(a)). This conspicuous difference between the rG-RPA-predicted phase bound-aries of polyampholytes and polyelectrolytes is consistent with explicit-chain simulations . B. Salt-free rG-RPA account of pH-dependent LLPS
To address pH dependence under salt-free conditions, we apply rG-RPA to an example of anear-neutral polyampholyte under neutral pH, namely the N-terminal IDR of the DEAD-boxhelicase Ddx4 (IDR denoted as Ddx4 N1 ) and its charge-scrambled variant Ddx4 N1 CS whichhas the same amino acid composition as Ddx4 N1 by a different sequence charge pattern .The sequences are studied at neutral and acidic pH. We refer to the resulting charge patternsas (in obvious notation) Ddx4 N1pH7 , Ddx4 N1 CS pH7 , Ddx4 N1pH1 , and Ddx4 N1 CS pH1 , where pH7and pH1 are approximate pH values symbolizing neutral and acidic conditions. For thepH7 sequences, each of the 24 arginines (R) and 8 lysines (K) of Ddx4 N1 and Ddx4 N1 CS isassigned a +1 charge, each of the 18 aspartic acids (D) and 18 glutamic acids (E) is assigneda − p K H = 6 .
04) carries a+1 charge (Fig. 2(a), K, R in blue; H in cyan). Thus, Ddx4
N1pH7 and Ddx4 N1 CS pH7 are near-neutral polyampholytes whereas Ddx4 N1pH1 and Ddx4 N1 CS pH1 are polyelectrolytes, althoughthese four sequences—unlike those in Fig. 1—contains also many uncharged monomers.Fig. 2(b) indicates that the rG-RPA-predicted T cr is much lower under acidic than underneutral conditions, and that the T cr of Ddx4 N1 is always higher than that of Ddx4 N1 CSunder both pH conditions, underscoring that sequence-specific effects influence the LLPSof not only neutral and nearly-neutral polyampholytes but also polyelectrolytes.Intriguingly, inverse S-shaped coexistence curves are seen in Fig. 2(b) not only for neutralpH (blue curves) but also for acidic pH (orange curves). This feature is characteristic ofpolyampholytes (Fig. 1(b)) but not uniformly charged polyelectrolytes (Fig. 1(a)). Thisresult suggests that inverse S-shaped phase boundaries can arise in general from a het-erogeneous sequence charge pattern because it leads to the simultaneous presence of bothattractive and repulsive interchain interactions (which can be counterion-mediated in the12 .0 0 .1 0 .2 0 .3 0 .40 .00 .20 .40 .
60 .0 0 .1 0 .2 0 .3 0 .40 .00 .10 .20 .30 .40 .5
Ddx4 N1 pH7Ddx4 N1 pH1Ddx4 N1 CS pH7Ddx4 N1 CS pH1 Ddx4 N1 pH7Ddx4 N1 CS pH7 Ddx4 N1 pH1Ddx4 N1 CS pH1 D d x N C S D d x N C S D d x N D d x N p H p H p H p H (a)(b) (c) FIG. 2. LLPS at neutral and acidic pH. (a) Charge sequences of Ddx4 N1 and Ddx4 N1 CS (blue/cyan:+1, red: −
1, white: 0) and their (b) rG-RPA and (c) fG-RPA phase diagrams. case of polyelectrolytes) and therefore allows for condensed-phase configurations with lowerdensities .As a control, fG-RPA results are shown in Fig. 2(c). In contrast to rG-RPA, fG-RPApredicts that the l/ ( l B ) cr value (proportional to T cr ) of both Ddx4 N1 and Ddx4 N1 CS at lowpH is higher than that of Ddx4 N1 CS at neutral pH, and that the critical volume fractions atlow pH are significantly lower than those at neutral pH. Although these differences betweenfG-RPA and rG-RPA predictions for the Ddx4 IDR remain to be conclusively tested byexperiment, the low-pH fG-RPA phase diagrams here (orange curves in Fig. 2(c)) sharesimilar features with the fG-RPA phase diagrams for polyelectrolytes in Fig. 1(c) which, as13 dx4 N1 pH7Ddx4 N1 CS pH7Ddx4 N1 pH1Ddx4 N1 CS pH1 (b)(a)
Ddx4 N1 pH7Ddx4 N1 CS pH7Ddx4 N1 pH1Ddx4 N1 CS pH1
FIG. 3. Simple RPA salt-free phase diagrams for the four Ddx4 sequences in Fig. 2(a). (a)Phase diagrams computed using the Coulomb potential in Fourier space, U k = 4 πl B /k , are verysimilar to the fG-RPA phase diagrams in Fig. 2(c). (b) Phase diagrams computed using a Coulombpotential with a short-range cutoff, U k = 4 πl B / [ k (1 + ( kl ) )]; the same potential used in ourprevious simple-RPA studies . This Coulomb potential with a short-range cutoff predictsthat the two pH1 sequences have critical temperatures even higher than that of wildtype Ddx4 atpH7. This prediction, however, contradicts the physical intuition that polyelectrolytes should havelower phase separation propensities than neutral or near-neutral polyampholytes of the same chainlength. discussed above, are at odd with trends observed in prior theories and experiments. ThefG-RPA results and those obtained using our earlier, simple formulation of RPA are verysimilar (Fig. 3). C. Salt-free rG-RPA rationalizes pH-dependent LLPS of IP5
We now utilize our theory to rationalize part of the experimental pH-dependent LLPStrend of the lyophilized 39-residue peptide IP5 , the isoelectric point of which is pH = 4.4(Fig. 4(a and b)) . The pH-dependent charge σ of a basic or acidic residue is computed here by σ = ± ± (pK a − pH) ± (pK a − pH) , (18)14here the + and − signs in the ± signs above apply to the basic (R, K, H) and acidic (D, E)residues, respectively. Standard pK a values , viz., R: 12 .
10, K: 10 .
67, H: 6 .
04, D: 3 .
71, andE: 4 .
15, are used in Eq. 18 to construct pH-dependent charge sequences of IP5 (Fig. 4(c)).The rG-RPA- and fG-RPA-predicted IP5 phase boundaries for the experimental studiedpH values are shown in Fig. 4(d). Both theories predict a lower l/ ( l B ) cr ≈ . . l/ ( l B ) cr ≈ .
5. Physically, this is not surprising, as has been addressed in previousRPA studies , because non-electrostatic cohesive interactions are neglected here. Nonethe-less, consistent with experiment, both theories posit that LLPS propensity decreases withincreasing pH. Moreover, the rG-RPA-predicted critical volume fraction ( φ m ) cr ≈ . .
024 is reasonable in view of the experimental value of ≈ .
036 (Ref. 80), indicating onceagain that rG-RPA is superior to fG-RPA as the latter predicts much higher ( φ m ) cr ’s. D. Salt-dependent rG-RPA for heteropolymeric charge sequences
In view of the superiority of rG-RPA over fG-RPA, only rG-RPA is used below. Weconsider the four charge sequences in Fig 2(a) as examples and restrict attention to mono-valent salt and counterions ( z s = z c = 1). In experiments we conducted for this study usingdescribed methods , no Ddx4 N1 LLPS was observed in salt-free solution at room temper-ature; yet Ddx4 N1 at room temperature is known to phase separate with 100 mM NaCland that LLPS propensity decreases when [NaCl] is increased to 300 mM. These findingssuggest that, similar to LLPS of uniformly charged polyelectrolytes , salt dependenceof heteropolymer LLPS is non-monotonic at temperatures slightly higher than the salt-free T cr and therefore such temperatures are of particular interest. For this reason, we applyrG-RPA to compute IDR-salt binary phase diagrams of Ddx4 N1pH7 , Ddx4 N1 CS pH7 , Ddx4 N1pH1 ,and Ddx4 N1 CS pH1 (Fig. 5), each at an l/l B value slightly higher than the sequence’s salt-free l/ ( l B ) cr in Fig 2(b).As expected, all binary phase diagrams in Fig. 5 exhibit non-monotonic salt dependence.In general, at temperatures above the salt free critical temperature, i.e. l/l B & salt-free l/ ( l B ) cr , when sufficient salt is added to the salt-free homogeneous solution, LLPS is triggeredat φ s = ( φ s ) Lcr . Adding more salt beyond ( φ s ) Lcr enhances LLPS in that a wider range ofoverall φ m falls within the LLPS regime, until a turning point ( φ s ) T is reached. Beyond that,adding more salt (increasing φ s above ( φ s ) T ) reduces LLPS (the phase-separated range of φ m H H (Aib)QGTFTSDKSKYLDERAAQDFVQWLLDGGPSSGAPPPS (a) (b) (c)(d) pH 5.5–7.42 − − − − − T ( ℃ ) pH 5.50pH 5.70pH 5.87pH 6.00pH 7.42 rG-RPA fG-RPA FIG. 4. LLPS of IP5. (a) The IP5 sequence, where basic and acidic residues are in blue andred, respectively; (Aib) is the non-proteinogenic amino acid α -methylalanine . (b) ExperimentalpH-dependent phase diagrams of IP5 based on the data in Fig. 4 of Ref. 80; anti-freeze was usedto obtain some of the low- T results . (c) Net charge per residue, q c , of IP5. (d) Phase diagramspredicted by rG-RPA (solid curves) and fg-RPA (dashed curves). narrows). LLPS is impossible for the given temperature when salt concentration is increasedabove an upper critical point ( φ s ) Ucr .Despite these qualitative commonalities, there are significant sequence-dependent differ-ences. Notably, at neutral pH, the range of salt concentrations that can induce LLPS is16uch narrower for Ddx4
N1pH7 ( φ s . . N1 CS pH7 ( φ s . . N1pH1 and Ddx4 N1 CS pH1 are similar ( φ s . .
01, Fig. 5(c and d)), and their ( φ s ) Lcr and ( φ s ) Ucr aresignificantly larger than those at neutral pH.Next we explore these trends at temperatures below the salt-free T cr . Figs. 6–9 presentsalt-polymer phase diagrams for four Ddx4 sequences (both wild type and charge scrambledsequences at neutral and acidic pH) at three different temperatures. Panels (a) and (b) inthese figures show phase diagrams at temperatures below the respective salt free T cr for thegiven sequence, while panel (c) is at a temperature above salt free T cr . The three phasediagrams are compared in panel (d) for a given sequence. These figures reveal trends for l/l B & salt-free l/ ( l B ) cr (above the salt free critical temperature) are largely in line withbehaviors at temperatures below the salt-free T cr . The only difference is for l/l B < salt-free l/ ( l B ) cr , ( φ s ) Lcr = 0. For l/l B < salt-free l/ ( l B ) cr , temperatures for different sequences werechosen such that the maximum φ m range of LLPS are similar among the sequences (as inFig. 5). With this choice of temperature constraint, when the IDR-salt phase diagramsfor different sequences (Figs. 6–9) are compared, we note that ( φ s ) Ucr and ( φ s ) T of Ddx4 N1pH7 are much smaller than those of Ddx4 N1 CS pH7 . Furthermore, ( φ s ) Ucr , ( φ s ) T of these two pH7sequences are much smaller than those of the two pH1 sequences. Thus, we conclude thatDdx4 N1pH7 is more sensitive to salt than Ddx4 N1 CS pH7 , and both are more salt-sensitive thanDdx4 N1pH1 and Ddx4 N1 CS pH1 . Metrics other than ( φ s ) T can also be used to determine saltsensitivity. For example, the low- φ m turning point (e.g., at φ m ≈ . φ s ≈ .
16 inFig. 5(a), unlabeled) with a φ s value similar to that of ( φ s ) T may be used to characterizesalt sensitivity. The resulting trend is similar to the one gleaned from the turning point,( φ s ) T .The existence of a ( φ s ) Lcr > N1 does not phase separate with [NaCl] < ◦ C ( l/l B =0 . φ s increases, indicating that salt ions and the heteropolymeric IDRs partially exclude eachother in low-salt but partially coalesce in high-salt solutions at neutral pH. This intriguingfeature was not encountered in solutions of either a single species of uniformly-charged ortwo species of oppositely-charged homopolymers . In contrast, the tie-line slopes17n Fig. 5(c) and (d) are all positive, indicating that salt ions and the heteropolymeric IDRsalways partially coalesce under acidic conditions. (a) Ddx4 pH7N1N1 (b) Ddx4 N1 CS pH7 (c) Ddx4 pH1 (d) Ddx4 N1 CS pH1 UL T
FIG. 5. IDR-salt binary phase diagrams of two Ddx4 variants at low and high pH. Results are for l/l B & l/ ( l B ) cr , where the salt-free 1 / ( l B ) cr equals 0 .
455 for Ddx4
N1pH7 (a), 0 .
336 for Ddx4 N1 CS pH7 (b), 0 .
195 for Ddx4
N1pH1 (c), and 0 .
188 for Ddx4 N1 CS pH1 (d). The φ s values of the grey circles in(a)–(d) are ( φ s ) Ucr , ( φ s ) T , or ( φ s ) Lcr , as indicated by U, T, and L in (a). a)(c) (d)(b) ×10 -4 FIG. 6. Polymer-salt coexistence phase diagrams of Ddx4
N1pH7 at the l/l B values indicated. Thesalt-free critical value of l/l B is l/ ( l B ) cr = 0 . φ s ) Ucr , whereas the bottom grey circle in (c) provides the lowercritical concentration ( φ s ) Lcr (see discussion in main text). Each dashed line in (a)–(c) is a tie lineconnecting a pair of coexistent phases. The three phase boundaries in (a)–(c) are compared in (d).
E. Salt-dependent rG-RPA is consistent with established trends in LLPS ofhomopolymeric, uniformly charged polyelectrolytes
Our model predicts salt and polymers coalesce for Ddx4
N1pH1 and Ddx4 N1 CS pH1 (Fig. 5 cand d). These sequences are examples of non-uniformly charged polyelectrolytes. However,these results are in contrary to experiment and theory on uniformly charged polyelectrolytesthat suggest salt ions and polymers tend to exclude each other, leading to tie lines withnegative slopes in the polymer-salt phase diagrams . We test the ability of our model19 a)(c) (d)(b) FIG. 7. Polymer-salt coexistence phase diagrams of Ddx4 N1 CS pH7 at the l/l B values indicated.The salt-free critical value of l/l B is l/ ( l B ) cr = 0 . φ s ) Ucr , whereas the bottom grey circle in (c) provides thelower critical concentration ( φ s ) Lcr . Each dashed line in (a)–(c) is a tie line connecting a pair ofcoexistent phases. The three phase boundaries in (a)–(c) are compared in (d). to reproduce this established trend by computing salt-polymer phase diagrams for uniformlycharged polymers (Fig. 10(a)). The established feature is captured by our new theory, asthe slopes of all tie lines are negative in Fig. 10(a). Furthermore, consistent with literaturereports on uniformly charged homopolymers (homopolyelectrolytes) , with addition ofsalt, rG-RPA predicts a one-to-two phase transition in the low salt regime as well as a two-to-one phase transition in the high salt regime. For comparison, Fig. 10(b) is the phasediagram of an overall neutral polyampholytes at a temperature substantially lower than thesalt-free T cr with all tie lines having positive slopes. A recent field theory simulation study of20 a)(c) (b)(d) FIG. 8. Polymer-salt coexistence phase diagrams of Ddx4
N1pH1 at the l/l B values indicated. Thesalt-free critical value of l/l B is l/ ( l B ) cr = 0 . φ s ) Ucr , whereas the bottom grey circle in (c) provides the lowercritical concentration ( φ s ) Lcr . Each dashed line in (a)–(c) is a tie line connecting a pair of coexistentphases. The three phase boundaries in (a)–(c) are compared in (d). an overall neutral diblock polyampholyte also found tie lines with slightly positive slopes .Since tie lines with exclusively positive slopes are also seen for the overall negatively-chargedlow-pH Ddx4 IDRs above, the opposite-signed tie-line slopes in Fig. 10(a) for homopolymericand those in Fig. 5(c) and (d) for heteropolymeric polyelectrolytes suggest a role of sequenceheterogeneity in determining whether charged polymers tend to exclude or coalesce with saltions. However, the precise origins of variation in tie-line slope remains to be ascertained.One idea is that the non-zero tie-line slopes arise from chain connectivity of polymers.If the polymers were not connected and behave like a collections of monomers, the salt21 a) (c) (b)(d) FIG. 9. Polymer-salt coexistence phase diagrams of Ddx4 N1 CS pH1 at the l/l B values indicated.The salt-free critical value of l/l B is l/ ( l B ) cr = 0 . φ s ) Ucr , whereas the bottom grey circle in (c) provides thelower critical concentration ( φ s ) Lcr . Each dashed line in (a)–(c) is a tie line connecting a pair ofcoexistent phases. The three phase boundaries in (a)–(c) are compared in (d). concentrations in the dilute and condensed phases would simply follow that of the polymerleading to positive slope . However chain connectivity can change the slope from positiveto negative.The nature of tie-line slopes has also received considerable attention in the salt-polymerphase diagrams observed during complex coacervation of symmetric polyelectrolytes .Insights gleaned from these studies can yield clues to tie-line slope differences observed inour analysis. A recent theory based on the concept of chain connectivity predicts a salt-concentration-dependent change of sign of tie-line slope, exhibiting a behavior similar to22hat in Fig. 5(a) and (b). Although in this case of coacervation the slope changes frompositive to negative with addition of salt, opposite to the case of heteropolymers describedhere. Another idea is that tie-line slope is determined by a competition between electro-static interactions among polymers and configurational entropy of the salt ions, wherebythe magnitude of electrostatic interactions in the condensed phase are enhanced by reducedsalt because of less screening but any difference in concentration in salt ions between thedilute and condensed phases is entropically unfavorable. It is intuitive that both of theseproposed mechanisms – conjectured in modeling coacervation – would be affected by thecharge pattern of the polymers, but the manner in which the proposed mechanisms aremodulated by sequence heterogeneity remains to be investigated. (a) 50 mer polyelectrolyte (b) 40m e(cid:0) (cid:1)(cid:2)(cid:3)(cid:4)(cid:5)(cid:6)(cid:7) (cid:8)(cid:9)(cid:10)(cid:11)(cid:12)(cid:13)(cid:14)(cid:15)(cid:16)(cid:17)(cid:18)(cid:19) FIG. 10. Salt-dependent LLPS of polyelectrolytes and polyampholytes. rG-RPA phase diagramsfor (a) an N = 50 homopolymer with monomer charge = −
1, and (b) the N = 40 4-blockpolyampholyte in Fig. 1. Note that salt-free l/ ( l B ) cr = 0 . .
63 for (b). ( φ s ) Ucr isgiven by the grey circle. An unmarked φ s = ( φ s ) Lcr > F. rG-RPA rationalizes sequence-dependent LLPS of Ddx4 IDRs
Simple RPA theory and an extended RPA+FH theory with an augmented Flory-Huggins(FH) mean-field account of non-electrostatic interactions was utilized to rationalize experimental data on sequence- and salt-dependent LLPS of Ddx4 IDRs . Because RPA23ccounts only for electrostatic interactions and a sequence-specific analytical treatment ofother interactions is currently lacking, FH was used to provide an approximate accountof non-electrostatic interactions. These interactions can include hydrophobicity, hydrogenbonding, and especially cation- π and π - π interactions because π -related interactions playprominent roles in LLPS of biomolecular condensates . To gain further insight into the semi-quantitative picture emerged from these earlier studies and to assess the generality ofour rG-RPA theory, here we apply an augmented rG-RPA to the LLPS of the same Ddx4 N1 and Ddx4 N1 CS sequences by adding to the rG-RPA free energy in Eq. 1 an FH interactionterm − χφ m , where χ = ∆ H ( l B /l ) − ∆ S contains both enthalpic and entropic components,and refer to the resulting formulation as rG-RPA+FH.To compare with experimental data , we use this theory to compute the phase diagramsof Ddx4 N1 and Ddx4 N1 CS at pH 6.5 with 100 and 300ml NaCl, which correspond, respec-tively, to φ s = 0 . N1 and Ddx4 N1 CS LLPSbecause no corresponding experimental data is currently available for comparison.Our detailed rG-RPA study of salt-Ddx4 N1 and salt-Ddx4 N1 CS binary phase diagramsin Fig. 5 and Figs. 6–9 indicates that the difference between dilute- and condensed-phasesalt concentrations is less than 15% for φ s < .
01. Assuming that this trend is not muchaffected by non-electrostatic interactions, here we make the simplifying assumption that saltconcentration is constant when determining the rG-RPA+FH phase diagrams. Fig. 11(a)shows the resulting rG-RPA+FH theory with χ = 0 . l B /l ) fits reasonably well with all fouravailable experimental phase diagrams.As control, phase diagrams are also computed without the augmented FH term (i.e., χ = 0). These phase diagrams are shown as dashed lines in Fig. 11(b). Without the χ term,the critical temperatures of Ddx4 N1 and Ddx4 N1 CS with [NaCl] = 100mM are both predictedto be below 0 ◦ C (Fig. 11(b)). This theoretical trend is consistent with the experimental ob-servation that phenylalanine to alanine (F-to-A) and arginine to lysine (R-to-K) mutants ofDdx4 N1 do not undergo LLPS at physiologically relevant temperatures . These muta-tions (F-to-A and R-to-K) are expected to significantly reduce π -related interactions andtherefore correspond to having a weaker FH term (i.e. χ ).One aforementioned experimentally observed feature that cannot be captured by the24resent rG-RPA+FH theory is that in the absence of salt, Ddx4 N1 at pH 6.5 does notphase separate at room temperature, but rG-RPA+FH with χ = 0 . l B /l ) predicts phaseseparation under the same conditions. There can be multiple reasons for this mismatchbetween theory and experiment, a likely one of which is that the mean-field treatmentof non-electrostatic interactions does not take into possible coupling (cooperative effects)between sequence-specific electrostatic and non-electrostatic interactions such as π -relatedinteractions and hydrogen bonding that can be enhanced by proximate electrostatic attrac-tion. IV. CONCLUSIONS
In summary, we have developed a formalism for salt-, pH-, and sequence-dependent LLPSby combining RPA and Kuhn-length renormalization. The trends predicted by the result-ing rG-RPA theory are consistent with established theoretical and experimental results.Importantly, unlike more limited previous analytical approaches, rG-RPA is generally ap-plicable to both polyelectrolytes and neutral/near-neutral polyampholytes. In addition toproviding physical rationalizations for experimental data on the pH-dependent LLPS of IP5peptides and sequence and salt dependence of LLPS of Ddx4 IDRs, our theory offers severalintriguing predictions of electrostatics-driven LLPS properties that should inspire furthertheoretical studies and experimental evaluations. One such observation is that in a salt-heteropolymer system, it is possible for the slope of the tie lines to shift from negativeto positive by increasing salt. Although tie lines with exclusively positive or exclusivelynegative slopes were predicted for uniformly charged polyelectrolytes and diblock polyam-pholytes , a salt-dependent change in the sign of tie-line slope for a single speciesof heteropolymer— specifically from negative to positive with increasing salt—is a notableprediction. In future studies, it would be interesting to explore how this property mighthave emerged from the intuitively higher degree of sequence heterogeneity of the Ddx4 N1 IDR vis-`a-vis that of simple diblock or few-block polyampholytes. In general, the inter-play between sequence heterogeneity and a proposed chain connectivity effect as well asa proposed screening-configurational entropy competition effect on the salt partitioningslope between dilute and condensed phases remains to be elucidated. Another observationof our work is that inverse S-shape coexistence curves can arise from sequence heterogeneity25 ( o C ) [Ddx4] (mg/ml)wt 100wt 300cs 100cs 300 T ( o C ) [Ddx4] (mg/ml) (a) (b) rG-RPA + FHrG-RPAwt 100wt 100cs 100cs 100 − − − FIG. 11. Comparing rG-RPA+FH results with experimental data on Ddx4 IDRs. (a) Experimentaldata of Ddx4
N1pH6 . (wt) and Ddx4 N1 CS pH6 . (cs) (chain length N = 241 for both sequences) inaqueous solutions with 100 and 300mM NaCl (from Ref. 24; color symbols) are fitted, respectively,to rG-RPA+FH theory with φ s = 0 . N1 binaryphase diagrams in Fig. 5 indicate that the difference in salt concentration between the two phaseis less than 15% for φ s < .
01. The fits yield an FH interaction parameter χ = 0 . l B /l ) which isequivalent to an enthalpy ∆ H = − . ◦ C and mg/mlby a procedure similar to that in Ref. 55 with an appropriately chosen model Kuhn length l thatis quite similar to (though not identical with) the C α –C α virtual bond length of polypeptides. (b)Phase diagrams of the two sequences with and without the augmented FH interaction. Without theFH term (i.e., χ = 0), the critical temperatures of both Ddx4 N1pH6 . and Ddx4 N1 CS pH6 . at 100mMNaCl are below 0 ◦ C. The two χ = 0 systems may be interpreted as corresponding to sequenceswith reduced favorable non-electrostatic interactions . See the main text for further discussion. not only for polyampholytes but also for polyelectrolytes. As emphasized recently , aninverse S-shape coexistence curve allows for a less concentrated condensed phase, which canbe of biophysical relevance because it would enable a condensate with higher permeability .26ecause rG-RPA is an analytical theory, pertinent numerical computations are much moreefficient than field-theory or explicit-chain simulations. Thus, in view of the above advancesand despite its approximate nature, rG-RPA should be useful as a high-throughput tool forassessing sequence-dependent LLPS properties in developing basic biophysical understandingand in practical applications such as design of new heteropolymeric materials.Future development of LLPS theory should address a number of physical properties nottackled by our current theories. These include, but not necessarily limited to: (i) Sequence-dependent effects of non-electrostatic interactions, which is neglected in rG-RPA+FH. (ii)Counterion condensation . (iii) Dependence of relative permittivity (dielectric con-stant) on polymer density and salt . (iv) A more accurate treatment of conformationalheterogeneity to compute the structure factor. The present approach accounts approxi-mately for sequence-dependent end-to-end distance, but it fails to capture conformationalheterogeneities at smaller length scales . A formalism for residue-pair-specific renormal-ized Kuhn length should afford improvement in this regard. (v) Higher-order densityfluctuations beyond the quadratic fluctuations treated by rG-RPA. The rapidly expand-ing repertoire of experimental data on biomolecular condensates is providing impetus fortheoretical efforts in all these directions. V. ACKNOWLEDGEMENT
We thank Alaji Bah, Julie Forman-Kay, and Kevin Shen for helpful discussions. Thiswork was supported by Canadian Insitutes of Health Research grant PJT-155930 and Nat-ural Sciences and Engineering Research Council of Canada grant RGPIN-2018-04351 toH.S.C., National Institutes of Health grant 1R15GM128162-01A1 to K.G., and compu-tational resources provided by SciNet of Compute/Calcul Canada. H.S.C. and K.G. aremembers of the Protein Folding and Dynamics Research Coordination Network funded byNational Science Foundation grant MCB 1516959.
Appendix A: Derivation of polymer solution free energy
As described in main text, we consider a neutral solution of n p charged polymers of N monomers (residues) with charge sequence | σ i = [ σ , σ , ...σ N ] T . Averaged net charge per27onomer is defined as q c = ( P τ σ τ ) /N . In addition, there are n s salt ions (co-ions) carrying z s charges and n c counterions carrying z c charges. Charge neutrality | q c | n p + z s n s = z c n c isalways preserved. Monomer and ion densities are defined as ρ m = n p N/ Ω, ρ s = n s / Ω, and ρ c = n c / Ω, respectively, with Ω being the solution volume.We label the polymers by α = 1 , , . . . , n p and residues in a polymer by τ = 1 , , . . . , N ,and denote the spatial coordinate of the τ th monomer in the α th polymer by R α,τ . Similarly,the small ions are labeled by a = 1 , , . . . , n s + n c , in which 1 ≤ a ≤ n s are for salt ions and n s + 1 ≤ a ≤ n s + n c are for counterions, with the coordinate of the a th small ion denotedby r a . The implicit-solvent partition function is then expressed as an integral over all solutecoordinates divided by factorials that account for the indistinguishability of the moleculeswithin each molecular species in the solution, viz., Z = 1 n p ! n c ! n s ! n w ! Z n p Y α =1 N Y τ =1 d R α,τ n s + n c Y a =1 d r a e − T [ R ] − U [ R , r ] , (A1)where n w denotes the number of water molecules, T accounts for chain connectivity of thepolymers, and U accounts for interactions among all solute molecules, [ R ] is shorthand for[ { R α,τ } ] and [ R , r ] is shorthand for [ { R α,τ } , { r a } ]. Connectivity is enforced by a sum ofGaussian potentials sharing the same Kuhn length l , which is given by T [ R ] = 32 l n p X α =1 N − X τ =1 ( R α,τ +1 − R α,τ ) . (A2)For simplicity, we assume that interactions in U are all pairwise, in which case it takes theform U [ R , r ] = 12 n p X α,β =1 N X τ,µ =1 U τµpp ( R α,τ − R β,µ )+ n p X α =1 N X τ =1 n s + n c X a =1 U τaps ( R α,τ − r a )+ 12 n s + n c X a,b =1 U abss ( r a − r b ) , (A3)where U pp , U ps , and U ss are, respectively, monomer-monomer, monomer-ion, and ion-ioninteraction potentials. It should be noted that although self-interactions, that is, the ( α, τ ) =( β, µ ) terms for monomers and the a = b terms for small ions, are included in the abovesummation to facilitate subsequent formal development of a field-theory description, these28ivergent terms will be regularized in the final free energy expression and thus have nobearing on the outcome of our theory. By introducing ρ τ k = n p X α =1 e i k · R α,τ , (A4a) c s k = n s X a =1 e i k · r a , (A4b) c c k = n c X a =1 e i k · r a + ns , (A4c)as the k -space density operators for the monomers and small ions, we rewrite Eq. A3 in k -space as U = 12Ω X k " N X τ,µ =1 ρ τ k U τµpp ( k ) ρ µ − k + 2 N X τ =1 X γ = s,c ρ τ k U τγps ( k ) c γ − k + X γ,γ ′ = s,c c γ k U γγ ′ ss ( k ) c γ ′ − k , (A5)where 1 / Ω is the standard normalization factor for the Fourier transformation, and thegeneral form U ( k ) = R d r U ( r ) exp( − i k · r ) represents the interaction potentials in k -space.As in Eq. A3 for U [ R , r ], the superscripts of U ( k ) are labels for monomers and ions, andthe subscripts specify the interaction type. We further define interaction matrices ˆ U ( k )’s byequating the matrix elements [ ˆ U ( k )] τµ with U τµ ( k ) for U pp , U ps , and U ss . We also define thedensity operator vectors | ρ k i and | c k i such that ( | ρ k i ) τ = ρ τ k and | c k i = [ c s k , c c k ] T . U canthen be expressed in matrix representation as U = 12Ω X k " h ρ − k | ˆ U pp ( k ) | ρ k i + 2 h ρ − k | ˆ U ps ( k ) | c k i + h c − k | ˆ U ss ( k ) | c k i . (A6)The present study foucses on solution systems in which U ss and U ps are purely Coulom-bic whereas U pp has both Coulombic and pairwise (two-body) excluded-volume repulsioncomponents. Hence ˆ U ss ( k ) = 4 πl B k | z ih z | , (A7a)ˆ U ps ( k ) = 4 πl B k | σ ih z | , (A7b)ˆ U pp ( k ) = 4 πl B k | σ ih σ | + v | N ih N | , (A7c)where k ≡ | k | , l B ≡ e / (4 πǫk B T ) is Bjerrum length ( e is electronic charge, ǫ is permittivity, k B is Boltzmann constant, T is absolute temperature). h z | = sign( q c )[ z s , − z c ] is the vector29epresenting the charge valencies (number of electronic charges per ion) of salt ions andcounterions, respectively, v > | N i is an N -dimensional vector in which every component is 1. Allelements in the excluded volume matrix | N ih N | take unity value because for simplicity allmonomers are taken to be of equal size. Substituting the potentials given by Eq. A7 intothe U function in Eq. A6 yields U = 12Ω X k = λ k |h σ | ρ k i + h z | c k i| + 12Ω X k v |h N | ρ k i| , (A8)where λ k = 4 πl B /k and | A k | ≡ A − k A k for arbitrary k -dependent A k . The first summationdoes not need to include k = because this term is proportional to the overall net charge ofthe solution and therefore must be zero because of overall electric neutrality of the solution.
1. Field theory for polymer solution
The Hubbard-Stratonovich transformation is then applied to linearize the quadratic form U in Eq. A8 by introducing conjugate fields ψ k for charge density and w k for mass density.The partition function Z in Eq. A1 can then be rewritten in terms of Z ′ = Z n p Y α =1 N Y τ =1 d R α,τ n s + n c Y a =1 d r a e − T [ R ] − U [ R , r ] = exp (cid:26) − v |h N | ρ k = i| (cid:27) Y k = Z dψ k d w k π Ω √ λ k v exp ( − X k = (cid:20) | ψ k | λ k + | w k | v (cid:21)) × Z n p Y α =1 N Y τ =1 d R α,τn s + n c Y a =1 d r a exp ( − i Ω X k = h(cid:0) h σ | ρ − k i + h z | c − k i (cid:1) ψ k + h N | ρ − k i w k i − T [ { R α,τ } ] ) , (A9)where Z = Z ′ / ( n p ! n c ! n s ! n w !). The first term in Z ′ is merely the k = component of U ,which by the definition of ρ τ k is equal to Z ≡ exp (cid:26) − v |h N | ρ k = i| (cid:27) = exp (cid:26) − v ( N n p ) (cid:27) . (A10)The remaining terms in Z ′ is a field integral of ψ and w . The first component (the latterpart of the second line in Eq. A9) is an exponential of the quadratic self-correlations, and the30econd term (the third and fourth lines in Eq. A9) is a partition function for the polymersand the small ions under the influence of ψ and w , which we now symbolize as Q sol [ ψ, w ] ≡ Z n p Y α =1 N Y τ =1 d R α,τ n s + n c Y a =1 d r a × exp ( − i Ω X k = h(cid:0) h σ | ρ − k i + h z | c − k i (cid:1) ψ k + h N | ρ − k i w k i − T [ { R α,τ } ] ) . (A11)By the definitions of c k and ρ τ k in Eq. A4, the exponent in the integrand of Q sol may beexpressed as − i Ω X k = h(cid:0) h σ | ρ − k i + h z | c − k i (cid:1) ψ k + h N | ρ − k i w k i − T [ { R α,τ } ]= − i Ω X k = ψ k " ( | z i ) s n s X i = a e − i k · r a + ( | z i ) c n s + n c X a = n s +1 e − i k · r a − n p X α =1 " l N − X τ =1 ( R α,τ +1 − R α,τ ) + i Ω X k = N X τ =1 (cid:0) σ τ ψ k + w k (cid:1) e − i k · R α,τ , (A12)where | z i s = sign( q c ) z s for salt ions and | z i c = − sign( q c ) z c for counterions as defined above.The coordinates of individual small ions and polymers are decoupled in this expression.Thus, the coordinate integrals in Q sol are also decoupled, allowing it to be written as Q sol [ ψ, w ] = ( Q s [ ψ ]) n s ( Q c [ ψ ]) n c ( Q p [ ψ, w ]) n p , (A13)where the n s , n c , and n p superscripts are powers, with Q s and Q c being the single-moleculepartition functions for salt ions and counterions, respectively; [ ψ ] is shorthand for [ { ψ k } ]and [ ψ, w ] is shorthand for [ { ψ k } , { w k } ]. These single-molecule small-ion partition functionsare given by Q s,c [ ψ ] = Z d r s,c exp ( − i ( | z i ) s,c Ω X k = ψ k e − i k · r s,c ) , (A14)where the expression for Q s or Q c corresponds, respectively, to choosing the subscript “ s ”or “ c ” for the “ s, c ” notation in the above Eq. A14. The single-polymer partition function Q p in Eq. A13 equals Q p [ ψ, w ] = Z D [ R ] e −H p [ R ; ψ, w ] , (A15)where D [ R ] ≡ Q Nτ =1 d R τ , [ R ; ψ, w ] is shorthand for [ { R τ } , { ψ k } , { w k } ], and H p [ R ; ψ, w ] = 32 l N − X τ =1 ( R τ +1 − R τ ) + i Ω X k = N X τ =1 ( σ τ ψ k + w k ) e − i k · R τ . (A16)31t should be noted that the small-ion label a and the polymer label α are not needed in thesingle-molecule partition functions in Eqs. A14 and A15. Collecting results from Eqs. A9,A10 and A13 yields the following formula for Z ′ : Z ′ = Z Z Y k = dψ k d w k π Ω √ λ k v exp ( − X k = (cid:20) | ψ k | λ k + | w k | v (cid:21) + n s ln Q s + n c ln Q c + n p ln Q p ) , (A17)where Z is provided by Eqs. A10, Q s , Q c , and Q p are given by Eqs. A14–A16.
2. Fluctuation expansion of partition function
To evaluate Eq. A17 analytically, we first derive a mean-field solution at ( ψ, w ) = ( ψ, w )in which the mean conjugated fields ψ and w satisfy the extremum condition ( δ Z ′ /δψ k ) =( δ Z ′ /δ w k ) = 0, which leads to the equalities ψ k Ω λ k = n s Q s (cid:18) δ Q s δψ k (cid:19) ( ψ, w ) + n c Q c (cid:18) δ Q c δψ k (cid:19) ( ψ, w ) + n p Q p (cid:18) δ Q p δψ k (cid:19) ( ψ, w ) , (A18a) w k Ω v = n p Q p (cid:18) δ Q p δ w k (cid:19) ( ψ, w ) , (A18b)where the subscript ( ψ, w ) indicates that the functional (field) derivatives are evaluated atthe to-be-solved mean conjugated fields. The ψ and w field are conjugates, respectively, tocharge density and mass density. By using Eqs. A14–A16 and the fact that the averages h· · ·i x over the spatial coordinates of the given molecular species ( x = p , s , or c ) of k -space density operators in Eq. A4 are given by h ρ τ k i p = n p h e i k · R τ i p , h c s k i s = n s h e i k · r s i s ,and h c c k i c = n c h e i k · r c i c because of the decoupling stated above by Eq. A13, the first-orderderivatives in Eq. A18 are given by n s,c Q s,c δ Q s,c δψ k = − i ( | z i ) s,c n s,c Ω (cid:16)(cid:10) e − i k · r s,c (cid:11) s,c (cid:17) ( ψ, w ) = − i ( | z i ) s,c Ω (cid:16)(cid:10) c s,c − k (cid:11) s,c (cid:17) ( ψ, w ) , (A19a) n p Q p δ Q p δψ k = − in p Ω * N X τ =1 σ τ e − i k · R τ + p ! ( ψ, w ) = − i Ω N X τ =1 σ τ (cid:16)(cid:10) ρ τ − k (cid:11) p (cid:17) ( ψ, w ) , (A19b) n p Q p δ Q p δ w k = − in p Ω * N X τ =1 e − i k · R τ + p ! ( ψ, w ) = − i Ω N X τ =1 (cid:16)(cid:10) ρ τ − k (cid:11) p (cid:17) ( ψ, w ) , (A19c)where ( h· · ·i x ) ( ψ, w ) denotes averaging over the spatial coordinates of the given molecularspecies evaluated for any given conjugate field ψ, w . With Eq. A19, the relations in Eq. A1832or the mean conjugate fields become ψ k = − iλ k (cid:16)(cid:10)(cid:2) h σ | ρ − k i + h z | c − k i (cid:3)(cid:11) s,c,p (cid:17) ( ψ, w ) , w k = − iv (cid:16)(cid:10)(cid:2) h N | ρ − k i (cid:3)(cid:11) p (cid:17) ( ψ, w ) , (A20)which can now be solved self-consistently to determine ψ k and w k .We proceed to obtain an approximate solution by assuming that within regions where thesystem exists as a single phase, the mass density is rather homogeneous. In that case, the k = components of the density operators ρ τ k , c s k , and c c k in Eq. A4 are small (approximatelyzero). It then follows from Eq. A20 that ψ k ≈ w k ≈ , ∀ k = . (A21)These considerations imply that the following approximate relations hold for the averageddensities on the right-hand side of Eq. A19: (cid:10) c s,c − k (cid:11) ≈ ≈ n s,c δ k , , N X τ =1 σ τ (cid:10) ρ τ − k (cid:11) ≈ ≈ q c n p N δ k , , N X τ =1 (cid:10) ρ τ − k (cid:11) ≈ ≈ n p N δ k , , (A22)where the “ ≈
0” subscript in h· · · i ≈ signifies that the given average over the s , c , or p spatialcoordinates is evaluated at the conjuagate fields in Eq. A21 for approximate homogeneousdensities. Now, to arrive at a definite approximate description, we expand the logarithmicsmall-ion partition functions around ψ k = = 0 up to O ( δψ ). Utilizing the expressions forthe averaged densities in Eq. A22 and replacing the conjugate field ψ k = ≈ ψ k = = 0, we obtainln Q s,c [ ψ ] ≈ ln Q s,c [ ψ k = = 0] + X k = (cid:18) δ ln Q s,c δψ k (cid:19) δψ k + 12 X k , k ′ = (cid:18) δ ln Q s,c δψ k δψ k ′ (cid:19) δψ k δψ k ′ = ln Ω − i | z i s,c Ω X k = (cid:10) e − i k · r s,c (cid:11) δψ k − z s,c X k , k ′ = hD e − i ( k + k ′ ) · r s,c E − D e − i k · r s,c E D e − i k ′ · r s,c E i δψ k δψ k ′ = ln Ω − z s,c X k = | ψ k | , (A23)where the “0” subscript in ( · · · ) indicates that the derivatives are evaluated at ψ k = = 0.Similarly, replacing the “ ≈
0” subscripts in Eq. A22, here the “0” subscript in h· · · i indicatesthat the average is evaluated at ψ k = = 0. In the last line of the above Eq. A23, the expansion33ariable δψ k is written as ψ k for every term in the P k = summation because the expansionis around ψ k = = 0. Substituting Eq. A23 for the ln Q s and ln Q c into Eq. A17 yields Z ′ ≈ Z Z Y k = dψ k d w k π Ω √ λ k v exp ( − X k = (cid:20) | ψ k | (cid:18) λ k + z s ρ s + z c ρ c (cid:19) + | w k | v (cid:21) + n p ln Q p + C ) , (A24)where C = ( n s + n c ) ln Ω will be dropped in subsequent consideration because it has no effecton the relative free energies of different configurational states. Let the exponent in Eq. A24without C be denoted as − H , then H may be seen as a Hamiltonian of a polymer system: H [ ψ, w ] = 12Ω X k = (cid:20) ν k ψ − k ψ k + w − k w k v (cid:21) − n p ln Q p [ ψ, w ] , (A25)where 1 ν k = 11 /λ k + z s ρ s + z c ρ c ≡ πl B k + κ (A26)is merely a Fourier-transformed Coulomb potential with screening length 1 /κ = [4 πl B ( z s ρ s + z c ρ c )] − / . We may now express Z ′ as a product of three components, viz., Z ′ = Z Z ion Z ′ p , (A27)where Z is defined in Eq. A10, Z ion = Y k = √ ν k λ k = Y k = (cid:20) κ k (cid:21) − , (A28)and Z ′ p = Y k = Z r ν k v dψ k d w k π Ω e − H [ ψ, w ] . (A29)Accordingly, the complete partition function Z = Z ′ / ( n s ! n c ! n p ! n w !) provides free energy ofthe system in units k B T per volume l : f = − l Ω ln Z = − s + f ion + f p + f , (A30)where − s = l Ω ln( n s ! n c ! n p ! n w !) , (A31) f = − l Ω ln Z = v l ( n p N ) = l v ρ m , (A32) f ion = − l Ω ln Z ion = l X k = ln (cid:20) κ k (cid:21) = − ( κl ) π + I , (A33) f p = − l Ω ln Z ′ p . (A34)34 . Small-ion free energy The first term of f ion in Eq. A33 is the standard Debye screening energy. The second termof f ion , I = l κ k max , is formally divergent ( k max is the maximum k value of the system,corresponding to the smallest length scale in coordinate space; I → ∞ as k max → ∞ ) butsince it is linearly proportional to n s and n c (through its dependence on κ , see above), thisformally divergent term is irrelevant to the relative free energies of different configurationalstates of the system . As in most analyses, the k -summation is performed here by replacingit with a continuous integral over k -space:1Ω X k = → Z d k (2 π ) . (A35)To make our model physically more realistic, however, we follow Muthukumar whotreated the charge of each small ion as distributed over a finite volume with a characteristiclength scale comparable to the bare Kuhn length l of the polymers. In this treatment, thepoint-charge expression for f ion in Eq. A33 is replaced by f ion = − π (cid:20) ln(1 + κl ) − κl + 12 ( κl ) (cid:21) , (A36)which reduces to − ( κl ) / (12 π ) in Eq. A33, as it should, in the limit of κl →
0. In thisregard, Eq. A36—which is used for all rG-RPA and fG-RPA applications in the presentwork—may be viewed as a regularized, more physical version of Eq. A33.
4. Polymer free energy
We now proceed to derive an approximate, tractable analytical expression for Z ′ p in Eq. 7in the main text and Eq. A29 by expanding ln Q p (defined in Eqs. A15 and A16) around ψ k = = w k = = 0, viz.,ln Q p [ ψ, w ] = ln Q p [ ψ k = = w k = = 0]+ X k = N X τ =1 (cid:18) δ ln Q p δϕ τ k (cid:19) ϕ τ k + 12 X k , k ′ = N X τ,µ =1 (cid:18) δ ln Q p δϕ τ k δϕ µ k ′ (cid:19) ϕ τ k ϕ µ k ′ + O ( ϕ )= ln Ω + 3( N − (cid:18) πl (cid:19) − X k = N X τ,µ =1 (cid:10) e − i k · ( R τ − R µ ) (cid:11) ϕ τ k ϕ µ − k + O ( ϕ ) , (A37)35here ϕ τ k = σ τ ψ k + w k and the first term in the second line vanishes because of Eq. A22.As in Eq. A23, the first two constant terms in the last line of the above equation have noeffect on the relative energies of different configurations of the system and therefore willbe discarded for our present purpose. The third term in the last line of Eq. A37 is theintrachain monomer-monomer correlation function evaluated at ψ k = = w k = = 0. This cor-relation function is equal to that of a Gaussian chain. However, in the presence of intra- andinterchain interactions, a Gaussian-chain description of the polymer chains in our systemis unsatisfactory, as has been demonstrated by theoretical and experimental studies showing that polymers with different net charges and heteropolymers with different chargesequences—even when they have the same net charge—can have dramatically different con-formational characteristics. Intuitively, this sequence-dependent conformational heterogene-ity should apply not only to the case when a polymer chain is isolated but also to situationsin which polymer chains are in semidilute solutions. To account for this fundamental prop-erty in the monomer-monomer correlation function, we need to include nonzero ψ k = and w k = fluctuations that arise from the higher-order terms in Eq. A37. Accordingly, based ona rationale similar to that advanced in Refs. 59, 71, 72, we replace the monomer-monomercorrelation function in Eq. A37 by a correlation function involving arbitrary fields. Thisdevelopment leads toln Q p [ ψ, w ] ≃ − N X k = [ ξ k ψ k ψ − k + g k w k w − k + 2 ζ k w k ψ − k ] , (A38)where ξ , g , and ζ are structure factors of mass and charge densities, ξ k = 1 N N X τ,µ =1 σ τ σ µ (cid:16)(cid:10) e i k · ( R τ − R µ ) (cid:11) p (cid:17) ( ψ, w ) , (A39a) g k = 1 N N X τ,µ =1 (cid:16)(cid:10) e i k · ( R τ − R µ ) (cid:11) p (cid:17) ( ψ, w ) , (A39b) ζ k = 1 N N X τ,µ =1 σ τ (cid:16)(cid:10) e i k · ( R τ − R µ ) (cid:11) p (cid:17) ( ψ, w ) . (A39c)Substituting Eq. A38 for ln Q p in Eq. A25, we obtain H [ ψ, w ] = 12Ω X k = h ψ − k w − k | ν k + ρ m ξ k ρ m ζ k ρ m ζ k v − + ρ m g k (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) ψ k w k + = 12Ω X k = h Ψ − k | ˆ∆ k | Ψ k i , (A40)36here h Ψ − k | ≡ h ψ − k w − k | , | Ψ k i = ( h Ψ − k | ) ∗ T , and ˆ∆ k is the 2 × Z ′ p = Y k = r ν k v det ˆ∆ k . (A41)Therefore, by Eqs. A35 and A41, the unit free energy is now formally given by f p = − l ln Z ′ p Ω = l Z d k (2 π ) ln (cid:20) ρ m (cid:18) ξ k ν k + v g k (cid:19) + v ν k ρ m (cid:0) ξ k g k − ζ k (cid:1)(cid:21) . (A42)It should be noted, however, that the k ≡ | k | → ∞ behavior of the integrand in the aboveEq. A42 needs to be regularized. For point particles, the k → ∞ limit of the pairwisecorrelation function is a Kronecker- δ :lim k →∞ (cid:10) e i k · ( R τ − R µ ) (cid:11) p = δ τµ . (A43)Thus, by Eq. A39, lim k →∞ ξ k = 1 N N X τ =1 σ τ , (A44a)lim k →∞ g k =1 , (A44b)lim k →∞ ζ k = q c . (A44c)Because lim k →∞ (1 /ν k ) = lim k →∞ πl B /k and v >
0, Eq. A44 indicates that the integralin Eq. A42 has an ultraviolet (large- k ) divergence. This divergence is physically irrele-vant, however, because the integral can be readily regularized by subtracting the unphysicalCoulomb self-energy of the charged monomers f self = ρ m l N Z d k (2 π ) πl B k N X τ =1 σ τ (A45)that was included merely for formulational convenience in the first place. In the same veinas the charge smearing for the small ions (Eq. A36), we also smear the δ -function excludedvolume repulsion by a Gaussian , viz., v → v ( k ) = v e − ( kl ) , (A46)and use v ( k ) in the integral of Eq. A42 of f p to give a v -regularized f p [ v ( k )]. The regu-larized f p resulting from these two procedures is then given by f p [ v ( k )] − f self → f p , (A47)37here the last arrow signifies that this regularized version of f p is the one used for oursubsequent theoretical development in the present work.As discussed above, the present separate treatments for small-ions (Eq. A36) and poly-mers (Eqs. A42 and A47) are needed in our formulation—which expresses the total partitionfunction as a product consisting of separate factors for small ions and polymers (Eq. A28)—such that the polymer part of the partition function can be used to derive an effective Kuhnlength. Not surprisingly, in the event that the bare chain length l is used instead of aneffective Kuhn length and that the volume of small ions and the volume of the monomers ofthe polymers becomes negligible ( v → , as can be readily seen in the following. First, when the size of thesmall ions is assumed to be negligible, their free energy is given by the simple Debye-H¨uckelexpression in Eq. A33 instead of the finite-size expression in Eq. A36. Second, as v → v in Eq. A42 vanish. Consequently, the resulting overall electrostaticfree energy, denoted here as f el , is given by f el = f (Eq . A )ion + f (Eq . A )p ( v →
0) = l Z d k (2 π ) (cid:26) ln (cid:20) κ k (cid:21) + ln (cid:20) ρ m ξ k ν k (cid:21)(cid:27) . (A48)Recalling that κ = 4 πl B ( z s ρ s + z c ρ c ) and 1 /ν k = 4 πl B / ( k + κ ) (Eq. A26), this quantitybecomes f el = l Z d k (2 π ) ln (cid:20) κ + k k × k + κ + 4 πl B ρ m ξ k k + κ (cid:21) = l Z d k (2 π ) ln (cid:20) κ + 4 πl B ρ m ξ k k (cid:21) = l Z d k (2 π ) ln (cid:20) πl B k (cid:0) z s ρ s + z c ρ c + ρ m ξ k (cid:1)(cid:21) , (A49)which is exactly the same f el expression in our previous simple RPA theory in a formulationthat does not consider an explicit excluded-volume repulsion term and treats small ions andpolymers on the same footing .
5. Effective Gaussian-chain model for two-body correlation function
The ( ψ, w )-dependence of the structure factors ξ , g , and ζ in Eq. A42 for f p allowsfor an account of sequence-dependence conformational heterogeneity by using a Gaussianchain with a renormalized Kuhn length l = xl (instead of the “bare” Kuhn length l )to approximate the polymer partition function Q p in Eq. A15. Specifically, we make the38pproximation that Q p ≈ Z D [ R ] e −H p [ R ] , where H p [ R ] = 32 l x N − X τ =1 ( R τ +1 − R τ ) . (A50)The structure factors ξ , g , and ζ in Eq. A39 can then be readily expressed in terms of theyet-to-be-determined renormalization parameter x : ξ k → ξ xk = 1 N N X τ,µ =1 σ τ σ µ e − ( kl ) x | τ − µ | , (A51a) g k → g xk = 1 N N X τ,µ =1 e − ( kl ) x | τ − µ | , (A51b) ζ k → ζ xk = 1 N N X τ,µ =1 σ τ e − ( kl ) x | τ − µ | . (A51c)The renormalization parameter x = l /l is determined using a sequence-specific variationalapproach introduced by Sawle and Ghosh , as follows. We first express the Hamiltonian H p [ R ] in Eq. A16 as H p = H p + H p , where H p (given by Eq. A50) is the principal term and H p [ R ; ψ, w ] = 32 l (cid:18) − x (cid:19) N − X τ =1 ( R τ +1 − R τ ) + i Ω X k = N X τ =1 ( σ τ ψ k + w k ) e − i k · R τ (A52)is the perturbative term. Then, for any given physical quantity A [ R ], the perturbationexpansion of its thermodynamic average over polymer configurations { R τ } and field fluctu-ations Ψ = ( ψ, w ) is given by h A [ R ] i = D e −H p [ R ;Ψ] A [ R ] E , Ψ (cid:10) e −H p [ R ;Ψ] (cid:11) , Ψ = h A [ R ] i + h h A [ R ] i (cid:10) H p [ R ; Ψ] (cid:11) , Ψ − (cid:10) A [ R ] H p [ R ; Ψ] (cid:11) , Ψ i + 12 (cid:20)D A [ R ] (cid:0) H p [ R ; Ψ] (cid:1) E , Ψ − h A [ R ] i D(cid:0) H p [ R ; Ψ] (cid:1) E , Ψ (cid:21) + h A [ R ] i (cid:10) H p [ R ; Ψ] (cid:11) , Ψ − (cid:10) A [ R ] H p [ R ; Ψ] (cid:11) , Ψ (cid:10) H p [ R ; Ψ] (cid:11) , Ψ + O (cid:0) ( H p ) (cid:1) , (A53)where the subscripts 0 , Ψ in h· · ·i signify, respectively, that the average over { R τ } ’s isweighted by the Hamiltonian H p [ R ] in Eq. A50 and the average over field configurationsis weighted by the Hamiltonian H [ ψ, w ] in Eq. A25. (Note that the meaning of the “0”39ubscript here is different from that for the averages evaluated at ψ k = = w k = = 0 inEq. A23). An H p [ R ] that provides a good description of the thermal properties of A maythen be obtained by minimizing h A i − h A i . This is accomplished by a partial optimizationto seek a value of x = l /l that would abolish the lowest-order nontrivial H p contributionsin Eq. A53.To obtain a partially optimized x = l /l that provides a good approximation for themonomer-monomer correlation function, A is chosen to be the squared end-to-end distanceof the polymer, i.e., A = R ee ≡ | R N − R | , because R ee is a simple yet effective measure ofconformational dimensions of polymers . To facilitate this calculation, we express H p inEq. A52 as H p = X + X , where X [ R ] = 32 l (cid:18) − x (cid:19) N − X τ =1 ( R τ +1 − R τ ) , (A54a) X [ R ; Ψ] = i Ω X k = N X τ =1 ( σ τ ψ k + w k ) e − i k · R τ , (A54b)such that X [ R ] is independent of Ψ and all of H p ’s dependence on Ψ is contained in X [ R ; Ψ].It follows that the Ψ average is trivial (i.e., it produces a multiplicative factor of unity andtherefore can be omitted) for any function of X [ R ] only. In Eq. A53, the only contributionsfrom terms linear in X [ R ] come from the first line on the right-hand side (after the secondequality), which equal (cid:10) R ee (cid:11) hX i − (cid:10) R ee X (cid:11) = − l ( N − x ( x − . (A55)For the X -containing terms in Eq. A53, we first consider their Ψ-averages before applyingthe h· · ·i averaging. For terms linear in X , it is straightforward to see that hX i Ψ = i Ω X k = N X τ =1 h σ τ h ψ k i Ψ + h w k i Ψ i e − i k · R τ = 0 (A56)because h ψ k i Ψ = h w k i Ψ = 0 according to the quadratic-field Hamiltonian H [ ψ, w ] inEq. A40. Thus, X has zero contribution in the first and third lines on the right-handside of Eq. A53. In contrast, terms quadratic in X [ R ] are not identical zero, because (cid:10) X (cid:11) Ψ = − X k = N X τ,µ =1 h σ τ σ µ h ψ − k ψ k i Ψ + h w − k w k i Ψ + ( σ τ + σ µ ) h ψ − k w k i Ψ i e − i k · ( R τ − R µ ) , (A57)40nd here hX i Ψ is seen as depending on field-field correlation functions h ψψ i , h ww i , and h ψ w i averaged over Ψ. Thus, the X factors in the averages in the second line on the right-hand side of Eq. A53 provide the only nonzero contribution through second order in H p .Following Ref. 59, we only consider lowest-order nonzero contributions from X , and from X , separately, i.e., including only terms through O ( X ) and O ( X ) as discussed above. Thisapproach to the perturbative analysis of Eq. A53 may also be rationalized by an alternateanalytical formulation put forth in Refs. 71, 72.As shown in Eq. A40, the field configuration distribution may be approximated by aGaussian distribution embodied by the quadratic Hamiltonian H [ ψ, w ]. According to per-turbation theory , the field-field correlation functions in Eq. A57 can now be obtainedfrom the matrix ˆ∆ k in Eq. A40 via the relationships h ψ − k ψ k i Ω = (cid:16) ˆ∆ − k (cid:17) = v − + ρ m g k det ˆ∆ k , (A58a) h w − k w k i Ω = (cid:16) ˆ∆ − k (cid:17) = ν k + ρ m ξ k det ˆ∆ k , (A58b) h ψ − k w k i Ω = h ψ k w − k i Ω = (cid:16) ˆ∆ − k (cid:17) = (cid:16) ˆ∆ − k (cid:17) = − ρ m ζ k det ˆ∆ k . (A58c)Hence hX i is expressed in terms of ˆ∆ k as (cid:10) X (cid:11) = − X k = N X τ,µ =1 h σ τ | v − + ρ m g k − ρ m ζ k − ρ m ζ k ν k + ρ m ξ k (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) σ µ + det ˆ∆ k e − i k · ( R τ − R µ ) . (A59)It should be noted that the excluded volume interaction v is not regularized by Eq. A46 herebecause a k -independent v is needed to guarantee a real solution for the renormalizationparameter x for arbitrary charge sequence | σ i (Refs. 68, 69). Thus, the regularized formof v in Eq. A46 applies only to the explicit v dependence of f p in Eq. A42 but not theimplicit v dependence of x contained in the renormalized form of the structure factors ξ , g , and ζ . Substituting the x -dependent correlation functions in Eq. A51 for the structurefactors in Eq. A59, we obtain the nonzero contribution from X in the second line of theright-hand side of Eq. A53 as12 (cid:2) h R ee X i − h R ee i hX i (cid:3) = N l x Z d k (2 π ) k Ξ xk det ∆ xk , (A60)where det ∆ xk = ν k v + ρ m (cid:18) ξ xk v + ν k g xk (cid:19) + ρ m [ ξ xk g xk − ( ζ xk (cid:1) (cid:3) , (A61)41nd Ξ xk ≡ ¯ ξ xk v + ν k ¯ g xk + ρ m (cid:0) ¯ ξ xk g xk + ξ xk ¯ g xk − ζ xk ¯ ζ xk (cid:1) . (A62)Here the renormalized ¯ ξ , ¯ g , and ¯ ζ are given by¯ ξ xk = 1 N N X τ,µ =1 σ τ σ µ | τ − µ | e − ( kl ) x | τ − µ | , (A63a)¯ g xk = 1 N N X τ,µ =1 | τ − µ | e − ( kl ) x | τ − µ | , (A63b)¯ ζ xk = 1 N N X τ,µ =1 σ τ | τ − µ | e − ( kl ) x | τ − µ | . (A63c)Finally, by combining Eqs. A55 and A60, we arrive at the variational equation1 − x − N l N − Z d k (2 π ) k Ξ xk det ∆ xk = 0 (A64)for solving x . In our numerical calculations, we take v = 4 πl /
3. Inserting the solution of x into Eq. A51 provides an improved accounting of the conformational heterogeneity in thefree energy; and this improvement is central to the present rG-RPA theory.
6. Mixing entropy
The factorials in Eq. A1 arise from the indistinguishability of the molecules belonging tothe same species. Taking logarithm and using Stirling’s approximation, one obtains − Sk B = ln ( n p ! n s ! n c ! n w !) ≃ n p ln n p + n s ln n s + n c ln n c + n w ln n w − n p − n c − n s − n w , (A65)where additive terms of the form [ln(2 πn )] / n = n p , n s , n c , or n w ) are omittedbecause for large n , their contributions is negligible in comparison to the terms included inEq. A65. As in Ref. 56, here we assume for simplicity that the size of a monomer, a smallion, or a water molecule all equals l . Assuming further, for simplicity, that the systemis incompressible, i.e., the system volume Ω is fully occupied by polymers, small ions, andwater, then 1Ω ( N n p + n s + n c + n w ) = ρ m + ρ s + ρ c + ρ w = 1 l . (A66)42ollowing Flory’s notation, volume fractions of polymers and salt ion are defined, respec-tively, as φ m = ρ m l , φ s = ρ s l , (A67)and the volume fraction φ c of counterions and volume fraction φ w of water are given by z c φ c = q c φ m + z s φ s , φ w = 1 − φ m − φ s − φ c . (A68)Because the last four terms in Eq. A65 are linear in numbers of molecules, they are irrelevantto phase separation . Discarding these terms results in the mixing entropy − s ≡ − Sl k B Ω = φ m N ln φ m + φ s ln φ s + φ c ln φ c + φ w ln φ w (A69)given in Eq. 2 of the main text. Appendix B: Temperature selection for polymer-salt phase diagrams of Ddx4variants
Three temperatures, two below and one slightly above the respective salt-free criticaltemperature T cr ∝ l/ ( l B ) cr of each of the Ddx4 variants Ddx4 N1pH7 , Ddx4 N1 CS pH7 , Ddx4 N1pH1 ,and Ddx4 N1 CS pH1 are selected for the phase diagrams in Figs. 6, 7, 8, and 9. The l/l B valuesare selected to compare salt dependence of the sequences under temperatures producingsimilar gaps between the dilute- and condensed-phase protein densities at or near φ s = 0for the different sequences. Specifically, for the same part of the figures ((a), (b), and(c) separately), the l/ ( l B )’s are such that dilute-condensed density gaps are similar acrossFigs. 6–9. REFERENCES C. P. Brangwynne, C. R. Eckmann, D. S. Courson, A. Rybarska, C. Hoege, J. Gharakhani,F. J¨ulicher, and A. A. Hyman, Science , 1729 (2009). P. Li, S. Banjade, H. C. Cheng, S. Kim, B. Chen, L. Guo, M. Llaguno, J. V. Hollingsworth,D. S. King, S. F. Banani, P. S. Russo, Q. X. Jiang, B. T. Nixon, and M. K. Rosen, Nature , 2499 (2012). 43
M. Kato, T. W. Han, S. Xie, K. Shi, X. Du, L. C. Wu, H. Mirzaei, E. J. Goldsmith,J. Longgood, J. Pei, N. V. Grishin, D. E. Frantz, J. W. Schneider, S. Chen, L. Li, M. R.Sawaya, D. Eisenberg, R. Tycko, and S. L. McKnight, Cell , 753 (2012). T. J. Nott, E. Petsalaki, P. Farber, D. Jervis, E. Fussner, A. Plochowietz, T. D. Craggs,D. P. Bazett-Jones, T. Pawson, J. D. Forman-Kay, and A. J. Baldwin, Mol. Cell , 936(2015). A. Molliex, J. Temirov, J. Lee, M. Coughlin, A. P. Kanagaraj, H. J. Kim, T. Mittag, andJ. P. Taylor, Cell , 123 (2015). C. W. Pak, M. Kosno, A. S. Holehouse, S. B. Padrick, A. Mittal, R. Ali, A. A. Yunus,D. R. Liu, R. V. Pappu, and M. K. Rosen, Molecular Cell , 72 (2016). L. P. Bergeron-Sandoval, H. K. Heris, C. Chang, C. E. Cornell, S. L. Keller, A. G.Hendricks, A. J. Ehrlicher, P. Francois, R. V. Pappu, and S. W. Michnick, bioRxiv ,10.1101/145664 (2018). A. G. Larson, D. Elnatam, M. M. Keenen, M. J. Trnka, J. B. Johnston, A. L. Burlingame,D. A. Agard, S. Redding, and G. J. Narlikar, Nature , 236 (2017). A. J. Plys and R. E. Kingston, Science , 329 (2018). W. K. Cho, J. H. Spille, M. Hecht, C. Lee, C. Li, V. Grube, and I. Cisse, Science ,412 (2018). B. R. Sabari, A. Dall’Agnese, A. Boika, I. A. Klein, E. L. Coffey, K. Shrinivas, B. J.Abraham, N. M. Hannett, A. V. Zamudio, J. C. Manteiga, C. H. Li, Y. E. Guo, D. S.Day, J. Schuijers, E. Vasile, S. Malik, D. Hnisz, T. I. Lee, I. I. Cisse, R. G. Roeder, P. A.Sharp, A. K. Chakraborty, and R. A. Young, Science , eaar3958 (2018). B. Tsang, J. Arsenault, R. M. Vernon, H. Lin, N. Sonenberg, L.-Y. Wang, A. Bah, andJ. D. Forman-Kay, Proc. Natl. Acad. Sci. U. S. A. , 4218 (2019). Y. Shin and C. P. Brangwynne, Science , eaaf4382 (2017). S. F. Banani, H. O. Lee, A. A. Hyman, and M. K. Rosen, Nat. Rev. Mol. Cell Biol. ,285 (2017). S. Boeynaems, S. Alberti, N. L. Fawzi, T. Mittag, M. Polymenidou, F. Rousseau,J. Schymkowitz, J. Shorter, B. Wolozin, L. Van Den Bosch, P. Tompa, and M. Fuxreiter,Trends Cell Biol. , 420 (2018). M. L. Broide, C. R. Berland, J. Pande, O. O. Ogun, and G. B. Benedek, Proc. Natl.Acad. Sci , 5660 (1991). 44 N. Asherie, A. Lomakin, and G. B. Benedek, Phys. Rev. Lett. , 4832 (1996). P. L. San Biagio, V. Martorana, A. Emanuele, S. M. Vaiana, M. Manno, D. Bulone,M. B. Palma-Vittorelli, and M. U. Palma, Proteins: Struct. Func. Bioinformatics ,116 (1999). H. X. Zhou and X. Pang, Chem. Rev. , 1691 (2018). S. Qin and X. Zhou, H, J. Phys. Chem. B. , 8164 (2016). S. Cinar, H. Cinar, H. S. Chan, and R. Winter, J. Am. Chem. Soc. , 7347 (2019). J. D. Forman-Kay, R. W. Kriwacki, and G. Seydoux, J. Mol. Biol. , 4603 (2018). H. Cinar, Z. Fetahaj, S. Cinar, R. M. Vernon, H. S. Chan, and W. R, Chem. Eur. J. ,13049 (2019). J. P. Brady, P. J. Farber, A. Sekhar, Y.-H. Lin, R. Huang, A. Bah, T. J. Nott, H. S.Chan, A. J. Baldwin, J. D. Forman-Kay, and L. E. Kay, Proc. Natl. Acad. Sci. U. S. A. , E8194 (2017). S. Alberti, J. Cell Sci. , 2789 (2017). Z. Monahan, V. H. Ryan, A. M. Janke, K. A. Burke, S. N. Rhoads, G. H. Zerye,R. O’Meally, G. L. Dignon, A. E. Conicella, W. Zheng, R. B. Best, R. N. Cole, J. Mittal,F. Shewmaker, and N. Fawzi, EMBO , e201696394 (2017). G. Dignon, W. Zheng, Y. C. Kim, R. B. Best, and J. Mittal, Plos Comp Bio , e1005941(2018). S. Das, A. N. Amin, Y.-H. Lin, and H. S. Chan, Phys. Chem. Chem. Phys. , 28558(2018). G. L. Dignon, W. Zheng, R. B. Best, Y. C. Kim, and J. Mittal, Proc. Natl. Acad. Sci. , 9929 (2018). J. McCarty, K. T. Delaney, S. P. O. Danielsen, G. H. Fredrickson, and J. E. Shea, J.Phys. Chem. Lett. , 1644 (2019). S. P. O. Danielsen, J. McCarty, J. E. Shea, K. T. Delaney, and G. H. Fredrickson, Proc.Natl. Acad. Sci. , 8224 (2019). S. P. O. Danielsen, J. McCarty, J.-E. Shea, K. T. Dalaney, and G. H. Fredrickson, J.Chem. Phys. , 034904 (2019). C. P. Brangwynne, P. Tompa, and R. Pappu, Nature Physics , 899 (2015). Y.-H. Lin, J. D. Forman-Kay, and H. S. Chan, Biochemistry , 2499 (2018). H. G. B. deJong and H. R. Kruyt, Ned. Akad. Wet. , 849 (1929).45 J. T. G. Overbeek and M. J. Voorn, J Cell Comp Physiol , 7 (1957). E. Spruijt, A. H. Westphal, J. W. Borst, M. A. C. Stuart, and J. v. d. Gucht, Macro-molecules , 6476 (2010). R. Chollakup, W. Smitthipong, C. D. Eisenbach, and M. Tirrell, Macromolecules ,2518 (2010). S. L. Perry, Y. Li, D. Priftis, L. Leon, and M. Tirrell, Polymers , 1756 (2014). S. L. Perry and C. E. Sing, Macromolecules , 5040 (2015). S. Srivastava and M. V. Tirrell, Advances in Chemical Physics , 499 (2016). T. K. Lytle, M. Radhakrishna, and C. E. Sing, Macromolecules , 9693 (2016). T. K. Lytle and C. E. Sing, Soft Matter , 7001 (2017). M. Radhakrishna, K. Basu, Y. Liu, R. Shamsi, S. L. Perry, and C. E. Sing, Macro-molecules , 3030 (2017). P. Dubin and R. J. Stewart, Royal Soc. Chem. , 329 (2018). P. Zhang, K. Shen, N. M. Alsaifi, and Z.-G. Wang, Macromolecules , 5586 (2018). S. Adhikari, M. A. Leaf, and M. Muthukumar, J. Chem. Phys. , 163308 (2018). L. Li, S. Srivastava, M. Andreev, A. B. Marciel, J. J. D. Pablo, and M. V. Tirrell,Macromolecules , 2988 (2018). J. J. Madinya, L. W. Chang, S. L. Perry, and C. E. Sing, Molecular Systems Design andEngineering , 10.1039/C9ME00074G (2019). L. W. Chang, T. K. Lytle, M. Radhakrishnan, J. J. Madinya, J. Velez, C. E. Sing, andS. L. Perry, Nature Communications , 1273 (2017). M. Dzuricky, S. Roberts, and A. Chilkoti, Biochemistry , 2405 (2018). T. K. Lytle, L. W. Chang, N. Markiewicz, S. L. Perry, and C. E. Sing, ACS Cent. Sci. , 709 (2019). K. A. Mahdi and M. Olvera de la Cruz, Macromolecules , 7649 (2000). A. V. Ermoshkin and M. Olvera de la Cruz, Macromolecules , 7824 (2003). Y.-H. Lin, J. D. Forman-Kay, and H. S. Chan, Phys. Rev. Lett. , 178101 (2016). Y.-H. Lin, J. Song, J. D. Forman-Kay, and H. S. Chan, J. Mol. Liq. , 176 (2017). Y.-H. Lin and H. S. Chan, Biophys. J. , 2043 (2017). Y.-H. Lin, J. P. Brady, J. D. Forman-Kay, and H. S. Chan, New J. Phys. , 115003(2017). M. Muthukumar, J. Chem. Phys. , 5183 (1996).46 M. Muthukumar, Polym Sci Ser A Chem Phys , 852 (2018). M. Muthukumar, Macromolecules , 9528 (2017). H. Hofmann, A. Soranno, A. Borgia, K. Gast, D. Nettels, and B. Schuler, Proc. Natl.Acad. Sci. , 16155 (2012). R. K. Das and R. V. Pappu, Proc. Natl. Acad. Sci. , 13392 (2013). B. Schuler, A. Soranno, H. Hofmann, and D. Nettels, Annual Review of Biophysics ,207 (2016). I. Konig, A. Zarrine-Afser, M. Aznauryan, A. Soranno, B. Wunderlich, F. Dingfelder,J. Stuber, A. Pluckthun, D. Nettles, and B. Schuler, Nat. Methods , 773 (2015). A. Soranno, I. Koenig, M. Borgia, H. Hofmann, F. Zosel, D. Nettels, and B. Schuler,Proc. Natl. Acad. Sci. , 4874 (2014). S. M. Sizemore, S. M. Cope, A. Roy, G. Ghirlanda, and S. M. Vaiana, Biophysical Journal , 1038 (2015). L. Sawle and K. Ghosh, J. Chem. Phys. , 085101 (2015). T. Firman and K. Ghosh, J. Chem. Phys. , 123305 (2018). J. Huihui, T. Firman, and K. Ghosh, J. Chem. Phys. , 085101 (2018). K. Shen and Z.-G. Wang, J. Chem. Phys. , 084901 (2017). K. Shen and Z.-G. Wang, Macromolecules , 1706 (2018). M. Muthukumar, Macromolecules , 9142 (2002). G. Orkoulas, S. K. Kumar, and A. Z. Panagiotopoulos, Phys. Rev. Lett. , 048303(2003). J. W. Jiang, L. Blum, O. Bernard, and J. M. Prausnitz, Molecular Physics , 1121(2001). Y. A. Budkov, A. L. Kolesnikov, N. Georgi, E. A. Nogovitsyn, and M. G. Kiselev, J.Chem. Phys. , 174901 (2015). J. Jiang, J. Feng, H. Liu, and Y. Hu, J. Chem. Phys. , 144908 (2006). D. W. Cheong and A. Z. Panagiotopoulos, Mol. Phys. , 3031 (2005). S. Das, A. Eisen, Y.-H. Lin, and H. S. Chan, J. Phys. Chem. B , 5418 (2018). Y. Wang, A. Lomakin, S. Kanai, R. Alex, and G. B. Benedek, Langmuir , 7715 (2017). W. M. Haynes, ed.,
CRC Handbook of Chemistry and Physics , 93rd ed. (CRC Press Inc.,2012). K. Ghosh and K. A. Dill, Proc. Natl. Acad. Sci. U. S. A. , 10649 (2009).47 H. Eisenberg and G. R. Mohan, J. Phys. Chem. , 671 (1959). L. Sabbagh and M. Delsanti, Eur. Phys. J. E. Soft Matter Biol. Phys. , 75 (2000). V. M. Prabhu, M. Muthukumar, G. D. Wignall, and Y. B. Melnichenko, Polymer ,8935 (2001). A. Moreira and R. Netz, Eur. Phys. J. D , 61 (2001). P. Zhang, N. M. Alsaifi, J. Wu, and Z.-G. Wang, Macromolecules , 9720 (2016). R. M. Vernon, P. A. Chong, B. Tsang, T. H. Kim, A. Bah, P. Farber, H. Lin, and J. D.Forman-Kay, eLife , e31486 (2018). M. T. Wei, S. Elbaum-Garfinkle, A. S. Holehouse, C. C. Chen, M. Feric, C. B. Arnold,R. D. Priestley, R. V. Pappu, and C. P. Brangwynne, Nat. Phys. , 1118 (2017). G. S. Manning, Acc. Chem. Res. , 443 (1979). M. Muthukumar, J. Chem. Phys. , 9343 (2004). A. Levy, D. Andelman, and O. H, Phys. Rev. Lett. , 227801 (2012). K. Ghosh and M. Muthukumar, J. Polym. Sci. B , 2644 (2001). C.-L. Lee and M. Muthukumar, J. Chem. Phys. , 024904 (2009). A. V. Dobrynin, R. H. Colby, and M. Rubinstein, J. Polym. Sci., Part B: Polym. Phys. , 3513 (2004). A. V. Dobrynin and M. Rubinstein, Prog. Polym. Sci. , 1049 (2005). G. L. Dignon, W. Zheng, Y. C. Kim, and J. Mittal, ACS Cent. Sci. , 821 (2019). Z.-G. Wang, Phys. Rev. E , 021501 (2010). M. C. Villet and G. H. Fredrickson, J. Chem. Phys. , 224115 (2014).
M. Muthukumar, J. Chem. Phys. , 7230 (1987). M. Doi and S. F. Edwards,
The Theory of Polymer Dynamics (Clarendon Press, Oxford,1986).
J. Cardy,