[PDF] Efficient methods for determining folding free energies in single-molecule pulling experiments

Abstract

The remarkable accuracy and versatility of single-molecule techniques make possible new measurements that are not feasible in bulk assays. Among these, the precise estimation of folding free energies using fluctuation theorems in nonequilibrium pulling experiments has become a benchmark in modern biophysics. In practice, the use of fluctuation relations to determine free energies requires a thorough evaluation of the usually large energetic contributions caused by the elastic deformation of the different elements of the experimental setup (such as the optical trap, the molecular linkers and the stretched-unfolded polymer). We review and describe how to optimally estimate such elastic energy contributions to extract folding free energies, using DNA and RNA hairpins as model systems pulled by laser optical tweezers. The methodology is generally applicable to other force-spectroscopy techniques and molecular systems.

Full PDF

EEﬃcient methods for determining folding freeenergies in single-molecule pulling experiments

A Severino , A M Monge , P Rissone and F Ritort , Dept. F´ısica de la Mat`eria Condensada, Universitat de Barcelona, C/ Mart´ı iFranqu`es 1, 08028 Barcelona, Spain CIBER-BBN de Bioingenier´ıa, Biomateriales y Nanomedicina, Instituto de SaludCarlos III, 28029 Madrid, SpainE-mail: [email protected]

Abstract.

The remarkable accuracy and versatility of single-molecule techniquesmake possible new measurements that are not feasible in bulk assays. Among these, theprecise estimation of folding free energies using ﬂuctuation theorems in nonequilibriumpulling experiments has become a benchmark in modern biophysics. In practice, theuse of ﬂuctuation relations to determine free energies requires a thorough evaluationof the usually large energetic contributions caused by the elastic deformation of thediﬀerent elements of the experimental setup (such as the optical trap, the molecularlinkers and the stretched-unfolded polymer). We review and describe how to optimallyestimate such elastic energy contributions to extract folding free energies, using DNAand RNA hairpins as model systems pulled by laser optical tweezers. The methodologyis generally applicable to other force-spectroscopy techniques and molecular systems.

Keywords: stochastic thermodynamics, single molecule experiments, nucleic acidsthermodynamics a r X i v : . [ q - b i o . B M ] D ec etermination of folding free energies in single-molecule experiments

1. Introduction

Predicting free-energy diﬀerences is a central problem in molecular biophysics. Proteinfolding [1], DNA hybridization [2], ligand binding, CRISPR–Cas9 RNA editing [3, 4],are molecular reactions whose fate is determined by the free-energy diﬀerence betweenreactants and products. Finding methods to extract free-energy, enthalpy andentropy diﬀerences is an essential task in biochemistry, where most of these quantitiesare measured by employing bulk techniques such as calorimetry, UV absorbance,ﬂuorescence, surface plasmon resonance, among others [5]. Bulk methods yield resultsthat are incoherent temporal averages over a large population of molecules that are indiﬀerent states. The signal is masked by the dominant species and reactions, limitingthe capability of detecting rare non-native states and reaction pathways. Moreover, bulkmolecular transformations often exhibit strong hysteresis eﬀects rendering equilibriumdiﬀerences inaccessible.By monitoring molecules one at a time, techniques such as single-moleculeﬂuorescence [6], single-molecule translocation across nanopores [7] and single-moleculeforce spectroscopy [8] overcome the previous limitations and therefore have becomekey experimental tools in many laboratories worldwide. In particular, force-spectroscopy techniques using atomic-force microscopy, magnetic tweezers, acoustic-force spectroscopy and laser optical tweezers (LOT) have been extremely fruitful,revolutionizing biophysics over the last three decades ‡ .The main advantage of force-measuring techniques (as compared to ﬂuorescence andother non-invasive optical technologies) lies in the possibility to measure simultaneouslyforce and displacement, giving direct access to mechanical work measurements insingle-molecule pulling experiments. Similarly to bulk assays, pulling experimentsare often carried out under irreversible conditions, in principle providing bounds(rather than direct estimates) of equilibrium free-energy diﬀerences. The developmentof the non-equilibrium thermodynamics of small systems (also known as stochasticthermodynamics) [10, 11, 12, 13] during the past three decades has provided thetheoretical concepts and methods needed to extract free-energy diﬀerences from repeatedirreversible work measurements. Exact results such as the Jarzynski equality [14]and the Crooks ﬂuctuation theorem [15] are now commonly employed to extract free-energy diﬀerences from single-molecule pulling experiments [16, 17, 18, 19, 20, 21]. Aparticularly useful application is the measurement of the folding free energy of nucleicacids and proteins (∆ G ) which is equal to the free energy diﬀerence between the foldedstructure and the unstructured random coil in the solvent. This quantity can be obtainedfrom pulling experiments by measuring the free energy diﬀerence (∆ G ) between thefolded and unfolded-stretched states of the considered experimental system taken attwo force values, and by deriving from it the value of ∆ G . However, a general problemin the manipulation of small systems using single-molecule techniques is that we cannot ‡ LOT invention revealed to be a breakthrough in laser physics and has been awarded with the NobelPrize in Physics in 2018 [9]. etermination of folding free energies in single-molecule experiments G directly. It is insteadthe free energy diﬀerence ∆ G between the folded and unfolded-stretched states of the entire considered experimental system taken at two force values. In order to retrieve the’bare’ molecular properties such as the value of ∆ G in a single molecule, we thereforeneed to retrench from ∆ G some contributions stemming from the experimental set-up(e.g. optical trap in LOT or cantilever in AFM and the linkers used to manipulate themolecule under study). These so-called stretching corrections play a crucial role becausetheir contribution to the total free energy diﬀerence ∆ G are much larger than the freeenergy one wants to extract ∆ G , making the accurate estimation of the latter a diﬃculttask. Although there are several studies on the inﬂuence of the instrumental artifactson the folding kinetics in single-molecule experiments [22, 23, 24, 25, 26, 27, 28, 29],their inﬂuence regarding the determination of the folding free energies at zero force has,to the best of our knowledge, never been addressed in detail.In this work we will rigorously examine these experimental contributions in LOTshowing how to eﬃciently and reliably estimate the free energies of formation of DNAand RNA hairpins in unzipping assays. The same methodology is applicable to proteinsand ligand binding interactions using LOT or other force measuring techniques aswell (AFM, magnetic tweezers and so on). The development of novel and reﬁnedstatistical analysis methods to extract diﬀerences in thermodynamic potentials (freeenergy, enthalpy, entropy, chemical potential, ...) will become crucial with the recentboost of high-throughput single-molecule techniques (magnetic tweezers, acoustic forcespectroscopy) that will require fast and eﬃcient algorithms.The content of this paper is organized as follows. In sections 2 and 3 we describethe typical experimental setup of LOTs and then deﬁne and discuss the diﬀerentcontributions to the total free energy. The two following sections (4 and 5) feature howto estimate these contributions when analyzing DNA and RNA molecules. Section 4 ﬁrstcovers the situations in which it is possible to introduce the so-called eﬀective stiﬀnessapproximation, which considerably simpliﬁes the computation of the large stretchingterms. When this approximation fails, a more elaborate approach requiring a carefulevaluation of the elastic response of the linkers and of the force probe is needed, andthis is the focus of section 5. Finally, in section 6 we present the conclusions.

2. Model of the experimental setup

We consider the case of a nucleic acid (DNA or RNA) hairpin pulled by LOT. In LOT,the total distance λ between the tip of the micropipette and the center of the optical trapis the control parameter of the experiments. As shown in ﬁgure 1(a),(b) the distance λ can be decomposed as: etermination of folding free energies in single-molecule experiments λ ( f ) = (cid:40) x b ( f ) + x h ( f ) + x d ( f ) + const (folded state) ,x b ( f ) + x h ( f ) + x ss ( f ) + const (unfolded state) , (1)depending on whether the molecule is folded or unfolded. Here x b ( f ) is the displacementof the bead from the center of the optical trap, x h ( f ) = x h ( f ) + x h ( f ) accounts forthe sum of the elongations of the two double-stranded handles, x ss ( f ) is the end-to-endextension of the single-stranded unfolded molecule, and x d ( f ) is the average extensionof the folded hairpin. This last term is deﬁned as the distance between the attachmentpoints of the handles to the 5’ and 3’ ends of the hairpin and is usually called ’hairpindiameter’ (whence the index d ). All these extensions are evaluated against the x -(pulling)axis and at a given force f . The ’const’ stands for an arbitrary shift in thetotal distance λ which does not aﬀect the analysis.In general, a small perturbation δλ generates a small change in the applied force δf . The extent of this variation is the eﬀective stiﬀness of the system k eﬀ = δf /δλ andit equals the slope of the experimental force-distance curve (FDC). Therefore, accordingto the above deﬁnition and to the prescription given in (1), the inverse eﬀective stiﬀnessof the folded (F) and unfolded (U) branches are respectively given by:1 k Feﬀ ( f ) = 1 k b ( f ) + 1 k h ( f ) + 1 k d ( f ) , (2a)1 k Ueﬀ ( f ) = 1 k b ( f ) + 1 k h ( f ) + 1 k ss ( f ) . (2b)where k b ( f ) corresponds to the stiﬀness of the bead in the optical trap, k h ( f ) isthe sum of the two handles stiﬀness and k d ( f ), k ss ( f ) stand for the molecular stiﬀnessof the folded and unfolded molecule, respectively.In particular, k d ( f ) is modelled as the stiﬀness to orient a dipole of diameter d (typically d = 2 nm for DNA and RNA hairpins [30]) along the force axis [31]. Recallingthat in general k − = δx/δf , the dipole stiﬀness can be derived from the well-knownrelation between a dipole average extension (which is here equal to the average extensionof the folded hairpin) and the force f to which it is subjected: x d ( f ) = d (cid:20) coth (cid:18) f dk B T (cid:19) − k B Tf d (cid:21) (3)where T is the temperature of the heat bath around the dipole and k B is Boltzmannconstant.An analytic expression for k ss and k h can be obtained by describing the elasticresponse of nucleic acids in their single-stranded and double-stranded form with theWorm-Like Chain (WLC) polymer model and its interpolation formula [32], f ( x ) = k B T P (cid:34)(cid:18) − xL c (cid:19) − − xL c (cid:35) (4) etermination of folding free energies in single-molecule experiments Figure 1. (a,b). Laser optical tweezers (LOT) experimental setup.

Themolecule is tethered between two polystyrene beads using two dsDNA (or dsRNA oreven dsDNA/DNA hybrids) handles. Arrow towards the center of the optical trapindicates the direction of the force. λ = x b + x h + x m (with x h = x h + x h ) is therelative distance between the center of the optical trap and the tip of the micropipette. x m equals x d when the molecule is folded (a) or x ss when the molecule is unfolded(b). (c). Sketch of the force versus relative extension (extension divided by contourlength) for each elastic element showing their respective energy contributions (shadedareas). (d).

Elastic energy contribution of each element vs force and comparison withthe typical energy of formation (dashed line, ∆ G ) for a 20bp DNA or RNA hairpin. where x is the average extension of the molecule ( x = x ss for the unfolded hairpin, x = x h for the double-stranded handles) and P is the persistence length, i.e. the typicaldistance along the polymer backbone over which there is an appreciable bending dueto thermal ﬂuctuations. L c is the contour length, i.e. the end-to-end distance of thefully straightened polymer, which can also be written as L c = nd b with n being thetotal number of monomers in the polymer and d b the length per monomer. In general,inverting (4) to get x ( f ) is not an easy task (the full computation is reported in theAppendix A) and the solution depends on the system parameters.Finally, the stiﬀness of the polymer can be obtained by diﬀerentiation of (4): k ( x ) ≡ ∂f ( x ) ∂x = k B T L c P (cid:34)(cid:18) − xL c (cid:19) − + 2 (cid:35) . (5)Given (4), it is also possible to further take into account the elastic deformation ofthe stretched polymer by performing the substitution L c → L c (1 + f /Y ), with Y theYoung modulus of the stretchable polymer [33, 34], i.e. the resistance to deformationof the system to an applied uniaxial stress. In this case the contour length becomesforce-dependent and the corresponding model is called the extensible WLC. By contrastequation (4) where L c is constant is known as the inextensible WLC. The latter has etermination of folding free energies in single-molecule experiments P is a measure of the mechanical stiﬀness of the polymerbeing strongly sensitive to environmental conditions (e.g. ionic strength, temperature,solvation, etc..). Polymers with P (cid:29) L c eﬀectively behave as rigid rods, whereas if P ≤ L c polymers are bent at the scale of the contour length by thermal forces. Itis important to mention that P does not only depend on the ionic concentration andtemperature [36] (as predicted by polyelectrolyte theories) but also on experimentalparameters such as contour length [35]. For example, at 1 M NaCl, recent single-moleculestudies have shown that, for short (a few tens bases) ssDNA molecules, P = 1 .

35 nm[37] whereas for long ssDNAs ∼

13 kb P = 0 .

76 nm [38]. On the other hand, for shortssRNA molecules P = 0 .

75 nm [39] and for long ∼ P = 0 .

83 nm [35].These values are signiﬁcantly lower than for double-stranded nucleic acids (dsDNA anddsRNA) where P = 50 nm for dsDNA [40] and P (cid:39)

60 nm for dsRNA molecules [41].

3. Stretching contributions and free-energy recovery

Let us suppose that initially at t = 0 we have a molecule in thermal equilibrium atthe folded (or native, N ) state at a given value λ of the control parameter. Then, weperturb the system by applying a predetermined time-dependent forward (F) protocol, λ F ( t ), that starting at λ at t = 0 ends at an arbitrary λ at a time t . The mechanicalwork W done along this process equals to: W = (cid:90) λ λ f dλ . (6)The Crooks Fluctuation Theorem (CFT) [15] relates the mechanical work done ona system in a set of arbitrary irreversible measurements with the equilibrium free-energydiﬀerence of this system between λ and λ , ∆ G = G ( λ ) − G ( λ ). It reads: P F ( W ) P R ( − W ) = exp (cid:18) W − ∆ Gk B T (cid:19) , (7)where P F ( W ) is the probability distribution of the work done in the F process and P R ( − W ) is the probability distribution of the work measured in the time-reversed (R)process (i.e. starting in thermal equilibrium in λ and performing the time-mirroredprotocol so that: λ R ( t ) = λ F ( t − t )). The derivation of the CFT has become amilestone for single-molecule experimentalists, allowing the measurement of free-energydiﬀerences in conditions where traditional bulk experiments are unfeasible. By pullingsingle molecules using LOT or magnetic tweezers it is possible to recover molecularfree-energy diﬀerences from irreversible work measurements [17, 42]. The CFT (Eq.7)implies the well-known Jarzynski equality [14]: etermination of folding free energies in single-molecule experiments (cid:28) exp (cid:18) − Wk B T (cid:19)(cid:29) F = exp (cid:18) − ∆ Gk B T (cid:19) , (8)Note that the average (cid:104)· · · (cid:105) F is evaluated over P F ( W ) (an analogous equality holdsfor the reverse process). It is important to bear in mind that the free energy ∆ G obtainedusing the CFT (7) (or the Jarzynski equality (8)) contains several contributions due tothe stretching of the diﬀerent parts forming the experimental setup. These are themolecule under study, the molecular handles and the optically trapped bead (ﬁgure1(a),(b)): ∆ G = ∆ G + ∆ W m + ∆ W b + ∆ W h . (9)∆ G is the free energy of formation of the molecule at zero force, which is equalto the free energy diﬀerence between the folded and unfolded hairpin conformations insolution (i.e. without optical trap and handles and without any applied force). Thequantities ∆ W i ( i = m, b, h) are the reversible work diﬀerences between the state ofthe i th setup element (optical trap, handles or molecule) at λ (where the hairpin isfolded and subjected to a minimum force f min ) and λ (where the hairpin is unfoldedand subjected to a maximum force f max ). Mathematical deﬁnitions of these quantitiesfor the LOT setup are given in the subsections below.As depicted in ﬁgure 1(c,d), for typical unfolding forces in DNA and RNA hairpins(15 - 25 pN), (9) is dominated by the trap contribution, while the other terms havethe same order of magnitude. Therefore, an accurate measurement of ∆ G requiresprecise knowledge of all the diﬀerent energetic contributions involved in the mechanicalunfolding of the molecule. The molecular contribution ∆ W m in (9) accounts for the reversible work needed tostretch the molecule under study and it can be written as:∆ W m = (cid:90) x ss ( f max )0 f ( x ss ) dx ss − (cid:90) x d ( f min )0 f ( x d ) dx d , (10)where f ( x ss ) and f ( x d ) are the equilibrium force-extension curves of the unfoldedand folded molecule, respectively (albeit diﬀerent mathematical functions the same letter f will be used to lighten the notation). The ﬁrst term in the right-hand side of (10)corresponds to the reversible work needed to stretch the unfolded molecule from itssingle-stranded random coil conformation at f = 0 up to f max and it can be computedfrom the WLC model, Eq. (4). The second term in the right-hand side of (10) is thereversible work required to orientate the molecular diameter along the force axis. It canbe written as: etermination of folding free energies in single-molecule experiments (cid:90) x d ( f min )0 f ( x d ) dx d = f min · x d ( f min ) − (cid:90) f min x d ( f ) df . (11)where x d ( f ) is given by (3). The term ∆ W b + ∆ W h , which corresponds to the sum of the reversible work requiredto displace the bead from the center of the optical trap (∆ W b ) plus the reversible workneeded to stretch the handles (∆ W h ), can be generally written as:∆ W b + ∆ W h = (cid:90) x b ( f max ) x b ( f min ) f ( x b ) dx b + (cid:90) x h ( f max ) x h ( f min ) f ( x h ) dx h = (cid:90) f max f min f (cid:18) ∂f∂x b (cid:19) − df + (cid:90) f max f min f (cid:18) ∂f∂x h (cid:19) − df = (cid:90) f max f min fk b ( f ) df + (cid:90) f max f min fk h ( f ) df . (12)Note that each element in the setup is substantially diﬀerent. In particular, thebead in the optical trap can be well approximated by a Hookean spring, whereas theelastic response of the handles and the single-stranded molecule (plus the diameter)is strongly nonlinear (see below). The contribution of these two terms in Eq.(9) isoften large. In particular, the energy required to displace the bead from the center ofthe optical trap is considerably higher as compared to the other terms. A schematicdepiction of this fact can be seen in ﬁgure 1(c), where the shaded areas below the curvesrepresent the work W obtained according to (6) using realistic elastic parameters forDNA and RNA hairpins. A further important simplifaction can be carried out when the FDC along the foldedbranch is approximately linear over the integration range of forces. Such a situationcorresponds by deﬁnition to a scenario where the slope (or stiﬀness) is constant, i.e. k Feﬀ (cid:54) = k Feﬀ ( f ). It allows one to readily perform the integration in eq. (12) which is nowreduced to the simple task of integrating an aﬃne function:∆ W b + ∆ W h = (cid:90) f max f min f (cid:18) k b + 1 k h (cid:19) df ∼ = (cid:90) f max f min fk Feﬀ df = f − f k Feﬀ , (13)where we used the fact that the stiﬀness of the dipole modelling the folded hairpin isconsiderably larger than the other terms in (2a), so that k Feﬀ = ( k − + k − + k − ) − ∼ =( k − + k − ) − , and where the constant stiﬀness assumption is used in the last equalityof the right hand side of (13). Linearity of the FDC is a good approximation if theintegration range is not too large (for example, when f max − f min ≈ etermination of folding free energies in single-molecule experiments F o r ce [ p N ] Distance [nm]

12 15 k effF k effU

12 15 18 F o r ce [ p N ] Distance [nm]

20 6040 G G G G G G G G G G GCCCCCCCCCC A A A A A A A A AAATTTTTTTTTT A A

16 pN/s6 pN/s − slope = 0.93(7)

16 pN/s6 pN/s W [k B T] P F ( W ) , P R ( - W ) W [k B T] l og P F ( W ) P R ( − W )

285 290 295 300 (( W = ¢G (a) (b) (λ ,f min ) (λ ,f max ) Figure 2. Free-energy recovery of CD4 DNA with short handles (a).

Sequence of CD4 DNA (top panel). FDCs and integration range for the work W (bottom panel). Demonstration of the linearity of the FDCs in the integration range(inset) plus linear ﬁts to the folded (solid line in the inset) and the unfolded branches(dashed line in the inset). (b). Forward (solid lines) and reverse (dashed lines)work distributions for two diﬀerent pulling speeds calculated in the integration rangeindicated in (a) panel. Crossing points between work distributions are tagged as solidpoints. The CFT veriﬁcation is shown as inset. Error bars have been obtained usingthe Bootstrap method. the DNA and RNA hairpins considered in the next section.). Above all, linearity of theFDC certainly requires a linear optical trap of constant stiﬀness [31]. We will refer tothis approximation as the eﬀective stiﬀness method .

4. The eﬀective stiﬀness method

The eﬀective stiﬀness approximation discussed in section 3.3 provides an easy methodto treat the elastic contributions of the experimental setup. Here we provide two typicalscenarios where (13) provides a reliable estimation of the free energy of formation ∆ G of DNA and RNA hairpins. In section 4.1 the case of the CD4 DNA hairpin with shorthandles is reported. Then in section 4.2 we discuss the case of the CD4 RNA hairpinwith long handles. The use of short dsDNA handles ( ∼

29 bp each) in single-molecule experiments has beenshown to increase the precision of kinetic measurements due to their enhanced signal-to-noise ratio as compared to long handles [31]. Short handles also makes easier theevaluation of the stretching contributions. In fact, the large stiﬀness of short handles ascompared to the trap stiﬀness, k h (cid:29) k b , implies that k eﬀ (cid:39) k b to ﬁrst order. As the trapstiﬀness itself can be considered nearly force independent k Feﬀ is, therefore, constant alongthe folded branch, and the eﬀective stiﬀness approximation (13) becomes applicable. etermination of folding free energies in single-molecule experiments λ and λ . In the forward (reverse) process the systemstarts in thermal equilibrium at λ ( λ ) and it is driven out of equilibrium following apredetermined protocol λ F ( t ) ( λ R ( t )) until λ ( λ ) is reached. For each experimentalrealization the work W is calculated according to (6). Note that, in the force range atwhich the molecule typically unfolds and refolds (12 - 17 pN in ﬁgure 2(a)), the FDCsare linear in force (inset of ﬁgure 2(a)). Therefore, the conditions required to use theeﬀective stiﬀness method are fulﬁlled (13).In ﬁgure 2(b) we show the F and R work distributions calculated for two pullingspeeds (6 and 16 pN/s) in the same integration range. According to the CFT (7), thework value at which both distributions cross (black solid points) equals to ∆ G . Notethat, since the integration range is the same, ∆ G does not change with pulling speed,as it is required for an equilibrium quantity. We emphasize the validity of the CFT byplotting the function log P F ( W ) /P R ( − W ) as a function of W in k B T units. Accordingto (7), this function is linear in W with slope 1 and with a y-intercept equal to ∆ G (bothin k B T units). As expected, the experimental data (solid points) satisfy the previousrelation (see inset of ﬁgure 2(b), where the solid line is a linear ﬁt to the experimentaldata).Once we have measured ∆ G using the CFT, we subtract the stretchingcontributions to recover ∆ G . According to (9), we have:∆ G = ∆ G − ∆ W m − ∆ W b − ∆ W h . (14)The term ∆ W m is calculated using (10). In order to model the ssDNA elasticresponse (i.e. f ( x ss ) in (10)), we use the WLC model (4) with a persistence length P equal to 1.35 nm and an interphosphate distance d b equal to 0.59 nm/base [37], so that L c = (2 n bases + 4) × .

59 nm/base ≈

26 nm. On the other hand, the term ∆ W h + ∆ W b is calculated using the eﬀective stiﬀness method (13) with k Feﬀ = 0 . ± .

002 pN/nm(obtained by a linear ﬁt of the FDCs, see inset in 2(a)).In table 1 we report the values we obtained for ∆ G , ∆ G , as well as theaforementioned stretching contributions.Results for ∆ G are in very good agreement with the theoretical ones obtained usingthe nearest-neighbour model for DNA either using Mfold parameters (∆ G = 51 k B T )[43] or the ones derived from unzipping experiments (∆ G = 48 k B T ) [44]. In what follows, we ﬁrst discuss the characteristics of long handles in subsection 4.2.1,explaining why sometimes the eﬀective stiﬀness method can be applied, while othertimes it cannot. To illustrate the two distinct situations, we ﬁrst present in section4.2.2 a scenario based on the CD4 RNA hairpin, where the eﬀective stiﬀness method etermination of folding free energies in single-molecule experiments G [ k B T ] ∆ W m [ k B T ] ∆ W h + ∆ W b [ k B T ] ∆ G [ k B T ]DNA short 295 ± ± ± ± ± ± ± ± Table 1. Fluctuation theorem and stretching contributions for CD4 DNAhairpin (short handles) and CD4 RNA hairpin (long handles). (DNAshort, ﬁrst row)

Reported energies for the integration range [ λ , λ ]=[20, 80] nmcorresponding to a force range ( f min , f max ) = (13 ,

17) pN. (RNA long, secondrow)

Reported energies for the integration range [ λ , λ ]=[30, 85] nm correspondingto a force range ( f min , f max ) = (18 ,

22) pN. Error bars obtained after averaging theresults over four (DNA short) and ﬁve (RNA long) molecules at two pulling speeds,respectively. is applicable with long handles, just as with short handles (Sec. 4.1). Secondly,the development of a general approach for long handles, beyond the eﬀective stiﬀnessapproximation, is covered in section 5 and exempliﬁed with the CD4L12 RNA hairpin.

Long handles, ∼

500 bp each, typicallyrepresent a bigger challenge than their short counterpart because they are signiﬁcantlysofter. This implies that long handles stiﬀness features a noticeable force dependence k h = k h ( f ), especially in the lower range of forces experimentally accessible with LOT.Moreover, the magnitude of k h is now lower and typically comparable to the trapstiﬀness, k h ∼ k b . Thus, since k Feﬀ (cid:39) ( k − + k − ) − , the term k h signiﬁcantly contributesto k Feﬀ . This, together with the clear force dependence of k h , implies in turn that theeﬀective stiﬀness is not constant but depends on force: k Feﬀ = k Feﬀ ( f ). Consequently, uponcalculating stretching contributions, the terms ∆ W b , ∆ W h need to be evaluated morecarefully. At closer inspection, however, the use of long handles does not invalidate per se the eﬀective stiﬀness approximation (13). The validity of (13) relies on the assumptionthat k Feﬀ is constant over the integration range [ λ , λ ]. Indeed, in many situations, suchas with CD4 RNA hairpin, the actual integration range occurs at forces high enough sothat k h (cid:29) k b and k Feﬀ can be taken as constant. Whenever this assumption does nothold another approach must be used. There are two typical scenarios. On the one hand,if the integration range is large (e.g. for molecules featuring a pronounced hysteresis),the force-dependence of the stiﬀness k Feﬀ = k Feﬀ ( f ) cannot be neglected (note that evenif k Feﬀ changes marginally from pN to pN, the overall change on the whole integrationrange can be signiﬁcant). On the other hand, if we reach low enough forces (e.g. byusing a molecule that refolds at very low forces), the eﬀective stiﬀness also exhibits forcedependence. Indeed at low forces k h (cid:28) k b , hence k Feﬀ ∼ k h , and as k h = k h ( f ) is steepat low f , so will k Feﬀ be.Provided that the handle stiﬀness k h ( f ) and the force stiﬀness of the trap k b ( f )are known with a good precision, the integrals in (12) can in principle be carriedout easily, irrelevantly of k Feﬀ being non-constant. This corresponds however to anidealized scenario which rarely occurs in practice. To begin with, the elastic properties etermination of folding free energies in single-molecule experiments k b canhave a very signiﬁcant energetic impact. For instance, a modest deviation of 5% from k b = 0 .

08 pN/nm to k b = 0 .

075 pN/nm, results in a change of a dozen k B T in ∆ W b when integrated between 2 an 12 pN. Changes in the value of k b and even non-linearforce corrections in k b ( f ) do inevitably occur in LOT, not only on a day-to-day basis(depending on the laser focusing, alignment, power, intensity or temperature) but alsowithin the same day on a molecule-to-molecule basis, since the beads used for performingexperiments can usually slightly vary in size, and the trap stiﬀness directly depends onthis. A slight force dependence in k b ( f ) also occurs if the optical plane of the bead shiftswith force. Hence, we see that k h ( f ) and k b ( f ) are usually not characterized preciselyenough for the integrals in (12) to be computed reliably.To address the aforementioned issues, we will introduce in section 5 a novelmethodology to retrieve the optimal stiﬀness proﬁle k b ( f ) and k h ( f ) directly from theFDCs obtained in pulling experiments with LOT. Before doing so, let us however showan example where long handles and the eﬀective stiﬀness approximation go in pair: theCD4 RNA hairpin. The eﬀectivestiﬀness method can be applied to the CD4 RNA hairpin which is a molecule showingnearly reversible folding-unfolding kinetics at the accessible pulling speeds [17, 39]. Themolecule has the same sequence as hairpin CD4-DNA presented in Section 4.1 butreplacing thymines by uracils (top panel of ﬁgure 3(a)). In the present case, the RNAhairpin is inserted between two ∼

500 bases-long hybrid RNA/DNA handles [45]. Thus,the molecular construct is formed by the RNA hairpin plus the two long hybrid handles.Pulling experiments were performed analogously as described in section 4.1.Due to the narrowness of the region in the FDCs (ﬁgure 3(a), bottom panel) wherefolding-unfolding events of CD4 RNA take place, the eﬀective stiﬀness k Feﬀ remainsfairly constant over the force range experimentally probed. This linearity of the FDCsis evidenced in the inset of ﬁgure 3(a) and justiﬁes the use of the eﬀective stiﬀnessapproximation. By ﬁtting the FDCs slopes in the highlighted region, we obtain a valuefor k Feﬀ equal to 0 . ± . λ , λ ] = [30, 85] nm, which correspondsto the force interval ( f min , f max ) = (18 ,

22) pN. As we did in section 4.1, the F and Rwork distributions are calculated for two pulling speeds (2 and 20 pN/s) and are shownin ﬁgure 3(b). Note that the crossing point between both distributions corresponds tothe work value equal to ∆ G . The CFT (7) is satisﬁed for CD4 with long handles, ascan be seen in the inset of ﬁgure 3(b). We can thus subtract from the obtained ∆ G etermination of folding free energies in single-molecule experiments F o r ce [ p N ] Distance [nm]

G G G G G G G G G G GCCCCCCCCCC A A A A A A A A AAAUUUUUUUUUU A A F o r ce [ p N ] Distance [nm] k effF k effU (λ ,f min ) (λ ,f max ) W [k B T] P F ( W ) , P R ( - W ) W = ¢G (a) (b) W [k B T] l og P F ( W ) P R ( − W ) ((

20 pN/s2 pN/s

20 pN/s2 pN/s slope = 0.91(6)

330 340320 3500-1010

Figure 3. Free-energy recovery CD4 RNA hairpin with long handles(a).

Sequence of CD4 RNA (top panel). FDCs and integration range for the work W (bottom panel). Visual evidence of the linearity of the FDCs in the integrationrange (inset) plus linear ﬁts to the folded (solid line in the inset) and the unfoldedbranches (dashed line in the inset). (b). Forward (solid lines) and reverse (dashedlines) work distributions for two diﬀerent pulling speeds calculated in the integrationrange indicated in (a) panel. Crossing points between work distributions are tagged assolid points. The CFT veriﬁcation is shown as inset. Error bars have been obtainedusing the Bootstrap method. the stretching contributions ∆ W h + ∆ W b using the eﬀective stiﬀness method, along theexact same lines as in 4.1. As a last step, the term ∆ W m in (9) is calculated using theWLC model (4) with P = 0 .

75 nm and an interphosphate distance d b equal to 0.665nm/base, so that L c ≈

29 nm, higher than for the CD4 DNA molecule.We report in table 1 the values we obtained for ∆ G , ∆ G , as well as for thestretching contributions. The measured value for ∆ G (70 ± k B T ) is compatible withthe previous single-molecule measurements obtained in LOT assays at 100mM Tris HClpH 8 and 1 M NaCl (∆ G ≈ k B T ) [39] and with the Mfold prediction (∆ G = 68 k B T ) [43]. We conclude that the eﬀective stiﬀness approximation is valid for determiningfolding free energies from irreversible work measurements if the integration range isnarrow enough so that FDCs along the folded branch have constant slope in such range(i.e. the eﬀective stiﬀness k Feﬀ can be taken as constant).

5. Beyond the eﬀective stiﬀness method

In the previous sections we introduced the eﬀective stiﬀness method, testing its reliabilityin addressing the analysis of both short and long handles. We also gave evidence that itsvalidity is limited to the case of a linear elastic response and that when this conditionis not fulﬁlled a more general methodology becomes necessary. This is the subjectcovered by section 5.1 where we present a novel technique going beyond the eﬀectivestiﬀness approximation. Then, in section 5.2 we present an application of this method etermination of folding free energies in single-molecule experiments

As can be seen in (1) the force-extension proﬁle λ ( f ) depends on x b and x h , and theseare, by deﬁnition, related to the stiﬀness through: x i ( f ) = (cid:90) f k − i ( f (cid:48) ) df (cid:48) , dx i df = 1 k i ( f ) for i = b, h . (15)This hints at the fact that FDCs (i.e. the λ ( f ) proﬁle) might allow us to retrievethe stiﬀness proﬁles needed to estimate the elastic energy contributions from bead andhandles in (12). To realize this in practice, we must assume the elastic response of thetrap and the one of the handles can be parametrised by some reasonable physical model.Starting with the handles, we will assume that the extensible WLC model (ext-WLC)is a good description. k h ( f ) = k ext-WLCh ( f ; { P, d b , Y } ) , x h ( f ) = x ext-WLCh ( f ; { P, d b , Y } ) , (16)where we introduced the usual WLC elastic parameters (i.e. persistence length P , Youngmodulus, Y , and monomer length d b ). Then, we can either model the trap stiﬀness asconstant, or as a linear function of force: k b ( f ) = k b , + αf , x b ( f ) = 1 α log (cid:18) αk b, f (cid:19) , (17)where α quantiﬁes the linear dependence and k b , is the stiﬀness at zero force ( x b ( f ) isobtained by integrating as in (15)).Note that we can rewrite (1) as: λ ( f ) = x h ( f ) + x b ( f ) + x d ( f ) δ N + x ss ( f ) δ U + λ , (18)where we used a delta-Kronecker-like notation ( δ N ( U ) = 1 if the molecule is in the Native(Unfolded) state and zero otherwise) and explicitly introduced the oﬀset λ , whichaccounts for the fact that the molecular extension is always measured with respect tothe micropipette. If we now rewrite the explicit dependence with respect to our modelparameters, (18) becomes: λ ( f ) = x h ( f ; { P, d b , Y } ) + x b ( f ; { α, k b , } ) + x d ( f ) δ N + x ss ( f ) δ U + λ ≡ M ( f ; { P, d b , Y, α, k b , , λ } ) , (19)where we have denoted M as the overall model underpinning the λ ( f ) response. As(1) illustrates, the knowledge of a handful of physical parameters fully determines theFDC for the N and U branches. The key idea behind our methodology is that the inverseimplication is also true: knowing λ ( f ) and given M we can extract P, d b , Y, α, k b , etermination of folding free energies in single-molecule experiments k h ( f ) and k b ( f ) for all f using the models in (16) and (17) eventually obtaining the stretchingcontributions through numerical integration of (12). Crucially, this can be done withoutany a priori knowledge of the parameters of the experimental setup.In practice, however, the ﬁtting procedure requires a FDC featuring enoughcurvature to be able to constrain the model, and even so, the number of parametersto ﬁt is too large for a 2-dimensional curve, so that some additional considerations mustbe taken into account. Firstly, reasonable bounds/priors on the allowed values for theparameters must be set. Secondly, it is convenient to assume that certain parametersplay a minor role in the overall FDC shape (such as Y ) or are characterized well enough(e.g. the monomer length for dsDNA) to be ﬁxed at some nominal value and notﬁtted. Thirdly, computing the handles extension x h = x h ( f ) using the extensible WLCcan be slow and numerically inaccurate as it normally requires to perform a numericalinversion of f = f ( x h ). To address this, we introduce in Appendix A a formula toexplicitly invert the WLC which can then be used in (19). Fourthly, to get as manypoints as possible to constrain the ﬁt, we have aligned all the FDCs in the starting pointso they share an identical λ oﬀset (i.e. ’const’ in (1)(a,b)). After all these steps, ﬁtting λ ( f ) = M ( f ; { P, d b , Y, α, k b, , λ } ) is aﬀordable.In the following section we will show a concrete examples of the FDCs ﬁttingprocedure and its application to extract the stretching contributions. The eﬀective stiﬀness method may work well when the range of force integration is nottoo large. This condition is met in molecules exhibiting mild hysteresis. For moleculesshowing large hysteresis in pulling cycles the limits of integration f min and f max are faraway and the eﬀective stiﬀness k Feﬀ cannot be considered constant anymore. Here wepresent results for an RNA molecule (CD4L12) falling in this category and present ageneral procedure to extract the free energy of formation. CD4L12 shares the samestem than the previously discussed CD4 RNA in section (4.2.2), but with the originaltetraloop replaced by a dodecaloop (i.e. 12-loop bases), see sequence in ﬁgure 4(a). Alarge loop yields a larger entropic barrier for refolding and large hysteresis in the FDC.Pulling experiments were performed as described in section (4.1), with a pulling speedof 100 nm/s and 300 nm/s and a buﬀer containing 4 mM MgCl, 50 mM NaCl, and 10mM Tris. The values of P = 0 .

75 nm and d b = 0 .

665 nm were used to describe theelastic properties of the ssRNA for this buﬀer [39].As can be seen in ﬁgure 4(b), CD4L12 behaves as a two-state system being eitherfolded or unfolded along the FDCs. As expected, pulling cycles feature large hysteresis,with a maximal diﬀerence of nearly 20 pN between the lowest folding and largestunfolding force rips. In order to compute the work needed for the CFT (7), we mustintegrate the area under the FDC within a large force range with a very low f min . It etermination of folding free energies in single-molecule experiments G G G G G G G G G GGCCCCCCCCCC A A A A A A A A A A AUUUUUUUUUU A A A A A AAA AA (a) (b)

Force [pN] k h [ p N / n m ] Distance, λ [nm] F o r ce [ p N ] × -2 Force [pN] k e ff F [ p N / n m ] (c) Figure 4. Free-energy recovery CD4L12 RNA hairpin. (a).

Sequence andsecondary structure of CD4L12 RNA (b).

Aligned FDCs folding (red) and unfolding(blue) for a given molecule pulled at 100 nm/s.

Inset : Eﬀective stiﬀness proﬁlemeasured along the folded branch. (c).

Stiﬀness proﬁle of the hybrid DNA-RNAhandles which form the molecular construct used with CD4L12 and CD4 RNA [39].Data points have been obtained using the high frequency power-spectrum methoddescribed in [31]. The red line is a ﬁt of the extensible WLC model, yielding P = 20 ± Y = 200 ±

14 pN ( d b was not ﬁtted but ﬁxed to the interphosphatedistance of A-form RNA, d b = 0 .

27 nm [45]). is clear that in this case the constant stiﬀness approximation described in section 4.2.1does not apply, as shown in the inset of ﬁgure 4(b) where k Feﬀ markedly changes withforce. To estimate the stretching contributions we follow the previous subsection 5.1and (19) to obtain ∆ W b , ∆ W h , and, from (14), the value of ∆ G .In order to carry out the ﬁt prescribed by (19), we need to introduce some furtherassumptions to simplify the problem. Regarding the hybrid DNA-RNA handles, weuse the value of the interphosphate distance d b = 0 .

27 nm of A-form RNA and Youngmodulus Y = 200 pN obtained by ﬁtting the stiﬀness of the handles proﬁle (ﬁgure 4(c)).While changes in d b only moderately aﬀect the overall curvature of k h (but they impactthe overall contour length, an eﬀect already captured by ﬁtting λ ), changes in Y donot. Hence ﬁxing these two values gives a better constrained model. For the persistencelength of the handles P it is convenient to ﬁt the deviation ∆ P (in %) with respect toa plausible expected nominal value P , i.e. P eﬀ = P (1 + ∆ P ), that we take from theﬁt in ﬁgure (4) as P = 20 nm. Lastly, we also include the number of nucleotides n released in the transition between the folded and the unfolded branches as an extra freeparameter of the model. We are thus eventually left with 5 free parameters which weﬁt (18,19) using a standard non-linear least square regression (Levenberg-Marquardt): λ ( f ) = M ( f ) = M ( f ; { k b,0 , α, ∆ P, λ , n ) } ) . (20) etermination of folding free energies in single-molecule experiments (a) (b) Figure 5. Fitting the folded and unfolded branches of CD4L12 RNA hairpin.(a).

Solid blue line is an example of curve ﬁtting based on (20). Data points usedfor the ﬁt are the black diamonds. They are obtained by smoothing and ﬁlteringthe gray dots, which are themselves obtained by aggregating the unfolding FDCs ofdiﬀerent pulling cycles from ﬁgure 4b (b).

Example of forward and reverse (solidand dashed lines) work distributions for the same molecule pulled at 100 nm/s. Dueto the large hysteresis, work distributions do not overlap.

Inset:

Illustration of the matching method to retrieve ∆ G by imposing continuity between P F ( W ) (light green)and P R ( − W ) e ( W − ∆ G ) /k B T (dark green) in log-normal scale. Solid grey line is the ﬁttedGaussian, see [46] for details. An example of such ﬁtting procedure is shown on ﬁgure 5(a). As can be seen, theagreement between the experimental points and the reconstructed curve is remarkable.Furthermore, all the values obtained from the ﬁt dovetail with prior expectations.Firstly, the value of n matches with the expected number of released nucleotides (i.e.52). Secondly, the zero-force trap stiﬀness k b falls in expected range [31]. Thirdly, theforce-dependence parameter α of the trap stiﬀness is of the same order of magnitude thanvalues already reported in the literature for similar LOT settings [31]. Fourthly, ∆ P issmall so P is reasonably close to the assigned nominal value P . Another good genericindicator is the very low error on the ﬁtted parameters, hinting at a well-constrainedmodel; a fact that is further conﬁrmed by the observation that in the correlation matrixof the ﬁt, most oﬀ-diagonal entries are near-zero (details not shown). We must ﬁnallystress that the choice of free parameters in (20) is convenient for the considered situationbut is by no means customary. In a context where the trap would be well characterizedand the handles would not, we may have for instance ﬁxed k b but ﬁtted d b . Equation(19) can be adapted at will, depending on the requirement.With the ﬁtted values of α , k b,0 and ∆ P in hand and our assumptions for Y and d b (legitimated retrospectively by the agreement of the ﬁt in ﬁgure 5), we are now in aposition to precisely establish the proﬁles of k h ( f ) an k b ( f ) through the use of equations(15), (16) and (17). We can now quantify the terms ∆ W b and ∆ W h using (12) and ∆ G using the FT. These numbers together with (14) allow us to extract ∆ G . etermination of folding free energies in single-molecule experiments P F ( W ) and P R ( − W ) obtained from theFDC (ﬁgure 4(b)). The very pronounced hysteresis and the large value of the averagedissipated work in a pulling cycle (about 60 k B T ) is such that F and R work distributionslie far apart without overlapping. Previous methods based on the overlapping of F andR work distributions are not applicable and an alternative approach must be used, suchas the Bennett acceptance ratio [47] and the “matching method”. This last methodconsists in ﬁnding the optimal ∆ G value so that P F ( W ) is the analytical continuationof P R ( − W ) e ( W − ∆ G ) /k B T . This procedure is graphically illustrated in the inset of ﬁgure5(b) and further explained in [46]. Results obtained for diﬀerent molecules are shown intable 2. We note that the values of ∆ G obtained with the two methods yield compatibleresults ( matching being systematically 3-5 k B T lower than Bennett ). Our estimatedvalue ∆ G = 67 ± k B T is not far from the Mfold prediction (∆ G = 63 k B T ) showingthe reliability of the approach.∆ G Bennet [ k B T ] ∆ G Matching [ k B T ] ∆ W m [ k B T ] ∆ W b + ∆ W h [ k B T ] ∆ G [ k B T ]1045 ± ± ± ± ± ± ± ± ± ± ± ± Mean: 67 ± B T Table 2. Fluctuation theorem and stretching contributions for CD4L12RNA hairpin with long handles.

Overview of the values of ∆ G , the stretchingcorrections, and the ﬁnal ∆ G estimate for 6 diﬀerent molecules. All values are givenin k B T . ∆ G Bennet and ∆ G Matching provide two ways to extract ∆ G using the CFT.The value of ∆ G is obtained through (9) using the value of the Bennett estimate.The last line corresponds the only experimental setting in which the pulling speed is300 nm/s, all the other results were obtained at 100nm/s. We want to stress the sensitivity of the value of ∆ G on the accurate estimationof the stretching contributions which, being one order of magnitude larger, can leadto inconsistent results. Had we used a methodology assuming ’average’ or ’standard’stretching contributions, we would have obtained erroneous numbers. Consider forinstance subtracting the average value (cid:104) ∆ W b + ∆ W h + ∆ W m (cid:105) = 898 k B T derived fromtable 2 to the highest and the lowest estimates of ∆ G shown in the same table: itresults in two widely oﬀ values ∆ G = 1107 −

898 = 209 k B T and ∆ G = 863 −

898 = − k B T . Therefore a tailored molecule-to-molecule estimation of the stretchingcontribution is absolutely essential for molecules like CD4L12 where the eﬀective stiﬀnessapproximation cannot be used. etermination of folding free energies in single-molecule experiments

6. Conclusions

We have presented a brief tutorial on the approaches commonly used to extract foldingfree energies of single molecules pulled with optical tweezers in unzipping assays. Arecurrent issue in these calculations is the large magnitude of the stretching contributionsto the full free-energy diﬀerence measured in a pulling experiment using the CFT. Suchcontributions arise from the experimental setup and include the optical trap, the elasticstretching of the handles used in the molecular construct and the extension release ofthe unfolded polymer. A great simpliﬁcation in the analysis of these correction termscan be be performed when the eﬀective stiﬀness of the experimental system can beapproximated as constant, as we saw in section 4. In this so-called eﬀective stiﬀnessapproximation a single parameter k Feﬀ suﬃces to quantify the stretching contributionsof handles and trap. We exempliﬁed this case in the study of a DNA hairpin in section4.1. For long handles the stiﬀness of the handles turns out to be comparable to that ofthe trap and a force dependent k Feﬀ is apparent. In this case, as we showed in section4.2.2, one can still use the eﬀective stiﬀness approximation if the range of integrationto evaluate the work is narrow enough. This is possible if the pulling curves are nottoo irreversible and forward and reverse work distributions overlap. In contrast, forstrong irreversible pulling experiments one needs to accurately characterize all elasticcontributions from the experimental setup. Here we have introduced a novel method(section 5) based on least-squares ﬁtting of the elastic response of the folded and unfoldedbranches. It relies on adapting the elastic parameters extracted from the literature(inter-monomer distance, persistence length, Young modulus) to the experimental dataas well as accurately retrieving the stiﬀness of the optical trap using the same data.One problem that remains open is the magnitude of the statistical error committedin the estimation of ∆ G . In fact, ∆ G is the diﬀerence of two large numbers (∆ G and the stretching contributions) each with a large error and extracted from the sameexperimental FDC data. How to combine the errors from these two large quantitiesremains largely unclear as they are not really uncorrelated. A rule of thumb insingle-molecule experiments is that the largest errors come from molecule to moleculeexperimental variability. It is then recommended to ﬁrst extract ∆ G values for diﬀerentmolecules by subtracting elastic contributions from ∆ G on a single-molecule basis, andthen derive the mean value of ∆ G and the corresponding statistical error.The large contribution of the stretching term (14) to the full free energy ∆ G makesthe prediction of the (comparably small) value of ∆ G a diﬃcult task. This situation isreminiscent of the enthalpy-entropy compensation problem in biochemistry [48, 49]. Inthis case free-energy diﬀerences of intra an intermolecular weak interactions (e.g. folding,binding, allostery, enzymatic reactions and so on) are typically one order of magnitudesmaller than entropies and enthalpies, i.e. ∆ G = ∆ H − T ∆ S with ∆ G (cid:28) ∆ H, T ∆ S . Inthis regard, enthalpy-entropy compensation in biochemistry appears to be similar to the∆ G -stretching compensation in force spectroscopy. The analogy is not pure coincidenceas the stretching contributions are essentially also of entropic nature and much larger etermination of folding free energies in single-molecule experiments G .The methodology we have described should be generally useful and applicable toforce spectroscopy studies of single-molecule constructs whenever elastic contributionsare present. Applications go beyond the case of measuring folding free energies suchas extracting molecular free-energy landscapes [30] measure ligand binding energies[50], protein-protein and RNA-protein interactions and characterizing heterogeneousmolecular ensembles [51]. Acknowledgements.

We acknowledge ﬁnancial support from Grants Proseqo(FP7 EU program) FIS2016-80458-P (Spanish Research Council) and Icrea Academiaprizes 2013 and 2018 (Catalan Government).

References [1] Aabert B, Johnson A L J, Raﬀ M, Roberts K and Walter P 2002

New York: Garland Science [2] Felsenfeld G and Miles H T 1967

Annual Review of Biochemistry et al. Nature Reviews Microbiology Nature Biotechnology Biophysical chemistry: Part II: Techniques for the study ofbiological structure and function (Macmillan)[6] Shashkova S and Leake M C 2017

Bioscience Reports BSR20170031[7] Meller A, Nivon L and Branton D 2001

Physical Review Letters Nature Methods Current Science

Physics Today The European Physical Journal B Reports on Progress in Physics Physical Review X Physical Review Letters Physical Review E Science

Nature

Advances in Chemical Physics

Proceedings of the National Academy of Sciences

Nature Physics Nature Physics Proceedings of the National Academy of Sciences

Physical Review Letters

Physical Chemistry Chemical Physics Proceedings of the National Academy of Sciences

Physical Review E Proceedings of the NationalAcademy of Sciences

Proceedings of the National Academy of Sciences

Biophysical Journal etermination of folding free energies in single-molecule experiments [30] Woodside M T, Anthony P C, Behnke-Parks W M, Larizadeh K, Herschlag D and Block S M 2006 Science

Biophysical Journal

Macromolecules Biophysical Journal Rheologica Acta Annual Review of Biophysics BiophysicalJournal

Biopolymers

Nucleic Acids Research Nucleic Acids Research et al. Science

BiophysicalJournal Physical Review E Nucleic Acids Research Proceedings ofthe National Academy of Sciences

BiophysicalJournal Biophysical Journal

Journal of Computational Physics Journal of Biological Chemistry

Protein Science Science

Free energy and information-content measurements in thermodynamic andmolecular ensembles

Ph.D. thesis Universitat de Barcelona[52] Broekmans O D, King G A, Stephens G J and Wuite G J 2016

Physical Review Letters

Handbook of mathematical functions: with formulas, graphs,and mathematical tables vol 55 (Courier Corporation)[54] Borwein P and Erd´elyi T 1995

Polynomials and polynomial inequalities vol 161 (Springer Science& Business Media) etermination of folding free energies in single-molecule experiments Appendix A. WLC Explicit inversion

The inextensible WLC model described in (4) gives a very direct way to compute f = f ( x ), but it is not straightforward to use it to retrieve x = x ( f ). Althoughnumerical inversion using Mathematica and other software is possible (e.g. as in [52])it is useful to have explicit inversion formulae. Hence let us now quickly show that(4) can be easily inverted to express z := x/L c as a function of f . We ﬁrst deﬁne thenormalized quantity ˜ f = (4 P/k b T ) f . We can then re-write (4) as ˜ f = (1 − z ) − − z .By multiplying both sides of the previous by (1 − z ) and by moving all terms to thesame side, we obtain:0 = z + a z + a z + a with a = − − ˜ f , a = 32 + ˜ f , a = − ˜ f z as a function of f simply maps to ﬁnding theroots of a cubic polynomial – a problem solved since the 15th century. The approachtaken here is the canonical one [53, 54]. We start deﬁning the following intermediatequantities: R := 9 a a − a − a Q := 3 a − a D for cubic equations: D := Q + R (A.3)If D >

0, there is only one real solution to (A.1), and we have to deﬁne the followingintermediate quantities to express the answer: T := (cid:113) R + √ D S := (cid:113) R − √ D (A.4)(since D >

0, we also have that √ D is real, and thus there is indeed at least onereal cubic root for T and S ). The desired inverse value z ∗ = z ( f ) is then ﬁnally obtainedas: z ∗ = − a + S + T (A.5)If D <

0, there are three real roots to the cubic equation. These roots can beobtained by re-using the quantities S and T deﬁned above, but doing so requires usingcomplex number algebra – which may not be handy. Instead, we also can deﬁne thefollowing intermediate quantity: θ := arccos (cid:32) R (cid:112) − Q (cid:33) (A.6)From which the three real roots z , z , z can be obtained directly as: z i = 2 (cid:112) − Q cos (cid:18) θ + θ i (cid:19) − a with θ = 0 , θ = 2 π, θ = 4 π (A.7) etermination of folding free energies in single-molecule experiments , z = x/L c and a property of the inextensible WLC is that the extension x is alwayssmaller than the contour length L c . Using trigonometric standard formula and the factthat 2 √− Q > z − z > z − z ≥ θ (which must belong to [0 , π ] by deﬁnition of the arccosine), which implies that z is the smallest of all the roots. Moreover, we note that all the roots must be positive,since we see in (4) that ∀ z < f ( z ) < z is the smallest of them, it therefore has to be the onewe are looking for, in [0 , z = z ∗ = z ( f ) when D <

0. The previous resultalso covers the D = 0 situation, because we then have from (A.6), θ = 0, and so we arein the limiting case z = z .Let us ﬁnally note that in the case of the extensible WLC, the key diﬀerence withthe inextensible case is the replacement L c → L c (1 + f /Y ) with Y the Young Modulus,i.e. the contour length is now force dependent. It can be shown that this implies thefollowing relationship between the two models: x extW LC ( f ) = x inextW LC ( f ) (1 + f /Yf /Y