[PDF] Delineating elastic properties of kinesin linker and their sensitivity to point mutations

Abstract

We analyze free energy estimators from simulation trials mimicking single-molecule pulling experiments on a neck linker of a kinesin motor. For that purpose, we have performed a version of steered molecular dynamics (SMD) calculations. The sample trajectories have been analyzed to derive distribution of work done on the system. In order to induce unfolding of the linker, we have stretched the molecule at a constant pulling force and allowed for a subsequent relaxation of its structure. The use of fluctuation relations (FR) relevant to non-equilibrium systems subject to thermal fluctuations allows us to assess the difference in free energy between stretched and relaxed conformations. To further understand effects of potential mutations on elastic properties of the linker, we have performed similar in silico studies on a structure formed of a polyalanine sequence (Ala-only) and on three other structures, created by substituting selected types of amino acid residues in the linker's sequence with alanine (Ala) ones. The results of SMD simulations indicate a crucial role played by the Asparagine (Asn) and Lysine (Lys) residues in controlling stretching and relaxation properties of the linker domain of the motor.

Full PDF

DDelineating elastic properties of kinesin linker andtheir sensitivity to point mutations

Michał ´Swia¸ tek and Ewa Gudowska-Nowak Jagiellonian University, Marian Smoluchowski Institute of Physics, ul. Prof. S.Łojasiewicza 11, Krak ´ow, 30–348,Poland Jagiellonian University, Marian Smoluchowski Institute of Physics and Mark Kac Center for Complex SystemsResearch, ul. Prof. S.Łojasiewicza 11, Krak ´ow, 30–348, Poland * [email protected] ABSTRACT

We analyze free energy estimators from simulation trials mimicking single-molecule pulling experiments on a neck linker of akinesin motor. For that purpose, we have performed a version of steered molecular dynamics (SMD) calculations. The sampletrajectories have been analyzed to derive distribution of work done on the system. In order to induce unfolding of the linker, wehave stretched the molecule at a constant pulling force and allowed for a subsequent relaxation of its structure. The use ofﬂuctuation relations (FR) relevant to non-equilibrium systems subject to thermal ﬂuctuations allows us to assess the differencein free energy between stretched and relaxed conformations. To further understand effects of potential mutations on elasticproperties of the linker, we have performed similar in silico studies on a structure formed of a polyalanine sequence (Ala-only)and on three other structures, created by substituting selected types of amino acid residues in the linker’s sequence withalanine (Ala) ones. The results of SMD simulations indicate a crucial role played by the Asparagine (Asn) and Lysine (Lys)residues in controlling stretching and relaxation properties of the linker domain of the motor.

Introduction

The motor proteins’ ability to generate movement creates a situation, where it is possible to treat different parts of one moleculeas roughly independent objects, that possess an ability to move at different times and speeds . That makes the ﬁeld of motorproteins a desirable testing ground for application of various theoretical models, that aim to ﬁlter the inherent complexity ofbiological systems . Among the motors, members of the kinesin protein superfamily are consistently used throughout theyears in research focusing on molecular motors’ mechanical properties, both in vitro and in silico .Strain through the neck linker ensures processive runs of the motor and can be estimated by analyzing elastic properties ofborder regions between heads of the kinesin molecule. Here, in a desire to simplify a polymer-like model of the linker, we haveneglected long range interactions and considered solely the structure of the neck linker itself. The neck linker label refers toa concise (less than 20 amino acids length) amino acid sequence in a single kinesin head that acts as a bridge between α -6helix in coiled-coil dimerization domain and α -7 helix in the core motor domain, respectively . An on-going accumulation ofexperimental data evidence suggests that a transition of the neck linker from a disordered (random coil) state to an ordered( β -sheet) conformation is a key factor in determining a mechanism of force-generation that is a crucial element of molecularmotors’ ability to move along microtubules .Biopolymers forming chains are often interpreted in terms of numerous approximations, all having roots in theoreticalassumptions that form a basis of the Freely Jointed Chain (FJC) model. Among these, the worm-like-chain (WLC) model seems to be the most relevant when it comes to describing a bending process of a semi-rigid biostructure, even though it hasreceived some criticism . Stretching of a peptide requires an application of a certain force and the relation between thatforce F and the stable extension x of the chain can be formulated as F = k B TL (cid:34) (cid:18) − xL c (cid:19) − + xL − (cid:35) (1)where k B is Boltzmann constant, T stands for temperature, L c is the contour length of the polymer and L its persistencelength. In a previous work, we have already presented a preliminary venture into the matter at hand, showcasing a difference instretching process between a neck linker and an Ala-only polypeptide . Here, employing the methods of Molecular Dynamicsand Normal Mode Analysis, we intend to deliver a more comprehensive description of a possible relation between neck linker’s a r X i v : . [ q - b i o . B M ] S e p mino acid sequence and the speciﬁcity of its function. The paper is organised as follows: After a brief Introduction , the section

Material and Methods presents basic methodology of the domain analysis and describes the setup of Molecular Dynamics(MD) simulations. A number of theoretical considerations are discussed, pertaining to thermodynamic description of an aminoacid chain, ability to determine its elasticity via the force-extension relations and signiﬁcance of non-equilibrium dynamics asused in our simulations. All these are placed in subsections of their own. Next, in

Results and Analysis , data collected in seriesof simulations are presented and examined. The last section,

Conclusions , contains our closing remarks in which we summariseﬁndings and highlight points of interest for future research in this ﬁeld.

Methods

Domain analysis of kinesin heads

In the initial part of our studies we have identiﬁed dynamic domains in the structure of

Kinesin Heavy Chain (taken fromPDB Databank( id:3kin) ) and analyzed deformations (low-frequency domain motions) which have been obtained with asimpliﬁed mechanical model proposed by Hinsen . The method is essentially based on observation that the low frequencymodes describing motion of domains in proteins are inﬂuenced by anharmonic effects and in realistic environments becomestrongly overdamped, thus independent of the applied force ﬁeld details.The structure has been analyzed with DomainFinder application . Firstly, an approximate normal mode analysis has beenperformed. Then, 16 modes of lowest frequencies have been stored for further domain and deformation analysis. Energythreshold of 200 kJ/mol has been chosen to discriminate regions of sufﬁcient rigidity to be candidates for domains. This choicehas led to acquiring images presented in FIG.1, in which domains are associated with internally stable regions of protein andoff-domain regions are relatively ﬂuid.Here, blue color represents regions of deformation energy well below threshold, while light blue and light red partssymbolize regions slightly below and slightly above the threshold, respectively. A crucial parameter differentiating betweenrigid regions with uniform motions and intermediate regions whose internal deformation yields systematic contributions to theoverall motion between boundaries of the domain is the domain coarseness factor c . In brief, the coarseness parameter speciﬁeshow similar the rigid body motions of different residues should be to consider that these residues form a dynamical domain;thus the smaller the coarseness, the ﬁner is the deﬁnition of the domains. Since the deformation energies of chosen 16 modeshave ranged up to 15-20, the deformation analysis based on these 16 modes has been performed with a deformation thresholdequal to 15. The dynamical domain decomposition has resulted in images displayed in the lower panel of FIG.1.A deformation energy deﬁnition used by DomainFinder relates to the interaction energy between two particles i and j ofthe elastic network and reads E i = N ∑ j = k ( R ( ) i j ) | ( d i − d j ) R ( ) i j | | R ( ) i j | (2)where i , j denote C α atoms, d i , j are corresponding inﬁnitesimal displacements from original positions (displacement of theatom in the mode to be analyzed), R i j are distances between pairs of atoms i , j in a submitted structure and k ( R i j ) is an effectiveharmonic force constant, that attenuates with a spatial distance according to the relation: k ( R ( ) i j ) = C exp  − | R ( ) i j | r  (3)in order to maintain a force ﬁeld of short range ( r ), that excludes interactions between potential domains. Parameter C hasbeen chosen arbitrarily, as 47,400 kJ × mol − nm − at temperature 300K, to ensure compatibility with the Amber 94 forceﬁeld.Amplitudes of d i , j are deﬁned with an equation N ∑ i = | d i | = f N , (4) f being a scaling factor of value 1 nm . Displacements d i have been used to deﬁne rotation ( Ω ) and translation ( T ) vectors: d i = T + Ω × R i (5) here R i stands for the i atom position. When displacement vectors do not describe pure rigid-body motion, linear least-squaresﬁt is used to determine values of T and Ω . The structure is further divided into cubic compartments of side length 1,2 nm, allignored unless containing at least 3 atoms and having average deformation energy below pre-deﬁned threshold. Those cubeshave their rotation and translation vectors calculated. A following deﬁnition of similarity is used to identify clusters of cubeshaving similar mobility. S i j = | Ω i + Ω j || Ω i − Ω j | + | T i + T j || T i − T j | (6)The rotation vectors are empirically more precise in sorting out domains, thus are given greater weight. A cluster is ﬁnallycreated, by using the criterion S ik > S maxi j c (7)where c is pre-selected domain coarseness parameter. All cubes contributing to this relation, are then considered to composeone cluster. Structure and relaxation of extended linker: mechanical and thermodynamic considerations

By deﬁnition, the partition function for simulations run at a constant volume condition is given by Z = N (cid:90) d r e − β E ( r ) (8)where β = k B T . Here T is absolute temperature and E stands for the energy of a given conﬁguration state, with an averageenergy of the system given by (cid:104) E (cid:105) = (cid:90) d r E ( r ) ρ ( r ) , (9)in which ρ ( r ) is the equilibrium probability density ρ ( r ) = e − β E ( r ) × [ (cid:82) d r e − β E ( r ) ] − . Accordingly, the conﬁgurational entropyof the system is given by S = (cid:104) E (cid:105) − FT = k B log N − k B (cid:90) d r ρ ( r ) log ρ ( r ) (10)with F being the Helmholtze free energy F = − k B T ln Z . If the system energy is partitioned over many local minima (energywells), the conﬁgurational integral Eq.8 can be represented in the form of a sum Z = ∑ i Z i with Z i = N (cid:82) i d r e − β E ( r ) andintegral evaluated over the i − th energy well, in which the probability density ρ i ( r ) can be expressed as ρ i ( r ) = ρ ( r ) p i , r ∈ Ω i (11)with p i = Z i Z . The average energy of the system can be then rephrased as (cid:104) E (cid:105) = ∑ i p i (cid:104) E (cid:105) i with entropy S = − k N ∑ i p i ln p i + ∑ i p i S i (12)given by the sum of weighted average of individual entropies S i associated with different wells and the entropy of partitioningof the system among various wells. All in all, the weighted average Eqs.(10) and (12), or the differences ∆ S , ∆ F pertinent totwo (initial/ﬁnal) states can be then accessed in a straightforward way by a histogram method counting the number of times themolecule ”visited” given conﬁgurational state in course of MD simulations . MD simulations’ setup

A structure of a motor domain, belonging to a kinesin-like protein

KIF3B (a Kinesin-2 family’s member) , has beenobtained from the PDB Databank( id:3b6u) . A sequence of 19 amino acids, 17 of which are considered to be a functionalpart known as a neck linker , has then been extracted and optimized. In order to do so, Steepest Descent and

Conjugate radients algorithms have been employed. The neck linker chain was subsequently placed in a box of water molecules (aSimple Point Charge model), and the whole system was optimized again. Next, a simulation of Molecular Dynamics (MD)was scheduled, with a goal of achieving a state of at least near equilibrium, producing a 6 ns long trajectory (with a time stepduration ∆ t = × − ps ). A distribution of end-to-end distances was then created and used to determine the mean end-to-end distance of the equilibrated linker chain. Finally, a structure has been selected with an end-to-end distance sufﬁciently close tothe mean, while belonging to a time frame from near the end of the simulation.That selected neck linker structure has been taken out of the box of water molecules and placed in the implicit solvent. Ashort simulation, with positions of ﬁrst and last C α atoms ﬁxed, served as a short equilibration routine. After that, the systemwas employed as a starting point of 10 simulation runs, where a constant force of 1300 kJ ∗ mol − ∗ nm − ( approximately 2160 pN ) was applied between a mass centre of the 1st residue and a mass centre of the 19th residue. We desired to test our structureagainst strong external inﬂuence that causes a rapid response from our system and, with that in mind, the aforementionedforce value has been chosen, after some preliminary MD runs. Those simulations of non-equilibrium dynamics produced a setof 1 ps trajectories. The ﬁnal states of these trajectories became starting points of another 10 simulations, each lasting 1 ps ( ∆ t = × − ps ), where the constant stretching force had been turned off, resulting in system relaxation. In all simulations, the Berendsen thermostat has been used to ensure stable temperature conditions (T=300K), while the levels of pressure have beencontrolled with

Parrinello-Rahman barostat (p=100kPa). Since our interest has been focused on investigation of linker’s speciﬁcelasticity, analogous steps have been taken with regards to an 18 amino acid long

Ala-only peptide, with the equilibrationsimulation being 3ns long. The alanine residue is the simplest possible, not possessing any side chain and, because of that, apolyalanine sequence has been deemed the best model for observing a protein polymer behavior limited only to the proteinbackbone. Additionally, the above steps of modelling have been repeated for three ”intermediate” structures between theoriginal neck linker and the

Ala-only polymer. Namely, by selecting some of amino acid residues (either Asparagine, or Proline,or Lysine) and substituting them with alanine residues, ”mutant” versions of the neck-linker have been created. The equilibriumsimulations of those three altered linker structures lasted 5 ns , 6 ns and 3 ns for no-Asparagine , no-Lysine and no-Proline chains,respectively. Finally, the whole process involving all 5 different sequences has been repeated for different stretching forcevalues (130, 400, 700 and 1000 kJ ∗ mol − ∗ nm − ). The 4.5.5 version of the GROMACS package and the OPLS-aa force ﬁeld have been employed to perform all MD simulations.

Steered molecular dynamics of kinesin linker structure

Molecular interactions and mechanical properties of individual molecules can be nowadays probed by use of combinedtechniques, like Atomic Force Microscopy (AFM) and optical tweezers . In such experiments single molecules are hold andstretched, and from the measurements of a cantilever spring restoring force in the AFM instrument, the information aboutelasticity (effective spring constants) and intensity of rupture forces can be derived . Analogous to these experimentalsetups, steered molecular dynamics (SMD) simulations permit similar investigations to be performed in silico . In brief, theprocedure of SMD applies external steering forces in molecular dynamics simulations to investigate processes of e.g. proteinunfolding or binding/unbinding of substrates separated by some energy barriers. Practical designs of such simulations arebased on relating free energy difference in nonequilbrium steady states achieved in course of manipulation with the work donethrough the process. The thermostated system is hold at the beginning of the action at equilibrium of a given temperature T . Bychanging an externally controlled parameter λ , the external work W done on the system may be estimated. The process isrepeated many times so that the statistics of work performed is collected with free energy difference between the steady statesrelated by Jarzynski equality (cid:68) e − β W (cid:69) = (cid:90) p ( W ) e − β W = e − β ∆ G , β = ( k B T ) − (13)with average taken over repeated realizations of the process and p ( W ) being the relevant probability density function (PDF) forwork distribution. The microscopic state of the system composed of N particles in contact with the heat bath is speciﬁed by3 N -dim position vector r and 3 N -dim momentum vector p evolving under the dynamics governed by the Hamiltonian H ( r , p ) .Description of the system dynamics in terms of a collective variable x (reaction coordinate) aimed to capture essentials ofundergoing thermodynamic process is possible by deﬁning a potential of mean force V ( x ) expressed as e − β V ( x ) = (cid:90) d r d p δ ( x − x (cid:48) ( r )) e − β H ( r , p ) (14)If the system is further perturbed by an external potential V λ with a control parameter λ , the total potential energy of thesystem becomes U ( x , λ ) = V ( x ) + V λ ( x ) and the total Hamiltonian changes to ˜ H λ ( r , p ) + H (cid:48) λ . In the experiments with pullingforce, the center of mass of the pulled molecule is attached to a spring with an elasticity constant k , so that the pulling force s F = k ( λ ( t ) − x ) with the control parameter λ ( t ) = x + vt . The thermodynamic work deﬁnition for overdamped dynamicsbecomes then W = (cid:90) dt ˙ λ ∂ U ( x ; λ ) ∂ λ ≡ − k γ (cid:90) t du ( x u − vu ) (15)where γ stands for the friction coefﬁcient.The implications of non-equilibrium dynamics described above have been taken into consideration as we set out to gaugethe free energy difference between stretched and relaxed amino acid chains. Results and Discussion

A clear decomposition of the kinesin structure to three domains comprising two heads can be seen in FIG.1,with softest motionsobserved in the region of the neck linker. This domain formed of 14-18 amino acids is widely considered a key structureunderlying kinesin’s force-generating mechanism and has been examined in a series of experimental and theoretical studies.In particular, it has been proposed that a conformationally ﬂexible unstructured state of the linker changes to a structured anddocked one upon ATP binding, providing essential conformational change in the motor, responsible for subsequent stepping .On one hand, the linker spring has to be then ﬂexible enough to allow for diffusive search of the motor head of the next bindingside. On the other, when both heads of the motor are simultaneously bound to the microtubule track, the neck linker has tobe sufﬁciently stiff to ensure that mechanical forces between both head domains enable mechanical coupling. Accordingly,mechanical models and molecular dynamics simulations of this peptide structure are important contributions to understandingits elastic properties and ability to control kinesin’s motion.In order to select a proper starting structure for stretching simulations, a construction of well equilibrated initial ensembleis required. It is often assumed that, for a small system with properly functioning temperature and pressure coupling, a runthat does not exceed 100 ps is enough to achieve a state of equilibrium . However, it has been shown that such assumptiondoes not have to be necessarily true . Taking into consideration possible difﬁculties in achieving equilibrium, we havedecided to measure ﬂuctuations of the end-to-end distance, aside from a routine check of parameters (e.g. Root Mean SquareDisplacement (RMSD) or average potential energy), typically used as indicative measures for an equilibrated system.Figure 2 displays end-to-end distance distributions in different time windows of the simulation with Gaussian curves ﬁttedto the data. For a chain made up of orientationally uncorrelated (free-jointed) links with a length of each segment randomlydistributed, the end-to-end stretch distance is expected to follow statistics of the Gaussian law. Although we have observed thatthe length distributions of the neck linker as well as the Ala-only polypeptide do ﬁt Gaussian curves in certain time regimes, thelong-run equilibrium simulations of the segments clearly indicate deviations of the end-to-end distances from Gaussianity, seeFIG.2. This observation stays in line with the assumed GROMACS force ﬁeld, which apart from the harmonic approximationson bonds and angles, contains also long range interactions: V = ∑ bonds k i ( x i − x i , ) + ∑ angles k i ( θ i − θ i , ) + ∑ torsions V n ( + cos ( n ω − γ ))+ N ∑ i N ∑ j = i + (cid:32) ε i j (cid:34)(cid:18) σ i j r i j (cid:19) − (cid:18) σ i j r i j (cid:19) (cid:35) + q i q j πε r i j (cid:33) . (16)Here x i is a symbol of an i th bond length, θ i stands for an i th angle, while x i , and θ i , are their respective reference values. V n is a parameter that gives information about rotation barriers of a torsion angle ω , while k i refers to an i th force constant. ε i j is a minimal value of Van der Waals potential between atoms with indices i and j , r i j is a distance between these atoms, σ i j is adistance between them when the Van der Waals potential value equals 0. The symbols q i and q j refer to charges of i th and j th atom respectively, and ε stands for the dielectric constant.Simulation results indicate that when the chain’s structure attains a local minimum of the potential, the long rangeinteractions do not play a signiﬁcant role. Effectively, their inﬂuence on variations of the potential energy wanes temporarily.As a result, the chain is able to explore a narrow conformational subspace, behaving similarly to the Gaussian chain, beforebeing pulled out of the energy well by thermal ﬂuctuations. If the chain never leaves the vicinity of that particular energy well,an overall distribution of end-to-end distances approaches a normal distribution for sufﬁciently long simulation runs.The mean end-to-end distance of the Ala-only chain over the whole simulation equals 3.77 ± end-to-end distances of the neck linker structure is not as well ﬁtted to a Gaussian curve. In order to make sure that a chosen structure s sufﬁciently close to the potential energy minimum, we have selected a model one with the end-to-end distance of 2 . nm ,belonging to the class of conformations attained between 4 . . ns of simulation runs. Results of the pulling experimentsperformed on chosen neck linker and Ala-only chains are displayed in FIG.3 and clearly indicate that both structures stretch at(almost) a constant rate for the majority of the process. At the same time, in accordance with ﬁndings reported in our priorstudies , we observe that Ala-only chain’s linear response drops much sooner than that of the neck linker structure.While the end-to-end distance is a useful parameter in AFM experiments and those mimicking them in silico , it does notgive detailed information regarding inner dynamics of the examined structure. In order to gather additional information thatcould hint at inner dynamic characteristics and elasticity of the analyzed biopolymer chains, we have measured pair distancesbetween the 4th residue of the simulated chains and a selected residue of interest. The 4th residue of our neck linker sequence isan Isoleucine amino acid which is one of the most prevalent elements at this position in a neck linker sequence across numerouskinesin families . The second selected residue in the pair has been chosen as either neighboring Asparagine, Lysine or Proline.Asparagine and Lysine have been chosen for their signiﬁcant propensity to be in contact with water (polar Asparagine andpositively charged Lysine), while Proline - because of the presence of a Pyrrolidine, ﬁve-member ring in its side chain, beingthe only steric group of that kind in the whole neck linker chain. Effects of the simulation runs are displayed in FIG.4 anddocument considerable difference in response to mechanical perturbations between the neck linker and the Ala-only chain.The extensions between the 4th residue and its proximal contacts (the 2nd, 3rd, 5th and 6th residues in the chain) remainrelatively unchanged throughout the stretching time, regardless of the type of the representative polymer chain. Strongervariations are observed for more distant pairs: for the neck linker structure signiﬁcant changes in extension proﬁles betweenpairs of residues emerge in time windows longer than 0 . ps . At the same time the Ala-only sequence shows pronouncedvariability in conformations by comparison to a much more rigid structure of the neck linker .Positions of the 8th and 14th residues in the neck linker are taken by Proline which is known to reduce ﬂexibility of the chainin the kink (cis) conformation. In fact, in former in silico studies examining mechanical properties of the neck linker domainfrom sequence analysis it has been argued that the cis-trans isomerization of a conserved proline residue switching betweenstraight and kink forms accounts for variations in resulting force-extension proﬁles and supports experimental observations ofthe proline’s isomerization inﬂuence on duration and effectiveness of biological processes dependent on protein folding.In case of the Ala-only chain, displacement patterns between Isoleucine at 4th position and subsequent Alanine residuesdiffer signiﬁcantly from those observed for the neck linker chain stretched at constant pulling speed: In course of pullingexperiment inner distances do not deviate much from their averages, whereas distances to external residues (at 8th, 11th, 14thand 15th positions) show pronounced extensibility. Altogether, while the ﬁnal end-to-end distance of the

Ala-only chain hasbeen on average smaller than that of the neck linker (see FIG.3), its inner extension distances reach greater lengths, to the point,where 15th residue of the polyalanine chain has almost twice the ﬁnal value, when compared to the largest inner distances ofthe neck linker . This indicates that stretching of the

Ala-only chain is far more complex, possibly with emergent inner dynamicdomains facilitating extensions.In order to further explore the speciﬁcity of neck linker’s sequence and to determine how the presence of particular aminoacid types affects that speciﬁcity, we have prepared 3 modiﬁed neck linker chains, where all Asparagine, all Lysine residues andthe two Proline residues have been substituted with Alanine amino acids, respectively. Simulations’ setup has been identicalto the one employed in case of the unchanged neck linker and the

Ala-only chain. The no-Asparagine and the no-Proline peptides seems to have easily achieved a local minimum of potential energy. On the other hand, the distribution of the no-Lysine chain end-to-end distances from the full simulation does not ﬁt the Gaussian curve at all (FIG.5). The possible reasons forsuch behavior has been discussed above. Accordingly, in order to meet the requirement of beginning simulation runs with amechanically equilibrated structure, as a starting conformation of the no-Lysine chain we have selected a structure from a timeperiod, in which the no-Lysine end-to-end distances have been distributed normally (FIG.6).It can be concluded from the Jarzynski’s equality , that a crossing point between work distributions of forward and reverseprocesses is equivalent to free energy difference ∆ G between the resulting states and has been used in various studies ofmechanical stability and folding/unfolding dynamics of biopolymers . The approach allows us to determine ∆ G betweenstretched and relaxed structures of the neck linker and

Ala-only chains. It is difﬁcult to a priori ascertain the possible rangeof work values in such an experiment, since it depends on several factors, such as the experiment’s duration, the stretchingforce’s value, the way that force is applied, etc. Nevertheless, we can infer from examples of similar experiments, like the oneperformed on DNA hairpins , that one should expect work values of order of hundreds of k B T (where k B T can be given asapproximately 4.14 pN ∗ nm ). As we can see in FIG.7 - work distributions derived from ensembles of stretched and relaxed necklinker structures are well separated and exhibit larger average values than those typical for the Ala-only chains. The free energydifference between two equilibrated states ∆ G can be identiﬁed as ∆ G = . k B T for the neck linker and ∆ G = k B T forthe Ala-only chain. These rough estimates of ∆ G , used in conjunction with the Aarhenius deﬁnition of the rate constant, seemto imply that the neck linker not only stretches more effectively than the arbitrarily chosen polyalanine peptide but also returnsfaster to the relaxed conformation. This conclusion is in line with its documented greater elasticity. istributions displayed in FIG.8 suggest that the elasticity of all 3 modiﬁed neck linker sequences suffered, compared to theoriginal one (see FIG.7). Just like in case of the polyalanine peptide, the value of work performed on them hardly crosses thepoint of 1000 k B T , while that of the neck linker easily passes the 1500 k B T mark. Additionally, the removal of Asparagine fromthe chain has inﬂuenced its ability to spontaneously retract most severely, both that ability and its stretching are less effectivethan that of the polyalanine peptide. The substitution of Proline seems to result in similar behavior to the Ala-only chain. The no-Lysine chain’s ability to retract seems to even surpass that of the original chain. All this may hint at Asparagine being acrucial part, when it comes to retracting during relaxation process. Asparagine side chain consists of sole amine group, whichmay have a stabilizing effect through its ability to partake in hydrogen bonding. Lysine side chain contains amine group as well,it is however preceded by a conventional chain of 4 methylene groups, a fact that probably keeps the amine group away fromthe peptide’s backbone. Proline may be adding to the stabilising effects, as well as providing an extra push to the stretchingability with its conformational changes.Peptide type < W > stretch [ k B T ] < W > relax [ k B T ] ∆ G [ k B T ] neck linker . ± . . ± . . ± . Ala-only . ± . . ± . ± . no-Asparagine . ± . . ± . . ± . no-Lysine . ± . . ± . . ± . no-Proline . ± . . ± . . ± . Table 1.

Distribution means and ∆ G between stretched and relaxed chains of the original neck linker , the

Ala-only peptide andfor 3 modiﬁed (mutant-like) structures, with the stretching force value preset to 1300 kJ ∗ mol − ∗ nm − .The data gathered in remaining in-silico experiments (stretching force values: 130, 400, 700 and 1000 kJ ∗ mol − ∗ nm − )have been used together with the data for the stretching force value of 1300 kJ ∗ mol − ∗ nm − ) showcased in previous ﬁgures.In FIG.9 the relation between fractional extension and ∆ G has been shown, with all chains and force values included. Thequadratic curves ﬁtted to the data points conﬁrm harmonic spring-like behaviour of simulated chains and stay § in line withanalysis of single polymer dynamics presented elsewhere (see e.g.Ref. ). The neck linker curve is characterised by a quadraticcoefﬁcient of a greater value than all other curves, save one. In fact, the neck linker curve and the polyalanine (the Ala-only peptide) curve seem to deﬁne a range of coefﬁcient values that go from the least unique chain (the

Ala-only peptide) to the onepresent in properly functioning biostructures (the neck linker ). Predictably, the least unique chain is also the least effectivespring, while the other one is most effective. The curves depicting the characteristics of no-Asparagine chain and no-Proline chain fall in between these two extremes. It could thus suggest that chains decrease in effectiveness towards the polyalanine with the removal of

Proline and

Asparagine residues. The lack of

Asparagine in sequence seems to impact the elasticitymore profoundly, with no-Asparagine curve running very close to the polyalanine curve. This result agrees with our previousconclusions. Intriguingly, the no-Lysine curve is to the far left of the other curves, including neck linker curve. Its quadraticcoefﬁcient value reﬂects this difference: it is close to 8 . k B T ). Judging from this, we may argue that the substitution of Lysine residues actually improves the spring-likeperformance of the chain. It is possible to draw the conclusion from this particular result that the substitution of

Lysine residuesmay improve neck linker’s performance within the context of kinesin motor’s movement along microtubules. However, it isalso possible to assume that there is an optimal range within which biologically viable springs operate and that such drasticsurge of elasticity takes no-Lysine chain outside of this range. If it were to be so, then it is probable that this optimal biologicalrange coincides with the range deﬁned by neck linker and polyalanine chains, as shown in the FIG.9. Indeed, further inquiriesmay prove enlightening as to whether neck linker chain corresponds to the optimal structure in the context of, ﬁrst, kinesinprotein, then whole group of molecular motors and, ﬁnally, in the context of all protein native structures.

Conclusions

Pulling experiments on single-molecules provide a quantitative characterization of unfolding and relaxation mechanisms ofbiomolecules. Despite many nanotechniques like atomic force microscopy, laser tweezer or ﬂuorescence resonance energytransfer are available today and used in combined protocols, they may not reveal molecular mechanisms underlying modulationof protein’s elasticity, especially under costly conditions of manipulating local mutations of investigated molecules. In order toovercome these difﬁculties, mechanical models of molecular dynamics can be used as guiding insight into consequences oflocal modiﬁcations of protein structure on its elasticity and response to external mechanical stress . Computationalall-atoms MD or coarse-grained MD simulations on single molecule pulling experiments are frequently a complementary toolin analysis of entropic elasticity of polymers or protein molecules and facilitate development and design of single-moleculeforce spectroscopy . otably, entropic forces have been rarely discussed in the context of nanomechanical devices, although entropy-functionalunits might be easier controllable by external parameters (like temperature or external ﬁelds) and their motion induced withoutchanging the chemical structure of the components. Therefore investigations and a design of such entropy-driven systemsseems to be of particular interest in the ﬁeld of engineering of artiﬁcial motors for nanoscale transport.Here we have investigated force response in the structure of kinesin focusing on elastic properties of a spring connectingtwo separate domains (heads) of the motor protein. Due to small molecular size of the motor, in most biological applicationsthe viscous forces are overtaking the inertia effects and the overdamped dynamics is a valid approximation . In contrast,our numerical simulations follow the procedure in which full set of Newton’s equations of motion are solved to propagate intime the coordinates of all atoms of the structure under considerations.Presented study adds to the notion that the neck linker regions possess mechanical properties not found in an arbitraryamino acid chain. The neck linker’s models modiﬁed with point mutations clearly exhibit different responses to stretching forcein comparison to the intact, original structure. At this point, it still remains unclear, how much those differences depend onamino acid type or on the number of residues being substituted and whether the placement within the sequence factors in. It isalso impossible to assert, whether the amino acid types’ impacts on neck linker properties are merely additive, or if the certainresidues’ combinations produce more nuanced interactions. Future studies, both experimental and theoretical, should shed lighton effects of perturbing motor proteins by introduced point mutations and on adaptation of the resulting stepping mechanism tothose changes. Data Availability

The ﬁle containing initial atom coordinates of the neck linker region, used in the beginning of the simulations’ setup, is availableat . The GROMACS package employed to execute simulations can beobtained from . As for the datasets generated and analysed during the currentstudy, they are available from the corresponding author on reasonable request.

References Yildiz, A., Tomishige, M., Gennerich, A. & Vale, R. D. Intramolecular strain coordinates kinesin stepping behavior alongmicrotubules.

Cell , 1030–1041 (2008). Kolomeisky, A. B. Motor proteins and molecular motors: how to operate machines at the nanoscale.

J. Physics: Condens.Matter , 463101 (2013). Kolomeisky, A. B. & Phillips Iii, H. Dynamic properties of motor proteins with two subunits.

J. Physics: Condens. Matter , S3887 (2005). Teimouri, H., Kolomeisky, A. B. & Mehrabiani, K. Theoretical analysis of dynamic processes for interacting molecularmotors.

J. Phys. A: Math. Theor. , 065001 (2015). Hyeon, C. & Onuchic, J. N. A structural perspective on the dynamics of kinesin motors.

Biophys. J. , 2749–2759(2011). Zhang, Z. & Thirumalai, D. Dissecting the kinematics of the kinesin step.

Structure , 628–640 (2012). Howard, J.

Mechanics of Motor Proteins and the Cytoskeleton (Sinauer Associates, Sunderlans, MA, 2001). Hariharan, V. & Hancock, W. O. Insights into the mechanical properties of the kinesin neck linker domain from sequenceanalysis and molecular dynamics simulations.

Cell. molecular bioengineering , 177–189 (2009). Bouchiat, C. et al.

Estimating the persistence length of a worm-like chain molecule from force-extension measurements.

Biophys. J. , 409–413 (1999). Kutys, M., Fricks, J. & Hancock, W. O. Monte carlo analysis of neck linker extension in kinesin molecular motors.

PLoS , e1000980 (2010). Padinhateeri, R. & Menon, G. I. Stretching and bending ﬂuctuations of short dna molecules.

Biophys. J. , 463–471(2013).

Lisowski, B., ´Swia¸tek, M., ˙Zabicki, M. & Gudowska-Nowak, E. Understanding operating principles and processivity ofmolecular motors.

Acta Phys. Pol. B , 1073 (2012). Kozielski, F. et al.

The crystal structure of dimeric kinesin and implications for microtubule-dependent motility.

Cell ,985–994 (1997). Hinsen, K. Analysis of domain motions by approximate normal mode calculations.

Proteins , 417–429 (1998). Hinsen, K. & Field, M. J. Analysis of domain motions in large proteins.

Proteins , 369–382 (1999). Meirovitch, H. Recent developments in methodologies for calculating the entropy and free energy of biological systems bycomputer simulations.

Curr. Opin. Struct. Biol. , 181–186 (2007). Chong, S.-H. & Ham, S. Conﬁgurational entropy of protein.

Chem. Phys. Lett. , 225–229 (2011).

Hill, T.

An Introduction to Statistical Thermodynamics (Dover Publications, New York, NY, 1986).

Lawrence, C. J. et al.

A standardized kinesin nomenclature.

J. Cell Biol. , 19–22 (2004).

Berman, H. M. et al.

The protein data bank.

Nucleic Acids Res. , 235–242 (2000). Pronk, S. et al.

Gromacs 4.5: a high-throughput and highly parallel open source molecular simulation toolkit.

Bioinformatics , 845–854 (2013). Berendsen, H. J., van der Spoel, D. & van Drunen, R. Gromacs: a message-passing parallel molecular dynamicsimplementation.

Comput. Phys. Commun. , 43–56 (1995). Lindahl, E., Hess, B. & Van Der Spoel, D. Gromacs 3.0: a package for molecular simulation and trajectory analysis.

J.Mol. Model. annual , 306–317 (2001). Jorgensen, W. L. & Tirado-Rives, J. The opls [optimized potentials for liquid simulations] potential functions for proteins,energy minimizations for crystals of cyclic peptides and crambin.

J. Am. Chem. Soc. , 1657–1666 (1988).

Jorgensen, W. L., Maxwell, D. S. & Tirado-Rives, J. Development and testing of the opls all-atom force ﬁeld onconformational energetics and properties of organic liquids.

J. Am. Chem. Soc. , 11225–11236 (1996).

Alemany, A. & Ritort, F. Fluctuation theorems in small systems: extending thermodynamics to the nanoscale.

Eur. News , 27–30 (2010). Hummer, G. & Szabo, A. Kinetics from nonequilibrium single-molecule pulling experiments.

Biophys. J. , 5–15 (2003). Park, S., Khalili-Araghi, F., Tajkhorshid, E. & Schulten, K. Free energy calculation from steered molecular dynamics usingjarzynski’s equality.

J. Chem. Phys. , 3559 (2003).

Jarzynski, C. Equalities and inequalities: Irreversibility and the second law of thermodynamics at the nanoscale.

Ann. Rev.Cond. Matt. Phys , 329–351 (2011). Block, S. Kinesin motor mechanics; binding, stepping, tracking, gating and limping.

Biophys. J , 2986–2995 (2007). Zhang, Z., Shi, Y. & Liu, H. Molecular dynamics simulations of peptides and proteins with ampliﬁed collective motions.

Biophys. J. , 3583–3593 (2003). Genheden, S. & Ryde, U. Will molecular dynamics simulations of proteins ever reach equilibrium?

Phys. Chem. Chem.Phys. , 8662–8677 (2012). Smith, L. J., Daura, X. & van Gunsteren, W. F. Assessing equilibration and convergence in biomolecular simulations.

Proteins: Struct. Funct. Bioinforma. , 487–496 (2002). Lu, K. P., Finn, G., Lee, T. H. & Nicholson, L. K. Prolyl cis-trans isomerization as a molecular timer.

Nat. Chem. Biol. ,619–629 (2007). Alemany, A., Ribezzi-Crivellari, M. & Ritort, F. From free energy measurements to thermodynamic inference innonequilibrium small systems.

New J. Phys. , 075009 (2015). Fox, R. F. Using nonequilibrium measurements to determine macromolecule free-energy differences.

Proc. Natl. Acad. Sci. , 12537–12538 (2003).

Latinwo, F. & Schroeder, C. M. Determining elasticity from single polymer dynamics.

Soft Matter , 7907–7913 (2011). Harris, N. C., Song, Y. & Kiang, C.-H. Experimental free energy surface reconstruction from single-molecule forcemicroscopy using jarzynski’s equality.

Phys. Rev. Lett. , 068101 (2007). Linke, H., Downton, M. & Zuckermann, M. Performane characteristics of brownian motors.

Chaos , 026111 (2005). Bier, M. Processive motor protein as an over damped brownian stepper.

Phys. Rev. Lett. , 148104 (2003). Acknowledgements

This project has been supported in part by the grant from National Science Center 2014/13/B/ST2/02014. uthor contributions statement

M. ´S. and E. G-N. conceived and designed the research. SMD simulations were conducted by M. ´S. and followed by statisticalanalysis and interpretation of the results performed together with E. G-N. Both authors (M. ´S. and E. G-N.) wrote the paper.

Competing interests

The authors declare no competing interests.

Corresponding author

Correspondence to Michał ´Swia¸tek. igure 1.

Distribution of deformation energies in the kinesin structure (upper panel) and derived domains (lower panel).Clusters of vibrational energies have been plotted by similar colors. igure 2.

Left column includes (from top to bottom): the neck linker’s end-to-end distance distributions of the wholeequilibrium simulation, 10 recorded steps preceding the last 10 and, in the ﬁnal row, the last 10 recorded steps. Rightcolumn includes the Ala-only peptide’s end-to-end distance distributions arranged in an analogous way. Gaussian curves havebeen ﬁtted to guide an eye, based on the mean and σ of a given empirical distribution derived from the SMD simulations. igure 3. End-to-end distances as functions of time, averaged over 10 MD runs, each of duration t = ps . The left plotdepicts a change in the neck linker’s end-to-end distance. The right plot displays analogous ﬁndings for the Ala-only chain.

Figure 4.

Distances (extensions) between residues as functions of time, averaged over 10 MD runs, each of duration t = ps .The left plot depicts different rates of distance changes in the neck linker’s chain between the 4th residue and selected residues,listed in the inset. The right plot displays analogous ﬁndings for the Ala-only chain. igure 5.

The end-to-end distance distributions of the no-Asparagine chain (top), the no-Lysine chain (middle) and the no-Proline chain (bottom), all pertaining to equilibrium simulations. Appropriate Gaussian curves have been ﬁtted to the data,based on derived means and σ of respective distributions. igure 6. The end-to-end distance distributions of the last 10000 recorded steps taken from the no-Lysine peptide equilibriumsimulation data. An appropriate Gaussian curve has been ﬁtted to the data, based on the mean and σ of its distribution. igure 7. Reconstruction of probability density functions (PDFs) of work done on the end-to-end distance in pulling in-silico experiments of the modeled peptides (the stretching force value set at 1300 kJ ∗ mol − ∗ nm − ). The Gaussian curves are ﬁttedto the derived histograms of the neck linker (top) and Ala-only peptide (bottom). Crossing points of the curves are at1462 . k B T and 882 k B T , respectively. The ensembles of stretched structures are red, while the ensembles of relaxed structuresare green colored. igure 8. Reconstruction of probability density functions (PDFs) of work done on the end-to-end distance in pulling in-silico experiments on the modiﬁed linker chains (the stretching force value set at 1300 kJ ∗ mol − ∗ nm − ). The Gaussian curves areﬁtted to the derived histograms of the no-Asparagine peptide (top), the no-Lysine peptide (middle) and the no-Proline one(bottom). The ensembles of stretched structures are red colored, while the ensembles of relaxed structures are green. igure 9. ∆ G expressed as a function of the fractional extension for the neck linker , Ala-only peptide, no-Asparagine chain(

Asn substituted ), no-Lysine chain ( Lys substituted ) and no-Proline chain (

Pro substituted ). Several different ∆ G values havebeen acquired by repeating the in silico experiments while using different values of constant stretching force equal to 130, 400,700, 1000 and 1300 kJ ∗ mol − ∗ nm − . Parabolic curves have been ﬁtted to the plots. Quadratic coefﬁcients of ﬁtted plots are:Lys substituted: 8 . ∗ ; Pro substituted: 2 . ∗ ; Asn substituted: 2 . ∗ ; Neck linker: 3 . ∗ ; Ala-only:2 . ∗ and represent the effective stiffness (effective spring constant) of the stretched structure.and represent the effective stiffness (effective spring constant) of the stretched structure.