[PDF] Orientation before destruction. A multiscale molecular dynamics study

Abstract

The emergence of ultra-fast X-ray free-electron lasers opens the possibility of imaging single molecules in the gas phase at atomic resolution. The main disadvantage of this imaging technique is the unknown orientation of the sample exposed to the X-ray beam, making the three dimensional reconstruction not trivial. Induced orientation of molecules prior to X-ray exposure can be highly beneficial, as it significantly reduces the number of collected diffraction patterns whilst improving the quality of the reconstructed structure. We present here the possibility of protein orientation using a time-dependent external electric field. We used ab initio simulations on Trp-cage protein to provide a qualitative estimation of the field strength required to break protein bonds, with 45 V/nm as a breaking point value. Furthermore, we simulated, in a classical molecular dynamics approach, the orientation of ubiquitin protein by exposing it to different time-dependent electric fields. The protein structure was preserved for all samples at the moment orientation was achieved, which we denote `orientation before destruction'. Moreover, we find that the minimal field strength required to induce orientation within ten ns of electric field exposure, was of the order of 0.5 V/nm. Our results help explain the process of field orientation of proteins and can support the design of instruments for protein orientation.

Full PDF

OOrientation before destruction. A multiscale moleculardynamics study

Anna Sinelnikova , Thomas Mandl , Harald Agélli , Oscar Grånäs , Erik G. Marklund ,Carl Caleman , and Emiliano De Santis *1,31 Department of Physics and Astronomy, Uppsala University, Box 516, SE-751 20Uppsala, Sweden University of Applied Sciences Technikum Wien, Höchstädtplatz 6, A-1200 Wien,Austria Department of Chemistry – BMC, Uppsala University, Box 576, SE-751 23 Uppsala,Sweden Center for Free-Electron Laser Science, DESY, Notkestrasse 85, DE-22607 Hamburg,Germany

Abstract

The emergence of ultra-fast X-ray free-electron lasers opens the possibility of imaging single moleculesin the gas phase at atomic resolution. The main disadvantage of this imaging technique is the unknownorientation of the sample exposed to the X-ray beam, making the three dimensional reconstruction not triv-ial. Induced orientation of molecules prior to X-ray exposure can be highly beneﬁcial, as it signiﬁcantlyreduces the number of collected diffraction patterns whilst improving the quality of the reconstructed struc-ture. We present here the possibility of protein orientation using a time-dependent external electric ﬁeld.We used ab initio simulations on Trp-cage protein to provide a qualitative estimation of the ﬁeld strengthrequired to break protein bonds, with 45 V/nm as a breaking point value. Furthermore, we simulated, ina classical molecular dynamics approach, the orientation of ubiquitin protein by exposing it to differenttime-dependent electric ﬁelds. The protein structure was preserved for all samples at the moment orienta-tion was achieved, which we denote ‘orientation before destruction’. Moreover, we ﬁnd that the minimalﬁeld strength required to induce orientation within ten ns of electric ﬁeld exposure, was of the order of 0.5V/nm. Our results help explain the process of ﬁeld orientation of proteins and can support the design ofinstruments for protein orientation.

Proteins are essential components of all organisms, involved in a huge number of functions in living cells.It is generally accepted that a protein’s function and three-dimensional (3D) structure are closely connected.Determining proteins tertiary structures is thus of crucial importance for basic science as well as for medicine.To date, one of the most important and widely used techniques for protein structure determination is X-raycrystallography. It relies on the coherent diffraction of X-rays from a crystallized sample of appropriate size * [email protected] a r X i v : . [ c ond - m a t . m t r l - s c i ] F e b nd quality [1], which can not always be produced. Compounding this, proteins that are inherently dynamicmight not crystallize in the relevant conformations. The new era of X-ray Free-electron Lasers (XFELs),like LCLS [2] or European XFEL [3] facilities, able to produce a series of short and highly intense X-raypulses, enabled a new technique for structure determination of isolated molecules in the gas phase – SingleParticle Imaging (SPI) [4, 5]. SPI overcomes the need of a crystalline sample, but the extreme intensity ofthe X-ray pulse causes severe radiation damage to the molecule, leading to coulombic explosion[6]. Diffrac-tion patterns are thus obtained from separate exposures of identical molecules, each in their own randomorientation. Hence, a single diffraction pattern reﬂects a single view of the particle. The unknown orienta-tion of the particle renders its 3D reconstruction problematic, complicated and expensive – in terms of largesample consumption and beam-time required. Huge numbers of diffraction patterns are needed in order forthe reconstruction of the 3D structure of the molecule, which is accomplished using speciﬁc algorithms thathave been developed to deal with this task [7, 8, 9, 10, 11]. Data scarcity, sample heterogeneity, detectormasking and background noise effects can often lead to artefacts on the reconstructed structures [12, 13, 14].We recently presented a potential solution in Marklund et al. study [15], where classical molecular dynamics(MD) simulations were successfully used to show the possibility of orienting proteins by aligning their in-trinsic dipole moments against an external, static, electric ﬁeld (EF). We were able to deﬁne a range of the EFstrengths which allow protein orientation without inducing signiﬁcant structural loss, and also discovered thatlonger exposure times shifted the range toward lower ﬁeld strengths. Here, we present a natural continuationto the Marklund et al. study [15], where we use MD to explore time-dependent EFs. We apply a multi-scaleapproach comprising both ab initio and classical MD simulations. Using time-dependent density functionaltheory (TD-DFT) in the presence of an EF we identiﬁed at which ﬁeld strengths covalent bonds remain intactin a protein, using the mini-protein Trp-cage as a model system. We moreover performed gas-phase classicalMD simulations on ubiquitin to study the effect of the time-dependent EF on its orientation and the conse-quent structural evolution. Our results are of key for understanding the process of dipole orientation inducedby an external, time-dependent EF. As such they can serve to guide in the design of an apparatus able toimplement this phenomenon to manipulate proteins in SPI and other applications. Ab initio

MD simulations

A molecule becomes polarized when subjected to external EFs as the charge in the molecule rearrange in orderto screen the EF. Consequently, this generally changes the molecular dipole. Moreover, the rearrangementsof the electronic structure result in residual forces on charged sites in the protein. We used the ab initio

MDsoftware package Siesta 4.1 [16] to estimate the forces resulting from the interaction with the EF. We followedthe same procedure as published in our earlier work [17].

Ab initio calculations were carried out on a smallprotein, Trp-cage (PDBid: 1L2Y) [18], with a total charge of +2 e , as expected in the vacuum state [19].We ﬁrst thermalized the system using Born-Oppenhiemer MD, employing the Nosè thermostat to conserve atemperature of 300 K. For this step of the simulation, we used double-Z basis set with one polarization orbitalper atom. The basis functions where generated with a shallow conﬁnement potential of 0.001 Ry to allow forsufﬁcient diffuse functions. The exchange-correlation integration grid was determined by a 200 Ry cut-off,and was treated according to the Van der Waals function described by Vydrov and van Voorhis [20]. Thethermalization simulation was two ps long, with a time step of 0.5 fs. Next, we oriented the protein so thatits dipole moment was aligned against the external EF. Without allowing for any nuclei dynamics we thenexposed the protein to EFs ranging from 0.5 to 50.0 V/nm. For these simulations, we extended the basis setto encompass the charges in the electron distribution with respect to the ground state and used a triple-Z basisset with double polarization orbitals. For accurate partial charges the integration mesh cut-off was increasedto 500 Ry. 2 .2 Classical MD simulations A set of gas-phase classical MD simulations were performed to study the orientation of ubiquitin exposedto a time-dependent EF. Gromacs 4.5.7 [21] simulation package was used together with the OPLS-AA forceﬁeld [22] in accordance with our previous studies of protein in gas phase [19, 23, 17, 24, 15]. In order tosample sufﬁcient statistics for our analysis and to better mimic the heterogeneity of gas-phase experimentalsamples, we performed independent sets of simulations starting from different protein structures. For thispurpose, starting from coordinates based on crystallographic data of ubiquitin [25] (PDBid: 1UBQ), we ran a10-ns pre-simulation in solution in the NPT ensemble with Berendsen [26] weak coupling. Temperature wasset to 300 K with 0.1 ps time coupling and pressure was set to 1 bar with a time coupling of 20 ps. TIP4P [27]water model was used. From the equilibrated portion of this bulk simulation (2.5 ns - 10 ns) we extractedstructures at randomly picked times.The strategy we used then, depicted in Figures 1 and S1 of Supporting Information section, is the fol-lowing. After removing the solvent, we assigned the protonation states of ubiquitin in vacuum according topublished data [28, 29, 19], resulting in total charge of +7 e . The systems were relaxed in vacuum, and thetemperature adjusted over 100 ps simulation to 300 K using the Berendsen thermostat [26]. We then ran a10 ns long simulations in which we allowed the structures to equilibrate without thermostat. At the end ofthese pre-runs, the temperature of all the replicas was spanning a range of 305 K ± E ( t ) = E exp − ( t − t ) σ · H ( t − t ) + E · H ( t − t ) (1)where H ( t ) is the Heaviside function, t ∈ [ , , , ] ns and E ∈ [ . , . , . , . , . , . , . , . ] V/nm.The direction of the EF was set to be parallel to the x-axis of the simulation box. Duration for simulations inwhich t ∈ [ , , ] ns was set to 10 ns and for t = t value, eight EF strengths). t0 = 9 nst0 = 2 nst0 = 5 nst0 = 0 ns• Bulk simulations. • Systems preparations • 4 electric field implementations. •Simulations in presence of electric field. Figure 1:

Classical simulations - The schematic representation of performed classical MD simulations. Formore detailed description see Figure S1 of Supporting Information section.Long-range electrostatic forces in vacuum were captured using no cut-offs for non-bonded interactions.The equations of motion were propagated using the Leap-Frog integration scheme [31] with a 0.5 fs timestep. To reproduce perfect vacuum, neither pressure coupling nor periodic boundary conditions were applied.3

Results

Ab initio

MD simulations: Validation of classical MD approach

Ab initio calculations were performed on the Trp-cage protein. Given the extreme computational effort neededfor quantum calculations, we limited the extend of those simulations to comprise only the electronic response,without nuclei dynamics. The main goal of these simulations is to have a quantitative estimation of the orderof magnitude of the EF strength needed to break interatomic bonds, as well as validating the use of ﬁxedcharges in the classical MD simulations.In Table 1, some of the most representative equilibrium bond forces within proteins are listed. These valueswere computed by dividing the tabulated bond energies by the tabulated equilibrium distances. We qualita-tively assume that a bond among two atoms is broken when their distance is increased of 25% with respectto their equilibrium distance. We thus deﬁne a Bond Dissociation Force (

BDF thld ) equal to 1 eV/ Å as areasonable lower estimation of the force sufﬁcient to break a covalent bond. This value corresponds to ≈

80% of the force acting on two sulfur atoms in a disulﬁde bond at their equilibrium distance (Table 1).We assume that the force acting on any atom i in our simulations can be expressed as a sum of three terms,given as F i TOT ( E ) = F ( d CM ) + F ( d i ) + F i field ( E ) . (2)Here, F ( d CM ) is the force depending on the motion of the center of mass of the protein, F ( d i ) is the forcedue to the atomic vibrational state and F field ( E ) is the contribution to the force given by the interaction withthe external EF. For each simulations at the different EF strength, we calculate the average relative force (cid:104)| F ( E j ) |(cid:105) as (cid:104)| F ( E j ) |(cid:105) = N N ∑ i | F i ( E j ) | − | F i ( E = ) | = N N ∑ i | F i field ( E ) | . (3)With this quantity we isolate the effects induced only by presence of the external EF from the total force ata given E j . The contributions due to the center of mass motion and the particular vibrational state in whichthe atoms are frozen in are automatically removed. By comparing (cid:104)| F ( E j ) |(cid:105) to BDF thld it is possible to inferan estimation of the ﬁeld strength necessary to break a covalent bond. In Figure 2 we display (cid:104)| F ( E j ) |(cid:105) asfunction of the simulated EF. The plot shows that the EF strength needed to break atomic bonds is of the orderof 45 V/nm. Moreover, one can notice that for all the ﬁeld strengths used in the classical MD (in the greeninset of the graph), the value of the average force (cid:104)| F ( E j ) |(cid:105) is one order of magnitude lower than BDF thld .This ﬁnding proves that the integrity of the protein topology is maintained in that ﬁeld range and the accuracyof classical MD is preserved.In Figure 3 the polarization induced by 3.0 V/nm ﬁeld on the electron density is shown. At this ﬁeld-strength,the majority of the polarization response is local to the atoms. Hence, not much charge is transferred acrossthe molecule and the maximum charge increase and decrease in relation to the ground-state of a speciﬁc atomis relatively small. This validates the approach used in our classical MD simulations, where the ionic chargeis ﬁxed throughout the simulation. In Table 2, we list the maximum increase and decrease in integrated chargeon the atoms where the difference is largest. The corresponding position of the respective atoms are shown inFigure 3. The fact that these atoms are not residing at the edge of the protein also indicates that importanceof the local polarization above de-localized charge transfer across the molecule, further corroborating the useof classical MD in this context. 4able 1: Covalent and hydrogen bond forces at the equilibrium of particular relevance in proteins. Force val-ues are computed by dividing the tabulated energies by the tabulated equilibrium bond distances. Hydrogenbonds are indicated by · · · . Covalent bonds [32]

Hydrogen bonds [33]Type Force [eV/Å] Type Force [eV/Å]C − N ≈ · · · O ≈ − C ≈ · · · N ≈ − S ≈ · · · O ≈ − O ≈ · · · O ≈ − S ≈ e . The integrated difference is deﬁned as ‘number of electrons on site X at ﬁeld =0’ -‘number of electrons on site X with ﬁeld=3 V/nm’. The corresponding atoms are highlighted in Figure 3 byenlarging their radii of a factor two, for the maximum decrease, and a factor three, for the maximum increase.Atom type Max charge increase Max charge decreaseHydrogen 0.016 -0.018Carbon 0.022 -0.013Nitrogen 0.012 -0.017Oxygen 0.010 -0.014 | F ( E ) | [ e V / Å ] BDF thld

Figure 2:

Ab initio simulations - Mean value of the force exerted on Tpr-cage atoms, deﬁned in equation 3, asa function of the electric ﬁeld strength. The region of interest for the classical MD simulations is presentedin the inset. 5igure 3:

Ab initio simulations - The effect of the electric ﬁeld on the electron distribution in the Trp-cageprotein. Displayed is the difference in electron density between a protein un exposed to the ﬁeld and a proteinexperiencing a 3 V/nm ﬁeld. Blue electron density in the ﬁgure symbolizes a loss of electron density, andred an increase in electron density. Hydrogen atoms are white, Oxygen red, Nitrogen blue and Carbon black.Atoms whose correspond to a maximum decrease of the integrate electron number difference are depictedwith atomic radii increased of a factor two; atoms whose correspond to a maximum increase of the integrateelectron number difference are depicted with atomic radii enlarged of a factor three. The isosurface level isset to 0.00226 electrons/Å . The black arrow denotes the direction of the external electric ﬁeld. The way the protein responds to the time-dependent EF has been assessed by studying three different observ-ables. First of all, to assess the extent of protein orientation we deﬁne the degree of orientation as Θ = − cos ( θ ) , (4)where θ is the angle between the EF and the total dipole moment of the molecule. Thus, a fully aligned proteinexpresses a value of Θ =

0, whereas Θ = Θ = ( θ ) becomes0. In Figure 4, we show how Θ is depending on EF strength E and the ramping time t . Values in Figure 4show the average degree of orientation over the last two nanoseconds of the ten independent simulations.As one could expect, the stronger the ﬁeld is, the more the molecule becomes orientated. In Figure S2 (inSupporting Information section) the projection of the dipole moment of the plane perpendicular to the EFvector is depicted. Here, E strengths equal to 0.1 V/nm, for all four ﬁeld implementations, are not strongenough to orient the protein in the simulation time we explored and the projection of the dipole moment isequally distributed in the plane. For E equal to 0.2 V/nm, although there is not perfect alignment of the6 .0 0.5 1.0 1.5 2.0 2.5 3.0 E (V/nm)0.000.050.100.150.200.250.300.35 D e g r ee o f o r i e n t a t i o n t = 0 ns t = 2 ns t = 5 ns t = 9 ns Figure 4:

Classical simulations - An averaged degree of orientation as a function of the maximum ﬁeldstrength E for different ramp-up times t . The different colors refer to the different ﬁeld implementations(black - t = 0 ns, cyan - t = 2 ns, red - t = 5 ns and orange - t = 9 ns ).EF and the dipole of the molecule, the dipole moment distribution in not completely random. In particular,quite interestingly, the EF implementation with a ramping up time equal to 2 ns results in a better orientationrespect to the other three EF implementations; although not very focused, one can notice an highly populatedregion corresponding to a spread of ±

15 degrees with respect to perfect alignment. For ﬁelds values equalto 0.5 V/nm, the projection of the dipole moment in plane perpendicular to the ﬁeld is focused in the ±

15 degrees region, expressing a good alignment between the ﬁeld vector and the ubiquitin dipole moment.For ﬁeld strengths greater than or equal to 0.5 V/nm, the differences in the degree of orientation among theramping times are not resolved within the errors (Figures 4, S3 and S4). At the end of the simulations, theprotein is orientated in a similar way, regardless of the ﬁeld implementation.The second observable we monitored in our simulations was the speed of orientation. Although the degreeof orientation oscillates signiﬁcantly, we can observe that there is a clear exponential decay. In Figure 5 anexample of this trend for E =0.5 V/nm, t =2 ns is presented. Here, the black line represents the evolutionof Θ over time ( Θ ( t ) : = (cid:104) Θ ( t ) (cid:105) replicas ). The orange line represents a ﬁt on Θ ( t ) , with f ( t ) = exp ( − kt ) . Wedeﬁne τ as the time required for the protein to lose 90% of its initial orientation and hence to arrange parallelto the EF vector: τ = ln ( ) k (5)In Figure 6 we display the correlation between the orientation time τ for all the simulated ﬁeld imple-mentations t and E . Evidently, a clear dependence between the rate of orientation and the ramping time canbe observed: the longer the ramping time is, the more time is needed to orient the structure. Here, τ spans arange of values from 7.6 ns, in the case of the EF equal to 0.2 V/nm and t equal to 9 ns, to 3 ps, for the EFequal to 3.0 V/nm and t equal to 0 ns.The question naturally arises: is there any particular ﬁeld strength able to orient the protein? In order toanswer this query we plot, in Figure 7, E ( τ )( E ; t ) , namely the value of the EF at the time the protein is7 D e g r ee o f o r i e n t a t i o n simulationsfit: f(t) = exp((-1.44 ± 0.01)t) Figure 5:

Classical simulations - The time evolution of degree of orientation averaged over ten independentruns for the parameters E = . t = f ( t ) = exp ( − kt ) . E (V/nm)02468 ( n s ) t =0 ns t =2 ns t =5 ns t =9 ns Figure 6:

Classical simulations - The dependence of time τ (when the protein lost 90% of initial orientation)on E for different ramping time t . The colors scheme is the same one described in Figure 4.oriented as a function of the ﬁeld implementations for the different simulations. Noteworthy, except for thetrivial case of the constant ﬁeld ( t =0 ns), the ﬁeld strength required to align the protein seems to be always8 .5 1.0 1.5 2.0 2.5 3.0 E (V/nm)0.00.51.01.52.02.53.0 E () ( V / n m ) t = 0 ns t = 2 ns t = 5 ns t = 9 ns Figure 7:

Classical simulations - EF strength at the time the protein is oriented as a function of the EFimplementation. The colors scheme is the same described in Figure 4. E (V/nm)0.100.150.200.250.30 R M S D ( n m ) t = 0 ns t = 2 ns t = 5 ns t = 9 ns Figure 8:

Classical simulations - The dependence of RMSD at time t = τ on E for different ramping time t . The colors scheme is the same one described in Figure 4.9f the order of 0.5 V/nm, independently of the ﬁnal ﬁeld strength of the simulation and the implementations.In other words, an EF strength of 0.5 V/nm is a necessary and sufﬁcient condition to have a good alignmentof ubiquitin.The last important question we assessed regards the structure stability. It is of crucial importance to beaware of possible structural changes induced by the presence of an external EF. Root Mean Square Deviation(RMSD) computed on C α atoms gives a measure of how much of the original structure is preserved. It isreasonable to deﬁne a structure to be preserved if the RMSD value is below 0.5 nm, while for RMSD valueshigher than this threshold, we can assume that the proteins initial structure is lost.In Figure 8, the results of the RMSD( τ ) calculations (namely the RMSD value at time t = τ ) as a function of t and E are displayed. We can observe that all ramping up times provide good conservation of the proteinstructures at the time it is orientated. This assumption is valid for all the values of the EF we tested. We cantherefore conclude that, in all cases we simulated, the orientation happens before the structure is damaged,and as such "orientation before destruction". This concept is visualized in Figure 9. 𝛕𝛍 𝛍 E 𝛍 t Figure 9:

Classical simulations - Time evolution of the degree of orientation, RMSD and electric ﬁeld func-tion for t =2 ns and E = 2.5 V/nm simulations. Only the ﬁrst 5 ns are shown. Each data point shows the resultof the averaged values of the 10 replicas. Error bars are not shown for simplicity. The green background rep-resents the part of the simulation in which the ubiquitin structure is preserved (RMSD ≤ . E represents the direction of the external EF, with µ arrows we represent the direction ofthe protein dipole. In the insets, cartoon representations of the protein structure at the corresponding time arepresented.Moreover, it is interesting to measure how the protein structure evolves once it is fully immersed inan hypothetical experimental setup exerting an EF. In order to study this aspect, we evaluated the RMSDvalues of ubiquitin after 5 ns when the EF reached its maximum value. Namely we considered 5 ns from the RMSD ( t − t ) = M ∑ i = Nm i | (cid:126) r i ( t ) − (cid:126) r i ( t ) | where M = ∑ Ni = m i , (cid:126) r i ( t ) is the position of atom i at time t and N is the total numberof atoms of the system. .0 0.5 1.0 1.5 2.0 2.5 3.0 E (V/nm)0.00.51.01.52.02.53.0 R M S D ( n m ) t = 0 ns t = 2 ns t = 5 ns t = 9 ns Figure 10:

Classical simulations - Dependence of RMSD after 5ns of EF reached its maximum value afunction of E and t . The colors scheme is the same one described in Figure 4.beginning of the simulations for the case of t =0 ns, 7 ns for t =2 ns, 10 ns for t =5 ns, and lastly 14 ns forthe case of t =9 ns. This describes accurately the possibility to acquire the image of the protein using theX-ray beam after the molecule travelled in the experimental device. The results of this analysis are presentedin Figure 10, where for each data point an average over the 10 replicas was computed. One can observe forEF lower than or equal to 1.5 V/nm, the protein structure is maintained for all the EF implementations. Inparticular for an EF lower than or equal to 0.8 V/nm, there are almost no differences among the four differentEF implementations. On the contrary, EF ﬁeld values greater than or equal to 2.5 V/nm express RMSD valuesabove the 0.5 nm threshold for all the ﬁeld implementations. Hence, it is possible to draw the conclusion thatfor these EF strengths the protein structure is lost. In the general framework of bio-imaging and in particular for the recent developed SPI technique, signiﬁcanteffort has been made to generate advanced algorithms aimed to solve the orientation problem for proteintertiary structure determination. All these approaches rely on the reconstruction of a 3D diffraction volumeby assembling huge sets of diffraction patterns obtained from noisy, randomly oriented copies of a particleinjected into an X-ray beam [34]. While such algorithms are very powerful, they cannot always converge tothe correct solution, due to uncertainties and ambiguities in the data. The possibility to reduce some of thedegrees of freedom of the problem by knowing the orientation during beam exposure is therefore of greatinterest.One possible approach was proposed by Östlin et al. [35], where it was theoretically proven the oppor-tunity of using an angular ion map to record information regarding the protein spatial orientation traceablefrom the sample Coulomb explosion. Here, instead of acquiring sparse orientation information from theexperiment, we develop the approach of imposing the orientation onto the sample molecules, exploring the11ossibility to use an external, time-dependent EF to a prori orient a molecule in the gas phase before itsexposure to a X-ray beam.First, we provided a qualitative indication on the EF strength required to alter considerably the chemicalstructures of the proteins. Given the impossibility of classical MD to study the electronic response to theexternal EF, we tackled this point by using DFT simulations to examine the Trp-cage mini-protein. Westudied the average atomic forces resulting from the interaction with the EF and we qualitatively estimatedthat the ﬁeld strength needed to break interatomic bonds is of the order of 45 V/nm. This limit is considerablyhigher than what has been used in this context before [15, 17]. Thus, we proved that the EF range used inthe next part of the present study represents a safe windows of values where the interatomic forces acting onthe protein atoms during EF exposure are lower than the force needed to break bonded interactions. In otherwords, classical MD approximation is still valid in our chosen EF range .We used classical MD simulations in vacuum to monitor the orientation of a protein as a response to atime-dependent EF. This study can be considered as a natural extension of our previous work [15]. In fact,the EF implemented here can be considered more realistic with respect to the EF studied before, as it mimicsthe behavior of a molecule injected in an apparatus exerting an EF in greater detail. The injected moleculeexperiences a gradually increasing ﬁeld strength, in part due to the experimental geometry, but also due tothe non-square waveform of an applied electric ﬁeld pulse. Ubiquitin was exposed to an external EF rangingfrom 0.1 V/nm to 3 V/nm. To reproduce different time-dependecies of the experienced EF, we tested threeramping up times ( t = 2 ns, t = 5 ns and t = 9 ns) of the EF, namely the time required for the ﬁeld to reachits maximum value. As a reference, we also studied the case of a static EF, referred to as t = 0 ns, thatreplicates the same EF implementation used in Marklund et al. [15]. In order to collect enough statisticsand to reproduce the intrinsic heterogeneity of SPI samples, we performed for each choice of t and E tenindependent simulations that differ slightly in the initial protein structures. We found that for EF valuesgreater and equal to 0.5 V/nm, regardless of the ﬁeld implementation, desirable orientation of the proteinwas achieved. An EF of a strength equal to 0.1 V/nm results to be too weak to induce any orientation ofubiquitin (in the time scale investigated herein). These ﬁndings are consistent with our previous work [15].Quite interestingly, the orientation is not signiﬁcantly dependent of the EF implementation. Given the factthat the initial structures of our replicas were oriented in the same way, we were in the position to study thevelocity of orientation as a function of the EF implementation and ﬁeld strength. The velocity of orientationwas computed by performing an exponential ﬁt on the degree of orientation versus time. We then deﬁned τ to be the time required for the molecule to lose 90% of its initial dipole orientation and to align to the EF.Studying τ as a function of { t , E } , we noticed, as expected, a linear dependence of the orientation time with t and E . The higher the ﬁeld, the lower the time required to align the protein to the external ﬁeld is. Thelower t , the higher τ . We found that for all time-dependent implementations of the EF we simulated, the EFstrength required to orient the protein is of the order of 0.5 V/nm.A careful monitoring of the protein structure progression was performed. We studied the evolution ofC- α RMSD once the protein was exposed to the EF. We demonstrated that for all four EF implementationswe simulated, the protein preserves its structure at the time it is oriented, τ . For values of the ﬁeld less then1.5 V/nm, there is not relevant unfolding of the protein even for longer time scales. The unfolding of theprotein happens after the orientation - ‘orientation before destruction’. Conﬂicts of Interest

The authors declare that they have no conﬂicts of interest with the contents of this article. It is noteworthy to recall that the possibility of breaking bonds is not modelled in classical MD in its standard parameterization,since the bonded interactions are given as input for all simulations. uthor Contributions C.C., E.G.M. and E.D.S. devised the project. C.C. with help from O.G. performed ab initio simulations.H.A., T.M. and A.S. carried out classical simulations. A.S. and E.D.S. analyzed the data. E.D.S. with thehelp from C.C., O.G., E.G.M. and A.S. wrote the article.

Acknowledgments

This work is a part of the MS SPIDOC project funded by the European Union’s Horizon 2020 FET-OPENresearch and innovation program (Grant agreement No. 801406) for support of ongoing photo activationIM-MS activities. A.S. acknowledges funding from the Knut and Alice Wallenberg foundation through theWallenberg Academy Fellow grant of J. Nilsson. C.C. acknowledges the Swedish Research Council (2018-00740) and the Helmholtz Association through the Center for Free-Electron Laser Science at DESY. E.D.S.gratefully acknowledges a postdoctoral fellowship from Carl Trygger foundation.We acknowledge the use of Uppsala Multidisciplinary Center for Advanced Computational Science (Uppmaxprojects number SNIC 2020/15-67, SNIC 2019/8-314 and SNIC 2019/30-47) provided by SNIC.We thank Maxim Brodmerkel for comments that greatly improved the manuscript.

Data availability

The data that support the ﬁndings of this study are available from the corresponding author upon reasonablerequest.

References [1] W. H. Bragg and W. L. Bragg, “The reﬂection of X-rays by crystals,”

Proceedings of the Royal Societyof London. Series A, Containing Papers of a Mathematical and Physical Character , vol. 88, no. 605,pp. 428–438, 1913.[2] P. Emma, R. Akre, J. Arthur, R. Bionta, C. Bostedt, J. Bozek, A. Brachmann, P. Bucksbaum, R. Coffee,F.-J. Decker, et al. , “First lasing and operation of an ångstrom-wavelength free-electron laser,” naturephotonics , vol. 4, no. 9, p. 641, 2010.[3] E. A. Schneidmiller and M. V. Yurkov, “Photon beam properties at the european xfel (december 2010revision),” tech. rep., Deutsches Elektronen-Synchrotron (DESY), 2011.[4] M. J. Bogan, W. H. Benner, S. Boutet, U. Rohner, M. Frank, A. Barty, M. M. Seibert, F. Maia, S. March-esini, S. Bajt, et al. , “Single particle X-ray diffractive imaging,”

Nano letters , vol. 8, no. 1, pp. 310–316,2008.[5] R. Neutze, R. Wouts, D. Van der Spoel, E. Weckert, and J. Hajdu, “Potential for biomolecular imagingwith femtosecond X-ray pulses,”

Nature , vol. 406, no. 6797, pp. 752–757, 2000.[6] H. N. Chapman, C. Caleman, and N. Timneanu, “Diffraction before destruction,”

Philosophical Trans-actions of the Royal Society B: Biological Sciences , vol. 369, no. 1647, p. 20130313, 2014.[7] J. Flamant, N. Le Bihan, A. V. Martin, and J. H. Manton, “Expansion-maximization-compression al-gorithm with spherical harmonics for single particle imaging with X-ray lasers,”

Physical Review E ,vol. 93, no. 5, p. 053302, 2016. 138] K. Ayyer, T.-Y. Lan, V. Elser, and N. D. Loh, “Dragonﬂy: an implementation of the expand–maximize–compress algorithm for single-particle imaging,”

Journal of applied crystallography , vol. 49, no. 4,pp. 1320–1335, 2016.[9] P. Schwander, D. Giannakis, C. H. Yoon, and A. Ourmazd, “The symmetries of image formation byscattering. ii. applications,”

Optics express , vol. 20, no. 12, pp. 12827–12849, 2012.[10] C. G. Mayne, J. Saam, K. Schulten, E. Tajkhorshid, and J. C. Gumbart, “Rapid parameterization ofsmall molecules using the force ﬁeld toolkit,”

Journal of computational chemistry , vol. 34, no. 32,pp. 2757–2770, 2013.[11] L. Zhou, T.-Y. Zhang, Z.-C. Liu, P. Liu, and Y.-H. Dong, “A multiple-common-lines method to de-termine the orientation of snapshot diffraction patterns from single particles,”

Acta CrystallographicaSection A: Foundations and Advances , vol. 70, no. 4, pp. 364–372, 2014.[12] I. V. Lundholm, J. A. Sellberg, T. Ekeberg, M. F. Hantke, K. Okamoto, G. van der Schot, J. Andreasson,A. Barty, J. Bielecki, P. Bruza, et al. , “Considerations for three-dimensional image reconstruction fromexperimental data in coherent diffractive imaging,”

IUCrJ , vol. 5, no. 5, pp. 531–541, 2018.[13] A. Hosseinizadeh, A. Dashti, P. Schwander, R. Fung, and A. Ourmazd, “Single-particle structure deter-mination by X-ray free-electron lasers: Possibilities and challenges,”

Structural dynamics , vol. 2, no. 4,p. 041601, 2015.[14] I. Poudyal, M. Schmidt, and P. Schwander, “Single-particle imaging by X-ray free-electron lasers—howmany snapshots are needed?,”

Structural Dynamics , vol. 7, no. 2, p. 024102, 2020.[15] E. G. Marklund, T. Ekeberg, M. Moog, J. L. P. Benesch, and C. Caleman, “Controlling protein orienta-tion in vacuum using electric ﬁelds,”

J. Phys. Chem. Lett. , vol. 8, pp. 4540–4544, 2017.[16] J. M. Soler, E. Artacho, J. D. Gale, A. García, J. Junquera, P. Ordejón, and D. Sánchez-Portal, “TheSIESTA method forab initioorder-nmaterials simulation,”

J. Phys. Cond. Mat. , vol. 14, pp. 2745–2779,mar 2002.[17] A. Sinelnikova, T. Mandl, C. Östlin, O. Granas, M. N. Brodmerkel, E. Marklund, and C. Caleman,“Reproducibility in the unfolding process of protein induced by an external electric ﬁeld,”

ChemicalScience , 2020.[18] J. W. Neidigh, R. M. Fesinmeyer, and N. H. Andersen, “Designing a 20-residue protein,”

Nature Struc-tural Biology , vol. 9, pp. 425–430, Jun 2002.[19] A. Patriksson, E. Marklund, and D. van der Spoel, “Protein structures under electrospray conditions,”

Biochemistry , vol. 46, no. 4, pp. 933–945, 2007.[20] O. A. Vydrov and T. Van Voorhis, “Nonlocal van der waals density functional: The simpler the better,”

The Journal of Chemical Physics , vol. 133, no. 24, p. 244103, 2010.[21] B. Hess, C. Kutzner, D. Van Der Spoel, and E. Lindahl, “Gromacs 4: algorithms for highly efﬁcient,load-balanced, and scalable molecular simulation,”

Journal of chemical theory and computation , vol. 4,no. 3, pp. 435–447, 2008.[22] G. A. Kaminski, R. A. Friesner, J. Tirado-Rives, and W. L. Jorgensen, “Evaluation and reparametrizationof the opls-aa force ﬁeld for proteins via comparison with accurate quantum chemical calculations onpeptides,”

The Journal of Physical Chemistry B , vol. 105, no. 28, pp. 6474–6487, 2001.1423] E. G. Marklund, D. S. Larsson, D. van der Spoel, A. Patriksson, and C. Caleman, “Structural stabilityof electrosprayed proteins: temperature and hydration effects,”

Physical Chemistry Chemical Physics ,vol. 11, no. 36, pp. 8069–8078, 2009.[24] T. Mandl, C. Östlin, I. E. Dawod, M. N. Brodmerkel, E. G. Marklund, A. V. Martin, N. Timneanu,and C. Caleman, “Structural heterogeneity in single particle imaging using x-ray lasers,”

The journal ofphysical chemistry letters , vol. 11, no. 15, pp. 6077–6083, 2020.[25] S. Vijay-Kumar, C. E. Bugg, and W. J. Cook, “Structure of ubiquitin reﬁned at 1.8 Å resolution,”

Journalof molecular biology , vol. 194, no. 3, pp. 531–544, 1987.[26] H. J. Berendsen, J. v. Postma, W. F. van Gunsteren, A. DiNola, and J. R. Haak, “Molecular dynamicswith coupling to an external bath,”

The Journal of chemical physics , vol. 81, no. 8, pp. 3684–3690,1984.[27] W. L. Jorgensen, J. Chandrasekhar, J. D. Madura, R. W. Impey, and M. L. Klein, “Comparison ofsimple potential functions for simulating liquid water,”

The Journal of chemical physics , vol. 79, no. 2,pp. 926–935, 1983.[28] K. Breuker, H. Oh, D. M. Horn, B. A. Cerda, and F. W. McLafferty, “Detailed unfolding and fold-ing of gaseous ubiquitin ions characterized by electron capture dissociation,”

Journal of the AmericanChemical Society , vol. 124, no. 22, pp. 6407–6420, 2002.[29] H. Oh, K. Breuker, S. K. Sze, Y. Ge, B. K. Carpenter, and F. W. McLafferty, “Secondary and tertiarystructures of gaseous protein ions characterized by electron capture dissociation mass spectrometryand photofragment spectroscopy,”

Proceedings of the National Academy of Sciences , vol. 99, no. 25,pp. 15863–15868, 2002.[30] C. Caleman and D. Van Der Spoel, “Picosecond melting of ice by an infrared laser pulse: A simulationstudy,”

Angewandte Chemie , vol. 120, no. 8, pp. 1439–1442, 2008.[31] W. F. Van Gunsteren and H. J. Berendsen, “A leap-frog algorithm for stochastic dynamics,”

MolecularSimulation , vol. 1, no. 3, pp. 173–185, 1988.[32] J. E. Huheey, E. A. Keiter, R. L. Keiter, and O. K. Medhi,

Inorganic chemistry: principles of structureand reactivity . Pearson Education India, 2006.[33] D. H. McDaniel, B. E. Douglas, and J. Alexander,

Concepts and Models of Inorganic Chemistry . JohnWiley, 1983.[34] J. Bielecki, F. R. Maia, and A. P. Mancuso, “Perspectives on single particle imaging with x rays at theadvent of high repetition rate X-ray free electron laser sources,”

Structural Dynamics , vol. 7, no. 4,p. 040901, 2020.[35] C. Östlin, N. Tîmneanu, H. O. Jönsson, T. Ekeberg, A. V. Martin, and C. Caleman, “Reproducibilityof single protein explosions induced by X-ray lasers,”

Physical Chemistry Chemical Physics , vol. 20,no. 18, pp. 12381–12389, 2018. 15 upporting information ab initio

MD simulations

Table S1: (cid:104)| F ( E ) |(cid:105) as a function of the electric ﬁeld strength. Values are given in eV/Å units. Electric ﬁeld (cid:104)| F ( E ) |(cid:105) Classical MD simulations

E0 = 3.0 V/nmE0 = 1.5 V/nmE0 = 1.0 V/nmE0 = 0.1 V/nmE0 = 0.2 V/nmE0 = 0.5 V/nmE0 = 0.8 V/nmE0 = 2.5 V/nmt0 = 9 nst0 = 2 nst0 = 5 nst0 = 0 ns• Bulk simulations. • 10 structures extraction; • Water removal; • Charge assignment. • Energy minimisation; • 10 ps at 300K; • 10 ns. • 10 ps at 300K; • Dipole alignment. • 4 electric field implementations. •Simulations in presence of electric field.

Figure S1: The schematic representation of performed classical MD simulations.16igure S2: Scatter plot of the projection of the dipole moment on the yz plane for the simulations with electricﬁeld strength of 0.1 V/nm (ﬁrst row), 0.2 V/nm (middle row) and 0.5 V/nm (lower row) . Each point of everypanel refers to a frame taken in the last 2 ns of simulations for all the ten replicas. Each panel contains8000 points (the frames are taken every 2.5 ps of simulations). The circle drawn in the center of the panelsrepresent an uncertainty of ±

15 degrees respect to a perfect alignment.17igure S3: Scatter plot of the projection of the dipole moment on the yz plane for the simulations with electricﬁeld strength of 0.8 V/nm (ﬁrst row), 1.0 V/nm (middle row) and 1.5 V/nm (lower row) . Each point of everypanel refers to a frame taken in the last 2 ns of simulations for all the ten replicas. Each panel contains8000 points (the frames are taken every 2.5 ps of simulations). The circle drawn in the center of the panelsrepresent an uncertainty of ±

15 degrees respect to a perfect alignment.18igure S4: Scatter plot of the projection of the dipole moment on the yz plane for the simulations with electricﬁeld strength of 2.5 V/nm (ﬁrst row) and 3.0 V/nm (lower row) . Each point of every panel refers to a frametaken in the last 2 ns of simulations for all the ten replicas. Each panel contains 8000 points (the frames aretaken every 2.5 ps of simulations). The circle drawn in the center of the panels represent an uncertainty of ±±