[PDF] Global Small-Angle Scattering Data Analysis of Inverted Hexagonal Phases

Abstract

We have developed a global analysis model for randomly oriented, fully hydrated inverted hexagonal (H II ) phases formed by many amphiphiles in aqueous solution, including membrane lipids. The model is based on a structure factor for hexagonally packed rods and a compositional model for the scattering length density (SLD) enabling also the analysis of positionally weakly correlated H II phases. For optimization of the adjustable parameters we used Bayesian probability theory, which allows to retrieve parameter correlations in much more detail than standard analysis techniques, and thereby enables a realistic error analysis. The model was applied to different phosphatidylethanolamines including previously not reported H II data for diC14:0 and diC16:1 phosphatidylethanolamine. The extracted structural features include intrinsic lipid curvature, hydrocarbon chain length and area per lipid at the position of the neutral plane.

Full PDF

GGlobal Small-Angle Scattering Data Analysis ofInverted Hexagonal Phases

Moritz P.K. Frewein a,b , Michael Rumetshofer c , Georg Pabst a,ba University of Graz, Institute of Molecular Biosciences, Biophysics Division,NAWI Graz, 8010 Graz, Austria b BioTechMed Graz, 8010 Graz, Austria c Graz University of Technology, Institute of Theoretical Physics andComputational Physics, NAWI Graz, 8010 Graz, [email protected]

We have developed a global analysis model for randomly oriented, fully hydratedinverted hexagonal (H II ) phases formed by many amphiphiles in aqueous solu-tion, including membrane lipids. The model is based on a structure factor forhexagonally packed rods and a compositional model for the scattering lengthdensity (SLD) enabling also the analysis of positionally weakly correlated H II phases. For optimization of the adjustable parameters we used Bayesian proba-bility theory, which allows to retrieve parameter correlations in much more detailthan standard analysis techniques, and thereby enables a realistic error analysis.The model was applied to diﬀerent phosphatidylethanolamines including previ-ously not reported H II data for diC14:0 and diC16:1 phosphatidylethanolamine.The extracted structural features include intrinsic lipid curvature, hydrocarbonchain length and area per lipid at the position of the neutral plane. Elastic small-angle scattering (SAS) techniques are unrivaled for providing de-tailed structural insight into aggregates formed by amphiphiles in aqueous so-lutions [1]. In the ﬁeld of membrane biophysics signiﬁcant eﬀorts have been de-voted to the development of SAS analysis methods for the biologically most rel-evant ﬂuid lamellar phases, including domain-forming lipid mixtures and asym-metric lipid bilayers [2]. In contrast, non-lamellar phases such as the invertedhexagonal (H II ) phase are less commonly found for membrane lipids under phys-iological conditions, but are of signiﬁcant biotechnological interest, e.g. for gen1 a r X i v : . [ q - b i o . B M ] A p r ransfection [3] or drug delivery systems [4]. H II phases are also highly amenablesystems for deriving intrinsic lipid curvatures by small-angle X-ray scattering(SAXS) [5, 6, 7, 8], which is the main focus of the present contribution. The in-trinsic lipid curvature C is given by the negative inverse of the curvature radius, − /R , of an unstressed monolayer at the position of the neutral plane, whichcorresponds to the location where molecular bending and stretching modes aredecoupled [9]. Major interest in obtaining reliable C -values originates from itscontribution to the stored elastic energy strain in planar bilayers [10], trans-membrane protein function [11, 12] and overall membrane shape [13].Structural details of H II phases have been successfully derived using electrondensity map reconstruction based on Bragg peak scattering only [14, 15, 16, 17].However, for highly swollen H II phases or at elevated temperatures the num-ber of observed Bragg peaks may become insuﬃcient for a reliable analysis.This may be particularly the case for mixtures of cone-shaped (H II phase-forming) and cylindrically-shaped (lamellar phase-forming) or inverted cone-shaped (spherical micelle-forming) lipids, which is the typical strategy for de-termining C for non H II phase-forming lipids (see, e.g. [7]). In this case globalanalysis techniques, which take into account both Bragg peaks and diﬀuse scat-tering become advantageous, as demonstrated previously also for lamellar phases[18].Global analysis techniques have been reported previously for H I phases, i.e.,oil-in-water type hexagonal aggregates [19, 20]. The speciﬁc need for developinga dedicated model for H II phases comes from the observation that unoriented H II phases contain previously not reported additional diﬀuse scattering originatingmost likely from packing defects. We have evaluated our global H II modelfor phosphatidylethanolamines with diﬀering hydrocarbon chain compositionand as a function of temperature using Bayesian probability theory to increasethe robustness of analysis. This method signiﬁcantly increased the obtainedinformation content compared to our previous analysis [7] and allowed us toderive details about the structure, e.g. the lipid headgroup area, hydrocarbonchain length or molecular shape to name but a few. Dioleoyl phosphatidylethanolamine (DOPE, diC18:1 PE), palmitoyl oleoyl phos-phatidylethanolamine (POPE, C16:0-18:1 PE), dimyristoyl phosphatidylethanolamine(DMPE, diC14:0 PE) and dipalmitoleoyl phosphatidylethanolamine (diC16:1PE) were purchased in form of powder from Avanti Polar Lipids (Alabaster,AL).

Cis -9-tricosene was obtained from Sigma-Aldrich (Vienna, Austria). Alllipids were used without any further puriﬁcation. Note that dipalmitoleoyl phos-phatidylethanolamine is deliberated abbreviated with diC16:1 PE in order tobe not confused with dipalmitoyl phosphatidylethanolamine (diC16:0 PE).Fully hydrated H II phases were prepared using rapid solvent exchange (RSE)221] as detailed previously [22]. In brief, stock solutions of lipids (10 mg/ml)and tricosene (5 mg/ml) were ﬁrst prepared by dissolving both compounds inchloroform/methanol (9:1 vol/vol). Ultra pure water (18 MΩ / cm ) was ﬁlledinto test tubes and equilibrated at (60-70) ◦ C using an incubator. Lipid andtricosene stock solutions were added to the test tubes containing preheatedwater (organic solvent/water ratio = 2.55) and an then quickly mounted ontothe RSE apparatus, described in [23]. Organic solvent quickly evaporated usingthe following settings: temperature: 65 ◦ C; vortex speed: 600 rpm; argon-ﬂow:60 ml/min and a ﬁnal vacuum of pressure: (400-500) mbar. The full procedurewas performed for 5 minutes, yielding a lipid pellet at the bottom of the testtube in excess of water. All samples contained 12 wt.% tricosene. Tricoseneinserts preferentially into the interstical space between the rods in H II phaseseﬀectively reducing packing frustration as veriﬁed previously [24, 7]. UnstressedH II phases are required for C determination [9]. Small angle X-ray scattering (SAXS) experiments were performed on a SAXS-pace compact camera (Anton Paar, Graz, Austria) equipped with an Eiger R1 M detector system (Dectris, Baden-Daettwil, Switzerland) and a 30 W-Genix3D microfocus X-ray generator (Xenocs, Sassenage, France), supplying Cu-K α ( λ = 1.54 ˚A) radiation with a circular spot size of the beam of ∼ µ m on thedetector. Samples were taken-up in paste cells (Anton Paar) and equilibratedat each measured temperature for 10 minutes using a Peltier controlled samplestage (TC 150, Anton Paar). The total exposure time was 32 minutes (4 framesof 8 min), setting the sample-to-detector distance to 308 mm. Data reduction,including sectorial data integration and corrections for sample transmission andbackground scattering, was performed using the program SAXSanalyis (AntonPaar). II phases We initially tested the applicability of a previously reported model-free approach[19]. However, although perfect ﬁts to the experimental data were obtained,the corresponding pair distance distribution functions contained signiﬁcantlynegative values upon approaching the maximum particle size ( D max ), which isnot physically relevant (O. Glatter, personal communication). This encouragedus to proceed with data modeling.To do so, we considered a bundle of hexagonal prisms consisting of a watercore coated by lipids (Fig. 4.1). Its scattering intensity is characterized by theform factor of a single prism F ( q ) and the structure factor of the whole bundle S ( q ), where q is the scattering vector. Assuming that the prisms are long ascompared to their diameters allows us to decouple form and structure factor319]: I ( q ) ∝ | F ( q ) | S ( q ) , (1)The structure factor of a two-dimensional lattice of inﬁnitely long hexagonalprisms, averaged over all in-plane vectors is given by [25, 26, 19]: S ( q, θ | n, ∆) = 1 + 1 N hex ( n ) e − q ∆ N hex ( n ) (cid:88) j (cid:54) = k J ( q | R j − R k | sin θ ) , (2)where θ is the angle between scattering vector and the axis ( z ) normal to thehexagons, N hex = 1 + 3 n ( n + 1) is the total number of unit cells for n rings(Fig. 4.1), and J is the zero-order Bessel function of ﬁrst kind. The e − q ∆ -term is well-known as Debye-Waller factor, where ∆ is the lateral mean squaredisplacement of the rotation axes of the unit cells around their mean positions R j . For the sake of readability we give the parameter dependencies after thevertical line in each equation, i.e. for Eq. (2) n and ∆ in. Analogously to [19] wealso considered a polydispersity of N hex , which yields a smooth structure factor.However, since this aﬀects only low q -vectors and not signiﬁcantly the ﬁnalquality of the result, this was omitted in order to reduce overall computationtimes. An alternative structure factor, based on the positioning of peaks withﬂexible shapes on hexagonal lattice points, has been reported previously [27, 20].However, its application involves a signiﬁcantly higher number of adjustableparameters, which lead us to disregard this option.The form factor for a hexagonal prism of length L is given by [19] F ( q, θ | ρ ) = f ( q, θ ) (cid:90) ρ ( r, ϕ ) J ( qr sin θ ) rdrdϕ = f ( q, θ ) [ F lipid ( q, θ | ρ ) + F inter ( q, θ | ρ )] (3)with f ( q, θ ) = 4 π sin( L q cos θ ) / ( q cos θ ), where ρ ( r, ϕ ) is the in-plane SLD. Here, ρ denotes all parameters describing the SLD ρ ( r, ϕ ). For evaluation, which isperformed due to symmetry over 1 /

12 of the area of the hexagon, the formfactor is split into a core-shell cylindrical part F lipid , which accounts for thephospholipid only, and F inter , which accounts for the interstitial space, oftentaken up by pure hydrophobic ﬁller molecules (here: tricosene). We also foundthat the length of the cylinders does not aﬀect F signiﬁcantly for L ≥ L = 2500 ˚A for all ourfurther calculations. 4igure 1: Scheme of the H II phase model. The hexagonal lattice (left side)is deﬁned by its lattice parameter a and the number of rings (lattice order) n . Its unit cell, shown in the center, is a regular hexagonal prism of length L and consists of a cylindrical water core, surrounded by lipids with their headspointing toward the central water channel and a ﬁller molecule occupying theinterstices. We denote the axis of rotation by z . The unit cell is separatedin areas of diﬀerent SLD which depend on the molecular composition (see alsoFig. 4.2). R , denotes the position of the neutral plane at the center of the lipidbackbone.The core-shell cylindrical part can be evaluated analytically [28] F lipid ( q, θ | ρ ) = (cid:90) a/ r ∆ ρ ( r ) J ( qr sin θ ) dr == 1 q sin θ (cid:34) ∆ ρ M r M J ( qr M sin θ ) ++ M − (cid:88) k =1 (∆ ρ k − ∆ ρ k +1 ) r k J ( qr k sin θ ) (cid:35) , (4)where M is the total number of shells, r k are the shell radii, ∆ ρ is the SLDrelative to water (∆ ρ = ρ − ρ W ; ρ W = 0 .

33 ˚A − in case of X-rays), and J isthe ﬁrst-order Bessel function of the ﬁrst kind.Molecular ﬂuctuations cause a smearing of the sharp boundaries betweenthe individual slabs. Analogously to [29], these were taken into account bytranslating all shell boundaries { r k } by the distance x , whose value was assumedto be distributed by a Gaussian N ( x | µ, σ ) of mean µ and variance σ . F lipid, ﬂuc ( q, θ | ρ , σ ﬂuc ) = (cid:90) dx N ( x | µ = 0 , σ ) F lipid ( q, θ | ρ (cid:48) ( x )) (5)Here, µ = 0 and ρ (cid:48) ( x ) denotes the SLD including the radial shift x .5igure 2: Overlay of the scattering pattern of DOPE (circles) and | F lipid ( q ) | (solid line). The phase change between the (1,0) and (1,1) reﬂections for theH II phase leads to a minimum in the absolute square of the form factor, whichis absent in the experimental data.The form factor of the interstices F inter ( q, θ | ρ ) = 6 π (cid:90) π/ dϕ (cid:90) a ϕ ) a/ r ∆ ρ inter J ( qr sin θ ) dr (6)needs to be evaluated numerically, but remains constant for a given latticeconstant a and SLD ∆ ρ inter . However, since a can be determined accuratelyfrom Bragg peak positions, F inter needs to be calculated only once for eachscattering pattern.H II phases are well-known to change their phase from ’+’ to ’ − ’ betweenthe (1,0) and (1,1) reﬂections [15], which brings about a minimum in | F lipid | between the two peaks.All our present experimental data, as well as those previously reported [7, 22],exhibited signiﬁcant diﬀuse scattering between these two peaks (Fig. 4.1). Thatis, experimental data from unoriented H II phases show no form factor minimumin this q -range. The additional scattering may also explain the failure of themodel-free analysis approach discussed above and possibly arises from packingdefects between hexagonal bundles, e.g. at grain boundaries. However, surface-aligned H II phases do not exhibit such scattering contributions [30], disfavoringsuch a scenario. Hence, this appears to be only a property of unoriented H II phases, fully immersed in aqueous solution. We speculate that the outermostboundary of H II structures may try to avoid contact of the hydrocarbon withwater by forming a lamellar layer, i.e., in some ways similar to hexosomes [4].Indeed, we were able to account for the additional diﬀuse scattering adding a6orm factor of a laterally uniform, inﬁnitely extended, planar bilayer F BL ( q | ρ lam ) = 4 π (cid:90) ∆ ρ lam ( z ) e iqz dz (7)to the total scattered intensity, where z is the coordinate normal to the lamellarphase and ρ lam are the parameters describing the SLD of the lamellar phase.We cannot exclude that the additional diﬀuse scattering originates from unil-amellar vesicles or other kinetically-trapped aggregates formed during samplepreparation.Considering orientational averaging we ﬁnally obtained for the total scat-tered intensity I mod ( q | n, ∆ , σ ﬂuc , ρ , c lam , ρ lam ) ∝ (cid:90) π | F ( q, θ | ρ , σ ﬂuc ) | S ( q, θ | n, ∆) sin θdθ ++ 2 c lam F BL ( q | ρ lam ) (cid:90) π F ( q, θ | ρ , σ ﬂuc ) s ( q, θ | n, ∆) sin θdθ ++ c | F BL ( q | ρ lam ) | , (8)where c lam denotes the fraction of the lamellar phase. The structure factor s ( q, θ | n, ∆) = 1 (cid:112) N hex ( n ) e − q ∆ / N hex ( n ) (cid:88) j J ( q | R j | sin θ ) (9)was derived analogously to the H II structure factor (Eq. 2) and the form factoris F ( q, θ | ρ , σ ﬂuc ) = f ( q, θ ) [ F lipid, ﬂuc ( q, θ | ρ , σ ﬂuc ) + F inter ( q, θ | ρ )], see Eq. 3. In this section we develop a model for the SLDs described by the parameters ρ and ρ lam . For increased structural ﬁdelity we considered the minimum amountof parameters. We also constrained the SLDs by the speciﬁc molecular compo-sition. Assuming that tricosene partitions exclusively into the interstitial space,the PE structure was parsed into three cylindrical shells of a wedge-shaped lipidunit cell of opening angle α and height h (Fig. 4.2): (i) the headgroup (H), con-sisting of phosphate and ethanolamine groups, (ii) the glycerol backbone (BB),given by the carbonyl and glycerol groups, and (iii) the tails (HC) consisting ofall methyl, methine and methylene groups. The outer radius of the wedge a/ q kl = π √ a ( k + 2 kl + l ),where k and l are the Miller indices. The position of the neutral plane R wasassumed to be in the center of the BB shell. This was motivated by bend-ing/compression experiments, which obtained estimates for the location of neu-tral plane within the lipid backbone regime [9, 5]. In our model, the entire PEstructure is described by the intrinsic curvature C = − /R , the width of theheadgroup d H and the backbone d BB . Further structural parameters of interest,as the width of the hydrocarbon chain d HC = a/ − d BB / − R , (10)7igure 3: Composition-speciﬁc SLD modeling of phosphatidylethanolamines. a)The unit cell of a single lipid has the shape of a cylinder sector of radius a/ d HH = 2( d H + d BB + d HC ) (11)and the radius of the water core R W = ( a − d HH ) / R − d BB / − d H , (12)follow from these three parameters.In the case of X-ray scattering the SLDs (electron densities) of each shellare given by ρ k = n ek /V k with k ∈ { H, BB, HC } , where n ek is the number ofelectrons of a given quasi-molecular lipid fragment, V H = 110 ˚A , V BB = 135˚A [31], and V HC = V lipid − V BB − V H . Further, we estimated ∆ ρ inter (Eq. (6))of tricosene by molecular averaging over the fractional volumes of v CH , v CH and v CH [31] (see supporting Tab. S1). In our model the electron densityis suﬃciently described by the parameters C , d H , d BB and V lipid , hence ρ = { C , d H , d BB , V lipid } . All other parameters can be deduced from these by usingthe lipid contribution to volume of the k ’th shell, V k = ˆ A ( r k +1 − r k )2 − ˜ n k W V W , (13)where ˜ n k W is the number of water molecules within each shell and V W = 30 ˚A the molecular volume of water. ˆ A = αh is the mantle area of a sector of unitary8adius and can be obtained using Eq. (13) with k = HC and ˜ n HCW = 0,ˆ A = 2 V HC a − ( R + d BB ) . (14)Hence, Eq. (13) also deﬁnes ˜ n HW and ˜ n BBW .Using our parametrization it is straight forward to derive the area per lipidat any position within the molecule. For example, the area per lipid at theneutral plane calculates as A = ˆ AR . (15)Further, following [32], the molecular shape parameter is given by˜ S = V HC ˆ A ( R + d BB ) d HC , (16)where ˜ S = 1 represents cylindrical – lamellar phase forming – molecules, and˜ S <

S >

S > II -structure.The form factor of the additional lamellar phase was calculated by integrat-ing Eq. (7), using a simple SLD model consisting of head and tail slabs [28]: F BL ( q ) = 4 π iq (cid:110) ∆ ρ H,lam (cid:2) e i qd H,lam − (cid:3) ++ ∆ ρ H,lam (cid:104) e i2 q ( d H,lam + d HC,lam ) − e i q ( d H,lam +2 d HC,lam ) (cid:105) ++ ∆ ρ HC,lam (cid:104) e i q ( d H,lam +2 d HC,lam ) − e i qd H,lam (cid:105) (cid:111) , (17)where ∆ ρ H,lam and ∆ ρ HC,lam are the headgroup and hydrocarbon SLDs relativeto water, respectively. These were derived as detailed above by counting thenumber of electrons in each slab and dividing by the corresponding volumes V H,lam or V HC,lam . Assuming that V HC,lam = V HC and V H,lam = V H + V BB , thehydrocarbon slab thickness results from d HC,lam = V HC A L (18)and the headgroup thickness from d H,lam = V H,lam + n W,lam V W A L . (19)Hence, the area per lipid A L and the number of headgroup-associated wa-ter molecules n W, lam are the only parameters for the lamellar phase, ρ lam = { A L , n W, lam } . 9able 1: Overview of the model parameters for fully hydrated unoriented H II phases.Occurrence x MeaningStructure factor ∆ mean square displacement of the lattice points n number of hexagonal shells (domain size)H II form factor σ ﬂuc ﬂuctuation constant of lipid unit cell C intrinsic curvature d H width of the lipid headgroup d BB width of the lipid backbone V lipid lipid volumeLamellar form factor c lam lamellar form factor scaling constant A L area per lipid of the lamellar phase n w,lam number of water molecules in the headgroup slab of the lamellar phaseSignal scaling Γ instrumental scaling constant I inc incoherent background The ﬁnal model for scattered intensities of unoriented fully hydrated H II is givenby I sim ( q | x ) = Γ I mod ( q | n, ∆ , σ ﬂuc , ρ , c lam , ρ lam ) + I inc , (20)where I mod is given by Eq. (8), Γ is an instrumental scaling constant and I inc accounts for incoherent scattering. In total, we have 12 model parameters,denoted by x , which are listed in Tab. 5.There are various ways of estimating the parameters x . For a given dataset I with the standard deviations σ in the presence of a well-deﬁned globalminimum the method of least squares yields ﬁtting parameters by minimizing acost function χ ( x | I , σ ). However, such an approach led for our present data to asigniﬁcant variation of results between consecutive optimization runs, indicatinga cost function landscape with a weakly-deﬁned global minimum. Thus, besidesunreliable x values, also error estimates and potential correlations between theparameters remained undetermined.To achieve higher conﬁdence in our results we decided to use Bayesian prob-ability theory; for a detailed introduction, see [33, 34, 35, 36]. In brief, we wereinterested in deriving the probability p ( x | I , σ , I ), meaning the probability theparameters x given the set of experimental data I with standard deviations σ and additional information I , which might be present, such as e.g. the ﬁnitewidth of a lipid molecule. In the framework of Bayesian probability theoryBayes’ theorem shows how to calculate this quantity, also called the posterior, posterior (cid:122) (cid:125)(cid:124) (cid:123) p ( x | I , σ , I ) ∝ likelihood (cid:122) (cid:125)(cid:124) (cid:123) p ( I | x , σ , I ) prior (cid:122) (cid:125)(cid:124) (cid:123) p ( x | (cid:26) σ , I ) . (21)10ayes’ theorem constitutes the rule for learning from experimental data. Theprior probability p ( x |I ) represents the prior knowledge about the unknownquantities x . We crossed out σ in the prior, since the prior does not dependon the standard deviations of the data. The likelihood p ( I | x , σ , I ), representingthe probability for the data I given x and σ , includes all information about themeasurement itself. The prior probabilities p ( x |I ) were assumed to be uniformlydistributed between lower x min and upper x max constraints for all parameters.Therefore, p ( x |I ) = (cid:89) i Θ( x i − x i, min ) − Θ( x i − x i, max ) x i, max − x i, min , (22)where Θ( x ) is the Heaviside step function. For each parameter x i , x i, min and x i, max denote physically meaningful boundaries. In particular, we constrainedthe parameters d H and d BB by the conditions˜ n H W ≥ n BB W ≥ . (23)This means that the volumes of the head and backbone shell (Eq. (13)) have tobe large enough to accommodate the respective molecular group.We consider the likelihood p ( I | x , σ , I ). Since we did not trust per se theexperimentally derived error estimates σ for the scattered intensities we assumedthat their real values ˜ σ are connected to σ by a scaling factor η . Using themarginalization rule of Bayesian probability theory we obtain p ( I | x , σ , I ) = (cid:90) dηd ˜ σ p ( I | x , (cid:26) σ , ˜ σ , (cid:1) η, I ) p ( ˜ σ , η | (cid:26) x , σ , I )= (cid:90) dηd ˜ σ p ( I | x , ˜ σ , I ) p ( ˜ σ | η, σ , I ) (cid:124) (cid:123)(cid:122) (cid:125) (cid:81) i δ (˜ σ i − ησ i ) p ( η | (cid:26) σ , I ) (cid:124) (cid:123)(cid:122) (cid:125) ∝ /η ∝ (cid:90) dη p ( I | x , ˜ σ = η σ , I ) (cid:124) (cid:123)(cid:122) (cid:125) N ( I | x ,η σ ) η , (24)where d ˜ σ is short hand for (cid:81) i d ˜ σ i . Here, we have speciﬁcally made use of theso-called Jeﬀreys prior p ( η ) ∝ /η [37], where η is a scaling invariant, meaningthat we have a priory no idea about the order of magnitude of η . This scalingconnects the likelihood (Eq. (24)) to the multivariate Gaussian N ( I | x , η σ ) = N q (cid:89) i =1 ησ i √ π exp (cid:20) − η σ i ( I sim ( q i | x ) − I obs i ) (cid:21) (25)where η has to be integrated out, respecting Jeﬀreys prior and I obs i denotesthe observed intensity at q i . Here, N q is the number of data points for a givenscattering pattern.For illustration, consider an arbitrary function O ( x ) with the parameters x .The expectation value of O ( x ) is then calculated by evaluating the integral (cid:104)O ( x ) (cid:105) = (cid:90) d x dη O ( x ) p ( x , η | I , σ , I ) (26)11here p ( x , η | I , σ , I ) = 1 Z N ( I | x , η σ ) 1 η p ( x |I ) (27)with the normalization constant Z . For example, using O ( x ) = x i produces theexpectation value (cid:104) x i (cid:105) for parameter x i .A suitable technique for performing these integrals and sampling from theprobability distribution p ( x , η | I , σ , I ) is Markov Chain Monte Carlo (MCMC),which is based on constructing a so called Markov chain with the desired dis-tribution of x in equilibrium. We used the Metropolis Hastings algorithm forgenerating the Markov chain { x k , η k } . Starting with a parameter set x k =1 and η k =1 , every new parameter set k + 1 can be proposed by varying parametersin the old parameter set k . The new parameter set k + 1 is accepted with theprobability P acc = min (cid:26) , p ( x k +1 , η k +1 | I , σ , I ) p ( x k , η k | I , σ , I ) (cid:27) . (28)It occurred that the ﬁrst 10 −

20 % of a Markov chain have to be discardedto ensure that the rest of the Markov chain is independent of the initial state x k =1 and η k =1 , i.e. the Markov chain is equilibrated to the desired distribution.In addition, the states in the Markov chain have to be uncorrelated, which canbe ensured by taking only every N runth state of the Markov chain. N run can becontrolled by evaluating the autocorrelation function or using techniques likebinning and jackknife. Finally, the observable can be estimated by O := (cid:104)O ( x ) (cid:105) ≈ N Markov N Markov (cid:88) k =1 O ( x k ) , (29)i.e. taking the mean value of N Markov uncorrelated Markov Chain elements.The conﬁdence intervals can be estimated from∆ O := σ O √ N Markov (30)The variance σ O = (cid:10) O ( θ ) (cid:11) − (cid:104)O ( θ ) (cid:105) (31)can in turn be estimated from the Markov chain. Alternatively, the uncertaintycan be determined from independent MCMC runs.Since the Markov chain { x k , η k } is a representative sample drawn from p ( x , η | I , σ , I ) it can be used to plot the probability distribution, e.g. themarginal probability distribution p ( x i , x j | I , σ , I ) for the parameter i and j byplotting the two dimensional histogram of the samples { x ki } and { x kj } . Thisallows to unravel correlations between the parameter i and j , i.e. the analysisof mutual parameter dependencies that could lead to ambiguous results usingthe least squares method. Additionally, the cost function χ = 1 N q N q (cid:88) i =1 (cid:2) I ( q i | x ) sim − I obs j (cid:3) ˜ σ i (32)12s saved for every run. We ﬁrst explored our model and the Bayesian analysis on the well-studied H II structure of DOPE [15, 17, 14, 7]. We emphasize that the choice of our modelrestricts the algorithm to a certain functional space for describing the scatter-ing pattern. This constraint can lead to some systematic deviations from theexperimental data and Bayesian model comparison can be used for choosing theappropriate model.Clearly, our model is able to account well for most features of the scatteringpattern up to q ∼ . − (Fig. 6.1). In particular, the bilayer form factorcompensates well for the form factor minimum between the (1,0) and the (1,1)-peak, but adds also some diﬀuse scattering at higher q -values. The small peakobserved in the calculated intensity at very low q is an artifact resulting fromthe structure factor. This could be removed by averaging over a distribution ofdomains [19], but does not aﬀect the overall structural results and has thereforebeen omitted to reduce computational cost (see also above). Further, the prox-imity of a form factor minimum to the (2,1)-peak of the H II phase nearly causesan extinction in the scattering data. Note that tricosene-free DOPE samplesexhibit a clear (2,1) reﬂection (Fig. S4c). However, because of strain-induceddistortions of the hexagonal prisms such samples cannot be analysed with thepresent model. The maximum aposterior (MAP) solution still shows a slightlymore pronounced (2,1)-peak, since a perfect ﬁt in this q -range would lead tosigniﬁcant deviations between model and experimental data close to the (2,0)-peak, which due to its smaller errors have a higher signiﬁcance in contributing toour overall goodness of the MAP solution. Additionally, our MAP solution un-derestimates the contributions of the (2,2) and (3,1) peaks due to the proximityof the cylinder form factor to two minima. To account for this we tested morecomplex SLD models, by considering either a separate slab for the methyl termi-nus of the hydrocarbon chain, or a linear decrease of the electron density in thehydrocarbon regime. However, this did not lead to a signiﬁcant improvement ofthe agreement between model and experimental data in this q -range. In orderto avoid overﬁtting we therefore remained with the SLD model as described insection 3. Table S2 lists the corresponding expectation values (cid:104) x (cid:105) and variances σ x . In order to check for reproducibilty, we prepared a fresh DOPE sample.Results listed in Tab. S2 show that all structural lipid parameters are identi-cal within experimental uncertainty (see also Fig. S4b). Diﬀerences in latticeparameters, such as a and ∆ relate to slight variations of tricosene content.One of the beneﬁts of the Bayesian analysis compared to the least squaresmethod is the possibility to reveal correlations between adjustable parameters,just by looking at the 2D marginal probability density distributions, see e.g.Fig. 6.1a. Marginal distributions of all other parameters are shown in the13igure 4: Expectation value and error bands of the intensity of fully hydratedDOPE at 35 ◦ C (a) including the involved structure (b) and form factors (c,blue: hexagonal FF, green: lamellar FF).14upplementary Figs. S1-S3. Most parameter pairs show no correlations andexhibit probability distributions with Gaussian-like behavior, including σ fluc , V lipid , and Γ. Signiﬁcant correlations can be seen for the parameters d H and d BB with C , as well as between n w,lam and A L . A strong correlation between twoparameter suggests the possibility to simplify the model. However, this wouldbe highly speciﬁc for a given amphiphile and was consequently not considered.The parameters d H and d BB exhibit broad probability distributions with no well-deﬁned maximum. In turn C has a peaked probability distribution yielding awell-deﬁned estimate value and uncertainty.Here, we discuss for illustration purposes the correlation between C and d H (Fig. 6.1a). Solutions along the diagonal line give similar scattering intensi-ties, but lead to signiﬁcantly diﬀerent electron density proﬁles, see Fig. 6.1b,c.The correlation between d H and C may appear counterintuitive. From geomet-ric/physical arguments follows that small d H values represent a bending of thephosphate-ethanolamine director of lipid headgroup toward the polar/apolarinterface, which leads to a shift of C toward more positive values. However,this would lead to signiﬁcantly diﬀerent scattered intensities and hence to non-optimal solutions. The mathematical algorithm therefore aims to compensatefor this by decreasing C for small d H . The abrupt drop of the headgroup thick-ness probability distribution at small d H is due to the termination criterion(Eq. (23)).In the following, we discuss some expectation values (cid:104) x (cid:105) and the errors σ x obtained by applying Eqs. (29) and (31). Table 6.1 compares the obtainedstructural parameters of DOPE to existing literature values. Our results arein good agreement with previous reports, given the diﬀerent additives (alka-nes or alkenes, some did not use any ﬁller molecule) and slight variations intemperatures. Note that in some cases A and C have been reported for thepivotal plane. The pivotal plane marks the position within the lipids where themolecular area does not change upon deformation and is usually slightly closerto hydrocarbon tails than the neutral plane [5, 7]. This leads to slight shift of C toward positive values.The most direct comparison of C can be made to our previous work [7],which was performed at the same temperature and tricosene content. Here, weﬁnd that the global model combined with Bayesian analysis yields an intrinsiccurvature, which agrees within experimental uncertainty well with our previousresult. Increasing temperature for DOPE should yield a decrease of lipid chain lengthand concomitant signiﬁcant increase of the area per lipid at the methyl terminusleading to more negative intrinsic curvatures as reported previously [15, 17, 14,7]. Indeed, our analysis yielded a linear decrease of C and d HC (Fig. 6.2).The slope ∆ C /∆ T = ( − . ± . × − (˚A K) − is identical to ourpreviously reported value [7]. The relative change of the chain length is in turn∆ d HC /∆ T = ( − . ± . S p ( C | I , σ , I ) and p ( d H | I , σ , I ) of in-trinsic curvature C and headgroup width d H and p ( C , d H | I , σ , I ) (Panel a).The red cross and corresponding lines mark the sample with the lowest χ (MAPsolution), the green circle shows the mean value of the distribution. Panel b)and c) show the corresponding ﬁts and SLD proﬁles.16able 2: Comparison of structural parameters of DOPE to literature values. a /˚A V L /˚A d HH /˚A R W /˚A A /˚A C /˚A − reference76.9 ± ±

10 32.4 ± ± ± ± a i [16] b c - - - - - -0.0367 ± d i - [17] e

76 - - - 51.5 i -0.031 i [6] f - - - - - -0.0399 ± g - - - - - -0.0365 ± ha T = 35 ◦ C, dodecane b T = 22 ◦ C, tetradecane c T = 30 ◦ C, dodecane d T = 25 ◦ C e T = 30 ◦ C f T = 25 ◦ C g T = 35 ◦ C, tricosene h T = 30 ◦ C, tetradecane i determined at the pivotal planeshows only a modest increase of ∆ ˆ S /∆ T = (1 . ± . × − K − , despite themore negative C values at higher temperatures and despite the decrease of d HC and increase of V HC (∆ V HC /∆ T = (0 . ± . /K). This results from aconcomitant increase of headgroup area (∆ A /∆ T = (0 . ± . /K)at the neutral plane – and analogously also at the position of the polar/apolarinterface ( R + d BB ) –, which compensates for the changes of d HC and V HC . Theradius of the water core decreases with (∆ R W /∆ T = ( − . ± . × − ˚A/K). Finally, we tested the applicability of the analysis technique to PEs with dif-fering hydrocarbon chain composition. In particular, we studied the H II phasesof POPE, which has a palmitoyl and an oleoyl chain, DMPE, which has twomyristoyl chains, and diC16:1PE, with two palmitoeloyl hydrocarbons. Notethat pure POPE forms a H II phase only above 71 ◦ C, while the lamellar to H II phase transition temperature T H for pure di16:1PE was reported to be 43 . ◦ Cand T H > ◦ C for DMPE [38]. The addition of alkanes or alkenes to invertedhexagonal phases is known to reduce stress resulting from interstitial spacebetween the individual rods [39, 40, 41]. We previously demonstrated that tri-cosene suﬃciently lowers the T H for POPE to perform a H II phase analysis atphysiological temperature [7]. Similarly, di16:1PE formed a neat H II phase at35 ◦ C upon adding 12 wt% tricosene (see below). In the case of DMPE we found17igure 6: Structural parameters of DOPE H II as a function of temperatureresulting from the Bayesian analysis. Panel a) shows probability densities of theintrinsic curvatures, b) of the hydrocarbon chain length c) of the hydrocarbonchain volume d) of the area per lipid at the neutral plane e) the shape parameterand f) the radius of the water cylinder. Red lines indicate linear regressions ofthe probability density distributions. 18igure 7: Data points (including error bars) and expectation value of the in-tensity (with error bands) of SAXS patterns of POPE (35 ◦ C), di16:1PE (35 ◦ C)and DMPE (80 ◦ C).a pure H II scattering pattern only for T ≥ ◦ C, indicating a signiﬁcantly lessnegative C . For this reason we performed the global analysis at 35 ◦ C for POPEand di16:1PE and at 80 ◦ C for DMPE.Unlike DOPE, the (2,1)-peak was clearly present in the scattering data of allthree lipids, a feature which helped to obtain a better agreement of the modelwith experimental data (Fig. 6.3). The corresponding probability density dis-tributions for C clearly show that monounsaturated hydrocarbons induce sig-niﬁcantly more negative intrinsic curvature than saturated hydrocarbon, whichis due to the kink induced by the cis -double bond. The proximity of values forPOPE and DMPE is attributed to the temperature diﬀerence and the associ-ated decrease of C (Fig. 6.3). Assuming a similar temperature dependence asobserved for DOPE yields a C close to zero for DMPE at 35 ◦ C, which agreeswith the well-established observation that DMPE prefers to form bilayers at am-bient temperatures. Beside the diﬀerence between saturated and unsaturatedhydrocarbons our analysis also clearly shows that C DOPE0 < C diC16:1PE0 . That is,increasing the chain length of monounsaturated acyl chains also leads to a morenegative C value. This signiﬁes that the kink induced by the cis -double bondleads to a progressive increase of hydrocarbon splay upon acyl chain extension.The mean values of hydrocarbon chain length show a trend in the expected19igure 8: Intrinsic curvature (a) and hydrocarbon chain length (b) probabilitydensities and mean values (red) for various lipids at 35 ◦ C (except DMPE: 80 ◦ C). 20able 3: Comparison of structural parameters of diﬀerent phos-phatidylethanolamines. C / ˚A − d HC / ˚A R W / ˚A A / ˚A V HC / ˚A ˜ S DMPE a − . ± . . ± . . ± . ± ± . ± . b − . ± . . ± . . ± . ± ± . ± . b − . ± . . ± . . ± . ± ± . ± . b − . ± . . ± . . ± . ± ±

10 1 . ± . a T = 80 ◦ C b T = 35 ◦ Cdirection, i.e. they increase with the number of hydrocarbons (Fig. 6.3), but allcases exhibit a broad distribution as a result of the not well-deﬁned backbonewidth d BB .The expectation values for C and d HC for the diﬀerent lipids are listed inTab. 6.3, including resulting structural parameters for R W , A , V HC , and ˜ S .Previously, we reported C = − . − for POPE at 37 ◦ C [7], which isin excellent agreement with our present value. Regarding other structural pa-rameters we particularly found that R W and A decrease with C becomingmore negative, which is mainly attributed to the geometry of the H II phase.The hydrocarbon chain volumes are in agreement with the chemical composi-tions. That is, DMPE with two C14:0 chains has the smallest and DOPE withtwo C18:1 chains has the largest V HC value, whereas volumes of POPE anddiC16:1PE take up intermediate values. Our hydrocarbon volume of POPE isabout 4 % lower than the value reported for POPE in the absence of tricosene atthe same temperature, where it forms a ﬂuid lamellar phase [31]. This indicatesa slightly tighter hydrocarbon chain packing in fully relaxed monolayers. Theshape parameter (DMPE (cid:39) POPE < diC16:1PE < DOPE) clearly shows thatfrom all lipids presently studied DOPE has the highest propensity to form a H II phase, which is consistent with its low T H [38]. We have introduced a global scattering model for fully hydrated unoriented H II phases. Compared to previous models for H I phases [19, 20], H II phase analy-sis required to add diﬀuse scattering not originating from hexagonal structures.While the exact origin of this additional contribution remains unclear, we suc-cessfully modeled the measured SAXS pattern upon including a lamellar formfactor. The SLD of the lipid unit cell was constrained by compositional mod-eling using complementary information on lipid volume and structure. Thisdescription is generic and entails the analysis of SAXS and small-angle neutronscattering (SANS) data. In particular a joint analysis of SAXS and diﬀerentlycontrasted SANS data (see, e.g. [42, 2]) might be beneﬁcial for increased struc-tural resolution regarding the lipid head and backbone groups.Here, we analyzed SAXS data using Bayesian probability theory combined21ith MCMC simulations. This was speciﬁcally necessary due to the weakly-deﬁned global minimum of the optimization cost function. The full probabilisticapproach provides the probability density distributions of the involved param-eters leading to reliable parameter estimates including errors.The obtained estimates are in good agreement with previously reportedstructural data of DOPE and POPE. We further provided details for lipid struc-tures of DMPE and di16:1PE in the H II phase, clearly demonstrating that outof all presently studied lipids DMPE is least prone to form a H II phase. Thedeveloped technique will be easily transferred to other H II phase amphiphilesusing appropriate compositional modeling. In particular, we are envisioning ahigh potential for applications in drug-delivery formulations involving H II struc-tures, which exhibit only weak Bragg peaks, but signiﬁcant contributions fromdiﬀuse scattering, such as hexosomes (see, e.g. [4]). Another potential applica-tion is the determination of intrinsic lipid curvatures of lamellar-phase-forminglipids using mixtures with DOPE [7], which is particularly encouraged by thehigh robustness of the retrieved C estimates. Such approaches are currentlybeing explored in our laboratory. We thank D. Kopp, M. Pachler and J. Kremser for technical assistance, and E.Semeraro for critical reading of the manuscript. We further thank O. Glatterfor performing a trial analysis using his GIFT software package. This work wassupported ﬁnancially by the Austrian Science Funds FWF (grant no. P27083-B20 to G.P.)