Ubiquitous nucleosome unwrapping in the yeast genome
UUbiquitous nucleosome unwrapping in the yeastgenome
R˘azvan V. Chereji ∗ and Alexandre V. Morozov † Department of Physics and Astronomy, Rutgers University,Piscataway, New Jersey 08854-8019, USA BioMaPS Institute for Quantitative Biology, Rutgers University,Piscataway, New Jersey 08854-8019, USAMay 17, 2013
Abstract
Nucleosome core particle is a dynamic structure – DNA may tran-siently peel off the histone octamer surface due to thermal fluctuations orthe action of chromatin remodeling enzymes. Partial DNA unwrappingenables easier access of DNA-binding factors to their target sites and thusmay provide a dominant pathway for effecting rapid and robust accessto DNA packaged into chromatin. Indeed, a recent high-resolution mapof distances between neighboring nucleosome dyads in
S.cerevisiae showsthat at least 38.7% of all nucleosomes are partially unwrapped. The ex-tent of unwrapping follows a stereotypical pattern in the vicinity of genes,reminiscent of the canonical pattern of nucleosome occupancy in whichnucleosomes are depleted over promoters and well-positioned over codingregions. To explain these observations, we developed a biophysical modelwhich employs a 10-11 base pair (bp) periodic nucleosome energy profile.The profile, based on the pattern of histone-DNA contacts in nucleosomecrystal structures and the idea of linker length discretization, accountsfor both nucleosome unwrapping and higher-order chromatin structure.Our model reproduces the observed genome-wide distribution of inter-dyad distances, and accounts for patterns of nucleosome occupancy andunwrapping around coding regions. At the same time, our approach ex-plains in vitro measurements of accessibility of nucleosome-covered bind-ing sites, and of nucleosome-induced cooperativity between DNA-bindingfactors. We are able to rule out several alternative scenarios of nucleosomeunwrapping as inconsistent with the genomic data. ∗ [email protected] † [email protected] a r X i v : . [ q - b i o . GN ] M a y ukaryotic genomes are organized into arrays of nucleosomes [1]. Each nu-cleosome consists of a stretch of genomic DNA wrapped around a histone oc-tamer core [2]. The resulting complex of DNA with histones and other regula-tory and structural proteins is called chromatin [1, 3]. Arrays of nucleosomesform 10-nm fibers which resemble beads on a string and fold into higher-orderstructures [4, 5, 6]. Depending on the organism and cell type, 75-90% of ge-nomic DNA is packaged into nucleosomes [1]. Since nucleosomal DNA is tightlywrapped around the histone octamer, its accessibility to various DNA-bindingproteins such as repair enzymes, transcription factors (TFs), polymerases, andrecombinases, is suppressed. The question of how DNA-binding proteins gainaccess to their target sites in vivo efficiently and robustly is one of the outstand-ing puzzles in chromatin biology.Two different mechanisms for site exposure have been proposed: nucleosometranslocation by thermal fluctuations or by ATP-dependent chromatin remod-eling factors [7, 8, 9], and transient unwrapping of nucleosomal DNA off thehistone octamer surface [10, 11, 12, 13]. Proteins can utilize spontaneous DNAunwrapping to bind to their target sites, which would favor further destabiliza-tion of the histone-DNA complex and binding of additional proteins (Fig. 1A).Since partial unwrapping of nucleosomal DNA is energetically less costly thannucleosome translocation, it is likely to play a major role in numerous DNA-mediated processes. For example, nucleosome “breathing” governs transcriptiondynamics of RNA polymerase [14], and may provide an explanation for fast DNArepair by photolyases [15].Partial unwrapping of nucleosomal DNA and subsequent differential acces-sibility of nucleosome-covered protein-binding sites were observed in single nu-cleosomes [16, 10, 17, 11, 18, 19, 20, 12, 13], di-nucleosomes [21], and multi-nucleosome arrays [22, 23], and modeled computationally [24, 25, 26, 27, 28, 29].Recently, nucleosome dyad positions were mapped with single base-pair (bp)precision in S.cerevisiae , resulting in a genome-wide collection of distances be-tween neighboring nucleosomes [30]. Surprisingly, almost 40% of these inter-dyad distances are less than 147 bp, indicating that at least one nucleosome inthe pair is partially unwrapped. Furthermore, there are distinct 10-11 bp peri-odic oscillations in the histogram of inter-dyad distances, suggesting a stepwiseunwrapping mechanism consistent with the pattern of histone-DNA contacts inthe crystal structure [2, 31, 32].We present a rigorous statistical mechanics approach to nucleosome unwrap-ping which explains the observed genome-wide distribution of inter-dyad dis-tances [30] as well as earlier experiments which probed differential accessibilityof nucleosome-covered binding sites [10, 11] and studied nucleosome-inducedcooperativity between DNA-binding factors [16, 13]. Using this approach, wereproduce both nucleosome occupancy and inter-dyad distance profiles in thevicinity of transcription start sites (TSS). The model neglects sequence speci-ficity of histone-DNA interactions but requires potential barriers at gene promot-ers, which may be created by DNA-bound TFs and transcription pre-initiationcomplexes (PICs). Although sequence-specific nucleosome positioning is notcrucial for explaining many features of the distribution of inter-dyad distances,2ur framework allows us to predict both sequence-specific free energies of nucleo-some formation and the unwrapping potential from paired-end high-throughputsequencing datasets.
We use a high-resolution in vivo map of nucleosome dyad positions based onchemical modification of engineered histones and DNA backbone cleavage byhydroxyl radicals [30]. With cleaved DNA segments sorted by molecular masson the agarose gel and the ∼ −
200 bp fraction sequenced using paired-end reads, the map provides a direct measurement of both dyad positions anddistances between adjacent dyads. Although superior to methods based onmicrococcal nuclease (MNase) digestion whose accuracy is affected by MNasesequences preferences and its tendency to over- or under-digest DNA [33, 34, 35],the map is biased by unknown hydroxyl radical cutting preferences for twoalternate sites at each DNA strand [30].A genome-wide histogram of inter-dyad distances shows that at least 38.7% of all nucleosomes are partially unwrapped (blue line in Fig. 1C). In orderto study the energetics of unwrapping, we introduce a simple model based onthe 10-11 bp periodic pattern of histone contacts with the minor groove of thenucleosomal DNA [31, 32] (Fig. 1B). As DNA is peeled off each contact patch,its free energy increases because hydrogen bonds and favorable electrostaticcontacts between histone side chains and the DNA phosphate backbone arelost. However, once DNA breaks free from the contact patch, it may adoptmultiple conformations, which allows it to increase its entropy and thus lowerthe total free energy. The favorable entropic term grows with the extent ofunwrapping until the next contact patch is reached, completing one cycle inthe oscillatory energy profile. The oscillations are superimposed on a straightline whose slope reflects the average free energy cost per bp of histone-DNAcontact loss. Additional details of the potential construction can be found inSI Appendix, Model A. The histone-DNA potential constructed in this way hasno sequence specificity, in contrast to the free energy cost of bending DNA intoa nucleosomal superhelix [36]. To a good approximation, we expect sequenceeffects to average out in the genome-wide histogram of inter-dyad distances (thisassumption is tested later).Thus we aim to reproduce the observed distribution of inter-dyad distanceswith a model in which nucleosome energetics is sequence-independent but tran-sient unwrapping is allowed. We compute the conditional probability P ( c + d | c )of finding the nucleosome dyad at bp c + d , given that the previous dyad islocated at bp c (Materials and Methods). Since inter-dyad distances cannotbe used to distinguish between symmetric and asymmetric unwrapping, we as-sume the former for simplicity. The model is fit to the observed distributionof inter-dyad distances (SI Appendix). The free parameters of the model in-3lude the amplitude of the oscillations, the slope of the free energy profile and a min(max) , the minimum (maximum) effective length of the nucleosome particle(SI Appendix, Model A). The maximum extent of nucleosome unwrapping iscontrolled by a min , while a max is allowed to exceed 147 bp in order to accountfor the effects of higher-order chromatin structure and linker histone deposition.We also fit the relative frequency of hydroxyl radical DNA cleavage at the − The effective length of the particle found in the fit, a max = 163 bp, is greaterthan 147 bp, the length of the DNA in the nucleosome core [32]. Indeed, a max =147 bp is incompatible with the observed inter-dyad distribution (Fig. S3A,B).The model is less sensitive to the value of a min because extensively unwrappedparticles are energetically unfavorable and therefore are not frequently seen inthe data. The overall shape of the inter-dyad distribution is also sensitive to theslope of the energy profile in Fig. 1B, providing a robust estimate of the averagenucleosome unwrapping energy (Fig. S3C). The fitted value of the slope yields14 . B T for the average histone-DNA interaction energy in a fully wrappednucleosome.Thus the energy profile in Fig. 1B describes both DNA interactions withthe histone octamer core (up to 73 bp from the dyad) and the effects of higher-order chromatin structure, including, potentially, the attachment of Hho1p, theH1 linker histone of
S. cerevisiae , to the DNA immediately outside of the nu-cleosome core [37, 38, 39]. Although Hho1p is less abundant in yeast thanin higher eukaryotes, it is involved in higher-order chromatin organization, in-cluding chromatin compaction in stationary phase [40, 39]. Relatively little isknown about the molecular mechanism of H1 binding: there is no consensus yetwhether the binding is symmetric or asymmetric, or even what the extent of theH1 footprint is [37, 38]. H1 binding and other factors that mediate chromatinfolding into higher-order structures cause linker lengths to be discretized [1, 41].Linker length discretization can be described by a periodic decaying two-bodyeffective potential between neighboring nucleosomes, with the first minimumapproximately 5 bp away from the nucleosome edge [1, 42, 43].Based on these observations, we have constructed two models for the energyprofile outside of the nucleosome core region. One model is a polynomial fit thatextends the quasiperiodic profile of the unwrapping energy through anothercycle (Fig. 1B; SI Appendix, Model A). The depth and the position of thefirst minimum outside of the nucleosome core are additional free parameters.As can be seen in Fig. S4B, our fit robustly predicts the first minimum to bepositioned 5-6 bp outside of the nucleosome core, in agreement with previous4 .5 1.5 2.5 3.5 4.5 5.5 6.5
Dyad − edge distance (bp)
Dyad u ha l f ( k B T ) − − − − Inter−dyad distance (bp) P r obab ili t y
50 100 150 200 250 − − − Inter−dyad distance (bp) O sc ill a t o r y pa r t Correlation: 0.764
DataModelDataModel (unwrapping)Model (no unwrapping)
RMS: 9.996 ×10 -4 × BACD
Figure 1:
Genome-wide distribution of inter-dyad distances. (A) A nu-cleosome is a dynamic structure. Transient nucleosome unwrapping followedby factor binding prevents subsequent rewrapping, mediating further bindingevents. (B) Nucleosome energy profile. The energy of a nucleosome that covers2 x + 1 bps is given by u SInuc = 2 u half ( x ) with symmetric unwrapping. The min-ima and maxima of the energy landscape are based on the crystal structure ofthe nucleosome core particle [31, 32]. Dark gray bars show where the histonebinding motifs interact with the DNA minor groove in the structure. Lightgray bars show where the DNA major groove faces the histones. The energyprofile was obtained by a polynomial fit as described in SI Appendix, ModelA. (C) The inter-dyad distance distribution from a high-resolution nucleosomemap [30] (blue), and from the model with (red) and without unwrapping (gray).In the model without unwrapping, a min = a max = 147 bp and the fitting param-eters are E b , µ and f (SI Appendix, Model A). RMS - total root-mean-squaredeviation between the model and the data. (D) Oscillations in the observed(blue) and predicted (red) inter-dyad distance distributions, obtained by sub-tracting a smooth background from the data and the model with unwrapping in(C). The smooth background was found by applying a Savitzky-Golay filter ofpolynomial order 3 with 31 bp length (using the sgolayfilt function from theSignal Processing Toolbox of MATLAB). Correlation refers to r osc , the linearcorrelation coefficient between measured and predicted oscillations.5tudies [1, 42, 43]. The depth of this minimum is comparable to the depth ofthe unwrapping minima (Fig. S4, SI Appendix, Model A).The other model represents the energy profile outside of the nucleosome coreby a linear function (Fig. S5A; SI Appendix, Model B). The two free parametersare the slope and the length of the line, which are related to the H1-DNAinteraction energy and the H1 footprint. This model is motivated by a picture ofthe H1 histone being gradually detached from its DNA site immediately outsideof the nucleosome core. This alternative scenario, although likely oversimplified,can be used to check the sensitivity of our results toward a particular energyprofile outside of the core region. We find that the linear profile fits the overallshape of the inter-dyad distribution somewhat less well than the oscillatory one(compare RMS values in Figs. 1B and S5B), although the 10-11 bp periodicfine structure is reproduced in both cases (Figs. 1C, S5C). The optimal linearprofile is 7 bp long, yielding a symmetric H1 footprint with two 7 bp half-sites(Fig. S5D) and the H1-DNA interaction energy of ≈ B T (Fig. S5E).
Next we have tested the sensitivity of our fits to the functional form of the un-wrapping free energy profile. Although our primary model follows nucleosomecrystal structures in creating a quasi-periodic energy profile with both 10 and11 bp modes, strictly periodic 10 or 11 bp sinusoidal profiles yield nearly thesame quality of fit (Fig. S6; SI Appendix, Models C,D). Since the initial phaseof the oscillations is not determined by the crystal structure anymore, it be-comes another fitting parameter. The fitted initial phases in the 10 and 11-bpmodels make the periodic curves match the crystal structure further away fromthe dyad, where most of the observed unwrapping takes place (Fig. S6A). Thephases diverge closer to the dyad, where they are not as strongly constrainedby the data. RMS deviation is less sensitive to the initial phase than r osc , thelinear correlation between predicted and observed oscillations in the inter-dyadhistograms (Fig. S7). The primary peak in the dependence of r osc on the initialphase matches the crystal structure. There is also a secondary peak correspond-ing to the 5 bp shift in the unwrapping energy profile, which in turn leads to the10 bp, in-phase shift in the distribution of inter-dyad oscillations (Fig. S7B).Since the inter-dyad distance distribution has a distinct oscillatory compo-nent, it is not surprising that a purely linear model of unwrapping energy doesnot fit the data as well, although it does match its overall shape (Fig. S8A;SI Appendix, Model E). Less trivially, it was suggested on the basis of single-nucleosome unzipping experiments that nucleosome unwrapping proceeds with5-bp periodicity because histones interact with each DNA strand separatelywhere the DNA minor groove faces the histone octamer surface, creating twodistinct contact “subpatches” [44]. This single-molecule data was fit to a modelwith a step-wise unwrapping free energy profile [45]. Each step in the profilecorresponds to breaking a point histone-DNA contact, and the steps occur ev-ery 5 .
25 bp on average. We do not find any evidence for 5 bp periodicity ofnucleosome unwrapping in the genomic data. Indeed, both 5 bp step-wise and6 bp periodic sinusoidal profiles fit the data poorly, about as well as the linearmodel (Fig. S8B,C). Even the 10-bp step-wise unwrapping profile, while clearlyhaving the right periodicity, does not fit the data as well as the structure-basedmodel (Fig. S8D). This observation suggests that the picture of gradual loss offavorable finite-range histone-DNA interactions followed by gain in DNA con-formational entropy is closer to reality than abrupt disruption of short-rangehistone-DNA contacts. A direct comparison of single-molecule and genome-wide energy profiles is unfortunately obscured by the fact that the reportedsingle-nucleosome unzipping experiments are specific to the 601 nucleosome-forming sequence [46], in contrast to our methodology which provides the aver-age, sequence-independent picture of unwrapping energetics.
Fig. 2A, in which genes are sorted by the promoter length and aligned by theTSS, shows a canonical picture of nucleosomes depleted in promoters and well-positioned over coding regions [50, 51]. Interestingly, promoter nucleosomeshave shorter inter-dyad distances and are therefore more unwrapped (Fig. 2B).When averaged over all genes, the number of dyads at a given bp and theaverage inter-dyad distance at that bp are strongly correlated (compare blueand red lines in Fig. 2C). The profile of average inter-dyad distances is alsocorrelated with the distribution of DNA fragment lengths in an MNase assaywhich mapped both nucleosomes and subnucleosome-size particles by paired-end sequencing (Fig. S9A, green line in Fig. 2C) [47]. The two profiles do notcoincide completely because inter-dyad distances also depend on the distributionof linker lengths. The observed behavior is opposite of the naive expectationthat unwrapping increases with occupancy due to nucleosome crowding, but canbe reproduced in a simple sequence-independent model in which nucleosomesphase off a potential barrier placed in the promoter region (Fig. S10) [52].Partially unwrapped nucleosomes tend to have elevated histone turnoverrates [49], both around TSS and genome-wide (Figs. 2C,S9B,S9D). We find thatnucleosomes at loci enriched in PICs [48] are also more unwrapped (Figs. 2C,S9C).Finally, inter-dyad distances tend to increase with the fraction of A/T nu-cleotides, indicating that nucleosomes occupying A/T-rich sequences have longerfootprints genome-wide (Fig. S9E). We note that it is misleading to equateinter-nucleosome distances with peak-to-peak distances in the average profileof nucleosome dyad counts (blue line in Fig. 2C). The peak-to-peak distancesare 164-165 bp, while the average inter-dyad distance for the nucleosomes inthe [TSS,TSS+1000] region is 149.6 bp. Thus nucleosome unwrapping is muchmore common than could be predicted by mapping single-nucleosome positionsalone. 7
165 bp 164 bp 164 bp 164 bp 165 bp A v e r age nu m be r o f d y ad s (cid:132)(cid:18)(cid:17)(cid:17)(cid:17) (cid:132)(cid:22)(cid:17)(cid:17) TSS +500 +1000145146147148149150151152153 A v e r age d i s t an c e t o t he ne x t d y ad ( bp ) Distance (bp) A v e r age f r ag m en t s i z e ( bp ) –2 –1 +1 +2 +3 +4 +5 +6 Position (bp)Nucleosome dyad counts (cid:132)(cid:20)(cid:17)(cid:17)(cid:17) (cid:132)(cid:19)(cid:17)(cid:17)(cid:17) (cid:132)(cid:18)(cid:17)(cid:17)(cid:17)
TSS (cid:18)(cid:17)(cid:17)(cid:17) (cid:19)(cid:17)(cid:17)(cid:17) (cid:17)(cid:19) (cid:18)(cid:17)(cid:18)(cid:19)
Position (bp) (cid:34)(cid:87)(cid:70)(cid:83)(cid:66)(cid:72)(cid:70)(cid:1)(cid:74)(cid:79)(cid:85)(cid:70)(cid:83)(cid:132)(cid:69)(cid:90)(cid:66)(cid:69)(cid:1)(cid:69)(cid:74)(cid:84)(cid:85)(cid:66)(cid:79)(cid:68)(cid:70) (cid:132) (cid:132) (cid:132)
BAC
Figure 2:
Nucleosome unwrapping in the vicinity of transcription startsites. (A) Distribution of nucleosome dyad counts [30] near the TSS. 4763 ver-ified
S.cerevisiae open reading frames (ORFs) were aligned by their TSS andsorted by promoter lengths. Each horizontal line corresponds to one ORF. (B)Distribution of the average distance between neighboring dyads. For each bp,the distances between a dyad at that bp and all neighboring dyads were aver-aged. ORFs are sorted as in (A). In (A) and (B), values at bps without dyadswere obtained by interpolation, and heatmaps were smoothed using a 2D Gaus-sian kernel with σ = 3 pixels. (C) Data in (A), (B) and Fig. S9A-C averagedover all genes. Blue: nucleosome dyad counts, red: average distance betweenneighboring dyads, green: average length of DNA-bound particles mapped byMNase digestion [47] (see Fig. S9A for details). Curve with light gray back-ground: combined occupancy of 9 PICs (TBP, TFIIA, TFIIB, TFIID, TFIIE,TFIIF, TFIIH, TFIIK, PolII) [48], curve with light pink background: aver-age histone turnover rate [49]. The peaks in the dyad count profile (blue) aremarked with orange ovals representing nucleosomes, and peak-to-peak distancesare shown. 8
25 50 75 100 125 150 -6 -5 -4 -3 -2 -1 Probed Site B s t U I T aq α I B sr I B s t U I T aq α I B s a J I B s t N I T aq α I B s a H I UnwrappingNo unwrapping A p open p open P s t I H i nd III H a e III A v a II M s p I B a m H I H ha I M s e I S t y I B f a I P m l I -6 -5 -4 -3 UnwrappingNo unwrapping
Position (bp)B
Figure 3:
Probability of binding site exposure within a nucleosome.
The solid blue and dashed green lines represent model predictions with andwithout unwrapping, respectively. In the latter case, a min = a max = 147 bp andall other parameters are adopted from the model with unwrapping. The dyad isfixed at bp 74. (A) Restriction enzyme sites inserted into the 5S rRNA sequenceat locations indicated by the centers of vertical red bars [10]. (B) Restrictionenzyme sites inserted into the 601 sequence at locations indicated by the centersof the vertical red bars in the middle of each group [11]. Each group of threebars corresponds to independent measurements in which the 601 sequence wasflanked by different DNA sequences. In (A) and (B), the height of each bar isthe equilibrium constant for site exposure averaged over multiple experiments(error bars show standard deviation). Partial unwrapping of nucleosomal DNA results in differential accessibility offactor binding sites with respect to their position inside the nucleosome: sites onthe edges are more accessible than those closer to the dyad. In contrast, all-or-none nucleosome formation should not be sensitive to the binding site position– a nucleosome, once unfolded, liberates its entire site. Polach and Widom [10]studied differential accessibility of 6 restriction enzymes to their target sites.The sites were placed at various positions throughout the 5S rRNA nucleosomalsequence (Fig. 3A). A later study used the 601 sequence and an extended setof 11 restriction enzymes (Fig. 3B) [11]. These studies measured equilibriumconstants for site exposure K confeq , which are related to the probability for a siteto be accessible for binding: p open = K confeq / (1 + K confeq ) ≈ K confeq [29].We use our crystal structure-based unwrapping model (Fig. 1B; SI Appendix,Model A) to fit the data on site accessibility [10, 11]. Here the system consistsof a single nucleosome and asymmetric unwrapping is allowed. We assume thata site becomes accessible for the enzyme only after an additional number ofbps, d , have been unwrapped from the histone octamer surface [29]. We alsoassume that once the dyad is unwrapped from either end, the entire nucleosome9s unfolded. Then the probability for a binding site to be accessible is given by p open ( x ) = − Occ nuc ( x + d ) for x < x d − d − Occ nuc ( x d ) for x d − d ≤ x ≤ x d + d − Occ nuc ( x − d ) for x > x d + d, where x ∈ [1 , x d = 74 bp is the position of the dyad, and the nucleosomeoccupancy is given by Eq. (3).Besides d , the fitting parameters of the model are the overall slope of thebinding energy (cid:15) and the histone chemical potential µ (all other parametersare as in SI Appendix, Model A, with the exception of a min = 1 bp, a max =147 bp). For the 5S rRNA measurements [10], we obtain (cid:15) = − .
13 k B T/bp, µ = − . B T, and d = 23 bp. For the 601 measurements [11], (cid:15) = − .
16 k B T/bp, µ = − . B T, and d = 45 bp. As expected, the nucleo-some formation energy of the 601 sequence is 147 × ( (cid:15) − (cid:15) ) = 4 . B T morefavorable than that of the 5S sequence, in agreement with the experimentallymeasured difference of 4.9 k B T [53]. The nucleosome formation energy of the601 sequence is 24 . B T, close to the 23 . B T estimate made on the basis of601 unzipping experiments [45]. Interestingly, the 601 DNA has to unwrap moreextensively past the binding site to allow access to restriction enzymes.Overall, our model reproduces the observed differential accessibility of re-striction enzyme binding sites with respect to the nucleosome dyad (Fig. 3).The only outliers are
Sty
I and
Bfa
I binding sites in the 601 series which werenot used in the fit and which, if unwrapping proceeds from the ends, cannot bemore open than the
Pml
I site located further away from the dyad.
If multiple biding sites reside within a single nucleosome, binding of one factormakes the other sites more accessible, in a phenomenon known as nucleosome-induced cooperativity [16, 18, 25]. The cooperativity disappears in the absenceof nucleosomes and reduces in extent with the distance between consecutivesites [16]. Moreover, the cooperativity is not observed if the two sites are on theopposite sides of the nucleosome dyad [13].We can use our model of nucleosome unwrapping (SI Appendix, Model Awith a min = 1 bp, a max = 147 bp) to capture all these aspects of nucleosome-induced cooperativity (Fig. 4). Specifically, for sites located more than 40 bpaway from the dyad site accessibility is strongly enhanced if DNA unwrappingis allowed (Fig. 4A). Interestingly, cooperativity between two TFs bound on thesame side of the dyad is observed both with and without unwrapping (Fig. 4B).However, without unwrapping it is impossible to show that binding on theopposite sides of the dyad is not cooperative, as observed in experiments [13](Fig. 4C). Furthermore, the decrease of cooperativity with distance [16] cannotbe reproduced (Fig. 4D). Thus modeling transient nucleosome unwrapping isnecessary for understanding how TFs and other DNA-binding proteins gainaccess to their nucleosome-covered sites.10 .7 Sequence-dependent nucleosome positioning and un-wrapping Here we test the assumption that sequence specificity of nucleosome formationmay be neglected when inferring nucleosome unwrapping potential from genomicdata. In general, the total free energy u nuc ( k, l ) of a nucleosome occupying bps k, . . . , l is a sum of a sequence-independent term u SInuc ( k, l ) describing contactsbetween histone side chains and the DNA phosphate backbone (Fig. 1B), anda sequence-dependent term u SDnuc ( k, l ) describing DNA bending into the nucle-osomal superhelix [36]. Our approach (Eq. (4), SI Appendix) can be used toinfer both contributions from high-throughput maps of nucleosome positions.Since the number of different unwrapped species may be as high as severalthousand depending on the maximum extent of unwrapping and the assump-tion of unwrapping symmetry, we expect robust inference to require levels ofread coverage that are not currently available, especially in higher eukaryotes.Furthermore, resolution of MNase-based nucleosome maps is insufficient for cap-turing the 10-11 bp periodicity of the unwrapping potential. In the absence ofhigh-resolution, high-coverage experimental data, we have tested our ability topredict nucleosome unwrapping energetics using a realistic model system.Specifically, we assumed that u SDnuc depends only on the number of mono-and dinucleotides in the nucleosomal DNA [54] (SI Appendix). The length ofnucleosomal DNA ranges from a min = 74 bp to a max = 147 bp; all other param-eters of u SInuc are as in SI Appendix, Model A. Using Eq. (1) with µ = −
13 k B T,we computed the exact nucleosome distribution n nuc1 ( k, l ) for the S.cerevisiae chromosome I. We sampled paired-end nucleosomal reads [ k, l ] from n nuc1 ( k, l )until a desired level of read coverage (mean number of reads starting at a bp)was reached. From this finite sample, we constructed a histogram of nucleosomelengths P (∆) and used it to optimize the parameters of u SInuc (in each step ofthe fitting procedure, u SInuc alone was used to predict n nuc1 , which in turn gave P (∆)). Next, we used Eq. (4) to predict u nuc ( i, j ), subtracted u SInuc from it (as-suming that the dyad is at the mid-point of each particle), and fit u SDnuc to therest. Finally, u SInuc + u SDnuc were used to compute ˜ n nuc1 , which was then comparedwith the exact profile n nuc1 .We find that we are able to reproduce the unwrapping potential even if nu-cleosome formation is sequence-specific (Fig. S11A). However, the overall slopeof the potential is overpredicted because the histogram of particle lengths is af-fected by well-positioned nucleosomes which have negative sequence-dependentenergies. The average of these energies biases the slope. Nucleosome occupancypredicted using u SInuc + u SDnuc yields a reasonable correlation with the exact pro-file even at 1 read per bp, although at least 10 reads per bp are required toreproduce dyad positions (Fig. S11B).11
20 40 60 80 100 120 14000.20.40.60.81
Position (bp) O cc upan cy Nucleosome (unwrapping)TF (unwrapping)Nucleosome (no unwrapping)TF (no unwrapping)00.20.40.60.81
Distance from the dyad (bp) TF b i nd i ng p r obab ili t y UnwrappingNo unwrapping0 20 40 60 (cid:132)(cid:19)(cid:23) (cid:132)(cid:19)(cid:21) (cid:132)(cid:19)(cid:19) (cid:132)(cid:19)(cid:17) (cid:132)(cid:18)(cid:25) (cid:132)(cid:18)(cid:23) (cid:132)(cid:18)(cid:21)(cid:17)(cid:17)(cid:15)(cid:19)(cid:17)(cid:15)(cid:21)(cid:17)(cid:15)(cid:23)(cid:17)(cid:15)(cid:25)(cid:18)
TF chemical potential (k B T) TF b i nd i ng p r obab ili t y (cid:54)(cid:79)(cid:88)(cid:83)(cid:66)(cid:81)(cid:81)(cid:74)(cid:79)(cid:72)(cid:15)(cid:1)(cid:18)(cid:1)(cid:53)(cid:39)(cid:54)(cid:79)(cid:88)(cid:83)(cid:66)(cid:81)(cid:81)(cid:74)(cid:79)(cid:72)(cid:15)(cid:1)(cid:19)(cid:1)(cid:53)(cid:39)(cid:84)(cid:47)(cid:80)(cid:1)(cid:86)(cid:79)(cid:88)(cid:83)(cid:66)(cid:81)(cid:81)(cid:74)(cid:79)(cid:72)(cid:15)(cid:1)(cid:18)(cid:1)(cid:53)(cid:39)(cid:47)(cid:80)(cid:1)(cid:86)(cid:79)(cid:88)(cid:83)(cid:66)(cid:81)(cid:81)(cid:74)(cid:79)(cid:72)(cid:15)(cid:1)(cid:19)(cid:1)(cid:53)(cid:39)(cid:84) Unwrapping. 1 TFUnwrapping. 2 TFsNo unwrapping. 1 TFNo unwrapping. 2 TFs (cid:132)(cid:19)(cid:23) (cid:132)(cid:19)(cid:21) (cid:132)(cid:19)(cid:19) (cid:132)(cid:19)(cid:17) (cid:132)(cid:18)(cid:25) (cid:132)(cid:18)(cid:23) (cid:132)(cid:18)(cid:21)(cid:17)(cid:17)(cid:15)(cid:19)(cid:17)(cid:15)(cid:21)(cid:17)(cid:15)(cid:23)(cid:17)(cid:15)(cid:25) TF chemical potential (k B T) TF b i nd i ng p r obab ili t y Distance (bp) TF b i nd i ng p r obab ili t y Unwrapping. 2 TFsNo unwrapping. 2 TFs D A BC D
Figure 4:
Modes of nucleosome-induced cooperativity. (A) TF and nucle-osome occupancy with and without unwrapping. The TF binding site occupiesbps 11–20. Inset: TF binding probability as a function of the distance betweenthe nucleosome dyad and the proximal edge of the TF site, with and withoutunwrapping. (B) TF titration curves for one TF site vs. two TF sites locatedon the same side of the dyad. Site 1 occupies bps 11–20, site 2 occupies bps31–40. Inset: Binding site locations. (C) Same as (B), but with the two TFsites located on the opposite sides of the dyad. Site 1 occupies bps 11–20, site2 occupies bps 117–126. Inset: Binding site locations. (D) Nucleosome-inducedcooperativity as a function of the distance between two TF binding sites. Thebinding probability of the second TF is shown. Site 1 occupies bps 11-20, whilethe position of the second site is variable. Inset: Definition of the distancebetween the two binding sites. In all panels, free energy of a fully wrappednucleosome is − log(10 ) k B T; histone chemical potential is log(10 − ) k B T; TFbinding energy is − log(10 ) k B T to cognate sites, and − log(10 ) k B T to allother sites; TF chemical potential is log(10 − ) k B T unless varied. Asymmetricunwrapping is allowed; in the model without unwrapping, a min = a max = 147 bpand all other parameters are the same as in the model with unwrapping. Since DNA wrapped in nucleosomes is sterically occluded, nucleosome formationprevents access of DNA-binding factors such as TFs or polymerase subunits totheir cognate sites. Transient DNA unwrapping off the histone octamer surface,extensively studied in model systems [16, 10, 17, 11, 18, 19, 20, 12, 13, 21, 22, 23],can facilitate such binding events but its importance on the genomic scale hasbeen unclear. MNase-based maps of nucleosome positions employing paired-end sequencing have identified numerous subnucleosome-size particles [47, 55].12owever, it is impossible to deduce from these experiments which particlescorrespond to unwrapped nucleosomes. In addition, there is a concern thatDNA fragments may have been under- or overdigested by MNase.These challenges have been overcome in a recent experiment in which chemi-cally modified histones are used to map nucleosome dyads with single-bp resolu-tion [30]. Used in conjunction with paired-end sequencing, the experiment pro-vides information about both dyad positions and the distances between neigh-boring dyads. The latter is especially important since it probes the two-particledistribution in the same cell. The histogram of inter-dyad distances shows thatat least 38.7% of all
S.cerevisiae nucleosomes are partially unwrapped. Genepromoters are enriched in such nucleosomes, which also have higher histoneturnover rates. Over coding regions, the distribution of wrapped DNA lengthsoscillates in phase with the nucleosome occupancy profile, so that more stablenucleosomes are also more wrapped.We have developed a statistical mechanics description of nucleosome arrayswhich allows DNA unwrapping while rigorously treating steric exclusion andsequence specificity of nucleosome formation. We have shown that prominent10-11 bp periodicity in the distribution of inter-dyad distances [30] can be ex-plained using an unwrapping energy profile based on the pattern of histone-DNAcontacts in the nucleosome crystal structures. We were able to rule out alterna-tive scenarios of step-wise unwrapping and 5 bp periodicity of the unwrappingprocess. Furthermore, fitting the distribution of inter-dyad lengths required ac-counting for linker length discretization, commonly thought to be imposed bychromatin fiber formation [41].Our model yields estimates of nucleosome unwrapping energies consistentwith previous biophysical experiments, and accounts for single-nucleosome ob-servations which show that nucleosome-covered binding sites closer to the edgeof the nucleosome are more easily accessible to DNA-binding factors and thatbinding of the first factor enhances binding of subsequent factors on the sameside of the dyad. Finally, our approach reproduces the periodic distribution ofwrapped DNA lengths in the vicinity of TSS if potential barriers are placedin the promoters. The barriers may be created in vivo by PICs [48], otherDNA-bound factors, and chromatin remodelers. The extent of nucleosome un-wrapping in the yeast genome suggests that its treatment should be included inall future models of nucleosome positioning and chromatin energetics.
Here we outline the exact theory for T molecular species simultaneously inter-acting with one-dimensional DNA (details and extensions are provided in SIAppendix). The DNA-bound particles are subject to steric exclusion and mayalso be partially unwrapped. Let u s ( k, l ) ( s ∈ { , , . . . , T } ) denote the bindingenergy of a particle of type s attached to bps k, . . . , l . One can show that the13ne-body distribution of particles of type t is given by n t ( k, l ) = 1 Z Z − ( k ) (cid:104) t, k | z | t, l (cid:105) Z + ( l ) , (1)where Z is the grand canonical partition function, (cid:104) t, k | z | s, l (cid:105) = e β [ µ s − u s ( k,l )] δ t,s , β = 1 /k B T is the inverse temperature, µ s is the chemical potential of particlesof type s , and δ t,s is the Kronecker symbol. If the DNA length is L , vectors | t, k (cid:105) are T L -dimensional with a 1 at position ( t − L + k and 0 everywhereelse. Similarly, the nearest-neighbor two-body distribution function is given by n t,s ( i, j ; k, l ) = 1 Z Z − ( i ) (cid:104) t, i | z | t, j (cid:105) Θ( k − j ) (cid:104) s, k | z | s, l (cid:105) Z + ( l ) , (2)where Θ is the Heaviside step function, and Z − ( k ) and Z + ( k ) are partial par-tition functions for the DNA segments [1,k) and (k,L], respectively. Z − ( k ), Z + ( k ), and Z can be computed recursively, as shown in SI Appendix. UsingEq. (1), one can find particle occupancy:Occ t ( i ) = i (cid:88) k = i − a t max +1 k + a t max − (cid:88) l =max( i,k + a t min − n t ( k, l ) , (3)where a t min ( a t max ) is the minimum (maximum) length of a particle of type t .The inverse problem of predicting DNA binding energies from one-particledistributions can also be solved: β [ u t ( i, j ) − µ t ] = − ln (cid:20) n t ( i, j ) ZZ − ( i ) Z + ( j ) (cid:21) , (4)where Z, Z − , Z + are again found recursively using only one-particle distribu-tions n t ( i, j ) as input (SI Appendix). Eq. (4) enables us to disentangle intrinsicsequence preferences and unwrapping energetics from steric effects. To predict the distribution of inter-dyad lengths, we compute the conditionalprobability of having a nucleosome with the dyad at bp c + d , given that theadjacent upstream nucleosome has the dyad at bp c : P ( c + d | c ) = N ( c, c + d ) N ( c ) , (5)where the probability distributions of the nucleosome centers can be computedusing Eqs. (1) and (2) for a single particle type: N ( c ) = (cid:88) ∆ n nuc1 ( c − ∆ , c + ∆ ) ,N ( c, c + d ) = (cid:88) ∆ , ∆ n nuc , nuc2 ( c − ∆ , c + ∆ ; c + d − ∆ , c + d + ∆ ) . , +1 are lengths of the particles centered at bp c and c + d , respectively.To estimate P ( c + d | c ), we use c = 5 kbp and a box of length L = 10 kbp, sothat the boundaries of the box are far away. We ignore nucleosome sequencespecificity, convolve P ( c + d | c ) with a kernel to account for site-specific chemicalcleavage bias (SI Appendix), and fit model parameters to reproduce the observeddistribution of inter-dyad distances. Acknowledgments
We thank Leonid Mirny for insightful discussions. This research was supportedby National Institutes of Health (R01 HG004708 to A.V.M.). A.V.M. is anAlfred P. Sloan Research Fellow. 15 upplementary Information
We consider a problem of mutually interacting particles (one-dimensional rods)that can be reversibly adsorbed to a one-dimensional lattice of L sites [here,DNA base pairs (bps)]. In order to model partial unwrapping of nucleosomalDNA off the surface of histone octamers, we allow the particles to cover avariable number of bps between a min and a max . We assume that the particlescannot overlap while they are attached to the lattice. This is implementedusing hard-core interactions between adjacent particles. There are also hardwalls at the ends of the lattice so that particles are prevented from running offit. In addition, we allow generic two-body interactions between nearest-neighborparticles.The attachment of a particle to the DNA modifies the total energy of the sys-tem in a sequence-specific manner. Physically, the binding energy may have con-tributions from DNA bending, electrostatic interactions, hydrogen bond forma-tion, van der Waals contacts, etc. Thus a particle which covers bps k, k +1 , . . . , l has a total one-body binding energy u ( k, l ). Note that for pairs of coordinates( k, l ) such that l − k + 1 > a max or l − k + 1 < a min , u ( k, l ) = ∞ , because all par-ticles must have the length between a min and a max bps. The theory presentedbelow is valid for arbitrary binding energies u ( k, l ).Let Φ( j, k ) be the two-body interaction between a pair of nearest-neighborparticles which cover base pairs . . . , j − , j and k, k +1 , . . . ( k > j ) (Fig. S1). Inthe case of nucleosomes, such interactions may be used to account for the effectsof higher-order chromatin structure [52]. Although we do not focus on two-bodyinteractions in this work, they are included below for the sake of generality. Weimpose Φ( j, k ) = (cid:26) ∞ if k ≤ j,V ( k − j −
1) if k > j, where V ( d ) is an arbitrary interaction potential which depends only on thelinear distance d between two neighboring particles.For a fixed number of particles attached to the DNA, N , the canonicalpartition function is Q N = (cid:88) { i n =1 ...L } n ∈{ ,..., N } e − βu ( i ,i ) e − β Φ( i ,i ) e − βu ( i ,i ) . . . × e − βu ( i N − ,i N − ) e − β Φ( i N − ,i N − ) e − βu ( i N − ,i N ) , (S1)where β = 1 /k B T is the inverse temperature ( k B is the Boltzmann constant).Note that with our definitions of one-body energies, two-body interactions and16 (j, k)u(i, j) u(k, l)i j k l Figure S1: Schematic illustration of one-body and two-body potentials in amulti-nucleosome system. Nucleosomes may be partially unwrapped, resultingin variable DNA footprints.hard-wall boundary conditions, only legitimate configurations of non-overlappingparticles will contribute to Eq. (S1).In order to simplify the notation, we introduce two L × L matrices: (cid:104) k | e | l (cid:105) = e − βu ( k,l ) , (cid:104) k | w | l (cid:105) = e − β Φ( k,l ) . Here (cid:104) k | M | l (cid:105) represents the element of matrix M in row k and column l . | l (cid:105) is a column vector of dimension L with 1 at position l and 0 everywhere else,and (cid:104) k | is a row vector with 1 at position k and 0 otherwise.Let | J (cid:105) be a vector of dimension L with 1 at every position. Eq. (S1) gives Q N = (cid:40) (cid:104) J | ( ew ) N − e | J (cid:105) if N ≥ N = 0.If the particles are allowed to attach and detach from the lattice, the systemhas a variable number of particles, and the grand-canonical partition functionis Z = N max (cid:88) N =0 e βNµ Q N = 1 + N max (cid:88) N =1 (cid:104) J | ( zw ) N − z | J (cid:105) = 1 + ∞ (cid:88) M =0 (cid:104) J | ( zw ) M z | J (cid:105) = 1 + (cid:104) J | ( I − zw ) − z | J (cid:105) , (S2)where µ is the chemical potential, N max is the maximum number of particlesthat can fit on L bp, I is the identity matrix, and (cid:104) k | z | l (cid:105) = e β [ µ − u ( k,l )] ≡ ζ ( k, l ).17ote that all particle configurations with N > N max do not contribute to Z ,allowing us to extend the upper limit from N max to ∞ .From the partition function, we can compute s -particle distribution functions(see the chapter by Stell in [56]): n ( k, l ) = ζ ( k, l ) Z δZδζ ( k, l ) ,n ( i, j ; k, l ) = ζ ( i, j ) ζ ( k, l ) Z δ Zδζ ( i, j ) δζ ( k, l ) , and in general n s ( i L , i R ; . . . ; i sL , i sR ) = ζ ( i L , i R ) . . . ζ ( i sL , i sR ) Z δ s Zδζ ( i L , i R ) . . . δζ ( i sL , i sR ) . Using these relations, we obtain the one-particle distribution n ( k, l ) = 1 Z (cid:104) J | ( I − zw ) − | k (cid:105)(cid:104) k | z | l (cid:105)(cid:104) l | ( I − wz ) − | J (cid:105) , (S3)and the two-particle distribution n ( i, j ; k, l ) = 1 Z (cid:104) J | ( I − zw ) − | i (cid:105)(cid:104) i | z | j (cid:105)(cid:104) j | w ( I − zw ) − | k (cid:105)(cid:104) k | z | l (cid:105)(cid:104) l | ( I − wz ) − | J (cid:105) . (S4)In particular, the nearest-neighbor two-particle distribution is given by n ( i, j ; k, l ) = 1 Z (cid:104) J | ( I − zw ) − | i (cid:105)(cid:104) i | z | j (cid:105)(cid:104) j | w | k (cid:105)(cid:104) k | z | l (cid:105)(cid:104) l | ( I − wz ) − | J (cid:105) . Eqs. (S3) and (S4) allow an obvious interpretation. To find the probabil-ity of starting a particle covering positions from k to l [Eq. (S3)], we have toadd statistical weights of all the configurations that contain a particle at thatposition, and divide the resulting sum by the partition function. Similarly, inorder to find the probability of a pair of particles, one covering positions i to j and the other covering positions k to l [Eq. (S4)], we need to sum statisticalweights of all the configurations containing that pair of particles, and divide bythe partition function.With the one-particle distribution, we can define the occupancy of a bp i asthe probability of finding that bp in contact with any particle. In other words,we need to sum the probabilities of all configurations in which particles coverbp i : Occ( i ) = i (cid:88) k = i − a max +1 k + a max − (cid:88) l =max( i,k + a min ) n ( k, l ) . Note that 1 − Occ( i ) is the probability that bp i is not covered by any particles.18 (t,j; s,k) u t (i, j) u s (k, l)i j k l Figure S2: Schematic illustration of one-body and two-body potentials in asystem with multiple particle types. The model allows all particles to be inmultiple stages of unwrapping. In practice, we allow nucleosomes to be par-tially unwrapped but the transcription factors (TFs) always have fixed DNAfootprints.
The above formalism can be easily extended to the case in which T types ofparticles can attach to the one-dimensional lattice. Let the binding energy of aparticle of type t ∈ { , . . . , T } that covers bps i to j on the lattice be u t ( i, j ).The interaction between a particle of type t ending at position k , and the nextparticle of type s starting at position l will be denoted by Φ( t, k ; s, l ) (Fig. S2).Each particle of type t , when attached to the DNA, is in contact with a numberof bps between a t min and a t max . Thus u t ( i, j ) = 0 if i and j do not satisfy theconstraints a t min ≤ j − i + 1 ≤ a t max . Also, Φ( t, i ; s, j ) = ∞ for j ≤ i sincethe particles cannot overlap. With this notation, the grand-canonical partitionfunction becomes Z = (cid:88) all states e − β [ u t ( i L ,i R ) − µ t ] e − β Φ( t ,i R ; t ,i L ) e − β [ u t ( i L ,i R ) − µ t ] . . . , where µ t is the chemical potential of the particles of type t . The sum is over allconfigurations, which can have variable numbers of particles of any type.Using the matrix notation, (cid:104) t, k | z | s, l (cid:105) = e − β [ u t ( k,l ) − µ t ] δ t,s ≡ ζ t ( k, l ) δ t,s , (cid:104) t, k | w | s, l (cid:105) = e − β Φ( t,k ; s,l ) , where δ t,s is the Kronecker delta symbol, the partition function can be writtenas Z = (cid:88) all states (cid:104) t , i L | z | t , i R (cid:105)(cid:104) t , i R | w | t , i L (cid:105)(cid:104) t , i L | z | t , i R (cid:105) . . . | t, i (cid:105) has dimension T L and only one non-zero element, set to 1for normalization. For example, | , i (cid:105) vectors have a 1 at position i , | , i (cid:105) vectorshave a 1 at position L + i , etc. As above, we denote by | J (cid:105) a vector in whichall T L elements are equal to 1, to obtain Z = 1 + (cid:104) J | ( I − zw ) − z | J (cid:105) , equivalent to Eq. (S2).Similarly to the previous case of a single particle type, we can compute theone-particle density n t ( k, l ) = ζ t ( k, l ) Z δZδζ t ( k, l )= 1 Z (cid:104) J | ( I − zw ) − | t, k (cid:105)(cid:104) t, k | z | t, l (cid:105)(cid:104) t, l | ( I − wz ) − | J (cid:105) . (S5)We can also obtain the two-particle density n t,s ( i, j ; k, l ) = 1 Z (cid:104) J | ( I − zw ) − | t, i (cid:105)(cid:104) t, i | z | t, j (cid:105)× (cid:104) t, j | w ( I − zw ) − | s, k (cid:105)(cid:104) s, k | z | s, l (cid:105)(cid:104) s, l | ( I − wz ) − | J (cid:105) (S6)and the nearest-neighbor two-particle density n t,s ( i, j ; k, l ) = 1 Z (cid:104) J | ( I − zw ) − | t, i (cid:105)(cid:104) t, i | z | t, j (cid:105)× (cid:104) t, j | w | s, k (cid:105)(cid:104) s, k | z | s, l (cid:105)(cid:104) s, l | ( I − wz ) − | J (cid:105) . (S7)Eqs. (S6) and (S7) give the joint probability that a particle of type t covers bps i to j , while a second particle of type s covers bps k to l .Using Eq. (S5) we can compute occupancy for each type of particles t andfor each bp i : Occ t ( i ) = i (cid:88) k = i − a t max +1 k + a t max − (cid:88) l =max( i,k + a t min ) n t ( k, l ) . In the following sections we will focus on the one-particle density function.
A straightforward application of Eqs. (S5) and (S6) entails computationallyintensive matrix manipulations. Fortunately, for particles that interact onlythrough hard-core repulsion rather than long-range two-body interactions, theone-particle distribution can be computed recursively and therefore much moreefficiently. 20 .2.1 General case
With multiple particle types, Eq. (S5) can be rewritten as n t ( i, j ) = 1 Z Z − ( i ) (cid:104) t, i | z | t, j (cid:105) Z + ( j ) , where Z − ( i ) and Z + ( j ) are the partition functions for the domains [1 , i ) and( j, L ], respectively. Note that in the case of hard-core interactions alone, Z − ( i )and Z + ( j ) do not depend on the type of the particle occupying positions i through j .In the case of steric exclusion alone, these partial partition functions satisfythe following recursion relations: Z − ( i ) = Z − ( i −
1) + (cid:88) s (cid:88) i − a s max ≤ j ≤ i − a s min Z − ( j ) (cid:104) s, j | z | s, i − (cid:105) , (S8)and Z + ( i ) = Z + ( i + 1) + (cid:88) s (cid:88) i + a s min ≤ j ≤ i + a s max (cid:104) s, i + 1 | z | s, j (cid:105) Z + ( j ) . (S9)Here each particle type s has two characteristic lengths, corresponding to itsminimum and maximum DNA footprints, respectively: a s min and a s max . Theboundary conditions are Z − (1) = 1 and Z + ( L ) = 1. The full partition functionis given by Z = Z − ( L + 1) = Z + (0). Note that all unphysical terms for whichbound particles run off the lattice automatically vanish from Eqs. (S8) and (S9).To avoid numeric instabilities, the recursion should be done in log space. Let F ( i ) = ln Z − ( i ) ,R ( i ) = ln Z + ( i ) . With this notation, Eqs. (S8) and (S9) become F ( i ) = F ( i −
1) + ln (cid:110) (cid:88) s (cid:88) i − a s max ≤ j ≤ i − a s min e F ( j ) − F ( i − β [ µ s − u s ( j,i − (cid:111) , (S10) R ( i ) = R ( i + 1) + ln (cid:110) (cid:88) s (cid:88) i + a s min ≤ j ≤ i + a s max e R ( j ) − R ( i +1)+ β [ µ s − u s ( i +1 ,j )] (cid:111) , with the boundary conditions F (1) = R ( L ) = 0.Then the one-particle distribution function is n t ( i, j ) = e F ( i )+ R ( j ) − ln Z + β [ µ t − u t ( i,j )] , where ln Z = F ( L + 1) = R (0).Although we do not show it here, the two-particle distribution can be com-puted in a very similar way. The only new ingredient in Eq. (S6) is the par-tition function for the box with walls at two arbitrary positions, Z ( t, j, s, k ) ≡(cid:104) t, j | w ( I − zw ) − | s, k (cid:105) . This partition function can be computed recursively,exactly as the partial partition functions Z ± discussed above.21 .2.2 Special case: No unwrapping The special case in which all particles are fully attached to their DNA sites (i.e.,there is no DNA unwrapping) can be easily obtained from our general formalism.Indeed, in this case we restrict a s min = a s max = a s in Eq. (S10), obtaining F ( i ) = F ( i −
1) + ln (cid:110) (cid:88) s e F ( i − a s ) − F ( i − β [ µ s − u s ( i − a s ,i − (cid:111) ,R ( i ) = R ( i + 1) + ln (cid:110) (cid:88) s e R ( i + a s ) − R ( i +1)+ β [ µ s − u s ( i +1 ,i + a s )] (cid:111) . As before, the boundary conditions are F (1) = R ( L ) = 0.The one-particle distribution is given by n t ( i, i + a t −
1) = e F ( i )+ R ( i + a t − − ln Z + β [ µ t − u t ( i,i + a t − ] . In the previous section, we have solved the direct problem: given the bindingenergies for all particle types, we compute s-particle distributions. However,typically it is particle distributions that are observed experimentally, and theenergetics of particle-DNA interactions need to be inferred. This inverse prob-lem can be solved recursively for the case of systems with multiple particle types,partial unwrapping (variable footprints), and steric exclusion. The recursive so-lution is efficient enough to be employed on the genome-wide scale. Here wefocus on one-particle distributions and one-body energies; the exact matrix for-mulation of the inverse problem for a single particle type with the two-particledistribution and the two-body potential is available in Ref. [52].
Using Eqs. (S5), (S8) and (S9), we obtain: Z − ( i ) = Z − ( i − (cid:88) t,i − a t max ≤ j ≤ i − a t min ZZ − ( i − Z + ( i − n t ( j, i − = Z − ( i − (cid:20) N R ( i − ξ ( i − (cid:21) , (S11)where N R ( i ) = (cid:80) t (cid:80) i − a t max +1 ≤ j ≤ i − a t min +1 n t ( j, i ) represents the probabilityof finding a particle of any type with the right edge at bp i , and ξ ( i ) = Z − ( i ) Z + ( i ) /Z . 22 + satisfies a similar recursive relation: Z + ( i ) = Z + ( i + 1) (cid:20) N L ( i + 1) ξ ( i + 1) (cid:21) , (S12)where N L ( i ) is the probability of finding a particle of any type with the leftedge at bp i : N L ( i ) = (cid:80) t (cid:80) i + a t min − ≤ j ≤ i + a t max − n t ( i, j ).The quantity ξ ( i ) satisfies ξ ( i + 1) − ξ ( i ) = 1 Z (cid:2) Z − ( i + 1) Z + ( i + 1) − Z − ( i ) Z + ( i ) (cid:3) = 1 Z (cid:110) Z − ( i + 1) (cid:2) Z + ( i + 1) − Z + ( i ) (cid:3) + Z + ( i ) (cid:2) Z − ( i + 1) − Z − ( i ) (cid:3) (cid:111) = N R ( i ) − N L ( i + 1) , so that ξ ( i ) = 1 + i − (cid:88) k =0 (cid:2) N R ( k ) − N L ( k + 1) (cid:3) , (S13)where the initial condition ξ (0) = 1 has been used.After we compute both Z − and Z + in this way, the total partition functionis given by Z = Z − ( L + 1) = Z + (0) as before, and the binding energy for anyparticle attached to the DNA is given by β [ u t ( i, j ) − µ t ] = − ln (cid:20) n t ( i, j ) ZZ − ( i ) Z + ( j ) (cid:21) . (S14) In the case of the all-or-none binding, all matrix elements (cid:104) i | n t | j (cid:105) vanish unless j = i + a t −
1, where a t is the length of the binding site for the particle of type t . Thus N L ( i ) = (cid:88) t n t ( i, i + a t − ,N R ( i ) = (cid:88) t n t ( i − a t + 1 , i ) . Using these expressions, we can employ Eqs. (S11), (S12) and (S13) to com-pute Z + and Z − in log space. Finally, Eq. (S14) can be used to compute thebinding energies.If all particles are of the same type, the quantity ξ can be simplified further: ξ ( i ) = 1 − i (cid:88) k = i − a +1 N L ( k ) = 1 − Occ( i ) , i ) is the probability that bp i is covered by a particle. Thus in thislimit ξ ( i ) is simply the probability that bp i is not bound by any particles.The recursion relations for Z − and Z + become Z − ( i + 1) = Z − ( i ) (cid:20) N L ( i − a + 1)1 − Occ( i ) (cid:21) ,Z + ( i ) = Z + ( i + 1) (cid:20) N L ( i + 1)1 − Occ( i + 1) (cid:21) . These expressions are equivalent to those previously obtained in [54].
The simplest sequence-dependent model of nucleosome formation assumes thatthe nucleosome energy depends only on the mono- and dinucleotide counts inthe nucleosomal site [54]. Thus for the sequence S S . . . S N , the nucleosomeformation energy is (cid:80) Ni =1 (cid:15) S i + (cid:80) N − i =1 (cid:15) S i S i +1 , where (cid:15) S i and (cid:15) S i S i +1 are themono- and dinucleotide contributions, respectively.Here we consider a symmetrized version of the model, with (cid:15) A = (cid:15) T , (cid:15) C = (cid:15) G and 10 unique dinucleotide energies: (cid:15) AA/T T , (cid:15) AC/GT , (cid:15) AG/CT , (cid:15) AT , (cid:15) CA/T G , (cid:15) CC/GG , (cid:15) CG , (cid:15) GA/T C , (cid:15) GC , (cid:15) T A .Using Eq. (S14), we can estimate all 12 (cid:15) ’s for a model in which nucleosomesare the only particle type and unwrapping is allowed. For each sequence startinga bp i and ending at bp j , we have u ( i, j ) − µ = j (cid:88) k = i (cid:15) S k + j − (cid:88) k = i (cid:15) S k S k +1 − µ, = (cid:0) m A/T m C/G m AA/T T · · · m T A − (cid:1) (cid:15) A/T ... (cid:15)
T A µ , (S15)where m X and m XY are the counts of mono- and dinucleotides X and XY inthe sequence, respectively.Using all possible combinations of pairs ( i, j ) where a nucleosome can form,we obtain a large number P of equations of this type: E − µ = M (cid:18) (cid:15)µ (cid:19) . Here, E − µ is a column vector of dimension P , where each row contains one u ( i, j ) − µ element from Eq. (S15). (cid:18) (cid:15)µ (cid:19) is the column vector from Eq. (S15),and M is a P ×
13 matrix with mono- and dinucleotide counts and -1’s in the24ast column. From this equation we can derive the energies (cid:15) and µ by a leastsquares fit.A problem that arises is that matrix M has a one-dimensional kernel, or nullspace [52]. Because in every DNA sequence the number of mononucleotides isequal to the length of the sequence, and the number of dinucleotides is equalto the length of the sequence minus 1, the columns of the matrix M are notlinearly independent. Indeed, the column vector | V (cid:105) = − − is the only linearly independent vector from the kernel of M : M | V (cid:105) = 0 (i.e.,the kernel of M is spanned by | V (cid:105) ). Thus we have only 12 linearly independentequations, and cannot obtain a unique solution for the 13 parameters (cid:15) and µ .We need to add one constraint, which we choose to be (cid:88) i =1 (cid:15) i = 0 , to make the solution unique. Hydroxyl radicals that cleave DNA near the nucleosome dyad have two preferredcutting sites, at positions -1 bp and +6 bp with respect to the dyad [30]. If DNAis cut at these positions with frequencies f and 1 − f , the distance between twoconsecutive cuts (one on the Watson and one on the Crick strand) is given by d cuts = d dyads + b, where d dyads is the distance between two neighboring dyads and the bias b is b = − −
52 with probability (1 − f ) f (1 − f ) f . The cleavage bias has to be taken into account by convolving the predictedinter-dyad distance probability P ( c + d | c ) with a kernel corresponding to thisbias: F ( x ) = (1 − f ) for x = − f (1 − f ) for x = − f for x = 20 otherwise . The convolution is then compared with the observed distribution of inter-dyad distances. 25 .6 Parameter optimization
Parameter fitting for all models was carried out in a two-stage procedure us-ing the genetic algorithm optimization function ga from the MATLAB Globaloptimization toolbox. First, the objective function to be minimized was setequal to the root-mean-square (RMS) deviation between predicted and observedinter-dyad distributions. Once RM S < − had been achieved, the objectivefunction was replaced by RMS − r osc (cid:39) − r osc , where r osc is the linear correlationbetween observed and predicted oscillations after the smooth background hasbeen subtracted from inter-dyad distributions, as in Fig. 1D. We have foundthat the two-stage setup allows us to effectively fit both the overall shape andthe fine oscillatory structure in the data. The best-fit parameters for all modelsare given in SI Results. 26 Supplementary Results
Model A: Crystal structure augmented with an additional well.
Thebinding energy of a particle of length a = 1 + x + x (1 bp for the dyad, and x and x for the extra number of bps in contact with the histone octamer oneach side of the dyad) is given by u = u half ( x ) + u half ( x ), with u half ( x ) = interp1 ( ... ) − E b x, where E b is the binding energy of a fully wrapped particle in the absence of10-11 bp oscillations. The oscillations are based on the the positions of thehistone-DNA contacts in the crystal structure [57]. The MATLAB function interp1 (...) was used to generate the oscillatory pattern by piecewise cubicHermite interpolation between the following data points: x (Position) f(x) (Energy) -1 -A3 A7 -A13 A17 -A24 A28 -A34 A38 -A44 A49 -A55 A59 -A65 A69 -A75 Ap -d85 AThe oscillatory pattern was superimposed onto a line with the slope of − E b / arameter Value a max
163 bp a min E b B T µ -14.51 k B T A B T f p
79 bp d B T − − − − − u ha l f ( k B T ) Dyad − edge distance (bp) Table S1: Fitted parameters for Model A. a max and a min are the maximum andminimum lengths of the nucleosome particle, µ is the histone octamer chemicalpotential, A is the amplitude of the oscillations, and f is the hydroxyl radicalcutting frequency. p and d are the position and the depth of the first minimumoutside of the nucleosome core particle, respectively. E b is the binding energyof a fully wrapped particle in the absence of 10-11 bp oscillations.Fit residuals: RM S = 9 . × − r osc = 0 . RM S osc = 1 . × − RM S is the root-mean-square error of the predicted inter-dyad distribution. r osc is the linear correlation between the oscillatory parts of the measured and pre-dicted inter-dyad distributions. The oscillatory part was obtained by subtract-ing the smooth background from the full inter-dyad distribution. Smoothingwas done by applying a Savitzky-Golay smoothing filter [also known as least-squares, or DISPO (Digital Smoothing Polynomial) filter] of polynomial order 3and length 31 bp. RM S osc is the root-mean-square error of the oscillatory partof the predicted inter-dyad distribution.28 odel B: Crystal structure augmented with a linear function.
Sameas Model A for x ∈ [1 , u half ( x ) = (cid:26) interp1 ( ... ) − E b x for x ∈ [1 , , interp1 ( ... ) − E b − ∆ E ∆ X ( x −
73) for x ∈ [74 ,
73 + ∆ X ] , where ∆ E/ ∆ X is the slope of the linear function (i.e., ∆ E is the energy dif-ference between the first and last points of the linear function and ∆ X is thecardinality of the range of the linear function). Parameter Value a min
27 bp E b B T µ -15.04 k B T A B T f E -2.47 k B T∆ X − − − − − Dyad − edge distance (bp) u ha l f ( k B T ) Table S2: Fitted parameters for model B. All parameters are as in Model A,except for ∆ E and ∆ X which are defined above.Fit residuals: RM S = 0 . RM S cannot decrease below 10 − for this model) r osc = 0 . RM S osc = 1 . × − All residuals are defined as in Model A.29 odel C: 10-bp oscillations superimposed onto a linear function. u half ( x ) = − A cos (cid:18) π
10 ( x − x ) (cid:19) − E b x, where A is the amplitude of the oscillations, x determines the phase of theoscillations, and E b is the binding energy of a fully wrapped particle in theabsence of the oscillations. Parameter Value a max
165 bp a min E b B T µ -13.99 k B T A B T x
79 bp f − − − − − Dyad − edge distance (bp) u ha l f ( k B T ) Table S3: Fitted parameters for Model C. All parameters are as in Model A,except for x which is defined above.Fit residuals: RM S = 9 . × − r osc = 0 . RM S osc = 2 . × − All residuals are defined as in Model A.30 odel D: 11-bp oscillations superimposed onto a linear function. u half ( x ) = − A cos (cid:18) π
11 ( x − x ) (cid:19) − E b x, where A , x and E b have the same meaning as in Model C. Parameter Value a max
161 bp a min
25 bp E b B T µ -14.30 k B T A B T x
80 bp f − − − − − Dyad − edge distance (bp) u ha l f ( k B T ) Table S4: Fitted parameters for Model D. All parameters are as in Model C.Fit residuals:
RM S = 9 . × − r osc = 0 . RM S osc = 2 . × − All residuals are defined as in Model A.31 odel E: Uniform unwrapping. u half ( x ) = − E b x, where E b is the binding energy of a fully wrapped nucleosome. Parameter Value a max
163 bp a min
35 bp E b B T µ -13.14 k B T f − − − − − Dyad − edge distance (bp) u ha l f ( k B T ) Table S5: Fitted parameters for Model E. All parameters are as in Model A.Fit residuals:
RM S = 9 . × − r osc = 0 . RM S osc = 2 . × − All residuals are defined as in Model A.32 odel F: 5-bp oscillations superimposed onto a linear function. u half ( x ) = − A cos (cid:18) π x − x ) (cid:19) − E b x, where A , x and E b have the same meaning as in Model C. Parameter Value a max
163 bp a min
39 bp E b B T µ -16.13 k B T A B T x
74 bp f − − − − − Dyad − edge distance (bp) u ha l f ( k B T ) Table S6: Fitted parameters for Model F. All parameters are as in Model C.Fit residuals:
RM S = 9 . × − r osc = 0 . RM S osc = 3 . × − All residuals are defined as in Model A.33 odel G: 5-bp stepwise unwrapping. u half ( x ) = − E step ceil (cid:18) x − x (cid:19) , where E step is the amount of energy lost in each step, and x determines thephase of the stepwise profile. Parameter Value a max
163 bp a min
39 bp E step B T µ -12.83 k B T x f − − − − − Dyad − edge distance (bp) u ha l f ( k B T ) Table S7: Fitted parameters for Model G. All parameters are as in Model A,except for E step and x defined above.Fit residuals: RM S = 9 . × − r osc = 0 . RM S osc = 2 . × − All residuals are defined as in Model A.34 odel H: 10-bp stepwise unwrapping. u half ( x ) = − E step ceil (cid:18) x − x (cid:19) , where E step is the amount of energy lost in each step, and x determines thephase of the stepwise profile. Parameter Value a max
169 bp a min E step B T µ -12.04 k B T x f − − − − − − Dyad − edge distance (bp) u ha l f ( k B T ) Table S8: Fitted parameters for Model E. All parameters are as in Model F.Fit residuals:
RM S = 9 . × − r osc = 0 . RM S osc = 2 . × − All residuals are defined as in Model A.35
Supplementary Figures (cid:132)(cid:20) a min (bp)a max (bp) R M S (cid:132)(cid:17)(cid:15)(cid:22) (cid:17)(cid:15)(cid:22) a min (bp)a max (bp) C o rr e l a t i on A B (cid:132)(cid:20)
Slope (k B T / bp) R M S C Figure S3:
Sensitivity of the predicted inter-dyad distribution to theparameters of the unwrapping potential based on nucleosome crystalstructures. (A) Root-mean-square error (RMS) of the inter-dyad distributionpredicted using the model in Fig. 1B, as a function of a min and a max . (B) Thelinear correlation coefficient between oscillations in the predicted and observedinter-dyad distributions ( r osc ), as a function of a min and a max . The oscillationswere obtained by subtracting the smooth background from inter-dyad distribu-tions, as described in the Fig. 1 caption. (C) Variation of the RMS with theslope of the unwrapping potential in Fig. 1B. In all panels, model parametersthat were not varied were kept fixed at their best-fit values (SI Results, ModelA). 36 inimum (bp)Correlation
76 77 78 79 80 81 821.51.71.92.12.32.52.72.93.13.33.5 (cid:132)(cid:17)(cid:15)(cid:18) D ep t h ( k B T ) Minimum (bp) D ep t h ( k B T ) RMS
76 77 78 79 80 81 821.51.71.92.12.32.52.72.93.13.33.5 11.522.533.544.5x 10 (cid:132)(cid:20) AB Figure S4:
Sensitivity of the predicted inter-dyad distribution to modelparameters describing higher-order chromatin structure.
The unwrap-ping potential is based on nucleosome crystal structures (SI Results, Model A).(A) Root-mean-square error (RMS) of the predicted inter-dyad distribution,as a function of the position and the depth of the first minimum outside ofthe nucleosome core (Fig. 1B). The depth of the first minimum is computedwith respect to u half ( x = 73 bp). (B) The linear correlation coefficient betweenoscillations in the predicted and observed inter-dyad distributions ( r osc ), as afunction of the position and the depth of the first minimum outside of the nucle-osome core (Fig. 1B). The oscillations were obtained by subtracting the smoothbackground from inter-dyad distributions, as described in the Fig. 1 caption.In both panels, model parameters that were not varied were kept fixed at theirbest-fit values (SI Results, Model A). 37 B Dyad − edge distance (bp) u ha l f ( k B T ) Inter−dyad distance (bp) P r obab ili t y DataModel50 100 150 200 250−2−1012 x 10 −3 Inter−dyad distance (bp) O sc ill a t o r y pa r t Correlation: 0.769
DataModel C (cid:132)(cid:18)(cid:19)(cid:15)(cid:22) (cid:132)(cid:18)(cid:17) (cid:132)(cid:24)(cid:15)(cid:22) (cid:132)(cid:22) (cid:132)(cid:19)(cid:15)(cid:22) (cid:17) (cid:19)(cid:15)(cid:22) (cid:22) (cid:24)(cid:15)(cid:22)(cid:17)(cid:17)(cid:15)(cid:17)(cid:17)(cid:19)(cid:17)(cid:15)(cid:17)(cid:17)(cid:21)(cid:17)(cid:15)(cid:17)(cid:17)(cid:23)(cid:17)(cid:15)(cid:17)(cid:17)(cid:25)(cid:17)(cid:15)(cid:17)(cid:18)(cid:17)(cid:17)(cid:15)(cid:17)(cid:18)(cid:19) (cid:39) E (k B T) R M S E FigS4_Xray_slope_outside.pdf
Extra number of base-pairs (cid:39) E ( k B T ) Correlation
DRMS: 11.964×10 -4 Figure S5:
Crystal structure-based model augmented by a linear po-tential outside of the nucleosome core. (A) The energy profile fitted toreproduce the inter-dyad distance distribution shown in (B). All fitting parame-ters are listed in SI Results, Model B. Under the symmetric unwrapping assump-tion, the energy of a nucleosome which covers 2 x + 1 bps is given by 2 u half ( x ).(B) The inter-dyad distance distribution observed in a high-resolution nucleo-some map [30] (blue line), and predicted using Model B in SI Results (red line).RMS - root-mean-square deviation between the model and the data. Note thatin this model RMS below 10 − could not be achieved, and thus optimizationwas switched to maximize the correlation coefficient r osc once RMS reached1 . × − (see SI Methods for details). (C) Oscillations in the observed (blueline) and predicted (red line) inter-dyad distributions. The oscillations were ob-tained by subtracting the smooth background from the data and the model in(B), as described in the Fig. 1 caption. Correlation refers to r osc , the linear cor-relation coefficient between measured and predicted oscillations. (D) Heatmapwith superimposed contour lines of the r osc dependence on the two parametersof the linear potential outside of the nucleosome core: ∆ x = x last −
73 bp and∆ E = u half ( x last ) − u half (73), where [1 , x last ] is the range of the energy profile(SI Results, Model B). Note that the best fit corresponds to ∆ x = 7 bp. (E)The dependence of the RMS on ∆ E for the best-fit value of ∆ x = 7 bp. Allparameters not explicitly varied in (D) and (E) were kept fixed at their best-fitvalues (SI Results, Model B). 38 Inter−dyad distance (bp) P r obab ili t y DataModel
50 100 150 200 250 − − − Inter −dyad distance (bp) O sc ill a t o r y pa r t Correlation: 0.709
DataModel
50 100 150 200 25000.0050.0100.0150.0200.025
Inter −dyad distance (bp) P r obab ili t y DataModel
50 100 150 200 250 − − − Inter −dyad distance (bp) O sc ill a t o r y pa r t Correlation: 0.689
DataModel (cid:37)(cid:90)(cid:66)(cid:69)(cid:1)(cid:132)(cid:1)(cid:70)(cid:69)(cid:72)(cid:70)(cid:1)(cid:69)(cid:74)(cid:84)(cid:85)(cid:66)(cid:79)(cid:68)(cid:70)(cid:1)(cid:9)(cid:67)(cid:81)(cid:10) (cid:132) (cid:132) (cid:132) (cid:132) (cid:68)(cid:83)(cid:90)(cid:84)(cid:85)(cid:66)(cid:77)(cid:1)(cid:84)(cid:85)(cid:83)(cid:86)(cid:68)(cid:85)(cid:86)(cid:83)(cid:70) (cid:132)(cid:67)(cid:81)(cid:1)(cid:81)(cid:70)(cid:83)(cid:74)(cid:80)(cid:69)(cid:74)(cid:68) (cid:132)(cid:67)(cid:81)(cid:1)(cid:81)(cid:70)(cid:83)(cid:74)(cid:80)(cid:69)(cid:74)(cid:68) (cid:86) (cid:73)(cid:66) (cid:77) (cid:71) (cid:9) (cid:76) B (cid:53) (cid:10) Dyad
ABC DE
RMS: 9.986 ×10 -4 RMS: 9.912 ×10 -4 Figure S6:
Strictly periodic models of nucleosome unwrapping. (A)Nucleosome unwrapping/higher-order structure potential energy profiles. Underthe symmetric unwrapping assumption, the energy of a nucleosome that covers2 x +1 bps is given by 2 u half ( x ). The minima and maxima of the energy landscapeare either based on the crystal structures of the nucleosome core particle as inFigure 1 (blue), or else are 10 (green) and 11 (red) bp-periodic oscillations withfitted initial phase (SI Results, Models C and D). Dark gray bars show wherethe histone binding motifs interact with the DNA minor groove. Light gray barsindicate where the DNA major groove faces the histones. (B) The inter-dyaddistance distribution from a high-resolution nucleosome map [30] (blue line),and from the 10 bp-periodic model (red line). All model parameters are listedin SI Results, Model C. RMS - root-mean-square deviation between the modeland the data. (C) Oscillations in the observed (blue line) and predicted (redline) inter-dyad distributions. The oscillations were obtained by subtractinga smooth background from the data and the model in (B), as described inthe Fig. 1 caption. Correlation refers to r osc , the linear correlation coefficientbetween measured and predicted oscillations. (D) Same as (B), for the 11 bp-periodic model. All model parameters are listed in SI Results, Model D. (E)Same as (C), for the 11 bp-periodic model.39 inimum (bp) S l ope ( k B T / bp ) RMS
64 65 66 67 68 69 70 71 72 730.050.060.070.080.090.100.110.120.130.140.15 11.522.533.54x 10 (cid:132)(cid:20)
64 66 68 70 720.050.10.15(cid:132)(cid:17)(cid:15)(cid:22)00.5
Minimum (bp)Slope (k B T / bp) C o rr e l a t i on AB Figure S7:
Sensitivity of the predicted inter-dyad distribution to pa-rameters of the 10 bp-periodic model. (A) Heatmap with superimposedcontour lines of the RMS dependence on the slope of the energy profile and theposition of the last minimum within the nucleosome core. RMS - root-mean-square deviation between the model and the data. (B) The linear correlationcoefficient r osc between oscillations in the predicted and observed inter-dyaddistributions, as a function of the overall slope of the energy profile and theposition of the last minimum within the nucleosome core. All parameters notexplicitly varied were kept fixed at their best-fit values (SI Results, Model C).40 inear model Dyad − edge distance (bp) u ha l f ( k B T ) − − − − − Inter−dyad distance (bp) P r obab ili t y Data
Model50 100 150 200 250 − − − Inter−dyad distance (bp) O sc ill a t o r y pa r t Correlation: 0.206
Data
Model
Dyad − edge distance (bp) u ha l f ( k B T ) − − − − − Inter−dyad distance (bp) P r obab ili t y Data
Model50 100 150 200 250 − − − Inter−dyad distance (bp) O sc ill a t o r y pa r t Correlation: 0.283
Data
Model
Dyad − edge distance (bp) u ha l f ( k B T ) − − − − − Inter−dyad distance (bp) P r obab ili t y Data
Model50 100 150 200 250 − − − Inter−dyad distance (bp) O sc ill a t o r y pa r t Correlation: 0.545
Data
Model ×10 -4 RMS: 9.904 ×10 -4 RMS: 9.790 ×10 -4 Dyad − edge distance (bp) u ha l f ( k B T ) − − − − − Inter−dyad distance (bp) P r obab ili t y Data
Model50 100 150 200 250 − − − Inter−dyad distance (bp) O sc ill a t o r y pa r t Correlation: 0.275
Data
Model
RMS: 9.972 ×10 -4 A B C D Figure S8:
Alternative models of nucleosome unwrapping. (A) Linearmodel (SI Results, Model E). (B) 5-bp periodic model (SI Results, Model F).(C) 5-bp step-wise model (SI Results, Model G). (D) 10-bp step-wise model(SI Results, Model H). In each column, the upper panel shows the nucleosomeunwrapping/higher-order structure potential energy profile (as in Fig. 1B), themiddle panel shows the comparison of experimental and predicted inter-dyaddistance distributions (as in Fig. 1C), and the lower panel shows observed andpredicted oscillations in the inter-dyad distance distributions (as in Fig. 1D).41 (cid:19) (cid:132)(cid:18) (cid:18) (cid:19) (cid:18)(cid:19)(cid:22)(cid:18)(cid:20)(cid:17)(cid:18)(cid:20)(cid:22)(cid:18)(cid:21)(cid:17)(cid:18)(cid:21)(cid:22)(cid:18)(cid:22)(cid:17)(cid:18)(cid:22)(cid:22)(cid:18)(cid:23)(cid:17) (cid:34)(cid:87)(cid:70)(cid:83)(cid:66)(cid:72)(cid:70)(cid:1)(cid:73)(cid:74)(cid:84)(cid:85)(cid:80)(cid:79)(cid:70)(cid:1)(cid:85)(cid:86)(cid:83)(cid:79)(cid:80)(cid:87)(cid:70)(cid:83)(cid:1)(cid:83)(cid:66)(cid:85)(cid:70)(cid:1)(cid:9)(cid:91)(cid:132)(cid:84)(cid:68)(cid:80)(cid:83)(cid:70)(cid:10) A v e r age i n t e r (cid:132) (cid:69) (cid:90) (cid:66)(cid:69) (cid:1) (cid:69) (cid:74) (cid:84) (cid:85) (cid:66)(cid:79) (cid:68) (cid:70) Correlation: (cid:132)(cid:17)(cid:15)(cid:20)(cid:26)(cid:19)
A/T ratio (cid:34) (cid:87) (cid:70) (cid:83) (cid:66)(cid:72)(cid:70) (cid:1) (cid:74) (cid:79) (cid:85) (cid:70) (cid:83) (cid:132) (cid:69) (cid:90) (cid:66)(cid:69) (cid:1) (cid:69) (cid:74) (cid:84) (cid:85) (cid:66)(cid:79) (cid:68) (cid:70)
Correlation: 0.501
Position (bp) (cid:41)(cid:74)(cid:84)(cid:85)(cid:80)(cid:79)(cid:70)(cid:1)(cid:85)(cid:86)(cid:83)(cid:79)(cid:80)(cid:87)(cid:70)(cid:83)(cid:1)(cid:83)(cid:66)(cid:85)(cid:70)(cid:1)(cid:9)(cid:91)(cid:132)(cid:84)(cid:68)(cid:80)(cid:83)(cid:70)(cid:10) (cid:132) (cid:132) (cid:132) (cid:132) (cid:132) Position (bp)PIC occupancy (cid:132)(cid:20)(cid:17)(cid:17)(cid:17) (cid:132)(cid:19)(cid:17)(cid:17)(cid:17) (cid:132)(cid:18)(cid:17)(cid:17)(cid:17)
TSS (cid:18)(cid:17)(cid:17)(cid:17) (cid:19)(cid:17)(cid:17)(cid:17) (cid:17)(cid:22)(cid:17)(cid:18)(cid:17)(cid:17)(cid:18)(cid:22)(cid:17)(cid:19)(cid:17)(cid:17)(cid:19)(cid:22)(cid:17)(cid:20)(cid:17)(cid:17)
BD E CA Position (bp)Average fragment length (cid:132)(cid:20)(cid:17)(cid:17)(cid:17) (cid:132)(cid:19)(cid:17)(cid:17)(cid:17) (cid:132)(cid:18)(cid:17)(cid:17)(cid:17)
TSS (cid:18)(cid:17)(cid:17)(cid:17) (cid:19)(cid:17)(cid:17)(cid:17) (cid:18)(cid:20)(cid:22)(cid:18)(cid:21)(cid:17)(cid:18)(cid:21)(cid:22)(cid:18)(cid:22)(cid:17)(cid:18)(cid:22)(cid:22)
Figure S9:
Genome-wide distribution of nucleosome lengths, histoneturnover rates, and transcription pre-initiation complexes. (A) Dis-tribution of average lengths of DNA-bound particles mapped by MNase diges-tion [47] in the vicinity of TSS. We considered particles with sizes between 80and 200 bp and assigned particle lengths to the mid-point of each particle. Val-ues for bps without dyads were obtained by interpolation. (B) Distribution ofhistone turnover rates [49] in the vicinity of TSS. (C) Distribution of the com-bined occupancy of 9 transcription pre-initiation complexes (PICs) [48] in thevicinity of TSS. PIC occupancies provided at 20 bp interval in Ref. [48] wereinterpolated. In panels (A)-(C), gene order is as in Figure 1B, and the heatmapswere smoothed using a 2D Gaussian kernel with σ = 3 pixels. (D) Correlationbetween inter-dyad distances and histone turnover rates averaged over 10 kbpwindows tiling the yeast genome. (E) Correlation between average inter-dyaddistances and the A/T ratio in 10 kbp windows tiling the yeast genome. A/Tratio is the fraction of A/T nucleotides in the window, divided by the genome-wide A/T fraction. Correlation in (D) and (E) refers to the linear correlationcoefficient. 42 P o t en t i a l ba rr i e r ( k B T / bp ) I n t e r gen i c r eg i on r a t i o ( % ) −1000 −500 TSS +500 +1000 N o r m a li z ed d y ad den s i t y Distance (bp) A v e r age f r ag m en t s i z e ( bp ) Normalized dyad density (Brogaard et al.)
Normalized dyad density (simulation)
Average nuc. size (Henikoff et al.)Average nuc. size (simulation)
Figure S10:
Modeling distributions of nucleosome lengths and dyadpositions in the vicinity of TSS.
We align all yeast genes by their TSS as inFig. 2 and for each bp compute the fraction of times that bp is found in an in-tergenic region, as opposed to the ORF of a neighboring gene (grey backgroundcurve). The intergenic fraction has an asymmetric shape with a maximum atabout 50 bp upstream of the TSS. We use this shape as a guide for construct-ing an energy barrier for in vivo histone deposition (pink background curve).The barrier is composed of three half-Gaussians: B ( x ) = H exp (cid:104) − ( x − c ) σ (cid:105) ( x ≤ c ) , ( H + D ) (cid:110) exp (cid:104) − ( x − c ) σ (cid:105) − (cid:111) − D exp (cid:104) − ( x − c ) σ (cid:105) ( x > c ). The free param-eters of the barrier are fit to maximize the sum of two correlations: betweenobserved and predicted normalized dyad counts [30] (solid and dashed bluelines, respectively), and between observed and predicted average nucleosomeDNA lengths [47] (solid and dashed green lines, respectively). Normalized dyadcounts are computed as the total number of dyads at a given bp for all genes,divided by the average of this quantity in a [-1000,1000] bp window around theTSS. Average DNA lengths are computed for all nucleosomes with a midpoint ata given bp, for all genes (if the midpoint falls in between two bps, the one on theleft is used). The fitted parameters are: H = 0 . B T , D = 0 . B T , c = x TSS −
32 bp , σ = 162 . , σ = 28 . , σ = 2090 . x TSS is theabsolute position of the TSS in the box, c is the center of the 3 Gaussians, H is the height of the first Gaussian, D is the depth of the third Gaussian, and σ , σ , σ are the standard deviations of the three Gaussian distributions. Thesimulations were done in a 15 kbp box with the barrier placed at its center toeliminate the boundary effects. Unwrapping was assumed to be symmetric andthe nucleosome structure-based unwrapping potential (SI Results, Model A) wasused. The total free energy u nuc ( k, l ) of a nucleosome occupying bps k, . . . , l isa sum of u SInuc and u barriernuc = (cid:80) lj = k (cid:15) j , where (cid:15) j is the value of the barrier at bp j . Note that the chemical mapping data underestimates the number of -1 and+1 nucleosomes due to gel selection bias [30].43
20 40 60 80 (cid:132)(cid:18)(cid:17)(cid:132)(cid:25)(cid:132)(cid:23)(cid:132)(cid:21)(cid:132)(cid:19) Dyad (cid:132) edge distance (bp) (cid:52) (cid:70)(cid:82)(cid:86)(cid:70)(cid:79) (cid:68) (cid:70) (cid:132) (cid:74) (cid:79)(cid:69)(cid:70)(cid:81)(cid:70)(cid:79)(cid:69)(cid:70)(cid:79) (cid:85) b i nd i ng ene r g y ( k B T ) Exact (cid:36)(cid:80)(cid:78)(cid:81)(cid:86)(cid:85)(cid:70)(cid:69)(cid:1)(cid:18)(cid:1)(cid:83)(cid:70)(cid:66)(cid:69)(cid:1)(cid:16)(cid:1)(cid:67)(cid:81)(cid:36)(cid:80)(cid:78)(cid:81)(cid:86)(cid:85)(cid:70)(cid:69)(cid:1)(cid:18)(cid:17)(cid:1)(cid:83)(cid:70)(cid:66)(cid:69)(cid:84)(cid:1)(cid:16)(cid:1)(cid:67)(cid:81)(cid:36)(cid:80)(cid:78)(cid:81)(cid:86)(cid:85)(cid:70)(cid:69)(cid:1)(cid:18)(cid:17)(cid:17)(cid:1)(cid:83)(cid:70)(cid:66)(cid:69)(cid:84)(cid:1)(cid:16)(cid:1)(cid:67)(cid:81) a max a min E b µ A p d020406080100 Parameter | (cid:39) P | / | P | ( % ) AB C o rr e l a t i on D y ad s O cc . Figure S11:
Inference of the unwrapping potential in a model withsequence-specific nucleosome formation energies. (A) Exact unwrappingpotential vs. unwrapping potentials predicted at three levels of sequence cov-erage. All calculations were done on
S. cerevisiae chromosome I with L = 230kbp. M L reads were randomly sampled from the exact distribution n nuc1 ( i, j )(see Materials and Methods), where M ∈ { , , } is the desired level ofread coverage per bp. Sampled reads were used to compile a chromosome-widehistogram of nucleosome DNA lengths, to which the unwrapping potential inSI Results, Model A was fit using a genetic algorithm optimization function ga from MATLAB, which minimizes the root-mean-square error of the predicteddistribution of nucleosome lengths. (B) The relative errors between predictedand exact (SI Results, Model A) parameters of the unwrapping potential at threelevels of sequence coverage (light blue background). P denotes any parameteron the horizontal axis. Linear correlation between predicted and exact distri-butions of dyad positions and nucleosome occupancy (light pink background).The height of each bar represents the mean relative error for the correspondingparameter or the mean correlation coefficient, obtained by averaging the re-sults of 100 random sampling experiments. The uncertainty intervals representstandard deviations. 44 eferences [1] K. E. van Holde, Chromatin . New York: Springer, 1989.[2] K. Luger et al. , “Crystal structure of the nucleosome core particle at 2.8 ˚Aresolution,”
Nature , vol. 389, pp. 251–260, Sept. 1997.[3] G. Felsenfeld and M. Groudine, “Controlling the double helix,”
Nature ,vol. 421, pp. 448–453, 2003.[4] A. L. Olins and D. E. Olins, “Spheroid chromatin units (v bodies),”
Science ,vol. 183, pp. 330–332, Jan. 1974.[5] R. Kornberg, “Chromatin structure: a repeating unit of histones and dna.,”
Science (New York, NY) , vol. 184, no. 4139, pp. 868–871, 1974.[6] C. Woodcock, J. Safer, and J. Stanchfield, “Structural repeating units inchromatin. i. evidence for their general occurrence.,”
Experimental cell re-search , vol. 97, pp. 101–110, 1976.[7] H. Schiessel, J. Widom, R. Bruinsma, and W. Gelbart, “Polymer reptationand nucleosome repositioning,”
Phys. Rev. Lett. , vol. 86, pp. 4414–4417,2001.[8] I. Kulic and H. Schiessel, “Chromatin dynamics: Nucleosomes go mobilethrough twist defects,”
Phys. Rev. Lett. , vol. 91, p. 148103, 2003.[9] M. Vignali et al. , “ATP-Dependent chromatin-remodeling complexes,”
Mol. Cell. Biol. , vol. 20, pp. 1899–1910, Mar. 2000.[10] K. J. Polach and J. Widom, “Mechanism of protein access to specific DNAsequences in chromatin: A dynamic equilibrium model for gene regulation,”
J. Mol. Biol. , vol. 254, pp. 130–149, Nov. 1995.[11] J. D. Anderson, A. Th˚astr¨om, and J. Widom, “Spontaneous access of pro-teins to buried nucleosomal DNA target sites occurs via a mechanism that isdistinct from nucleosome translocation,”
Mol. Cell. Biol. , vol. 22, pp. 7147–7157, Oct. 2002.[12] H. S. Tims et al. , “Dynamics of nucleosome invasion by DNA binding pro-teins,”
J. Mol. Biol. , vol. 411, pp. 430–448, Aug. 2011.[13] G. Moyle-Heyrman, H. S. Tims, and J. Widom, “Structural constraints incollaborative competition of transcription factors against the nucleosome,”
J. Mol. Biol. , vol. 412, pp. 634–646, Sept. 2011.[14] C. Hodges et al. , “Nucleosomal fluctuations govern the transcription dy-namics of RNA polymerase II,”
Science , vol. 325, pp. 626–628, July 2009.[15] A. Bucceri, K. Kapitza, and F. Thoma, “Rapid accessibility of nucleosomalDNA in yeast on a second time scale,”
EMBO J , vol. 25, pp. 3123–3132,July 2006. 4516] C. C. Adams and J. L. Workman, “Binding of disparate transcriptionalactivators to nucleosomal DNA is inherently cooperative,”
Mol. Cell. Biol. ,vol. 15, p. 1405–1421, 1995.[17] J. D. Anderson and J. Widom, “Sequence and position-dependence of theequilibrium accessibility of nucleosomal DNA target sites,”
J. Mol. Biol. ,vol. 296, pp. 979–987, 2000.[18] J. A. Miller and J. Widom, “Collaborative competition mechanism for geneactivation in vivo,”
Mol. Cell. Biol. , vol. 23, pp. 1623–1632, 2003.[19] G. Li and J. Widom, “Nucleosomes facilitate their own invasion,”
Nat.Struct. Mol. Biol. , vol. 11, no. 1, pp. 763–769, 2004.[20] G. Li et al. , “Rapid spontaneous accessibility of nucleosomal DNA,”
Nat.Struct. Mol. Biol. , vol. 12, no. 1, pp. 46–53, 2004.[21] M. Engeholm et al. , “Nucleosomes can invade DNA territories occupied bytheir neighbors,”
Nat. Struct. Mol. Biol. , vol. 16, no. 2, pp. 151–158, 2009.[22] M. G. Poirier et al. , “Spontaneous access to DNA target sites in foldedchromatin fibers,”
J. Mol. Biol. , vol. 379, pp. 772–786, June 2008.[23] M. G. Poirier, E. Oh, H. S. Tims, and J. Widom, “Dynamics and function ofcompact nucleosome arrays,”
Nat. Struct. Mol. Biol. , vol. 16, pp. 938–945,2009.[24] T. Chou, “An exact theory of histone-DNA adsorption and wrapping,”
Europhys. Lett. , vol. 62, pp. 753–759, June 2003.[25] L. A. Mirny, “Nucleosome-mediated cooperativity between transcriptionfactors,”
Proc. Natl. Acad. Sci. U.S.A. , vol. 107, pp. 22534–22539, Dec.2010.[26] V. B. Teif and K. Rippe, “Statistical–mechanical lattice models for pro-tein–DNA binding in chromatin,”
J. Phys.: Condens. Matter , vol. 22,p. 414105, Oct. 2010.[27] V. B. Teif and K. Rippe, “Nucleosome mediated crosstalk between tran-scription factors at eukaryotic enhancers,”
Phys. Biol. , vol. 8, p. 044001,Aug. 2011.[28] V. B. Teif and K. Rippe, “Calculating transcription factor binding mapsfor chromatin,”
Brief. Bioinform. , vol. 13, pp. 187–201, Mar. 2012.[29] P. Prinsen and H. Schiessel, “Nucleosome stability and accessibility of itsDNA to proteins,”
Biochimie , vol. 92, pp. 1722–1728, Dec. 2010.[30] K. Brogaard et al. , “A map of nucleosome positions in yeast at base-pairresolution,”
Nature , vol. 486, pp. 496–501, 2012.4631] C. Davey, D. Sargent, K. Luger, A. Maeder, and T. Richmond, “Solventmediated interactions in the structure of the nucleosome core particle at1.9 ˚aresolution,”
J. Mol. Biol. , vol. 319, pp. 1097–1113, 2002.[32] T. J. Richmond and C. A. Davey, “The structure of DNA in the nucleosomecore,”
Nature , vol. 423, pp. 145–150, 2003.[33] C. Dingwall, G. P. Lomonosoff, and R. A. Laskey, “High sequence specificityof micrococcal nuclease,”
Nucl. Acids Res. , vol. 9, pp. 2659–2673, 1981.[34] J. D. McGhee and G. Felsenfeld, “Another potential artifact in the studyof nucleosome phasing by chromatin digestion with micrococcal nuclease,”
Cell , vol. 32, pp. 1205–1215, 1983.[35] H.-R. Chung, I. Dunkel, F. Heise, C. Linke, S. Krobitsch, A. E. Ehrenhofer-Murray, S. R. Sperling, and M. Vingron, “The effect of micrococcal nucle-ase digestion on nucleosome positioning data,”
PLoS ONE , vol. 5, no. 12,p. e15754, 2010.[36] A. V. Morozov, K. Fortney, D. A. Gaykalova, V. M. Studitsky, J. Widom,and E. D. Siggia, “Using DNA mechanics to predict in vitro nucleosomepositions and formation energies,”
Nucleic Acids Res. , vol. 37, pp. 4707–4722, 2009.[37] J. Zlatanova, C. Seebart, and M. Tomschik, “The linker-protein network:control of nucleosomal DNA accessibility,”
Trends Biochem. Sci. , vol. 33,pp. 247–253, 2008.[38] S. H. Syed et al. , “Single-base resolution mapping of h1–nucleosome in-teractions and 3D organization of the nucleosome,”
Proc. Natl. Acad. Sci.U.S.A. , vol. 107, pp. 9620–9625, May 2010.[39] M. Georgieva, A. Roguev, K. Balashev, J. Zlatanova, and G. Miloshev,“Hho1p, the linker histone of saccharomyces cerevisiae , is important forthe proper chromatin organization in vivo ,” Biochim. et Biophys. Acta ,vol. 1819, pp. 366–374, 2012.[40] G. Schafer, C. McEvoy, and H.-G. Patterton, “The saccharomyces cere-visiae linker histone Hho1p is essential for chromatin compaction in sta-tionary phase and is displaced by transcription,”
Proc. Natl. Acad. Sci.U.S.A. , vol. 105, pp. 14838–14843, 2008.[41] J. Widom, “A relationship between the helical twist of DNA and the or-dered positioning of nucleosomes in all eukaryotic cells,”
Proc. Natl. Acad.Sci. USA , vol. 89, pp. 1095–1099, 1992.[42] J. P. Wang, Y. Fondufe-Mittendorf, L. Xi, G. F. Tsai, E. Segal, andJ. Widom, “Preferentially quantized linker DNA lengths in Saccharomycescerevisiae,”
PLoS Comput. Biol. , vol. 4, p. e1000175, 2008.4743] R. V. Chereji et al. , “Statistical mechanics of nucleosome ordering bychromatin-structure-induced two-body interactions,”
Phys. Rev. E , vol. 83,p. 050903, May 2011.[44] M. Hall, A. Shundrovsky, L. Bai, R. Fulbright, J. Lis, and M. Wang, “High-resolution dynamic mapping of histone-DNA interactions in a nucleosome,”
Nat. Struct. Mol. Biol. , vol. 16, pp. 124–129, 2009.[45] R. Forties, J. North, S. Javaid, O. Tabbaa, R. Fishel, M. Poirier, andR. Bundschuh, “A quantitative model of nucleosome dynamics,”
NucleicAcids Res. , vol. 39, pp. 8306–8313, 2011.[46] A. Thastrom, P. T. Lowary, H. R. Widlund, H. Cao, M. Kubista, andJ. Widom, “Sequence motifs and free energies of selected natural and non-natural nucleosom e positioning DNA sequences,”
J. Mol. Biol. , vol. 288,pp. 213–229, 1999.[47] J. G. Henikoff, J. A. Belskyb, K. Krassovsky, D. MacAlpine, andS. Henikoff, “Epigenome characterization at single base-pair resolution,”
Proc. Natl. Acad. Sci. USA , vol. 108, pp. 18318–18323, 2011.[48] H. S. Rhee and B. F. Pugh, “Genome-wide structure and organization ofeukaryotic pre-initiation complexes,”
Nature , vol. 483, pp. 295–301, 2012.[49] M. F. Dion, T. Kaplan, M. Kim, S. Buratowski, N. Friedman, and O. J.Rando, “Dynamics of replication-independent histone turnover in buddingyeast,”
Science , vol. 315, pp. 1405–1408, 2007.[50] G. C. Yuan, Y. J. Liu, M. F. Dion, M. D. Slack, L. F. Wu, S. J. Altschuler,and O. J. Rando, “Genome-scale identification of nucleosome positions inS. cerevisiae,”
Science , vol. 309, pp. 626–630, 2005.[51] T. N. Mavrich, I. P. Ioshikhes, B. J. Venters, C. Jiang, L. P. Tomsho, J. Qi,S. C. Schuster, I. Albert, and B. F. Pugh, “A barrier nucleosome modelfor statistical positioning of nucleosomes throughout the yeast genome,”
Genome Res. , vol. 18, pp. 1073–1083, 2008.[52] R. V. Chereji and A. V. Morozov, “Statistical mechanics of nucleosomesconstrained by higher-order chromatin structure,”
J. Stat. Phys. , vol. 144,no. 2, pp. 379–404, 2011.[53] A. Th˚astr¨om, P. Lowary, and J. Widom, “Measurement of histone–DNAinteraction free energy in nucleosomes,”
Methods , vol. 33, pp. 33–44, May2004.[54] G. Locke, D. Tolkunov, Z. Moqtaderi, K. Struhl, and A. V. Morozov, “High-throughput sequencing reveals a simple model of nucleosome energetics,”
Proceedings of the National Academy of Sciences , vol. 107, pp. 20998–21003,Nov. 2010. 4855] H. A. Cole, B. H. Howard, and D. J. Clark, “Activation-induced disruptionof nucleosome position clusters on the coding regions of Gcn4-dependentgenes extends into neighbouring genes,”
Nucl. Acids Res. , vol. 39, pp. 9521–9535, 2011.[56] H. L. Frisch and J. L. Lebowitz,
The equilibrium theory of classical fluids:a lecture note and reprint volume . WA Benjamin, 1964.[57] T. J. Richmond and C. A. Davey, “The structure of DNA in the nucleosomecore,”