[PDF] Ubiquitous nucleosome unwrapping in the yeast genome

Abstract

Nucleosome core particle is a dynamic structure -- DNA may transiently peel off the histone octamer surface due to thermal fluctuations or the action of chromatin remodeling enzymes. Partial DNA unwrapping enables easier access of DNA-binding factors to their target sites and thus may provide a dominant pathway for effecting rapid and robust access to DNA packaged into chromatin. Indeed, a recent high-resolution map of distances between neighboring nucleosome dyads in \emph{S.cerevisiae} shows that at least 38.7\% of all nucleosomes are partially unwrapped. The extent of unwrapping follows a stereotypical pattern in the vicinity of genes, reminiscent of the canonical pattern of nucleosome occupancy in which nucleosomes are depleted over promoters and well-positioned over coding regions. To explain these observations, we developed a biophysical model which employs a 10-11 base pair periodic nucleosome energy profile. The profile, based on the pattern of histone-DNA contacts in nucleosome crystal structures and the idea of linker length discretization, accounts for both nucleosome unwrapping and higher-order chromatin structure. Our model reproduces the observed genome-wide distribution of inter-dyad distances, and accounts for patterns of nucleosome occupancy and unwrapping around coding regions. At the same time, our approach explains \emph{in vitro} measurements of accessibility of nucleosome-covered binding sites, and of nucleosome-induced cooperativity between DNA-binding factors. We are able to rule out several alternative scenarios of nucleosome unwrapping as inconsistent with the genomic data.

Full PDF

UUbiquitous nucleosome unwrapping in the yeastgenome

R˘azvan V. Chereji ∗ and Alexandre V. Morozov † Department of Physics and Astronomy, Rutgers University,Piscataway, New Jersey 08854-8019, USA BioMaPS Institute for Quantitative Biology, Rutgers University,Piscataway, New Jersey 08854-8019, USAMay 17, 2013

Abstract

Nucleosome core particle is a dynamic structure – DNA may tran-siently peel oﬀ the histone octamer surface due to thermal ﬂuctuations orthe action of chromatin remodeling enzymes. Partial DNA unwrappingenables easier access of DNA-binding factors to their target sites and thusmay provide a dominant pathway for eﬀecting rapid and robust accessto DNA packaged into chromatin. Indeed, a recent high-resolution mapof distances between neighboring nucleosome dyads in

S.cerevisiae showsthat at least 38.7% of all nucleosomes are partially unwrapped. The ex-tent of unwrapping follows a stereotypical pattern in the vicinity of genes,reminiscent of the canonical pattern of nucleosome occupancy in whichnucleosomes are depleted over promoters and well-positioned over codingregions. To explain these observations, we developed a biophysical modelwhich employs a 10-11 base pair (bp) periodic nucleosome energy proﬁle.The proﬁle, based on the pattern of histone-DNA contacts in nucleosomecrystal structures and the idea of linker length discretization, accountsfor both nucleosome unwrapping and higher-order chromatin structure.Our model reproduces the observed genome-wide distribution of inter-dyad distances, and accounts for patterns of nucleosome occupancy andunwrapping around coding regions. At the same time, our approach ex-plains in vitro measurements of accessibility of nucleosome-covered bind-ing sites, and of nucleosome-induced cooperativity between DNA-bindingfactors. We are able to rule out several alternative scenarios of nucleosomeunwrapping as inconsistent with the genomic data. ∗ [email protected] † [email protected] a r X i v : . [ q - b i o . GN ] M a y ukaryotic genomes are organized into arrays of nucleosomes [1]. Each nu-cleosome consists of a stretch of genomic DNA wrapped around a histone oc-tamer core [2]. The resulting complex of DNA with histones and other regula-tory and structural proteins is called chromatin [1, 3]. Arrays of nucleosomesform 10-nm ﬁbers which resemble beads on a string and fold into higher-orderstructures [4, 5, 6]. Depending on the organism and cell type, 75-90% of ge-nomic DNA is packaged into nucleosomes [1]. Since nucleosomal DNA is tightlywrapped around the histone octamer, its accessibility to various DNA-bindingproteins such as repair enzymes, transcription factors (TFs), polymerases, andrecombinases, is suppressed. The question of how DNA-binding proteins gainaccess to their target sites in vivo eﬃciently and robustly is one of the outstand-ing puzzles in chromatin biology.Two diﬀerent mechanisms for site exposure have been proposed: nucleosometranslocation by thermal ﬂuctuations or by ATP-dependent chromatin remod-eling factors [7, 8, 9], and transient unwrapping of nucleosomal DNA oﬀ thehistone octamer surface [10, 11, 12, 13]. Proteins can utilize spontaneous DNAunwrapping to bind to their target sites, which would favor further destabiliza-tion of the histone-DNA complex and binding of additional proteins (Fig. 1A).Since partial unwrapping of nucleosomal DNA is energetically less costly thannucleosome translocation, it is likely to play a major role in numerous DNA-mediated processes. For example, nucleosome “breathing” governs transcriptiondynamics of RNA polymerase [14], and may provide an explanation for fast DNArepair by photolyases [15].Partial unwrapping of nucleosomal DNA and subsequent diﬀerential acces-sibility of nucleosome-covered protein-binding sites were observed in single nu-cleosomes [16, 10, 17, 11, 18, 19, 20, 12, 13], di-nucleosomes [21], and multi-nucleosome arrays [22, 23], and modeled computationally [24, 25, 26, 27, 28, 29].Recently, nucleosome dyad positions were mapped with single base-pair (bp)precision in S.cerevisiae , resulting in a genome-wide collection of distances be-tween neighboring nucleosomes [30]. Surprisingly, almost 40% of these inter-dyad distances are less than 147 bp, indicating that at least one nucleosome inthe pair is partially unwrapped. Furthermore, there are distinct 10-11 bp peri-odic oscillations in the histogram of inter-dyad distances, suggesting a stepwiseunwrapping mechanism consistent with the pattern of histone-DNA contacts inthe crystal structure [2, 31, 32].We present a rigorous statistical mechanics approach to nucleosome unwrap-ping which explains the observed genome-wide distribution of inter-dyad dis-tances [30] as well as earlier experiments which probed diﬀerential accessibilityof nucleosome-covered binding sites [10, 11] and studied nucleosome-inducedcooperativity between DNA-binding factors [16, 13]. Using this approach, wereproduce both nucleosome occupancy and inter-dyad distance proﬁles in thevicinity of transcription start sites (TSS). The model neglects sequence speci-ﬁcity of histone-DNA interactions but requires potential barriers at gene promot-ers, which may be created by DNA-bound TFs and transcription pre-initiationcomplexes (PICs). Although sequence-speciﬁc nucleosome positioning is notcrucial for explaining many features of the distribution of inter-dyad distances,2ur framework allows us to predict both sequence-speciﬁc free energies of nucleo-some formation and the unwrapping potential from paired-end high-throughputsequencing datasets.

We use a high-resolution in vivo map of nucleosome dyad positions based onchemical modiﬁcation of engineered histones and DNA backbone cleavage byhydroxyl radicals [30]. With cleaved DNA segments sorted by molecular masson the agarose gel and the ∼ −

200 bp fraction sequenced using paired-end reads, the map provides a direct measurement of both dyad positions anddistances between adjacent dyads. Although superior to methods based onmicrococcal nuclease (MNase) digestion whose accuracy is aﬀected by MNasesequences preferences and its tendency to over- or under-digest DNA [33, 34, 35],the map is biased by unknown hydroxyl radical cutting preferences for twoalternate sites at each DNA strand [30].A genome-wide histogram of inter-dyad distances shows that at least 38.7% of all nucleosomes are partially unwrapped (blue line in Fig. 1C). In orderto study the energetics of unwrapping, we introduce a simple model based onthe 10-11 bp periodic pattern of histone contacts with the minor groove of thenucleosomal DNA [31, 32] (Fig. 1B). As DNA is peeled oﬀ each contact patch,its free energy increases because hydrogen bonds and favorable electrostaticcontacts between histone side chains and the DNA phosphate backbone arelost. However, once DNA breaks free from the contact patch, it may adoptmultiple conformations, which allows it to increase its entropy and thus lowerthe total free energy. The favorable entropic term grows with the extent ofunwrapping until the next contact patch is reached, completing one cycle inthe oscillatory energy proﬁle. The oscillations are superimposed on a straightline whose slope reﬂects the average free energy cost per bp of histone-DNAcontact loss. Additional details of the potential construction can be found inSI Appendix, Model A. The histone-DNA potential constructed in this way hasno sequence speciﬁcity, in contrast to the free energy cost of bending DNA intoa nucleosomal superhelix [36]. To a good approximation, we expect sequenceeﬀects to average out in the genome-wide histogram of inter-dyad distances (thisassumption is tested later).Thus we aim to reproduce the observed distribution of inter-dyad distanceswith a model in which nucleosome energetics is sequence-independent but tran-sient unwrapping is allowed. We compute the conditional probability P ( c + d | c )of ﬁnding the nucleosome dyad at bp c + d , given that the previous dyad islocated at bp c (Materials and Methods). Since inter-dyad distances cannotbe used to distinguish between symmetric and asymmetric unwrapping, we as-sume the former for simplicity. The model is ﬁt to the observed distributionof inter-dyad distances (SI Appendix). The free parameters of the model in-3lude the amplitude of the oscillations, the slope of the free energy proﬁle and a min(max) , the minimum (maximum) eﬀective length of the nucleosome particle(SI Appendix, Model A). The maximum extent of nucleosome unwrapping iscontrolled by a min , while a max is allowed to exceed 147 bp in order to accountfor the eﬀects of higher-order chromatin structure and linker histone deposition.We also ﬁt the relative frequency of hydroxyl radical DNA cleavage at the − The eﬀective length of the particle found in the ﬁt, a max = 163 bp, is greaterthan 147 bp, the length of the DNA in the nucleosome core [32]. Indeed, a max =147 bp is incompatible with the observed inter-dyad distribution (Fig. S3A,B).The model is less sensitive to the value of a min because extensively unwrappedparticles are energetically unfavorable and therefore are not frequently seen inthe data. The overall shape of the inter-dyad distribution is also sensitive to theslope of the energy proﬁle in Fig. 1B, providing a robust estimate of the averagenucleosome unwrapping energy (Fig. S3C). The ﬁtted value of the slope yields14 . B T for the average histone-DNA interaction energy in a fully wrappednucleosome.Thus the energy proﬁle in Fig. 1B describes both DNA interactions withthe histone octamer core (up to 73 bp from the dyad) and the eﬀects of higher-order chromatin structure, including, potentially, the attachment of Hho1p, theH1 linker histone of

S. cerevisiae , to the DNA immediately outside of the nu-cleosome core [37, 38, 39]. Although Hho1p is less abundant in yeast thanin higher eukaryotes, it is involved in higher-order chromatin organization, in-cluding chromatin compaction in stationary phase [40, 39]. Relatively little isknown about the molecular mechanism of H1 binding: there is no consensus yetwhether the binding is symmetric or asymmetric, or even what the extent of theH1 footprint is [37, 38]. H1 binding and other factors that mediate chromatinfolding into higher-order structures cause linker lengths to be discretized [1, 41].Linker length discretization can be described by a periodic decaying two-bodyeﬀective potential between neighboring nucleosomes, with the ﬁrst minimumapproximately 5 bp away from the nucleosome edge [1, 42, 43].Based on these observations, we have constructed two models for the energyproﬁle outside of the nucleosome core region. One model is a polynomial ﬁt thatextends the quasiperiodic proﬁle of the unwrapping energy through anothercycle (Fig. 1B; SI Appendix, Model A). The depth and the position of theﬁrst minimum outside of the nucleosome core are additional free parameters.As can be seen in Fig. S4B, our ﬁt robustly predicts the ﬁrst minimum to bepositioned 5-6 bp outside of the nucleosome core, in agreement with previous4 .5 1.5 2.5 3.5 4.5 5.5 6.5

Dyad − edge distance (bp)

Dyad u ha l f ( k B T ) − − − − Inter−dyad distance (bp) P r obab ili t y

50 100 150 200 250 − − − Inter−dyad distance (bp) O sc ill a t o r y pa r t Correlation: 0.764

DataModelDataModel (unwrapping)Model (no unwrapping)

RMS: 9.996 ×10 -4 × BACD

Figure 1:

Genome-wide distribution of inter-dyad distances. (A) A nu-cleosome is a dynamic structure. Transient nucleosome unwrapping followedby factor binding prevents subsequent rewrapping, mediating further bindingevents. (B) Nucleosome energy proﬁle. The energy of a nucleosome that covers2 x + 1 bps is given by u SInuc = 2 u half ( x ) with symmetric unwrapping. The min-ima and maxima of the energy landscape are based on the crystal structure ofthe nucleosome core particle [31, 32]. Dark gray bars show where the histonebinding motifs interact with the DNA minor groove in the structure. Lightgray bars show where the DNA major groove faces the histones. The energyproﬁle was obtained by a polynomial ﬁt as described in SI Appendix, ModelA. (C) The inter-dyad distance distribution from a high-resolution nucleosomemap [30] (blue), and from the model with (red) and without unwrapping (gray).In the model without unwrapping, a min = a max = 147 bp and the ﬁtting param-eters are E b , µ and f (SI Appendix, Model A). RMS - total root-mean-squaredeviation between the model and the data. (D) Oscillations in the observed(blue) and predicted (red) inter-dyad distance distributions, obtained by sub-tracting a smooth background from the data and the model with unwrapping in(C). The smooth background was found by applying a Savitzky-Golay ﬁlter ofpolynomial order 3 with 31 bp length (using the sgolayfilt function from theSignal Processing Toolbox of MATLAB). Correlation refers to r osc , the linearcorrelation coeﬃcient between measured and predicted oscillations.5tudies [1, 42, 43]. The depth of this minimum is comparable to the depth ofthe unwrapping minima (Fig. S4, SI Appendix, Model A).The other model represents the energy proﬁle outside of the nucleosome coreby a linear function (Fig. S5A; SI Appendix, Model B). The two free parametersare the slope and the length of the line, which are related to the H1-DNAinteraction energy and the H1 footprint. This model is motivated by a picture ofthe H1 histone being gradually detached from its DNA site immediately outsideof the nucleosome core. This alternative scenario, although likely oversimpliﬁed,can be used to check the sensitivity of our results toward a particular energyproﬁle outside of the core region. We ﬁnd that the linear proﬁle ﬁts the overallshape of the inter-dyad distribution somewhat less well than the oscillatory one(compare RMS values in Figs. 1B and S5B), although the 10-11 bp periodicﬁne structure is reproduced in both cases (Figs. 1C, S5C). The optimal linearproﬁle is 7 bp long, yielding a symmetric H1 footprint with two 7 bp half-sites(Fig. S5D) and the H1-DNA interaction energy of ≈ B T (Fig. S5E).

Next we have tested the sensitivity of our ﬁts to the functional form of the un-wrapping free energy proﬁle. Although our primary model follows nucleosomecrystal structures in creating a quasi-periodic energy proﬁle with both 10 and11 bp modes, strictly periodic 10 or 11 bp sinusoidal proﬁles yield nearly thesame quality of ﬁt (Fig. S6; SI Appendix, Models C,D). Since the initial phaseof the oscillations is not determined by the crystal structure anymore, it be-comes another ﬁtting parameter. The ﬁtted initial phases in the 10 and 11-bpmodels make the periodic curves match the crystal structure further away fromthe dyad, where most of the observed unwrapping takes place (Fig. S6A). Thephases diverge closer to the dyad, where they are not as strongly constrainedby the data. RMS deviation is less sensitive to the initial phase than r osc , thelinear correlation between predicted and observed oscillations in the inter-dyadhistograms (Fig. S7). The primary peak in the dependence of r osc on the initialphase matches the crystal structure. There is also a secondary peak correspond-ing to the 5 bp shift in the unwrapping energy proﬁle, which in turn leads to the10 bp, in-phase shift in the distribution of inter-dyad oscillations (Fig. S7B).Since the inter-dyad distance distribution has a distinct oscillatory compo-nent, it is not surprising that a purely linear model of unwrapping energy doesnot ﬁt the data as well, although it does match its overall shape (Fig. S8A;SI Appendix, Model E). Less trivially, it was suggested on the basis of single-nucleosome unzipping experiments that nucleosome unwrapping proceeds with5-bp periodicity because histones interact with each DNA strand separatelywhere the DNA minor groove faces the histone octamer surface, creating twodistinct contact “subpatches” [44]. This single-molecule data was ﬁt to a modelwith a step-wise unwrapping free energy proﬁle [45]. Each step in the proﬁlecorresponds to breaking a point histone-DNA contact, and the steps occur ev-ery 5 .

25 bp on average. We do not ﬁnd any evidence for 5 bp periodicity ofnucleosome unwrapping in the genomic data. Indeed, both 5 bp step-wise and6 bp periodic sinusoidal proﬁles ﬁt the data poorly, about as well as the linearmodel (Fig. S8B,C). Even the 10-bp step-wise unwrapping proﬁle, while clearlyhaving the right periodicity, does not ﬁt the data as well as the structure-basedmodel (Fig. S8D). This observation suggests that the picture of gradual loss offavorable ﬁnite-range histone-DNA interactions followed by gain in DNA con-formational entropy is closer to reality than abrupt disruption of short-rangehistone-DNA contacts. A direct comparison of single-molecule and genome-wide energy proﬁles is unfortunately obscured by the fact that the reportedsingle-nucleosome unzipping experiments are speciﬁc to the 601 nucleosome-forming sequence [46], in contrast to our methodology which provides the aver-age, sequence-independent picture of unwrapping energetics.

Fig. 2A, in which genes are sorted by the promoter length and aligned by theTSS, shows a canonical picture of nucleosomes depleted in promoters and well-positioned over coding regions [50, 51]. Interestingly, promoter nucleosomeshave shorter inter-dyad distances and are therefore more unwrapped (Fig. 2B).When averaged over all genes, the number of dyads at a given bp and theaverage inter-dyad distance at that bp are strongly correlated (compare blueand red lines in Fig. 2C). The proﬁle of average inter-dyad distances is alsocorrelated with the distribution of DNA fragment lengths in an MNase assaywhich mapped both nucleosomes and subnucleosome-size particles by paired-end sequencing (Fig. S9A, green line in Fig. 2C) [47]. The two proﬁles do notcoincide completely because inter-dyad distances also depend on the distributionof linker lengths. The observed behavior is opposite of the naive expectationthat unwrapping increases with occupancy due to nucleosome crowding, but canbe reproduced in a simple sequence-independent model in which nucleosomesphase oﬀ a potential barrier placed in the promoter region (Fig. S10) [52].Partially unwrapped nucleosomes tend to have elevated histone turnoverrates [49], both around TSS and genome-wide (Figs. 2C,S9B,S9D). We ﬁnd thatnucleosomes at loci enriched in PICs [48] are also more unwrapped (Figs. 2C,S9C).Finally, inter-dyad distances tend to increase with the fraction of A/T nu-cleotides, indicating that nucleosomes occupying A/T-rich sequences have longerfootprints genome-wide (Fig. S9E). We note that it is misleading to equateinter-nucleosome distances with peak-to-peak distances in the average proﬁleof nucleosome dyad counts (blue line in Fig. 2C). The peak-to-peak distancesare 164-165 bp, while the average inter-dyad distance for the nucleosomes inthe [TSS,TSS+1000] region is 149.6 bp. Thus nucleosome unwrapping is muchmore common than could be predicted by mapping single-nucleosome positionsalone. 7

165 bp 164 bp 164 bp 164 bp 165 bp A v e r age nu m be r o f d y ad s (cid:132)(cid:18)(cid:17)(cid:17)(cid:17) (cid:132)(cid:22)(cid:17)(cid:17) TSS +500 +1000145146147148149150151152153 A v e r age d i s t an c e t o t he ne x t d y ad ( bp ) Distance (bp) A v e r age f r ag m en t s i z e ( bp ) –2 –1 +1 +2 +3 +4 +5 +6 Position (bp)Nucleosome dyad counts (cid:132)(cid:20)(cid:17)(cid:17)(cid:17) (cid:132)(cid:19)(cid:17)(cid:17)(cid:17) (cid:132)(cid:18)(cid:17)(cid:17)(cid:17)

TSS (cid:18)(cid:17)(cid:17)(cid:17) (cid:19)(cid:17)(cid:17)(cid:17) (cid:17)(cid:19) (cid:18)(cid:17)(cid:18)(cid:19)

Position (bp) (cid:34)(cid:87)(cid:70)(cid:83)(cid:66)(cid:72)(cid:70)(cid:1)(cid:74)(cid:79)(cid:85)(cid:70)(cid:83)(cid:132)(cid:69)(cid:90)(cid:66)(cid:69)(cid:1)(cid:69)(cid:74)(cid:84)(cid:85)(cid:66)(cid:79)(cid:68)(cid:70) (cid:132) (cid:132) (cid:132)

BAC

Figure 2:

Nucleosome unwrapping in the vicinity of transcription startsites. (A) Distribution of nucleosome dyad counts [30] near the TSS. 4763 ver-iﬁed

S.cerevisiae open reading frames (ORFs) were aligned by their TSS andsorted by promoter lengths. Each horizontal line corresponds to one ORF. (B)Distribution of the average distance between neighboring dyads. For each bp,the distances between a dyad at that bp and all neighboring dyads were aver-aged. ORFs are sorted as in (A). In (A) and (B), values at bps without dyadswere obtained by interpolation, and heatmaps were smoothed using a 2D Gaus-sian kernel with σ = 3 pixels. (C) Data in (A), (B) and Fig. S9A-C averagedover all genes. Blue: nucleosome dyad counts, red: average distance betweenneighboring dyads, green: average length of DNA-bound particles mapped byMNase digestion [47] (see Fig. S9A for details). Curve with light gray back-ground: combined occupancy of 9 PICs (TBP, TFIIA, TFIIB, TFIID, TFIIE,TFIIF, TFIIH, TFIIK, PolII) [48], curve with light pink background: aver-age histone turnover rate [49]. The peaks in the dyad count proﬁle (blue) aremarked with orange ovals representing nucleosomes, and peak-to-peak distancesare shown. 8

25 50 75 100 125 150 -6 -5 -4 -3 -2 -1 Probed Site B s t U I T aq α I B sr I B s t U I T aq α I B s a J I B s t N I T aq α I B s a H I UnwrappingNo unwrapping A p open p open P s t I H i nd III H a e III A v a II M s p I B a m H I H ha I M s e I S t y I B f a I P m l I -6 -5 -4 -3 UnwrappingNo unwrapping

Position (bp)B

Figure 3:

Probability of binding site exposure within a nucleosome.

The solid blue and dashed green lines represent model predictions with andwithout unwrapping, respectively. In the latter case, a min = a max = 147 bp andall other parameters are adopted from the model with unwrapping. The dyad isﬁxed at bp 74. (A) Restriction enzyme sites inserted into the 5S rRNA sequenceat locations indicated by the centers of vertical red bars [10]. (B) Restrictionenzyme sites inserted into the 601 sequence at locations indicated by the centersof the vertical red bars in the middle of each group [11]. Each group of threebars corresponds to independent measurements in which the 601 sequence wasﬂanked by diﬀerent DNA sequences. In (A) and (B), the height of each bar isthe equilibrium constant for site exposure averaged over multiple experiments(error bars show standard deviation). Partial unwrapping of nucleosomal DNA results in diﬀerential accessibility offactor binding sites with respect to their position inside the nucleosome: sites onthe edges are more accessible than those closer to the dyad. In contrast, all-or-none nucleosome formation should not be sensitive to the binding site position– a nucleosome, once unfolded, liberates its entire site. Polach and Widom [10]studied diﬀerential accessibility of 6 restriction enzymes to their target sites.The sites were placed at various positions throughout the 5S rRNA nucleosomalsequence (Fig. 3A). A later study used the 601 sequence and an extended setof 11 restriction enzymes (Fig. 3B) [11]. These studies measured equilibriumconstants for site exposure K confeq , which are related to the probability for a siteto be accessible for binding: p open = K confeq / (1 + K confeq ) ≈ K confeq [29].We use our crystal structure-based unwrapping model (Fig. 1B; SI Appendix,Model A) to ﬁt the data on site accessibility [10, 11]. Here the system consistsof a single nucleosome and asymmetric unwrapping is allowed. We assume thata site becomes accessible for the enzyme only after an additional number ofbps, d , have been unwrapped from the histone octamer surface [29]. We alsoassume that once the dyad is unwrapped from either end, the entire nucleosome9s unfolded. Then the probability for a binding site to be accessible is given by p open ( x ) =  − Occ nuc ( x + d ) for x < x d − d − Occ nuc ( x d ) for x d − d ≤ x ≤ x d + d − Occ nuc ( x − d ) for x > x d + d, where x ∈ [1 , x d = 74 bp is the position of the dyad, and the nucleosomeoccupancy is given by Eq. (3).Besides d , the ﬁtting parameters of the model are the overall slope of thebinding energy (cid:15) and the histone chemical potential µ (all other parametersare as in SI Appendix, Model A, with the exception of a min = 1 bp, a max =147 bp). For the 5S rRNA measurements [10], we obtain (cid:15) = − .

13 k B T/bp, µ = − . B T, and d = 23 bp. For the 601 measurements [11], (cid:15) = − .

16 k B T/bp, µ = − . B T, and d = 45 bp. As expected, the nucleo-some formation energy of the 601 sequence is 147 × ( (cid:15) − (cid:15) ) = 4 . B T morefavorable than that of the 5S sequence, in agreement with the experimentallymeasured diﬀerence of 4.9 k B T [53]. The nucleosome formation energy of the601 sequence is 24 . B T, close to the 23 . B T estimate made on the basis of601 unzipping experiments [45]. Interestingly, the 601 DNA has to unwrap moreextensively past the binding site to allow access to restriction enzymes.Overall, our model reproduces the observed diﬀerential accessibility of re-striction enzyme binding sites with respect to the nucleosome dyad (Fig. 3).The only outliers are

Sty

I and

Bfa

I binding sites in the 601 series which werenot used in the ﬁt and which, if unwrapping proceeds from the ends, cannot bemore open than the

Pml

I site located further away from the dyad.

If multiple biding sites reside within a single nucleosome, binding of one factormakes the other sites more accessible, in a phenomenon known as nucleosome-induced cooperativity [16, 18, 25]. The cooperativity disappears in the absenceof nucleosomes and reduces in extent with the distance between consecutivesites [16]. Moreover, the cooperativity is not observed if the two sites are on theopposite sides of the nucleosome dyad [13].We can use our model of nucleosome unwrapping (SI Appendix, Model Awith a min = 1 bp, a max = 147 bp) to capture all these aspects of nucleosome-induced cooperativity (Fig. 4). Speciﬁcally, for sites located more than 40 bpaway from the dyad site accessibility is strongly enhanced if DNA unwrappingis allowed (Fig. 4A). Interestingly, cooperativity between two TFs bound on thesame side of the dyad is observed both with and without unwrapping (Fig. 4B).However, without unwrapping it is impossible to show that binding on theopposite sides of the dyad is not cooperative, as observed in experiments [13](Fig. 4C). Furthermore, the decrease of cooperativity with distance [16] cannotbe reproduced (Fig. 4D). Thus modeling transient nucleosome unwrapping isnecessary for understanding how TFs and other DNA-binding proteins gainaccess to their nucleosome-covered sites.10 .7 Sequence-dependent nucleosome positioning and un-wrapping Here we test the assumption that sequence speciﬁcity of nucleosome formationmay be neglected when inferring nucleosome unwrapping potential from genomicdata. In general, the total free energy u nuc ( k, l ) of a nucleosome occupying bps k, . . . , l is a sum of a sequence-independent term u SInuc ( k, l ) describing contactsbetween histone side chains and the DNA phosphate backbone (Fig. 1B), anda sequence-dependent term u SDnuc ( k, l ) describing DNA bending into the nucle-osomal superhelix [36]. Our approach (Eq. (4), SI Appendix) can be used toinfer both contributions from high-throughput maps of nucleosome positions.Since the number of diﬀerent unwrapped species may be as high as severalthousand depending on the maximum extent of unwrapping and the assump-tion of unwrapping symmetry, we expect robust inference to require levels ofread coverage that are not currently available, especially in higher eukaryotes.Furthermore, resolution of MNase-based nucleosome maps is insuﬃcient for cap-turing the 10-11 bp periodicity of the unwrapping potential. In the absence ofhigh-resolution, high-coverage experimental data, we have tested our ability topredict nucleosome unwrapping energetics using a realistic model system.Speciﬁcally, we assumed that u SDnuc depends only on the number of mono-and dinucleotides in the nucleosomal DNA [54] (SI Appendix). The length ofnucleosomal DNA ranges from a min = 74 bp to a max = 147 bp; all other param-eters of u SInuc are as in SI Appendix, Model A. Using Eq. (1) with µ = −

13 k B T,we computed the exact nucleosome distribution n nuc1 ( k, l ) for the S.cerevisiae chromosome I. We sampled paired-end nucleosomal reads [ k, l ] from n nuc1 ( k, l )until a desired level of read coverage (mean number of reads starting at a bp)was reached. From this ﬁnite sample, we constructed a histogram of nucleosomelengths P (∆) and used it to optimize the parameters of u SInuc (in each step ofthe ﬁtting procedure, u SInuc alone was used to predict n nuc1 , which in turn gave P (∆)). Next, we used Eq. (4) to predict u nuc ( i, j ), subtracted u SInuc from it (as-suming that the dyad is at the mid-point of each particle), and ﬁt u SDnuc to therest. Finally, u SInuc + u SDnuc were used to compute ˜ n nuc1 , which was then comparedwith the exact proﬁle n nuc1 .We ﬁnd that we are able to reproduce the unwrapping potential even if nu-cleosome formation is sequence-speciﬁc (Fig. S11A). However, the overall slopeof the potential is overpredicted because the histogram of particle lengths is af-fected by well-positioned nucleosomes which have negative sequence-dependentenergies. The average of these energies biases the slope. Nucleosome occupancypredicted using u SInuc + u SDnuc yields a reasonable correlation with the exact pro-ﬁle even at 1 read per bp, although at least 10 reads per bp are required toreproduce dyad positions (Fig. S11B).11

20 40 60 80 100 120 14000.20.40.60.81

Position (bp) O cc upan cy Nucleosome (unwrapping)TF (unwrapping)Nucleosome (no unwrapping)TF (no unwrapping)00.20.40.60.81

Distance from the dyad (bp) TF b i nd i ng p r obab ili t y UnwrappingNo unwrapping0 20 40 60 (cid:132)(cid:19)(cid:23) (cid:132)(cid:19)(cid:21) (cid:132)(cid:19)(cid:19) (cid:132)(cid:19)(cid:17) (cid:132)(cid:18)(cid:25) (cid:132)(cid:18)(cid:23) (cid:132)(cid:18)(cid:21)(cid:17)(cid:17)(cid:15)(cid:19)(cid:17)(cid:15)(cid:21)(cid:17)(cid:15)(cid:23)(cid:17)(cid:15)(cid:25)(cid:18)

TF chemical potential (k B T) TF b i nd i ng p r obab ili t y (cid:54)(cid:79)(cid:88)(cid:83)(cid:66)(cid:81)(cid:81)(cid:74)(cid:79)(cid:72)(cid:15)(cid:1)(cid:18)(cid:1)(cid:53)(cid:39)(cid:54)(cid:79)(cid:88)(cid:83)(cid:66)(cid:81)(cid:81)(cid:74)(cid:79)(cid:72)(cid:15)(cid:1)(cid:19)(cid:1)(cid:53)(cid:39)(cid:84)(cid:47)(cid:80)(cid:1)(cid:86)(cid:79)(cid:88)(cid:83)(cid:66)(cid:81)(cid:81)(cid:74)(cid:79)(cid:72)(cid:15)(cid:1)(cid:18)(cid:1)(cid:53)(cid:39)(cid:47)(cid:80)(cid:1)(cid:86)(cid:79)(cid:88)(cid:83)(cid:66)(cid:81)(cid:81)(cid:74)(cid:79)(cid:72)(cid:15)(cid:1)(cid:19)(cid:1)(cid:53)(cid:39)(cid:84) Unwrapping. 1 TFUnwrapping. 2 TFsNo unwrapping. 1 TFNo unwrapping. 2 TFs (cid:132)(cid:19)(cid:23) (cid:132)(cid:19)(cid:21) (cid:132)(cid:19)(cid:19) (cid:132)(cid:19)(cid:17) (cid:132)(cid:18)(cid:25) (cid:132)(cid:18)(cid:23) (cid:132)(cid:18)(cid:21)(cid:17)(cid:17)(cid:15)(cid:19)(cid:17)(cid:15)(cid:21)(cid:17)(cid:15)(cid:23)(cid:17)(cid:15)(cid:25) TF chemical potential (k B T) TF b i nd i ng p r obab ili t y Distance (bp) TF b i nd i ng p r obab ili t y Unwrapping. 2 TFsNo unwrapping. 2 TFs D A BC D

Figure 4:

Modes of nucleosome-induced cooperativity. (A) TF and nucle-osome occupancy with and without unwrapping. The TF binding site occupiesbps 11–20. Inset: TF binding probability as a function of the distance betweenthe nucleosome dyad and the proximal edge of the TF site, with and withoutunwrapping. (B) TF titration curves for one TF site vs. two TF sites locatedon the same side of the dyad. Site 1 occupies bps 11–20, site 2 occupies bps31–40. Inset: Binding site locations. (C) Same as (B), but with the two TFsites located on the opposite sides of the dyad. Site 1 occupies bps 11–20, site2 occupies bps 117–126. Inset: Binding site locations. (D) Nucleosome-inducedcooperativity as a function of the distance between two TF binding sites. Thebinding probability of the second TF is shown. Site 1 occupies bps 11-20, whilethe position of the second site is variable. Inset: Deﬁnition of the distancebetween the two binding sites. In all panels, free energy of a fully wrappednucleosome is − log(10 ) k B T; histone chemical potential is log(10 − ) k B T; TFbinding energy is − log(10 ) k B T to cognate sites, and − log(10 ) k B T to allother sites; TF chemical potential is log(10 − ) k B T unless varied. Asymmetricunwrapping is allowed; in the model without unwrapping, a min = a max = 147 bpand all other parameters are the same as in the model with unwrapping. Since DNA wrapped in nucleosomes is sterically occluded, nucleosome formationprevents access of DNA-binding factors such as TFs or polymerase subunits totheir cognate sites. Transient DNA unwrapping oﬀ the histone octamer surface,extensively studied in model systems [16, 10, 17, 11, 18, 19, 20, 12, 13, 21, 22, 23],can facilitate such binding events but its importance on the genomic scale hasbeen unclear. MNase-based maps of nucleosome positions employing paired-end sequencing have identiﬁed numerous subnucleosome-size particles [47, 55].12owever, it is impossible to deduce from these experiments which particlescorrespond to unwrapped nucleosomes. In addition, there is a concern thatDNA fragments may have been under- or overdigested by MNase.These challenges have been overcome in a recent experiment in which chemi-cally modiﬁed histones are used to map nucleosome dyads with single-bp resolu-tion [30]. Used in conjunction with paired-end sequencing, the experiment pro-vides information about both dyad positions and the distances between neigh-boring dyads. The latter is especially important since it probes the two-particledistribution in the same cell. The histogram of inter-dyad distances shows thatat least 38.7% of all

S.cerevisiae nucleosomes are partially unwrapped. Genepromoters are enriched in such nucleosomes, which also have higher histoneturnover rates. Over coding regions, the distribution of wrapped DNA lengthsoscillates in phase with the nucleosome occupancy proﬁle, so that more stablenucleosomes are also more wrapped.We have developed a statistical mechanics description of nucleosome arrayswhich allows DNA unwrapping while rigorously treating steric exclusion andsequence speciﬁcity of nucleosome formation. We have shown that prominent10-11 bp periodicity in the distribution of inter-dyad distances [30] can be ex-plained using an unwrapping energy proﬁle based on the pattern of histone-DNAcontacts in the nucleosome crystal structures. We were able to rule out alterna-tive scenarios of step-wise unwrapping and 5 bp periodicity of the unwrappingprocess. Furthermore, ﬁtting the distribution of inter-dyad lengths required ac-counting for linker length discretization, commonly thought to be imposed bychromatin ﬁber formation [41].Our model yields estimates of nucleosome unwrapping energies consistentwith previous biophysical experiments, and accounts for single-nucleosome ob-servations which show that nucleosome-covered binding sites closer to the edgeof the nucleosome are more easily accessible to DNA-binding factors and thatbinding of the ﬁrst factor enhances binding of subsequent factors on the sameside of the dyad. Finally, our approach reproduces the periodic distribution ofwrapped DNA lengths in the vicinity of TSS if potential barriers are placedin the promoters. The barriers may be created in vivo by PICs [48], otherDNA-bound factors, and chromatin remodelers. The extent of nucleosome un-wrapping in the yeast genome suggests that its treatment should be included inall future models of nucleosome positioning and chromatin energetics.

Here we outline the exact theory for T molecular species simultaneously inter-acting with one-dimensional DNA (details and extensions are provided in SIAppendix). The DNA-bound particles are subject to steric exclusion and mayalso be partially unwrapped. Let u s ( k, l ) ( s ∈ { , , . . . , T } ) denote the bindingenergy of a particle of type s attached to bps k, . . . , l . One can show that the13ne-body distribution of particles of type t is given by n t ( k, l ) = 1 Z Z − ( k ) (cid:104) t, k | z | t, l (cid:105) Z + ( l ) , (1)where Z is the grand canonical partition function, (cid:104) t, k | z | s, l (cid:105) = e β [ µ s − u s ( k,l )] δ t,s , β = 1 /k B T is the inverse temperature, µ s is the chemical potential of particlesof type s , and δ t,s is the Kronecker symbol. If the DNA length is L , vectors | t, k (cid:105) are T L -dimensional with a 1 at position ( t − L + k and 0 everywhereelse. Similarly, the nearest-neighbor two-body distribution function is given by n t,s ( i, j ; k, l ) = 1 Z Z − ( i ) (cid:104) t, i | z | t, j (cid:105) Θ( k − j ) (cid:104) s, k | z | s, l (cid:105) Z + ( l ) , (2)where Θ is the Heaviside step function, and Z − ( k ) and Z + ( k ) are partial par-tition functions for the DNA segments [1,k) and (k,L], respectively. Z − ( k ), Z + ( k ), and Z can be computed recursively, as shown in SI Appendix. UsingEq. (1), one can ﬁnd particle occupancy:Occ t ( i ) = i (cid:88) k = i − a t max +1 k + a t max − (cid:88) l =max( i,k + a t min − n t ( k, l ) , (3)where a t min ( a t max ) is the minimum (maximum) length of a particle of type t .The inverse problem of predicting DNA binding energies from one-particledistributions can also be solved: β [ u t ( i, j ) − µ t ] = − ln (cid:20) n t ( i, j ) ZZ − ( i ) Z + ( j ) (cid:21) , (4)where Z, Z − , Z + are again found recursively using only one-particle distribu-tions n t ( i, j ) as input (SI Appendix). Eq. (4) enables us to disentangle intrinsicsequence preferences and unwrapping energetics from steric eﬀects. To predict the distribution of inter-dyad lengths, we compute the conditionalprobability of having a nucleosome with the dyad at bp c + d , given that theadjacent upstream nucleosome has the dyad at bp c : P ( c + d | c ) = N ( c, c + d ) N ( c ) , (5)where the probability distributions of the nucleosome centers can be computedusing Eqs. (1) and (2) for a single particle type: N ( c ) = (cid:88) ∆ n nuc1 ( c − ∆ , c + ∆ ) ,N ( c, c + d ) = (cid:88) ∆ , ∆ n nuc , nuc2 ( c − ∆ , c + ∆ ; c + d − ∆ , c + d + ∆ ) . , +1 are lengths of the particles centered at bp c and c + d , respectively.To estimate P ( c + d | c ), we use c = 5 kbp and a box of length L = 10 kbp, sothat the boundaries of the box are far away. We ignore nucleosome sequencespeciﬁcity, convolve P ( c + d | c ) with a kernel to account for site-speciﬁc chemicalcleavage bias (SI Appendix), and ﬁt model parameters to reproduce the observeddistribution of inter-dyad distances. Acknowledgments

We thank Leonid Mirny for insightful discussions. This research was supportedby National Institutes of Health (R01 HG004708 to A.V.M.). A.V.M. is anAlfred P. Sloan Research Fellow. 15 upplementary Information

We consider a problem of mutually interacting particles (one-dimensional rods)that can be reversibly adsorbed to a one-dimensional lattice of L sites [here,DNA base pairs (bps)]. In order to model partial unwrapping of nucleosomalDNA oﬀ the surface of histone octamers, we allow the particles to cover avariable number of bps between a min and a max . We assume that the particlescannot overlap while they are attached to the lattice. This is implementedusing hard-core interactions between adjacent particles. There are also hardwalls at the ends of the lattice so that particles are prevented from running oﬀit. In addition, we allow generic two-body interactions between nearest-neighborparticles.The attachment of a particle to the DNA modiﬁes the total energy of the sys-tem in a sequence-speciﬁc manner. Physically, the binding energy may have con-tributions from DNA bending, electrostatic interactions, hydrogen bond forma-tion, van der Waals contacts, etc. Thus a particle which covers bps k, k +1 , . . . , l has a total one-body binding energy u ( k, l ). Note that for pairs of coordinates( k, l ) such that l − k + 1 > a max or l − k + 1 < a min , u ( k, l ) = ∞ , because all par-ticles must have the length between a min and a max bps. The theory presentedbelow is valid for arbitrary binding energies u ( k, l ).Let Φ( j, k ) be the two-body interaction between a pair of nearest-neighborparticles which cover base pairs . . . , j − , j and k, k +1 , . . . ( k > j ) (Fig. S1). Inthe case of nucleosomes, such interactions may be used to account for the eﬀectsof higher-order chromatin structure [52]. Although we do not focus on two-bodyinteractions in this work, they are included below for the sake of generality. Weimpose Φ( j, k ) = (cid:26) ∞ if k ≤ j,V ( k − j −

1) if k > j, where V ( d ) is an arbitrary interaction potential which depends only on thelinear distance d between two neighboring particles.For a ﬁxed number of particles attached to the DNA, N , the canonicalpartition function is Q N = (cid:88) { i n =1 ...L } n ∈{ ,..., N } e − βu ( i ,i ) e − β Φ( i ,i ) e − βu ( i ,i ) . . . × e − βu ( i N − ,i N − ) e − β Φ( i N − ,i N − ) e − βu ( i N − ,i N ) , (S1)where β = 1 /k B T is the inverse temperature ( k B is the Boltzmann constant).Note that with our deﬁnitions of one-body energies, two-body interactions and16 (j, k)u(i, j) u(k, l)i j k l Figure S1: Schematic illustration of one-body and two-body potentials in amulti-nucleosome system. Nucleosomes may be partially unwrapped, resultingin variable DNA footprints.hard-wall boundary conditions, only legitimate conﬁgurations of non-overlappingparticles will contribute to Eq. (S1).In order to simplify the notation, we introduce two L × L matrices: (cid:104) k | e | l (cid:105) = e − βu ( k,l ) , (cid:104) k | w | l (cid:105) = e − β Φ( k,l ) . Here (cid:104) k | M | l (cid:105) represents the element of matrix M in row k and column l . | l (cid:105) is a column vector of dimension L with 1 at position l and 0 everywhere else,and (cid:104) k | is a row vector with 1 at position k and 0 otherwise.Let | J (cid:105) be a vector of dimension L with 1 at every position. Eq. (S1) gives Q N = (cid:40) (cid:104) J | ( ew ) N − e | J (cid:105) if N ≥ N = 0.If the particles are allowed to attach and detach from the lattice, the systemhas a variable number of particles, and the grand-canonical partition functionis Z = N max (cid:88) N =0 e βNµ Q N = 1 + N max (cid:88) N =1 (cid:104) J | ( zw ) N − z | J (cid:105) = 1 + ∞ (cid:88) M =0 (cid:104) J | ( zw ) M z | J (cid:105) = 1 + (cid:104) J | ( I − zw ) − z | J (cid:105) , (S2)where µ is the chemical potential, N max is the maximum number of particlesthat can ﬁt on L bp, I is the identity matrix, and (cid:104) k | z | l (cid:105) = e β [ µ − u ( k,l )] ≡ ζ ( k, l ).17ote that all particle conﬁgurations with N > N max do not contribute to Z ,allowing us to extend the upper limit from N max to ∞ .From the partition function, we can compute s -particle distribution functions(see the chapter by Stell in [56]): n ( k, l ) = ζ ( k, l ) Z δZδζ ( k, l ) ,n ( i, j ; k, l ) = ζ ( i, j ) ζ ( k, l ) Z δ Zδζ ( i, j ) δζ ( k, l ) , and in general n s ( i L , i R ; . . . ; i sL , i sR ) = ζ ( i L , i R ) . . . ζ ( i sL , i sR ) Z δ s Zδζ ( i L , i R ) . . . δζ ( i sL , i sR ) . Using these relations, we obtain the one-particle distribution n ( k, l ) = 1 Z (cid:104) J | ( I − zw ) − | k (cid:105)(cid:104) k | z | l (cid:105)(cid:104) l | ( I − wz ) − | J (cid:105) , (S3)and the two-particle distribution n ( i, j ; k, l ) = 1 Z (cid:104) J | ( I − zw ) − | i (cid:105)(cid:104) i | z | j (cid:105)(cid:104) j | w ( I − zw ) − | k (cid:105)(cid:104) k | z | l (cid:105)(cid:104) l | ( I − wz ) − | J (cid:105) . (S4)In particular, the nearest-neighbor two-particle distribution is given by n ( i, j ; k, l ) = 1 Z (cid:104) J | ( I − zw ) − | i (cid:105)(cid:104) i | z | j (cid:105)(cid:104) j | w | k (cid:105)(cid:104) k | z | l (cid:105)(cid:104) l | ( I − wz ) − | J (cid:105) . Eqs. (S3) and (S4) allow an obvious interpretation. To ﬁnd the probabil-ity of starting a particle covering positions from k to l [Eq. (S3)], we have toadd statistical weights of all the conﬁgurations that contain a particle at thatposition, and divide the resulting sum by the partition function. Similarly, inorder to ﬁnd the probability of a pair of particles, one covering positions i to j and the other covering positions k to l [Eq. (S4)], we need to sum statisticalweights of all the conﬁgurations containing that pair of particles, and divide bythe partition function.With the one-particle distribution, we can deﬁne the occupancy of a bp i asthe probability of ﬁnding that bp in contact with any particle. In other words,we need to sum the probabilities of all conﬁgurations in which particles coverbp i : Occ( i ) = i (cid:88) k = i − a max +1 k + a max − (cid:88) l =max( i,k + a min ) n ( k, l ) . Note that 1 − Occ( i ) is the probability that bp i is not covered by any particles.18 (t,j; s,k) u t (i, j) u s (k, l)i j k l Figure S2: Schematic illustration of one-body and two-body potentials in asystem with multiple particle types. The model allows all particles to be inmultiple stages of unwrapping. In practice, we allow nucleosomes to be par-tially unwrapped but the transcription factors (TFs) always have ﬁxed DNAfootprints.

The above formalism can be easily extended to the case in which T types ofparticles can attach to the one-dimensional lattice. Let the binding energy of aparticle of type t ∈ { , . . . , T } that covers bps i to j on the lattice be u t ( i, j ).The interaction between a particle of type t ending at position k , and the nextparticle of type s starting at position l will be denoted by Φ( t, k ; s, l ) (Fig. S2).Each particle of type t , when attached to the DNA, is in contact with a numberof bps between a t min and a t max . Thus u t ( i, j ) = 0 if i and j do not satisfy theconstraints a t min ≤ j − i + 1 ≤ a t max . Also, Φ( t, i ; s, j ) = ∞ for j ≤ i sincethe particles cannot overlap. With this notation, the grand-canonical partitionfunction becomes Z = (cid:88) all states e − β [ u t ( i L ,i R ) − µ t ] e − β Φ( t ,i R ; t ,i L ) e − β [ u t ( i L ,i R ) − µ t ] . . . , where µ t is the chemical potential of the particles of type t . The sum is over allconﬁgurations, which can have variable numbers of particles of any type.Using the matrix notation, (cid:104) t, k | z | s, l (cid:105) = e − β [ u t ( k,l ) − µ t ] δ t,s ≡ ζ t ( k, l ) δ t,s , (cid:104) t, k | w | s, l (cid:105) = e − β Φ( t,k ; s,l ) , where δ t,s is the Kronecker delta symbol, the partition function can be writtenas Z = (cid:88) all states (cid:104) t , i L | z | t , i R (cid:105)(cid:104) t , i R | w | t , i L (cid:105)(cid:104) t , i L | z | t , i R (cid:105) . . . | t, i (cid:105) has dimension T L and only one non-zero element, set to 1for normalization. For example, | , i (cid:105) vectors have a 1 at position i , | , i (cid:105) vectorshave a 1 at position L + i , etc. As above, we denote by | J (cid:105) a vector in whichall T L elements are equal to 1, to obtain Z = 1 + (cid:104) J | ( I − zw ) − z | J (cid:105) , equivalent to Eq. (S2).Similarly to the previous case of a single particle type, we can compute theone-particle density n t ( k, l ) = ζ t ( k, l ) Z δZδζ t ( k, l )= 1 Z (cid:104) J | ( I − zw ) − | t, k (cid:105)(cid:104) t, k | z | t, l (cid:105)(cid:104) t, l | ( I − wz ) − | J (cid:105) . (S5)We can also obtain the two-particle density n t,s ( i, j ; k, l ) = 1 Z (cid:104) J | ( I − zw ) − | t, i (cid:105)(cid:104) t, i | z | t, j (cid:105)× (cid:104) t, j | w ( I − zw ) − | s, k (cid:105)(cid:104) s, k | z | s, l (cid:105)(cid:104) s, l | ( I − wz ) − | J (cid:105) (S6)and the nearest-neighbor two-particle density n t,s ( i, j ; k, l ) = 1 Z (cid:104) J | ( I − zw ) − | t, i (cid:105)(cid:104) t, i | z | t, j (cid:105)× (cid:104) t, j | w | s, k (cid:105)(cid:104) s, k | z | s, l (cid:105)(cid:104) s, l | ( I − wz ) − | J (cid:105) . (S7)Eqs. (S6) and (S7) give the joint probability that a particle of type t covers bps i to j , while a second particle of type s covers bps k to l .Using Eq. (S5) we can compute occupancy for each type of particles t andfor each bp i : Occ t ( i ) = i (cid:88) k = i − a t max +1 k + a t max − (cid:88) l =max( i,k + a t min ) n t ( k, l ) . In the following sections we will focus on the one-particle density function.

A straightforward application of Eqs. (S5) and (S6) entails computationallyintensive matrix manipulations. Fortunately, for particles that interact onlythrough hard-core repulsion rather than long-range two-body interactions, theone-particle distribution can be computed recursively and therefore much moreeﬃciently. 20 .2.1 General case

With multiple particle types, Eq. (S5) can be rewritten as n t ( i, j ) = 1 Z Z − ( i ) (cid:104) t, i | z | t, j (cid:105) Z + ( j ) , where Z − ( i ) and Z + ( j ) are the partition functions for the domains [1 , i ) and( j, L ], respectively. Note that in the case of hard-core interactions alone, Z − ( i )and Z + ( j ) do not depend on the type of the particle occupying positions i through j .In the case of steric exclusion alone, these partial partition functions satisfythe following recursion relations: Z − ( i ) = Z − ( i −

1) + (cid:88) s (cid:88) i − a s max ≤ j ≤ i − a s min Z − ( j ) (cid:104) s, j | z | s, i − (cid:105) , (S8)and Z + ( i ) = Z + ( i + 1) + (cid:88) s (cid:88) i + a s min ≤ j ≤ i + a s max (cid:104) s, i + 1 | z | s, j (cid:105) Z + ( j ) . (S9)Here each particle type s has two characteristic lengths, corresponding to itsminimum and maximum DNA footprints, respectively: a s min and a s max . Theboundary conditions are Z − (1) = 1 and Z + ( L ) = 1. The full partition functionis given by Z = Z − ( L + 1) = Z + (0). Note that all unphysical terms for whichbound particles run oﬀ the lattice automatically vanish from Eqs. (S8) and (S9).To avoid numeric instabilities, the recursion should be done in log space. Let F ( i ) = ln Z − ( i ) ,R ( i ) = ln Z + ( i ) . With this notation, Eqs. (S8) and (S9) become F ( i ) = F ( i −

1) + ln (cid:110) (cid:88) s (cid:88) i − a s max ≤ j ≤ i − a s min e F ( j ) − F ( i − β [ µ s − u s ( j,i − (cid:111) , (S10) R ( i ) = R ( i + 1) + ln (cid:110) (cid:88) s (cid:88) i + a s min ≤ j ≤ i + a s max e R ( j ) − R ( i +1)+ β [ µ s − u s ( i +1 ,j )] (cid:111) , with the boundary conditions F (1) = R ( L ) = 0.Then the one-particle distribution function is n t ( i, j ) = e F ( i )+ R ( j ) − ln Z + β [ µ t − u t ( i,j )] , where ln Z = F ( L + 1) = R (0).Although we do not show it here, the two-particle distribution can be com-puted in a very similar way. The only new ingredient in Eq. (S6) is the par-tition function for the box with walls at two arbitrary positions, Z ( t, j, s, k ) ≡(cid:104) t, j | w ( I − zw ) − | s, k (cid:105) . This partition function can be computed recursively,exactly as the partial partition functions Z ± discussed above.21 .2.2 Special case: No unwrapping The special case in which all particles are fully attached to their DNA sites (i.e.,there is no DNA unwrapping) can be easily obtained from our general formalism.Indeed, in this case we restrict a s min = a s max = a s in Eq. (S10), obtaining F ( i ) = F ( i −

1) + ln (cid:110) (cid:88) s e F ( i − a s ) − F ( i − β [ µ s − u s ( i − a s ,i − (cid:111) ,R ( i ) = R ( i + 1) + ln (cid:110) (cid:88) s e R ( i + a s ) − R ( i +1)+ β [ µ s − u s ( i +1 ,i + a s )] (cid:111) . As before, the boundary conditions are F (1) = R ( L ) = 0.The one-particle distribution is given by n t ( i, i + a t −

1) = e F ( i )+ R ( i + a t − − ln Z + β [ µ t − u t ( i,i + a t − ] . In the previous section, we have solved the direct problem: given the bindingenergies for all particle types, we compute s-particle distributions. However,typically it is particle distributions that are observed experimentally, and theenergetics of particle-DNA interactions need to be inferred. This inverse prob-lem can be solved recursively for the case of systems with multiple particle types,partial unwrapping (variable footprints), and steric exclusion. The recursive so-lution is eﬃcient enough to be employed on the genome-wide scale. Here wefocus on one-particle distributions and one-body energies; the exact matrix for-mulation of the inverse problem for a single particle type with the two-particledistribution and the two-body potential is available in Ref. [52].

Using Eqs. (S5), (S8) and (S9), we obtain: Z − ( i ) = Z − ( i −  (cid:88) t,i − a t max ≤ j ≤ i − a t min ZZ − ( i − Z + ( i − n t ( j, i −  = Z − ( i − (cid:20) N R ( i − ξ ( i − (cid:21) , (S11)where N R ( i ) = (cid:80) t (cid:80) i − a t max +1 ≤ j ≤ i − a t min +1 n t ( j, i ) represents the probabilityof ﬁnding a particle of any type with the right edge at bp i , and ξ ( i ) = Z − ( i ) Z + ( i ) /Z . 22 + satisﬁes a similar recursive relation: Z + ( i ) = Z + ( i + 1) (cid:20) N L ( i + 1) ξ ( i + 1) (cid:21) , (S12)where N L ( i ) is the probability of ﬁnding a particle of any type with the leftedge at bp i : N L ( i ) = (cid:80) t (cid:80) i + a t min − ≤ j ≤ i + a t max − n t ( i, j ).The quantity ξ ( i ) satisﬁes ξ ( i + 1) − ξ ( i ) = 1 Z (cid:2) Z − ( i + 1) Z + ( i + 1) − Z − ( i ) Z + ( i ) (cid:3) = 1 Z (cid:110) Z − ( i + 1) (cid:2) Z + ( i + 1) − Z + ( i ) (cid:3) + Z + ( i ) (cid:2) Z − ( i + 1) − Z − ( i ) (cid:3) (cid:111) = N R ( i ) − N L ( i + 1) , so that ξ ( i ) = 1 + i − (cid:88) k =0 (cid:2) N R ( k ) − N L ( k + 1) (cid:3) , (S13)where the initial condition ξ (0) = 1 has been used.After we compute both Z − and Z + in this way, the total partition functionis given by Z = Z − ( L + 1) = Z + (0) as before, and the binding energy for anyparticle attached to the DNA is given by β [ u t ( i, j ) − µ t ] = − ln (cid:20) n t ( i, j ) ZZ − ( i ) Z + ( j ) (cid:21) . (S14) In the case of the all-or-none binding, all matrix elements (cid:104) i | n t | j (cid:105) vanish unless j = i + a t −

1, where a t is the length of the binding site for the particle of type t . Thus N L ( i ) = (cid:88) t n t ( i, i + a t − ,N R ( i ) = (cid:88) t n t ( i − a t + 1 , i ) . Using these expressions, we can employ Eqs. (S11), (S12) and (S13) to com-pute Z + and Z − in log space. Finally, Eq. (S14) can be used to compute thebinding energies.If all particles are of the same type, the quantity ξ can be simpliﬁed further: ξ ( i ) = 1 − i (cid:88) k = i − a +1 N L ( k ) = 1 − Occ( i ) , i ) is the probability that bp i is covered by a particle. Thus in thislimit ξ ( i ) is simply the probability that bp i is not bound by any particles.The recursion relations for Z − and Z + become Z − ( i + 1) = Z − ( i ) (cid:20) N L ( i − a + 1)1 − Occ( i ) (cid:21) ,Z + ( i ) = Z + ( i + 1) (cid:20) N L ( i + 1)1 − Occ( i + 1) (cid:21) . These expressions are equivalent to those previously obtained in [54].

The simplest sequence-dependent model of nucleosome formation assumes thatthe nucleosome energy depends only on the mono- and dinucleotide counts inthe nucleosomal site [54]. Thus for the sequence S S . . . S N , the nucleosomeformation energy is (cid:80) Ni =1 (cid:15) S i + (cid:80) N − i =1 (cid:15) S i S i +1 , where (cid:15) S i and (cid:15) S i S i +1 are themono- and dinucleotide contributions, respectively.Here we consider a symmetrized version of the model, with (cid:15) A = (cid:15) T , (cid:15) C = (cid:15) G and 10 unique dinucleotide energies: (cid:15) AA/T T , (cid:15) AC/GT , (cid:15) AG/CT , (cid:15) AT , (cid:15) CA/T G , (cid:15) CC/GG , (cid:15) CG , (cid:15) GA/T C , (cid:15) GC , (cid:15) T A .Using Eq. (S14), we can estimate all 12 (cid:15) ’s for a model in which nucleosomesare the only particle type and unwrapping is allowed. For each sequence startinga bp i and ending at bp j , we have u ( i, j ) − µ = j (cid:88) k = i (cid:15) S k + j − (cid:88) k = i (cid:15) S k S k +1 − µ, = (cid:0) m A/T m C/G m AA/T T · · · m T A − (cid:1)  (cid:15) A/T ... (cid:15)

T A µ  , (S15)where m X and m XY are the counts of mono- and dinucleotides X and XY inthe sequence, respectively.Using all possible combinations of pairs ( i, j ) where a nucleosome can form,we obtain a large number P of equations of this type: E − µ = M (cid:18) (cid:15)µ (cid:19) . Here, E − µ is a column vector of dimension P , where each row contains one u ( i, j ) − µ element from Eq. (S15). (cid:18) (cid:15)µ (cid:19) is the column vector from Eq. (S15),and M is a P ×

13 matrix with mono- and dinucleotide counts and -1’s in the24ast column. From this equation we can derive the energies (cid:15) and µ by a leastsquares ﬁt.A problem that arises is that matrix M has a one-dimensional kernel, or nullspace [52]. Because in every DNA sequence the number of mononucleotides isequal to the length of the sequence, and the number of dinucleotides is equalto the length of the sequence minus 1, the columns of the matrix M are notlinearly independent. Indeed, the column vector | V (cid:105) =  − −  is the only linearly independent vector from the kernel of M : M | V (cid:105) = 0 (i.e.,the kernel of M is spanned by | V (cid:105) ). Thus we have only 12 linearly independentequations, and cannot obtain a unique solution for the 13 parameters (cid:15) and µ .We need to add one constraint, which we choose to be (cid:88) i =1 (cid:15) i = 0 , to make the solution unique. Hydroxyl radicals that cleave DNA near the nucleosome dyad have two preferredcutting sites, at positions -1 bp and +6 bp with respect to the dyad [30]. If DNAis cut at these positions with frequencies f and 1 − f , the distance between twoconsecutive cuts (one on the Watson and one on the Crick strand) is given by d cuts = d dyads + b, where d dyads is the distance between two neighboring dyads and the bias b is b =  − −

52 with probability  (1 − f ) f (1 − f ) f . The cleavage bias has to be taken into account by convolving the predictedinter-dyad distance probability P ( c + d | c ) with a kernel corresponding to thisbias: F ( x ) =  (1 − f ) for x = − f (1 − f ) for x = − f for x = 20 otherwise . The convolution is then compared with the observed distribution of inter-dyad distances. 25 .6 Parameter optimization

Parameter ﬁtting for all models was carried out in a two-stage procedure us-ing the genetic algorithm optimization function ga from the MATLAB Globaloptimization toolbox. First, the objective function to be minimized was setequal to the root-mean-square (RMS) deviation between predicted and observedinter-dyad distributions. Once RM S < − had been achieved, the objectivefunction was replaced by RMS − r osc (cid:39) − r osc , where r osc is the linear correlationbetween observed and predicted oscillations after the smooth background hasbeen subtracted from inter-dyad distributions, as in Fig. 1D. We have foundthat the two-stage setup allows us to eﬀectively ﬁt both the overall shape andthe ﬁne oscillatory structure in the data. The best-ﬁt parameters for all modelsare given in SI Results. 26 Supplementary Results

Model A: Crystal structure augmented with an additional well.

Thebinding energy of a particle of length a = 1 + x + x (1 bp for the dyad, and x and x for the extra number of bps in contact with the histone octamer oneach side of the dyad) is given by u = u half ( x ) + u half ( x ), with u half ( x ) = interp1 ( ... ) − E b x, where E b is the binding energy of a fully wrapped particle in the absence of10-11 bp oscillations. The oscillations are based on the the positions of thehistone-DNA contacts in the crystal structure [57]. The MATLAB function interp1 (...) was used to generate the oscillatory pattern by piecewise cubicHermite interpolation between the following data points: x (Position) f(x) (Energy) -1 -A3 A7 -A13 A17 -A24 A28 -A34 A38 -A44 A49 -A55 A59 -A65 A69 -A75 Ap -d85 AThe oscillatory pattern was superimposed onto a line with the slope of − E b / arameter Value a max

163 bp a min E b B T µ -14.51 k B T A B T f p

79 bp d B T − − − − − u ha l f ( k B T ) Dyad − edge distance (bp) Table S1: Fitted parameters for Model A. a max and a min are the maximum andminimum lengths of the nucleosome particle, µ is the histone octamer chemicalpotential, A is the amplitude of the oscillations, and f is the hydroxyl radicalcutting frequency. p and d are the position and the depth of the ﬁrst minimumoutside of the nucleosome core particle, respectively. E b is the binding energyof a fully wrapped particle in the absence of 10-11 bp oscillations.Fit residuals: RM S = 9 . × − r osc = 0 . RM S osc = 1 . × − RM S is the root-mean-square error of the predicted inter-dyad distribution. r osc is the linear correlation between the oscillatory parts of the measured and pre-dicted inter-dyad distributions. The oscillatory part was obtained by subtract-ing the smooth background from the full inter-dyad distribution. Smoothingwas done by applying a Savitzky-Golay smoothing ﬁlter [also known as least-squares, or DISPO (Digital Smoothing Polynomial) ﬁlter] of polynomial order 3and length 31 bp. RM S osc is the root-mean-square error of the oscillatory partof the predicted inter-dyad distribution.28 odel B: Crystal structure augmented with a linear function.

Sameas Model A for x ∈ [1 , u half ( x ) = (cid:26) interp1 ( ... ) − E b x for x ∈ [1 , , interp1 ( ... ) − E b − ∆ E ∆ X ( x −

73) for x ∈ [74 ,

73 + ∆ X ] , where ∆ E/ ∆ X is the slope of the linear function (i.e., ∆ E is the energy dif-ference between the ﬁrst and last points of the linear function and ∆ X is thecardinality of the range of the linear function). Parameter Value a min

27 bp E b B T µ -15.04 k B T A B T f E -2.47 k B T∆ X − − − − − Dyad − edge distance (bp) u ha l f ( k B T ) Table S2: Fitted parameters for model B. All parameters are as in Model A,except for ∆ E and ∆ X which are deﬁned above.Fit residuals: RM S = 0 . RM S cannot decrease below 10 − for this model) r osc = 0 . RM S osc = 1 . × − All residuals are deﬁned as in Model A.29 odel C: 10-bp oscillations superimposed onto a linear function. u half ( x ) = − A cos (cid:18) π

10 ( x − x ) (cid:19) − E b x, where A is the amplitude of the oscillations, x determines the phase of theoscillations, and E b is the binding energy of a fully wrapped particle in theabsence of the oscillations. Parameter Value a max

165 bp a min E b B T µ -13.99 k B T A B T x

79 bp f − − − − − Dyad − edge distance (bp) u ha l f ( k B T ) Table S3: Fitted parameters for Model C. All parameters are as in Model A,except for x which is deﬁned above.Fit residuals: RM S = 9 . × − r osc = 0 . RM S osc = 2 . × − All residuals are deﬁned as in Model A.30 odel D: 11-bp oscillations superimposed onto a linear function. u half ( x ) = − A cos (cid:18) π

11 ( x − x ) (cid:19) − E b x, where A , x and E b have the same meaning as in Model C. Parameter Value a max

161 bp a min

25 bp E b B T µ -14.30 k B T A B T x

80 bp f − − − − − Dyad − edge distance (bp) u ha l f ( k B T ) Table S4: Fitted parameters for Model D. All parameters are as in Model C.Fit residuals:

RM S = 9 . × − r osc = 0 . RM S osc = 2 . × − All residuals are deﬁned as in Model A.31 odel E: Uniform unwrapping. u half ( x ) = − E b x, where E b is the binding energy of a fully wrapped nucleosome. Parameter Value a max

163 bp a min

35 bp E b B T µ -13.14 k B T f − − − − − Dyad − edge distance (bp) u ha l f ( k B T ) Table S5: Fitted parameters for Model E. All parameters are as in Model A.Fit residuals:

RM S = 9 . × − r osc = 0 . RM S osc = 2 . × − All residuals are deﬁned as in Model A.32 odel F: 5-bp oscillations superimposed onto a linear function. u half ( x ) = − A cos (cid:18) π x − x ) (cid:19) − E b x, where A , x and E b have the same meaning as in Model C. Parameter Value a max

163 bp a min

39 bp E b B T µ -16.13 k B T A B T x

74 bp f − − − − − Dyad − edge distance (bp) u ha l f ( k B T ) Table S6: Fitted parameters for Model F. All parameters are as in Model C.Fit residuals:

RM S = 9 . × − r osc = 0 . RM S osc = 3 . × − All residuals are deﬁned as in Model A.33 odel G: 5-bp stepwise unwrapping. u half ( x ) = − E step ceil (cid:18) x − x (cid:19) , where E step is the amount of energy lost in each step, and x determines thephase of the stepwise proﬁle. Parameter Value a max

163 bp a min

39 bp E step B T µ -12.83 k B T x f − − − − − Dyad − edge distance (bp) u ha l f ( k B T ) Table S7: Fitted parameters for Model G. All parameters are as in Model A,except for E step and x deﬁned above.Fit residuals: RM S = 9 . × − r osc = 0 . RM S osc = 2 . × − All residuals are deﬁned as in Model A.34 odel H: 10-bp stepwise unwrapping. u half ( x ) = − E step ceil (cid:18) x − x (cid:19) , where E step is the amount of energy lost in each step, and x determines thephase of the stepwise proﬁle. Parameter Value a max

169 bp a min E step B T µ -12.04 k B T x f − − − − − − Dyad − edge distance (bp) u ha l f ( k B T ) Table S8: Fitted parameters for Model E. All parameters are as in Model F.Fit residuals:

RM S = 9 . × − r osc = 0 . RM S osc = 2 . × − All residuals are deﬁned as in Model A.35

Supplementary Figures (cid:132)(cid:20) a min (bp)a max (bp) R M S (cid:132)(cid:17)(cid:15)(cid:22) (cid:17)(cid:15)(cid:22) a min (bp)a max (bp) C o rr e l a t i on A B (cid:132)(cid:20)

Slope (k B T / bp) R M S C Figure S3:

Sensitivity of the predicted inter-dyad distribution to theparameters of the unwrapping potential based on nucleosome crystalstructures. (A) Root-mean-square error (RMS) of the inter-dyad distributionpredicted using the model in Fig. 1B, as a function of a min and a max . (B) Thelinear correlation coeﬃcient between oscillations in the predicted and observedinter-dyad distributions ( r osc ), as a function of a min and a max . The oscillationswere obtained by subtracting the smooth background from inter-dyad distribu-tions, as described in the Fig. 1 caption. (C) Variation of the RMS with theslope of the unwrapping potential in Fig. 1B. In all panels, model parametersthat were not varied were kept ﬁxed at their best-ﬁt values (SI Results, ModelA). 36 inimum (bp)Correlation

76 77 78 79 80 81 821.51.71.92.12.32.52.72.93.13.33.5 (cid:132)(cid:17)(cid:15)(cid:18) D ep t h ( k B T ) Minimum (bp) D ep t h ( k B T ) RMS

76 77 78 79 80 81 821.51.71.92.12.32.52.72.93.13.33.5 11.522.533.544.5x 10 (cid:132)(cid:20) AB Figure S4:

Sensitivity of the predicted inter-dyad distribution to modelparameters describing higher-order chromatin structure.

The unwrap-ping potential is based on nucleosome crystal structures (SI Results, Model A).(A) Root-mean-square error (RMS) of the predicted inter-dyad distribution,as a function of the position and the depth of the ﬁrst minimum outside ofthe nucleosome core (Fig. 1B). The depth of the ﬁrst minimum is computedwith respect to u half ( x = 73 bp). (B) The linear correlation coeﬃcient betweenoscillations in the predicted and observed inter-dyad distributions ( r osc ), as afunction of the position and the depth of the ﬁrst minimum outside of the nucle-osome core (Fig. 1B). The oscillations were obtained by subtracting the smoothbackground from inter-dyad distributions, as described in the Fig. 1 caption.In both panels, model parameters that were not varied were kept ﬁxed at theirbest-ﬁt values (SI Results, Model A). 37 B Dyad − edge distance (bp) u ha l f ( k B T ) Inter−dyad distance (bp) P r obab ili t y DataModel50 100 150 200 250−2−1012 x 10 −3 Inter−dyad distance (bp) O sc ill a t o r y pa r t Correlation: 0.769

DataModel C (cid:132)(cid:18)(cid:19)(cid:15)(cid:22) (cid:132)(cid:18)(cid:17) (cid:132)(cid:24)(cid:15)(cid:22) (cid:132)(cid:22) (cid:132)(cid:19)(cid:15)(cid:22) (cid:17) (cid:19)(cid:15)(cid:22) (cid:22) (cid:24)(cid:15)(cid:22)(cid:17)(cid:17)(cid:15)(cid:17)(cid:17)(cid:19)(cid:17)(cid:15)(cid:17)(cid:17)(cid:21)(cid:17)(cid:15)(cid:17)(cid:17)(cid:23)(cid:17)(cid:15)(cid:17)(cid:17)(cid:25)(cid:17)(cid:15)(cid:17)(cid:18)(cid:17)(cid:17)(cid:15)(cid:17)(cid:18)(cid:19) (cid:39) E (k B T) R M S E FigS4_Xray_slope_outside.pdf

Extra number of base-pairs (cid:39) E ( k B T ) Correlation

DRMS: 11.964×10 -4 Figure S5:

Crystal structure-based model augmented by a linear po-tential outside of the nucleosome core. (A) The energy proﬁle ﬁtted toreproduce the inter-dyad distance distribution shown in (B). All ﬁtting parame-ters are listed in SI Results, Model B. Under the symmetric unwrapping assump-tion, the energy of a nucleosome which covers 2 x + 1 bps is given by 2 u half ( x ).(B) The inter-dyad distance distribution observed in a high-resolution nucleo-some map [30] (blue line), and predicted using Model B in SI Results (red line).RMS - root-mean-square deviation between the model and the data. Note thatin this model RMS below 10 − could not be achieved, and thus optimizationwas switched to maximize the correlation coeﬃcient r osc once RMS reached1 . × − (see SI Methods for details). (C) Oscillations in the observed (blueline) and predicted (red line) inter-dyad distributions. The oscillations were ob-tained by subtracting the smooth background from the data and the model in(B), as described in the Fig. 1 caption. Correlation refers to r osc , the linear cor-relation coeﬃcient between measured and predicted oscillations. (D) Heatmapwith superimposed contour lines of the r osc dependence on the two parametersof the linear potential outside of the nucleosome core: ∆ x = x last −

73 bp and∆ E = u half ( x last ) − u half (73), where [1 , x last ] is the range of the energy proﬁle(SI Results, Model B). Note that the best ﬁt corresponds to ∆ x = 7 bp. (E)The dependence of the RMS on ∆ E for the best-ﬁt value of ∆ x = 7 bp. Allparameters not explicitly varied in (D) and (E) were kept ﬁxed at their best-ﬁtvalues (SI Results, Model B). 38 Inter−dyad distance (bp) P r obab ili t y DataModel

50 100 150 200 250 − − − Inter −dyad distance (bp) O sc ill a t o r y pa r t Correlation: 0.709

DataModel

50 100 150 200 25000.0050.0100.0150.0200.025

Inter −dyad distance (bp) P r obab ili t y DataModel

50 100 150 200 250 − − − Inter −dyad distance (bp) O sc ill a t o r y pa r t Correlation: 0.689

DataModel (cid:37)(cid:90)(cid:66)(cid:69)(cid:1)(cid:132)(cid:1)(cid:70)(cid:69)(cid:72)(cid:70)(cid:1)(cid:69)(cid:74)(cid:84)(cid:85)(cid:66)(cid:79)(cid:68)(cid:70)(cid:1)(cid:9)(cid:67)(cid:81)(cid:10) (cid:132) (cid:132) (cid:132) (cid:132) (cid:68)(cid:83)(cid:90)(cid:84)(cid:85)(cid:66)(cid:77)(cid:1)(cid:84)(cid:85)(cid:83)(cid:86)(cid:68)(cid:85)(cid:86)(cid:83)(cid:70) (cid:132)(cid:67)(cid:81)(cid:1)(cid:81)(cid:70)(cid:83)(cid:74)(cid:80)(cid:69)(cid:74)(cid:68) (cid:132)(cid:67)(cid:81)(cid:1)(cid:81)(cid:70)(cid:83)(cid:74)(cid:80)(cid:69)(cid:74)(cid:68) (cid:86) (cid:73)(cid:66) (cid:77) (cid:71) (cid:9) (cid:76) B (cid:53) (cid:10) Dyad

ABC DE

RMS: 9.986 ×10 -4 RMS: 9.912 ×10 -4 Figure S6:

Strictly periodic models of nucleosome unwrapping. (A)Nucleosome unwrapping/higher-order structure potential energy proﬁles. Underthe symmetric unwrapping assumption, the energy of a nucleosome that covers2 x +1 bps is given by 2 u half ( x ). The minima and maxima of the energy landscapeare either based on the crystal structures of the nucleosome core particle as inFigure 1 (blue), or else are 10 (green) and 11 (red) bp-periodic oscillations withﬁtted initial phase (SI Results, Models C and D). Dark gray bars show wherethe histone binding motifs interact with the DNA minor groove. Light gray barsindicate where the DNA major groove faces the histones. (B) The inter-dyaddistance distribution from a high-resolution nucleosome map [30] (blue line),and from the 10 bp-periodic model (red line). All model parameters are listedin SI Results, Model C. RMS - root-mean-square deviation between the modeland the data. (C) Oscillations in the observed (blue line) and predicted (redline) inter-dyad distributions. The oscillations were obtained by subtractinga smooth background from the data and the model in (B), as described inthe Fig. 1 caption. Correlation refers to r osc , the linear correlation coeﬃcientbetween measured and predicted oscillations. (D) Same as (B), for the 11 bp-periodic model. All model parameters are listed in SI Results, Model D. (E)Same as (C), for the 11 bp-periodic model.39 inimum (bp) S l ope ( k B T / bp ) RMS

64 65 66 67 68 69 70 71 72 730.050.060.070.080.090.100.110.120.130.140.15 11.522.533.54x 10 (cid:132)(cid:20)

64 66 68 70 720.050.10.15(cid:132)(cid:17)(cid:15)(cid:22)00.5

Minimum (bp)Slope (k B T / bp) C o rr e l a t i on AB Figure S7:

Sensitivity of the predicted inter-dyad distribution to pa-rameters of the 10 bp-periodic model. (A) Heatmap with superimposedcontour lines of the RMS dependence on the slope of the energy proﬁle and theposition of the last minimum within the nucleosome core. RMS - root-mean-square deviation between the model and the data. (B) The linear correlationcoeﬃcient r osc between oscillations in the predicted and observed inter-dyaddistributions, as a function of the overall slope of the energy proﬁle and theposition of the last minimum within the nucleosome core. All parameters notexplicitly varied were kept ﬁxed at their best-ﬁt values (SI Results, Model C).40 inear model Dyad − edge distance (bp) u ha l f ( k B T ) − − − − − Inter−dyad distance (bp) P r obab ili t y Data

Model50 100 150 200 250 − − − Inter−dyad distance (bp) O sc ill a t o r y pa r t Correlation: 0.206

Data

Model

Dyad − edge distance (bp) u ha l f ( k B T ) − − − − − Inter−dyad distance (bp) P r obab ili t y Data

Model50 100 150 200 250 − − − Inter−dyad distance (bp) O sc ill a t o r y pa r t Correlation: 0.283

Data

Model

Dyad − edge distance (bp) u ha l f ( k B T ) − − − − − Inter−dyad distance (bp) P r obab ili t y Data

Model50 100 150 200 250 − − − Inter−dyad distance (bp) O sc ill a t o r y pa r t Correlation: 0.545

Data

Model ×10 -4 RMS: 9.904 ×10 -4 RMS: 9.790 ×10 -4 Dyad − edge distance (bp) u ha l f ( k B T ) − − − − − Inter−dyad distance (bp) P r obab ili t y Data

Model50 100 150 200 250 − − − Inter−dyad distance (bp) O sc ill a t o r y pa r t Correlation: 0.275

Data

Model

RMS: 9.972 ×10 -4 A B C D Figure S8:

Alternative models of nucleosome unwrapping. (A) Linearmodel (SI Results, Model E). (B) 5-bp periodic model (SI Results, Model F).(C) 5-bp step-wise model (SI Results, Model G). (D) 10-bp step-wise model(SI Results, Model H). In each column, the upper panel shows the nucleosomeunwrapping/higher-order structure potential energy proﬁle (as in Fig. 1B), themiddle panel shows the comparison of experimental and predicted inter-dyaddistance distributions (as in Fig. 1C), and the lower panel shows observed andpredicted oscillations in the inter-dyad distance distributions (as in Fig. 1D).41 (cid:19) (cid:132)(cid:18) (cid:18) (cid:19) (cid:18)(cid:19)(cid:22)(cid:18)(cid:20)(cid:17)(cid:18)(cid:20)(cid:22)(cid:18)(cid:21)(cid:17)(cid:18)(cid:21)(cid:22)(cid:18)(cid:22)(cid:17)(cid:18)(cid:22)(cid:22)(cid:18)(cid:23)(cid:17) (cid:34)(cid:87)(cid:70)(cid:83)(cid:66)(cid:72)(cid:70)(cid:1)(cid:73)(cid:74)(cid:84)(cid:85)(cid:80)(cid:79)(cid:70)(cid:1)(cid:85)(cid:86)(cid:83)(cid:79)(cid:80)(cid:87)(cid:70)(cid:83)(cid:1)(cid:83)(cid:66)(cid:85)(cid:70)(cid:1)(cid:9)(cid:91)(cid:132)(cid:84)(cid:68)(cid:80)(cid:83)(cid:70)(cid:10) A v e r age i n t e r (cid:132) (cid:69) (cid:90) (cid:66)(cid:69) (cid:1) (cid:69) (cid:74) (cid:84) (cid:85) (cid:66)(cid:79) (cid:68) (cid:70) Correlation: (cid:132)(cid:17)(cid:15)(cid:20)(cid:26)(cid:19)

A/T ratio (cid:34) (cid:87) (cid:70) (cid:83) (cid:66)(cid:72)(cid:70) (cid:1) (cid:74) (cid:79) (cid:85) (cid:70) (cid:83) (cid:132) (cid:69) (cid:90) (cid:66)(cid:69) (cid:1) (cid:69) (cid:74) (cid:84) (cid:85) (cid:66)(cid:79) (cid:68) (cid:70)

Correlation: 0.501

Position (bp) (cid:41)(cid:74)(cid:84)(cid:85)(cid:80)(cid:79)(cid:70)(cid:1)(cid:85)(cid:86)(cid:83)(cid:79)(cid:80)(cid:87)(cid:70)(cid:83)(cid:1)(cid:83)(cid:66)(cid:85)(cid:70)(cid:1)(cid:9)(cid:91)(cid:132)(cid:84)(cid:68)(cid:80)(cid:83)(cid:70)(cid:10) (cid:132) (cid:132) (cid:132) (cid:132) (cid:132) Position (bp)PIC occupancy (cid:132)(cid:20)(cid:17)(cid:17)(cid:17) (cid:132)(cid:19)(cid:17)(cid:17)(cid:17) (cid:132)(cid:18)(cid:17)(cid:17)(cid:17)

TSS (cid:18)(cid:17)(cid:17)(cid:17) (cid:19)(cid:17)(cid:17)(cid:17) (cid:17)(cid:22)(cid:17)(cid:18)(cid:17)(cid:17)(cid:18)(cid:22)(cid:17)(cid:19)(cid:17)(cid:17)(cid:19)(cid:22)(cid:17)(cid:20)(cid:17)(cid:17)

BD E CA Position (bp)Average fragment length (cid:132)(cid:20)(cid:17)(cid:17)(cid:17) (cid:132)(cid:19)(cid:17)(cid:17)(cid:17) (cid:132)(cid:18)(cid:17)(cid:17)(cid:17)

TSS (cid:18)(cid:17)(cid:17)(cid:17) (cid:19)(cid:17)(cid:17)(cid:17) (cid:18)(cid:20)(cid:22)(cid:18)(cid:21)(cid:17)(cid:18)(cid:21)(cid:22)(cid:18)(cid:22)(cid:17)(cid:18)(cid:22)(cid:22)

Figure S9:

Genome-wide distribution of nucleosome lengths, histoneturnover rates, and transcription pre-initiation complexes. (A) Dis-tribution of average lengths of DNA-bound particles mapped by MNase diges-tion [47] in the vicinity of TSS. We considered particles with sizes between 80and 200 bp and assigned particle lengths to the mid-point of each particle. Val-ues for bps without dyads were obtained by interpolation. (B) Distribution ofhistone turnover rates [49] in the vicinity of TSS. (C) Distribution of the com-bined occupancy of 9 transcription pre-initiation complexes (PICs) [48] in thevicinity of TSS. PIC occupancies provided at 20 bp interval in Ref. [48] wereinterpolated. In panels (A)-(C), gene order is as in Figure 1B, and the heatmapswere smoothed using a 2D Gaussian kernel with σ = 3 pixels. (D) Correlationbetween inter-dyad distances and histone turnover rates averaged over 10 kbpwindows tiling the yeast genome. (E) Correlation between average inter-dyaddistances and the A/T ratio in 10 kbp windows tiling the yeast genome. A/Tratio is the fraction of A/T nucleotides in the window, divided by the genome-wide A/T fraction. Correlation in (D) and (E) refers to the linear correlationcoeﬃcient. 42 P o t en t i a l ba rr i e r ( k B T / bp ) I n t e r gen i c r eg i on r a t i o ( % ) −1000 −500 TSS +500 +1000 N o r m a li z ed d y ad den s i t y Distance (bp) A v e r age f r ag m en t s i z e ( bp ) Normalized dyad density (Brogaard et al.)

Normalized dyad density (simulation)

Average nuc. size (Henikoff et al.)Average nuc. size (simulation)

Figure S10:

Modeling distributions of nucleosome lengths and dyadpositions in the vicinity of TSS.

We align all yeast genes by their TSS as inFig. 2 and for each bp compute the fraction of times that bp is found in an in-tergenic region, as opposed to the ORF of a neighboring gene (grey backgroundcurve). The intergenic fraction has an asymmetric shape with a maximum atabout 50 bp upstream of the TSS. We use this shape as a guide for construct-ing an energy barrier for in vivo histone deposition (pink background curve).The barrier is composed of three half-Gaussians: B ( x ) = H exp (cid:104) − ( x − c ) σ (cid:105) ( x ≤ c ) , ( H + D ) (cid:110) exp (cid:104) − ( x − c ) σ (cid:105) − (cid:111) − D exp (cid:104) − ( x − c ) σ (cid:105) ( x > c ). The free param-eters of the barrier are ﬁt to maximize the sum of two correlations: betweenobserved and predicted normalized dyad counts [30] (solid and dashed bluelines, respectively), and between observed and predicted average nucleosomeDNA lengths [47] (solid and dashed green lines, respectively). Normalized dyadcounts are computed as the total number of dyads at a given bp for all genes,divided by the average of this quantity in a [-1000,1000] bp window around theTSS. Average DNA lengths are computed for all nucleosomes with a midpoint ata given bp, for all genes (if the midpoint falls in between two bps, the one on theleft is used). The ﬁtted parameters are: H = 0 . B T , D = 0 . B T , c = x TSS −

32 bp , σ = 162 . , σ = 28 . , σ = 2090 . x TSS is theabsolute position of the TSS in the box, c is the center of the 3 Gaussians, H is the height of the ﬁrst Gaussian, D is the depth of the third Gaussian, and σ , σ , σ are the standard deviations of the three Gaussian distributions. Thesimulations were done in a 15 kbp box with the barrier placed at its center toeliminate the boundary eﬀects. Unwrapping was assumed to be symmetric andthe nucleosome structure-based unwrapping potential (SI Results, Model A) wasused. The total free energy u nuc ( k, l ) of a nucleosome occupying bps k, . . . , l isa sum of u SInuc and u barriernuc = (cid:80) lj = k (cid:15) j , where (cid:15) j is the value of the barrier at bp j . Note that the chemical mapping data underestimates the number of -1 and+1 nucleosomes due to gel selection bias [30].43

20 40 60 80 (cid:132)(cid:18)(cid:17)(cid:132)(cid:25)(cid:132)(cid:23)(cid:132)(cid:21)(cid:132)(cid:19) Dyad (cid:132) edge distance (bp) (cid:52) (cid:70)(cid:82)(cid:86)(cid:70)(cid:79) (cid:68) (cid:70) (cid:132) (cid:74) (cid:79)(cid:69)(cid:70)(cid:81)(cid:70)(cid:79)(cid:69)(cid:70)(cid:79) (cid:85) b i nd i ng ene r g y ( k B T ) Exact (cid:36)(cid:80)(cid:78)(cid:81)(cid:86)(cid:85)(cid:70)(cid:69)(cid:1)(cid:18)(cid:1)(cid:83)(cid:70)(cid:66)(cid:69)(cid:1)(cid:16)(cid:1)(cid:67)(cid:81)(cid:36)(cid:80)(cid:78)(cid:81)(cid:86)(cid:85)(cid:70)(cid:69)(cid:1)(cid:18)(cid:17)(cid:1)(cid:83)(cid:70)(cid:66)(cid:69)(cid:84)(cid:1)(cid:16)(cid:1)(cid:67)(cid:81)(cid:36)(cid:80)(cid:78)(cid:81)(cid:86)(cid:85)(cid:70)(cid:69)(cid:1)(cid:18)(cid:17)(cid:17)(cid:1)(cid:83)(cid:70)(cid:66)(cid:69)(cid:84)(cid:1)(cid:16)(cid:1)(cid:67)(cid:81) a max a min E b µ A p d020406080100 Parameter | (cid:39) P | / | P | ( % ) AB C o rr e l a t i on D y ad s O cc . Figure S11:

Inference of the unwrapping potential in a model withsequence-speciﬁc nucleosome formation energies. (A) Exact unwrappingpotential vs. unwrapping potentials predicted at three levels of sequence cov-erage. All calculations were done on

S. cerevisiae chromosome I with L = 230kbp. M L reads were randomly sampled from the exact distribution n nuc1 ( i, j )(see Materials and Methods), where M ∈ { , , } is the desired level ofread coverage per bp. Sampled reads were used to compile a chromosome-widehistogram of nucleosome DNA lengths, to which the unwrapping potential inSI Results, Model A was ﬁt using a genetic algorithm optimization function ga from MATLAB, which minimizes the root-mean-square error of the predicteddistribution of nucleosome lengths. (B) The relative errors between predictedand exact (SI Results, Model A) parameters of the unwrapping potential at threelevels of sequence coverage (light blue background). P denotes any parameteron the horizontal axis. Linear correlation between predicted and exact distri-butions of dyad positions and nucleosome occupancy (light pink background).The height of each bar represents the mean relative error for the correspondingparameter or the mean correlation coeﬃcient, obtained by averaging the re-sults of 100 random sampling experiments. The uncertainty intervals representstandard deviations. 44 eferences [1] K. E. van Holde, Chromatin . New York: Springer, 1989.[2] K. Luger et al. , “Crystal structure of the nucleosome core particle at 2.8 ˚Aresolution,”

Nature , vol. 389, pp. 251–260, Sept. 1997.[3] G. Felsenfeld and M. Groudine, “Controlling the double helix,”

Nature ,vol. 421, pp. 448–453, 2003.[4] A. L. Olins and D. E. Olins, “Spheroid chromatin units (v bodies),”

Science ,vol. 183, pp. 330–332, Jan. 1974.[5] R. Kornberg, “Chromatin structure: a repeating unit of histones and dna.,”

Science (New York, NY) , vol. 184, no. 4139, pp. 868–871, 1974.[6] C. Woodcock, J. Safer, and J. Stanchﬁeld, “Structural repeating units inchromatin. i. evidence for their general occurrence.,”

Experimental cell re-search , vol. 97, pp. 101–110, 1976.[7] H. Schiessel, J. Widom, R. Bruinsma, and W. Gelbart, “Polymer reptationand nucleosome repositioning,”

Phys. Rev. Lett. , vol. 86, pp. 4414–4417,2001.[8] I. Kulic and H. Schiessel, “Chromatin dynamics: Nucleosomes go mobilethrough twist defects,”

Phys. Rev. Lett. , vol. 91, p. 148103, 2003.[9] M. Vignali et al. , “ATP-Dependent chromatin-remodeling complexes,”

Mol. Cell. Biol. , vol. 20, pp. 1899–1910, Mar. 2000.[10] K. J. Polach and J. Widom, “Mechanism of protein access to speciﬁc DNAsequences in chromatin: A dynamic equilibrium model for gene regulation,”

J. Mol. Biol. , vol. 254, pp. 130–149, Nov. 1995.[11] J. D. Anderson, A. Th˚astr¨om, and J. Widom, “Spontaneous access of pro-teins to buried nucleosomal DNA target sites occurs via a mechanism that isdistinct from nucleosome translocation,”

Mol. Cell. Biol. , vol. 22, pp. 7147–7157, Oct. 2002.[12] H. S. Tims et al. , “Dynamics of nucleosome invasion by DNA binding pro-teins,”

J. Mol. Biol. , vol. 411, pp. 430–448, Aug. 2011.[13] G. Moyle-Heyrman, H. S. Tims, and J. Widom, “Structural constraints incollaborative competition of transcription factors against the nucleosome,”

J. Mol. Biol. , vol. 412, pp. 634–646, Sept. 2011.[14] C. Hodges et al. , “Nucleosomal ﬂuctuations govern the transcription dy-namics of RNA polymerase II,”

Science , vol. 325, pp. 626–628, July 2009.[15] A. Bucceri, K. Kapitza, and F. Thoma, “Rapid accessibility of nucleosomalDNA in yeast on a second time scale,”

EMBO J , vol. 25, pp. 3123–3132,July 2006. 4516] C. C. Adams and J. L. Workman, “Binding of disparate transcriptionalactivators to nucleosomal DNA is inherently cooperative,”

Mol. Cell. Biol. ,vol. 15, p. 1405–1421, 1995.[17] J. D. Anderson and J. Widom, “Sequence and position-dependence of theequilibrium accessibility of nucleosomal DNA target sites,”

J. Mol. Biol. ,vol. 296, pp. 979–987, 2000.[18] J. A. Miller and J. Widom, “Collaborative competition mechanism for geneactivation in vivo,”

Mol. Cell. Biol. , vol. 23, pp. 1623–1632, 2003.[19] G. Li and J. Widom, “Nucleosomes facilitate their own invasion,”

Nat.Struct. Mol. Biol. , vol. 11, no. 1, pp. 763–769, 2004.[20] G. Li et al. , “Rapid spontaneous accessibility of nucleosomal DNA,”

Nat.Struct. Mol. Biol. , vol. 12, no. 1, pp. 46–53, 2004.[21] M. Engeholm et al. , “Nucleosomes can invade DNA territories occupied bytheir neighbors,”

Nat. Struct. Mol. Biol. , vol. 16, no. 2, pp. 151–158, 2009.[22] M. G. Poirier et al. , “Spontaneous access to DNA target sites in foldedchromatin ﬁbers,”

J. Mol. Biol. , vol. 379, pp. 772–786, June 2008.[23] M. G. Poirier, E. Oh, H. S. Tims, and J. Widom, “Dynamics and function ofcompact nucleosome arrays,”

Nat. Struct. Mol. Biol. , vol. 16, pp. 938–945,2009.[24] T. Chou, “An exact theory of histone-DNA adsorption and wrapping,”

Europhys. Lett. , vol. 62, pp. 753–759, June 2003.[25] L. A. Mirny, “Nucleosome-mediated cooperativity between transcriptionfactors,”

Proc. Natl. Acad. Sci. U.S.A. , vol. 107, pp. 22534–22539, Dec.2010.[26] V. B. Teif and K. Rippe, “Statistical–mechanical lattice models for pro-tein–DNA binding in chromatin,”

J. Phys.: Condens. Matter , vol. 22,p. 414105, Oct. 2010.[27] V. B. Teif and K. Rippe, “Nucleosome mediated crosstalk between tran-scription factors at eukaryotic enhancers,”

Phys. Biol. , vol. 8, p. 044001,Aug. 2011.[28] V. B. Teif and K. Rippe, “Calculating transcription factor binding mapsfor chromatin,”

Brief. Bioinform. , vol. 13, pp. 187–201, Mar. 2012.[29] P. Prinsen and H. Schiessel, “Nucleosome stability and accessibility of itsDNA to proteins,”

Biochimie , vol. 92, pp. 1722–1728, Dec. 2010.[30] K. Brogaard et al. , “A map of nucleosome positions in yeast at base-pairresolution,”

Nature , vol. 486, pp. 496–501, 2012.4631] C. Davey, D. Sargent, K. Luger, A. Maeder, and T. Richmond, “Solventmediated interactions in the structure of the nucleosome core particle at1.9 ˚aresolution,”

J. Mol. Biol. , vol. 319, pp. 1097–1113, 2002.[32] T. J. Richmond and C. A. Davey, “The structure of DNA in the nucleosomecore,”

Nature , vol. 423, pp. 145–150, 2003.[33] C. Dingwall, G. P. Lomonosoﬀ, and R. A. Laskey, “High sequence speciﬁcityof micrococcal nuclease,”

Nucl. Acids Res. , vol. 9, pp. 2659–2673, 1981.[34] J. D. McGhee and G. Felsenfeld, “Another potential artifact in the studyof nucleosome phasing by chromatin digestion with micrococcal nuclease,”

Cell , vol. 32, pp. 1205–1215, 1983.[35] H.-R. Chung, I. Dunkel, F. Heise, C. Linke, S. Krobitsch, A. E. Ehrenhofer-Murray, S. R. Sperling, and M. Vingron, “The eﬀect of micrococcal nucle-ase digestion on nucleosome positioning data,”

PLoS ONE , vol. 5, no. 12,p. e15754, 2010.[36] A. V. Morozov, K. Fortney, D. A. Gaykalova, V. M. Studitsky, J. Widom,and E. D. Siggia, “Using DNA mechanics to predict in vitro nucleosomepositions and formation energies,”

Nucleic Acids Res. , vol. 37, pp. 4707–4722, 2009.[37] J. Zlatanova, C. Seebart, and M. Tomschik, “The linker-protein network:control of nucleosomal DNA accessibility,”

Trends Biochem. Sci. , vol. 33,pp. 247–253, 2008.[38] S. H. Syed et al. , “Single-base resolution mapping of h1–nucleosome in-teractions and 3D organization of the nucleosome,”

Proc. Natl. Acad. Sci.U.S.A. , vol. 107, pp. 9620–9625, May 2010.[39] M. Georgieva, A. Roguev, K. Balashev, J. Zlatanova, and G. Miloshev,“Hho1p, the linker histone of saccharomyces cerevisiae , is important forthe proper chromatin organization in vivo ,” Biochim. et Biophys. Acta ,vol. 1819, pp. 366–374, 2012.[40] G. Schafer, C. McEvoy, and H.-G. Patterton, “The saccharomyces cere-visiae linker histone Hho1p is essential for chromatin compaction in sta-tionary phase and is displaced by transcription,”

Proc. Natl. Acad. Sci.U.S.A. , vol. 105, pp. 14838–14843, 2008.[41] J. Widom, “A relationship between the helical twist of DNA and the or-dered positioning of nucleosomes in all eukaryotic cells,”

Proc. Natl. Acad.Sci. USA , vol. 89, pp. 1095–1099, 1992.[42] J. P. Wang, Y. Fondufe-Mittendorf, L. Xi, G. F. Tsai, E. Segal, andJ. Widom, “Preferentially quantized linker DNA lengths in Saccharomycescerevisiae,”

PLoS Comput. Biol. , vol. 4, p. e1000175, 2008.4743] R. V. Chereji et al. , “Statistical mechanics of nucleosome ordering bychromatin-structure-induced two-body interactions,”

Phys. Rev. E , vol. 83,p. 050903, May 2011.[44] M. Hall, A. Shundrovsky, L. Bai, R. Fulbright, J. Lis, and M. Wang, “High-resolution dynamic mapping of histone-DNA interactions in a nucleosome,”

Nat. Struct. Mol. Biol. , vol. 16, pp. 124–129, 2009.[45] R. Forties, J. North, S. Javaid, O. Tabbaa, R. Fishel, M. Poirier, andR. Bundschuh, “A quantitative model of nucleosome dynamics,”

NucleicAcids Res. , vol. 39, pp. 8306–8313, 2011.[46] A. Thastrom, P. T. Lowary, H. R. Widlund, H. Cao, M. Kubista, andJ. Widom, “Sequence motifs and free energies of selected natural and non-natural nucleosom e positioning DNA sequences,”

J. Mol. Biol. , vol. 288,pp. 213–229, 1999.[47] J. G. Henikoﬀ, J. A. Belskyb, K. Krassovsky, D. MacAlpine, andS. Henikoﬀ, “Epigenome characterization at single base-pair resolution,”

Proc. Natl. Acad. Sci. USA , vol. 108, pp. 18318–18323, 2011.[48] H. S. Rhee and B. F. Pugh, “Genome-wide structure and organization ofeukaryotic pre-initiation complexes,”

Nature , vol. 483, pp. 295–301, 2012.[49] M. F. Dion, T. Kaplan, M. Kim, S. Buratowski, N. Friedman, and O. J.Rando, “Dynamics of replication-independent histone turnover in buddingyeast,”

Science , vol. 315, pp. 1405–1408, 2007.[50] G. C. Yuan, Y. J. Liu, M. F. Dion, M. D. Slack, L. F. Wu, S. J. Altschuler,and O. J. Rando, “Genome-scale identiﬁcation of nucleosome positions inS. cerevisiae,”

Science , vol. 309, pp. 626–630, 2005.[51] T. N. Mavrich, I. P. Ioshikhes, B. J. Venters, C. Jiang, L. P. Tomsho, J. Qi,S. C. Schuster, I. Albert, and B. F. Pugh, “A barrier nucleosome modelfor statistical positioning of nucleosomes throughout the yeast genome,”

Genome Res. , vol. 18, pp. 1073–1083, 2008.[52] R. V. Chereji and A. V. Morozov, “Statistical mechanics of nucleosomesconstrained by higher-order chromatin structure,”

J. Stat. Phys. , vol. 144,no. 2, pp. 379–404, 2011.[53] A. Th˚astr¨om, P. Lowary, and J. Widom, “Measurement of histone–DNAinteraction free energy in nucleosomes,”

Methods , vol. 33, pp. 33–44, May2004.[54] G. Locke, D. Tolkunov, Z. Moqtaderi, K. Struhl, and A. V. Morozov, “High-throughput sequencing reveals a simple model of nucleosome energetics,”

Proceedings of the National Academy of Sciences , vol. 107, pp. 20998–21003,Nov. 2010. 4855] H. A. Cole, B. H. Howard, and D. J. Clark, “Activation-induced disruptionof nucleosome position clusters on the coding regions of Gcn4-dependentgenes extends into neighbouring genes,”

Nucl. Acids Res. , vol. 39, pp. 9521–9535, 2011.[56] H. L. Frisch and J. L. Lebowitz,

The equilibrium theory of classical ﬂuids:a lecture note and reprint volume . WA Benjamin, 1964.[57] T. J. Richmond and C. A. Davey, “The structure of DNA in the nucleosomecore,”