[PDF] In vivo facilitated diffusion model

Abstract

Under dilute in vitro conditions transcription factors rapidly locate their target sequence on DNA by using the facilitated diffusion mechanism. However, whether this strategy of alternating between three-dimensional bulk diffusion and one-dimensional sliding along the DNA contour is still beneficial in the crowded interior of cells is highly disputed. Here we use a simple model for the bacterial genome inside the cell and present a semi-analytical model for the in vivo target search of transcription factors within the facilitated diffusion framework. Without having to resort to extensive simulations we determine the mean search time of a lac repressor in a living E. coli cell by including parameters deduced from experimental measurements. The results agree very well with experimental findings, and thus the facilitated diffusion picture emerges as a quantitative approach to gene regulation in living bacteria cells. Furthermore we see that the search time is not very sensitive to the parameters characterizing the DNA configuration and that the cell seems to operate very close to optimal conditions for target localization. Local searches as implied by the colocalization mechanism are only found to mildly accelerate the mean search time within our model.

Full PDF

aa r X i v : . [ q - b i o . S C ] J a n In vivo facilitated diﬀusion model

Maximilian Bauer , , Ralf Metzler , , ∗ ∗ E-mail: [email protected]

Abstract

Under dilute in vitro conditions transcription factors rapidly locate their target sequence on DNA byusing the facilitated diﬀusion mechanism. However, whether this strategy of alternating between three-dimensional bulk diﬀusion and one-dimensional sliding along the DNA contour is still beneﬁcial in thecrowded interior of cells is highly disputed. Here we use a simple model for the bacterial genome inside thecell and present a semi-analytical model for the in vivo target search of transcription factors within thefacilitated diﬀusion framework. Without having to resort to extensive simulations we determine the meansearch time of a lac repressor in a living

E. coli cell by including parameters deduced from experimentalmeasurements. The results agree very well with experimental ﬁndings, and thus the facilitated diﬀusionpicture emerges as a quantitative approach to gene regulation in living bacteria cells. Furthermore we seethat the search time is not very sensitive to the parameters characterizing the DNA conﬁguration andthat the cell seems to operate very close to optimal conditions for target localization. Local searches asimplied by the colocalization mechanism are only found to mildly accelerate the mean search time withinour model.

Introduction

Transcription factors (

TFs ) are able to locate and bind their target sequence on DNA at surprisinglyhigh rates. This became clear when in 1970 it was measured that in vitro the lac repressor associateswith the operator at a rate of k a = 7 × M − s − [1]. This is about two orders of magnitude faster thana rate calculated with the well-known Smoluchowski formula for three-dimensional diﬀusion control [2].The results obtained in the in vitro experiments by Riggs et al. and by Winter et al. were successfullyexplained with the by now classical facilitated diﬀusion model, introduced by Berg, von Hippel andco-workers [3, 4]: the TF alternates between three-dimensional diﬀusion through the bulk solution andsliding along the DNA contour which can be considered as one-dimensional diﬀusion. While a largemajority of subsequent reformulations of this target search problem are based on this facilitated diﬀusionmodel [5–8], there are also critical reviews focusing on limitations of the traditional model [9, 10].Even if it is accepted by most of the scientists that in vitro TFs perform facilitated diﬀusion to ﬁndtheir targets, there is a vivid debate on whether this mechanism indeed plays a role in vivo . The interestin this long-standing topic was boosted by the development of new experimental techniques, namelysingle-molecule assays studying DNA-binding proteins, or more generally the diﬀusion of proteins withincells [11–18]. After ﬁnding indirect evidence some years ago, Elf and coworkers recently demonstratedthat the lac repressor does display facilitated diﬀusion in live

Escherichia coli (E. coli) cells [19, 20].Thus it is important to study how the present facilitated diﬀusion models need to be translatedto the in vivo situation. In comparison to the dilute situation studied in vitro the most importantchanges are: the inﬂuence of the conﬁnement to the cell body or the nucleoid and the compactiﬁed DNAconformation, and the impact of the presence of many large biomolecules. The latter, which is oftenreferred to as macromolecular crowding has two major eﬀects: the equilibrium for DNA-binding proteinsis shifted favoring the associated state and the diﬀusion in the cytoplasm is slowed down [21,22]. There isan on-going debate whether this reduced diﬀusion is still Brownian, following experimental evidence thatfor larger molecules such as mRNA [23,24] or lipid granules [25] the motion follows the laws of anomalousdiﬀusion [26, 27]. Indeed, there are indications that particles of the size of several tens of kilo Daltonsexhibit anomalous diﬀusion [28, 29]. In what follows we model TFs in the bulk by normal Browniandiﬀusion and point at potential implications of anomalous diﬀusion in the conclusions.We note that theoretical work on facilitated diﬀusion in vivo has also been reported by Mirny andcoworkers as well as by Koslover and coworkers [9, 30]. A diﬀerent approach for the situation in livingcells, based on a fractal organization of the chromatin in the nucleus, showed that also in eukaryotesfacilitated diﬀusion can be beneﬁcial [31].With respect to the impact of the cell’s ﬁnite size Foﬀano et al. recently studied the inﬂuence of(an-)isotropic conﬁnement on the facilitated diﬀusion process for rather short DNA chains [32]. To builda theoretical model for facilitated diﬀusion on the entire genome in living cells we shortly review whatis known about the organization of the bacterial DNA [33]. The emerging general consensus pointsat a distinct separation of the genome into connected subunits, that may be dynamic. Using atomicforce microscopy the size of structural units of the

E. coli chromosome was studied, ﬁnding units of size40 nm and 80 nm [34]. By means of two complementary approaches the average size of the structuraldomains was measured to be 10 kilobasepairs (kbp) [35]. Romantsov et al. studied the structure withﬂuorescence correlation spectroscopy, yielding units of size 50 kbp with a diameter of (70 ±

20) nm [36].Chromosome conformation capture carbon copy(5C) was used to determine a three-dimensional modelof the

Caulobacter genome [37]. For the same bacterium Viollier et al. determined that the location ofgenes on the chromosome map correlates linearly with its position along the cell’s long axis [38].Based on these experimental observations several models for the DNA structure in bacterial cells havebeen proposed: entropy is spotted to be the main driver of chromosome segregation, and ring polymersare used to model the bacterial chromosome [39, 40]. Buenemann and Lenz showed that a geometricmodel based on a self-avoiding random walk (SAW) is suﬃcient to explain the linear positioning of locialong the cell’s longest axis [41]. Finally, the chromosomal structure and, in particular, the accuratepositioning of loci was proposed as resulting from regulatory interactions [42, 43].In this paper we survey if it is possible to extend our previous generalized facilitated diﬀusion model[44] to the in vivo situation and compare the results with the ones obtained by Koslover et al. [30].Therefore in the following section we detail how we obtain a coarse-grained model for the bacterialgenome and state our semi-analytical model for the search process. Then the general theory will beapplied to the speciﬁc case of a lac repressor in an

E. coli cell, and we favorably compare our results withrelated experimental measurements [19]. Finally we conclude our ﬁndings and give an outlook on futureresearch directions.

Theory

The quantity we investigate is the average time a TF needs to ﬁnd a target sequence in a living bacterialcell after starting at a random position within the cell. In principle it is possible to apply our previousgeneralized diﬀusion model using rescaled rates, lengths and diﬀusion constants to account for the crowded in vivo environment [44]. However, for parameters typical for the interior of cells the eﬀective contactradius between TF and DNA is larger than the average distance between neighboring DNA segments.Consequently a direct translation is not possible.Moreover, as we will see below, already the simpler one-state model of facilitated diﬀusion is suﬃcientto obtain a fairly good estimate of the experimental results without any further free parameters. Thuswe do not distinguish between search and recognition states of the TF-DNA complex [5]. Intersegmentaljumps and/or transfers [6, 8, 16, 45] of TFs between DNA segments, that are close-by in the embeddingspace but distant when measured in the chemical coordinate along the genome, are to some extentindirectly included in terms of re-attachment to the DNA within one of the geometric subunits of thechromosome. In future studies these eﬀects could be explicitly included to reﬁne the model.Our approach is based on the general picture of the facilitated diﬀusion mechanism: the TF diﬀusesthree-dimensionally through the bulk solution until it encounters a stretch of DNA to which it can bind.Then a sliding motion along the DNA contour is possible, during which the TF probes for the target.If the target is not found, the TF will dissociate from the chain after a certain time span and resumeits 3D-diﬀusion through the cell until the next binding event. This scheme continues until the targetis found. The major diﬀerence to the dilute in vitro situation lies in the DNA conformation which isheavily inﬂuenced by the conﬁnement to the cell volume or the nucleoid volume: As the contour lengthof (the typically circular) bacterial DNA is about three orders of magnitude larger than the longest cellaxis in which it is placed, there is clearly a need to compact it. To proceed we present our model for thecompacted genome.

Model for the compacted genome

Without dwelling on details to which extent nucleoid-structuring proteins and/or supercoiling is respon-sible for DNA compaction in bacterial cells, we adapt the model of Buenemann and Lenz and assumethat the DNA is assembled structurally into spheres (‘blobs’) containing one loop each [41]. Thus, thewhole genome is modeled as a closed SAW of these uniformly large blobs on a lattice representing thenucleoid volume (here we diverge from ref. [41], where the full cell volume was taken). To mimic thecylindrical shape of the nucleoid one of the cuboid lattice’s edges is taken to be longer than the othertwo of equal length.The key quantities are the blobs’ radius of gyration r g and the number of basepairs within a blob, N b . While the latter parameter determines how many blobs make up the DNA, since the number of bpson the DNA is a ﬁxed parameter, the ﬁrst one eﬀectively determines the lattice size (see ﬁgure 1). Figure 1. Two-dimensional schematic of the DNA conformation.

The circles denote singleDNA blobs. The lattice spacing is twice the blob radius: d g = 2 r g . A part of an exemplary searchtrajectory is depicted by the arrow.To obtain individual DNA conformations we follow a routine similar to the one described in ref. [41]:as a starting point we use a closed loop of minimal extension which touches both end faces along thelongest cell axis. Then the chain is elongated by inserting hooks at random positions until it reachesthe desired length (due to the form of the algorithm only chains with an even number of blobs areconsidered). Only elongation steps which yield a conformation within the nucleoid volume are executed.Afterwards the genome is equilibrated in the following manner: we randomly choose one of the threetransformation types of the MOS algorithm [46]. Then it is checked if the resulting conformation is stillan SAW, otherwise the old conformation is kept. Finally only attempts are counted in which the SAW isstill conﬁned to the nucleoid volume. This is repeated 100,000 times for each individual model genome.Afterwards the resulting DNA conformations are centered on a larger lattice representing the fullcell volume and remain unchanged during the subsequent simulation of the target search process. Thisapproach is aﬃrmed by recent results that DNA dynamics only have little eﬀect on target search rates [30].For the sake of simplicity we assign the target to be in a blob in the middle of the DNA. Target search process

The TF is assumed to start its search at a random position in the cell volume and its motion is modeledas a random walk on the eﬀective lattice (ﬁg. 1), during which we keep track of how often sites containinga blob are passed. The search process is schematically depicted in ﬁg. 2.

B U S r p 1-p r Figure 2. Schematic of the microscopic events within a blob (without target).

B denotes abound TF, and U an unbound TF within a blob. Finally, S represents a searching TF which is currentlynot in a blob.The TF starts its search diﬀusing in 3D (S-state). With certainty (probability 1) after some timeit will encounter a blob, which it enters in its unbound state (U). We ﬁrst study the case where thisblob does not contain the target DNA. Based on the microscopic model be outlined below, we assign aprobability p r that the TF will bind to the DNA within this blob. If so it changes to the B-state. As thereis no target to be found on the DNA, after some time the TF will dissociate and return to the unboundU-state. With probability p r it can bind again, or it may leave the blob (with probability 1 − p r ) andstart a new random walk on the lattice (S-state). The same procedure will take place when subsequentblobs are encountered.A qualitatively new event occurs when the site containing the target DNA is encountered for the ﬁrsttime. Now the tendency to quit the corresponding blob competes with the probability to ﬁnd the target.For this reason, in general several encounters with the target blob are necessary. The correspondingscheme is depicted in ﬁgure 3: p T B U S tt p r p 1-p r Figure 3. Schematic of the microscopic events within the target blob.

Same notation as inthe previous ﬁgure. Additionally, T denotes a TF which has found the target.Once again after entering the blob in the unbound U-state, with probability 1 − p r not a single bindingevent takes place. However, if the TF binds to DNA (with probability p r ), subsequently with probability p t the target will be found (T-state) before dissociating. If the target is not found and the TF dissociates,again with probability 1 − p r , the blob is left. Otherwise (with probability p r ) a new chance to ﬁnd thetarget while being bound is opened up. As in the simpler scheme without target, a new random walk(S-state) is started on a neighboring site if the blob is left. To proceed we relate the probabilities p r and p t to microscopic quantities and determine the time steps of the individual processes, before calculatingthe typical search time for the target. Microscopic model

To determine p r , that is the probability to bind to DNA after entering a blob or after dissociationfrom the DNA within the blob we employ the approximation that locally the DNA can be treated as arandom coil [3,44]. Thus we have to solve the diﬀusion equation for an initially homogeneous probabilitydistribution within a sphere of radius r g . Inside this sphere nonspeciﬁc association to a basepair on theDNA occurs with the constant, intrinsic rate k ass (in units of M − s − ). We introduce a second concentricsphere of radius αr g whose surface is absorbing, modeling the TFs leaving the domain of the blob. Thus,the dimensionless quantity α measures (in units of r g ) where the blob’s area of inﬂuence ends, see belowand Supporting Information (SI) S1. The corresponding problem is solved in the SI S1, yielding thebinding probability p r = 1 − αφ ( γ ) α + ( α − γ φ ( γ ) , (1)with the dimensionless quantity γ = r g p κ/D . Here D denotes the 3D-diﬀusion constant, and κ = nk ass N b . Moreover, n = 3 / (4 πr g ) represents the density of DNA within the coil. In Eq. 1 we alsointroduced the auxiliary function φ ( γ ) = ( γ coth( γ ) − /γ [47].Note that p r is a monotonic function of γ . Keeping the values of κ, α and r g ﬁxed, for decreasing, yetﬁnite values of D the probability to escape the blob (which is given by 1 − p r ) becomes smaller, as inthis case the TF moves slower and spends more time within the blob, where it can be caught by a stretchof DNA. Exactly at D = 0 one obtains p r = 0, an apparent paradox. However, while it is true thatan immobile TF is unable to leave a blob, the converse argument that the TF will bind to DNA withcertainty is not obvious, as binding requires the motion of a TF towards DNA within the blob. Becausethis complementarity is implicitly assumed in the present model, it only yields meaningful results forﬁnite values of γ . Only this situation will be considered in the following.If binding occurs, the average time this takes is given by a somewhat complicated formula for arbitraryvalues of α (see SI S1). Here we report the simpler result for the special case α = 2. This case is ofinterest, as in the numerical evaluation we use the value α = p / ≈ .

14, see below. τ α =2 b = 12 κ

20 + (8 γ − φ ( γ ) + (4 γ − γ φ ( γ )(2 + γ φ ( γ ))(2 + ( γ − φ ( γ )) . (2)Conversely, the average time it takes the TF to leave the blob is τ α =2 e = 12 κ − φ − ( γ ) + γ (4 φ ( γ ) + ) + γ φ ( γ )2 + γ φ ( γ ) . (3)While diﬀusing in 3D, a single random walk step on average takes τ = d g / (6 D ). Once the TF bindsnon-speciﬁcally to the blob containing the target, the probability to ﬁnd the target before dissociatingcan be found by considering a one-dimensional diﬀusion problem. We assume that the target is locatedin the middle of the corresponding blob. Then we consider a DNA stretch of length L = N b b/ b denotes the size of a basepair, b = 0 .

34 nm.Due to the DNA’s coiled conformation within a blob, we use the standard assumption that the ﬁrstbinding event occurs at a random position on the DNA and that dissociation and reassociation positionsare completely uncorrelated, see for example [48]. Formally this implies that the TF initially is uniformlydistributed on the DNA along which it diﬀuses with the diﬀusion constant D . The TF can leave theDNA with the dissociation rate k oﬀ . We furthermore assume that the other extremity of the DNA actsas a reﬂecting boundary [48], possibly due to compacting proteins that obstruct further 1D-diﬀusion atthis position. The calculation detailed in the SI S1 yields: p t = tanh( L/ℓ ) L/ℓ , (4)with ℓ = p D /k oﬀ , which denotes a typical distance covered sliding on DNA before dissociating. If thetarget is found, the conditional time this successful event takes on average, reads τ t = 1 − / (cid:0) p t cosh (cid:0) Lℓ (cid:1)(cid:1) k oﬀ = 1 − Lℓ / (cid:0) sinh (cid:0) Lℓ (cid:1) cosh (cid:0) Lℓ (cid:1)(cid:1) k oﬀ . (5)However, an unsuccessful event implies that the DNA is (on average) left after the time span τ d = 1 /k oﬀ .Inspection of Eq. (5) shows that in the limit D →

0, i.e. when TFs are (nearly) incapable of sliding, τ t approaches the ﬁnite value 1 / (2 k oﬀ ), which is at ﬁrst sight a surprising result. However, in this limitthe probability to reach the target as given by Eq. (4) approaches zero, ensuring that meaningful resultsare obtained. It should be stressed that our model only allows target detection via sliding, and not viadirect detection solely through three-dimensional diﬀusion. Mean search time

To determine the mean time it takes to ﬁnd the target at ﬁrst we specify how often the “loop” of bindingand unbinding events (B and U in ﬁgures 2 and 3) is executed during an encounter with a blob. In allthe blobs without the target this happens on average p r / (1 − p r ) times. As one loop lasts τ c = τ b + τ d the average time that is spent within a blob is τ blob = τ e + τ c p r / (1 − p r ).In the blob containing the target, the average number of binding and unbinding loops is g ( p r , p t ) = χ/ (1 − χ ), where χ = p r (1 − p t ). Note that the number of executed loops in blobs without target is thespecial case p t = 0 of the general case, g ( p r , p t = 0) = p r / (1 − p r ). In the same sense ﬁgure 2 can beconsidered a special case of ﬁgure 3. The combined probability to ﬁnd the target before leaving the blobreads p r p t / (1 − p r + p r p t ), consequently the probability for a failed attempt is p uns = (1 − p r ) / (1 − χ ).Thus, a successful event during which the target is found, on average takes τ suc = τ b + τ t + g ( p r , p t ) τ c ,and an unsuccessful one τ uns = τ e + g ( p r , p t ) τ c .The mean total search time can be dissected into three contributions: ﬁrst, the mean time the TFneeds to arrive at the target blob for the ﬁrst time. Then the mean time it takes to return to the targetafter an unsuccessful search event. The latter has to be multiplied with the average number of failedattempts. Finally the average time it takes to successfully bind the target at the corresponding blob hasto be added.To quantify this model two parameter pairs from the random walk simulation are needed as inputs:the mean number of steps it takes to encounter the target blob for the ﬁrst time n f , after startingat a random position within the cell and how many blobs without target are encountered during thistime, n f , enc . Furthermore we determine the mean number of steps and blob-encounters in a random walkstarting on a site next to the target blob: n r , , n r , enc and ending in the target blob. Altogether themean total search time reads: τ = n f , τ + n f , enc τ blob + p uns − p uns ( τ uns + n r , τ + n r , enc τ blob )+ τ suc . (6)This formula is the main result of our study, which will be discussed quantitatively for the case of the lacrepressor in an E. coli cell.

Results

As input parameters for our TF search model in a living cell we use data deduced from experimentalstudies. For the DNA conﬁguration we use two parameter sets for the blob size and the number N b ofbasepairs within a blob: (a) r g = 15 nm and N b = 10 [35, 41] and (b) r g = 35 nm and N b = 5 × [36].The volume of the nucleoid can be approximated as a cylinder of diameter d nuc = 0 . µm and length l nuc = 1 . µm [39]. We use a cuboid with edge lengths l x = l y = p π × d nuc / ≈ nm and l z = l nuc .This corresponds to nucleoid lattices of size 7 × ×

46 and 3 × ×

20. As the

E.coli genome consists of ∼ d cell = 0 . µm and length l nuc = 2 . µm [39]. Accordingly, we use embracing latticesof size 15 × ×

83 and 6 × ×

36 to mimic the full cell volume. Besides, we employ α = p / k ass as detailed in the SI S1 and we use D = 3 µm /s and D = 0 . µm /s [19]. The results of the random walk simulation are summarized intable 1. Table 1. Simulation results

Set n f , n f , enc q f n r , n r , enc q r a 31514 766 .

41 0 . .

48 0 . . .

63 0 . . .

848 0 . n r / f , and n r / f , enc shows that the ones obtained with parameterset a are approximately one order of magnitude larger than the ones obtained with set b. This is clear asset a corresponds to a ﬁner model of the DNA, in which the respective value of r g is smaller. Next, weconsider the ratios q f = n f,enc /n f, D and q r = n r,enc /n r, D , that is the fractions of sites containing a blobencountered during a trajectory. The results are very close to the total fraction of sites that are occupiedby a blob: for parameter set a, this is: 464 / (15 × × ≈ . / (6 × × ≈ . Non-monotonic behavior

In ﬁgure 4 the mean search time averaged over the ensembles with parameter set a is shown as a functionof the association rate k ass and the dissociation rate k oﬀ .We ﬁnd a non-monotonic dependence both on the association and the dissociation rate typical forfacilitated diﬀusion models: for a ﬁxed value of k ass there exists a value of k oﬀ that minimizes the searchtime. This minimal value decreases if both rates are increased while keeping them at a constant ratio.In ﬁgure 5 the ratio of the search time obtained with parameter set b with the search time obtainedwith parameter set a is plotted for the same range as in ﬁgure 4.Even though set b always yields slightly smaller search times, the results are very similar, especiallyin the range usually studied in experiments, as we will see below. Therefore in the following we solelyconsider results obtained with set a. In the SI S1 we moreover show that the approach to use anensemble average to obtain the mean search time is justiﬁed as the scatter between data obtained with Figure 4. Mean search time.

The mean search time is plotted as a function of the dissociation rate k oﬀ and the association rate k ass (using parameter set a). The blue bar and the blue dotted lines denotethe range of k oﬀ which is biologically relevant [19]. Figure 5. Diﬀerence between the two parameter sets.

The plot shows the ratio of the meansearch time obtained with parameter set b with the ones obtained with set a.individual conformations is negligible (see ﬁgure S1). Only at very low values of k oﬀ , when the TF spendsconsiderable time in the non-speciﬁcally bound state, the individual conformation does play a role.We saw that for ﬁxed values of k ass , there exists an optimal of k oﬀ , for which the target localizationoccurs fastest. It is insightful to study whether a living E. coli cell operates close to this point.

Comparison to experimental results

We choose the rates according to the results of Xie and coworkers [19]: they measured that the lacrepressor spends 87% of the total time non-speciﬁcally bound and determined the residence time onDNA t R to be in the range 0 . < t R = 1 /k oﬀ < . (7)To incorporate these values, we calculate the fraction of time, f b , that the TF spends non-speciﬁcallybound. This is obtained from Eq. 6 by only considering the terms involving τ d and τ t . The result isplotted in ﬁgure 6, again as a function of dissociation and association rate. Figure 6. Bound fraction of time.

The fraction of time during which the TF is non-speciﬁcallybound is shown (using parameter set a).We see that contour lines of a constant fraction appear as straight lines in this log-log-plot. Anumerical analysis yields that the condition f b = 0 .

87 is fulﬁlled forlog ( k ass (M − s − )) = 1 .

04 log ( k oﬀ (s − )) + 2 . . (8)The observation that the slope of this curve is (nearly) unity, reﬂects the fact that specifying the boundfraction of time is equivalent to specifying the equilibrium binding constant which is simply given by theratio of k ass and k oﬀ . We plug Eq. 8 into our model and plot the resulting mean search time as a functionof the single residual parameter k oﬀ in ﬁgure 7 in the range given by Eq. 7. Additionally, in ﬁgure 7 weplot the minimal search time in this regime which is obtained by choosing the optimal value of k ass .In both cases we obtain a monotonically decreasing function of k oﬀ . Most interestingly, the valuesobtained in this biologically relevant parameter regime are only marginally larger than the optimal ones.At k oﬀ & − the two data sets nearly lie on top of each other. This means that within our modelan E. coli cell seems to operate quite close to conditions, which are optimal for target localization. At k oﬀ = 200s − , which was used in the discussion in ref. [30], we obtain τ ≈

311 s. This is approximately12% below the experimental result 6 ×

59s = 354s [19], implying a very favorable agreement.

Local searches

There is some evidence that many TFs are produced close to their target positions, a phenomenon calledcolocalization [14, 49]. These local searches would obviously be faster than a global search starting ata random position within the cell. To quantify this in ﬁgure 8 we plot how many percent of the totalsearch time is still needed to ﬁnd the target if the TF starts its search in the target blob while all otherparameters remain unchanged.In mathematical terms this corresponds to omitting the terms in the ﬁrst line of Eq. 6. We see thatonly for relatively large values of k ass an appreciable acceleration is obtained for local searches. This isclear as large values of the association rate imply that all the blobs encountered en route act as trapsslowing down the transport. Interestingly, in the regime typical for the interior of cells the accelerationis of little amount. This can also be interpreted in the more general context of “geometry-controlled0 off (s -1 )0200400600 τ ( s ) mean search timeminimal search time Figure 7. Mean search time and minimal search time.

The mean search time and the minimalsearch time (with appropriately chosen k ass ) are plotted as a function of the dissociation rate atparameters relevant for the interior of living cells. Figure 8. Acceleration due to local searches.

The ratio of the time needed in a local search withthe one in a global search (with parameter set a) is shown.kinetics”, see the works of B´enichou and coworkers [50, 51]. These authors showed that for non-compactexploration of space - as is the case in the present model - the initial position of a searching particle haslittle inﬂuence.

Discussion

We analyzed the facilitated diﬀusion mechanism in a living cell using a coarse-grained model of thebacterial genome. Just like in dilute in vitro systems there is a non-monotonic dependence both on thedissociation rate and the association rate of TFs from and to DNA. The respective optimal conditionsmark a trade-oﬀ between spending too much time on DNA where the motion is rather slow, but thetarget can be found, and spending too much time in the cytoplasm where the motion is faster, but theTF is insensitive to the target.1When calculating the mean search time as an input from our random walk simulation we solelyuse the mean number of steps taken and the number of blobs encountered during the trajectory. Thiscorresponds to treating the nucleoid body as an eﬀective medium through which the TF diﬀuses, whichagrees with the observations made by Koslover et al. that within a short time span the TF starts aneﬀective diﬀusive motion [30]. Accordingly, we see that the exact values of the parameters describing theDNA conformation have only little eﬀect on our results. Only the fact that there is an eﬀective mediumcharacterized by the DNA density matters.Calibrating our results with the experimental observation that the TF spends 87% of the time non-speciﬁcally bound [19], we obtain search times that only slightly underestimate the experimentally knownresults. In a previous study we showed that the introduction of a search and a recognition state in orderto resolve the speed-stability paradox slows down the search [44]. Thus, a reﬁned model taking this eﬀectinto account could yield a result even closer to the experimental one.Most importantly, within our model the results in the biologically relevant regime of dissociation ratesare quite close to the ones minimizing the search time, indicating that living

E. coli cells function nearconditions optimal for TF target location.Our results for the mean search times are similar to those obtained by Koslover et al. [30]. However,in their model for in vivo facilitated diﬀusion they distribute the DNA over the entire cell volume andassume a random coil conﬁguration. If one were conﬁning the DNA to the smaller nucleoid volume, theeﬀective DNA-TF contact radius in that model would then become smaller than the average distancebetween DNA segments. Besides, our model is less idealized. In that sense our current approach has theadvantage that the DNA is realistically conﬁned to the nucleoid volume, and based on input parametersdeduced from experimental studies we also obtain mean search times, that are very close to experimental in vivo values. Moreover, our model oﬀers the advantage that in future studies additional informationmay be deduced, for example, by studying the underlying probability densities of n r , , n r , enc , etc., inaddition to their mean values determined here. Colocalization eﬀects

Comparing the mean search times for TFs starting at a random position in the cell volume with thoseTFs that already start close to the target, we only observe a minor acceleration. This is due to the factthat most of the search time is spent returning to the target blob after a failed attempt to ﬁnd the target.For a wide range of parameters the ﬁrst encounter with the target blob only represents a small fractionof the whole search time. Leaving the picture of mean values for the search time of an ensemble of TFs,on the level of single trajectories immediate returns to the target blob are indeed possible and thus maylead to search times much shorter than the average search time. Such scenarios may in fact be relevantfor biological cells.Should observations of anomalous diﬀusion for TFs in the cytoplasm of living cells be substantiated,the eﬀect of colocalization should become signiﬁcantly more pronounced, if the nature of the explorationof space is compact [50, 51]: subdiﬀusion implies an increased occupation probability near the initialposition [23, 52, 53], and thus increases the likelihood for successful TF-DNA binding after repeated at-tempts. In that sense subdiﬀusion may even be beneﬁcial for molecular processes in living cells, as arguedrecently [52, 54, 55].We believe that this relatively simple model for facilitated diﬀusion in vivo will instigate new experi-ments and more detailed theories, to ultimately obtain a full understanding of bacterial gene regulation.2

Acknowledgments

References

1. Riggs AD, Bourgeois S, Cohn M (1970) The lac repressor-operator interaction: Iii. kinetic studies.J Mol Biol 53: 401 - 417.2. von Smoluchowski M (1916) Three presentations on diﬀusion, molecular movement according tobrown and coagulation of colloid particles. Physikal Zeitschr 17: 557-571.3. Berg OG, Winter RB, Von Hippel PH (1981) Diﬀusion-driven mechanisms of protein translocationon nucleic acids. 1. models and theory. Biochemistry 20: 6929-6948.4. Winter RB, Berg OG, Von Hippel PH (1981) Diﬀusion-driven mechanisms of protein translocationon nucleic acids. 3. the escherichia coli lac repressor-operator interaction: kinetic measurementsand conclusions. Biochemistry 20: 6961-6977.5. Slutsky M, Mirny L (2004) Kinetics of protein-dna interaction: Facilitated target location insequence-dependent potential. Biophys J 87: 4021-4035.6. Lomholt MA, van den Broek B, Kalisch SMJ, Wuite GJL, Metzler R (2009) Facilitated diﬀusionwith dna coiling. Proc Natl Acad Sci USA 106: 8204-8208.7. Zhou HX (2011) Rapid search for speciﬁc sites on dna through conformational switch of nonspecif-ically bound proteins. Proc Natl Acad Sci USA 108: 8651-8656.8. Sheinman M, B´enichou O, Kafri Y, Voituriez R (2012) Classes of fast and speciﬁc search mecha-nisms for proteins on dna. Rep Prog Phys 75: 026601.9. Mirny L, Slutsky M, Wunderlich Z, Tafvizi A, Leith J et al. (2009) How a protein searches for itssite on DNA: the mechanism of facilitated diﬀusion. J Phys A Math Gen 42: 43401310. Kolomeisky AB (2011) Physics of protein-DNA interactions: mechanisms of facilitated targetsearch. Phys Chem Chem Phys 13: 2088-209511. Sokolov I, Metzler R, Pant K, Williams M (2005) Target search of n sliding proteins on a dna.Biophys J 89: 895-902.12. Gowers DM, Wilson GG, Halford SE (2005) Measurement of the contributions of 1d and 3d path-ways to the translocation of a protein along dna. Proc Natl Acad Sci USA 102: 15883-15888.13. Wang YM, Austin RH, Cox EC (2006) Single molecule measurements of repressor protein 1ddiﬀusion on dna. Phys Rev Lett 97: 048302.14. Kolesov G, Wunderlich Z, Laikova ON, Gelfand MS, Mirny LA (2007) How gene order is inﬂuencedby the biophysics of transcription regulation. Proc Natl Acad Sci USA 104: 13948-13953.15. Bonnet I, Biebricher A, Port´e PL, Loverdo C, B´enichou O, et al. (2008) Sliding and jumping ofsingle ecorv restriction enzymes on non-cognate dna. Nucleic Acids Res 36: 4118-4127.16. van den Broek B, Lomholt MA, Kalisch SMJ, Metzler R, Wuite GJL (2008) How dna coilingenhances target localization by proteins. Proc Natl Acad Sci USA 105: 15738-15742.317. Konopka MC, Shkel IA, Cayley S, Record MT, Weisshaar JC (2006) Crowding and conﬁnementeﬀects on protein diﬀusion in vivo. J Bacteriol 188: 6115-6123.18. K¨uhn T, Ihalainen TO, Hyv¨aluoma J, Dross N, Willman SF, et al. (2011) Protein diﬀusion inmammalian cell cytoplasm. PLoS One 6: e22962.19. Elf J, Li GW, Xie XS (2007) Probing transcription factor dynamics at the single-molecule level ina living cell. Science 316: 1191-1194.20. Hammar P, Leroy P, Mahmutovic A, Marklund EG, Berg OG, et al. (2012) The lac repressordisplays facilitated diﬀusion in living cells. Science 336: 1595-1598.21. Minton AP (2001) The inﬂuence of macromolecular crowding and macromolecular conﬁnement onbiochemical reactions in physiological media. J Biol Chem 276: 10577-10580.22. Morelli MJ, Allen RJ, ten Wolde PR (2011) Eﬀects of macromolecular crowding on genetic net-works. Biophys J 101: 2882-2891.23. Golding I, Cox EC (2006) Physical nature of bacterial cytoplasm. Phys Rev Lett 96: 098102.24. Weber SC, Spakowitz AJ, Theriot JA (2010) Bacterial chromosomal loci move subdiﬀusivelythrough a viscoelastic cytoplasm. Phys Rev Lett 104: 238102.25. Jeon JH, Tejedor V, Burov S, Barkai E, Selhuber-Unkel C, et al. (2011)

In Vivo anomalous diﬀusionand weak ergodicity breaking of lipid granules. Phys Rev Lett 106: 048103.26. Metzler R, Klafter J (2000) The random walk’s guide to anomalous diﬀusion: a fractional dynamicsapproach. Phys Rep 339: 1-77.27. Barkai E, Garini Y, Metzler R (2012) Strange kinetics of single molecules in living cells. PhysToday 65(8): 29-35.28. Banks D, Fradin C (2005) Anomalous diﬀusion of proteins due to molecular crowding. Biophys J89: 2960-2971.29. Weiss M, Elsner M, Kartberg F, Nilsson T (2004) Anomalous subdiﬀusion is a measure for cyto-plasmic crowding in living cells. Biophys J 87: 3518-3524.30. Koslover EF, Diaz de la Rosa MA, Spakowitz AJ (2011) Theoretical and computational modelingof target-site search kinetics in vitro and in vivo. Biophys J 101: 856-865.31. B´enichou O, Chevalier C, Meyer B, Voituriez R (2011) Facilitated diﬀusion of proteins on chro-matin. Phys Rev Lett 106: 03810232. Foﬀano G, Marenduzzo D, Orlandini E (2012) Facilitated diﬀusion on conﬁned dna. Phys Rev EStat Nonlin Soft Matter Phys 85: 021919.33. Rocha EPC (2008) The organization of the bacterial genome. Annu Rev Genet 42: 211-233.34. Kim J, Yoshimura SH, Hizume K, Ohniwa RL, Ishihama A, et al. (2004) Fundamental structuralunits of the escherichia coli nucleoid revealed by atomic force microscopy. Nucleic Acids Res 32:1982-1992.35. Postow L, Hardy C, Arsuaga J, Cozzarelli N (2004) Topological domain structure of the escherichiacoli chromosome. Genes Dev 18: 1766-1779.436. Romantsov T, Fishov I, Krichevsky O (2007) Internal structure and dynamics of isolated escherichiacoli nucleoids assessed by ﬂuorescence correlation spectroscopy. Biophys J 92: 2875-2884.37. Umbarger MA, Toro E, Wright MA, Porreca GJ, Bau D, et al. (2011) The three-dimensionalarchitecture of a bacterial genome and its alteration by genetic perturbation. Mol Cell 44: 252-264.38. Viollier PH, Thanbichler M, McGrath PT, West L, Meewan M, et al. (2004) Rapid and sequentialmovement of individual chromosomal loci to speciﬁc subcellular locations during bacterial dnareplication. Proc Natl Acad Sci USA 101: 9257-9262.39. Jun S, Wright A (2010) Entropy as the driver of chromosome segregation. Nat Rev Microbiol 8:600-607.40. Jung Y, Jeon C, Kim J, Jeong H, Jun S, et al. (2012) Ring polymers as model bacterial chromo-somes: conﬁnement, chain topology, single chain statistics, and how they interact. Soft Matter 8:2095-2102.41. Buenemann M, Lenz P (2010) A geometrical model for dna organization in bacteria. PLoS One 5:e13806.42. Junier I, Martin O, K´ep`es F (2010) Spatial and topological organization of dna chains induced bygene co-localization. PLoS Comput Biol 6: e1000678.43. Fritsche M, Li S, Heermann DW, Wiggins PA (2012) A model for escherichia coli chromosomepackaging supports transcription factor-induced dna domain formation. Nucleic Acids Res 40:972-980.44. Bauer M, Metzler R (2012) Generalized facilitated diﬀusion model for dna-binding proteins withsearch and recognition states. Biophys J 102: 2321-2330.45. Sheinman M, Kafri Y (2009) The eﬀects of intersegmental transfers on target location by proteins.Phys Biol 6: 01600346. Madras N, Orlitsky A, Shepp L (1990) Monte carlo generation of self-avoiding walks with ﬁxedendpoints and ﬁxed length. J Stat Phys 58: 159-183.47. Reingruber J, Holcman D (2010) Narrow escape for a stochastically gated brownian ligand. J PhysCondens Matter 22: 065103.48. Coppey M, B´enichou O, Voituriez R, Moreau M (2004) Kinetics of target site localization of aprotein in DNA: a stochastic approach Biophys J 87: 1640-164949. Wunderlich Z, Mirny LA (2008) Spatial eﬀects on the speed and reliability of protein-dna search.Nucleic Acids Res 36: 3570-3578.50. B´enichou O, Chevalier C, Klafter J, Meyer B, Voituriez R (2010) Geometry-controlled kinetics.Nat Chem 2: 472-47751. Meyer B, Chevalier C, Voituriez R, B´enichou O (2010) Universality classes of ﬁrst-passage-timedistribution in conﬁned media. Phys Rev E Stat Nonlin Soft Matter Phys 83: 05111652. Guigas G, Weiss M (2008) Sampling the cell with anomalous diﬀusion - The discovery of slowness.Biophys J 94: 90-94.553. Lomholt MA, Zaid IM, Metzler R (2007) Subdiﬀusion and weak ergodicity breaking in the presenceof a reactive boundary. Phys Rev Lett 98: 200603.54. Hellmann M, Heermann DW, Weiss M (2012) Enhancing phosphorylation cascades by anomalousdiﬀusion. EPL 97: 58004.55. Sereshki LE, Lomholt MA, Metzler R (2012) A solution to the subdiﬀusion-eﬃciency paradox:inactive states enhance reaction eﬃciency at subdiﬀusion conditions in living cells. EPL 97: 20008.6

Supporting Information S1

In this supporting information we detail the explicit calculations which are beyond the scope of the maintext.

To relate p r to the non-speciﬁc association rate k ass per base pair (in units of M − s − ), we solve thefollowing diﬀusion equation for the TF’s probability c ( r , t ) to be at position r at time t: ∂c ( r , t ) ∂t = (cid:26) D ∆ c ( r , t ) − κc ( r , t ) , for 0 < r < r g D ∆ c ( r , t ) , for r g < r < r , (S1)with κ = nk ass N b , where n denotes the density of DNA and N b the number of basepairs within the blob. D denotes the 3D-diﬀusion constant and r g the blob’s radius of gyration. The diﬀerential equation issubject to the initial condition c ( r , t = 0) = (cid:26) c = 3 / (4 πr g ) , for 0 < r < r g , for r g < r < r , (S2)and the boundary condition c ( r = r , t ) = 0. Thus, r represents a cutoﬀ-radius at which the TF isassumed to have deﬁnitely left the domain of the blob. We use n = c as we study the situation whereone TF is in the blob containing one DNA chain.We deﬁne the Laplace transform f ( u ) of a function f ( t ) through: f ( u ) = ∞ Z f ( t ) exp( − ut ) dt. (S3)In Laplace space the diﬀerential equation S1 reads: uc ( u, r ) = (cid:26) c + D ∆ c ( r , u ) − κc ( r , u ) , for 0 < r < r g D ∆ c ( r , u ) , for r g < r < r , (S4)From its solution the ﬂux out of the outer sphere j out ( u ) and the binding ﬂux j bind ( u ) in the inner spherecan be obtained via: j out ( u ) = − πr D ∂c ( u, r ) ∂r (cid:12)(cid:12)(cid:12)(cid:12) r = r , (S5)and j bind ( u ) = 4 πκ r g Z dr r c ( u, r ) . (S6)We obtain j out ( u ) = 3 r g q r q sinh( q δr ) q r g coth( q r g ) − q r g ) + q q coth( q δr ) , (S7)and furthermore j bind ( u ) = 3 r g q κu + κ " r g q − ( q r g coth( q r g ) − r g q coth( q δr ))coth( q r g ) + q q coth( q δr ) , (S8)7where q = q u + κD , q = p u/D and δr = r − r g .A Taylor series around u = 0 then yields j bind ( u ) ≃ p r (1 − τ b u ) , (S9)and j out ( u ) ≃ (1 − p r )(1 − τ e u ) . (S10)We obtain p r = 1 − αφ ( γ ) α + ( α − γ φ ( γ ) , (S11)where we introduced α = r /r g , γ = r g p κ/D and the auxiliary function φ ( γ ) = ( γ coth( γ ) − /γ [S1].The average time it takes for binding reads τ b = α κ (cid:8) α + (cid:0) γ ( α − − α (cid:1) φ ( γ )+ (cid:0) − α + 2 γ (1 − α ) (cid:1) γ φ ( γ ) (cid:9) × (cid:0) α + ( α − γ φ ( γ ) (cid:1) − × (cid:0) α + ( γ ( α − − α ) φ ( γ ) (cid:1) − . (S12)This equation is true for arbitrary values of α . In the main text we explicitly state the case α = 2.However, in the results section we use α = p / ≈ .

14, as described in the last section of this SI.The average time the TF needs for leaving the blob is given by τ e = 12 κ (cid:8) α (3 − φ − ( γ )) + γ ((3 α − φ ( γ )+ 2 + α − α ) (cid:19) − γ − α ) φ ( γ ) (cid:27) × ( α + ( α − γ φ ( γ )) − . (S13) To calculate the probability to ﬁnd the target before dissociating, we consider the one-dimensional diﬀu-sion problem ∂c ( z, t ) ∂t = D ∂ c ( z, t ) ∂z − k oﬀ c ( z, t ) , (S14)subject to the initial condition c ( z, t = 0) = 1 /L and the boundary conditions c ( z = 0 , t ) = 0 and ∂c ( z,t ) ∂z (cid:12)(cid:12)(cid:12) z = L = 0. In Laplace space with respect to time we obtain the following solution: c ( u, z ) = 1 L ( u + k oﬀ )  − cosh(( L − z ) q u + k off D )cosh( L q u + k off D )  (S15)A Taylor series of j target ( u ) = D ∂c ( z,u ) ∂z (cid:12)(cid:12)(cid:12) z =0 in u yields: j target ( u ) ≃ tanh( L/ℓ ) L/ℓ + u k oﬀ (cid:18) ( L/ℓ ) − tanh( L/ℓ ) L/ℓ (cid:19) , (S16)where ℓ = p D /k oﬀ .8 (k off (s -1 ))0.9811.021.04 τ i nd i v / τ e n s a1a2a3b1b2b3b4b5 Figure S1.

Ratio of the mean search times obtained with individual conformations with the respectiveensemble averaged mean search time at k ass = 10 M − s − .This corresponds to a target ﬁnding probability of p t = tanh( L/ℓ ) L/ℓ . (S17)The average time it takes to ﬁnd the target reads τ t = 12 k oﬀ (cid:18) − L/ℓ sinh(

L/ℓ ) cosh(

L/ℓ ) (cid:19) . (S18) In Figure S1 we plot the ratio of the mean search time for all the eight individual conformations with themean search time of the corresponding ensemble average at k ass = 10 M − s − .Apparently all the individual curves only scatter about one percent around the value obtained withthe ensemble average. Thus it appears appropriate always to use the latter in the main text. α = p / In principle the parameter α which represents the ratio of the cutoﬀ-radius r and the blob’s radius ofgyration r is a free parameter which can be used to reﬁne the model. However, in the limit κ →

0, thatis when no binding to DNA occurs or when there is no DNA present, the escape time τ e from a blobshould coincide with the free diﬀusion time τ D . Now using Eq. S13,lim κ → τ e ( κ ) = r g D (5 α − . (S19)Equalizing this with τ D = r g D yields α = q . Consequently, this value was chosen in the main text.9 References

S1. Reingruber J, Holcman D (2010) Narrow escape for a stochastically gated brownian ligand.