[PDF] Kinetic discrimination of a polymerase in the presence of obstacles

Abstract

One of the causes of high fidelity of copying in biological systems is kinetic discrimination. In this mechanism larger dissipation and copying velocity result in improved copying accuracy. We consider a model of a polymerase which simultaneously copies a single stranded RNA and opens a single- to double-stranded junction serving as an obstacle. The presence of the obstacle slows down the motor, resulting in a change of its fidelity, which can be used to gain information about the motor and junction dynamics. We find that the motor's fidelity does not depend on details of the motor-junction interaction, such as whether the interaction is passive or active. Analysis of the copying fidelity can still be used as a tool for investigating the junction kinetics.

Full PDF

KKinetic discrimination of a polymerase in the presence of obstacles

Ilana Bogod ∗ and Saar Rahav † Schulich Faculty of Chemistry, Technion-Israel Institute of Technology, Haifa 32000, Israel

One of the causes of high ﬁdelity of copying in biological systems is kinetic discrimination. Inthis mechanism larger dissipation and copying velocity result in improved copying accuracy. Weconsider a model of a polymerase which simultaneously copies a single stranded RNA and opensa single- to double-stranded junction serving as an obstacle. The presence of the obstacle slowsdown the motor, resulting in a change of its ﬁdelity, which can be used to gain information aboutthe motor and junction dynamics. We ﬁnd that the motor’s ﬁdelity does not depend on details ofthe motor-junction interaction, such as whether the interaction is passive or active. Analysis of thecopying ﬁdelity can still be used as a tool for investigating the junction kinetics.

I. INTRODUCTION

Being alive means being out of thermal equilibrium. Our cells grow and divide via a host of nonequilibriumprocesses in which complex molecules are synthesized, transported and degraded. Many of these processes are carriedout by biomolecules which act like motors or machines, and are driven by chemical potential diﬀerences. One of themost delicate and demanding tasks handled by such biological motors is the replication and distribution of geneticinformation. Polymerase type enzymes, which generate copies of nucleic acid polymers, are a well-known class ofinformation handling molecular machines. Maintaining high ﬁdelity of copying by these enzymes is often crucial, asa large number of copying errors may result in malfunctions and death. Since information is physical, as beautifullystated by Landauer [1], the copying should be studied as a thermodynamic process.Close to thermal equilibrium, the accuracy of the copying process is determined by the diﬀerence in binding free-energy between correct and incorrect monomers. This mechanism of error reduction has been termed energeticdiscrimination. But binding free energies of various nucleic acids have a limited range, as they all need to stay bound,yet not be too diﬃcult to remove. This means that energetic discrimination schemes often show only moderate levelof accuracy, much lower than what is typically observed in biological systems. The natural conclusion is that accuratecopying of information requires out-of-equilibrium processes, as pointed out by Hopﬁeld [2] and Ninio [3]. When asystem is driven away from equilibrium, copying ﬁdelity can be enhanced at the cost of additional dissipation. Atheory of the ﬁdelity of transcription will inevitably investigate such processes from a thermodynamical perspective.Several recent papers were devoted to this question [4–9]. Sartori and Pigolotti [4] discussed the diﬀerence betweenthe energetic discrimination scheme that was mentioned above and kinetic discrimination. The latter is controlled bythe rates of incorporation of correct and incorrect monomers, or equivalently by activation energies, rather than bydiﬀerences in binding free-energies. They pointed out that typically one of these discrimination schemes dominates the ∗ Electronic address: [email protected] † Electronic address: [email protected] a r X i v : . [ q - b i o . S C ] A p r other, so that the subdominant mechanism has little eﬀect on the copying ﬁdelity. Kinetic discrimination is typicallydominant when the process is far from equilibrium.The thermodynamics of polymerization processes were studied meticulously by Andrieux and Gaspard in a series ofpapers [5–9]. For copolymerization on a template with kinetic discrimination, their model showed trade-oﬀ betweenaccuracy and dissipation. Processes driven by larger thermodynamic aﬃnities resulted in faster transcription rates,accompanied by lower error rates.Such results suggest a connection between the velocity of the polymerizing agent and its accuracy. However, themodels studied so far in the context of transcription or copolymerization were allowed to propagate freely on theirtemplate. In biological systems, the template – a single strand of DNA/RNA – may be blocked by a second strand ora hairpin, which must be removed before copying can proceed. Typically these obstacles are handled by helicases orother specialized molecular machines. Nevertheless, there are known examples of polymerases, such as the T7 RNApolymerase [10] and HIV-1 reverse transcriptase [11], that remove obstacles on their own, simultaneously transcribingand opening the double stranded structure. When reaching an obstacle these polymerases must slow down. How doessuch an interaction with an obstacle aﬀect the ﬁdelity of transcription? Naively, one would expect that the presenceof an obstacle would slow down such a motor and, accordingly, lead to more copying errors.Betterton and J¨ulicher studied a simple model of a helicase encountering such an obstacle [12, 13]. They qualitativelycharacterized the interaction between motor and obstacle as being either active or passive. In passive interaction, themotor must wait until the nearest bond in the double-stranded obstacle opens due to a thermal ﬂuctuation, therebyallowing the motor to step forward and prevent the bond from closing. In the active interaction the motor partiallyenters the obstacle, and the elastic interaction between the motor and the single- to double-strand junction increasesthe likelihood of bond opening. Active interactions were found to result in higher rates of bond breaking and largervelocities.In this paper we study how the ﬁdelity of a polymerase is aﬀected when it encounters an obstacle. A simplemodel which describes both polymerization on a template and interaction with a junction is developed. It combineselements from the models presented by Andrieux and Gaspard [5–7] and by Betterton and J¨ulicher [12, 13]. The modelis studied numerically with the help of simulations, and analytically, using a steady-growth ansatz. In particular,diﬀerences between copying ﬁdelity of active and passive interactions are investigated.The structure of the paper is as follows: in section II we present a simple model of a polymerase working against anobstacle. We discuss the possible processes and their transition rates. In section III we present the master equationthat describes the dynamics of the model. We also present a steady-growth ansatz that allows to obtain analyticalexpressions for observables such as the polymerase mean velocity and copying ﬁdelity. In section IV we investigatethe case of passive interaction between the polymerase and the obstacle, while section V is dedicated to systemswith active interactions. In both sections the predictions of the steady-growth ansatz are compared to Monte-Carlosimulations. We ﬁnd that while the form of the elastic interaction aﬀects the mean copying rate, it has no eﬀect onthe copying ﬁdelity. We discuss the implications of this result in section VI. FIG. 1: A simple model of a polymerase on a single-stranded DNA that serves as a template for copying. The motor is locatedon the border between a single and double stranded DNA. It can move forward while polymerizing an additional nucleotideonto the complementary strand, or it can move backward while removing a base from this strand. At the same time, theobstacle can move forward, or, if the motor is far enough, the obstacle can move back, as the double strand closes.

II. A MARKOVIAN MODEL OF A POLYMERASE PUSHING AGAINST AN OBSTACLE

In this section we present a simple Markovian model of a polymerase. The model is heuristically depicted in Fig.1. The system is composed of a substrate nucleic acid polymer (DNA or RNA) with a known sequence of monomers.The active site of the polymerase is located at the l th base of the substrate. The polymerase progresses along the chainwhile adding monomers to the complementary strand, leaving behind a double-stranded structure. The obstacle westudy is a junction with another double-stranded structure. These are known to occur as part of a secondary structureof the substrate, such as a hairpin. In the model, the junction is located at site ( l + j ).The discreteness of base pairs naturally calls for a model with discrete steps, which are characterized by transitionrates. We consider four diﬀerent processes:1. Forward motion of the polymerase from site l to site l + 1, accompanied by addition of a monomer m l +1 to thecomplementary strand. The rate of this process is denoted by R + m l +1 .2. Backward motion of the polymerase from site l to site l −

1, while removing monomer m l from the complementarystrand. The rate is denoted by R − m l .3. Opening of a bond in the junction, with rate Q + . In this process the junction moves forward along the template.4. Closing of a bond in the junction, with rate Q − , moving the junction backwards.An important distinction between the addition and removal of monomers is that the polymerase can always addany of the monomers in the solution, but can remove only the speciﬁc monomer which is at the terminal position onthe complementary strand. The kinetic equations that we will use in the next section will take this into account.The polymerase and junction will interact with each other if they are close enough. Physically, this is an elasticinteraction that is caused by deformation of the junction when the polymerase pushes into it. Such an interactionwill partially destabilize the outer bond in the junction while pushing the polymerase backwards. We assume thatthe interaction changes the transition rates according to R + m ( j ) = r + m Θ − ( j ), R − m ( j ) = r − m Θ + ( j ), Q + ( j ) = q + Θ + ( j )and Q − ( j ) = q − Θ − ( j ). r ± m and q ± are the rates for the non-interacting (or distant) polymerase and junction. Θ ± ( j )describe the polymerase-junction interaction. A more detailed description of the dependence of these rates on systemproperties is given in the rest of this section.We wish to study the ﬁdelity of copying and how it is aﬀected by the interaction with the junction. Two qualitativelydiﬀerent mechanisms can be employed to improve ﬁdelity of copying. Energetic discrimination favors the correctmonomer based on lower free energy of binding. In contrast, kinetic discrimination originates from diﬀerent kineticrates for binding of monomers. As pointed out by Sartori and Pigolotti [4], the two mechanisms compete with eachother, since energetic discrimination works near equlibrium, while its kinetic counterpart reaches best ﬁdelity far fromequilibrium.Here we consider a model with pure kinetic discrimination, following a similar choice by Bennett [14] and byAndrieux and Gaspard [5]. We make several simplifying assumptions about the structure of the model. Theseassumptions allow to reduce the bookkeeping involved in deﬁning the system’s state, while keeping the main qualitativefeatures of the dynamics. Speciﬁcally, we assume that the substrate strand is built out of two types of monomers andthe solution has the two complementary monomers in equal concentrations and in a spatially homogeneous mixture.With these assumptions, the transition rates only discriminate between addition of a correct or an incorrect monomerto the complementary strand, where the correctness is determined by comparing the monomer to its partner on thesubstrate. Crucially, there is no need to specify the composition of the substrate.The absence of energetic discrimination means that r + m r − m = [ mN T P ]ˆ r + m [ P P i ]ˆ r − m = [ mN T P ][ P P i ] (cid:15), (1)for any value of m , where (cid:15) = exp (cid:2) − ∆ G (cid:3) , ∆ G is the standard free energy of the polymerization reaction in unitsof k B T , [ mN T P ] is the concentrations of nucleotide m and [ P P i ] is the concentration of pyrophosphate, a byproductof the polymerization reaction. ˆ r ± m denotes the transition rates to addition and removal of monomers at standardconcentrations. We assume that concentrations do not vary in time. This is a good approximation for many in vitro experiments.Kinetic discrimination can be expressed through higher rates of addition and removal of the correct monomer ( c ),compared to the wrong monomer ( w ), namelyˆ r + c ˆ r + w = ˆ r − c ˆ r − w = d, (2)where d parametrizes the preference for inserting correct monomers, and therefore the resulting ﬁdelity of copying.In living cells d ≈ − [15], but we will consider smaller values to avoid problems associated with insuﬃcientsampling of errors in our simulations. Kinetic discrimination reduces errors because when the polymer grows rapidly,a larger proportion of the incorporated monomers is of the correct type. These monomers are left behind in the copiedstrand when the polymerase propagates. They are bound from both sides, and are unlikely to detach unless the motorperforms multiple backward steps to return to their location in the chain.Based on these considerations, the transition rates for pure kinetic discrimination will take the following form r − w = ˆ r − w [ P P i ] ,r − c = ˆ r − c [ P P i ] = ˆ r − w d [ P P i ] ,r + w = ˆ r + w [ wN T P ] = ˆ r − w (cid:15) [ wN T P ] ,r + c = ˆ r + c [ cN T P ] = ˆ r − w d(cid:15) [ cN T P ] , (3)The opening and closing of the bonds in the junction are ruled by thermal ﬂuctuations. The ratio of rates hasbeen determined empirically to be q − q + ≈ ± . Thermodynamic consistencymandates thatΘ + ( j )Θ − ( j + 1) = e [ U ( j ) − U ( j +1)] , (4)where U ( j ) is the potential of the interaction at distance j in units of k B T . We will assume that the inﬂuence of thiselastic interaction aﬀects both rates according toΘ + ( j ) = e ( g − U ( j +1) − U ( j )] , Θ − ( j + 1) = e g [ U ( j +1) − U ( j )] . (5)Here, g is a load-distribution-factor like parameter. Note that the interaction aﬀects both processes that close thedistance between the motor and the junction in the same way, and the same is true for opening. It does not matterif the process involves motion of the motor accompanied by a polymerization reaction, or motion of the junction dueto the formation of a new inter-strand bond. The same form of interaction was also used by Betterton and J¨ulicherin their model of helicase [12, 13]. III. THE MASTER EQUATION AND THE STEADY-GROWTH ANSATZ

Consider a system in which the polymerase was placed on the substrate strand and then left to evolve underconditions of mean growth. The state of the system is characterized by: i) the position of the motor l on thesubstrate, which is assumed to be initially at l = 0. l is therefore also the length of the complementary strand beingpolymerized; ii) the composition of the complementary strand compared to the substrate strand: m , m , ...m l , where m k = c or w ; and iii) the distance of the motor from the junction j .Given the processes and rates described in Sec. II, the probability distribution of the system evolves according tothe master equation dPdt ( m ...m l , j, l, t ) = R + m l ( j + 1) P ( m ...m l − , l − , j + 1 , t ) + (cid:88) m l +1 R − m l +1 ( j − P ( m ...m l +1 , l + 1 , j − , t )+ Q + ( j − P ( m ...m l , l, j − , t ) + Q − ( j + 1) P ( m ...m l , l, j + 1 , t ) −  Q + ( j ) + Q − ( j ) + R − m l ( j ) + (cid:88) m l +1 R + m l +1 ( j )  P ( m ...m l , l, j, t ) . (6)The system evolution can be studied in full detail by solving the master equation with an appropriately choseninitial condition, or alternatively by simulating the underlying jump process. However, both approaches are needlesslycomplicated if one is interested in simple quantities such as the mean error rate and velocity.A simpler approach for the description of these observables, which may even allow for an analytical solution, wasdeveloped by Andrieux and Gaspard [5]. The approach is based on the assumption that after a transient, the systemreaches a steady-growth regime in which correlations between the length of the chain l , the composition of the chainand the distance j are lost. In this steady-growth regime the probability distribution can be approximated by P ( m ...m l , l, j, t ) (cid:39) P t ( l ) µ ( m ...m l )Φ( j ) , (7)where P t ( l ) is the probability of length l , µ ( m ...m l ) is the probability of a given sequence when the length is set, andΦ( j ) is the probability of distance j . Only the length distribution is explicitly time-dependent.This ansatz can not be exact, since the distribution µ ( m ...m l ) must include the memory of transient behaviorin the distant past. Nevertheless, it is a useful approximation in the steady-growth regime, as long as one focuseson marginal distributions such as µ ( m l ) , µ ( m l − , m l ) etc., describing the probability distribution of one or a fewmonomers near the tip of the growing strand. In the steady-growth regime one expects these distributions to beindependent of time and the precise value of l . Using similar physical intuition, one expects the probability Φ( j ) toreach a time-independent steady state in the steady-growth regime. The equations describing these simpler marginalprobabilities are derived by substituting the ansatz Eq. (7) into the master equation and summing over all unwantedvariables.Summation over the chain composition and over values of j gives an equation for the length distribution dP t ( l ) dt = (cid:88) m l r + m l (cid:10) Θ − (cid:11) P t ( l −

1) + (cid:88) m l +1 r − m l +1 (cid:10) Θ + (cid:11) µ ( m l +1 ) P t ( l + 1) −  (cid:88) m l +1 r + m l +1 (cid:10) Θ − (cid:11) + (cid:88) m l r − m l (cid:10) Θ + (cid:11) µ ( m l )  P t ( l ) . (8)Here (cid:104) Θ ± (cid:105) = (cid:80) j Θ ± ( j )Φ( j ), and µ ( m l ) is the likelihood that the last monomer in the chain is m l . We will see shortlythat for our model there are no correlations in the composition of the chain and so µ ( m l ) can have the two values µ ( w ) and µ ( c ) = 1 − µ ( w ). µ ( w ) is also the probability of copying error in the bulk of the copied strand.Inspection of Eq. (8) reveals that it includes processes in which the chain grows and shrinks. The mean growthvelocity is the diﬀerence between the mean rate of polymerization and depolymerization, namely v = (cid:88) m r + m (cid:10) Θ − (cid:11) − (cid:88) m r − m µ ( m ) (cid:10) Θ + (cid:11) . (9)Summation over the chain composition and the lengths leads to an equation for the distribution of distances betweenthe motor and junction. A short calculation gives (cid:0) (cid:88) m r + m + q − (cid:1) Θ − ( j + 1)Φ( j + 1) + (cid:0)(cid:10) r − (cid:11) + q + (cid:1) Θ + ( j − j − − (cid:34)(cid:0) (cid:88) m r + m + q − (cid:1) Θ − ( j ) + (cid:0) (cid:10) r − (cid:11) + q + (cid:1) Θ + ( j ) (cid:35) Φ( j ) = 0 (10)as the equation determining the distribution of Φ( j ). Here (cid:104) r − (cid:105) = (cid:80) m r − m µ ( m ) is the mean rate of removal of thelast monomer.To obtain equations for the composition of the monomers in the complementary strand, one sums over all valuesof l , j and the possible composition of the ﬁrst k monomers m ...m k . This results in a hierarchy of equations for theprobability distribution of the last few monomers. The ﬁrst two equations in this hierarchy are given by r + m l (cid:10) Θ − (cid:11) + (cid:88) m l +1 r − m l +1 (cid:10) Θ + (cid:11) µ ( m l m l +1 ) − r − m l (cid:10) Θ + (cid:11) µ ( m l ) − (cid:88) m l +1 r + m l +1 (cid:10) Θ − (cid:11) µ ( m l ) = 0 , (11) r + m l (cid:10) Θ − (cid:11) µ ( m l − ) + (cid:88) m l +1 r − m l +1 (cid:10) Θ + (cid:11) µ ( m l − m l m l +1 ) − r − m l (cid:10) Θ + (cid:11) µ ( m l − m l ) − (cid:88) m l +1 r + m l +1 (cid:10) Θ − (cid:11) µ ( m l − m l ) = 0 . (12)This hierarchy has a solution in which there are no correlations between consecutive monomers. One can show thatunder the assumption of correlations only between nearest neighbors, which allows closing the hierarchy using Eqs.(11) and (12), the conditional probability µ ( m l − | m l ) tends to µ ( m l ). More importantly, in Secs. IV and V wecompare the results of the ansatz to simulations which do not assume lack of correlations. Excellent agreement isfound. Both facts strongly suggest that the uncorrelated solution is stable to small perturbations.Since there are no correlations in the steady-growth regime, µ ( m l ) describes also the probability to ﬁnd a monomeranywhere on the chain , and µ ( m m ...m l ) = µ ( m ) µ ( m ) ...µ ( m l ). We note in passing that models with rates leadingto nearest-neighbor correlations were studied by Andrieux and Gaspard [7–9]. In absence of correlations one obtainsthe following set of equations for monomer probabilities r + m l (cid:10) Θ − (cid:11) + (cid:88) m l +1 r − m l +1 (cid:10) Θ + (cid:11) µ ( m l ) µ ( m l +1 ) −  r − m l (cid:10) Θ + (cid:11) + (cid:88) m l +1 r + m l +1 (cid:10) Θ − (cid:11) µ ( m l ) = 0 . (13)For the model studied here, the monomers in the complementary strand can either match the substrate, m = c , orbe a copying error, m = w . By substituting µ ( c ) = 1 − µ ( w ) one can reduce Eq. (13) to a single equation for thecopying ﬁdelity (cid:10) Θ + (cid:11) ( r − w − r − c ) µ ( w ) − (cid:2) ( r − w − r − c ) (cid:10) Θ + (cid:11) + ( r + w + r + c ) (cid:10) Θ − (cid:11)(cid:3) µ ( w ) + r + w (cid:10) Θ − (cid:11) = 0 . (14)Equations (9), (10) and (14) are a set of coupled equations that characterize the properties of the polymerase inthe steady-growth regime. Their simple form allows one to calculate quantities such as the motor’s mean velocityand its ﬁdelity analytically. The simple form of these equations ultimately emerges from the simple dependence ofthe transition rates on the chain composition and the motor-junction distance. In the next two sections we will solvethese equations explicitly for two cases. We will also compare their predictions to those of a stochastic simulationwhich does not assume a steady-growth regime. IV. PASSIVE UNWINDING

Following Betterton and J¨ulicher [12, 13] we qualitatively characterize the interaction between the polymerase andthe junction as either passive or active. In the passive case, the interaction between motor and junction is that of ahard wall. As a result, the motor does not enter the junction, and equivalently, a transition that closes the junctionon the motor is impossible. Active unwinding, which will be studied in the next section, allows the motor to enterthe junction. As will be seen later, this modiﬁes the transition rates in a way which can result in faster unwinding.The hard wall interaction means that Φ( j ) = 0 for j ≤

0. In addition, the transition from j = 1 to j = 0 isforbidden, so Θ − (1) = 0. When the motor is away from the junction, its interaction with the wall can be neglected,and Θ − ( j ) = 1 for j >

1, while Θ + ( j ) = 1 for all values of j . This interaction is termed passive since unwindinghappens when a bond in the junction opens due to a purely thermal ﬂuctuation and the motor steps into the newlyavailable space, thereby preventing the bond from closing. When this rectiﬁcation process is more likely than itsreversed process, the double-stranded DNA/RNA junction will be unwound on average. An inspection of Eq. (10),shows that Φ( j ) must in fact satisfy a detailed balance condition (cid:2) r + w + r + c + q − (cid:3) Θ − ( j + 1)Φ( j + 1) = (cid:2) q + + (cid:10) r − (cid:11) (cid:3) Θ + ( j )Φ( j ) . (15)The underlying reason for the appearance of this detailed balance condition is the one-dimensional structure of thestates and transition topology in the variable j , which precludes non-trivial closed cycles of transitions.Substitution of Θ ± ( j ) = 1 for j >

1, and of Θ + (1) = 1 , Θ − (1) = 0, leads toΦ( j + 1) = ρ Φ( j ) , (16)with ρ = (cid:104) r − (cid:105) + q + r + w + r + c + q − . We are interested in systems in which the driving force is suﬃcient for polymerization in absenceof a junction, while the junction tends to close in absence of a polymerase. Under such conditions ρ <

1. This allowsus to explicitly solve for Φ( j ) as a function of the mean ﬁdelity of copying. We ﬁnd that Φ( j ) = (1 − ρ ) ρ j − for j ≥ j ) = 0 otherwise.This local-equilibrium distribution Φ( j ) allows us to calculate the averages (cid:10) Θ + (cid:11) = (cid:88) j Θ + ( j )Φ( j ) = 1 , (17) (cid:10) Θ − (cid:11) = (cid:88) j Θ − ( j )Φ( j ) = ∞ (cid:88) j =2 Φ( j ) = 1 − Φ(1) = ρ. (18)Substituting Eqs.(17) and (18) in Eq. (14) gives the following quadratic equation for the probability of making acopying error µ ( w ) q − ( r − c − r − w ) + µ ( w )[ q + ( r + w + r + c ) − q − ( r − c − r − w ) + r − w r + c + r − c r + w ] − r + w ( r − c + q + ) = 0 . (19)Let us ﬁrst examine the case of a ﬁxed and immobile junction. In this case the polymerase will not be able topropagate at all, and kinetic discrimination is not possible. By substituting q ± = 0 into Eq. (19) we ﬁnd that thecopying ﬁdelity is µ ( w ) = r + w r − c r + c r − w + r − c r + w = 12 , (20)which is the equilibrium error rate for our model due to the equal binding free-energies of m = w, c (cf. Eq. 3).For the general case of q − , q + (cid:54) = 0 we obtain µ ( w ) = − z + (cid:113) z + 4 q − ( r − c − r − w ) r + w ( r − c + q + )2 q − ( r − c − r − w ) , (21)with z = q + ( r + w + r + c ) − q − ( r − c − r − w ) + r − w r + c + r − c r + w . The motor’s mean velocity can be calculated by substitutingthe solutions for µ ( m ) and Θ( j ) in Eq. (9). A short calculation gives v = q + − q − ρ = q + − q − (cid:104) r − (cid:105) + q + r + w + r + c + q − . (22)We see that the bond opening rate q + bounds the possible velocity of the polymerase. This bound is achieved inthe limit of high concentrations, where r + w , r + c (cid:29) (cid:104) r − (cid:105) , q ± . In this limit, the motor is very likely to reside near thejunction and almost immediately step forward once a bond in the junction has opened, making the bond opening inthe junction the rate-limiting process.The analytical calculation leading to Eqs. (21) and (22) for the copying ﬁdelity and velocity is based on the steady-growth assumption. To test the validity of this assumption, we compared the resulting prediction of the theory to astochastic simulation of the system. The simulation employed the Gillespie algorithm to determine the next step [18].The copying ﬁdelity was calculated by counting the fraction of wrong monomers along the chain, while ignoring aninitial transient of length 100 base pairs. To gather enough statistics we repeated each simulation 3000 times.Fig. 2(a) depicts results for the mean velocity of polymerization as a function of monomer concentration. In allour simulations [cNTP]=[wNTP]=[C]. The concentration of PPi was taken to be 100 µM . The kinetic discriminationparameter was chosen to be d = 70. This value is much smaller than what one expects to ﬁnd in biological systems.On the other hand, it results in enough copying errors to allow comparison of simulation and theory on a reasonabletime scale. Lines correspond to the prediction of Eqs. (21) and (22), while symbols are the simulation results. It isclear that the agreement is excellent for all positive velocities. Diﬀerent curves correspond to diﬀerent values of thekinetic coeﬃcient of the junction q + , where we always keep q − = 7 q + [16, 17].The results clearly show that the polymerase’s mean velocity increases with monomer concentration. For a freelypropagating motor, the velocity is asymptotically linear in the concentrations, due to the linear dependence of r + w,c .This is no longer true when the junction is present, since the velocity approaches q + in this limit. For low concentrationsof monomers, the mean velocity is negative and the complementary strand is degraded by the motor. This violatesthe assumptions made for the steady-growth regime, as the chain composition was mostly determined by the processused to prepare it, rather than by the polymerase dynamics. Andrieux and Gaspard [7, 8] discussed the velocity in adepolymerization regime in detail, but such a discussion goes beyond the scope of the current work.0 -30

120 150 0.5 5 50 500 V ( bp / s ) [C] (μM) Free Propagationq⁺ = 1 q⁺ = 40 q⁺ = 100 (a) μ ( w ) [C](μM) Free Propagationq⁺ = 1 q⁺= 5 q⁺ = 40 (b)FIG. 2: (a) Mean polymerization velocity as a function of monomer concentration [C]. (b) The copying ﬁdelity of the polymeraseas a function of monomer concentration. In both panels, the lines represent the theoretical prediction, obtained from the steady-growth ansatz. Symbols represent the results from a Monte Carlo simulation of the model.

The copying ﬁdelity is depicted in Fig. 2(b). It increases with the concentration, as expected in a kinetic discrim-ination mechanism. The presence of the obstacle reduces the copying ﬁdelity. At slow velocities, the copying errorprobabilities approach , as predicted. At large concentrations the error probability approaches the value µ ( w ) (cid:39) r − c + q + q + (1 + d ) + 2 r − c . (23)It is easy to see that when q + (cid:29) r − , the obstacle opens fast enough to allow almost free propagation of the motor,and the error rate goes to d which is the error rate of a far-from-equilibrium freely propagating polymerase [5].When q + (cid:28) r − , the obstacle is essentially immobile, and the error rate approaches , as expected. V. ACTIVE UNWINDING

Active unwinding occurs when the polymerase can push into the junction and drive the two strands apart. Theelastic interaction between motor and junction weakens the bond between the strands while also applying a force thatpushes the polymerase away from the junction. The precise form of the interaction is not known. We will thereforechoose an interaction which exhibits all the expected qualitative features but is easy to use in calculations.Following Betterton and J¨ulicher [12, 13], we consider a step-like interaction potential U ( j ). The potential isschematically depicted in Fig. 3. This potential vanishes when the polymerase is away from the junction ( j > j = 0), the potential obtains the value U >

0, expressing a repulsiveinteraction between motor and junction. The polymerase can push itself further into the junction, where in every stepthe potential increases by an additional U . We assume that the polymerase can at most penetrate a ﬁnite numberof steps into the junction. This is expressed by placing a hard wall interaction at j = − n . This potential enters thekinetic equation through the factors Θ ± ( j ) given in Eq. (5). For the step potential, these factors obtain a simple1 FIG. 3: Schematic drawing of the potential of the elastic interaction between the polymerase and the junction as a functionof the distance between them. form Θ + ( j ) =  Y g , j < , j ≥ , (24)and Θ − ( j ) =  , j ≤ − nY g − , − n < j ≤ , j > . (25)where Y ≡ exp( U ).The calculation that allowed for an explicit solution of Eq. (10) for the passive case can also be applied for activeinteractions with this staircase potential. The solution of Eq. (10) follows from the detailed balance condition (15),but the diﬀerent form of Θ ± results in a somewhat diﬀerent recursion relation for the probability distribution Φ( j ).For j ≥

1, one still has Φ( j + 1) = ρ Φ( j ) as in Sec. IV, but now for − n < j < j + 1) = Φ( j ) ρY , andΦ( j ) = 0 for j ≤ − n . A straight forward, but somewhat tedious, calculation givesΦ( j ) =  (1 − ρ )(1 − Y ρ )( Y ρ ) − n (1 − ρ )+ ρ (1 − Y ) , j = 1Φ(1) ρ j − , j > ρY ) j − , − n < j < , j ≤ − n. (26)2This can be used to calculate (cid:10) Θ + (cid:11) = ∞ (cid:88) j =1 Φ( j ) + (cid:88) j =1 − n Φ( j ) Y g = 1 − Y ρ + (1 − ρ )[( Y ρ ) − n − Y g ρ (1 − Y ) + (1 − ρ )( Y ρ ) − n , and (cid:10) Θ − (cid:11) = ∞ (cid:88) j =2 Φ( j ) + (cid:88) j =2 − n Φ( j ) Y g − = ρ (1 − Y ρ ) + (1 − ρ )[( Y ρ ) − n +1 − Y ρ ] Y g − ρ (1 − Y ) + (1 − ρ )( Y ρ ) − n . The factors (cid:104) Θ + (cid:105) and (cid:104) Θ − (cid:105) are subsequently used in the calculation of the polymerase velocity and ﬁdelity.Interestingly, we note that (cid:104) Θ − (cid:105) (cid:104) Θ + (cid:105) = ρ , exactly as in the passive case. Since the mean error rate depends only on thisratio, we ﬁnd that the motor’s ﬁdelity is independent of the elastic interaction and is given be Eq. (21). The motor’svelocity does depend on the elastic interaction and is given by v = [1 − Y ρ + (1 − ρ ) (( Y ρ ) − n − Y g ] [ (cid:104) r − (cid:105) + ρ ( r + w + r + c )] ρ (1 − Y ) + (1 − ρ )( Y ρ ) − n (27) -4 0 V ( bp / s ) [C](μM) U₀=1, n=3 U₀=4.5, n=3 U₀=∞ (a) μ ( w ) [C](μM) Free PropagationU₀=∞

U₀=1, n=3

U₀=4.5, n=3 (b)FIG. 4: (a) Mean polymerization velocity as a function of monomer concentration [ C ]. Diﬀerent curves represent diﬀerentmotor-junction interaction parameters, while kinetic parameters of the junction are kept constant ( q + = 1). (b) The copyingﬁdelity of the polymerase as a function of monomer concentration. Symbols correspond to results from stochastic simulations,whereas lines depict analytical results. The collapse of three lines onto a single curve shows that details of the interaction haveno eﬀect on the ﬁdelity, namely that the error rate for any value of U and n are the same as for the hard wall interaction.The solid line and the triangle symbols correspond to a polymerase without an obstacle. Such a freely propagating polymeraseexhibits considerably higher ﬁdelity. The analytical predictions of Eqs. (21) and (27) were compared to simulations in Fig. 4. Figure 4(a) showscomparison of the mean velocity of polymerization as a function of monomer concentration for various values of U .The U → ∞ results correspond to a passive motor with a hard wall interaction. The concentration of PPi andthe kinetic discrimination are as in the passive unwinding. Lines, again, correspond to the prediction of Eq. (27),while symbols are the simulation results. The kinetic coeﬃcients of the junction were set to q + = 1 , q − = 7 for all3the results depicted in the ﬁgure. Excellent agreement is found between Eq. (27) and the simulation results. Thepolymerization velocity clearly increases with increasing concentration, but its value depends on the interaction. Thetwo active motors depicted in Fig. 4(a) are clearly faster than a passive motor with the same monomer concentration.A comparison of the two motors with n = 3 shows that the mean velocity does depend also on the step height U .The fact that v depends on U is precisely the eﬀect found for helicases by Betterton and J¨ulicher [12, 13].The dependence of the copying ﬁdelity on the monomer concentration is shown in Fig. 4(b). The steady-growthansatz predicts an interaction independent result, given by Eq. (21). We performed simulations for several modelswith diﬀerent interactions between the polymerase and the junction, including a passive motor and two active modelswith diﬀerent three-step potentials. All show the same ﬁdelity as a function of concentration. This ﬁdelity is stillaﬀected by the presence of the obstacle through the kinetic coeﬃcients q ± , and is therefore diﬀerent from that of afreely propagating polymerase, which is also depicted in the ﬁgure. VI. DISCUSSION

Kinetic discrimination is one of the mechanisms employed by DNA polymerases and similar biological motors toincrease the ﬁdelity of copying of genetic information. This mechanism exhibits a trade-oﬀ between dissipation andﬁdelity. In the context of copolymerization this trade-oﬀ was studied for instance by Andrieux and Gaspard [7]. Asthe system is driven further away from equilibrium, its mean velocity increases, and it attains a lower rate of copyingerrors.In helicases, the mean propagation velocity can be used to deduce thermodynamic features of the motor’s interactionwith an obstacle [12, 13]. Experimentally, it is easier to count the copying errors in the copied strand than to follow therate of copying. The existence of polymerases which simultaneously copy and open single- to double-stranded junctionsoﬀers an interesting alternative. Is it possible to extract information about polymerase and junction dynamics fromthe enzyme’s ﬁdelity? More speciﬁcally, can one deduce whether the polymerase-junction interaction is active orpassive?The results presented in this paper help to clarify such questions. The simple model of a polymerase studied hereslows down when it encounters a junction. This slowdown is indeed accompanied by an increased rate of copying errors,as can be seen from a comparison of the results of a freely propagating and non-freely propagating systems depicted inFigs. 2(b) and 4(b). This is exactly what one would anticipate in a model employing kinetic discrimination. However,our results show that the suggested correlation between the mean copying velocity and ﬁdelity, where faster copyingmeans better ﬁdelity, is not always present. Upon encountering an obstacle, the model predicts an unexpected partialdecoupling between mean velocity and error rate. The copying ﬁdelity, as expressed by Eq. (21), is independentof details of the polymerase-junction interaction. The presence of a junction still aﬀects the probability of copyingerrors, but only through the kinetic part of the transition rates (given by q ± ). In contrast, the mean velocity clearlydepends on all model parameters, including the interaction. In fact, we ﬁnd that active unwinding can result in higherrate of copying than that of a passive polymerase, in agreement with the results of Betterton and J¨ulicher [12, 13].The three non-freely propagating models in Fig. 4 have diﬀerent velocities, due to the diﬀerence between active andpassive interactions. At the same time they all exhibit the same ﬁdelity. One still expects that when more parametersare varied, such as the monomer concentration or q ± , larger velocity will typically be accompanied by better ﬁdelity,4but it is important to point out that this is not always the case.One may wonder whether the independence of the ﬁdelity of the interaction is a particular property of the modelstudied here. Maybe the result will break down for an interaction that is not described by a staircase potential? Asexplained below, the interaction dependence of ﬁdelity found for the model is a result of the topology of the internalstate space of the system. Speciﬁcally, it emerges from the fact that this state space is one-dimensional, and thereforedoes not include non-trivial closed loops of transitions. Models with diﬀerent interactions would exhibit the sameﬁdelity as long as they have a one-dimensional internal state space.In the steady-growth ansatz, the distribution to ﬁnd the polymerase at diﬀerent internal states, Φ( j ), becomes timeindependent and furthermore satisﬁes Eq. (10). This equation can be recast as I j,j − − I j +1 ,j = 0 , (28)where I j +1 ,j ≡ (cid:0)(cid:10) r − (cid:11) + q + (cid:1) Θ + ( j )Φ( j ) − (cid:32)(cid:88) m r + m + q − (cid:33) Θ − ( j + 1)Φ( j + 1) (29)is the net ﬂux of transitions from j to j + 1. The steady state solution for Φ must therefore satisfy I j +1 ,j = C , where C is a j -independent constant.For any reasonable model of a polymerase that unwinds a junction one expects that Φ( j ) → j → ∞ , expressingthe fact that the polymerase and junction tend to move closer to each other. One also expects that Φ( j ) → j → −∞ since otherwise the model would allow the polymerase to simply pass through the junction without unwindingit. These considerations mean that the constant C must vanish, demonstrating that the detailed balance condition(15) holds for quite general U ( j ). One should not take the appearance of the detailed balance condition as evidencethat the system is in thermal equilibrium. The model exhibits steady growth with a nonvanishing rate of copying,and is therefore clearly out of equilibrium. Nevertheless, the internal state space does relax to some kind of localequilibrium.The detailed balance condition, Eq. (15), is the underlying reason for the interaction independent ﬁdelity found inSec. V. Indeed, summation of Eq. (15) over j results in (cid:104) Θ − (cid:105)(cid:104) Θ + (cid:105) = ρ, for any U ( j ) that is consistent with distributions Φ( j ) that vanish at inﬁnity. Examination of Eq. (14) shows thatthe mean error rate depends on (cid:104) Θ − (cid:105) and (cid:104) Θ + (cid:105) only through their ratio, ρ . The fact that this ratio is independentof the elastic interaction U means that the mean error rate is also independent of the form of U ( j ).The generation of this internal detailed balance, and the resulting independence of ﬁdelity from the elastic interac-tion, are an interesting manifestation of the dynamics of copying machines. But how relevant is this phenomenon fora the few biological polymerases that remove obstacles on their own, such as reverse transcriptase? Can one deducethat their ﬁdelity does not depend on the elastic interaction with a junction? Such a conclusion would be too hasty.One should be aware that the model we constructed oversimpliﬁes several important aspects of the dynamics. Onedrastic assumption we made was to view the polymerase as a point particle. The polymerases in our cells are proteinsof a ﬁnite size. They can be squeezed by the application of an external force.5A ﬂexible ﬁnite-sized polymerase can be modeled as an elastic spring. One can generalize the model studiedhere to include this aspect by considering a system with two internal degrees of freedom, namely the size of thepolymer and the distance from the edge of the polymerase to the junction. One also should include two types ofelastic interactions, a quadratic potential for the size of the polymerase, and a more general interaction between thepolymerase and junction. The crucial point is that this expanded model has an internal state space which is no longerone-dimensional. As a result the probability distribution in this space decays to a nonequilibrium steady state in thesteady growth regime. This is expected to lead to some degree of dependence of the ﬁdelity on the elastic potential U . Further research is required to ﬁnd out whether this eﬀect can be large.Comparison between the model studied here and biological polymerases is further complicated by several additionalfactors. We assumed a model with purely kinetic discrimination, but for instance the ﬁdelity of reverse transcriptaseis a result of a mixture of kinetic and energetic discrimination. We have assumed equal, constant and uniformconcentrations - a condition that is unlikely to hold in vivo . In addition, one expects the incorporation of monomersto depend on the identity of their neighbors, leading to correlations in the copied strand composition. All theseelements must be included in the model before any biologically relevant predictions can be made.The approach taken here, namely studying a simpliﬁed version of the dynamics, should be viewed as a way ofobtaining qualitative understanding of copying machines, focusing on the interplay between the velocity, ﬁdelity,and interactions with an obstacle. It has the advantage of resulting in simple analytical expressions for observables,whose study can lead to qualitative insights. It is certainly worthwhile to include the additional aspects needed fora quantitative comparison with biological copying machines. But in our view there is much to gain by ﬁrst studyingmodels in which the roles of diﬀerent mechanisms can be investigated separately.Before concluding, we would like to point out an interesting qualitative property of the dynamics. Our resultsshow that the mean velocity and the ﬁdelity contain non-overlapping information regarding the polymerase-junctioninteraction and kinetics. This is clearly indicated by the results depicted in Fig. 4. One should therefore striveto obtain data on both observables, and not be satisﬁed with measurements of only one of them. We expect thisconclusion to be rather robust, and therefore hold also for biological polymerases. Acknowledgements

We thank Omri Malik and Ariel Kaplan for illuminating discussions that have initiated our interest in this topic.This work was supported by the U.S.-Israel Binational Science Foundation (Grant No. 2014405), by the Israel ScienceFoundation (Grant No. 1526/15), and by the Henri Gutwirth Fund for the Promotion of Research at the Technion. [1] R. Landauer, Physics letters A , 188 (1996).[2] J. J. Hopﬁeld, Proceedings of the National Academy of Sciences , 4135 (1974).[3] J. Ninio, Biochimie , 587 (1975).[4] P. Sartori and S. Pigolotti, Physical review letters , 188101 (2013).[5] D. Andrieux and P. Gaspard, Proceedings of the National Academy of Sciences , 9516 (2008).[6] D. Andrieux and P. Gaspard, The Journal of chemical physics , 014901 (2009). [7] P. Gaspard and D. Andrieux, The Journal of chemical physics , 044908 (2014).[8] P. Gaspard, Journal of Statistical Physics , 17 (2016).[9] P. Gaspard, Physical Review E , 042419 (2016).[10] C. C. Richardson, Cell , 315 (1983).[11] M. Hottiger, V. N. Podust, R. L. Thimmig, C. McHenry, and U. H¨ubscher, Journal of Biological Chemistry , 986(1994).[12] M. Betterton and F. J¨ulicher, Physical review letters , 258103 (2003).[13] M. Betterton and F. J¨ulicher, Physical review E , 011904 (2005).[14] C. H. Bennett, International Journal of Theoretical Physics , 905 (1982).[15] H. R. Lee and K. A. Johnson, Journal of Biological Chemistry , 36236 (2006).[16] T. M. Lohman and K. P. Bjornson, Annual review of biochemistry , 169 (1996).[17] F. J¨ulicher and R. Bruinsma, Biophysical journal , 1169 (1998).[18] D. T. Gillespie, Journal of computational physics22