[PDF] Coarse graining of biochemical systems described by discrete stochastic dynamics

Abstract

Many biological systems can be described by finite Markov models. A general method for simplifying master equations is presented that is based on merging adjacent states. The approach preserves the steady-state probability distribution and all steady-state fluxes except the one between the merged states. Different levels of coarse graining of the underlying microscopic dynamics can be obtained by iteration, with the result being independent of the order in which states are merged. A criterion for the optimal level of coarse graining or resolution of the process is proposed via a tradeoff between the simplicity of the coarse-grained model and the information loss relative to the original model. As a case study, the method is applied to the cycle kinetics of the molecular motor kinesin.

Full PDF

CCoarse graining of biochemical systems described by discretestochastic dynamics

David Seiferth (1) , Peter Sollich (2) , and Stefan Klumpp (1)(1)

Institute for the Dynamics of Complex Systems, University of G¨ottingen,Friedrich-Hund-Platz 1, 37077 G¨ottingen, Germany and (2)

Institute for Theoretical Physics, University of G¨ottingen,Friedrich-Hund-Platz 1, 37077 G¨ottingen, Germany (Dated: March 1, 2021) a r X i v : . [ phy s i c s . b i o - ph ] F e b bstract Many biological systems can be described by ﬁnite Markov models. A general method for sim-plifying master equations is presented that is based on merging adjacent states. The approachpreserves the steady-state probability distribution and all steady-state ﬂuxes except the one be-tween the merged states. Diﬀerent levels of coarse graining of the underlying microscopic dynamicscan be obtained by iteration, with the result being independent of the order in which states aremerged. A criterion for the optimal level of coarse graining or resolution of the process is proposed,via a trade-oﬀ between the simplicity of the coarse-grained model and the information loss relativeto the original model. As a case study, the method is applied to the cycle kinetics of the molecularmotor kinesin.

I. INTRODUCTION

Many stochastic systems in physics, chemistry, biology as well as other areas can bedescribed by discrete-state Markov models, such that their dynamics is given by a masterequation [1, 2]. Examples include the dynamics of (bio-)molecules where states describediscrete conﬁgurations or functional states, the dynamics of chemical reactions and of pop-ulations, where the states are given by the numbers of individuals of each species, discretestepping of molecular machines, and many more [3, 4]. In all these cases the dynamics canbe interpreted and visualized as a hopping process on a network where nodes represent thestates of the system and links between them transitions from one state to another [1, 5].Often, these discrete models can be understood as arising from more complex underlyingdynamics by coarse graining. For example in the case of macromolecules, the dynamics of thediscrete conﬁgurations of the molecules represent, on the microscopic level, the motion of theatoms in that molecule [6]. Coarse graining, the operation that transforms the microscopicdescription into the simpler macroscopic one, thus involves the elimination of degrees offreedom and a reduction of the dimension of the state space. Coarse graining can makelarge systems computationally tractable, but can also have conceptual advantages over themore detailed models by focusing on the essential parts of the dynamics.In general, coarse graining can be applied to systems with discrete or continuous degreesof freedom and coarse-grained models may also be of either type. For example in moleculardynamics, coarse-grained models of macromolecules are often obtained by a united atom2pproach [7, 8], i.e. by merging groups of atoms into a single larger unit, which may bespherical [9] or anisotropic [10]. The larger unit has fewer (but still continuous) degrees offreedom than the original atoms. As already mentioned, the case of coarse graining a con-tinuous description into a simpler discrete one is also important for describing the moleculardynamics of macromolecules [6, 11, 12]. Another case of interest is the simpliﬁcation of agiven discrete-state system, as very often descriptions of the same system at diﬀerent levelsof detail are possible. An example is given by the chemomechanical dynamics of a molecularmachine, which moves through a working cycle in multiple steps. Depending on what atheoretical description of that system is used for, fewer or more of these steps need to beexplicitly included in the model [13]. Likewise, alternative cycles and events branching oﬀfrom the main cycle may or may not be part of such description. This can be done ad hoc,but to simplify descriptions in a controlled manner, systematic protocols for coarse grainingare needed.Several coarse-graining approaches for discrete stochastic systems have recently beenproposed and studied [14–16]. They all have advantages and disadvantages, which we willdiscuss in detail below. In any case, coarse graining is accompanied by a loss of informationdue to the simpliﬁcation of the original system. Therefore most coarse-graining approachesattempt to simplify a system while retaining as much as possible its key characteristics.Important challenges in this process are to identify microstates that can be faithfully coarsegrained and to determine which degree of coarse graining is optimal in some sense. Here,we present an iterative approach that in each step merges two adjacent states in a ﬁnitediscrete-state Markov model and results in a hierarchy of models with diﬀerent levels ofcoarse graining. Our approach does not impose any constraints on the underlying networktopology and hence can be used to merge states in any system governed by a master equa-tion. By merging states, the dimensionality of the state space decreases, the system becomessimpler and some information is lost. In our approach speciﬁcally, the transition ﬂux be-tween the merged states vanishes and the dynamics of the system can be modiﬁed. However,transition rates and steady-state probabilities are only changed locally and transition ﬂuxesare retained. The steady-state distribution and the associated probability ﬂuxes were pro-posed to provide a complete and unique description for any steady state [17]. Therefore, weimpose only local changes in these quantities to approximate the original model well. Sinceour coarse-graining procedure generates a hierarchy of models, we discuss a cost function3hat relates the information loss and the gain in simplicity. This cost function can be usedto ﬁnd an optimal balance between information loss and simplicity and, thus, an optimiseddegree of coarse graining.The paper is organised as follows: After introducing the master equation and relatedquantities in section II, we present our coarse-graining procedure and show how transitionrates in the coarse-grained system can be deﬁned without changing the ﬂux between anypair of states. An alternative derivation by minimizing the Kullback-Leibler divergence fortrajectories provides further motivation for the choice of transition rates in the coarse-grainedsystem. A detailed comparison of diﬀerent coarse-graining approaches is also included inthis section. As a case study of a prototypical system in a non-equilibrium steady state,we apply our approach to the molecular motor kinesin and its multi-cyclic kinetic diagramas proposed in ref. [18]. First, we study coarse graining of that kinetic diagram withoutchanging its cycle topology in section III. Then, in section IV, we generalize the approach andalso allow changes in the number of cycles. This allows us to derive uni-cyclic models, whichhave originally been used to describe the kinesin motor [19], from the multi-cyclic model fora systematic comparison. Furthermore, we present a criterion that balances simplicity andinformation loss compared to the original model to ﬁnd the optimal level of coarse grainingfrom the hierarchy of models that are coarse-grained to diﬀerent degrees.

II. COARSE-GRAINING FRAMEWORKA. Master equation and cycle decomposition

We consider discrete stochastic systems described by a master equation. The systemhas N discrete states and transition between states i and j occur with time-independenttransition rates α ij . At a certain time t , a state i is occupied with probability p i ( t ), whichevolves according to the master equation dp i dt = (cid:88) j (cid:54) = i ( α ji p j − α ij p i ) = (cid:88) j (cid:54) = i J ji . (1)The master equation can be interpreted as balance of incoming ( α ji p j ) and outgoing ﬂuxes( α ij p i ). In a steady-state, the occupation probabilities are time-independent and hence theleft-hand side of the equation is zero. In the following, we will focus on steady-state quantities4nless stated otherwise. Steady-state probabilities can be calculated by matrix inversion ofthe master equation or in terms of a sum over spanning trees of a graph representing thestates and transitions [5]. In equilibrium, there is no probability ﬂux between any pair ofstates, a condition called detailed balance, as α ij p i = α ji p j (2)for all pairs of states i and j . Here, we will consider systems in non-equilibrium steady-states.Hence, while the occupation probabilities are time-independent, ∂ t p i = 0, the steady-stateﬂux between a pair of states is in general not zero. The net ﬂux from state i to j measuresthe net number of transitions per time and is given by J ij = α ij p i − α ji p j . (3)All nonzero ﬂuxes are associated with the production of entropy, a feature that we willuse below to quantify the information loss between a ﬁne-grained and a coarse-graineddescription of the same system. The total entropy production P can be expressed in termsof the net-transition ﬂuxes J ij and the corresponding transition aﬃnities ∆ S ij [1], P = 12 (cid:88) i,j J ij ∆ S ij . (4)The transition aﬃnity ∆ S ij = ln α ij p i α ji p j (5)quantiﬁes the change of entropy associated with a transition. The entropy production canbe used to quantify the deviation from detailed balance for non-equilibrium systems [20].For many purposes, it is useful to represent states and transitions as the nodes andedges of a graph. Doing so provides a constructive method for obtaining the steady stateprobabilities [1, 5]. Moreover, it allows us to decompose the dynamics on the network intoa set of fundamental cycles and to express ﬂuxes as well as the entropy production as sumsover these fundamental cycles [1].All nonzero ﬂuxes in the steady state are associated with cycles, closed paths of nodes andedges in the graph. A large graph will typically contain multiple cycles, of which not all areindependent of each other. In fact, all cycles can be constructed from a set of fundamentalcycles, which in turn can be constructed from spanning trees. A spanning tree is a connected5

61 432

Graph c h o r d c h o r d

561 432

Spanningtree

Fundamen-tal cycle c h o r d Fundamen-tal cycle c h o r d

561 432

Cycle FIG. 1. A graph with one arbitrary chosen spanning tree. The edges missing in the spanning treeare called chords and deﬁne fundamental cycles. The blue dotted edge 1 → → subgraph that contains all nodes, i.e. all states of the system, but no cycles. It is obtainedfrom the full graph by removing L edges. Adding back one edge to such a tree yields asingle cycle. The added edge is called chord l and deﬁnes a corresponding fundamental cycle C l . Figure 1 shows a 6-state network with three cycles of which two are fundamental. Thechoice of a spanning tree is arbitrary. Diﬀerent spanning trees can result in diﬀerent setsof fundamental cycles. The number of fundamental cycles is deﬁned by the number L ofchords of a network. In a network with E edges and N nodes, there are L = E − N + 1 (6)chords [1]. Moreover, all transition ﬂuxes can be expressed as a linear combination of chordﬂuxes [1]. Hence, a network with L chords has L independent transition ﬂuxes and L fundamental cycles. There are two chords in ﬁgure 1 and hence two fundamental cycles.The entropy production can then be expressed in terms of quantities related to funda-mental cycles or to the corresponding chords l as P = (cid:88) l J l ∆ S C l , (7)where the sum runs over a set of fundamental cycles. In this expression, J l is the chord ﬂux,the net-number of transitions between states connected by the chord l per unit time. ∆ S C l .. i mn j ......... J m → i J n → i ... i (cid:48) m (cid:48) j (cid:48) ......... J m → i J n → i Outgoing ﬂux retained˜ J m (cid:48) → i (cid:48) = J m → i + J n → i Merged state with˜ p m (cid:48) = p m + p n FIG. 1. State m and state n will be merged such that the ﬂuxes between the merged state and therest of the system are retained and the probabilities add up. FIG. 2. State m and state n will be merged such that the ﬂuxes between the merged state and therest of the system are retained and the probabilities add up. is the aﬃnity of the fundamental cycle C l deﬁned by chord l , that is the entropy changeafter one completion of the cycle. The cycle aﬃnity can be written as∆ S C l = (cid:88) ( ij ) ∈ C l ∆ S ij = ln  (cid:89) ( ij ) ∈ C l α ij α ji  , (8)where both sum and product are evaluated over all edges ( ij ) within cycle C l . We choosean arbitrary reference orientation for each fundamental cycle. In ﬁgure 1, the referenceorientation is chosen to be counterclockwise. B. Coarse-graining procedure

In this section, we present a procedure for coarse graining discrete stochastic systems. Tosimplify large networks, we merge two connected nodes m and n . Further reduction can beachieved by iterating this procedure, as we will discuss below. In general, the dynamics ofthe reduced network with two nodes merged is non-Markovian. However, we approximatethe dynamics with a modiﬁed master equation. The error of this approximation will besmallest if there is a time scale separation (discussed in section II D). Thus, we require that(approximately) the coarse-grained system shall still obey a master equation d ˜ p i dt = (cid:88) j (cid:54) = i ( ˜ α ji ˜ p j − ˜ α ij ˜ p i ) (9)but with modiﬁed probabilities and transition rates. We adopt the following notation: Asbefore, we denote the original probabilities and rates by p i and α ij . The probabilities of7tates in the coarse-grained system are denoted by ˜ p i and the transition rates by ˜ α ij . Notethat in the following we derive constraints that refer only to steady-state probabilities. Toconserve the probabilities, we request that the steady state probabilities of the two mergedstates add up, while all others are unaﬀected,˜ p i =  p i for i (cid:54) = m, i (cid:54) = np m + p n for i = m. (10)To determine the rates of the coarse-grained system, we impose the following additionalrequirements.1. All transition rates that do not point to or from one of the two merged nodes remainunchanged, ˜ α ij = α ij for i, j (cid:54) = m, n .2. All one-way ﬂuxes ﬂowing from or to a merged node stay the same: outgoing (one-way) ﬂuxes from the merged nodes n and m to the same node i are added, ˜ J m → i = J m → i + J n → i . Likewise, incoming (one-way) ﬂuxes to the merged nodes are added aswell, ˜ J i → m = J i → m + J i → n .Since the probabilities of nodes other than n and m stay the same, the latter conditiondirectly implies that the incoming rate in the coarse-grained network is ˜ α im = α im + α in .The condition on outgoing ﬂuxes determines the rates of transitions from the merged nodeto any other node. Thus, all requirements stated above are fulﬁlled if the transition rates˜ α ij of the coarse-grained system are deﬁned by˜ α ij = α ij for i, j (cid:54) = m, n ˜ α mj = α mj p m + α nj p n p m + p n ˜ α im = α im + α in , (11)where states m and n have been merged and states i (cid:54) = m, n and j (cid:54) = m, n are arbitrarystates.In this way, only transition rates pointing to or from the merged nodes are changed.The steady-state probabilities of the states that have not been merged stay the same andadd up for the merged state. The transitions going into a merged state are a sum. Thetransition coming out of a merged state are a weighted sum of the transitions in the original8ystem. All ﬂuxes are retained except the one between the merged states. Other quantitiesmay be diﬀerent in the coarse-grained system. These include the net-cycle ﬂuxes, i.e. thenumber of cycle completions per time, the corresponding thermodynamic forces, and theenergy dissipation.The procedure of merging two adjacent states as described above can be iterated tofurther reduce the size of the system. Importantly, the result is independent of the order inwhich states are merged. We can diﬀerentiate two cases: On the one hand, we can mergeﬁrst one pair of states and then another one, that has no direct transition to the ﬁrst pair.Our approach incorporates only local changes and hence both coarse-graining iterations areindependent. On the other hand, we can merge ﬁrst a pair of states and then another onethat is directly linked to the ﬁrst pair. In that case, we can check by iterating equation (11)that merging states m and n with a third state k is associative.As the described approach can be applied iteratively and the order of the steps doesnot aﬀect the results, we can use it to obtain a hierarchy of diﬀerent coarse-grained modelsof the same underlying microscopic dynamic. In general, we can deﬁne a mapping suchthat several states i in the original system are lumped together in one state I . States inthe original system are denoted by small letters and states in the coarse-grained system bycapital letters. For a given mapping, the rates in the coarse-grained system are then givenby ˜ α IJ = (cid:80) i ∈ I,j ∈ J α ij p i (cid:80) i ∈ I p i . (12)Here we have derived the probabilities and rates of the coarse-grained system from aset of plausible requirements imposed to make the steady-state properties of the simpliﬁedsystem similar to the ones of the original system. In section II E, we will give an alternativederivation and motivate the choice of transitions rates in the coarse-grained system byminimizing the Kullback-Leibler divergence for trajectories.Our coarse-graining approach requires the calculation of the steady-state distribution ofthe original system, which can be computationally costly for large systems. This requirementobviously limits the usefulness of our method in practice, in particular, when it is applied tolarge scale systems. However, the approach allows us to study the dynamics of the systemexclusively at the coarse-grained level (without need to solve the microscopic dynamics)in an approximate fashion. In addition, it provides a systematic way to compare diﬀerent9egrees of coarse-graining and to ﬁnd the optimal coarse-graining level, as we will discussbelow. C. Coarse-grained energy landscape in equilibrium systems

We brieﬂy consider the special case of canonical equilibrium systems. The probabilitydistribution for systems in equilibrium is given by the Boltzmann distribution p i = 1 Z e − βE i , (13)where Z is the partition sum, β = 1 /k B T is the inverse of Boltzmann’s constant and ther-modynamic temperature and E i the energy of state i . We compare the energy landscape ofthe original and coarse-grained system to make a consistency check of our approach. If wemerge state m and state n in an equilibrium system, the probabilities add up:˜ p m (cid:48) = p m + p n = 1 Z exp (cid:0) ln (cid:0) e − βE m + e − βE n (cid:1)(cid:1) = 1 Z e − βF m (cid:48) . (14)In the last expression, we have written the Boltzmann distribution of the coarse-grainedstate with a free energy instead of an energy. That this term indeed corresponds to a freeenergy F m (cid:48) = (cid:104) E mn (cid:105) − T S , given by the average energy (cid:104) E mn (cid:105) of the merged states and aninternal entropy of the coarse-grained state can be seen as follows: Pulling the energy E m out of the logarithm, we rewrite the free energy of the coarse-grained state as F m (cid:48) = E m + k B T ln e − βE m e − βE m + e − βE n = E m + k B T ln p m p m + p n . (15)Likewise, pulling out E n , we get F m (cid:48) = E n + k B T ln (cid:18) p n p m + p n (cid:19) . (16)We can deﬁne ¯ p m = p m / ( p m + p n ) and ¯ p n = p n / ( p m + p n ) as re-scaled probabilities of themicroscopic states within the coarse-grained state. Adding both equations with weights ¯ p m and ¯ p n , respectively, yields F m (cid:48) = E m ¯ p m + E n ¯ p n + k B T (¯ p m ln (¯ p m ) + ¯ p n ln (¯ p n ))= (cid:104) E mn (cid:105) − T S, (17)10here we expressed the free energy in terms of the average energy (cid:104) E mn (cid:105) = E m ¯ p m + E n ¯ p n ofthe coarse-grained states and the internal entropy due to lumping two states together whichis given by S = − k B [¯ p m ln (¯ p m ) + ¯ p n ln (¯ p n )] . (18)The latter arises due to the uncertainty, which micro-state in the merged state is occupiedgiven that the system is in the coarse-grained state. The identiﬁcation of the free energyof the merged states is consistent with the redeﬁnition of the transition rates proposed inequation (11). The outgoing transition rates ˜ α mj of a merged state m are weighted withwith ¯ p m and ¯ p n . D. Comparison with other coarse-graining approaches

The coarse-graining approach we describe here is closely related to the so-called localequilibrium approximation [14]. Speciﬁcally, the expression we obtain for the rates in thecoarse-grained system are equivalent to those obtained based on local equilibrium. However,there are diﬀerences in the justiﬁcation and as a consequence in the applicability of theapproach: The local equilibrium approximation makes use of time scale separation betweendegrees of freedom, see e.g. [14, 21]. A system in a non-equilibrium state often can bedescribed as consisting of subsystems that are internally in equilibrium. The time evolutionof all microstate within a macrostate is assumed to be the rapid [14]. Correspondingly, thelocal equilibrium approximation requires a time scale separation between rapid transitionsbetween states that will be lumped together and slower transitions to the rest of the system.Typically, the states that are merged will be characterized by approximately equal energylevels.By contrast, our approach is not based on a time scale separation like the local equi-librium assumption. Instead, we imposed the requirements stated in equations (10) and(11) to preserve the net numbers of transitions between any pair of coarse-grained states.Our method thus retains the steady-state probability distribution during coarse-graining. Itdoes not refer to equilibration or to the time scales associated with the dynamics. It canbe applied generally to merge arbitrary neighboring states (including those that violate therequirements of the local equilibrium approximation) and will lead to a coarse-grained model11hat preserves the steady state distribution and ﬂuxes. In principle, states with vastly diﬀer-ent energy levels and states with a large ﬂux between them can be merged as well. Likewise,transitions between the merged states do neither need to equilibrate rapidly to an equilib-rium approximated by a Boltzmann distribution, nor do they need to relax rapidly into anon-equilibrium steady state. However, while our coarse-graining approach preserves thesteady state of the system, it necessarily gives an approximation of the dynamics. The errorof this approximation will be smallest if there is a time scale separation, i.e. if the require-ments of the local equilibrium approximation are fulﬁlled. In that sense the requirements topreserve steady state quantities are similar to a local equilibrium approximation. However,they still allow the application of the approach in more general cases.Several other coarse-graining approaches have been proposed recently [14–16, 22] thatdiﬀer in various aspects from each other and from the approach used here. In the follow-ing, we will brieﬂy compare these approaches with our coarse-graining procedure. Table Isummarizes this comparison.In [15], Pigolotti and Vulpiani presented an adiabatic approximation which eliminatesrapidly evolving states from the system. All states with a mean dwell time [the inverse ofthe exit rate as given by equation (21) below] below a certain threshold are removed and therates of the remaining states are renormalized. The approximation is well suited for systemwith a time scale separation. In contrast to our approach, the calculation of a steady-statedistribution is not needed. Hence, this approach is suitable e.g. for chemical networks withinﬁnite state space, where calculating the steady state distribution is computationally chal-lenging. However, the steady-state distribution is also not preserved in the coarse graining,as the probabilities of the states eliminated by coarse graining are redistributed globally andnot only to the neighbouring states.The approach by Altaner and Vollmer [16] is based on eliminating ’bridge states’, statesthat can be eliminated in such a way that the cycle topology and the cycle aﬃnities arepreserved. In the coarse-graining step, the steady-state probability of the bridge state is dis-tributed (not necessarily equally) to its two neighbouring states, such that the probabilitiesof non-neighbour states are unaﬀected (locality) and the systems’ change in system entropyis preserved. This method is only applicable if the net ﬂux along the bridge is nonzero.Hence, it is limited to non-equilibrium systems with nonzero currents.In yet another approach [14], Hummer and Szabo proposed a coarse-graining method that12eaves the occupancy-number correlation function C ij ( t ) = (cid:104) θ i ( t ) θ j (0) (cid:105) unchanged. Here θ i ( t )is an indicator function that is one if state i is occupied at time t and zero otherwise. Thecalculation of the reduced matrix with the transition rates of the coarse-grained systemrequires matrix inversion and knowledge of the steady-state distribution as in our approach.However, in general, all transitions rates (not only the outgoing rates from the merged states)are changed. Hence, the method of Hummer and Szabo incorporates non-local changes ofthe transition matrix. The reduced matrix of the is computed by Hummer and Szabo in theMarkovian limit ( s → s → ∞ in the Laplace space. If only two states are merged and hence all other macrostates contain only one micro state, the local equilibrium approximation is equivalent to thedeﬁnition of transition rates in equation (11).All coarse-graining procedures discussed so far are based on merging states. A diﬀerentapproach was proposed by Altaner et al. in [22]. Starting from the fact that edge ﬂuxescan be represented as superpositions of cycle ﬂuxes (see equation (36) for an example), amapping is deﬁned that transforms the original graph into a new one. The states in the newgraph represent cycles of the original one. The transitions between the states representingcycles are deﬁned in such way that the coarse-grained system fulﬁlls detailed balance.Our coarse-graining approach shares some features with these methods, but diﬀers fromthem in others. For example, our approach requires the calculation of the steady-state dis-tribution of the system, which can be computationally costly for large systems and imposes alimitation on the use of the method. The approach described by Pigolotti and Vulpiani [15]is the only one of the methods presented here that does not require the steady-state distribu-tion. In our approach, the state space is reduced via the merging of two arbitrary adjacentnodes. Adjacency is the only prerequisite, whereas the method of Altaner and Vollmer [16]is only applicable for bridge-states such that the cycle topology is preserved. The probabilityof the coarse-grained states is redistributed locally in our approach as well as in [16] whereasthe methods of [14, 15] redistribute the probabilities non-locally. Likewise, the steady-stateﬂuxes between states are not preserved in the adiabatic approximation with slow and faststates [15] but are retained in Altaner and Vollmer’s [16] and in our approach. The cycleaﬃnities (under the condition of preserved network topology) are preserved by the methodsin [15, 16], but not by our approach. Due to the direct mapping in our coarse-graining13 ABLE I. Comparison of coarse-graining approaches.

Adiabatic approxima-tion with fast and slowstates [15] Fluctuation preservingcoarse-grain method byAltaner and Vollmer in[16] Approach by Hummerand Szabo in [14] Our transition ﬂuxespreserving coarse-grainapproachAssumption / requiredquantities Outside stead- stateregime valid too; cal-culation of steady statenot needed Steady state; requiresnon-zero ﬂux alongbridge that will becoarse-grained Steady state Steady stateReduction of state spacevia Bridge Bridge Lumping together N states in M lumpedstates Merge two nodesRedeﬁnition of rates Non-local changes oftransition rates; af-fected by the dwell timein the bridge state Proportion of one-wayﬂuxes and proportion ofprobabilities deﬁne newrates from and to thenode left and right ofthe bridge state Non-local changes oftransition rates; Matrixinversion of matrixproduct such that num-ber correlation functionis retained Local changes: Outgo-ing ﬂux from mergednodes preserved; allother rates retainedCycle topology Coarse graining of fun-damental cycles possi-ble, no restrictions Preserved (requirement) No restrictions Coarse graining ofbranches and funda-mental cycles possibleLocal redistribution ofprobability Non local redistribution Only the steady-stateprob. of the neighboringnodes changed Probabilities withinlumped states add up Coarse-grained proba-bility of merged node isthe sum; all other prob.preservedAﬃnities Preserved Preserved Not preserved even iftopology unchanged Changed by the logof proportion of theone-way ﬂuxes betweenmerged states - see eq.(39)Fluxes Not preserved Preserved Not preserved PreservedIterative Commutes Does not commute Lumping order notimportant Order of coarse-grain it-erations not important(commutes)Advantages Well suited for inﬁnitelarge networks (chemi-cal networks) Aﬃnities preserved;can only coarse grainbridges with a non-zero,positive net-ﬂux Occupancy numbercorrelation functionretained Coarse-graining ofbranches, direct map-ping between coarse-grained and originalstates, merging twonodes is iterable andorder is not important E. Minimizing the Kullback-Leibler divergence for trajectories

Another way to motivate the choice of transition rates in the coarse-grained system asgiven by equation (11) is by minimizing the Kullback-Leibler divergence KL( q || ˜ q ) betweenprobabilities q and ˜ q of trajectories in the original system and in the coarse grained system,respectively [25]. The Kullback-Leibler divergence [26] is deﬁned asKL( q || ˜ q ) = (cid:68) ln (cid:18) q ˜ q (cid:19) (cid:69) q , (19)and provides a non-symmetric measure of the similarity of two probability distributions.For this derivation, we write the master equation in matrix notation d p dt = Wp , (20)15here the matrix elements of W are given by W ij =  α ji for i (cid:54) = j − (cid:80) k (cid:54) = i α ik for i = j. (21)The diagonal elements are called exit rates and ensure conservation of probability. The exitrate of state i is the sum of all outgoing transition rates. The master equation is solved by p ( t ) = e W t p (0) , (22)where G ( t ) = e W t denotes the propagator. To derive probabilities for trajectories, wediscretize time and expanded the propagator for small time steps ∆ t : G = + ∆ t W with G ij = ∆ t · W ij ( i (cid:54) = j ) G ii = 1 − ∆ t (cid:88) j (cid:54) = i W ji . (23)The continuous time dynamics described by the original master equation can be recoveredin the limit of ∆ t → { i } = ( i , i ∆ t , i t , ..., i t − ∆ t ) is a sequence ofstates at diﬀerent time steps. Its probability can be expressed in terms of the propagator as q ( { i } ) = p i (0) t − ∆ t (cid:89) τ =0 G i τ +∆ t ,i τ . (24)We now reduce the dimensionality of the state space by merging adjacent states. Insteadof considering a single coarse-graining iteration as in section II B, we can examine a systemafter several coarse-graining iterations, as the order of those steps does not aﬀect the result.After mapping states in the original system (denoted by small letters) to states in the coarse-grained system (denoted by capital letters), we have to deﬁne transition rates for the coarse-grained model ˜ W IJ . We will show that the choice of transition rates in our coarse-grainingapproach minimizes the Kullback-Leibler divergence KL( q ( { i } ) || ˜ q ( { I } )), which comparesthe probability for trajectories in the original and the reduced state space. A trajectory { I } = ( I , I ∆ t , I t , ..., I t − ∆ t ) in the coarse-grained state space has probability˜ q ( { I } ) = ˜ p I (0) t − ∆ t (cid:89) τ =0 ˜ G I τ +∆ t ,I τ , (25)where the short-time propagator for the reduced system has the matrix elements˜ G IJ = ∆ t ˜ W IJ ( I (cid:54) = J )˜ G II = 1 − ∆ t (cid:88) J (cid:54) = I ˜ W JI (26)16n analogy to equation (23).In order to minimize the Kullback-Leibler divergence in equation (19), we have to considerthe mean of the logarithm of the distribution ˜ q in the coarse-grained system with respect tothe distribution q in the original system: (cid:104) ln ˜ q (cid:105) q = (cid:104) ln ˜ p I (0) (cid:105) q + t − ∆ t (cid:88) τ =0 (cid:104) ln( ˜ G I τ +∆ t ,I τ ) (cid:105) q . (27)If we assume that the distribution p in the original system starts in its steady state, eachtime interval ( τ, τ + ∆ t ) has the same average, such that (cid:104) ln ˜ q (cid:105) q = (cid:88) i p i ln ˜ p I (0) + t ∆ t (cid:88) IJ (cid:88) i ∈ I,j ∈ J G ij p j ln( ˜ G IJ ) (28)the sum over the time steps in equation (27) can be simpliﬁed because (cid:104) ln( ˜ G I τ +∆ t ,I τ ) (cid:105) q becomes time independent in the steady-state and hence each summand in the sum overtime contributes equally. The probabilities in a cluster I add up ( p I = (cid:80) i ∈ I p i ) and thecluster propagator can be deﬁned as G IJ = p − J (cid:80) i ∈ I,j ∈ J G ij p j such that equation (28) canbe written as (cid:104) ln ˜ q (cid:105) q = (cid:88) I ˜ p I ln ˜ p I (0) + t ∆ t (cid:88) IJ G IJ p J ln( ˜ G IJ ) . (29)In the limit of long trajectories t → ∞ , the ﬁrst term can be discarded. In the second term,we separate diagonal ( I = J ) and oﬀ-diagonal ( I (cid:54) = J ) terms:1 t (cid:104) ln ˜ q (cid:105) q = (cid:88) I G II p I ln( ˜ G II )∆ t + (cid:88) I (cid:54) = J G IJ ∆ t p J (cid:32) ln(∆ t ) + ln ˜ G IJ ∆ t (cid:33) . (30)Going back to continuous time, i.e. for ∆ t →

0, we can rewrite the factors in equation (30)such that lim ∆ t → G II = 1lim ∆ t → G IJ ∆ t = W IJ (31)and lim ∆ t → ˜ G IJ ∆ t = ˜ W IJ lim ∆ t → ln( ˜ G II )∆ t = lim ∆ t → ln(1 − ∆ t (cid:80) J (cid:54) = I ˜ W JI )∆ t = − (cid:88) J (cid:54) = I ˜ W JI (32)where W IJ deﬁnes the clustered rates of the original model and ˜ W IJ the approximating ratesin the coarse-grained model. The term with ln(∆ t ) cancels with the same term in (cid:104) ln q (cid:105) q

17n the Kullback-Leibler divergence KL( q || ˜ q ) = (cid:104) ln q (cid:105) q − (cid:104) ln ˜ q (cid:105) q . After performing the limit,equation (30) can be written aslim ∆ t → t (cid:104) ln ˜ q (cid:105) q = − (cid:88) I p I (cid:88) J (cid:54) = I ˜ W JI + (cid:88) I (cid:54) = J W IJ p J ln( ˜ W IJ ) + c = (cid:88) I (cid:54) = J ( W IJ p J ln( ˜ W IJ ) − ˜ W IJ p J ) + c (33)with a constant c that is independent of of ˜ W IJ . The minimization of KL( q || ˜ q ) is equivalentto maximizing its second term (cid:104) ln ˜ q (cid:105) q . The derivative with respect to ˜ W IJ of the secondterm (cid:104) ln ˜ q (cid:105) q of the Kullback-Leibler divergence is dd ˜ W IJ t (cid:104) ln ˜ q (cid:105) q = p J (cid:18) W IJ ˜ W IJ − (cid:19) = 0 . (34)Hence, ˜ W IJ = W IJ minimizes the Kullback-Leibler divergence KL( q || ˜ q ) and yields the samechoice for transition rates for the coarse-grained system˜ W IJ = W IJ = (cid:80) i ∈ I,j ∈ J G ij p j (cid:80) j ∈ J p j (35)as in equation (11). Summarizing, then, we have shown in this section that minimizingthe Kullback-Leibler divergence for trajectories yields the same choice of transition rates asproposed in equation (11). III. COARSE GRAINING WITH PRESERVED CYCLE TOPOLOGYA. Cycle decomposition of the kinetic diagram for the kinesin motor

In this and the next section, we will discuss the coarse graining of the kinetic diagramof the molecular motor kinesin as an example for simplifying a system in a non-equilibriumsteady-state. We will ﬁrst coarse grain the network without changing its cycle topology andpostpone coarse-graining steps to remove cycles to the next section.Kinesin is a motor protein with two heads which can carry cargo, moves along microtubuleﬁlaments and is powered by the hydrolysis of adenosine triphosphate (ATP) [28, 29]. Kinesincan be described by the 6-states network [18] that we have already used as an example above,depicted in ﬁgure 3. The six states correspond to diﬀerent chemical states of the two motorheads (ATP- or ADP-bound or free). Transition 2 → (cid:48) (cid:48) (cid:48) (cid:48) (cid:48) (cid:48)(cid:48) (cid:48)(cid:48) (cid:48)(cid:48) (cid:48)(cid:48) FIG. 1. Kinetic diagram for Kinesin from [ ? ]. The 6-state model will be simpliﬁed to a 4-statemodel without changing the number of cycles within the network. States 1 and 2 will be mergedto state 1’, while states 3 and 4 are described by state 2” in the coarse-grained network. Exceptfor the ﬂuxes between the merged states, all net transition ﬂuxes are retained. We eliminated thetransition with minimal contribution to entropy production in each fundamental cycle. FIG. 3. Kinetic diagram for Kinesin from [18]. The 6-state model will be simpliﬁed to a 4-statemodel without changing the number of cycles within the network. In the forward cycle, states 1and 6 will be merged to state 1”, while in the backward cycle, states 3 and 4 are described bystate 3” in the coarse-grained network. Except for the ﬂuxes between the merged states, all nettransition ﬂuxes are retained. We eliminated the transition with minimal contribution to entropyproduction in each fundamental cycle. step, the backward transition 5 → → → C F = { , , , , } , deﬁned by the dotted chord 1 → C F . It corresponds to the normal working cycle of the kinesin motorunder low loads. The red coloured cycle C B = { , , , , } (deﬁned by the dashed chord2 →

3) incorporates a mechanical backward transition and an ATP hydrolysis transition andis called backward cycle C B . The third cycle, which is not fundamental in the decompositionin ﬁgure 1, contains no mechanical transitions but two ATP hydrolysis transitions and henceis a dissipative or futile cycle C D . The network has two independent transition ﬂuxes. Withthe choice of two chords in the cycle decomposition depicted in ﬁgure 1, all transition ﬂuxes19an be expressed in terms of the chord ﬂuxes J and J as J = J = J = J F + J D J = J = J = J B + J D J = J − J = J F − J B . (36)For the sake of completeness, the relations between transition and cycle ﬂuxes are listed aswell. The cycle ﬂuxes J C (with C = F, B, D ) are deﬁned according to Hill’s convention [5]and can be interpreted as the number of completions of cycle C per time. Cycle ﬂuxes willbe needed later to calculate ﬂuctuations of steady-state quantities. Equation (52) in thesupplementary material, shows how to calculate the cycle ﬂuxes in the kinesin network. Theentropy change after completion of a cycle C can be expressed in terms of the transitionaﬃnities within the cycle as deﬁned in equation (8). With the choice of chords depictedin ﬁgure 1, the aﬃnity of the futile cycle C D can be written as a linear combination of thefundamental cycles as ∆ S D = ∆ S F + ∆ S B . (37)The entropy production in a system in a non-equilibrium steady-state can be written interms of fundamental cycle aﬃnities and the ﬂux through the fundamental cycles deﬁningchords as stated in equation (7). This yields P = (cid:88) l J l ∆ S C l = J ∆ S F + J ∆ S B . (38) B. Coarse graining of the kinetic diagram of kinesin without changing the networktopology

To simplify the state space, we merge states 3 and 4 in the backward cycle, and subse-quently states 1 and 6 in the forward cycle as shown in ﬁgure 3. The network topology,i.e. the number of fundamental cycles, remains unchanged by these network simpliﬁcations.The coarse-grained network still contains three cycles, of which two are fundamental. Byconstruction of our coarse-graining approach, all transition ﬂuxes are conserved. In partic-ular, the chord ﬂuxes remain unchanged. ( ˜ J (cid:48)(cid:48) (cid:48)(cid:48) = J and ˜ J (cid:48)(cid:48) (cid:48)(cid:48) = J ). The ﬂux between20he merged nodes is lost however and hence the fundamental cycle aﬃnities are reduced bythe transition aﬃnity of the eliminated transition between the merged nodes:∆ ˜ S F = ∆ S F − ln α p α p ∆ ˜ S B = ∆ S B − ln α p α p . (39)The cycle aﬃnities are retained if the net ﬂux between the merged states is zero. Hence,the coarse graining also reduces the entropy production by∆ P = P − ˜ P = J ln α p α p + J ln α p α p , (40)unless detailed balance is fulﬁlled between the merged states. C. Fluctuations in the kinesin network

Our coarse-graining approach is designed to preserve ﬂuxes, which correspond to averagesof observable properties of the kinesin motor. To demonstrate how our coarse-grainingapproach aﬀects the distribution of quantities such as the entropy production and velocityof kinesin, we simulated 10000 trajectories of ﬁnite duration (analytical results are presentedin appendix 3). We compared the original 6-state model of Liepelt and Lipowsky [18] withtwo coarse-grained systems. In the 5-state system, state 3 and state 4 of the original modelare merged such that the backward cycle loses one transition. In the 4-state model, states1 and 6 are merged in addition, such that the forward cycle loses one transition as well.In ﬁgure 4(a), we show the velocity distribution for the 6-, 5-, and 4-state model obtainedfrom 10000 trajectories with simulation time 1200 s. The velocity of the motor is proportionalto the ﬂux between states 2 and 5 and is given by v = lJ , (41)where l is the step size of the motor. The variance of the velocity decreases for longertrajectories. The velocity is proportional to the net ﬂux of the mechanical transition 2 → (cid:48) → (cid:48) in the 5-state model, 2 (cid:48)(cid:48) → (cid:48)(cid:48) in the 4-state model). Leaving the networktopology unchanged, our coarse-graining approach preserves the steady-state ﬂux and hence21 pd f (a) B /s]0.00.20.40.6 pd f (b) FIG. 1. Simulation results for the velocity (a) and the entropy production (b) of a kinesin motor.The rate constants for the 6-state model are from [ ? ] for chemical concentrations [ AT P ] =[

ADP ] = [ P ] = 1 µM , stepping size l = 8 nm and an external load force F = 1 pN. 10000trajectories have been sampled for each model with a simulation time τ = 1200 s. The black datacorresponds to the original 6-state model from [ ? ] which is depicted in ﬁgure ?? . The bluedata represents a coarse-grained system with 5 states, where the original state 1 and 2 have beenmerged. The four-state model is depicted in yellow. The entropy production for a trajectory ofﬁnite length was calculated as in equation ?? ). The velocity v of a kinesin motor is proportionalto the step size l and to the ﬂux along the mechanical transition ( J in the 6-state, J (cid:48) (cid:48) in the6-state and J (cid:48)(cid:48) (cid:48)(cid:48) in the 4-state model). . FIG. 4. Simulation results for the velocity (a) and the entropy production (b) of a kinesinmotor. The rate constants for the 6-state model are from [18] for chemical concentrations[

AT P ] = [

ADP ] = [ P ] = 1 µM , stepping size l = 8 nm and an external load force F = 1 pN.10000 trajectories have been sampled for each model with a simulation time τ = 1200 s. The blackdata (circles) corresponds to the original 6-state model from [18] which is depicted in ﬁgure 3. Theblue data (squares) represents a coarse-grained system with ﬁve states, where the original states 3and 4 have been merged. The four-state model is depicted in red and with triangles. The entropyproduction for a trajectory of ﬁnite length was calculated as in equation (42). For the entropyproduction, the 4-state model underestimates the mean by around 17%. The velocity v of a kinesinmotor is proportional to the step size l = 8 nm and to the ﬂux along the mechanical transition ( J in the 6-state, J (cid:48) (cid:48) in the 6-state and J (cid:48)(cid:48) (cid:48)(cid:48) in the 4-state model). Our coarse-graining approachpreserves the mean of the velocity whereas it does not preserve variances of observables. However,the numerical diﬀerences for the variance of the velocity are very small. the mean velocity ( v = ˜ v = 10 .

689 nm/s) remains unchanged in the course of the coarse-graining procedure. Moreover, the full velocity distribution seems to be unaﬀected by thecoarse graining. But in fact, our coarse-graining approach does not preserve variances ofobservables. However, by calculating the variance analytically, one can see that the diﬀerenceand hence the error due to coarse-graining is very small in this case: For the 6-state model,the standard deviation of the velocity is σ v = 0 . σ ˜ v = 0 . · − . In general, only mean values are conserved byour reduction scheme. However, in this example the relative diﬀerences are small because22he network topology is preserved and the ﬂux between the merged nodes is small comparedto the ﬂux along the mechanical transition between states 2 and 5.The corresponding distributions for the entropy production for trajectories of length1200 s are shown in ﬁgure 4(b). For a trajectory of duration τ with m transitions n → n → ... → n m − → n m we consider the quantity P τ = k B τ ln α n α n ...α n m − n m α n n α n n ...α n m n m − (42)as entropy production as proposed by Lebowitz and Spohn in [31] and also used in e.g. [32].From equation (40), we expect the mean of the entropy production for the coarse-grainedsystems (4-state model in red and 5-state model in blue) to be reduced compared to the 6-state model (black). The variance is not retained either in the coarse-graining step. However,the numerical diﬀerences between the 6-state and the 5-state model are very small, as theedge with the minimal contribution to entropy production in the backward cycle is removedand the backward cycle is dominated by the forward cycle for small loads. Coarse grainingfrom six to ﬁve states leads to a relative decrease of the mean of the entropy productionof about 0.02%. Hence the 6-state and the 5-state model show similar distributions. Ingeneral, however, mean and variance of entropy production are not retained by our coarse-graining approach. Coarse graining the edge with minimal contribution in the forward cyclereduces the mean entropy production by about 17% compared to the original model. Thatthe latter coarse-graining step has a stronger impact is related to the fact that the forwardcycle is dominant and yields the largest contribution to entropy production, at least underthe conditions of low external force used here. With increasing forces, the backward cyclebecomes more important. The force dependency will be discussed in more detail in the nextsection, when multi-cyclic and uni-cyclic kinesin models will be compared.To summarize, the 6-state model for kinesin has been simpliﬁed without changing itsnetwork topology. The steady-state velocity remains unchanged if the network topology ispreserved. Whereas, the mean of the entropy production always decreases if a transitionis eliminated that does not fulﬁl detailed balance. Variances in general (here for entropyproduction and velocity) are not preserved.23 V. COARSE GRAINING WITH CHANGED CYCLE TOPOLOGY: ITERA-TIVELY COARSE-GRAINED KINESINA. General approach

In the previous section, we coarse grained the kinetic diagram for kinesin without changingthe cycle topology: Both the 6-state model of Liepelt and Lipowsky [18] and the coarse-grained 4-state model contain three cycles. However, the kinesin literature, in particularearlier studies, also contains models with a single cycle. One prominent example is the uni-cyclic 4-state model by Fisher and Kolomeisky [19]. In this section, we will also allow forthe removal of cycles in coarse graining, which allows us to obtain such a uni-cyclic modelfrom the 6-state model with three cycles. We use a general approach, in which we iterativelyeliminate states from the kinesin model to simplify the description of the molecular motorand to obtain a hierarchy of models with diﬀerent levels of coarse-graining. In each coarse-graining iteration, the two states that are merged are chosen such that the transition orthe cycle with minimal entropy production is removed and thus the diﬀerence in entropyproduction between the models is minimal. In general, there will be coarse-graining stepsthat preserve the cycle topology and steps that change it. For the kinesin model, we canperform four iterations of the coarse-graining step. In each iteration, the model loses onestate, such that we end up with a 2-state network after step 4. The general coarse-grainingalgorithm is as follows:1. Find the (chemical) transition ( ij ) with minimal contribution to the entropy produc-tion: min ( ij ) J ij ∆ S ij = min ( ij ) ( α ij p i − α ji p j ) · ln α ij p i α ji p j . (43)The restriction to chemical transitions is not generally necessary, but for the speciﬁcexample of the kinesin motor, it guarantees that stepping remains part of all coarse-grained models.2. After ﬁnding the transition ( ij ) with minimal contribution to entropy production, wehave to check whether merging states i and j results in a change in cycle topology.If the system loses a cycle C due to merging states i and j , one checks whether thecontribution to the entropy production J ij ∆ S C is smaller than any contribution to the24ntropy production J mn ∆ S mn for any transition ( mn ) which is not part of cycle C ,i.e. one checks if J ij ∆ S C < min ( mn ) (cid:54)∈ C J mn ∆ S mn . (44)3. If equation (44) is fulﬁlled or merging states i and j does not change the cycle topology,merge the pair of states i and j . If equation (44) is not fulﬁlled and merging states m and n does not change the cycle topology, merge states m and n . Otherwise, eliminatethe cycle with minimal contribution to the entropy production.4. Repeat until one obtains an equilibrium two-state network.We will demonstrate how this algorithm can be used to simplify the model for a kinesinmotor under diﬀerent external load forces. For a given value of the force, the coarse-grainedmodels at diﬀerent levels of the hierarchy will be compared with respect to their networktopology and the distributions of their entropy production and velocity. Kinesin preferablymoves in the forward direction when no load force is applied. Its velocity decreases underload [33, 34]; the force that results in a zero mean velocity is called stall force (approximately7 pN [34]). We consider three cases: (a) no load (section IV B), (b) a load below the stall force(section IV C), and (c) a load above the stall force, for which the motor moves backwards(section IV D).Since the coarse-graining procedure we use generates a hierarchy of models, one canask which degree of coarse graining is optimal in some sense. It is desirable to removeas many states as possible, while retaining as much as possible the key characteristics ofthe model. We propose to use the entropy production as the characteristic that should beretained as much as possible, as it quantiﬁes the out-of-equilibrium nature of the system.Entropy production is reduced with every step of coarse graining. After four iterations, fourstates have been eliminated such that kinesin is described by a two-state model, which isin equilibrium by deﬁnition and has zero entropy production. To balance the reduction inthe number of nodes in the network and the loss of entropy production, we introduce a costfunction (a free energy-like quantity) F ( ν ) = ∆ P ( ν ) P − T ∆ N ( ν ) N . (45)Here the change in the number of nodes ∆ N = N − N ( ν ) and the change of entropyproduction ∆ P = P − P ( ν ) are both determined with respect to the original 6-state model25 a) Load force F = 0 pN.

61 5 432

Iteration 0

51 4 32

Iteration 1

41 32

Iteration 2

Iteration 3 Iteration 4

Figure 1: bla1 (b) Load force F = 6 .

61 5 432

Iteration 0

51 4 32

Iteration 1

Iteration 2

Iteration 3 Iteration 4

Figure 1: bla1 (c) Load force F = 9 pN.

61 5 432

Iteration 0

Iteration 1

Iteration 2

Iteration 3 Iteration 4

Figure 1: bla1 F r a c t i o n a l c h a n g e ΔPP ΔNN ΔPP - ΔNN Heuristic sweet spot ΔPP ΔNN ΔPP - ΔNN Δeuristic sweet spot −0.50.00.51.0 ΔPP ΔNN ΔPP Δ- ΔNN Heuristic sweet spot F r a c t i o n a l c h a n g e Entropy ΔPP Velocity Δvv Velocity Δvv Velocity Δvv FIG. 5. Diﬀerent models for a kinesin motor. In (a), no external stall force is applied and theforward cycle is dominant. In (b), the kinesin motor is at an external load force of 6.5pN, whichis a substantial force, but smaller than the stall force. In (c), the applied load force is larger thanthe stall force and the backward cycle is dominant. and normalized to the respective value in the original model denoted with subscript 0.We have lost a fraction ∆ P ( ν ) /P of the entropy production and eliminated a fraction∆ N ( ν ) /N of the states after coarse-graining iteration ν . The weighting factor T controlsthe relative weight of the information loss (the entropy production) and the simplicity (thenumber of eliminated states) of our model. In the following, we chose T to be one and henceboth terms contribute equally. We can then ﬁnd the optimal level of coarse-graining byminimizing F ( ν ). B. Diﬀerent levels of coarse-grained models for zero external load force

For a free kinesin motor with zero external force, ﬁgure 5(a) shows the hierarchy of mod-els at diﬀerent levels of coarse-graining. A uni-cyclic 3-state model (after three iterations)represents a sweet spot between information loss in terms of entropy production and sim-26licity of the model if both terms are equally weighted ( T = 1). If no external force isapplied, the forward cycle is dominant and yields the largest contribution to the entropyproduction P = J ∆ S F + J ∆ S B . Therefore, the ﬁrst two coarse-graining iterations onlyaﬀect the backward cycle. The ﬁrst step preserves the network topology and the backwardcycle loses the transition with the smallest contribution J ij ∆ S ij to entropy production. Thisdoes not aﬀect the two independent transition ﬂuxes J and J (chord ﬂuxes). Hence, themean velocity of the motor is not changed either. The second iteration changes the networktopology by eliminating the backward cycle. Thus, the coarse-grained model after two stepsis a uni-cyclic network. The ﬂux in the uni-cyclic network equals the chord ﬂux J in the6-state model. Losing one fundamental cycle and one independent transition ﬂux changesthe mean velocity of the motor. Iteration three eliminates a transition within the uni-cyclic4-state network. The last coarse-graining iteration results in a two-state network whichobeys detailed balance by construction. Hence, both the entropy production and the meanvelocity of the motor are zero.In summary, the largest changes in entropy production and mean velocity result from thelast coarse-graining step. For zero or small external load forces, the forward cycle is dom-inant. The backward cycle can be removed by coarse graining, because the approximationerror in the steady-state velocity is smaller than 1%.The uni-cyclic 3-state model represents a sweet an optimal trade-oﬀ between informationloss and simplicity of the model if the motor works at zero external load force. To investigatewhether the coarse-grained model approximates the original model well also in the presenceof a load force, we plot the diﬀerences in the velocity and the entropy production betweenvarious coarse-grained models and the original model as functions of the force in ﬁgure 6(the corresponding variances are shown in ﬁgure 8 in the appendix).Both ﬁgures indicate that uni-cyclic models approximate the original model well for zeroor small external load forces. For increasing forces, the approximation of the velocity andthe entropy production distribution becomes less accurate. The next subsection shows thatchanging the force parameter results in a diﬀerent hierarchy of models at diﬀerent levels ofcoarse-graining. 27 . Diﬀerent levels of coarse-grained models for an external load force smaller thanthe stall force With an increasing external load force, the mean velocity of the molecular motor decreasesand the forward cycle becomes less dominant in terms of entropy production. Figure 5(b)shows diﬀerent models for a kinesin motor at an external load force of 6.5 pN, which isa substantial force, but smaller than the stall force. The mechanical transition 2 → D. Diﬀerent levels of coarse-grained models for an external load force larger thanthe stall force

Finally, we consider a force above the stall force. In ﬁgure 5(c), the 6-state model for themotor is iteratively coarse grained for an external load force F = 9 pN. In this situation, themolecular motor moves backwards and the mean velocity is negative. The backward cycleis dominant. The transitions in the forward cycle contribute less to the entropy productionthan those in the backward cycle. Hence in the ﬁrst coarse-graining step, a transition in theforward cycle is removed and in the second step, the forward cycle is eliminated entirely.The mean velocity is changed in those steps in which the network topology is modiﬁed. E. Summary

In the previous sections, the 6-state model for kinesin with two fundamental cycles hasbeen iteratively coarse-grained such that the diﬀerence in the steady-state entropy produc-tion is minimal. In addition, we determined the velocity as a function of force, the mainquantity of interest from an experimental point of view. Plotted in ﬁgure 6(c) and (f),the relative diﬀerences between these quantities in coarse-grained models and in the origi-nal 6-state models show a pronounced diﬀerence between models that preserve the networktopology and models that do not. As long as the network topology is preserved, the velocityis unchanged by the coarse-graining and the entropy production is only weakly aﬀected.However, when the network topology is modiﬁed in a coarse-graining step, both quanti-ties change. Similar observations pertain to the variances of these quantities, as shown inAppendix 3. The diﬀerence between coarse-graining with and without preserved networktopology is particularly pronounced for forces around the stall force, where multiple cyclescontribute to the dynamics. Uni-cyclic models are a good approximation for load forces thatare considerably smaller (or larger) than the stall force. For small forces, the motor walksin the positive direction and has a dominant forward cycle, such that the backward cyclecan be neglected. At and above the stall force, however, the backward cycle makes a strongcontribution. The uni-cyclic model that resembles the backward cycle approximates the ve-29 M e a n v e l o c i t y [ n m / s ] tri-cyclic 6-state modeltri-cyclic 4-state modeluni-cyclic 4-state model(forward cycle)uni-cyclic 3-state model(forward cycle)uni-cyclic 4-state model(backward cycle) (a) M e a n v e l o c i t [ n m / s ] (b) E rr o r i n m e a n v e l o c i t y Relative differencebetween original modeland coarse-grained modeltri-cyclic 4-state modeluni-cyclic 4-state model(forward cycle)uni-cyclic 3-state model(forward cycle)uni-cyclic 4-state model(backward cycle) (c) E n t r o p y p r o d u c t i o n [ k B / s ] tri-cyclic 6-state modeltri-cyclic 4-state modeluni-cyclic 4-state model(forward cycle)uni-cyclic 3-state model(forward cycle)uni-cyclic 4-state model(backward cycle) (d) E n t r o p y p r o d u c t i o n [ k B / s ] (e) E rr o r i n e n t r o p y p r o d u c t . Uni-cyclicmodelforwardcycle Tri-cyclicmodel Uni-cyclicmodelbackwardcycle Relative differencebetween original model andtri-cyclic 4-state modeluni-cyclic 4-state model(forward cycle)uni-cyclic 3-state model(forward cycle)uni-cyclic 4-state model(backward cycle) (f)

FIG. 1. The mean velocity and entropy production are plotted against the external load forceacting on the motor for diﬀerent kinesin models. The ﬁgures in the second row show diﬀerencesbetween tri-cyclic and uni-cyclic sytems for large external load forces. Furthermore, the relativediﬀerences between the mean velocity and the entropy production for coarse-grained models andthe original 6-state model respectively are depicted in the third row. The motor has zero meanvelocity for F = 7 pN. For no or small external load forces, uni-cyclic models approximate themean velocity and entropy production well with small deviations. While for larger load forces, acoarse-grained model is better, if it has the same network topology as the original model. FIG. 6. In (a) and (d), the mean velocity and entropy production are plotted against the externalload force acting on the motor for diﬀerent kinesin models, see ﬁgure 5. The ﬁgures in the secondcolumn, (b) and (e), show an enlarged view of the diﬀerences between tri-cyclic and uni-cyclicsystems for large external load forces. Furthermore, the relative diﬀerences between the meanvelocity and the entropy production for coarse-grained models and the original 6-state modelrespectively are depicted in the third column in (c) and (f). The tri-cyclic 4-state model is optimalin terms of preserving the velocity of the motor. Moreover, the load force ranges indicated in(f) show which model is most suitable for approximating the entropy production within the givenrange. The motor has zero mean velocity for F = 7 pN. For no or small external load forces, uni-cyclic models resembling the forward cycle approximate the mean velocity and entropy productionwell with small deviations. For load forces around the stall force, on the other hand, a coarse-grained model is better provided it has the same network topology as the original model. In ﬁgure8 in the supplementary material, the variances of velocity and entropy production are plottedagainst the external load force. locity of the motor well if the load force is considerably larger than the stall force and hencethe backward cycle is dominant such that the forward cycle can be neglected. We also notethat since changes in network topology aﬀect the velocity, they shift the stall force itself.The new stall force is ˜ F stall = 13 . < F < F > . P = (cid:88) ( ij ) J ij ∆ S ij . (46)If the system contains N states and E transitions, the sum has E terms. Equivalently,the entropy production can be also expressed as a sum of fewer terms, namely L = E − N + 1 (equation 6), if we make use of the cycle decomposition. Reducing the numberof fundamental cycles L thus always reduces the number of terms L in the expansion ofthe entropy production in the fundamental cycles basis. If on the other hand the networktopology is preserved, the chord ﬂuxes J l are unchanged whereas the cycle aﬃnities ∆ S C l ofthe fundamental cycles C l are changed and the number of terms is preserved. V. DISCUSSION AND CONCLUSION

We have introduced a coarse-graining procedure for systems governed by a master equa-tion. The approach merges two states such that the probability of the coarse-grained stateis the sum of probability of the two ﬁne-grained states by redeﬁning the transition rates ac-cording to equation (11). All probabilities of states not aﬀected by the merge are retained.The outgoing transition rates of the coarse-grained state are chosen such that the outgoingﬂux is conserved. This approach changes steady-state probabilities and transition rates onlylocally. The procedure can be applied iteratively and has no constraints on the networktopology. This is in contrast to the coarse-graining approach presented by Altaner andVollmer [16], which is restricted to merging adjacent bridge states, hence coarse-graining ofbranches or states with more complicated transitions to the rest of the network - as the onedepicted in ﬁgure 2 - is not possible. The only constraint for our approach is the adjacencyof states that shall be merged. As a consequence, the presented approach can (and typicallywill) reduce the number of fundamental cycles in the system as shown in section IV.Moreover, we propose that for an iterative coarse-graining procedure, balancing the (un-wanted) loss of entropy production with the (wanted) reduction of the number nodes of thenetwork can result in a heuristic sweet spot of an optimally coarse-grained model. The costfunction proposed in equation (45) quantiﬁes the balance of information loss and simplicityof the model. The relative weight parameter T is so far chosen arbitrarily. With T = 1,the information loss (the entropy production) and the simplicity (the number of eliminated32tates) contribute equally. In the limit of T →

0, no coarse graining at all is favourableand the minimum of the cost function is always the original model. Whereas in the limitof T → ∞ , the minimum of the cost function corresponds to a 2-state equilibrium modelwhere the maximal number of states have been merged. While the iterative coarse-grainingprocedure is not restricted by the features of the network, the criterion for optimal coarse-graining can only be applied to systems out of thermodynamic equilibrium due to the useof the loss in entropy production as a measure of information. Systems in equilibrium havezero entropy production and hence the criterion cannot be used to compare models that arecoarse-grained to diﬀerent levels for equilibrium systems.In section II E, we minimized the Kullback-Leibler divergence for trajectories to justifyour choice of transition rates in the coarse-grained system. The Kullback-Leibler divergenceis a special case of the more general α -divergence D α ( q || ˜ q ) = 1 α (1 − α ) (cid:28) − (cid:18) q ˜ q (cid:19) α (cid:29) ˜ q . (47)The Kullback-Leibler divergence is not symmetric: KL ( q || ˜ q ) (cid:54) = KL (˜ q || q ), whereas the α -divergence is symmetric for α = 1 /

2. Minimizing the α -divergence for α = 1 / α could lead to interesting other deﬁnitions of transition rates in the coarse-grainedsystem. APPENDIX1. Parametrisation for the molecular motor kinesin

Liepelt and Lipowsky proposed a 6-state model for Kinesin with transition rates α ij = α ij I ij ([ X ])Φ ij ( F ) , (48)where the ﬁrst factor α ij is a ﬁtted rate constant which is independent of concentrations of X ∈ { ATP, ADP, P } (see table II) or the external load force F [18]. The rate constantsare depicted in table II and based on experiments by Visscher et al. [33]. If a transition i → j does not involve binding of ATP, ADP or P, the second factor I ij ([ X ]) is equal toone. Otherwise, it is linear dependent on the concentration of the involved reactant. Forinstance, ATP is bound during transition 1 →

2. Hence, the transition rate α is linearly33ependent on the concentration of ATP: I ([ATP]) = [ATP]. The force dependence factorΦ ij ( F ) is diﬀerent for the mechanical transitions (2 → →

2) and all other chemicaltransitions. The factors for the mechanical transitions areΦ ( F ) = exp (cid:18) − θ F lk B T (cid:19) Φ ( F ) = exp (cid:18) (1 − θ ) F lk B T (cid:19) , (49)where θ denotes the dimensionless load distribution factor (see table II), l the step size, k B the Boltzmann constant and T the temperature (see table II). For all chemical transitions,the force dependence factor Φ ij ( F ) = 21 + e χ ij F l/k B T (50)is symmetric because the dimensionless force parameter χ ij = χ ji (cid:62)

2. Cycle ﬂuxes in the kinesin network

The cycle ﬂuxes (net numbers of cycle completions per time) can be found from theextension of the diagram method described by Hill in [5]: A ﬂux diagram for a cycle is thecycle itself plus a set of arrows ﬂowing into it as depicted in ﬁgure 7. The algebraic valueof a ﬂux diagram for a cycle is the product of rate constants associated with the arrowsmultiplied by the contribution of the cycle (up to a normalisation constant). The cycleﬂuxes in the Kinesin network can be calculated with J C = π + C − π − C S S C = J + C − J − C J F = ( α α + α α + α α ) α α α α − α α α α SJ B = ( α α + α α + α α ) α α α α − α α α α SJ D = α α α α α α − α α α α α α S , (51) where the factor S C is the sum of all possible sets of remaining transitions, which ﬂow intothe cycle but do not form a cycle themselves. In ﬁgure 7 these transitions are dashed. Thefactor S is the sum of the weights of all spanning trees. The cycle ﬂuxes in the coarse-grained34 oncentration [ P ] 10 − M[ ADP ] 10 − M[ AT P ] 10 − MStepping size l k B T . · − JFitted rate constants α · s − α s − α s − α = ( α /α ) · α α = α = α = α s − α = α µ Ms) − α = α µ Ms) − α = α µ Ms) − Load distribution factor θ χ = χ χ = χ χ = χ θ which governs the mechanical transition rates, as given by equation (49). Dimensionlessforce parameters which govern the force dependence of the chemical transition rates χ ij = χ ji [18, 33], cf. equation (50). J C = ˜ π C + − ˜ π C − ˜ S ˜ S C ˜ J F = ( ˜ α + ˜ α ) ˜ α ˜ α ˜ α − ˜ α ˜ α ˜ α ˜ S ˜ J B = ( ˜ α + ˜ α ) ˜ α ˜ α ˜ α − ˜ α ˜ α ˜ α ˜ S ˜ J D = ˜ α ˜ α ˜ α ˜ α − ˜ α ˜ α ˜ α ˜ α ˜ S . (52)35 (cid:48) (cid:48) (cid:48) (cid:48) (cid:48) (cid:48) (cid:48) (cid:48) FIG. 1. The blue arrows contribute to the forward cycle F in the positive (+) direction (counter-clockwise). Its algebraic value is π F + = α α α α . The brown arrows ﬂow into the forwardcycle F . They correspond to the brown-colored terms in equation ( ?? ). In the second row, thediagrams for the forward cycle in plus direction are depicted for a coarse-grained network as shownin ﬁgure ?? . State 1 and 6 in the original network are merged to state 1’ in the coarse grainednetwork. Same allies for state 3 and 4 which are merged to state 3’ . Again, the brown andblue-colored parts correspond to the color-marked terms in ( ?? ). FIG. 7. In the ﬁrst row, the blue arrows contribute to the forward cycle F in the positive (+)direction (counterclockwise). Its algebraic value is π F + = α α α α . The dashed arrows ﬂowinto the forward cycle F . They correspond to the factor S C in equation (52). In the second row,the diagrams for the forward cycle in plus direction are depicted for a coarse-grained network asshown in ﬁgure 3. States 1 and 6 in the original network are merged to state 1’ in the coarsegrained network. States 3 and 4 are merged to state 3’.

3. Fluctuations in the kinesin network

Variances of steady-state observables can be explicitly calculated in terms of so-calledone-way cycle ﬂuxes [35]. One-way cycle ﬂuxes are not preserved by our coarse-grainingmethod. Hence, variances are retained neither.The ﬂuctuations of entropy production and velocity can be expressed in terms of ﬂuctu-ations in the number of completed cycles. All transition ﬂuxes can be written as a linearcombination of cycle ﬂuxes as shown in equation (36). In equation (52), it is demonstratedhow to calculate the cycle ﬂuxes in the kinesin network explicitly. The entropy production P and velocity v depend on transition ﬂuxes. In order to calculate the variance of P and v , we need to calculate the variance of transition ﬂuxes, which are linear combinations ofcycle ﬂuxes. The latter can be calculated with Hill’s diagram method [5]. For a long timeinterval τ , the net number n i of completed cycles of type i ∈ { F, B, D } can be treated as an36ndependent random variable which has a Gaussian distribution with mean and variance¯ n i ( τ ) = J neti τσ i ( τ ) = ( J i + + J i − ) τ, (53)as shown in [36], where J i ± denotes the cycle ﬂux in positive or negative direction respec-tively. To avoid confusion, the net-transition ﬂux is denoted as J neti . But if not statedotherwise all ﬂuxes are net ﬂuxes. The net ﬂux through edge (2,5) can be expressed interms of cycle ﬂuxes as proposed in equation (36), such that J net = J netF − J netB . (54)The net numbers of completed forward (F) and backward (B) cycles (depicted in ﬁgure 1)are independent random variables [36] for long time intervals τ . Thus, the variance adds up,as the net number of transitions n is a sum of independent random variables, which havezero covariance: n = n F − n B ¯ n = ¯ n F − ¯ n B σ = σ F + σ B . (55)The velocity thus has mean and variance¯ vl = J net = ¯ n τσ v/l = σ n τ = J F + + J F − + J B + + J B − τ . (56)An explicit expression for the one-way cycle ﬂuxes can be found in equation (52). If acoarse-graining iteration preserves the cycle topology (no cycle is lost due to a merged pairof states), the single-cycle ﬂuxes, in general, are not retained. But the sum of net-cycleﬂuxes (which corresponds to a transition ﬂux) remains unchanged. Thus, the mean of thevelocity is retained in all coarse-graining iterations. The variance is not preserved becauseone-way cycle ﬂuxes are not retained by our coarse-graining approach. By contrast, one-waytransition ﬂuxes are preserved but cannot be expressed in terms of one-way cycle ﬂuxes. Theentropy production P = J ∆ S F + J ∆ S B = ( J F + J D )∆ S F + ( J B + J D )∆ S B = n F + n D τ ∆ S F + n B + n D τ ∆ S B (57)37as variance σ P = ( σ n F + σ n D ) (cid:18) ∆ S F τ (cid:19) + ( σ n B + σ n D ) (cid:18) ∆ S B τ (cid:19) = J F + + J F − + J D + + J D − τ (∆ S F ) + J B + + J B − + J D + + J D − τ (∆ S B ) , (58)where we expressed the chord ﬂuxes in terms of cycle ﬂuxes as given by equation (36) andused that for long trajectories, the numbers of completed cycles are independent Gaussiandistributed random variables as given by equation (53) [35]. To summarize, the mean ofthe velocity of the kinesin motor is preserved by the coarse-graining procedure, whereas themean of the entropy production and variances for entropy production and velocity are notpreserved. Figure 8 shows the variance of the velocity and the entropy production as afunction of the external load force acting on the motor for tri-cyclic and uni-cyclic kinesinmodels. 38 V a r i a n c e v e l o c i t y [ n m / s ] tri-cyclic 6-state modeltri-cyclic 4-state modeluni-cyclic 4-state model(forward cycle)uni-cyclic 3-state model(forward cycle)uni-cyclic 4-state model(backward cycle) (a) V a r i a n c e v e l o c i t y Relative differencebetween original modeland coarse-grained modeltri-cyclic 4-state modeluni-cyclic 4-state model(forward cycle)uni-cyclic 3-state model(forward cycle)uni-cyclic 4-state model(backward cycle) (b) V a r i a n c e e n t r o p y p r o d . tri-cyclic 6-state modeltri-cyclic 4-state modeluni-cyclic 4-state model(forward cycle)uni-cyclic 3-state model(forward cycle)uni-cyclic 4-state model(backward cycle) (c) V a r i a n c e e n t r o p y p r o d . Relative differencebetween original model andtri-cyclic 4-state modeluni-cyclic 4-state model(forward cycle)uni-cyclic 3-state model(forward cycle)uni-cyclic 4-state model(backward cycle) (d)

FIG. 1. The variance of the velocity and the entropy production are plotted against the externalload force acting on the motor for diﬀerent kinesin models. The ﬁgures in the second row showdiﬀerences between tri-cyclic and uni-cyclic systems for large external load forces. Furthermore,the relative diﬀerences between the variance of the velocity and the entropy production for coarse-grained models and the original 6-state model respectively are depicted in the third row. Themotor has zero mean velocity for F = 7 pN. For no or small external load forces, uni-cyclic modelsapproximate the variances for the velocity and the entropy production well with small deviations.While for larger load forces, a coarse-grained model is better, if it has the same network topologyas the original model. Both quantities are calculated analytically as described in ?? . The variancewas calculated for trajectories with length τ = 1 s. FIG. 8. The variance of the velocity and the entropy production are plotted against the externalload force acting on the motor for diﬀerent kinesin models. Furthermore, the relative diﬀerencesbetween the variance of the velocity and the entropy production for coarse-grained models andthe original 6-state model respectively are depicted in the second column. The relative diﬀerencesin the second column reveal large approximation errors by uni-cyclic systems that resembles theforward cycle for large external load forces. The motor has zero mean velocity for F = 7 pN. Forno or small external load forces, uni-cyclic models approximate the variances for the velocity andthe entropy production well with small deviations. While for larger load forces, a coarse-grainedmodel is better, if it has the same network topology as the original model. Both quantities arecalculated analytically as described in section 3. The variance was calculated for trajectories withlength τ = 1 s.[1] J. Schnakenberg, Network theory of microscopic and macroscopic behavior of master equationsystems, Rev. Mod. Phys. , 571 (1976).[2] N. van Kampen, in Stochastic Processes in Physics and Chemistry (Third Edition) (Elsevier, msterdam, 2007) pp. 96 – 133.[3] D. A. McQuarrie, Stochastic approach to chemical kinetics, J. Appl. Probab. , 413–478(1967).[4] D. T. Gillespie, Stochastic simulation of chemical kinetics, Annu. Rev. Phys. Chem. , 35(2007).[5] T. Hill, Free Energy Transduction and Biochemical Cycle Kinetics (Springer New York, 2012).[6] F. No´e, C. Sch¨utte, E. Vanden-Eijnden, L. Reich, and T. R. Weikl, Constructing the equilib-rium ensemble of folding pathways from short oﬀ-equilibrium simulations, Proc. Natl. Acad.Sci. U.S.A , 19011 (2009).[7] M. G. Saunders and G. A. Voth, Coarse-graining methods for computational biology, Annu.Rev. Biophys. (2013).[8] M. Praprotnik, L. D. Site, and K. Kremer, Multiscale simulation of soft matter: From scalebridging to adaptive resolution, Annu. Rev. Phys. Chem. (2008).[9] S. J. Weiner, P. A. Kollman, D. A. Case, U. C. Singh, C. Ghio, G. Alagona, S. Profeta, andP. Weiner, A new force ﬁeld for molecular mechanical simulation of nucleic acids and proteins,Journal of the American Chemical Society , 765 (1984).[10] M. P. Allen and G. Germano, Expressions for forces and torques in molecular simulationsusing rigid bodies, Molecular Physics , 3225 (2006).[11] J.-H. Prinz, H. Wu, M. Sarich, B. Keller, M. Senne, M. Held, J. D. Chodera, C. Sch¨utte, andF. No´e, Markov models of molecular kinetics: Generation and validation, J. Chem. Phys. (2011).[12] C. Sch¨utte, A. Fischer, W. Huisinga, and P. Deuﬂhard, A direct approach to conformationaldynamics based on hybrid Monte Carlo, J. Comput. Phys. , 146 (1999).[13] S. Klumpp, C. Keller, F. Berger, and R. Lipowsky, Molecular motors: Cooperative phenomenaof multiple molecular motors, in Multiscale Modeling in Biomechanics and Mechanobiology ,edited by S. De, W. Hwang, and E. Kuhl (Springer, London, 2015) pp. 27–61.[14] G. Hummer and A. Szabo, Optimal dimensionality reduction of multistate kinetic and Markov-state models, J. Phys. Chem. B (2014).[15] S. Pigolotti and A. Vulpiani, Coarse graining of master equations with fast and slow states,J. Chem. Phys. , 154114 (2008).[16] B. Altaner and J. Vollmer, Fluctuation-preserving coarse graining for biochemical systems, hys. Rev. Lett. , 228101 (2012).[17] R. K. P. Zia and B. Schmittmann, A possible classiﬁcation of nonequilibrium steady states,J. Phys. A , L407 (2006).[18] S. Liepelt and R. Lipowsky, Kinesin’s network of chemomechanical motor cycles, Phys. Rev.Lett. , 258102 (2007).[19] M. E. Fisher and A. B. Kolomeisky, Simple mechanochemistry describes the dynamics ofkinesin molecules, Proc. Natl. Acad. Sci. U.S.A. , 7748 (2001).[20] G. Szabo, T. Tom´e, and I. Borsos, Probability currents and entropy production in nonequi-librium lattice systems, Phys. Rev. E , 011105 (2010).[21] J. M. G. Vilar and J. M. Rub´ı, Thermodynamics “beyond” local equilibrium, Proc. Natl.Acad. Sci. U.S.A. , 11081 (2001).[22] B. Altaner, S. Grosskinsky, S. Herminghaus, L. Katth¨an, M. Timme, and J. Vollmer, Net-work representations of nonequilibrium steady states: Cycle decompositions, symmetries, anddominant paths, Phys. Rev. E , 041133 (2012).[23] A. Gorban, I. Kevrekidis, C. Theodoropoulos, N. Kazantzis, and H. ¨Ottinger, Model reductionand coarse-graining approaches for multiscale phenomena (Springer Berlin Heidelberg, 2006).[24] O. Radulescu, A. N. Gorban, A. Zinovyev, and V. Noel, Reduction of dynamical biochemicalreactions networks in computational biology, Front. Genet. , 131 (2012).[25] S. Press´e, K. Ghosh, J. Lee, and K. Dill, Principles of maximum entropy and maximum caliberin statistical physics, Rev. Mod. Phys. (2013).[26] S. Kullback and R. A. Leibler, On information and suﬃciency, Ann. Math. Stat. , 79 (1951).[27] A. Kells, V. Koskin, E. Rosta, and A. Annibale, Correlation functions, mean ﬁrst passagetimes, and the kemeny constant, J. Chem. Phys. , 104108 (2020).[28] G. Woehlke and M. Schliwa, Walking on two heads: the many talents of kinesin, Nat. Rev.Mol. Cell Biol. , 50 (2000).[29] W. Wang, L. Cao, C. Wang, B. Gigant, and M. Knossow, Kinesin, 30 years later: Recentinsights from structural studies, Protein Sci. , 1047 (2015).[30] C. Hyeon, S. Klumpp, and J. N. Onuchic, Kinesin’s backsteps under mechanical load, Phys.Chem. Chem. Phys. , 4899 (2009).[31] J. L. Lebowitz and H. Spohn, A Gallavotti–Cohen-type symmetry in the large deviationfunctional for stochastic dynamics, J. Stat. Phys. (1999).

32] A. Puglisi, S. Pigolotti, L. Rondoni, and A. Vulpiani, Entropy production and coarse grainingin Markov processes, J. Stat. Mech. Theor. Exp. , P05015 (2010).[33] K. Visscher, M. J. Schnitzer, and S. M. Block, Single kinesin molecules studied with a molec-ular force clamp, Nature , 184 (1999).[34] N. J. Carter and R. A. Cross, Mechanics of the kinesin step, Nature , 308 (2005).[35] T. L. Hill and Y. D. Chen, Stochastics of cycle completions (ﬂuxes) in biochemical kineticdiagrams, Proc. Natl. Acad. Sci. U.S.A. , 1291 (1975).[36] T. Hill, Free Energy Transduction in Biology: The Steady-State Kinetic and ThermodynamicFormalism (Academic Press, New York, 1977).(Academic Press, New York, 1977).