[PDF] Long-term behaviours of Autocatalytic Sets

Abstract

Autocatalytic Sets are reaction networks theorised as networks at the basis of life. Their main feature is the ability of spontaneously emerging and self-reproducing. The Reflexively and Food-generated theory provides a formal definition of Autocatalytic Sets in terms of graphs with peculiar topological properties. This formalisation has been proved to be a powerful tool for the study of the chemical networks underlying life, and it was able to identify autocatalytic structures in real metabolic networks. However, the dynamical behaviour of such networks has not been yet complitely clarified. In this work, we present a first attempt to connect the topology of an Autocatalytic Set with its dynamics. For this purpose, we represent Autocatalytic Sets in terms of Chemical Reaction Networks, and we use the Chemical Reaction Network theory to detect motifs in the networks'structure, that allow us to determine the long-term behaviour of the system.

Full PDF

LLong-term behaviours of Autocatalytic sets

Alessandro Ravoni

Department of Mathematics and Physics,University of Roma Tre,Via della Vasca Navale 84, 00146 Rome,Italy

Autocatalytic Sets are reaction networks theorised as networks at the basis of life. Theirmain feature is the ability of spontaneously emerging and self-reproducing. The Reﬂex-ively and Food-generated theory provides a formal deﬁnition of Autocatalytic Sets interms of graphs with peculiar topological properties. This formalisation has been provedto be a powerful tool for the study of the chemical networks underlying life, and it wasable to identify autocatalytic structures in real metabolic networks. However, the dy-namical behaviour of such networks has not been yet complitely clariﬁed. In this work,we present a ﬁrst attempt to connect the topology of an Autocatalytic Set with its dy-namics. For this purpose, we represent Autocatalytic Sets in terms of Chemical ReactionNetworks, and we use the Chemical Reaction Network theory to detect motifs in thenetworks’structure, that allow us to determine the long-term behaviour of the system.

I. INTRODUCTION

Autocatalytic Sets (ASs) are sets of chemical speciesthat mutually catalyse each other’s production throughchemical reactions, starting from a basic food source.Introduced by Kauﬀman (Kauﬀman, 1971, 1986, 1993),the notion of AS captures the idea of a system capableof spontaneously emerging and self-reproducing, usingthe resources provided by the environment. These fea-tures are commonly accepted as basic qualities of earlylife forms (Higgs and Lehman, 2015; Kauﬀman, 1993;Luisi, 2016; Nghe et al., 2015; Rasmussen et al., 2016),and many authors have suggested that the appearanceof ASs can have been a fundamental step in the transi-tion from non-living to living matter, possibly precedingRNA-based systems (Dyson, 1985; Hordijk et al., 2010;Hordijk and Steel, 2018; Kauﬀman, 2011; Vasas et al.,2012; Xavier et al., 2020). However, some properties ofASs are still under debate, such as their eﬀective abilityto form spontaneously (Lifson, 1997; Szathm`ary, 2000)and undergo adaptive evolution (Vasas et al., 2010, 2012),a feature that is generally referred to as evolvability .A formal deﬁnition of ASs has been proposed in thecontext of graph theory (see Section II.B for details) bythe

Reﬂexively Autocatalytic and Food-generated (RAF)theory (Hordijk and Steel, 2004; Hordijk et al., 2011).This allowed to overcome some limits associated with ASsand to make evident their relevance in the context of theorigin of life. In particular, RAF theory has successfullyproved that ASs (or

RAF sets , using the nomenclatureof RAF theory) are highly likely to exist in a catalyticreaction system (Hordijk and Steel, 2004; Hordijk et al.,2011; Hordijk and Steel, 2012a; Mossel and Steel, 2005;Vasas et al., 2012), and RAF sets have been detectedin the metabolic networks of Escherichia Coli (Sousa etal., 2015) and ancient anaerobic autotrophs (Xavier etal., 2020). Moreover, RAF theory shows that RAF sets are often composed by RAF subsets, (Hordijk et al.,2012; Hordijk and Steel, 2014, 2018), and it has beenargued that (some of) these subsets could be the elemen-tary units on which natural selection may act (Hordijk etal., 2012; Hordijk and Steel, 2014; Hordijk et al., 2018b;Hordijk and Steel, 2018; Vasas et al., 2012). First nu-merical results support this scenario, at least for com-partmentalised RAF sets (Hordijk et al., 2018b; Ravoni,2020; Serra and Villani, 2019; Vasas et al., 2012). How-ever, in the same works it has been observed that the con-ditions a graph must satisfy to be a RAF set (see SectionII.B) are not suﬃcient to strictly constrain its dynamics,and the system may not exhibit the features necessaryto be evolvable. More generally, the connection betweenthe topology of a RAF set and its dynamics is not yetfully understood (Filisetti et al, 2011; Hordijk and Steel,2012b; Hordijk et al., 2018a; Serra and Villani, 2019),In this work we make a ﬁrst attempt to solve thisissue, by studying RAF sets through the formalism ofthe

Chemical Reaction Network (CRN) theory (Feinberg,2019). The CRN theory oﬀers powerful tools that allowto connect the topology of a network with its dynamicalbehaviour, for a large class of network dynamics (Ellison,1998; Feinberg, 1987, 1995, 2019; Ji et al., 2011). Notethat the CRN theory usually deals with the study of real-world chemical systems. Starting from the results of theCRN theory, we identify which conditions on the struc-ture of RAF sets are useful for predicting their long-termbehaviours. In particular, we investigate which condi-tions determine a dynamics with desired properties forsystems at the origin of life (e.g., self-maintenance, con-centrations growth, homeostasis), adding new elementsin the scenario previously proposed for the evolvabilityof RAF sets (Hordijk et al., 2018b; Vasas et al., 2012).The paper is organised as follows. In Section II werecall notions of the CRN theory useful for our study, andthe deﬁnition of RAF sets. In Section III we show our a r X i v : . [ q - b i o . M N ] O c t representation of RAF sets in terms of CNRs. In SectionIV we introduce motifs in the structure of RAF sets thatare connected with long-term behaviours of the networks,and we provide an illustrative example of evolvability ofRAF sets by numerically simulate the dynamics. Finally,in Section V we discuss the conclusions. II. BACKGROUNDA. Chemical Reaction Networks and the Deﬁciency ZeroTheorem

In this Section we recall some useful deﬁnitions and in-troduce the

Deﬁciency Zero Theorem (DZT) (Feinberg,1987, 1995). An interested reader can ﬁnd a detaileddiscussion of the CRN theory in Feinberg (2019). Let S denote the set of chemical species , N = { ν i , ν (cid:48) i : i =1 , · · · , r, ν j (cid:54) = ν (cid:48) i } the set of complexes , that is, the mem-bers of the vector space R S> generated by the speciesthat provide the inputs and the outputs of a reaction,and R = { ν i → ν (cid:48) i : i = 1 , · · · , r } the set of reactions ,that is, the set of relations among complexes. A CRNis a triple ( S , N , R ). Note that it is possible to consider open CRNs introducing pseudo reactions such as 0 → ν and ν →

0, where 0 is called the zero complex and it isthe zero vector of R S .We denote with G the directed graph with complexes asnodes and reactions ν i → ν (cid:48) i as directed edges. The con-nected components of G are called the linkage classes ofthe CRN. A strong-linkage class is a strongly connectedlinkage class. A terminal strong-linkage class is a strong-linkage class containing no complex that is a source foran edge pointing to a diﬀerent strong-linkage class. Notethat a complex not involved in any reaction is a termi-nal strong-linkage class. A CRN is said to be weaklyreversible if each linkage class in G is a strong-linkageclass, or, equivalently, if each linkage class is a terminalstrong-linkage class. The reaction vectors for a CRN arethe members of the set { ν (cid:48) i − ν i ∈ R S | ν i → ν (cid:48) i ∈ R} . Thespan generated by the reaction vectors is called the stoi-chiometric subspace of the network, and its dimension isdenote by d . The deﬁciency of a CRN is: δ = n − l − d, (1)where n and l are the number of complexes and linkageclasses, respectively. Note that the dimension d of thespan of two networks with the same complexes and thesame linkage classes is the same. This implies that thetwo networks have the same deﬁciency δ , and that itresults δ ≥ mass-action systems , that areCRNs taken together with an element κ ∈ R R > . Thenumber κ ν → ν (cid:48) is the rate constant for the reaction ν i → ν (cid:48) i ∈ R . Hereafter, we use G to indicate both a CRN, itsassociated directed graph and the resulting mass-action system. The species-formation rate function for a mass-action system is the function f ( · , κ ) : R S≥ → R S deﬁnedby: f ( c, κ ) = (cid:88) ν → ν (cid:48) κ ν → ν (cid:48) c ν ( ν (cid:48) − ν ) , (2)where c is the (time-dependent) vector of species concen-tration and c ν is deﬁned as: c ν := (cid:89) s ∈S c ( s ) ν ( s ) . (3)Here c ( s ) is the concentration of species s and ν ( s ) isthe s -th component of the vector ν , that is, the stoi-chiometric coeﬃcient of species s in the complex ν . Infact, for each reaction ν i → ν (cid:48) i ∈ R , the component ν i ( s )indicates the number of molecules of the s -th chemicalspecies consumed by the i -th reaction, while the compo-nent ν (cid:48) i ( s ) indicates the number of molecules of the s -thspecies produced by the i -th reaction. The dynamics of G is described by the equation: f ( c, κ ) = ˙ c, (4)where the point indicates the time derivative. Eq. 4 cor-respond to a set of ordinary diﬀerential equations (ODEs)that describe the time evolution of concentration of eachspecies in S . Indeed, it is natural to deﬁne a positiveequilibrium as an element c ∈ R S > such that f ( c, κ ) = 0.A positive equilibrium is then a steady state of the mass-action system characterized by a positive concentrationand a zero net production rate for every species of theCRN. Note that a necessary condition for the existence ofa positive equilibrium for a mass-action system is the ex-istence of a positive vector α ∈ R R > such that (Feinberg,1995): (cid:88) ν → ν (cid:48) α ν → ν (cid:48) ( ν (cid:48) − ν ) = 0 . (5)If exists α such that Eq. 5 is satisﬁed, we say that the net-work’s reaction vectors are positively dependent. Notethat weak reversibility is a suﬃcient (but not neces-sary) condition for the existence of a positive equilibrium(Boros, 2019; Feinberg, 1995).The DZT provided a connection between the topologyof a network, expressed in terms of its deﬁciency, and theexistence and the uniqueness of steady states for the as-sociated mass-action system (Feinberg, 1987, 1995). Infact, the DZT does much more, and its results can beapplied to a broad class of kinetics systems. However,in this work we limit our attention on a speciﬁc result(again, see Feinberg (2019) for a comprehensive illustra-tion of the theorem and its application). In particular,we will use the following statement: Theorem 1 (The deﬁciency zero theorem)

Let ( S , N , R ) be a reaction network of deﬁciency zero.

1. If the network is weakly reversible, the resultingmass-action system admits precisely one positiveequilibrium for each initial condition, no matterwhat (positive) rate constants are assigned to thereactions .2. If the network is not weakly reversible, there canbe no assignment of (positive) rate constants suchthat the resulting mass-action system admits the ex-istence of a positive equilibrium (the network’s re-action vectors are not positively dependent).

B. Reﬂexively Autocatalytic and Food-generated Sets

Within the framework of RAF theory (Hordijk andSteel, 2004; Hordijk et al., 2011), a network of interactingspecies is represented by a

Catalytic Reaction Systems (CRS) (Hordijk and Steel, 2004, 2017; Lohn et al., 1998),that is a tuple (

S, R, C, F ) such that:– S is the set of chemical species;– R is the set of reactions, ρ → π , where ρ, π ⊂ S arethe reactants and products of a reaction, respec-tively;– C is the catalysis set, that is, a set of pairs { ( s, r ) : s ∈ S, r ∈ R } indicating the species s as the cata-lyst of reaction r ;– F ⊂ S is the food set, that is a distinguished subsetof S such that each species s ∈ F is assumed to beavailable from the environment.Let R (cid:48) be a subset of R . The closure cl R (cid:48) ( F ) is de-ﬁned to be the (unique) minimal subset of S that con-tains F together with all species that can be producedfrom F by repeated applications of reactions in R (cid:48) . Notethat cl R (cid:48) ( F ) is well deﬁned and ﬁnite (Hordijk and Steel,2004). In RAF theory, the notion of ASs in representedby the RAF sets. Given a CRS ( S, R, C, F ), a RAF setis a set of reactions R (cid:48) ⊆ R (and associated chemicalspecies) that satisﬁes the following properties:– Reﬂexively autocatalytic: for each reaction r ∈ R (cid:48) there exists at least one species s ∈ cl R (cid:48) ( F ) suchthat ( s, r ) ∈ C ;– F -generated: for each reaction r : ρ → π, r ∈ R (cid:48) and for each species s ∈ S such that s ∈ ρ , it is s ∈ cl R (cid:48) ( F ).Thus, a RAF set is a set of reactions able to catalyticallyproduce all its reactants starting from a suitable food set.It is also possible to deﬁne the closure of a set of reactions,introducing the notion of closed RAF set (Smith et al.,2014). Given a CRS ( S, R, C, F ), a subset R (cid:48) of R is saidto be a closed RAF set if: f1+f2 (x2) x1(x4)x2+x3(x1)x4 (x3)(x4)xy ( y ) y1 ( y ) y2 (x1) y3 (y2)y1+y2 FIG. 1: RAF set example. Complexes are representedby black letters (the letter f indicates food species),reactions by arrows. Blue letters in brackets indicatecatalysts. The entire reaction network is a RAF set.Two closed RAF sets are present within the RAF set,namely: R (1) = { f1+f2 (x2) −−−→ x1 , x1 (x4) −−−→ x2+x3 , x2+x3 (x1) −−− (cid:42)(cid:41) −−− x4 , x4 (x3) −−−→ x1 , x4 (x4) −−− (cid:42)(cid:41) −−− xy } and R (2) = { f1+f2 (y3) −−−→ y1 , f1+f2 (y1) −−−→ y2 , y2+y3 (y2) −−− (cid:42)(cid:41) −−− y3 } .– R (cid:48) is a RAF set;– for each r such that all its reactants and at least onecatalyst are either part of the set F or are producedby a reaction from R (cid:48) , it is r ∈ R (cid:48) .Fig. 1 shows an example of a RAF set and its constituentclosed RAF sets.In Smith et al. (2014), authors claimed that “infor-mally, a closed RAF captures the idea that any reactionthat can occur, will occur”. It has been shown that closedRAF sets have interesting dynamical properties and theymay be the relevant units for the potential evolvabilityof ASs (Hordijk et al., 2018b). However, the structureof a closed RAF set seems unable to guarantee its dy-namical stability, and monotonic growth of all its associ-ated species (Dittrich and Di Fenizio, 2007; Ravoni, 2020;Serra and Villani, 2019). In Section IV we introduce fur-ther constraints on the topology of closed RAF sets thatallow us to predict their dynamics and to identify net-works with peculiar dynamical properties. III. REFLEXIVELY AUTOCATALYTIC ANDFOOD-GENERATED CHEMICAL REACTIONNETWORKS

In this Section we study RAF sets in the context ofCRN theory. Our ﬁrst step is to represent F-generatedsets in terms of CRNs (Section III.A). In particular, wefocus our attention on zero deﬁciency CRNs (this restric-tion is dropped when we introduce catalysis, see SectionIII.B). We open the system to the inﬂow of matter byintroducing pseudo reactions that produce species of thefood set. We consider both reversible and irreversiblereactions and allow only elementary reactions (i.e., reac-tions involving one or two chemical species as reactants)to occur .Finally, in Section III.B we introduce the notion ofcatalysis within the framework of the CRN theory. InRAF theory, a catalyst is a species able, with its pres-ence, to facilitate (i.e. accelerate) the proceeding of areaction (Hordijk and Steel, 2004). However, in real sys-tems, catalysed reactions usually involve one or more in-termediate complexes and can exhibit various dynami-cal behaviours (Michaelis and Menten, 1913; Ricard andCornish-Bowden, 1987).We deﬁne catalysis within CRNs by introducing an en-zymatic mechanism such that a catalysed reactions con-sumes a catalyst x and a substrate s to form an interme-diate complex e , which in turn releases product speciesand the original catalyst x : s + x κ −− (cid:42)(cid:41) −−− κ − e κ −− (cid:42)(cid:41) −−− κ − p + x (6)where κ − = 0 for irreversible reactions. Note thatthis modeling is the common Michaelis-Menten scheme (Michaelis and Menten, 1913), extensively used to de-scribe enzymatic processes, and that this modelisationhas already been used to simulate RAF sets dynam-ics (Serra and Villani, 2019). Furthermore, we assumethe validity of the total quasi steady-state approxima-tion (tQSSA) (Briggs and Haldane, 1925; Tzafriri, 2003;Tzafriri and Edelman, 2004), that is, the intermediatecomplex e disappears virtually as fast as it is formed(due to its highly reactivity). In particular, by imposing κ − + κ κ >> , (7) κ − + κ κ >> κ κ , (8) κ − κ << , (9)it results (Tzafriri, 2003; Tzafriri and Edelman, 2004):˙ c ( e ) ≈ . (10)This assumption allows us to provide a relation betweenthe topology of the CRN without catalysis and the cor-responding dynamics including enzymatic processes (Ap-pendix A). Reactions involving three or more molecules as reactants arerare; they are often introduced as an approximation of sequencesof elementary reactions (Gillespie, 2007). By excluding non-elementary reactions, the model we propose is suitable for beingstudied through stochastic simulations (Gillespie, 1976, 1977),without the need to introduce further assumptions.

A. Food-generated Chemical Reaction Networks

Let F be a set of chemostatted chemical species, that is,a set of species whose concentration is controlled by theenvironment. Given a CRN G = ( S , N , R ), we consider F ⊂ S to be the food set of G . As in RAF theory, we saythat G = ( S , N , R ) is a F-generated CRN if, for a certainset F ⊂ S and for each s ∈ S , there exists a sequence ofreactions ( ν (cid:48) − ν ) , ( ν (cid:48) − ν ) , . . . , ( ν (cid:48) i − ν i ) such that:– ν (cid:48) i ( s ) (cid:54) = 0;– ν ( s (cid:48) ) (cid:54) = 0 ⇔ s (cid:48) ∈ F ;– ∀ s (cid:48) ∈ S such that ν j ( s (cid:48) ) (cid:54) = 0 , j = 1 , . . . , i exists k < j such that ν (cid:48) k ( s (cid:48) ) (cid:54) = 0.Let G = ( S , N , R ) be a F-generated CRN. We say that G is minimal if the following condition holds: ∀ ( ν, µ ) ∈ N ∃ s ∈ S| µ (cid:54) = ν ⇒ ν ( s ) (cid:54) = 0 , µ ( s ) = 0 , (11)that is, a F-generated CRN G is minimal if all its com-plexes contain at least one species that does not appear inany other complex of the network. It can be shown that aminimal F-generated CRN is a zero deﬁciency network.In fact, let G (cid:48) be a minimal F-generated CRN havingjust one linkage class that connects all its n complexesthrough exactly n − ν (cid:48) − ν ) are linear independent.It follows that d G (cid:48) = n −

1, and it is δ G (cid:48) = 0. Let G be aCRN obtained from G (cid:48) by removing exactly m reactions.Note that G is again minimal. It results that l G = 1 + m , d G = n − − m , and δ G = 0. From the arbitrary choice of m and the equivalence of the deﬁciency for CRNs havingthe same complexes and the same linkage classes (Fein-berg, 1995), it follows that each CRN satisfying Eq. 11is a deﬁciency zero network.In order to model the chemostatting of species in F , weintroduce pseudo reactions for each species f ∈ F suchthat: ι f −− (cid:42)(cid:41) −− o f f (12)where ι f and o f are the constant rates for reactions0 → f and f →

0, respectively. We denote with G F =( S F , N F , R F ) the CRN such that S F = F , N F = F ∪ R F is the set ofthe pseudo reactions. Note that for each pair of reac-tion vectors ( ν (cid:48) i − , ( ν (cid:48) j − ∈ R F , for i (cid:54) = j , it is( ν (cid:48) i − · ( ν (cid:48) j −

0) = 0, where · is the scalar product. Thus,it results d G F = | F | and δ G F = 0, that is, G F is a (weakly)reversible zero deﬁciency CRN. The DZT states that G F admits exactly one steady state. In particular, for each f ∈ F , the positive equilibrium is given by c ( f ) = ι f /o f .Therefore, c ( f ) is the concentration that the species f would reach if only only pseudo reactions are included.Given a CRN G (cid:48) = ( S (cid:48) , N (cid:48) , R (cid:48) ) and a food set F ⊂ S (cid:48) ,the equivalent open CRN G of G (cid:48) is obtained from G (cid:48) byadding G F , such that it is G = ( S , N , R ), where S = S (cid:48) , N = N (cid:48) ∪ N F and R = R (cid:48) ∪ R F . Note that, in gen-eral, the addition of G F to G (cid:48) changes the topology of G (cid:48) ;for instance, G F can connect separated linkage classes of G (cid:48) . Moreover, the introduction of various chemostattedchemical species aﬀects the dynamical behaviour of thenetwork, both by breaking conservation laws and withthe emergence of irreversible cycles (Polettini and Espos-ito, 2014). In this work, we focus our attention on CRNshaving no complexes including both food and non-foodspecies, and we do not allow food species to be catalysts.Therefore, for each complexes ν ∈ N and species s ∈ S it is: ν ( f ) (cid:54) = 0 , ν ( s ) (cid:54) = 0 ⇒ s ∈ F (13)( s, ν → ν (cid:48) ) ∈ C ⇒ s / ∈ F, (14)where f denotes a food species. Authors are currentlyinvestigating the impact of chemostatting on network dy-namics if Eq. 13 and Eq. 14 are not satisﬁed. B. Addition of catalysis

In this Section we introduce a procedure to includecatalysis within a CRN. In RAF theory, a catalysis isa pair ( x, r ) indicating that species x catalyses reac-tion r . As previously indicated, in our model, we adda catalysis as a substrate-catalyst interaction that pro-duces a substrate-catalyst complex . In particular, for amonomolecular reaction s → p catalysed by a molecule x , we assume that the species s reacts with the cata-lyst x to form the substrate-catalyst complex e , whichcan perform the inverse reaction or release the product p and the catalyst x . Similarly, for a bimolecular reaction s + s → p catalysed by x , we assume that one reactant,says s , is the substrate that reacts with x to produce e ∗ ,while the other reactant, s , reacts with e ∗ producing e ,that eventually releases x and products p . Fig. 2 sketchedthe procedure.Note that the substrate-catalyst complexes e, e ∗ , e ∗∗ (Fig. 2) are considered as diﬀerent complexes, that is:( e i · e j ) , ( e ∗ i · e ∗ j ) , ( e ∗∗ i · e ∗∗ j ) (cid:54) = 0 ⇔ i = j (15) e i · e ∗ j = e i · e ∗∗ k = e ∗ j · e ∗∗ k = 0 ∀ i, j, k. (16)Moreover, it is important to underline that original spon-taneous reactions are still present in the network with ad-dition of catalysis. Given a CRN G and a set of catalysis C , we call G C the CRN obtained from G by adding reac-tions according to the scheme of Fig. 2 for each catalysisin C . Therefore, G C represents the “real” mass-actionsystem underlying a CRN to which a set of catalysis isassociated. As in RAF theory, given a F-generated CRN G =( S , N , R ) and a set of catalysis C = { ( s, ν → ν (cid:48) ) : s ∈S , ν → ν (cid:48) ∈ R} , we say that G is a RAF CRN if, for eachreaction ν → ν (cid:48) ∈ R , there exists at least one species x ∈ S such that ( x, ν → ν (cid:48) ) ∈ C ; we say that a RAFCRN is a closed RAF CRN if, for each reaction ν → ν (cid:48) such that all its reactants and at least one catalyst areeither part of the set F or are produced by a reactionfrom R , it ν → ν (cid:48) ∈ R ; ﬁnally, we say that a closedRAF CRN is a minimal closed RAF CRN if it does notcontain any other closed RAF CRNs. IV. LONG-TERM BEHAVIOUR

In this Section, we identify motifs in the top logy of aminimal closed RAF CRN G that allow to predict the dy-namics of its associated network G C (note that, althoughnot speciﬁed, we are considering open CRNs). In gen-eral, adding catalysis increases the deﬁciency of a CRN(Polettini et al., 2015), and higher deﬁciency theories arenecessary in order to study the steady states of the net-work (Ellison, 1998; Feinberg, 1995).However, the topology of the zero deﬁciency CRN G provides some information on the dynamics of G C , atleast if conditions (7), (8) and (9) hold (see Appendix Afor more details). In fact, the relation among com-plexes provided by G does not change in G C : the terms α ν → ν (cid:48) ( ν − ν (cid:48) ) in the sum of Eq. 4 for the two networks,diﬀer only in the values of α ν → ν (cid:48) , with α G C > α G (Ap-pendix A). This implies, for instance, that if G is notweakly reversible, thanks to the DZT we can say thatthe reaction vectors of G C are not positively dependent;on the other hand, if G is weakly reversible, G C admitsthe existence of a positive equilibrium. Nevertheless, it isworth to underline that G and G C are not equivalent. Forinstance, consider the CRN shown in Fig. 3a. The net-work without catalysis G is a weakly reversible zero deﬁ-ciency network, and it admits exactly one positive equi-librium for each choice of initial condition. Conversely, byanalysing the the associated CRN G C through the CRNToolbox (Ji et al., 2011), we ﬁnd that G C admits the ex-istence of multiple stationary solutions starting from thesame initial conditions, for an appropriate choice of theconstant rates κ .We use these results in order to predict the dynamics ofthe mass-action system G C associated with a RAF CRN G . In particular, let G be a minimal closed RAF CRN,and let G − be the CRN obtained from G by removingall the reactions ν → ν (cid:48) , where the complex ν contains(only) food species. We deﬁne:1. Fully connected (FC) motif: G is a FC motif if it isa weakly reversible CRN;2. Non connected (NC) motif: G is a NC motif if both G and G − are not weakly reversible CRNs; s1 κ κ − p s1+x κ κ − e κ κ − p+xs1+s2 κ κ − p s2+x κ κ − e*e*+s1 κ κ − e κ κ − p+xs1+s2 κ κ − p1+p2 s2+x κ κ − e*e*+s1 κ κ − e κ κ − e**+p1e** κ − κ p2+x FIG. 2: Catalysis scheme. The dashed gray arrows associate spontaneous reactions (terms on the left) with thecorresponding catalysed reactions (terms on the right). Catalysis associated with irreversible reactions can beobtained from the ﬁrst two catalysis of the scheme by setting κ − and κ − equal to zero. Note that for irreversiblereactions, p is in turn a complex, i.e., it can be constituted by one or more chemical species. The associated ODEsare shown in Appendix A.3. Core connected (CC) motif: G is a CC motif if itis not a weakly reversible CRN and G − is a weaklyreversible CRN.Fig. 3 shows examples of the three motifs. They ex-hibit dynamics with diﬀerent characteristics (Fig. 4a).In particular, we focus our attention on three dynamicalproperties of the concentrations of species that constitutea motif:1. homeostasis : the ratio among concentrations of allthe species is constant over time ;2. continuous growth : the concentrations of all thespecies increase in time;3. self-conservation : if the inﬂow of matter towardsthe CRN is interrupted, the concentrations of allthe species do not decrease in time .Note that these are desired features for a system at thebasis of the origin of life. For instance, homeostasis isconsidered to be a fundamental property connected withself-maintenance of an organism; a growth of species con-centration is necessary for self-reproduction, and self-conservation prevents the decay of the system in caseof insuﬃcient resources (Luisi, 2016; Varela, 2000).We argue that FC motif exhibits homeostasis, NC mo-tif exhibits none of the properties and CC motif exhibitscontinuous growth and self-conservation. It is importantto underline that we are referring to long-term dynamicsof the networks and not to their transient behaviour.In fact, FC motif admits the existence of a posi-tive equilibrium for the concentration of all its chemicalspecies, meaning that, for t → ∞ , the ratio c ( s ) /c ( s (cid:48) ) isconstant for each pair s, s (cid:48) ∈ S (Fig. 4a). However, ifthe inﬂow rate ι f is set equal to zero for each f ∈ F , theresulting network is no more weakly-reversible. In par-ticular, there is a terminal strong-linkage class (i.e., the zero complex) that is an absorbing linkage class , mean-ing that, once the ﬂow of matter falls in it, it can nolonger leave it. Therefore, the matter consumed by re-actions involving the species of the CRN is not replaced,the species concentrations decrease and the network isnot self-conservative (Fig. 4a).On the other hand, NC motif contains an absorbinglinkage class involving a strict subset S (cid:48) of S , both in caseof ι f > ι f = 0. As a result, species in S (cid:48) increasetheir concentrations at the expense of other species s / ∈S (cid:48) . Thus, even if a subset of S can experience continuousgrowth, the whole NC motif does not satisfy any of theproperties listed above (Fig. 4a).By deﬁnition, instead, all the non-food species of a CCmotif belong to a weakly reversible (sub)network that isabsorbing for the inﬂowing matter coming from the en-vironment. This implies that, if ι f >

0, each non-foodspecies is reached by a positive ﬂow of matter (contin-uous growth); on the other hand, if ι f = 0, the matteris redistributed among non-food species and does not es-cape from the CRN (self-conservation). Note that a CCmotif could exhibit homeostasis, depending on the ﬁnedetails of its topology. A. An illustrative example of evolvability of RAF sets

In this Section we present illustrative examples on howthe detected motifs (Section IV) impact the long-termdynamics of a (larger) RAF CRN. We start our studyby simulating the dynamics of a RAF CRN constitutedby three diﬀerent closed RAF CRNs that share the samefood set F . The closed RAF CRNs correspond to thediﬀerent motifs we have introduced in Fig. 3.Fig. 4b shows the dynamics of this particular net-work: as expected, the resulting dynamics is substan-tially equivalent to that shown by the non composed mo- f1 0 f2f1+f2 (x2) x1(x4)x2+x3(x1)x4(x3) (a) f1 0 f2f1+f2 (x2) x1(x4)x2+x3(x1)x4 (b) f1 0 f2f1+f2 (x2) x1(x4)x2+x3(x1)x4 (x3) (c) f1 0 f2y1 (y3) f1+f2 (y1) y2y1+y2 (y2) y3 (d) f1 0 f2y1 (y3) f1+f2 (y1) y2y1+y2 (y2) y3 (y3) y4 (e) (f) FIG. 3: Examples of motifs. Complexes are represented by black letters (the letter f indicates food species),reactions by arrows. Blue letters in brackets indicate catalysts. (a) FC motif: the CRN is weakly reversible. (b,e)NC motif: the CRN is not weakly reversible, and there is not a weakly reversible (sub)network that includes all thenon-food species. (c,d,f) CC motif: the CRN is not weakly reversible, and there is a weakly reversible (sub)networkthat includes all the non-food species. Note that all the motifs are closed RAF CRNs.tifs (Fig. 4a, ι f > x4 (x1) −−−→ x5 to theFC motif of Fig. 3a changes it into a NC motif. At thesame time, adding the reaction x4 (x4) −−−→ x1 to the NC mo-tif of Fig. 3b changes it into a CC motif. The dynamics ofthe overall network, in fact, is the result of the evolutionof a FC motif and two CC motifs (Fig. 4c). This is anexample of a general case: it is always possible to passfrom one motif to another by adding (or removing) oneor more reactions.This last point has important consequences for theevolvability of RAF sets. In fact, we can interpret thecharacteristic dynamical behaviour of a motif as its phe-notype, and the (random) addition of a new reaction asa mutation. In this framework, the examples of Fig. 4band Fig. 4c show that closed RAF CRNs can acquire(or lose) dynamical properties that can be, potentially,picked by natural selection. As further evidence to sup-port this observation, in Fig. 5 we show diﬀerent frames ofan hypothetical evolution of a RAF CRN in which muta-tions occur. In particular, the starting network is formedby two closed RAF CRNs that share the same food set F = { f , f } . The closed RAF CRNs are indicated with letters X (CC motif shown in Fig. 3c) and Y (CC motifshown in Fig. 3d), respectively. Initially, both the closedRAF CRNs show a continuous growth of species concen-trations, with similar growing rates (Fig. 5a). However,the addition of a new reaction in the network allow Y to drawn resources from a supplementary source com-pared to X , represented by the food elements f f Y , which is able to grow faster than Y (Fig. 5b). At the same time, as previously observed, anunfavorable mutation (Fig. 3e) can decrease the eﬃciencyof a network’s self-reproduction (Fig. 5c). Finally, notethat a mutation introducing a connection between X and Y can have major eﬀects on their dynamics. In partic-ular, consider the RAF CRN shown in Fig. 1: it can bebuilt starting from the networks of Fig. 3c and Fig. 3d,by adding the production of the species xy . This latterexpansion of the network can be interpreted, for instance,as the result of a mutation that allows species x xy in a reversible way, and species x xy start-ing from species y

3. Therefore, thanks to a favorablemutation, one motif can steal resources from the other.The dynamics of the resulting RAF CRN is shown inFig. 5d: due to the mutation, the closed RAF CRN X (cid:48) (the symbol X (cid:48) indicates the network X to which thereaction x4 (x4) −−−→ xy is added) is an absorbing linkageclass in which the ﬂow of matter of the net Y falls; con-sequently, network X still shows a continuous growth inspecies concentrations, while Y dynamically disappears.   { F CC ≠ F NC ≠ F FC } inflow → CCNCFC (a)   { F CC = F NC = F FC } CCNCFC (b)   { F CC = F NC = F FC } CCNCFC (c)

FIG. 4: Dynamics of motifs. On the x -axis: time steps t . On the y -axis: species concentration c . (a) Dynamics ofthe three isolated motifs introduced in Fig. 3a, Fig. 3b and Fig. 3c. For t > ι f is set equal to 0for each pseudo reaction ι f −−→ f . (b) Dynamics of the three motifs introduced in Fig. 3a, Fig. 3b and Fig. 3c, thatdraw from the same food set F = { f , f } . (c) Resulting dynamics after mutations that change the FC motif ofFig. 3a in a NC motif, and the NC motif of Fig. 3b in a CC motif. The plotted curves are obtained by numericallysolving the ODEs associated with the CRNs (Eq. 4), setting ι f = 10, o f = 10 − , κ = κ − = 10 − , κ = 10, κ − = 10 , κ = 10 , κ − = 10 − , and an initial concentration of c = 10 − for all the species.In this Section, we have shown how the presence of mo-tifs FC, NC and CC within the structure of a RAF CRNaﬀects the dynamics of the system. Moreover, the exam-ples we have introduced suggest that closed RAF CRNscould be the elementary units that experience adaptiveevolution. Regarding this issue, it is important to under-line two points. First, note that, in this work, we are deal-ing with a deterministic model. This approximation doesnot allow to directly simulate neither the spontaneousappearance of novelty, nor the probabilistic eﬀects, thatare characteristics of systems with few elements. How-ever, recent numerical simulations have shown how thesestochastic processes can determine both a dependence ondiﬀerent initial conditions (for instance, deriving from aninhomogeneous distribution of resources, or from randomevents, such as the division of a protocell), and an ac-tual competition among closed RAF CRNs (Hordijk etal., 2018b; Ravoni, 2020; Serra and Villani, 2019). There-fore, the addition of stochastic elements to the RAF CRNmodel introduced in this work should reproduce the sce-nario proposed in (Hordijk et al., 2018b; Ravoni, 2020;Vasas et al., 2012) for the evolvability of a population of compartmentalised RAF sets.Moreover, it is very important to note that our resultsshow that a RAF CRN does not always fully emerges inthe long-term dynamics, and that an actual competitioncan therefore occur also among subsets of the same RAFnetwork.We are currently investigating the impact of some ofthe assumptions made in this work, such as the tQSSA orthe limited role of the food set in the network reactions,and will discuss all these aspects in an upcoming work. V. CONCLUSIONS

In this work, we study long-term behaviours of RAFsets, particularly focusing on those systems that are con-sidered to be the building blocks from which life emerged.For this purpose, we represent RAF sets in terms ofCRNs, and we use CRN theory to predict the dynam-ics of the networks starting from their topology. Ourmain result is the identiﬁcation of additional topologicalconditions compared to the one introduced by the RAF  × × ×  { F X = F Y } XY (a)   { F X ⊂ F Y } XY (b)   { F X = F Y } XY (c)   { F X = F Y } XYxy (d)

FIG. 5: Evolution of a RAF CRN induced by mutations. On the x -axis: time steps t . On the y -axis: speciesconcentration c . (a) Dynamics of a RAF CRN constituted by the CC motif of Fig. 3c ( X ) and the CC motif ofFig. 3d ( Y ), that draw from the same food set F = { f , f } . (b) A favorable mutation allows Y to draws resourcesfrom the expanded set { f , f , f , f } (Fig. 3f), resulting in a higher self-production rate. (c) A unfavorablemutation changes Y in a NC motif (Fig. 3e), preventing from its continuous growth. (d) A mutation allows X tosteal matter from Y , causing the dynamical disappearance of Y (the CRN associated with this dynamics is shown inFig. 1). The plotted curves are obtained by numerically solving the ODEs associated with the CRNs (Eq. 4), setting ι f = 10, o f = 10 − , κ = κ − = 10 − , κ = 10, κ − = 10 , κ = 10 , κ − = 10 − , and an initial concentration of c = 10 − for all the species.theory, which allow us to predict the dynamics of the sys-tem and to identify networks with interesting dynamicalproperties. To our knowledge, this is the ﬁrst time thatthe actual dynamics of RAF sets have been successfullyconnected with their topology.Furthermore, starting from their structure, we displaythat not all the subsets of a RAF network eventually dy-namically emerge, contrary to what has been observedin previous works, where too simple CRNs were consid-ered (Hordijk et al., 2018b; Ravoni, 2020). These resultsshow that the increase of complexity of a RAF set sig-niﬁcantly aﬀects the emergent dynamics, supporting thepotential evolvability of RAF sets and their importancein the transition from inanimate to living matter.Thermodynamic aspects of these systems are going tobe investigated in a forthcoming paper, allowing us toformalise a consistent theory of Autocatalytic networks. Appendix A: Appendix

Let G = ( S , N , R ) be a CRN, R + and R − be the setsof reversible and irreversible reactions (such that R = R + ∪ R − ), and i, j denote reactions ν → ν (cid:48) ∈ R . Thefunction f ( c, κ ) for G can be written as: f G ( c, κ ) = (cid:88) i ∈R + κ i c ν i ( ν (cid:48) i − ν i )+ (cid:88) j ∈R − κ j c ν j ( ν (cid:48) j − ν j ) . (A1)Similarly, for G c it is: f G C ( c, κ ) = (cid:88) i ∈R + [ κ i c ν i ( ν (cid:48) i − ν i ) + P i ]+ (A2)+ (cid:88) j ∈R − [ κ j c ν j ( ν (cid:48) j − ν j ) + I j ] , where P i and I j denote the additional terms introducedby adding catalysis according to the scheme of Fig. 2for reversible and irreversible reactions, respectively. Foreach pair ( i, i (cid:48) ) of reversible reactions such that i : ν → ν (cid:48) i (cid:48) : ν (cid:48) → ν , the term P ( i,i (cid:48) ) can be written as: P ( i,i (cid:48) ) = | s | [ κ c ( s c ( x ) − κ − c ( e ∗ )] (A3) × ( e ∗ − s − x ) + { κ c ( ν − s [ | s | c ( e ∗ )+(1 − | s | ) c ( x )] − κ − c ( e ) } [ e − ν + s − | s | e ∗ − (1 − | s | ) x ]+ { κ c ( e ) − κ − c ( ν (cid:48) − p [ | p | c ( e ∗∗ )+(1 − | p | ) c ( x )] } [ ν (cid:48) − p | p | e ∗∗ +(1 − | p | ) x − e ] + | p | [ κ c ( p c ( x ) − κ − c ( e ∗∗ )]( e ∗∗ − p − x ) . Here, s p s p

2, respectively). Note that due to the above deﬁnition, P ( i,i (cid:48) ) describes the terms to be added for all the catalysisshown in Fig. 2, depending on whether the relations s p j : ν → ν (cid:48) theterm I j can be written as: I ( j ) = | s | [ κ c ( s c ( x ) − κ − c ( e ∗ )] (A4) × ( e ∗ − s − x ) + { κ c ( ν − s [ | s | c ( e ∗ )+(1 − | s | ) c ( x )] − κ − c ( e ) } [ e − ν + s − | s | e ∗ − (1 − | s | ) x ]+[ κ c ( e )]( ν (cid:48) + x − e ) . Note that I j can be obtained from P ( j,j (cid:48) ) assuming κ − =0 and p c ( e ) = { κ c ( ν − s [ | s | c ( e ∗ ) + (1 − | s | ) c ( x )] (A5) − κ − c ( e ) + κ c ( e ) − κ − c ( ν (cid:48) − p [ | p | c ( e ∗∗ )+(1 − | p | ) c ( x )] } , ˙ c ( e ∗ ) = | s | [ κ c ( s c ( x ) − κ − c ( e ∗ ) (A6) − κ c ( ν − s c ( e ∗ ) + κ − c ( e )] , ˙ c ( e ∗∗ ) = | p | [ κ c ( p c ( x ) − κ − c ( e ∗∗ ) (A7) − κ − c ( ν (cid:48) − p c ( e ∗∗ ) + κ c ( e )] . By imposing ˙ e = 0, ˙ e ∗ = 0 and ˙ e ∗∗ = 0 (tQSSA), itfollows: κ c ( e ) − κ − c ( ν (cid:48) − p c ( e ∗∗ ) = κ c ( ν − s c ( e ∗ ) (A8) − κ − c ( e ) . Again, equivalent expressions for irreversible reactionsare obtained assuming κ − = 0, p s f G it is: f G = (cid:88) ν → ν (cid:48) α ν → ν (cid:48) ( ν (cid:48) − ν ) , (A9)then f G C can be written as: f G C = (cid:88) ν → ν (cid:48) α (cid:48) ν → ν (cid:48) ( ν (cid:48) − ν ) , (A10) where: α (cid:48) ν → ν (cid:48) = α ν → ν (cid:48) + ∆ ν → ν (cid:48) , (A11)with ∆ ν → ν (cid:48) ≥

0. For instance, if κ − (cid:54) = 0, p (cid:54) = 0 and s (cid:54) = 0 it is: ∆ ν → ν (cid:48) = κ c ( e ) (A12)for a forward reaction (that is, a reaction with a “spon-taneous” rate constant κ ) and:∆ ν → ν (cid:48) = κ − c ( ν (cid:48) − p c ( e ∗∗ ) (A13)for a reverse reaction (that is, a reaction with a “spon-taneous” rate constant κ − ). This implies, in particular,that G C admits a positive equilibrium only if G is weaklyreversible. An equivalent result holds if κ − = 0, p s REFERENCES

Boros, B., 2019. Existence of positive steady states for weaklyreversible mass action sys-tems, SIAM J. Math. Anal., 51,435–449.Briggs, G.E., Haldane, J.B.S., 1925. A note on the kinetics ofenzyme action. Biochem. J. 19, 338–339.Dittrich, P., Di Fenizio, P., 2007. Chemical organisation the-ory. Bull. Math. Biol., 69(4):1199–1231.Dyson, F.J., 1985. Origins of Life, Cambridge: CambridgeUniversity PressEllison, P., 1998. The Advanced Deﬁciency Algorithm andits applications to mechanism discrimination, Ph.D. the-sis, Department of Chemical Engineering, University ofRochester.Feinberg, M., 1987. Chemical reaction network structureandthe stability of complex isothermal reactors i. the deﬁ-ciency zero and deﬁciency one theorems. Chem. Engr. Sci.,42(10):2229–2268Feinberg, M., 1995. The existence and uniqueness of steadystates for a class of chemical reaction networks. Archive forRational Mechanics and Analysis, 132(4), 311-370.Feinberg, M., 2019. Foundations of Chemical Reaction Net-work Theory, vol 202 (Springer).Filisetti, A., Graudenzi, A., Serra, R., Villani, M., De Lu-crezia, D., Fuchslin, R.M., Kauﬀman, S.A., Packard, N.,Poli, I., 2011. A stochastic model of the emergence of au-tocatalytic cycles. J Syst Chem 2, 2.Gillespie, D. T., 1976. A general method for numerically sim-ulating the stochastic time evolution of coupled chemicalreactions. Journal of computational physics, 22(4), 403-434.Gillespie, D. T., 1977. Exact stochastic simulation of cou-pled chemical reactions. The journal of physical chemistry,81(25), 2340-2361.Gillespie, D. T., 2007. Stochastic simulation of chemical ki-netics. Annu. Rev. Phys. Chem., 58, 35-55.Higgs, P. G., Lehman, N., 2015. The RNA World: molecularcooperation at the origins of life. Nature Reviews Genetics,16(1), 7-17.Hordijk, W., Steel, M., 2004. Detecting autocatalytic, self-sustaining sets in chemical reaction systems. Journal of the-oretical biology, 227(4), 451-461.1