Circuits with broken fibration symmetries perform core logic computations in biological networks
Ian Leifer, Flaviano Morone, Saulo D. S. Reis, Jose S. Andrade Jr., Mariano Sigman, Hernan A. Makse
CCircuits with broken fibration symmetries perform core logiccomputations in biological networks
Ian Leifer , Flaviano Morone , Saulo D. S. Reis , Jos´e S. Andrade, Jr. ,Mariano Sigman , Hern´an A. Makse Levich Institute and Physics Department, City College of New York, New York, NewYork, USA Departamento de F´ısica, Universidade Federal do Cear´a, Fortaleza, Cear´a, Brazil Laboratorio de Neurociencia, Universidad Torcuato Di Tella, Buenos Aires, Argentina* [email protected]
Abstract
We show that logic computational circuits in gene regulatory networks arise from afibration symmetry breaking in the network structure. From this idea we implement aconstructive procedure that reveals a hierarchy of genetic circuits, ubiquitous acrossspecies, that are surprising analogues to the emblematic circuits of solid-stateelectronics: starting from the transistor and progressing to ring oscillators,current-mirror circuits to toggle switches and flip-flops. These canonical variants servefundamental operations of synchronization and clocks (in their symmetric states) andmemory storage (in their broken symmetry states). These conclusions introduce atheoretically principled strategy to search for computational building blocks in biologicalnetworks, and present a systematic route to design synthetic biological circuits.
Author summary
We show that the core functional logic of genetic circuits arises from a fundamentalsymmetry breaking of the interactions of the biological network. The idea can be putinto a hierarchy of symmetric genetic circuits that reveals their logical functions. We doso through a constructive procedure that naturally reveals a series of building blocks,widely present across species. This hierarchy maps to a progression of fundamentalunits of electronics, starting with the transistor, progressing to ring oscillators andcurrent-mirror circuits and then to synchronized clocks, switches and finally to memorydevices such as latches and flip-flops.
Introduction In all biological networks [1] some simple ‘network motifs’ appear more often than they would by pure chance [2–4]. This regularity has been interpreted as evidence that these motifs are basic building blocks of biological machineries, a proposal that has had a major impact on systems biology [4, 5]. However, mere statistical abundance by itself does not imply that these circuits are core bricks of biological systems. In fact, whether these network motifs may have a functional role in biological computation remains controversial [6–9]. June 25, 2020 1/16 a r X i v : . [ q - b i o . GN ] J un unctional building blocks should offer computational repertoires drawing parallels between biological networks and electronic circuits [10]. Indeed, the idea of using electronic circuitry and devices to mimic aspects of gene regulatory networks has been in circulation almost since the inception of regulatory genetics itself [11]. This idea has been a driving force in synthetic biology [12]; with several demonstrations showing that engineered biological circuits can perform computations [13], such as toggle switches [14–17], logic [18], memory storage [15, 19, 20], pulse generators and oscillators [14, 21–23]. In a previous work [24], we have shown that the building blocks of gene regulatory networks can be identified by the fibration symmetries of these networks. In the present paper, we first demonstrate the functionality of these symmetric building blocks in terms of synchronized biological clocks. We then show that the breaking of the fibration symmetries of the network identifies additional building blocks with core logic computational functions. We do so through a constructive procedure based on symmetry breaking that naturally reveals a hierarchy of genetic building blocks widely present across biological networks and species. This hierarchy maps to a progression of fundamental units of electronics, starting with the transistor, and progressing to ring oscillators and current-mirror circuits and then to memory devices such as toggle switches and flip-flops. We show that, while symmetric circuits work as synchronized oscillators, the breaking of these symmetries plays a fundamental role by switching the functionality of circuits from synchronized clocks to memory units. Thus, our constructive theoretical framework identifies 1) building blocks of genetic networks in a unified way from fundamental symmetry and broken symmetry principles, which 2) assure that they perform core logic computations, 3) suggests a natural mapping onto the foundational circuits of solid state electronics [12, 13], and 4) endows the mathematical notion of symmetry fibration [24] with biological significance. Results Feed-forward loops do not synchronize We start our analysis by considering the dynamics of the most abundant network motif in transcriptional regulatory networks, the so-called feed-forward loop (FFL) introduced by Alon and coworkers [3, 4, 25]. A cFFL motif consists of three genes (X, Y and Z; c refers to coherent where all regulators are activators) where the transcription factor expressed by gene X positively regulates the transcription of Y and Z, and, in turn Y regulates Z (Fig. 1A). Fig. 1A shows an example of cFFL motif in E. coli with X= cpxR , Y= baeR , and Z= spy . Numerical and analytic solutions for the expression levels of the genes in the cFFL (Fig. 1B) demonstrate that the FFL does not reach synchronization (unless for very specific setting of parameters) nor oscillations in expression levels. This is consistent with previous research which has interpreted the functionality of the FFL as signal delay [3, 25, 26]. We illustrate this result by presenting an analytical solution of the FFL [2–4]. We use a discrete time, continuous state variable model with a logic Boolean interaction function in the spirit of the Glass and Kauffman model of biochemical networks [27]. The dynamics of the expression levels y t and z t of genes Y and Z, respectively, as a function of time t in the cFFL is given by the following difference equations [4]: y t +1 = (1 − α ) y t + γ x θ ( x t − k x ) ,z t +1 = (1 − α ) z t + γ x θ ( x t − k x ) × γ y θ ( y t − k y ) , (1)where x t is the expression level of gene X, α is the degradation rate of the gene, γ x and γ y are the strength of the interaction representing the maximum expression rate of June 25, 2020 2/16enes X and Y, respectively, and the thresholds k x and k y are the dissociation constant between the transcription factor and biding site. The expression level is measured in terms of abundance of gene product, e.g., mRNA concentration. The Heaviside step functions θ ( x t − k x ) and θ ( y t − k y ) represent the activator regulation from gene X and Y, respectively. They represent the Boolean logic approximation of Hill input functions in the limit of strong cooperativity [4, 27]. We consider an AND gate for the combined interaction of transcription factors of genes X and Y onto the binding sites of gene Z [4]. Analogous results shown in S1 File Section I can be obtained with an OR gate and with ODE continuum models. In S1 File Sec. I A, we show that the expression levels of the genes Y and Z do not synchronize, in other words, y t = z t , ∀ t . For example, Fig. 1B shows a particular set of parameters which results in a non-synchronized state. Such state is obtained under initial condition y > k y and α < γ x . Specifically, we use the parameters: α = 0 . γ x = 0 . γ y = 0 . k x = 0 . k y = 0 .
1. For this combination, y t and z t do not synchronize since y t saturates at y t → γ x /α = 0 . t → ∞ , and z t saturates at z t → γ x γ y /α = 0 .
42, for t → ∞ . In this figure, we set x t equal to a square wave and then monitor the expression levels of y t and x t . When x < k x , both y t and z t decay exponentially to zero. On the other hand, when x > k x , both variables evolve to saturate again at y t = γ x /α and z t = γ x γ y /α , in agreement with the analytical solution. Feed-forward fibers synchronize via a symmetry fibration In the FFL, gene Z receives input from X and Y, while gene Y, instead, only from X, and therefore the inputs are not symmetric (Fig. 1C). But as it turns out, a search of motifs in biological networks [24] shows that the FFL circuit in Fig. 1A regularly appears in conjunction with an autoregulation (AR) loop [28] at Y= baeR (Fig. 1E). This minimal inclusion in the FFL symmetrizes the circuit and we show, next, that this symmetry results in a circuit with a first and minimal form of function: synchronization in gene expression. This can be formalized by analyzing invariances in the input tree [24], which represents all the paths that converge to a given gene (Fig. 1C and G). The mathematical principle of symmetry fibration in dynamical systems introduced in [24, 29, 30] predicts this, since symmetries Y ↔ Z that leave invariant the tree of inputs (but not necessarily the outputs as imposed by automorphisms) are necessary and sufficient to achieve synchronization. When two genes have isomorphic input trees, their expression levels are synchronized, see [24] for details. For instance, by means of its autoregulation, baeR forms a symmetry fibration with gene spy , that can be formalized and characterized by an isomorphism between the input trees of these genes (Fig. 1G). When two input trees, like those of baeR and spy , are isomorphic, the expression levels of these two genes are synchronized. These genes are said to belong to the same ‘fiber’ [24, 30]. Genes in the same fiber are redundant, and can be collapsed into the ‘ base ’ (see Fig. 1H) by a symmetry fibration [24] (the FFL, instead, cannot be reduced since it has no symmetry, see Fig. 1D). Numerical simulations and analytical solutions (Fig. 1F, S1 File Section II A and Section II B) confirm that the addition of the AR loop to the FFL – leading to a circuit that we call the Feed-Forward Fiber (FFF) – changes its functionality qualitatively, leading to synchronization of genes Y and Z into coherent co-expression. This prediction is confirmed with experimental co-expression profiles in Ref. [24]. The FFF with activator regulations in Fig. 1E has a simple dynamics converging to a synchronized fixed point: all interactions are satisfied meaning that the Heaviside step functions evaluate to 1 and we call this circuit SAT-FFF. Instead, when the autoregulation is a repressor, the loop behaves as a logical NOT gate. When expression is high it inhibits itself shifting to low state. Instead, if it is low it promotes itself to shift to a high state. Hence, the activity of this gene oscillates indefinitely [14, 27]. This is
June 25, 2020 3/16he simplest expression of frustration [27, 31], a core concept in physics which refers to a system which is always in tension and thus never reaches a stable fixed configuration.
A biological transistor as a core building block
To understand the computation rationale of symmetric and frustrated circuits made of repressors, we map them to electronic analogues. We begin the analogy with the simplest circuit of a single gene with a feedback loop with repression (AR loop, Fig. 2D).
The dynamics for the expression level y t is described by the discrete time model with Boolean interaction [27]: ∆ y t = y t +1 − y t = − αy t + γ y θ ( k y − y t ) . (2)Here, α is the degradation rate of the Y gene product, γ y is the maximum expression rate of gene Y, and k y is the dissociation constant. The Heaviside step function θ ( k y − y t ) reflects the repressor autoregulation in the Boolean logic approximation. We will show that this genetic repressor interaction, shown as the stub in Fig. 2B, is the genetic analogue of a solid-state transistor shown in
Fig. 2A.
A transistor is typically made up of three semiconductors, a base sandwiched between an emitter and a collector (Fig. 2A). The current flows between the emitter and collector only if voltage applied to the base is lower than at the emitter ( V B < V E ) and thus the transistor acts as a switch and inverter. In the genetic circuit, the expression y t drives the rate of expression of gene Y, like the voltage drives current around an electric circuit. Simply comparing Eq. (2) to the pnp transistor in Fig. 2A leads to the analogy in which the expression y t is an analogue for the base potential V B of a transistor, k y an analogue for V E , γ y an analogue for the emitter current I E , αy t an analogue for I B , and ∆ y t an analogue for I C . Then, Eq. (2) provides the genetic equivalent of the equation for a transistor’s collector current I C = I E − I B (Fig. 2C, D). The repressor AR genetic circuit of Fig. 2D becomes a one-stage ring oscillator (Fig. 2C) where the collector of the transistor connects to the base forming the minimal signal feedback loop. As shown in Fig. 2D, an example of the AR is the gene Y = trpR from the
E. coli transcriptional network. The repressor AR loop can be extended to the
FFF by symmetrizing it, adding gene Z, such that it synchronizes with Y to express an enzyme that catalyzes a biochemical reaction (Fig. 2F). The circuit is completed with the external regulator X which keeps the symmetry between Y and Z. The resulting circuit is called UNSAT-FFF (since it is frustrated).
The UNSAT-FFF maps to the so-called Widlar current-mirror electronic circuit shown in Fig. 2E, a popular building block of integrated circuits used since the foundations of the semiconductor industry (1967 US patent [32, 33]). It serves two key functions as we show below: mirror synchronization of y t = z t and oscillatory activity (Fig. 2G and S1 File Section III). UNSAT-FFF solution oscillates and synchronize
Next, we show analytically and numerically that the UNSAT-FFF circuit has an oscillatory solution plus synchronization of genes Y and Z. S1 File Section III A shows that the prediction of oscillatory behavior can be obtained also from a model of gene expression in the continuum time-domain using a first-order ODE model with time delay due to the process of transcription and translation. Below, we focus on the discrete dynamics.
The UNSAT-FFF circuit is obtained by the addition of a repressor AR loop to the
FFL: in Fig. 2F, gene Y acts as a repressor regulator on the gene Z and on itself. There
June 25, 2020 4/16re many variants of this circuit depending on the combinations of activator and repressor regulators. Here, we use X gene as an activator and Y gene as a repressor.
Different variants yields analogous results and will be discussed elsewhere. The important ingredient is the existence of frustration in the interactions. For instance if gene X is high, then it will make genes Y and Z high too. However, this configuration does not satisfy the repressor autoregulation bond neither the repression from Y → Z. Thus, two bonds are unsatisfied and this circuit unsatisfied: UNSAT-FFF. The discrete-time dynamics of the expression levels of genes y t and z t are given by: y t +1 = (1 − α ) y t + γ x θ ( x t − k x ) × γ y θ ( k y − y t ) ,z t +1 = (1 − α ) z t + γ x θ ( x t − k x ) × γ y θ ( k y − y t ) , (3)where γ x and γ y are the strength of the interaction (maximum expression rate) of genes X and Y , respectively, and k x and k y are the dissociation constant, respectively. Similarly to the SAT-FFF case, synchronization between y and z occurs for the UNSAT-FFF (see S1 File Section II). However, the impact of the repressor feedback loop on the dynamical behavior of this circuit is more profound, since it leads to oscillations.
Thus, while both, SAT-FFF and UNSAT-FFF, lead to synchronization of Y and Z, the former synchronizes into a fixed point and the later into an oscillatory limit cycle.
We set λ = γ x γ y /k y α , and β = 1 − α , so that we rewrite Eq. (3) for the rescaled variables ψ t = y t /k y and ζ t = z t /k y as ψ t +1 = βψ t + αλθ ( x t − k x ) θ ( k y − ψ t ) ,ζ t +1 = βζ t + αλθ ( x t − k x ) θ ( k y − ψ t ) . (4)Now, we set x t = x constant in time for simplicity. For x < k x , the solutions exponentially decay as ψ t = ψ e − t/τ and ζ t = ζ e − t/τ , where ψ is the initial condition. For x > k x , Eq. (4) defines an iterative map which satisfies the following recursive equation: f t ( ψ ) = f t − ( βψ ) θ ( ψ −
1) + f t − ( βψ + αλ ) θ (1 − ψ ) . (5)This iterative map results in different solutions depending on the value of λ . We consider first the case where the initial condition is ψ >
1. Thus, the solution of
Eq. (4) is ψ t = ψ e − t/τ , where τ − = − log(1 − α ). This solution is correct as long as ψ t >
1, but ceases to be valid at a certain time t ∗ such that ψ t <
1, which is given by t ∗ = d τ log ψ e . Next, we consider the case ψ <
1. In this case the solution is given by ψ t = ψ e − t/τ + λ (1 − e − t/τ ), which is always valid for λ <
1. Thus, when λ < system does not oscillate but it converges monotonically to a fixed point ψ ∞ = λ . However when λ >
1, this solution ceases to be valid at the time t ∗ = d τ log λ − ψ λ − e such that ψ t >
1. Therefore, the solution ψ t oscillates in time for λ >
1. For the case of ψ >
1, the explicit solution is given by the general analytical expression which is plotted in Fig. 2G, right: ψ t = ψ e − t/τ for t ∈ n , , . . . , t = d τ log ψ e o ,ψ t = ψ e − ( t − t ) /τ + λ (cid:16) − e − ( t − t ) /τ (cid:17) for t ∈ (cid:26) t , . . . , t = t + (cid:24) τ log λ − ψ λ − (cid:25)(cid:27) ,ψ t = ψ e − ( t − t ) /τ for t ∈ n t , . . . , t = t + d τ log ψ e o . (6)The general solution with initial condition ψ ( t ) < Thus, the main condition for oscillations in the circuits is λ >
1, and if λ <
1, there is no oscillatory behavior, and the solution ψ t converges monotonically to λ . Therefore, June 25, 2020 5/16he oscillatory phase is separated from the non-oscillatory phase by the condition: γ y k y = (cid:16) γ x α (cid:17) − , (7)which is depicted in the phase diagram in Fig. 2G, left. Thus, the repressor autoregulation at gene Y converts the circuit into a synchronized clock, a primary building block in any logic computational device. S1 File Section V and S2 File show all FFFs found across gene regulatory networks of the studied species spanning
A. thaliana , M. tuberculosis , B. subtilis , E. coli , salmonella , yeast , mouse and humans. We also show in Table 1 the count of circuits across species and their associated Z-scores showing that these circuits are statistically significant. The algorithm to find these fibers in biological networks is explained in [24] and S1 File Section IV and it is available at https://github.com/makselab/FiberCodes . Table 1. Symmetric circuits ( fibers ) count [24].
Species Database Nodes Edges AR Fiber FFF Fibonacci Fiber n = 2 Fiber N real N rand ± SD Z-score N real N rand ± SD Z-score N real N rand ± SD Z-score N real N rand ± SD Z-scoreArabidopsis Thaliana ATRM 790 1431 2 0 . ± . ± . ± . . ± . . ± . . ± . . ± . . ± . . ± . ± . ± . . ± . . ± . ± > . ± . . ± . . ± . ± > . ± . . ± . . ± . . ± . . ± . . ± . . ± . . ± . . ± .
25 3.76 1 0 ± > . ± . ± > We report the Z-scores showing that all found fibers are statistically significant. We use a random null model with the samedegree sequence (and sign of interaction) as the original network to calculate the random count N rand and compare with the realcircuit count N real to get the Z-score. A broad class of logic and dynamic repertoires in the class of fibers
The procedure to build more complex fibers can be systematically extended through an algebra of circuits that adds external regulators and loops to grow the base of symmetric circuits (Fig. 3 and S1 File Section IV A). In this space, the AR is the core loop unit referred to as | n = 1 , ‘ = 0 i in the nomenclature of [24] since it has n = 1 loop and ‘ = 0 external regulator (Fig. 3B), and the FFF is | n = 1 , ‘ = 1 i (Fig. 3C). From this starting point, one can grow the number of regulator genes, | , ‘ i with ‘ >
1. This does not affect the complexity because all the relevant dynamics remain constrained to the sole loop in the FFF circuit. Likewise, there are a number of circuits that can correspond to | , i . However, only certain modifications conserve the topological class identified by | , i . For instance, changing the sign of the edges is allowed as long as the edges of each gene are the same, but removing the edge X → Z will break the symmetry of the fiber, so it is not allowed. Adding a second node downstream of Y will conserve the topological class but only if it interacts with X. This situation changes as soon as the fiber feeds information back to the external world. This is the case of the circuits in Fig. 3D, where the gene Y now regulates its own regulator gene X.
The inclusion of this second feedback loop results in the coexistence of two time-scales in the network. This, in turn, increases the diversity of trajectories and delays in the network; a dynamic complexity that is measured by the divergence of the input tree, which is captured by the sequence Q t representing the number of source genes with June 25, 2020 6/16aths of length t − is a rooted tree with a gene at the root ( Q = 1), and Q t represents the number of genes in the t − th layer of the input tree. Then, the divergence of the input tree is captured by its branching ratio measured by n = (cid:16) Q t +1 Q t (cid:17) t →∞ , see [24] for details. A quantitative analysis of this measure yields exactly the golden ratio (cid:16) Q t +1 Q t (cid:17) t →∞ = ϕ = (cid:0) √ (cid:1) / . ... for the circuits in Fig. 3D, revealing that the input tree is a Fibonacci sequence Q t = Q t − + Q t − updating the current state two steps backwards, see [24]. We have called this class of circuits, Fibonacci Fibers, in [24]. For example, the repressor interactions between genes X = uxuR , Y = exuR , and Z = lgoR from the E. coli network function exactly as a Fibonacci Fiber (Fig. 3D).
Other typical examples of Fibonacci Fibers in the transcriptional networks across species are also shown in Fig. 3D. S1 File Section V and S2 File show all found fibers , and Table 1 the counts for all fibers and their associated Z-scores showing that these circuits are statistically significant. The important component of these circuits is the delay in the feedback loop through the regulator from Y → X and back to Y captured by the Q t − term in the Fibonacci sequence. This circuit has been synthetically implemented by Stricker et al. [22] using a hybrid promoter that drives the transcription of genes araC and lacI forming a dual-feedback circuit. The functionality of this circuit has been demonstrated to be robust oscillations due to the negative feedback loop [22]. We will show next that a symmetry breaking in this Fibonacci circuit forms the base of the JK flip-flop, which is the universal storage device of computer memories [33]. The complexity of the Fibonacci Fiber with feedback to the regulator is 1 . ... , which is lower than the number of loops in the circuit (two). The intuition of what this reveals is that, in this circuit the regulator X is still not part of the base (see Fig. 3D), since it does not receive input from itself and hence it is not within the symmetry of the fiber . This, in turn, indicates naturally that the next element in the hierarchy of fibers results from the inclusion of an AR loop in X. This creates a fully symmetric circuit (Fig. 3E) with a core | , ‘ = 0 i that feeds the reporter/enzyme Z. In this case, the complexity of the circuit is exactly two (the number of autoregulators within the fiber ). Examples are shown in Fig. 3E and Supplementary Materials.
Broken symmetry circuits as memory storage devices
All symmetric circuits shown in Fig. 3 [24] work as clocks with varying levels of sophistication and robustness given by their time-scales or loops (see S1 File
Sections III C and IV A). Additionally, the more complex Fibonacci Fibers store memory dynamically by integrating sequences of its two immediate past states, according to the input tree, which computes temporal convolutions. However, this is only a short-term memory, stored dynamically and continually erased. This raises the necessity of understanding how these canonical biological circuits can perform controlled memory set and reset, a fundamental component of all computing devices [33].
We show next that static memory storage requires breaking the fibration symmetry of each circuit creating structures analogous to ‘flip-flops’ [33] in electronics that use a bistable toggle switch [14, 15] to store a bit of binary information into computer memory.
As we show below, in genetic networks, symmetry breaking endows the circuits with the ability to remember.
Next, we extend the constructive process described above to include symmetry breaking. We do so by mimicking an evolutionary process where circuits ‘grow’ by a ‘duplication’ event (analogous to gene duplication in evolution) that conserves the base of the original circuit (Fig. 4, first and second row). Then, breaking the symmetry creates a new functionality. We begin this with the simplest case of AR gene Y: | n = 1 , ‘ = 0 i . This gene is duplicated by ‘opening up’ the AR loop into two mutually June 25, 2020 7/16epressed replica genes, Y and Y’ (Fig. 4, second row). This creates a bistable toggle switch as studied in synthetic genetic circuits [14, 15] that is topologically equivalent to the core AR loop | , i since both have the same input tree and base (Fig. 4, first row). The symmetry of the replica circuit is then explicitly broken by including different inputs, S and R, to regulate genes Y and Y’, respectively (Fig. 4, third row). S and R represent either genes or effectors that break the symmetry of the circuit. Symmetry is now broken, and the genetic circuit becomes analogous to a Set-Reset (SR) flip-flop [33], the simplest canonical circuit to store a bit of binary information in electronic computer memory (see Fig. 4, fourth row).
The symmetry is explicitly broken by applying set-reset inputs S = R. Specifically, when S = 1 and R = 0, the circuit stores a bit of information. But here is the interesting fact: this state is kept in memory even when S = R = 1. That is, the circuit remains in the broken symmetry state even if the inputs are now equal and symmetric. In other words, the symmetry is now ‘spontaneously’ broken [34], since a specific state is selected even without external bias, thus allowing the circuit to remember its state.
Thus, this genetic network is a toggle switch as studied in synthetic genetic circuits [14, 15] analogous to the SR flip-flop logic circuit shown in Fig. 4, fourth row.
When the controlling inputs are S = 0 and R = 1, the SR flip-flop stores one bit of information resulting in Q = 1 and Q = 0 ( Q = not Q , and the output Q is the logic conversion of not Y ). The SR flip-flop retains this logical state even when the controlling inputs change. In other words, when S = 1 and R = 1, the feedback (the repressor interactions between Y and Y’) maintains the outputs Q and Q to its previous state. This state changes only when we reset the circuits with S = 1 and R = 0. In this last case, the SR flip-flop stores Q = 0 and Q = 1, which is also remembered when both inputs are high ( S = 1 and R = 1, S1 File Section IV B). This spontaneous symmetry breaking [34] has analogy with a ferromagnetic material.
When the temperature is high enough, the ferromagnet does not show any magnetic property. Moreover, even lowering the temperature, in absence of any polarizing field, the material does not magnetize. On the contrary, if an external magnetic field is applied to the ferromagnet, like a magnet put in contact to a needle for enough time, and then removed, then the needle becomes magnetic itself, in that it remembers the direction of the previously applied magnetic field, thus breaking, spontaneously, the rotational symmetry. Biological realizations of this replica symmetry breaking process are shown in Fig. 4, last row. Full list of symmetry broken circuits in gene regulatory networks across species appears in S1 File Section V and S2 File. Table 2 shows the
Z-scores of these circuits indicating their significance. The algorithm to identify these broken symmetry flip-flops in biological networks is developed in S1 File Section VII and is available at https://github.com/makselab/CircuitFinder . Table 2. Broken symmetry circuits count.
Species Database Nodes Edges SR flip-flop Clocked SR flip-flop JK flip-flop N real N rand ± SD Z-score N real N rand ± SD Z-score N real N rand ± SD Z-scoreArabidopsis Thaliana ATRM 790 1431 47 1 . ± . . ± . ± > . ± . . ± . . ± . . ± . . ± . ± . ± . . ± . . ± . . ± . ± . . ± . . ± .
06 417Human 192 566 90TRRUST 2718 8215 89 4 . ± . . ± . . ± . ± . . ± . . ± . We report the corresponding Z-score statistics as computed in Table 1.June 25, 2020 8/16xtending this duplication and symmetry breaking process to the FFF (Fig. 4, second column), one can replicate the X and Y genes from the FFF base to create a circuit isomorphic to the Clocked SR flip-flop, another computer memory building block [33] (see S1 File Section IV B). The symmetry is broken by regulators or inducers of genes X and X’ acting as S (set) and R (reset) of memory, and it is restored when
S = R, leaving the system ‘magnetized’. The hierarchy continues by replicating the
Fibonacci Fiber and consequent breaking of symmetry when the inputs J and K are different (Fig. 4 third column, see S1 File Section IV B). This structure is isomorphic (i.e., has the same base ) to the JK flip-flop in electronics, which is the most widely used of all flip-flop designs [33]. In its symmetric state, the JK flip-flop is isomorphic to the symmetric Fibonacci base . In the symmetry broken phase, it acts as a memory device which presents two possibilities. A ‘chiral’ symmetry (where Y feeds X and Y’ feeds X’) and a ‘parity’ symmetry (left-right reflection, where Y feeds X’ and Y’ feeds X). This last one is the one realized in biological circuits, see Fig. 4, third column. Examples of
JK flip-flops are abundant in gene regulatory networks in human and mouse. They are shown in Fig. 4, last row and full list in S1 File Section V, S2 File, and Z-scores in
Table 2.
Discussion
In summary, fibration symmetries and broken symmetries in gene regulatory networks reveal the functions of synchronization, clocks and memory through electronic analogues of transistors, ring oscillators, current-mirror circuits, and flip-flops. They result in a hierarchy of building blocks with progressively more complex dynamics obtained by iterating a procedure of replication and symmetry breaking. Beyond the circuits discussed here, the biological hierarchy can be extended to any number m of loops of length d and autoregulators in the fiber n , to form ever more sophisticated circuits whose complexity is expressed in generalized Fibonacci sequences Q t = nQ t − + mQ t − d . Gene regulatory structures are a mixture of combinational logic circuits, like FFF, and sequential logic circuits, like FF. They provide the network with a structure analogous to a programmable logic device or chip where the ’register’ is a set of flip-flop circuits tied up together acting as the memory clock of the genetic network that feeds the combinatorial logic circuits made of simpler feed-forward circuit of low symmetry.
Complex biological circuitry can then be seen as an emergent process guided by the laws of symmetry that determine biological functions analogous to electronic components. The discovery of these building blocks and building rules of logic computation will allow to: (1) systematically design synthetic genetic circuits following biological symmetry, and (2) systematically map the structure and function of all biological networks, from the symmetries of the connectome [35] to genetic [24], protein and metabolic networks, following a first principle theoretical approach.
Supporting information
S1 File. Supplementary Materials.
Detailed description of all analytical solutions mentioned in the main text, of the data acquisition and treatment, and detailed description of the proposed algorithm to find fibers.
S2 File. List of symmetry and broken symmetry circuits.
In S2 File we present a list of circuits found in different species.
June 25, 2020 9/16 cknowledgments
We are grateful to L. Parra and W. Liebermeister for discussions.
References
1. Hartwell LH, Hopfield JJ, Leibler S, Murray AW. From molecular to modular cellbiology. Nature. 1999;402: C47-C52.2. Milo R, Shen-Orr SS, Itzkovitz S, Kashtan N, Chklovskii D, Alon U. Networkmotifs: simple building blocks of complex networks. Science. 2002;298: 824-827.3. Shen-Orr SS, Milo R, Mangan S, Alon U. Network motifs in the transcriptionalregulation network of
Escherichia coli . Nature Genet. 2002;31: 64-68.4. Alon U. An Introduction to Systems Biology: Design Principles of BiologicalCircuits. Boca Raton: CRC Press; 2006.5. Klipp E, Liebermeister W, Wierling C, Kowald A, Herwig R. Systems Biology: atextbook. Weinheim: Wiley-VCH; 2016.6. Ingram PJ, Stumpf MP, Stark J. Network motifs: structure does not determinefunction. BMC Genomic. 2006;7: 108.7. Payne JL, Wagner A. Function does not follow form in gene regulatory circuits.Sci Rep. 2015;5: 13015.8. Mac´ıa J, Widder S, Sol´e R. Specialized or flexible feed-forward loop motifs: aquestion of topology. BMC Syst Biol. 2009;3: 84.9. Ahnert SE, Fink TMA. Form and function in gene regulatory networks: thestructure of network motifs determines fundamental properties of their dynamicalstate space. J Royal Soc Interface. 2016;13: 20160179.10. Tyson JJ, Chen KC, Novak B. Sniffers, buzzers, toggles and blinkers: dynamicsof regulatory and signaling pathways in the cell. Curr Opin Cell Biol. 2003;15(2):221-31.11. Monod J, Jacob F. General conclusions: teleonomic mechanisms in cellularmetabolism, growth and differentiation. Cold Spring Harb Symp Quant Biol.1961;26: 389-401.12. Teo JY, Woo SS, Sarpeshkar R. Synthetic Biology: A Unifying View and ReviewUsing Analog Circuits. IEEE Trans. on Biomed. Circuits and Syst. 2015;9:453-474.13. Dalchau N, Sz´e G, Hernansaiz-Ballesteros R, Barnes CP, Cardelli L, Phillips A,et al. Computing with biological switches and clocks. Natural Computing.2018;17: 761-779.14. Atkinson MR, Savageau MA, Myers JT, Ninfa AJ. Development of geneticcircuitry exhibiting toggle switch or oscillatory behavior in
Escherichia coli . Cell.2003;113: 597-607.15. Gardner TS, Cantor CR, Collins JJ. Construction of a genetic toggle switch in
Escherichia coli . Nature. 2000;403: 339-342.June 25, 2020 10/166. Kramer BP, Fussenegger M. Hysteresis in a synthetic mammalian gene network.Proc Natl Acad Sci USA. 2005;102: 9517-9522.17. Kramer BP, Viretta AU, Baba MD-E, Aube D, Weber W, Fussenegger M. Anengineered epigenetic transgene switch in mammalian cells. Nature Biotech.2004;22: 867-870.18. Guet CC, Elowitz MB, Hsing W, Leibler S. Combinatorial synthesis of geneticnetworks. Science. 2002;296: 1466-1470.19. Ajo-Franklin CM, Drubin DA, Eskin JA, Gee EP, Landgraf D, Phillips I, et al.Rational design of memory in eukaryotic cells. Genes Dev. 2007;21: 2271-2276.20. Ham TS, Lee SK, Keasling JD, Arkin AP. Design and construction of a doubleinversion recombination switch for heritable sequential genetic memory. PLoSOne. 2008;3: e2815.21. Elowitz MB, Leibler S. A synthetic oscillatory network of transcriptionalregulators. Nature. 2000;403: 335-338.22. Stricker J, Cookson S, Bennett MR, Mather WH, Tsimring LS, Hasty J. A fast,robust and tunable synthetic gene oscillator. Nature. 2008;456: 516-519.23. Tigges M, Marquez-Lago TT, Stelling J, Fussenegger M. A tunable syntheticmammalian oscillator. Nature. 2009;457: 309-312.24. Morone F, Leifer I, Makse HA. Fibration symmetries uncover the building blocksof biological networks. Proc Natl Acad Sci USA. 2020;117: 8306-8314.25. Mangan S, Alon U. Structure and function of the feed-forward loop networkmotif. Proc Natl Acad Sci USA. 2003;100: 11980-11985.26. Mangan S, Zaslaver A, Alon U. The coherent feedforward loop serves as asign-sensitive delay element in transcription networks. J Mol Biol. 2003;334:197-204.27. Glass L, Kauffman SA. The logical analysis of continuous, non-linear biochemicalcontrol networks. J Theor Biol. 1973;38: 103-129.28. Mangan S, Zaslaver A, Alon U. Negative autoregulation increases the inputdynamic-range of the arabinose system of
Escherichia coli . BMC Sys Biol. 2011;5:111.29. Golubitsky M, Stewart I. Nonlinear dynamics of networks: the groupoidformalism. Bull Am Math Soc. 2006;43: 305-364.30. Boldi P, Vigna S. Fibrations of graphs. Discrete Mathematics. 2001;243: 21-66.31. Anderson PW. The concept of frustration in spin glasses. J of the Less-CommonMetals. 1978;62: 291-294.32. Widlar RJ, Some circuit design techniques for linear integrated circuits. IEEETrans Circuit Theory. 1965;4: 586-590. See also Widlar RJ: US Patent Number3,320,439; Filed May 26, 1965; Granted May 16, 1967: Low-value current sourcefor integrated circuits and Widlar RJ. Design techniques for monolithicoperational amplifiers. IEEE Solid-State Circuits. 1969;4: 184-191.33. Horowitz P, Hill W. The Arts of Electronics. 3rd ed. New York:CambridgeUniversity Press; 2015.June 25, 2020 11/164. Weinberg S. The Quantum Theory of Fields. Cambridge: Cambridge UniversityPress; 2005.35. Morone F, Makse HA. Symmetry group factorization reveals thestructure-function relation in the
Caenorhabditis elegans connectome. NatCommun. 2019;10: 4961.June 25, 2020 12/16
YZ X YZ
Feed Forward Loop (
FFL ) SAT Feed-Forward Fiber (
SAT- FFF )Network representation
A EB F
Synchronization
NO YES
105 120 135 XY Z Z YX E. Coli
X = cpxR
Y = baeR
Z = spy geneActivator
Base representation
D H
X YZ X Y
Not isomorphic
ZYX X X Y
Not isomorphic
Input-tree representation
C G
ZYX YYY Y X
Isomorphic
XXX
Fig 1.
Feed-Forward Loop (FFL) and Feed-Forward Fiber (FFF).
A: FFL networkrepresentation. B: Numerical solution of FFL dynamics. The expression levels of genes Y andZ do not synchronize. The oscillation pattern presented is due to the square-wave behavior ofgene X expression levels. We use α = 2 . γ x = 0 . γ y = 0 . k x = 0 . k y = 0 . y = 0 . z = 0 .
0. C: Input tree representation of FFL. The input trees of genes X, Y, and Z are notisomorphic, as a consequence, their expression levels do not synchronize. D: Baserepresentation of FFL. The base is the same as the original circuits since there are nosymmetries. E: SAT-FFF network representation. As an example, we consider genes cpxR = X, baeR = Y, and spy = Z from the gene regulatory network of
E. coli . The addition of theautoregulation leads to a symmetry between the expression levels of genes Y and Z. F: Thenumerical solution of the SAT-FFF dynamics shows the synchronization of the expressionlevels of genes Y and Z. We use α = 0 . γ x = 0 . γ y = 0 . k x = 0 . k y = 0 . y = 0 . z = 0 .
0. Again, the oscillation is due to the wave-like pattern of X. G: As a result, genesY and Z have isomorphic input trees. However, the input tree of the external regulator cpxR isnot isomorphic, despite the fact that it directly regulates the fiber . H: Since Y and Zsynchronize, gene Z can be collapsed into Y, resulting in a simpler base representation.
June 25, 2020 13/16 iscrete Boolean Dynamics
XYZ C Electronic onestage ring oscillator
Load
Logic circuit Y Mapping D Genetic repressorautoregulator loop Y Genetic circuit
Mapping F Genetic UNSAT-FFFWidlar current-mirror circuit (US patent: 3320439) E Load
A B
Logic circuit Y Electronic Transistor
To conduct requires: Repressor link Y Genetic Transistor
Mapping
To conduct requires:
Mapping
ZxY outputLogic circuit
Base pnp
CollectorEmitter
Enzyme
Mapping G Synchronized oscillations in UNSAT-FFF
OscillationsNo oscillations
Phase diagram
Discrete Boolean Dynamics
Genetic circuit
X YZ
E. coli
X = crp
Y = lsrR
Z = tamE. coli
Y = trpR
Fig 2.
Mapping between electronic and biological transistor.
A: A pnp transistorallows current flow if the voltage applied to its base is lower than the voltage at its emitter( V B < V E ). Since it has a high (low) output for a lower (high) input, it is logically representedby a NOT gate. The yellow box shows the mapping between the pnp transistor and thebiological repressor. B: A repressor regulation link plays the role of the pnp transistor since therate of expression of a gene is repressed by gene Y if k y < y t . C: By connecting the base of thetransistor to its collector, one forms a one stage ring oscillator. D: This connection is translatedto the biological analogue as a repressor autoregulation at gene Y. In this way, the rateexpression of gene Y is able to oscillate, depending on the adjustment of parameters α , k y and γ y (see S1 File Section III). E: Widlar current-mirror circuit and F: its biological analogue(UNSAT-FFF). By mirroring the ring oscillator, the Widlar mirror circuit allowssynchronization and oscillations (see S1 File Section III). G: Phase diagram of oscillations ofthe UNSAT-FFF. An oscillatory phase is defined by the condition γ y /k y > ( γ x /α ) − (see S1File Section III). For instance, on the right side, we plot the solution of the discrete dynamicsfor a set of parameters satisfying such condition. Specifically, α = 0 . γ x = 0 . γ y = 0 . k x = 0 .
5, and k y = 1 . June 25, 2020 14/16
ZYY XXY X Y Repressor link Y Y ZxY
X YZ
ZY X X YY XYXY XY XY XY ZY X ZYY XXY X YY Y
X YZ ZY X Y
X Y
UNSAT Feed-Forward Fiber E E. coli
B. subtilis
Yeast
Human Y X M. tuberculosis
Salmonella
Tec1 Ste12
Mcm1 F a r P r m Y M R C Y B R W Y P
L 1 9 1 C Y O L 1 0 6 W S n l Y ap3 Z d s Fibonacci FiberRepressor Autoregulator Loop n = 2 Fiber
Transistor Repressor Link (Stub)
No input-tree No base
Logic circuitInput-treeGenetic network
Base
Logic circuitInput-treeGenetic network
Base
Logic circuitInput-treeGenetic network
Base
Logic circuit
DCBA
Input-treeGenetic network
Base
Logic circuit R v c Rv3286cRv2720Rv0117Rv2465cRv3413cRv2466cRv3222c Rv3414cRv3413c hprT tilSftsH Y X YZ glnRtnrA glnAalsT exuRlgoRuxuR marA robnfo soxS JUN EZRCREM SP1FOS
PAX5 T P UHRF1BP1NLRC4GMLAPCSSFN lexA rocR
Symmetric Circuits X Examples of n = 2 realizationsExamples of Fibonacci realizations
Fig 3.
Symmetric circuits [24] function as clocks.
The addition of autoregulation loopsand feedback loops results on a hierarchy of circuits of increasing complexity. For example,turning the A: repressor link into a B: repressor autoregulation results on an input tree thatfeeds its own expression levels with Q t = 1 and branching ratio n = 1 and it is equivalent to itsown base . C: The addition of an external regulator ( ‘ = 1) creates the UNSAT-FFF wheregenes Y and Z synchronize and oscillate. This can be verified in its corresponding input treeand logic circuit. D: The addition of a second feedback loop results in an input tree that followsthe Fibonacci sequence Q t = 1 , , , , , ... . Here, gene X is not part of the base . The branchingstructure of the input tree implies that the Fibonacci Fiber can oscillate and synchronize, butis unable to store static information. Examples are shown from all studied species. E: Thesecond autoregulation at X results in a symmetric input tree with Q t = 2 Q t − and branchingratio n = 2. This fiber collapses into a base with two autoregulators. Examples of n = 2 Fiberare two fibers from the regulatory networks of B. subtilis . In this figure, we present activatorlinks (black), repressor links (red), and interactions with unknown functionality (grey).
June 25, 2020 15/16 ymmetryclass
Base (clock)ReplicaSymmetryReplicaSymmetryBreaking(memory)
Y'YSRS ≠ R Electronic logicanalogous
X YYX
XX Y'YX'CLKXX Y'YX'CLKSR S ≠ R XX Y'YX'CLKJK J ≠ K Biologicalrealizations AR
XX Y'YX'CLK
FFF Fibonacci Y' Y SR SR fl ip- fl op Y' YX X' SRCLK
Clocked SR fl ip- fl op Y' YX X' KCLKJ JK fl ip- fl op RS NFKB1HOXA9
BST1HAX1 RS IRF4BCL6
XSR
NR3C1NR0B2CLOCK E2F1TP53
XJK
NKX3-1JUNPITX1 TP53ESR1
XJK
AREPS300FLI1 RELAHDAC1
Symmetry Breaking Circuits
Y'Y Y Replica
HSP90AB1MAP4PRC1REEP5
SLC6A6
SSTR2WWOXPNRC2FHIT
XSR
PRDM1DDIT3CEBPB CEBPAMYC
PRODHFTH1P18CD33HOMER3
HK3
DEFA3
DEFA1
HSP90AB1MAP4PRC1REEP5SLC6A6
SSTR2
CD24CYP2C19PTMAAKR1C3CLK3
NRASSEP_07
TSC1
TSC2
UCN
Fig 4.
Broken symmetry circuits function as memory. AR (first column):
Thereplica symmetry duplication of the AR symmetric circuit results in a network that isanalogous to the SR flip-flop circuit. The symmetry between Y and Y’ is broken by the inputsS and R, such that S = R, resulting into a pair logical outputs Q and Q = not ( Q ). FFF(second column):
Following the same strategy, we replicate the FFF through a replicasymmetry duplication. Note that this operation adds a second level of logic gates to the SRflip-flop. In order to have consistent logic operations, we add an input clock gene CLK inaddition to S and R. The resulting circuit is analogous to the Clocked SR flip-flop logic circuit.
Fibonacci (third column:
The replica symmetry duplication of the Fibonacci Fiber resultsin a logic circuit which is analogous to the JK flip-flop. For each symmetry breaking class, weshow two examples of circuits from the human regulatory network. The external regulatorgenes, S and R (J and K), provide inputs which are logically processed by the circuit,accordingly to the type of interaction links between the genes, activators (black arrows) orrepressors (red flat links). The outputs of the circuits (green genes) regulate the expressionlevels of other genes (in red) without affecting the circuit functionality. Here, grey arrowscorrespond to interactions with unknown functionality.
June 25, 2020 16/16 upplementary Materials:Circuits with broken fibration symmetries perform core logic computations in biological networks
Ian Leifer, Flaviano Morone, Saulo D. S. Reis, Jos´e S. Andrade Jr., Mariano Sigman, and Hern´an A. Makse
CONTENTS
I. Feed-forward loop: FFL 3A. FFL discrete time model 3B. FFL ODE 4C. FFL with OR gate 6II. Satisfied Feed-Forward Fiber: SAT-FFF 7A. SAT-FFF discrete time model 7B. SAT-FFF ODE 9III. Unsatisfied Feed-Forward Fiber: UNSAT-FFF 11A. UNSAT-FFF ODE 11B. Period-amplitude relationship 14C. UNSAT-FFF clock functionality 16IV. Examples of symmetry and broken symmetry circuits 17A. Symmetry circuits ( fibers ) from [24] 171. Repressor regulator link (stub) 172. Repressor AR loop: | , i | , i | . ..., i n = 2 Fiber: | , i a r X i v : . [ q - b i o . GN ] J un I. Algorithm to find fibers 21VII. Algorithm to find broken symmetry circuits 24References 272 . FEED-FORWARD LOOP: FFL
In what follows, we analyze in detail the FFL. First, we present the analytical solution forthe discrete time model of the FFL, where we show that the FFL does not synchronize noroscillates. After that, we reach the same conclusion by using a continuous variable approach.Finally, we show the discrete time solution with the OR gate. Solutions for the FFL havebeen considered in the literature. Here we adapt those results to the particular models usedin our studies to perform consistent comparisons with the solutions of the fiber dynamicsobtained through the paper.
A. FFL discrete time model
Starting from Eq. (1) in the main text, we define rescaled variables ψ t = y t /k y and ζ t = z t /k y , we rewrite Eq. (1) as ψ t +1 = βψ t + αηθ ( x t − k x ) ,ζ t +1 = βζ t + αλθ ( x t − k x ) θ ( ψ t − , (1)where we set β = (1 − α ), η = γ x /αk y , and λ = γ x γ y /αk y . Equation (1) defines an iterativemap ψ t +1 = f ( ψ t ) which provides a solution ψ t = f ( ψ t − ) = f ( ψ t − ) = . . . = f t ( ψ ) . (2)A closed form for ψ t depends on the value of x . For the sake of simplicity, consider x t = x constant in time. If x < k x , the solution is simple, and always decays as ψ t = ψ e − t/τ ,where we choose to write β t = e − t/τ such as τ − = − log(1 − α ). On the other hand, if x > k x , the iterative map is f ( ψ t ) = βψ t + η , leading to a solution that converges to η as ψ t = ψ e − t/τ + η (cid:0) − e − t/τ (cid:1) . Therefore, the solution to ψ t is given by: ψ t = ψ e − t/τ + η (1 − e − t/τ ) θ ( x t − k x ) . (3)Similarly, the solution for ζ t depends on x , but it also depends on ψ and η . For x < k x ,it always decays to zero as ζ t = ζ e − t/τ . When x > k x , the variable ζ t follows differentbehaviors. 3or ψ <
1, we also find a solution that decays as ζ t = ζ e − t/τ . However, if η > t ∗ such that ψ t ∗ >
1, which is given by t ∗ = d τ log(( η − ψ ) / ( η − e . Here, d x e denotes the smallest integer larger than x , e.g., d . e = 2. For t > t ∗ , as ψ t saturates to η , the rescaled variable ζ t converges to λ as ζ t = ζ e − ( t − t ∗ ) /τ + λ (cid:0) − e − ( t − t ∗ ) /τ (cid:1) .Next, we consider the case ψ >
1. In this case, the solution for ζ t is given by ζ t = ζ e − t/τ + λ (1 − e − t/τ ), for η >
1. In contrast, when η <
1, this solution is valid only up to t ∗ = d τ log(( ψ − η ) / (1 − η )) e , such that ψ t <
1. As ψ t saturates to η , the rescaled variable ζ t exponentially decays as ζ t = ζ e − ( t − t ∗ ) /τ .To summarize, the possible solutions for ζ t depending on ψ and η are: ψ > , η > → ζ t = ζ e − t/τ + λ (1 − e − t/τ ) ,ψ > , η < → ζ t = ζ e − t/τ + λ (1 − e − t/τ ) for t ∈ n , , . . . , t = l τ log ψ − η − η mo ,ζ t = ζ e − ( t − t ) /τ for t > t ,ψ < , η > → ζ t = ζ e − t/τ for t ∈ n , , . . . , t = l τ log η − ψ η − mo ,ζ t = ζ e − ( t − t ) /τ + λ (cid:0) − e − ( t − t ) /τ (cid:1) for t > t ,ψ < , η < → ζ t = ζ e − t/τ . (4)Here, ζ = ζ t = t .From this discussion, we find that the rescaled variables ψ t and ζ t do not synchronize.That is, they do not reach the same value at their fixed points: ψ t = ζ t when t → ∞ , unlesswe use a specific set of parameters. Moreover, ψ t and ζ t also do not reach oscillatory states.The same conclusion is extended to the original variables y t and z t . B. FFL ODE
In order to show that our results presented in the main text are consistent with a contin-uous variable approach, now we focus our attention on the modelling of the FFL [4, 20, 21]by using ordinary differential equations (ODE). First, we write the ODE governing the4ynamics of expression levels y ( t ) and z ( t ):˙ y ( t ) = − αy ( t ) + γ x θ ( x ( t ) − k x ) , ˙ z ( t ) = − αz ( t ) + γ x θ ( x ( t ) − k x ) × γ y θ ( y ( t ) − k y ) . (5)For the sake of simplicity, we consider x ( t ) = x constant in time.By using the rescaled functions ψ ( t ) = y ( t ) /k y and ζ ( t ) = k y , we rewrite Eq. (5) as thefollowing set of ODEs: ˙ ψ ( t ) + αψ ( t ) = αηθ ( x t − k x ) , ˙ ζ ( t ) + αζ ( t ) = αλθ ( x t − k x ) θ ( ψ t − , (6)where η = γ x /αk y and λ = γ x γ y /αk y .For the case of x < k x , Eqs. (6) become a set of homogeneous ODEs with solutions thatdecay exponentially: ψ ( t ) x
We present the solution for the FFL [4, 20, 21] with an OR gate. We consider the coherentversion cFFL where all regulations are activators. The discrete-time dynamics of expressionlevels y t and z t of genes Y and Z in the cFFL with a Boolean OR gate are given by thefollowing pair of difference equations: y t +1 = (1 − α ) y t + γ x θ ( x t − k x ) ,z t +1 = (1 − α ) z t + γ x θ ( x t − k x ) + γ y θ ( y t − k y ) , (10)where α , γ y , and γ z have the same definition as in the main text. By adopting the samerescaled variables, ψ t = y t /k y and ζ t = z t /k y , we rewrite Eq. (10) as: ψ t +1 = βψ t + αηθ ( x t − k x ) ,ζ t +1 = βζ t + αλ x θ ( x t − k x ) + αλ x θ ( ψ t − , (11)where we have β = (1 − α ), η = γ x /αk y , λ x = γ x /αk y , and λ y = γ y /αk y . Again, withoutlack of generality, we consider x t = x constant in time. Clearly, the solution for ψ t is notaffected by the OR gate, therefore, it depends only on the value of x and is given by: ψ t = ψ e − t/τ + η (1 − e − t/τ ) θ ( x t − k x ) . (12)Again, τ = − log(1 − α ). Thus, if x < k x , ψ t exponentially decays to zero. On the contrary,if x > k x , ψ t converges to η . The solution for ζ t displays different solutions, depending onthe combination of x and the initial value ψ .First, consider the case of x < k x . For ψ <
1, we find a solution that always decays as ζ t = ζ e − t/τ . When ψ >
1, the solution for ζ t is given by ζ t = ζ e − t/τ + λ y (cid:0) − e − t/τ (cid:1) . However,since x < k x and ψ t exponentially decays, this solution is valid only until t ∗ = d τ log ψ e ,the instant that ψ t ∗ <
1. As ψ t decays, ζ t also decays as ζ t = ζ t ∗ e − ( t − t ∗ ) /τ .We consider next the case of x > k x . In this case, ψ t always converges to η . Therefore,the solution for ζ t depends not only on ψ , but also on η . For ψ < η <
1, we find6hat ζ t converges to λ x as ζ t = ζ e − t/τ + λ x (cid:0) − e − t/τ (cid:1) . However, if η >
1, this solution isnot valid after t ∗ = d τ log ( ψ − η ) / (1 − η ) e , when ψ t >
1. In this case, as ψ t saturates to η , ζ t converges to λ x + λ y as ζ t = ζ t ∗ e − ( t − t ∗ ) /τ + ( λ x + λ y ) (cid:0) − e − ( t − t ∗ ) /τ (cid:1) .For the case of ψ > η >
1, the rescaled variable ζ t always converges to λ x + λ y as ζ t = ζ e − t/τ + ( λ x + λ y ) (cid:0) − e − t/τ (cid:1) . However, if η <
1, this solution ceases to bevalid at t ∗ = d τ log (( η − ψ ) / ( η − e . Therefore, for this case ζ t converges to λ x as ζ t = ζ t ∗ e − ( t − t ∗ ) /τ + λ x (cid:0) − e − ( t − t ∗ ) /τ (cid:1) .From the discussion above, it is clear that the rescaled variables ψ t and ζ t neither synchro-nize nor oscillate. Extending this results to the original variables y t and z t , we conclude thatthe expression levels of the genes Y and Z also do not synchronize: y t = z t when t → ∞ . II. SATISFIED FEED-FORWARD FIBER: SAT-FFF
Below, we describe the solutions of the SAT FFF circuit in the discrete time continuousvariable model and in the ODE model.
A. SAT-FFF discrete time model
The SAT-FFF is a Feed-Forward Fiber with activator autoregulation where all inter-actions are satisfied. That is, it does not present the phenomenon of frustration and thedynamics converges to a fixed point. This can be simply seen by considering gene X high,which makes also gene Y high and Z high too. Finally, the configuration satisfies the ARloop, so all bonds are satisfied. The SAT-FFF is constructed on top of the cFFL by theaddition of an autoregulator loop on the Y gene, as depicted in Fig. 1E (main text) and S1Fig. 1A. The discrete time dynamics of the SAT-FFF with a logic interaction term is givenby: y t +1 = (1 − α ) y t + γ x θ ( x t − k x ) × γ y θ ( y t − k y ) ,z t +1 = (1 − α ) z t + γ x θ ( x t − k x ) × γ y θ ( y t − k y ) . (13)7ote that the Heaviside function θ ( y t − k y ) represents the activator feedback on the autoreg-ulation of the Y gene. We consider an AND gate for the interactions [4]. Analogous resultscan be obtained for OR gates or with an first-order ODE model. Writing down the set ofequations for the rescaled variables ψ t = y t /k y and ζ t = z t /k y , we get: ψ t +1 = βψ t + αλθ ( x t − k x ) θ ( ψ t − ,ζ t +1 = βζ t + αλθ ( x t − k x ) θ ( ψ t − . (14)Here, we made use of λ = γ x γ y /αk y and β = (1 − α ). We note that, since the secondterm on the right-hand side of both equations are equal, the dynamical variables ψ t and ζ t must synchronize, as well as y t and z t . Again, considering x t = x constant, for x < k x , thesolutions for ψ t and ζ t are trivial: both variables decay exponentially as ψ t = ψ e − t/τ and ζ t = ζ e − t/τ , where τ − = − log(1 − α ). This behavior is shown by the red solid line in S1Fig. S1B with ψ = 0 . ψ t with x > k x is: ψ t +1 = βψ t + αλθ ( ψ t − ≡ f ( ψ t ) , (15)so we find f t ( ψ ) = f t − ( βψ ) θ (1 − ψ ) + f t − ( βψ + λ ) θ ( ψ − . (16)This iterative map ψ t = f ( ψ t ) provides different solutions depending on ψ . Similar to thecase of x < k x , if ψ <
1, the solution decays to zero as ψ t = ψ e − t/τ . However, if ψ > λ = γ x γ y /αk y .First, if λ >
1, the solution for both rescaled variables converges to λ as ψ t = ψ e − t/τ + λ (1 − e − t/τ ) and ζ t = ζ e − t/τ + λ (1 − e − t/τ ), such that ψ t →∞ → λ and ζ t →∞ → λ , as presentedby the blue dash-dotted line in S1 Fig. S1B. For this case, we use ψ = 1 . λ = 2.For λ < ψ t approaches 1 at a time t ∗ given by t ∗ = & − α ) log − λψ − λ !’ . (17)Then, for t > t ∗ , the solutions decay to zero as ψ t = e − ( t − t ∗ ) /τ and ζ t = ζ t ∗ e t − t ∗ ) /τ . Thisbehavior is presented on S1 Fig. S1B by the dashed green line, where we use ψ = 2 and λ = 0 .
9. The rescaled variables ψ t and ζ t always synchronize, so do y t and z t . This can beproved by finding the difference (cid:15) t = ψ t − ζ t . For all the cases discussed above, (cid:15) t decaysexponentially fast as (cid:15) t = ( ψ − ζ ) e − t/τ . 8 YZ A B
FIG. 1.
SAT-FFF. a,
Network representation of the SAT-FFF. All regulations are activators. b, Different behaviors for the analytical solutions of ψ t depending on ψ and λ , as discussed in thetext. Now, we can use the solution with x t constant to qualitatively understand the SAT-FFFin general. An example of the SAT-FFF with non-constant x t is depicted on Fig. 1F in themain text. As shown, variable y t and z t do synchronize but with no internal oscillations. Wefeed an external oscillatory pattern of x t as a square wave. For x t < k x , both y t an z t decayexponentially. When x t > k x , they tend to saturate at γ x γ y /α . The SAT-FFF synchronizesat a fixed point. B. SAT-FFF ODE
Here, we consider the ODE model of SAT-FFF to confirm results presented in SectionII A. The dynamics of gene X is driven by outside sources, so we only consider the dynamicsof genes Y and Z which are described by equations: ˙ y = − αy ( t ) + γ x θ ( x ( t ) − k x ) × γ y θ ( y ( t ) − k y ) , ˙ z = − αz ( t ) + γ x θ ( x ( t ) − k x ) × γ y θ ( y ( t ) − k y ) . (18)Taking ψ ( t ) = y ( t ) /k y , ζ ( t ) = z ( t ) /k y and δ = γ x γ y /k y we transform Eq. (18) to:9 ˙ ψ = − αψ ( t ) + δ θ ( x ( t ) − k x ) × θ ( ψ ( t ) − , ˙ ζ = − αζ ( t ) + δ θ ( x ( t ) − k x ) × θ ( ψ ( t ) − . (19)Without loss of generality, we consider the case of x ( t ) = x constant in time. If x < k x ,then the solution of Eq. (19) is given by: ψ ( t ) x
1) (21)Due to the existence of isomorphic input trees between both genes, ψ ( t ) and ζ ( t ) syn-chronize and therefore y ( t ) and z ( t ) synchronize also, so we only consider dynamics of thefirst equation: ˙ ψ = − αψ ( t ) + δ θ ( ψ ( t ) − . (22)It’s easy to see that for ψ <
1, Eq. (20) will be the solution of Eq. (22). When ψ > δ/α >
1, the solution is given by: ψ ( t ) = δ/α + ( ψ − δ/α ) e − αt . (23)In the case ψ > δ/α <
1, the dynamics of ψ is split into two parts. One partbefore ψ decays to 1 and the other one after ψ crossed 1. The time when ψ ( t ) crosses 1 isequal to: t c = 1 α ln ( ψ − δ/α − δ/α ) , (24)and the dynamics can be written as: t ∈ [0 , t c ] ψ ( t ) = δ/α + ( ψ − δ/α ) e − αt ,t ∈ [ t c , ∞ ] ψ ( t ) = ψ − δ/α − δ/α e − αt . (25)10o summarize, the solution is: ψ < ψ ( t ) = ψ e − αt ψ > , δα > ψ ( t ) = δ/α + ( ψ − δ/α ) e − αt ψ > , δα < t ∈ [0 , t c ] ψ ( t ) = δ/α + ( ψ − δ/α ) e − αt t ∈ [ t c , ∞ ] ψ ( t ) = ψ − δ/α − δ/α e − αt . (26)This solution is analogous to the one obtained for the discrete time model in Section II A. III. UNSATISFIED FEED-FORWARD FIBER: UNSAT-FFF
Now, we turn our attention to the UNSAT-FFF. The solution of this circuit is developedin the main text using a discrete-time difference equation with a logistic interaction. Belowwe elaborate on the solution of the ODE continuum model and on the conditions on theperiod-amplitude relation and the clock functionality.
A. UNSAT-FFF ODE
The UNSAT-FFF circuit can be reduced to study the base of the circuit since both genesY and Z synchronize their behaviour as shown in the main text. The base of this circuit isa negative autorregulation loop (Fig. 3) plus an external regulator given by X. This circuithas been synthetically implemented by Stricker et al. in Ref. [18] using a promoter thatdrives the expression in the absence of LacI (and acts as a negative feedback loop) or in thepresence of IPTG, which acts as an activator. It was shown experimentally that this circuitleads to oscillatory behaviour in the expression profiles. This result was corroborated witha dynamical ODE model in [18] which here we adapt to study the case of the UNSAT-FFFwith ODE. See also the review paper [53] for further reading.Following the same approach as above, we consider gene x ( t ) = x constant in time andlarger than x > k x , and rescale the expression of genes y ( t ) and z ( t ) as ψ ( t ) = y ( t ) /k y and11 ( t ) = z ( t ) /k z . Since genes y ( t ) and z ( t ) synchronize their activities, then only one equationneeds to be considered, ψ ( t ).The key to observe oscillations in a first-order ODE is to consider the delay in the signalpropagation in the circuit. Without delay the dynamics converge to a fixed point; no oscil-latory solution exist in a first-order ODE continuum-time model. The situation is differentin the discrete-time model considered in the text. In this case, a discrete time plus a logicapproximation lead to oscillations.Negative feedback loop circuits with delays have been widely investigated in the dynam-ical systems literature. Here, we adapt the negative feedback loop model with delay ofStricker et al. (see Eq. (6) in Supplementary Information in Ref. [18]). We consider delaysin the negative feedback loop which is the key feature to explain the experimentally observedrobust oscillation in this circuit [18].Delays in a biological circuit arise from the combined processes of intermediate stepslike transcription, translation, folding, multimerization and binding to DNA. This series ofbiological processes are lumped into a single arrow between two genes in the network rep-resentation of the circuit. In reality this arrow represents processes that should be modeledin detail. These biological processes can be approximately taken into account by a delayin the interaction term in the dynamical equations. The interaction term can be writtenas δ θ (1 − ψ ( t − τ )), where τ represents the delay caused by the fact that the process ofself-repression is not instant. Therefore, the dynamics of ψ ( t ) are described by a first-orderdelay-differential equation (DDE) [18] of the form:˙ ψ = − αψ ( t ) + δ θ (1 − ψ ( t − τ )) , (27)where τ represents the delay caused by expression process.We derive analytical solutions to this equation following a procedure outlined in [54](Chapter V). We start by noting that initial conditions used to solve a DDE are not givenby the value of the function at one point, but rather by a set of values of the function on aninterval of length τ . The solution of a DDE can’t be thought of as a sequence of values of ψ ( t ) as in an ODE, but rather as a set of functions { f ( t ) , f ( t ) , f ( t ) , . . . , } , defined over aset of contiguous time intervals { [ − τ, , [0 , τ ] , [ τ, τ ] , . . . , } .Let’s consider Eq. (27) with initial function f ( t ) for t ∈ [ − τ, t ∈ [0 , τ ] Eq.(27) looks like: 12 ψ = − αψ ( t ) + δ θ (1 − f ( t − τ )) . (28)Moving the degradation term to the left and multiplying by e αt we get:˙ ψe αt + αψ ( t ) e αt = δ e αt θ (1 − f ( t − τ )) . (29)Re-writing the left part, we obtain: d ( ψe αt ) dt = δ e αt θ (1 − f ( t − τ )) , (30)and integrating on the interval R t , we get: ψe αt − ψ (0) = δ Z t e αt θ (1 − f ( t − τ )) dt . (31)Considering that ψ is continuous at 0 ( ψ (0) = f (0)) and ψ ( t ) for t ∈ [0 , τ ] is given by f ( t ): f ( t ) = f (0) e − αt + δ Z t e α ( t − t ) θ (1 − f ( t − τ )) dt, (32)then, following the same procedure, we can derive the general formula for finding the solution ψ ( t ) on the interval [ kτ, ( k + 1) τ ], assuming that the solution on the previous interval [( k − τ, kτ ] is given by f k − ( t ). We then need to solve the following iterative equation:˙ ψ = − αψ ( t ) + δ θ (1 − f k − ( t − τ )) . (33)The solution of this equation can be found by applying the integrating factor methodintegrating on R tkτ . We obtain: ψ ( t ) = ψ ( kτ ) ∗ e α ( kτ − t ) + δ Z tkτ e α ( t − t ) θ (1 − f k − ( t − τ )) dt . (34)Using Eq. (34) we can recursively find functions { f ( t ) , f ( t ) , f ( t ) , . . . , } on the intervalof interest, which provide the solution to Eq. (27). Using Wolfram Mathematica we findfunctions on the interval t ∈ [ − τ, τ ] for f = 2, α = 0 . δ = 1 and τ = 1 and put themtogether to find the solution plotted in S1 Fig. S2A.We note from Eq. (34) that all functions f k are written as the sum of an exponentialfunction and a constant. By looking at Eq. (27), we see that when the Heaviside function13 B FIG. 2.
UNSAT-FFF delay ODE model. A,
Solution of Eq. (27) using recursive Eq. (34) on t ∈ [ − τ, τ ] for f = 2, α = 0 . δ = 1 and τ = 1. B, One period of the oscillation of solution in (A) consisting of two exponential pieces. is equal to zero, we get a solution that decays exponentially to zero. Likewise, when theHeaviside function is equal to 1, we get a solution that exponentially grows to δα . In otherwords, the solution will grow until θ (1 − ψ ( t − τ )) changes to zero (i.e., when ψ ( t − τ ) > θ (1 − ψ ( t − τ )) changes to one (i.e., when ψ ( t − τ ) will cross 1 again,but from different side). Therefore, we get oscillations consisting of two exponential pieces.One period of the oscillation is shown in S1 Fig. S2B. The solution on this interval is givenby: t ∈ [4 . , . ψ ( t ) = 5 − . ∗ e − . t t ∈ [5 . , . ψ ( t ) = 5 . ∗ e − . t , (35)which is the predicted behavior. Additionally, we note that this circuit functions as acapacitor charging and discharging in an RC circuit. B. Period-amplitude relationship
As shown in the main text, the solution of the discrete-time Boolean interaction modelfor λ > A ψ for the rescaled variable ψ t , we recall that the iterative map ψ = f ( ψ ) satisfies the recursive equation: f t ( ψ ) = f t − ( βψ ) θ ( ψ −
1) + f t − ( βψ + αλ ) θ (1 − ψ ) . (36)14 B C
Period-Amplitude relation
FIG. 3.
Period-amplitude relationship. a,
Solutions for α = 0 . λ = 1 .
01. The values for A ψ = 0 .
202 and T = 15 obtained with the use of Eq. (37) and Eq. (39) perfectly agree with theones found by numerical simulations. b, Period-amplitude relationship in terms of the original setof parameters α , k y , γ x , and γ y . c, Period of oscillations as a function of λ for different values of α . Thus, the amplitude of oscillations A ψ is given by A ψ = lim ψ → − f ( ψ ) − lim ψ → + f ( ψ ) = αλ, (37)which implies that A ψ = γ x γ y k y . (38)To find the period T of the oscillations, we recall from Eq. (6) in the main text that thesolution for the minimum value of ψ , ψ min <
1, evolves to its maximum value ψ max in T − ψ max = e − ( T − /τ ψ min + λ (cid:0) − e − ( T − /τ (cid:1) . Since ψ min = (1 − α ) ψ max , due to thefact that ψ max >
1, we find T = & τ log αλ − !’ , (39)where we used ψ max = 1. For example, using α = 0 . λ = 1 .
01, we find A ψ = 2 .
02 and T = 15, which agrees with the numerical simulation presented in S1 Fig. S3A.Equation (39) allows to define a rescaled amplitude A = ( λ − /α , and a reduced period T = ( T − /τ such that T = log (cid:18) A (cid:19) , (40)which corresponds to the period-amplitude relationship of the UNSAT-FFF. A plot ofthis relationship is shown in S1 Fig. S3B, where we plot ( T − /τ as a function of[( γ x γ y /αk y ) − /α . 15oming back to the original variable y t = k y ψ t , we have that the amplitude of oscillationsof y t , A = k y A ψ , is given by: A = γ x γ y , (41)and from Eq. (39), we can write the period of oscillations as a function of the original set ofparameters as T = 1 − − α ) log (cid:18) α ( γ x γ y /αk y ) − (cid:19) . (42)S1 Figure S3C shows the period of oscillations T as a function λ for α = 0 . α = 0 . α = 0 . C. UNSAT-FFF clock functionality
The clock functionality of the UNSAT-FFF can be understood by analyzing its responsefunction, i.e. the relation between oscillations at the input and at the output of the circuit.The amplitude A y , and period T of the oscillations are not independent like in the harmonicoscillator, but are related through a ‘period-amplitude’ relation expressed by Eq. (40) andS1 Fig. S3B. From Eq. (42), for α sufficiently small, T − ∼ k y γ x γ y = 1 A , (43)which constrains the ‘clock’ ( T ) of the circuit to the power ( A y ). As a consequence, A y and T cannot be controlled arbitrarily, and this (A-T) constraint helps to stabilize theUNSAT-FFF response against disturbance in the input X. For example, for a given availablepower supply, the system is constrained to dissipate this power, and when the UNSAT-FFFoscillates, it is automatically set to operate on an extended time window ( T large) at lowamplitude A when a small expression level is required ( A small) and vice-versa. Resultsfor the clock functionality of the Fibonacci Fiber and n = 2 Fiber can be carried out in asimilar manner. The idea is that Fibonacci fibers with longer and longer loops can carrymore robust oscillatory patterns than simple autoregulation negative feedback loops.16 V. EXAMPLES OF SYMMETRY AND BROKEN SYMMETRY CIRCUITSA. Symmetry circuits ( fibers ) from [24]
Our findings show that simple sub-graphs ubiquitous on gene regulatory networks areanalogous to symmetric electronic circuits which can work as clocks, revealing a hierarchy ofsymmetry circuits. Here, we describe these symmetric circuits in more detail following [24].Supplementary File 1 presents the full list of circuits found across the regulatory networksof
A. thaliana , M. tuberculosis , B. subtilis , E. coli , salmonella , yeast , mouse and humans.Below we enumerate the set of symmetric fibers found in Ref. [24] in the genetic networksof these species.
1. Repressor regulator link (stub)
We start by an isolate repressor link. As depicted in Fig. 3A, this repressor regulatoralone does not form an input tree, neither it forms a base . Since it works as a transistor, itslogic representation is a NOT gate.
2. Repressor AR loop: | , i In Fig. 3B, we show the repressor autoregulation loop. When a repressor link is foundas an AR loop, we have a genetic network with one single loop ( n = 1) and no externalregulator ( ‘ = 0), which we denote symbolically by | , i . Such simple network has aninput tree that feeds its own expression levels. Also, it is equivalent to its own base . Bythe analysis of the corresponding logical circuit, one can observe that it naturally oscillates,since it is a NOT gate that feeds itself.
3. UNSAT-FFF: | , i The repressor autoregulation loop introduces a symmetry between Y and Z that allows theexpression levels y t and z t to synchronize and oscillate. The increase of external regulatorsdoes not affect the complexity of UNSAT-FFF, since its dynamics remains restricted to thesole loop on the network. This can be verified in the corresponding input tree and in its17 ase . The collapse of Z into Y forms a base with n = 1 autoregulator and ‘ = 1 externalregulator. Because of the repressor feedback, the oscillation and synchronization of Y andZ are again evident in its logic circuit. In Fig. 3C, we use the NAND gate for representinggene Y and an AND gate for gene Z, but other gates (such as an OR logic gate) would resultin similar conclusions.
4. Fibonacci Fiber: | . ..., i The Fibonacci Fiber shown in Fig. 3D is characterized by the addition of a second feed-back loop. This regulatory network have a Fibonacci sequence Q t = Q t − + Q t − as aninput tree. The branching structure of the input tree implies that the Fibonacci Fiber canstore memory dynamically by the interaction with its past states, although it is continuallyerased in time. As shown in the logic circuit of Fibonacci Fiber, its structure can oscillateand synchronize, but is unable to store static information. This situation changes as soon aswe allow symmetries to break, which leads to a number of genetic circuits, as those depictedin Fig. 4. In Fig. 3D, we show examples of the Fibonacci Fiber on regulatory networks ofvarious species. Note the presence of its base in the network representation for B. subtilis (genes tnrA and glnR ), E. coli (genes uxuR and exuR ), M. tuberculosis (genes Rv0182c andRv3286c),
Salmonella (genes marA and soxS ), Yeast (genes Tec1 and Ste12), and in two net-works from human genetic network (the pair genes
PAX5 and
TP53 , for the first example,and gene
FOS and
CREM , for the second). n = 2 Fiber: | , i Starting from a Fibonacci Fiber, the addition of a second autoregulation on gene X resultsin a symmetric input tree. Besides, the genetic network of the n = 2 Fiber collapses intoa base with a single gene with two autoregulations (Fig. 3E). From the corresponding logiccircuit, one can conclude that it is possible to achieve synchronization between genes X, Yand Z. In Fig. 3E we show examples of the n = 2 Fiber from the regulatory networks of B.subtilis (the first with the pair of genes lexA and rocR , and the second with genes hprT , tilS and ftsH ). 18 . Symmetry breaking circuits Figure 4 shows the procedure to generate broken symmetry circuits. As we discuss inthe main text, this process starts by a replica symmetry operation where the base from agiven symmetry circuit is duplicated. The symmetry is broken by the addition of two inputgenes as regulators of the circuit. The resulting broken symmetry circuits are analogous toflip-flops circuits from digital electronics. Such circuits are able to store memory statically,playing a central role in the design of microprocessors. In what follows, we describe theAR, FFF, and Fibonacci broken symmetry circuits and their analogous electronic circuitsin detail. Supplementary File 1 presents the full list of circuits found across the regulatorynetworks of
A. thaliana , M. tuberculosis , B. subtilis , E. coli , salmonella , yeast , mouse andhumans.
1. AR symmetry breaking circuit: SR flip-flop
Through a replica symmetry duplication, gene Y ‘opens-up’ its AR loop into two mutuallyregulated genes Y and Y’. We break the symmetry relation between Y and Y’ by the inclusionof different inputs S and R as depicted in Fig. 4 (replica symmetry breaking), such thatS = R.The SR flip-flop does not provide synchronized outputs. After the input signals arrive atthe logic gates, each gate provides its output without waiting for the output of the other.This results in fast oscillations which, in the particular application to integrated circuits indigital electronics, are undesired. Then, in digital electronics, the input S = 0 and R = 0 issaid to be forbidden, since the NAND gates set both Q = 1 and Q = 1, which violates thelogical state Q = not Q . In Fig. 4, we use the NAND gates, but the use of a NOR gatesleads to similar conclusions.Two biological realizations of the AR symmetry breaking class are shown in Fig. 4,both from human regulatory networks with genes NFKB1 and
HOXA9 (upper), and theregulatory network of genes
IRF4 and
BCL6 (bottom). Gene
NFKB1 further regulates twogenes,
BST1 and
HAX1 as its outputs, but this regulation does not affect the functionalityof the flip-flop. 19 . FFF symmetry breaking circuit: Clocked SR flip-flop
Following the same strategy, we start with the base of the UNSAT-FFF and symmetrizeit through a replica symmetry duplication. Note that the replica symmetry of FFF adds asecond level of NAND logic gates to the SR flip-flop via gene X. In order to have consistentlogic operations, we add an input clock gene CLK which symbolizes the activation of geneX, since gene X needs to receive input for its activation. The resulting circuit is analogous tothe Clocked SR flip-flop after the addition of the input genes S and R. The second level flip-flop inverts the outputs of the previous SR flip-flop logic circuit, meaning that when S = 1and R = 0 ( S = 0 and R = 1), the circuit outputs Q = 1 and Q = 0 ( Q = 0 and Q = 1).The input S = 0 and R = 0 results in an unchanged state. When the input of gene CLKis low, CLK = 0, the output of the second level of both NAND gates outputs high signals,independently of the values of S and R, assuring that the outputs Q and Q remain unchanged.However, when the clock input is CLK = 1, it allows the first level of NAND gates to changethe outputs for S = R . The clocked SR flip-flop also has a forbidden state when S = 1and R = 1 in digital electronics. In Fig. 4, we show two biological realizations of theFFF broken symmetry breaking circuit, the set of genes { CLOCK, NR0B2, NR3C1, E2F1 and
TP53 } and the set { CEBPB, DDIT3, PRDM1, CEBPA and
MYC } , both examplesare from human regulatory genetic networks. The outputs of the flip-flops, like E2F1 and
TP53 , further regulate a set of genes each as indicated by the red genes in the figure. Theseregulatory interactions do not affect the functionality of the flip-flops since they are outgoinglinks.
3. Fibonacci symmetry breaking circuit: JK flip-flop
The replica symmetry duplication of the Fibonacci Fiber results in a logic circuit similarto the FFF broken symmetry breaking circuit. However, the regulation links Y → X and Y → X yields a different logic circuit, which is analogous to the Clocked JK flip-flop, wherethe input genes are now J and K. The additional links solve the unpredictable output forthe J = 1 and K = 1 case by commuting the values stored in Q and Q . Two examples ofthe Fibonacci broken symmetry circuits are shown in Fig. 4 for the sets of genes { PITX1,JUN, NKX3-1, TP53 and
ESR1 } and { FLI1, HDAC1, EPS300, AR and
RELA } , both from20uman genetic networks. We also show the set of genes regulated by these flip-flops. V. DESCRIPTION OF DATASETS
Datasets are described in S1 Table 1. We use a set of datasets of transcriptional regulatorynetworks found in the literature. All datasets are freely available from online sources. InSupplementary File 1 we present a plot of all found circuits. In the case of symmetric circuits,same colored nodes indicate the genes in the fiber . The external regulators are coloredwith different colors. We also present all the symmetry broken circuits across all species asindicated in the file. The statistics, count and Z-scores of the circuits are presented in Table 1and 2 in the main text. The file with all circuits can be found at https://bit.ly/2YM5x3H . VI. ALGORITHM TO FIND FIBERS
To obtain the set of nodes in the graph that belong to a fiber , we use the algorithmdescribed in detail by Kamei and Cock [41] and developed in Ref. [24] to obtain the ‘ minimalbalanced coloring ’ of the graph (referred as balanced coloring for simplicity), where we colorthe network by assigning a different color to each fiber .To understand what balanced coloring means in the context of graph theory, we need todefine the concept of input set (which is a part of the input tree). In a directed graph, theinput set of a given node is the set of nodes with edges pointing into that node. Thus, theinput set is the first layer of the input tree. Next, we define the Input Set Color Vector(ISCV) of a node as a vector of length equal to the number of colors in the graph, that isthe number of fibers . Each entry of the ISCV of a given node counts how many nodes ofeach color are in the input set of this node. The balanced coloring is achieved by iterationby increasing the number of colors (length of ISCV) until all nodes of the same color havethe same ISCVs. At this point, each color identifies each fiber in the graph.Finding graph balanced coloring is equivalent to finding node sets with isomorphic inputtrees. A brief explanation of that is the following. If two nodes have the same ISCVs,nodes of their input sets have same colors. Thus these nodes are said to belong to the sameequivalence class [41]. Inductively, the same can be said of the input set of the nodes in theinput set. This recurrent relation implies that two nodes that have the same input sets will21 pecies Database Additional informationArabidopsis Thaliana ATRM [32] We use high-confidence functionally confirmed transcriptional regulatoryinteractions from the ATRM database of the broadly used model plantArabidopsis. http://atrm.cbi.pku.edu.cn/
Micobacterium Tuberculosis Research article [33] Supplementary Information of Ref. [33]
Bacillus subtilis SubtiWiki [34] We download the database from SubtiWiki website and consider all re-pressor and activation links as “Repression” and “Activation”. Thisdatabase is considered the primary source of information for Bacillus. http://subtiwiki.uni-goettingen.de/
Escherichia coli RegulonDB [35] We use the TF - operon interaction network from [35]. RegulonDBcombines transcriptional regulator interactions obtained by curating lit-erature and using NLP high-quality data and partially confirmed ex-perimentally and computationally predicted data. http://regulondb.ccg.unam.mx/
Salmonella SL1344 SalmoNet [36] We use the regulatory layer of the strain Salmonella TyphimuriumSL1344. SalmoNet consists of manually curated low-throughput andhigh-throughput experiments and predictions based on experimentallyverified binding sites and TF-gene binding site data from RegulonDB. http://salmonet.org/
Yeast YTRP [38] We use the TF-gene regulatory and TF-gene binding networks. Resultsof the TFPEs (Transcription Factor Perturbation Experiments) identifythe regulatory targets of TFs. This is further refined by using literature-curated data. http://cosbi3.ee.ncku.edu.tw/YTRP/Home
Mouse TRRUST [39] Downloaded from TRRUST website. TRRUST is constructed usingsentence-based text mining of more than 20 million abstracts from re-search articles, refined by manual curation.
Human TRRUST [39] Downloaded from TRRUST website.TRRUST 2 [39] Downloaded from TRRUST website and curated.KEGG [40] We use KEGG API to download all pathways of Human gene regulatorynetwork. After that, all networks are put together and duplicates areremoved.
TABLE 1.
Description of dataset acquisition.
All data are gathered from publicly availablesources [32–40]. have isomorphic input trees and will belong to the same fiber (for rigorous proof see chapter4 in [42]). 22he algorithm to find balanced coloring was described in detail by Kamei and Cock [41].In their algorithm, all nodes of the graph have the same initial color and, through a series ofoperations, they are recolored until balanced coloring is reached. The detailed descriptionof the algorithm is as follows:1. Initially, all nodes are assigned the same color.2. Each node is assigned with the N-dimensional vector (ISCV), where N is currentnumber of colors at the current iteration of the algorithm. Each entry of this vector isthe number of nodes of certain color with ingoing link to the respective node. In thefirst iteration, N = 1 and each entry is the in-degree of the node.3. If vectors for all nodes of the same color are equal, the balanced coloring is achievedand the algorithm stops.4. Otherwise, if coloring is unbalanced, each unique vector is assigned a new color andthe graph is recolored accordingly.5. Steps 2-4 are repeated until condition in Step 3 is satisfied.For example consider the FFF graph in S1 Fig. S4A. Initially, we assign the same color,white, to all nodes. Then, we assign a 1-dimensional vector to each node which counts thein-degree of each node (S1 Fig. S4B). Since ISCVs of X and Y (which have the same color)are different, ISCV(X) = 0 and ISCV(Y) = 2, where the entry refers to the number of inputsof white color, then the condition in Step 3 is not satisfied. There are two unique ISCVs,thus only two new colors are necessary. Thus, a 2-dimensional vector is assigned to eachnode: ISCV(X) = (0 , , , https://github.com/makselab/fiberCodes . 23 YZX YZA BC D ISCV(X)0 1 10 11ISCV(Y) ISCV(Z)
Newcolor
FIG. 4. Illustration of the balanced coloring algorithm to find fibers . VII. ALGORITHM TO FIND BROKEN SYMMETRY CIRCUITS
To count the number of occurrences of broken symmetry circuits in a given graph we countthe number of appearances of induced subgraphs as defined in Refs. [43–45]. Subgraphsand induced subgraphs are graph-theoretical concepts introduced in the social and computersciences as applications to graph matching and pattern recognition [44]. We allow symmetrybreaking to come from any gene in the circuit. Let’s consider a graph G = { V, E } , where V isthe set of nodes and E is the set of links. For instance in S1 Fig. S5A V = { A, B, C, D, Y, Y } and E = { A → Y, Y → B, Y → Y , Y → Y, Y → C, D → Y } . A subgraph G = { V , E } ofa graph G = { V, E } is a graph such that V ⊂ V and E ⊂ E [43]. For example S1 Fig. S5Bis a subgraph of the graph in S1 Fig. S5A with V = { Y, Y } ⊂ V and E = { Y → Y } ⊂ E .An induced subgraph G = { V , E } of a graph G = { V, E } is a subgraph with a set of nodes V ⊂ V and all links E ⊂ E such that their heads and tails are in V . For example thesubgraph of S1 Fig. S5B is not induced graph of G since it is missing the link { Y → Y } .That is, S1 Fig. S5B is not an induced subgraph (but it is just a subgraph of G), because oneof the links { Y → Y } with endpoints in V is missing. However the graph of S1 Fig. S5Cis an induced subgraph of G since V = { Y, Y } and E = { Y → Y , Y → Y } are included,but links { A → Y, Y → B, Y → C, D → Y } don’t belong to G . The problem of findingbroken symmetry circuits consists on three steps: (1) identify a base and create a replicasymmetry circuit, (2) find subgraphs isomorphic to the replica symmetry circuit, and (3)remove subgraphs that are not induced. 24he first step is to find subgraphs isomorphic to the replica symmetry circuit. Let’sconsider an example. The matrix in S1 Fig. S5D represents the adjacency matrix A of thesymmetric part of the SR flip-flop (i.e., a toggle switch). That is, A is the replica symmetrypart (Fig. 4, third row) of the broken symmetry circuit. Similarly, S1 Fig. S5G,H showthe adjacency matrices of clocked SR flip-flop and JK flip-flop circuits. The general ideais to choose a subgraph and check if it is isomorphic to the circuit and continue doingthis for all possible subgraphs in the entire network. However, this task is computationallyexpensive. Even for circuits with 5 nodes, the computational time is N , where N is thetotal number of nodes in G , which means that for big enough graphs the algorithm cantake very long computational time. Different approaches to this problem have been widelystudied and are nicely reviewed in Ref. [46]. Time costs can be cut if unprofitable paths areidentified and skipped in the search space. One of the recent works in the field is the VF2algorithm developed by Cordella et al. [47]. It is designed to deal with large graphs and usesstate of the art techniques in order to reduce computational time. We use the algorithmimplemented in a popular R package igraph [48] as a function subgraph isomorphisms ( ... ).We provide the analysis and plotting scripts allowing to reproduce our results at https://github.com/makselab/CircuitFinder .The second step is to remove all the subgraphs that are not induced or, simply speaking,have extra links between the genes in the broken symmetry circuit. We follow this procedure:take a node set identified above, find the induced subgraph of the complete graph with thisnode set and compare the adjacency matrix of the induced subgraph with the adjacencymatrix of the circuit. If the matrices are different, then the circuit is removed. All remainingcircuits are the broken symmetry flip-flops that we are looking for. Multi-links and self-loopsare removed from the network prior to consideration.By applying the steps described above we get the full list of induced subgraphs that areisomorphic to the given circuit. 25 B CY'YGraph = G Subgraphof G Inducedsubgraph of GD SR flip-flopadjacency matrix
10 01
A =
G HClocked SR flip-flopadjacency matrix JK flip-flopadjacency matrixY'Y BAC D Y'Y54231 54231E FClocked SR flip-flop JK flip-flop
FIG. 5. Definition of the induced subgraph used to obtain the broken symmetry circuits. A, GraphG. B, Subgraph of G. C, Induced Subgraph of G. D, Adjacency matrix of SR flip-flop. E, ClockedSR flip-flop. F, JK flip-flop. G, Adjacency matrix of Clocked SR flip-flop. H, Adjacency matrixof JK flip-flop.
1] Hartwell LH, Hopfield JJ, Leibler S, Murray AW. From molecular to modular cell biology.Nature. 1999;402: C47-C52.[2] Milo R, Shen-Orr SS, Itzkovitz S, Kashtan N, Chklovskii D, Alon U. Network motifs: simplebuilding blocks of complex networks. Science. 2002;298: 824-827.[3] Shen-Orr SS, Milo R, Mangan S, Alon U. Network motifs in the transcriptional regulationnetwork of
Escherichia coli . Nature Genet. 2002;31: 64-68.[4] Alon U. An Introduction to Systems Biology: Design Principles of Biological Circuits. BocaRaton: CRC Press; 2006.[5] Klipp E, Liebermeister W, Wierling C, Kowald A, Herwig R. Systems Biology: a textbook.Weinheim: Wiley-VCH; 2016.[6] Tyson JJ, Chen KC, Novak B. Sniffers, buzzers, toggles and blinkers: dynamics of regulatoryand signaling pathways in the cell. Curr Opin Cell Biol. 2003;15(2): 221-31.[7] Monod J, Jacob F. General conclusions: teleonomic mechanisms in cellular metabolism,growth and differentiation. Cold Spring Harb Symp Quant Biol. 1961;26: 389-401.[8] Teo JY, Woo SS, Sarpeshkar R. Synthetic Biology: A Unifying View and Review Using AnalogCircuits. IEEE Trans. on Biomed. Circuits and Syst. 2015;9: 453-474.[9] Dalchau N, Sz´e G, Hernansaiz-Ballesteros R, Barnes CP, Cardelli L, Phillips A, et al. Com-puting with biological switches and clocks. Natural Computing. 2018;17: 761-779.[10] Atkinson MR, Savageau MA, Myers JT, Ninfa AJ. Development of genetic circuitry exhibitingtoggle switch or oscillatory behavior in
Escherichia coli . Cell. 2003;113: 597-607.[11] Gardner TS, Cantor CR, Collins JJ. Construction of a genetic toggle switch in
Escherichiacoli . Nature. 2000;403: 339-342.[12] Kramer BP, Fussenegger M. Hysteresis in a synthetic mammalian gene network. Proc NatlAcad Sci USA. 2005;102: 9517-9522.[13] Kramer BP, Viretta AU, Baba MD-E, Aube D, Weber W, Fussenegger M. An engineeredepigenetic transgene switch in mammalian cells. Nature Biotech. 2004;22: 867-870.[14] Guet CC, Elowitz MB, Hsing W, Leibler S. Combinatorial synthesis of genetic networks.Science. 2002;296: 1466-1470.[15] Ajo-Franklin CM, Drubin DA, Eskin JA, Gee EP, Landgraf D, Phillips I, et al. Rational design f memory in eukaryotic cells. Genes Dev. 2007;21: 2271-2276.[16] Ham TS, Lee SK, Keasling JD, Arkin AP. Design and construction of a double inversionrecombination switch for heritable sequential genetic memory. PLoS One. 2008;3: e2815.[17] Elowitz MB, Leibler S. A synthetic oscillatory network of transcriptional regulators. Nature.2000;403: 335-338.[18] Stricker J, Cookson S, Bennett MR, Mather WH, Tsimring LS, Hasty J. A fast, robust andtunable synthetic gene oscillator. Nature. 2008;456: 516-519.[19] Tigges M, Marquez-Lago TT, Stelling J, Fussenegger M. A tunable synthetic mammalianoscillator. Nature. 2009;457: 309-312.[20] Mangan S, Alon U. Structure and function of the feed-forward loop network motif. Proc NatlAcad Sci USA. 2003;100: 11980-11985.[21] Mangan S, Zaslaver A, Alon U. The coherent feedforward loop serves as a sign-sensitive delayelement in transcription networks. J Mol Biol. 2003;334: 197-204.[22] Glass L, Kauffman SA. The logical analysis of continuous, non-linear biochemical controlnetworks. J Theor Biol. 1973;38: 103-129.[23] Mangan S, Zaslaver A, Alon U. Negative autoregulation increases the input dynamic-range ofthe arabinose system of Escherichia coli . BMC Sys Biol. 2011;5: 111.[24] Morone F, Leifer I, Makse HA. Fibration symmetries uncover the building blocks of biologicalnetworks. Preprint at https://bit.ly/2Z94B6o (2019).[25] Golubitsky M, Stewart I. Nonlinear dynamics of networks: the groupoid formalism. Bull AmMath Soc. 2006;43: 305-364.[26] Boldi P, Vigna S. Fibrations of graphs. Discrete Mathematics. 2001;243: 21-66.[27] Anderson PW. The concept of frustration in spin glasses. J of the Less-Common Metals.1978;62: 291-294.[28] Widlar RJ, Some circuit design techniques for linear integrated circuits. IEEE Trans CircuitTheory. 1965;4: 586-590. See also Widlar RJ: US Patent Number 3,320,439; Filed May 26,1965; Granted May 16, 1967: Low-value current source for integrated circuits and Widlar RJ.Design techniques for monolithic operational amplifiers. IEEE Solid-State Circuits. 1969;4:184-191.[29] Horowitz P, Hill W. The Arts of Electronics. 3rd ed. New York:Cambridge University Press;2015.
30] Weinberg S. The Quantum Theory of Fields. Cambridge: Cambridge University Press; 2005.[31] Morone F, Makse HA. Symmetry group factorization reveals the structure-function relationin the
Caenorhabditis elegans connectome. Nat Commun. 2019;10: 4961.[32] Jin J, He K, Tang X, Zhe L, Lv L, Zhao Y, et al. An Arabidopsis transcriptional regulatorymap reveals distinct functional and evolutionary features of novel transcription factors. MolBiol Evol. 2015;32: 1767-1773.[33] Bal´azsi G, Heath AP, Shi L, Gennaro ML. The temporal response of the Mycobacteriumtuberculosis gene regulatory network during growth arrest. Mol Syst Biol. 2008;4: 225.[34] Zhu, B. & St¨ulke J. SubtiWiki in 2018: From genes and proteins to functional network anno-tation of the model organism Bacillus subtilis. Nucleic Acids Res. 2017;46: D743-D748.[35] Gama-Castro S, Salgado H, Santos A, Ledezma-Tejeida D, Mu˜niz-Rascado L, Santiago G-S,et al. RegulonDB version 9.0: High-level integration of gene regulation, coexpression, motifclustering and beyond. Nucleic Acids Res. 2015;44: D133-D143.[36] M´etris A, Sudhakar P, Fazekas D, Demeter A, Ari E, Olbei M, et al. SalmoNet, an integratednetwork of ten Salmonella enterica strains reveals common and distinct pathways to hostadaptation. NPJ Systems Biology and Applications. 2017;3: 31.[37] Cherry JM, Hong EL, Amundsen C, Balakrishnan R, Binkley G, Chan E, et al. SaccharomycesGenome Database: The genomics resource of budding yeast. Nucleic Acids Res. 2011;40:D700-D705.[38] Yang T-H, Wang C-C, Wang Y-C, Wu W-S. YTRP: a repository for yeast transcriptionalregulatory pathways. Database 2014;2014: bau014.[39] Han H, Cho J-W, Lee S, Yun A, Kim H, Bae D, et al. TRRUST v2: An expanded referencedatabase of human and mouse transcriptional regulatory interactions. Nucleic Acids Res.2017;46: D380-D386.[40] Kanehisa M, Sato Y, Kawashima M, Furumichi M, Tanabe M. KEGG as a reference resourcefor gene and protein annotation. Nucleic Acids Res. 2015;44: D457-D462.[41] Kamei H, Cock PJ. A. Computation of balanced equivalence relations and their lattice for acoupled cell network. SIAM J Appl Dyn Syst. 2013;12: 352-382.[42] Aldis JW. A polynomial time algorithm to determine maximal balanced equivalence relations.Int J Bifurc Chaos Appl Sci Eng. 2008;18: 407-427.[43] Weisstein EW, ”Subgraph”. From MathWorld–A Wolfram Web Resource. http:// athworld.wolfram.com/Subgraph.html [44] Harary F. Graph Theory. Reading: Addison-Wesley; 1994.[45] https://en.wikipedia.org/wiki/Induced_subgraph [46] Conte D, Foggia P, Sansone C, Vento M. Thirty years of graph matching in pattern recognition.Int J Pattern Recognit Artif Intell. 2004;18: 265-298.[47] Cordella LP, Foggia P, Sansone C, Vento M. A (sub)graph isomorphism algorithm for matchinglarge graphs. IEEE Trans Pattern Anal Mach Intell. 2004;26: 1367-1372.[48] Csardi G, Nepusz T. The Igraph software package for complex network research. Inter Journal,Complex Systems. 2006;1695: 1-9.[49] Ingram PJ, Stumpf MP, Stark J. Network motifs: structure does not determine function.BMC Genomic. 2006;7: 108.[50] Payne JL, Wagner A. Function does not follow form in gene regulatory circuits. Sci Rep.2015;5: 13015.[51] Mac´ıa J, Widder S, Sol´e R. Specialized or flexible feed-forward loop motifs: a question oftopology. BMC Syst Biol. 2009;3: 84.[52] Ahnert SE, Fink TMA. Form and function in gene regulatory networks: the structure ofnetwork motifs determines fundamental properties of their dynamical state space. J RoyalSoc Interface. 2016;13: 20160179.[53] Purcell O, Savery N, Grierson C, Di Bernardo M. A comparative analysis of synthetic geneticoscillators. J Royal Soc Interface. 2010;7: 1503-24.[54] Driver RD. Ordinary and delay differential equations. New York: Springer Verlag; 1977.[46] Conte D, Foggia P, Sansone C, Vento M. Thirty years of graph matching in pattern recognition.Int J Pattern Recognit Artif Intell. 2004;18: 265-298.[47] Cordella LP, Foggia P, Sansone C, Vento M. A (sub)graph isomorphism algorithm for matchinglarge graphs. IEEE Trans Pattern Anal Mach Intell. 2004;26: 1367-1372.[48] Csardi G, Nepusz T. The Igraph software package for complex network research. Inter Journal,Complex Systems. 2006;1695: 1-9.[49] Ingram PJ, Stumpf MP, Stark J. Network motifs: structure does not determine function.BMC Genomic. 2006;7: 108.[50] Payne JL, Wagner A. Function does not follow form in gene regulatory circuits. Sci Rep.2015;5: 13015.[51] Mac´ıa J, Widder S, Sol´e R. Specialized or flexible feed-forward loop motifs: a question oftopology. BMC Syst Biol. 2009;3: 84.[52] Ahnert SE, Fink TMA. Form and function in gene regulatory networks: the structure ofnetwork motifs determines fundamental properties of their dynamical state space. J RoyalSoc Interface. 2016;13: 20160179.[53] Purcell O, Savery N, Grierson C, Di Bernardo M. A comparative analysis of synthetic geneticoscillators. J Royal Soc Interface. 2010;7: 1503-24.[54] Driver RD. Ordinary and delay differential equations. New York: Springer Verlag; 1977.