Digital clocks: simple Boolean models can quantitatively describe circadian systems
Ozgur Akman, Steven Watterson, Andrew Parton, Nigel Binns, Andrew Millar, Peter Ghazal
aa r X i v : . [ q - b i o . M N ] O c t Digital clocks: simple Boolean models can quantitativelydescribe circadian systems
Ozgur E. Akman , , ˚ , Steven Watterson , , Andrew Parton , ,Nigel Binns , Andrew J. Millar , : and Peter Ghazal , , : Centre for Systems, Dynamics and Control, College of Engineering, Computing and Mathematics,University of Exeter, Harrison Building, North Park Road, Exeter EX4 4QF, United Kingdom. Division of Pathway Medicine, University of Edinburgh Medical, Chancellors Building, 49 LittleFrance Crescent, Edinburgh EH16 4SB, United Kingdom. SynthSys Edinburgh, University of Edinburgh, CH Waddington Building, Kings Buildings, May-field Road, Edinburgh EH9 3JD, United Kingdom. Department of Mathematics, University of Edinburgh, James Clerk Maxwell Building, KingsBuildings, Mayfield Road, Edinburgh EH9 3JZ, United Kingdom. ˚ Corresponding author. E-mail: [email protected]; Tel: +44 1392 724 060; Fax: +44 1392 217965. : These two authors are the joint senior authors of this paper.Keywords: systems biology; circadian gene networks; Boolean logic; photoperiodism;
Arabidopsisthaliana
Running title: Digital clocks.
Abstract
The gene networks that comprise the circadian clock modulate biological function across a rangeof scales, from gene expression to performance and adaptive behaviour. The clock functions bygenerating endogenous rhythms that can be entrained to the external 24-hour day/night cycle,enabling organisms to optimally time biochemical processes relative to dawn and dusk. In recentyears, computational models based on differential equations have become useful tools for dissectingand quantifying the complex regulatory relationships underlying the clock’s oscillatory dynamics.However, optimising the large parameter sets characteristic of these models places intense demandson both computational and experimental resources, limiting the scope of in silico studies.Here, we develop an approach based on Boolean logic that dramatically reduces the parametri-sation, making the state and parameter spaces finite and tractable. We introduce efficient methodsfor fitting Boolean models to molecular data, successfully demonstrating their application to syn-thetic time courses generated by a number of established clock models, as well as experimentalexpression levels measured using luciferase imaging. Our results indicate that despite their relative1implicity, logic models can: (i) simulate circadian oscillations with the correct, experimentally-observed phase relationships amongst genes; and (ii) flexibly entrain to light stimuli, reproducingthe complex responses to variations in daylength generated by more detailed differential equationformulations. Our work also demonstrates that logic models have sufficient predictive power toidentify optimal regulatory structures from experimental data.By presenting the first Boolean models of circadian circuits together with general techniquesfor their optimisation, we hope to establish a new framework for the systematic modelling of morecomplex clocks, as well as other circuits with different qualitative dynamics. In particular, weanticipate that the ability of logic models to provide a computationally efficient representation ofsystem behaviour could greatly facilitate the reverse-engineering of large-scale biochemical networks.
Circadian rhythms are the fundamental daily oscillations in metabolism, physiology and behaviourthat occur in almost all organisms, ranging from cyanobacteria to humans [1]. The gene regulatorynetworks (GRNs), or clocks , that generate these rhythms regulate the expression of associated genesin roughly 24-hour cycles. Circadian networks have been studied in a variety of experimentallytractable model systems, revealing that different organisms share structurally similar circuits basedaround interlocking sets of positive and negative gene-protein feedback loops [2–4]. In mammals,circadian rhythms are being increasingly recognised as important to healthy phenotypes, playing arole in ageing [5], cancer [6], vascular disease [7] and psychiatric disorders [8], as well as modulatinginnate immunity [9–11].Clocks synchronise to their environment by using light and temperature to regulate the levelsof one or more components of the feedback loops. This ensures that key biological processesare optimised relative to dawn and dusk, benefiting growth and survival [12, 13]. For the clock toprovide such an adaptive advantage, the phase must change appropriately when the clock is subjectto regular perturbations - particularly seasonal changes in daylength (the photoperiod ). However, aswell as exhibiting flexible responses to variations in the input light signal, the clock must also exhibitrobustness to irregular perturbations, such as genetic mutations and the intrinsically stochasticenvironment of the cell.Temperature also plays a critical role as an environmental time cue. Across different species, theclock is relatively insensitive to temperature in that the period of free-running oscillations typicallyhas a Q value close to 1 [14–16]. This latter phenomenon, known as temperature compensation,is generally considered to be one of the defining properties of the circadian clock and has beensuggested to be a key requirement for stability of the clock’s phase relationship under seasonaltemperature variations [17, 18].The ability of ordinary and delay differential equations ( DE s) to reproduce the underlyingcontinuous dynamics of biochemical networks, and to parametrise individual reactions, has led tothe construction of DE models in a number of circadian organisms. These include the fungus Neurospora crassa [19–25], the fly
Drosophila melanogaster [26–29], the mammal
Mus musculus [30–33] and the higher plant
Arabidopsis thaliana [34–38]. Such models have proven useful inuncovering the general design principles of circadian oscillators, as well as providing a quantitativeframework within which to interpret experimental results [4, 38]. In particular, novel insights havebeen gained into the mechanisms promoting robustness with respect to photoperiod changes [25],2emperature fluctuations [18, 23] and molecular noise [39–41]. The DE models have also yieldedexperimentally-testable predictions that have lead to the discovery of novel circadian regulators[36].However, a significant drawback of the DE approach is that the values of the kinetic parameterscontrolling each individual reaction have to be specified, and for clocks these are typically unknown.When constructing a DE model, it is therefore necessary to calculate the particular combination ofparameter values giving an optimal fit to experimental data [18, 25, 34–37]. For realistic systemsinvolving large numbers of reactions, this optimisation procedure is computationally very expensivemaking exhaustive parameter searches intractable. With increasing parameter numbers also comesa need for data with which to constrain the optimisation, placing a greater demand on experiment interms of finance, time and ethics. These concerns mean that there is a pressing need for modellingapproaches that minimise the number of parameters required, whilst adequately capturing theessential dynamical behaviour of the system of interest.Here we develop just such an approach, based on Boolean logic. In Boolean models, the activityof each gene is described with a two-state variable taking the value ON (1) or OFF (0), meaningthat its products are present or absent respectively. Biochemical interactions are represented bysimple, binary functions which calculate the state of a gene from the activation state of its upstreamcomponents [42–50]. This approximation dramatically reduces the state-space of the system, map-ping the infinite number of different continuous system states in a DE model to a finite number ofdiscrete states in the Boolean equivalent.An additional important advantage of using a logic approach is that the total number of pa-rameters is substantially reduced. For a given gene, the full set of reactions determining its statethrough a particular interaction is parametrised by a single signalling delay , representing the nettime taken for these reactions to cause a change in state. Fitting to experimental data introducesan associated discretisation threshold , with expression levels above the threshold taken to corre-spond to the ON state of the gene and levels below it to the OFF state [46–48, 50]. In fitting aparticular experimental data set, each delay becomes a multiple of the sampling interval, while onlya bounded subset of thresholds will yield distinct Boolean expression patterns. This means that thetotal number of parameter combinations is finite and can be enumerated. Thus, by building a logicversion of a DE model, an infinite model can be converted into a finite one with fewer parametersto be optimised. This extends the scale and complexity of GRNs that can be studied by Booleanmodels far beyond the practical scope of DEs.In this work, we introduce the first Boolean models of circadian networks. By constructinglogic analogs of a number of established DE clock models, we demonstrate that in each case theBoolean models are capable of accurately reproducing the higher-order properties - particularlyphotoperiod responses - of their DE counterparts. This suggests that the complex, biologicalsignal transduction simulated by the DE models can be captured in Boolean equivalents possessingsignificantly smaller parameter sets. We introduce a general method for optimising Boolean modelsthat avoids the qualitative and often subjective terms characteristic of the cost functions used tofit the parameters of large DE clock models. Furthermore, we show that our fitting algorithm iscapable of determining the optimal Boolean model configuration associated with a given circuittopology and experimental data set. In particular, our algorithm successfully predicts the recentlydiscovered repressive action of the circadian gene TOC1 on LHY in the central feedback loop ofthe
Arabidopsis clock [51]. 3aken together, our results show that Boolean models can quantitatively distinguish betweena range of putative regulatory structures on the basis of the system dynamics. This identifiesBoolean logic as a viable technique for reverse engineering circadian networks, complementingapproaches based on DEs. Moreover, our work also suggests novel hybrid modelling approachesbased on employing Boolean models as a first step towards the construction of more detailedDE formulations. More generally, we propose that our methodology provides an efficient wayof systematically modelling complex signalling pathways, including other oscillatory circuits andsystems characterised by steady-state dynamics.
We selected 4 recent circadian oscillator models of increasing complexity with which to assess thesuitability of a Boolean formulation. The simplest of these was a
Neurospora model based on asingle negative feedback loop with a single light input [19] (Figure 1A). This is represented by3 differential equations ( DE s) parametrised by 13 kinetic constants. The second model was amodified version of the 1-loop Neurospora circuit in which there are a pair of negative feedbackloops associated with different isoforms of the active protein [18] (Fig. 1B). The extra feedback loopresults in 5 DEs parametrised by 18 kinetic constants. The third model was an
Arabidopsis circuitbased on a pair of interlocking feedback loops with 3 light inputs [35] (Fig. 1C). It is described by13 DEs together with 64 kinetic constants. The final model considered was a 3-loop
Arabidopsis circuit obtained by adding an extra feedback loop and light input to the 2-loop system [36] (Fig.1D). This yields 16 DEs parametrised by 80 kinetic constants.In a Boolean model, an interaction j between two components X i and X k is quantified by thecorresponding signalling delay τ j . This is the time taken for the biochemical processes representedby j to convert a change in the state of X i into a change in the state of X k (see Fig. S1 of theElectronic Supplementary Material). The signalling delays are thus parameters that determinethe dynamics of the model, with different combinations of delays yielding different attractor states(e.g. steady-states or limit cycles) [43, 44, 52, 53]. In order to relate the discrete dynamics tothe continuous variations in expression level observed experimentally, it is necessary to introduce adiscretisation process. Supplementary Fig. S2A shows a hypothetical time course and Supp. Fig.S2B the result of applying a particular discretisation threshold. In the general case, the choice ofthreshold is dependent on the effective range of activity and expression that is critical for signalpropagation. As a result, each threshold also becomes a parameter of the logic model [46, 47, 50].For logic descriptions of the circadian networks shown in Fig. 2, each edge is parametrised bya signalling delay τ j and each vertex by a threshold T i . Thus, a network is characterised by thevectors τ “ p τ j q and T “ p T i q . In Fig. 3, we compare the total number of parameters in the logicand DE versions of each network (see also Supplementary Table S1). We can see that dramaticallyfewer parameters are required in a logic description compared to the corresponding DE formulation.Indeed for the largest network considered, 3-loop Arabidopsis , the Boolean description reduces thenumber of parameters by a factor of 4. 4 .2 Logic model configurations consistent with DE models are optimal
In identifying the logic models that best reproduce the corresponding DE dynamics, we consideredvariations to the structures of the networks shown in Fig. 1. Starting from the abstract topologiesof Fig. 2, each edge is activating or inhibiting, and where two or more edges lead into a vertex, thecorresponding inhibition/activation signals are combined to determine the state of the gene.In the Boolean formalism, the manner in which regulatory signals affect a gene’s expression levelis represented by the corresponding logic gate . This is a binary function that specifies the currentstate of the gene, ON (1) or OFF (0), for each possible combination of input states [46, 54]. Forgenes with a single input, there are two possible functions for which the output varies with theinput: the identity gate, in which the output follows the input (0 Ñ Ñ Ñ Ñ logic configuration ( LC ): this encodesthe regulatory structure of the network in a compact fashion.It follows that each abstract topology of Fig. 2 gives rise to 2 E ` M possible LCs, where E is thenumber of edges and M is the number of vertices with more than one input. The total number ofpossible LCs is 4 for 1-loop Neurospora , 32 for 2-loop
Neurospora , 256 for 2-loop
Arabidopsis and2048 for 3-loop
Arabidopsis . In each case, a subset of these LCs are consistent with the pattern ofactivation and inhibition in the corresponding DE model. That there can be more than one suchLC in each case is due to the choice of AND or OR where a vertex has multiple edges leading intoit. For
Neurospora , the 1-loop circuit has a unique LC consistent with the DE model while for the2-loop circuit there are two DE LCs. The 2- and 3-loop
Arabidopsis networks yield 4 and 8 DELCs respectively.For a given logic model, we would expect the DE LCs to most closely reproduce the DE dynam-ics. To test this hypothesis, we optimised LCs to synthetic experimental data obtained from theDE system, and then compared their predictive performance. For each LC, the match to the con-tinuous dynamics was quantified by finding the combinations of parameters (signalling delays anddiscretisation thresholds) that minimised a quantitative cost function. In order to be able to objec-tively compare the ability of the Boolean models to reproduce experimental time courses againstthat of their DE counterparts, we employed a cost function that closely mirrored those commonlyused to optimise continuous models [18, 25, 34–36, 38]. The cost function we used measured thegoodness-of-fit of each logic model to synthetic data generated in both 24h light-dark (LD) cyclesand the appropriate free-running light regime (continuous dark, DD, for
Neurospora ; continuouslight, LL, for
Arabidopsis ). At each vertex, the cost score was calculated as the correlation betweenthe discretised time series for the downstream species and the time-delayed predicted output cal-culated from the discretised data for the upstream species. Scores were summed across all verticesand light regimes to give the final cost value (see Methods for details).After ranking the optimised LCs by score, we then assessed whether the top ranking LCscomprised viable clock circuits by checking that they were capable of: (i) generating self-sustainedoscillations with a circadian period in constant conditions; and (ii) entraining to LD cycles over arealistic range of photoperiods [55].For the
Neurospora and 2-loop
Arabidopsis logic models, the full set of LCs were fitted to5ynthetic data. The best-performing LCs for these networks are shown in Figs. 4 and 5A. It can beseen that the optimisation method produces a clear separation of the LCs by score. Moreover, foreach network, one of the DE LCs is uniquely identified as the optimal circuit yielding a viable clock.In the case of 3-loop
Arabidopsis , we considered the subset of configurations obtained by settingthe gates common with the 2-loop
Arabidopsis circuit to their optimised values. This mirroredthe construction of the 3-loop DE model which was derived from the 2-loop system by adding anadditional feedback loop while fixing all other interactions, and then optimising the parametersof the new loop [36]. Constraining the structure of the circuit in this fashion yielded 8 possibleLCs (the 2 edges in the LHY-PRR loop described as activation or inhibition and the AND/ORinteraction at LHY; see Fig. 2D). Figure 5B shows that in this system, as for the other models, aDE LC emerges as the optimal circuit.
The time series generated by the optimal configurations in LD cycles are shown in Fig. 6. Thecorresponding DE simulations are also plotted for comparison.In each case it is clear that the Boolean models capture the same qualitative dynamics as theirDE counterparts. Different species are switched on and off relative to one another with phases thatmatch the patterns of rising and falling expression in the corresponding continuous time series.Moreover, the delays between the switching times are similar to the phase differences between thepeaks and troughs of the DE solutions.It should be noted that both Boolean
Arabidopsis models reproduce the acute light response inthe Y gene, as well as the circadian response in Y around dusk (cf. Figs. 6E-H). This demonstratesthe ability of the Boolean circuits to simulate biochemical processes that occur on different timescales within the same system.The optimal LCs give equally good matches to the DE dynamics in simulated free-runningconditions, as can be seen in Supp. Fig. S3.
In order to assess the extent to which the Boolean models reproduce the DE dynamics in a moreglobal and quantitative fashion we compared phase changes over a range of biologically realisticphotoperiods. In the Boolean framework, the times at which solutions switch between 0 and 1emerge as natural phase measures. For continuous time courses - such as those generated by DEmodels - the point in the circadian cycle at which the expression level of a molecular species decreasesbelow a set threshold can be employed as a phase marker [25, 56, 57]. This suggested using the timeat which each species decreases below its discretisation threshold as a phase measure for the DEsimulations, and the time of the 1 Ñ Neurospora models in Figs. 7A and B. For both networks, the photoperiodic behaviour of the Boolean andDE models is very close: indeed for the 1-loop network, they are exactly equivalent. Figs. 7Cand D plot the photoperiod simulations obtained with the
Arabidopsis circuits. Here too, thephase-photoperiod profiles are very similar, with the addition of the LHY-PRR loop to the 2-loopmodel causing a transition from a predominately dusk-locked system to a dawn-locked one [57]. Inparticular, the Boolean 2-loop
Arabidopsis circuit exactly reproduces the dual light response in the6 gene, in which the acute peak tracks dawn and the circadian peak tracks dusk. This suggeststhat the logic circuits possess sufficient dynamic flexibility to perform the complex integration ofenvironmental signals that is a central property of circadian systems.
The success of the logic models in recovering the correct DE configurations from synthetic datasuggested that for a fixed abstract topology, our optimisation procedure has the capacity to deter-mine the logic network most consistent with a given data set. We tested this finding further byoptimising the 3-loop
Arabidopsis logic circuit to highly-sampled experimental time series recordedusing luciferase (LUC) imaging in constant light from a wild-type strain [57]. All possible LCswere considered, corresponding to a network inference carried out assuming no prior biologicalknowledge. The cost function optimised was the same as that used for fitting to synthetic LL data.As previously, viable clock circuits were taken to be those yielding autonomous limit cycles withcircadian periods.The results of fitting to experimental data are presented in Fig. 8A. It can be seen that thesecond highest ranking LC giving a viable circuit is a DE configuration. This configuration, G DE ,is in fact the same as that previously determined to be optimal from the synthetic Arabidopsis data sets. Moreover, Fig. 8B shows that G DE emerges as the top ranking clock configuration ifthe regulatory structure is constrained to incorporate the central LHY-TOC1 negative feedbackloop of the corresponding continuous model. Interestingly, the optimal configuration, G OP T , differsfrom G DE in the sign of the TOC1-LHY interaction, with TOC1 repressing LHY in G OP T asopposed to activating it (compare Supp. Figs. S4A and B). This result is consistent with the recentexperimental characterisation of the LHY-TOC1 circuit as a double negative feedback loop, ratherthan the single negative loop assumed in the 2- and 3-loop DE models [51].Figures 9B and C show the simulations of the free-running clock obtained from the optimal andDE configurations respectively. For both LCs, the simulated oscillation period and timing of geneexpression is close to that of the natural system (cf. Fig. 9A). This can be seen clearly by comparingthe durations for which each gene is ON in the two Boolean models with the corresponding peaksin the continuous data.
Circadian clocks have become popular systems for studying the relationship between gene-proteindynamics and phenotype. The molecular machinery underlying these networks has been relativelywell characterised, and this has led to great interest in developing predictive computational modelsof the clock. Such models are usually formulated as sets of differential equations. The high levelof biochemical detail afforded by this approach has allowed DE models to successfully addressa range of issues regarding the functional relationship between the architecture of the clock andcircadian homeostasis. One homeostatic mechanism that has been comprehensively investigatedfrom a theoretical perspective is temperature compensation. By assuming that certain subsets of7 circuit’s kinetic parameters are temperature-dependent, temperature has been incorporated intoa number of circadian DE models, ranging from minimal circuits built on the Goodwin oscillator[15, 23, 58] to more detailed formulations that explicitly model the underlying biochemical reactions[18, 24]. These models have provided new insights into the possible mechanisms that lead tocircadian mutants with altered compensation properties, as well as suggesting generic motifs thatmay facilitate the tuning of the phase-temperature relationship.However, clock models based on DEs are characterised by large numbers of kinetic parameters,the values of which are typically unknown. Optimising these to experimental data is a majorcomputational bottleneck which imposes a hard bound on the maximum system size that can bestudied. In addition, as the fitting problem is typically highly underdetermined and involves noisy,undersampled experimental data, robust optimisation can require the construction of cost functionstargeting specific qualitative features of the data, introducing a degree of arbitrariness to the fittingprocedure [18, 25, 34–36].The need for minimal clock parametrisations to address these issues has been recognised else-where in the literature. For example, recent
Neurospora work has shown that it is possible to use asimple two-parameter function to represent all the intermediate processes between the expressionof a clock gene and its action on a downstream target, whilst still maintaining sufficient flexibilityto accurately simulate biological temperature and photoperiod responses [18, 25].An alternative technique for modelling gene regulatory networks is provided by Boolean logic.By assuming discrete expression levels, Boolean models provide an even greater reduction in com-plexity, albeit at the cost of reduced biochemical precision. Previous studies have exploited thisreduction to study GRN state-space structures [48, 59, 60], and to introduce probabilistic modelsthat parametrise statistical transitions between states [45, 61]. The different limit cycles enumerableby a single Boolean GRN have been interpreted in some models as examples of cell differentiation[62, 63], making limit cycle properties of special interest. Elsewhere, Boolean studies have focusedon the regulatory logic underlying transcription [64–66], the immune response [44, 53] and apoptosis[49].Here, we have presented the first models of biological clocks based on the Boolean formalism. Weconstructed logic versions of 4 DE models simulating circadian rhythms in the organisms
Neurosporacrassa and
Arabidopsis thaliana . To derive the best logic representations, both regulatory structure(logic configuration) and parameters (signalling delays and discretisation thresholds) were optimisedto synthetic experimental data generated from the DEs. In each case, we found that the logicconfiguration yielding the best fit to data was consistent with that of the corresponding DE system.This suggested that the Boolean models were capable of identifying the particular combinationof activating and inhibiting elements best able to account for a set of experimental expressionpatterns (Figs. 4 and 5). We confirmed this hypothesis by successfully applying our optimisationalgorithm to real
Arabidopsis
LUC data (Fig. 8). These results demonstrate that the logic modelspossess sufficient predictive power to perform biological network inference, despite their relativesimplicity. Moreover, the optimisation to LUC data highlights the importance of network structurein determining dynamic behaviour [67, 68]. This optimisation gave rise to a pair of putative 3-loopnetwork architectures with distinct parameter values that generated very similar expression timeseries (Supp. Fig. S4 and Fig. 9). One of these architectures matched the 3-loop
Arabidopsis
DEmodel. It therefore contains the DE model’s core LHY-TOC1 negative feedback loop, which wasbased on an experimental study that demonstrated the repression of TOC1 by LHY whilst also8nferring the activation of LHY by TOC1 [69]. Significantly, the other architecture - which gavethe best fit to data - agreed with more recent biochemical work showing that TOC1 is in fact arepressor of LHY [51]. This result is also consistent with a complementary computational studyshowing that an expanded version of the 3-loop DE model incorporating a negative TOC1-LHYinteraction yields more accurate simulations of TOC1 knockout and overexpression mutants [70].We also found that although the cost function used to fit synthetic data only assessed goodness-of-fit in simulated 12:12 LD cycles, the logic models gave very good fits to the DEs in long and shortdays also, closely reproducing the relationships between photoperiod and expression phase (Fig. 7).The coordination of biochemical activity with the timing of dawn and dusk is a key system-levelproperty that can be used to assess the relationship between the structure of a circadian networkand its evolutionary flexibility [57]. Our photoperiod simulations indicate, somewhat surprisingly,that this property can be accounted for by significantly reduced dynamic models possessing muchsmaller parameter sets. In particular, the combined dawn- and dusk-tracking observed in the 2-loop
Arabidopsis
DE model (Fig. 7C) is accurately replicated by its logic formulation which hasless than 1/4 the number of parameters. A similar reduction was observed for 3-loop
Arabidopsis ,with the 80 parameters describing the DE models decreasing to 20 in its Boolean equivalent (Supp.Table S1). The ability of simple, discrete models to simulate biologically realistic photoentraimentmay also have important implications for the nature of circadian signal processing, in additionto being interesting from a modelling perspective. Specifically, it is consistent with hypothesesbased on experimental data suggesting that the mechanism by which daylength and light intensityinformation is transmitted to output pathways may be partly digital in nature [71].It should be noted that despite the success of the logic models in reproducing light responses,the coarser representation of network dynamics they provide means that the reproduction of certaincircadian properties pose problems for the Boolean approach. A particular restriction of note relatesto the modelling of temperature compensation. Temperature can be readily introduced into anyDE model by following the methodology originally introduced by Ruoff [72, 73]. In this scheme,the temperature dependence of a kinetic parameter is assumed to be governed by the Arrheniusrelation. This leads to a simple balance equation for the corresponding activation energies andcontrol coefficients which, when satisfied, guarantees a near-zero period-temperature derivative atthe balance point. In a Boolean model, the parameters that determine the free-running period τ F R are the discrete delays τ j , and so simple balance equations can be derived from the form of thefunction determining the dependence of τ F R on the τ j s. For example, in the 2-loop Neurospora model, it can be shown that τ F R “ τ ` τ ` τ ` τ . Thus, compensation can be achieved simplyby choosing the temperature dependence of each delay τ j so that their sum is constant at eachpoint over the range of interest. However, as the τ j s are generic parameters which in each instancesummarise several biochemical processes, any balance equation derived in this manner will not ingeneral be grounded in physical chemistry, reducing its biological relevance. In addition to providing a more compact parametrisation, a significant strength of the logic mod-elling approach is that it greatly reduces the complexity of the optimisation procedure itself. Thisenables optimal configurations and delay-threshold combinations to be distinguished with a greaterobjectivity. The critical simplification afforded by the Boolean formulation is that the cost score iscomputed from bitstrings, which take the value 0 or 1, rather than data that varies over a much9roader range of values. This removes the need for ad hoc cost function terms for normalising thedata and robustly computing period and phase. Indeed, the cost function we employed in thiswork was extremely simple, computing the correlation between the predictions of the model andthe corresponding discretised experimental data at each vertex.A second important simplification relates to the structure of the parameter space. In DEmodels, each interaction delay is dependent on processes that occur over a range of time scales(e.g. transcription, translation, degradation etc). This means that as well as being uncountablyinfinite, the parameter space can cover several different orders of magnitude, making it difficultto establish a priori bounds. By contrast, each signalling delay in a logic model is constrained tobe a multiple of the data sampling interval t S , and is bounded above by the maximum possibletime t MAX over which the corresponding interaction can occur. t MAX is itself bounded by theduration of the experimental measurements and can be further restricted on the basis of biologicalknowledge (here, the free-running period was used to bound all delays - see the Methods sectionfor details). The set of possible signalling delays is thus finite. The set of possible discretisationthresholds is also finite as varying the threshold for a given species will generate a set number ofdifferent binary time series, meaning it is only necessary to consider thresholds for which distinctbitstrings are obtained.For logic models, the parameter space as a whole is therefore finite, and can be objectivelybounded. Furthermore, for a fixed abstract topology, the set of all possible logic configurationsis also finite; it is simply equal to the set of possible Boolean functions. This means that itis, in principle, possible to comprehensively search across all possible regulatory structures andparameter combinations to determine the best fit to data. Such a search is impossible with DEsystems. Furthermore, searching across different patterns of activation and inhibition for a DEmodel is often problematic as it requires some a priori assumptions to be made regarding theunderlying biochemical mechanisms; e.g. specification of which reactions may be cooperative andthe ranges of the corresponding Hill coefficients [18, 25, 34–36]. In practice, however, the numberof possible network and parameter combinations in a Boolean formulation can become too large fora complete search with the computational resources available, making it necessary to constrain theoptimisation.
Here, in order to ensure computational tractability, we subsampled the delay-threshold space, whilealso fixing the parameters controlling the impulse light inputs. In addition, we restricted ourselvesto a subset of logic circuits by assuming that: (i) multiple light inputs to a gene are combined withan OR gate; and (ii) the net light signal directly modulates the state of the gene through eitheran AND or an OR gate, depending on the free-running light regime (see Fig. 2). Of course notall possible circuits will be biologically reasonable ones (e.g. any circuit for which gates with lightinputs uniformly output 0 or 1 would be unviable). Nonetheless, it is reasonable to expect thatsome interactions may be more accurately modelled by gates that are not of the simple AND/ORtype [64], meaning that better performing logic configurations may have been overlooked.In view of these restrictions, the fact that our optimal Boolean circuits match the dynamics oftheir target data sets, both synthetic and experimental, is very encouraging. Indeed, we anticipatethat it should be possible to find circuit configurations and parameter sets giving good fits todata over a broader range of genetic and environmental perturbations. For example, our Boolean10-loop
Arabidopsis model shows consistency with much of the photoperiodic behaviour of its DEcounterpart (Fig. 7D). However, it does not reproduce the phase response observed in the DE modelas photoperiod is decreased, for which some components switch between dawn- and dusk-dominancein a complex manner [57].A likely contributing factor is that our current optimisation method involves computing thescore at every parameter combination over a fixed lattice. This can be inefficient, particularlywhere the stoichiometry of an interaction and/or its molecular dynamics require the density ofinteracting species to accrue beyond a certain value before the interaction is statistically likely tooccur. In such cases, the optimal threshold choice for the discretisation is likely be found withina narrow band of possibilities. Furthermore, high threshold resolutions can be required to resolvetopological degeneracies in the cost function which make optimal networks difficult to distinguish(see section S1.2 of the Electronic Supplementary Material). The accuracy of the optimisation couldthus be increased by employing logic variants of global search methods capable of providing a morecomputationally efficient exploration of the parameter space, such as evolutionary algorithms [74].This would increase the predictive power of the models, better positioning us to address featuressuch as the dawn and dusk switching in shorter photoperiods observed in the 3-loop model.
Finally, we note that there are many promising avenues for further developing the approachesintroduced in this study. From a theoretical perspective, there is scope for extending establishedmethods for analysing Boolean models. Of particular interest are techniques for determining thesimplest inequalities involving linear combinations of the time delays that are required to traversea given path through the state space [44, 52, 53]. In the context of circadian models, this wouldinvolve developing general methods for deriving linear inequalities that result in free-running cycleswith the target period. As the computational demands of optimisation grow exponentially with thenumber of parameters to be fitted, restricting the search to the corresponding solution set wouldbe expected to dramatically reduce the computational load.More generally, hybrid logic/DE algorithms would see Boolean methods used firstly to determinethe optimal model configuration from data, and then to identify the regions of parameter spacewithin which an equivalent DE model is likely to give a good fit. This would exploit the ability oflogic models, demonstrated in this work, to generate a coarse, but quantitative representation ofthe system dynamics in a systematic and efficient manner. We anticipate that regulatory networksof much greater complexity than those considered here could be quantitively modelled using suchan approach.
In order to incorporate some of the variability in expression levels characteristic of real experimentaldata, synthetic data was generated using the variant of Gillespie’s stochastic simulation algorithm(SSA) introduced by Gonze et al. [39]. Fits to deterministic data obtained from direct integrationof the DEs gave very similar results (data not shown). However, the stochastic data sets yielded aclearer separation between LCs, most probably due to the lifting of topological degeneracies. For all11ircuits, final analyses were therefore restricted to the results obtained with stochastic time courses.In generating the synthetic time courses, the scaling (or extensitivity) parameter Ω was setclose to the minimum value yielding self-sustained, unforced oscillations in each case. The finalvalues used for simulations were: 1-loop
Neurospora , Ω “
25; 2-loop
Neurospora , Ω “
50; 2- and3-loop
Arabidopsis , Ω “ Neurospora models; LL for
Arabidopsis ). This choicemirrors the combination of experimental data sets typically chosen for parameter optimisation incomputational circadian studies [18, 25, 34–36, 38]. Time series were then subsampled every 0.5hto give the data used for model-fitting. Plots of the synthetic LD data sets are shown in Supp. Fig.S5. Experimental gene expression levels were measured using luciferase (LUC) imaging, carried outas previously described [16]. Images were recorded using Hamamatsu C4742-98 digital camerasoperating at -75 o C under the control of Wasabi software (Hamamatsu Photonics, Hamamatsu City,Japan) with a sampling interval of 1.5h. Bioluminescence levels were quantified using Metamorphsoftware (MDS, Toronto, Canada). The resulting expression profiles were detrended for amplitudeand baseline damping using the mFourfit function of BRASSv3 [35]. The detrended time series canbe seen in Fig. 9A.
For n biochemical species, let X i p t q P t , u denote the activity of species i at time t , where t is amultiple, kt S , of the sampling interval t S . The update rule is then expressed as: X i p t q “ s i p X p t ´ τ i q , . . . , X n p t ´ τ i n q , L p t ´ τ N ` q , . . . , L m p t ´ τ N ` m qq . (1)Here the L k s represent external light inputs to the circuit and τ , . . . , τ N ` m are the signalling delays,with τ i j P t τ , . . . , τ N u for all i, j . Assuming a 24h day and writing t DAW N and t DUSK for thetimes of dawn and dusk respectively, the L k s are described by the following function: L k p t q “ t DAW N ď mod p t, q ď mod p t DAW N ` p k , q , p k “ p k “
24 corresponds to constantlight (LL); setting p k “ t DUSK ´ t DAW N yields a continuous light-dark (LD) cycle. Parameter setswith p k ă t DUSK ´ t DAW N yield LD cycles with a pulse of length p k at dawn.The functions s i are Boolean functions s i : t , u n ` m Ñ t , u representing the interactionsbetween genes and light inputs that determine the state of species X i . We introduce a logicaldependency function (or logic gate ) G p¨ , g q to describe these interactions. This function takes asingle parameter g P t , u , the value of which determines the type of reaction modelled. Formally,we consider two types of operator. The first operator acts on a single Boolean input Y P t , u ,implementing either the identity or NOT gate, modelling activation and repression respectively: G p Y, q “ YG p Y, q “ NOT Y. (3)The second type of operator acts on two Boolean inputs Y, Z
P t , u , implementing either the ANDor the (inclusive) OR dependency. If either species can fulfil the interaction, an OR dependency is12sed. If both species are required in the interaction, an AND dependency is used. Thus, for species Y P t , u and Z P t , u : G p Y, Z, q “ Y OR ZG p Y, Z, q “ Y AND Z. (4)The functions s i in (1) are formed as compositions of these dependencies. For example, theupdate rule for gene T OC
Arabidopsis model has the form X T OC p t q “ p NOT X LHY p t ´ τ qq AND X Y p t ´ τ q . (5)Using (3) and (4), this can be written as a composition of logic functions X T OC p t q “ G p G p X LHY p t ´ τ q , g q , G p X Y p t ´ τ q , g q , g q , (6)with g “ g “ g “
1. The resulting Boolean function models a reaction in which
LHY must be downregulated in order for Y to activate T OC s i has an associated adjacency matrix A “ p A ij q defined by A ij “ s i p X , . . . , X j ´ , , X j ` . . . , X n , L q ı s i p X , . . . , X j ´ , , X j ` . . . , X n , L q , L “ p L , . . . , L m q . Thus, A ij “ X j can change the state of X i , and so matrix A describes the abstract topology of a model (these topologies can be seen in Fig. 2). It follows that acircadian clock model is parametrised by its adjacency matrix A , a set of delays τ “ p τ , . . . , τ N ` m q and a set of gates G “ p g , . . . , g d q . Writing X “ p X , . . . , X n q , this yields the following compact,vectorised form of the update rule: X p t q “ S p X p t q , L p t q ; A , τ, G q . (8)We refer to the set of gates G as the model’s logic configuration (LC). As each configuration G “p g , . . . , g d q , is a bitstring, it can be represented uniquely by the corresponding decimal expansion D p G q defined below: D p g , . . . , g d q “ d ÿ l “ g l d ´ l . (9)For the models considered, the LCs are enumerated in terms of their decimal expansions for sim-plicity in Figs. 4, 5 and 8.Of the LCs consistent with A , a subset matches the pattern of activation and inhibition inthe corresponding DE model. There are referred to as the DE LCs . For example, for the 2-loop
Neurospora model, frq activates both isoforms of FRQ, giving g “ g “
0, and both isoformsrepress transcription, giving g “ g “ g “
0, corresponding to OR) or a single isoform is sufficient ( g “
1, corresponding to AND).These DE LCs are encoded by the integers 6 and 7 respectively.Finally, in order to construct the simplest logic models consistent with the general form ofcircadian DE systems, each Boolean function s i is assumed to have the form s i p X , L q “ H i ` s Fi p X q , s Li p L q ˘ , (10)such that: (i) H i : t , u Ñ t , u implements either the AND or OR gate; (ii) s Fi : t , u n Ñ t , u encodes the structure of the free-running system; and (iii) s Li : t , u m Ñ t , u determines how13ultiple light inputs are integrated, with s Li p , . . . , q “ s Li p , . . . , q “
1. For consis-tency, it is therefore necessary that setting each light input L k to the relevant constant value L recovers the free-running system ( L “ L “ H i ` s Fi p X q , L ˘ “ s Fi p X q . For the Neurospora models, where the free-running condition is DD, theconsistency condition is achieved using an OR gate since x OR “ x . In the case of the Arabidopsis circuits, where the free-running condition is LL, the AND gate is appropriate as x AND “ x .For both Arabidopsis models, the multiple light inputs to Y are combined using an OR gate: thisis because in the corresponding DE models, both inputs can independently upregulate transcription[35, 36]. In addition, the 3-loop model only incorporates the pulsed light input to the PRR gene sinceremoving the continuous light input from the DE system had a negligible effect on its photoperiodicbehaviour.Full details of the logic formulation of each clock network are given in section S2 of the ElectronicSupplementary Material.
In order to identify the optimal combination of signalling delays τ “ p τ j q , thresholds T “ p T i q and logic gates G “ p g l q for a given model and data set, we must introduce a cost function to beminimised. We used a simple function based on a correlation between the predicted time coursesgenerated by the model and the corresponding data. Further details can be found in section S1.1 ofthe Electronic Supplementary Material. Optimal LC-parameter combinations are given in Tables1 and 2.The most comprehensive strategy in seeking to minimise the cost function is to systematicallycalculate the cost for all possible configurations and parametrisations of the model; that is all delays0 ď τ j ď t MAX , where t MAX is the longest timescale considered in the system, and all thresholds0 ă T i ă
1. However, when the model becomes too complex to permit a global analysis, it becomesnecessary to introduce further constraints to restrict the parametrisation to be considered. Suchconstraints should be as objective as possible.One viable, objective constraint is to limit all the signalling delays around a closed loop so thatthey sum to no more than the period τ F R of free-running oscillations in the target data set.For 1-loop
Neurospora, this gives the single delay bound τ ` τ ď τ F R , (11)where τ F R “ Neurospora , the constraint results in 2 delay bounds τ ` τ ď τ F R ,τ ` τ ď τ F R , (12)where again τ F R “ Arabidopsis , there are 3 delay bounds: τ ` τ ` τ ď τ F R ,τ ` τ ď τ F R ,τ ` τ ` τ ` τ ď τ F R . (13)In this case τ F R “ Arabidopsis , there are 4 delay bounds: those for the 2-loop model above togetherwith an extra bound introduced by the addition of the LHY-PRR loop: τ ` τ ď τ F R . (14)For this circuit, τ F R “ τ j was restricted to integer multiples of a minimum delay resolution τ R , itself a multiple k τ t S of the data sampling interval t S . Secondly, each threshold T i was bounded within a subinterval r T MIN , T
MAX s of r , s , and also restricted to integer multiples of a minimum threshold resolution T R . For all models, T MIN and T MAX were set to the values 0 . . Neurospora models, k τ was set to 1 and T R to 0 . Arabidopsis models, k τ was initially set to 2and T R to 0 .
05. Scores were then recalculated with k τ “ T R “ . r τ i ´ τ R , τ i ` τ R s and r T j ´ T R , T j ` T R s centred around the parameter combinations givingthe best scores. For Arabidopsis optimisations, light parameters controlling the impulse inputs ( L and L in the 2-loop circuit; L , L and L in the 3-loop circuit) were fixed at the values shownin Table 1. These were determined from discrete approximations to the corresponding continuouscurves in the DE models.Each parameter set was then checked for a functioning circadian clock. Firstly, the solutionto the free-running model was generated under the appropriate continuous light conditions, usingthe discretised data as initial conditions. If a limit cycle was obtained with period within 20% ofthe DE model, this was fed into the model under simulated 12:12 LD cycles. Periodic, entrainedoscillations were taken to indicate a viable clock circuit.For the Neurospora and 2-loop
Arabidopsis models, optimisations included all possible LCs. The3-loop
Arabidopsis optimisations were restricted to the subset of LCs defined by G “ p xyz q , x, y, z P t , u ; these are the configurations that result from fixing gates g , ..., g to their optimisedvalues in the 2-loop circuit.The photoperiod simulations shown in Fig. 7 were obtained by locally reoptimising the logiccircuits yielding viable clocks to simulations of the DE models under 12:12 LD cycles, and thencalculating the maximum symmetric photoperiod interval p ´ P MAX , ` P MAX q over whichboth model formulations generated stable, entrained solutions. The cost was minimised with asimulated annealing algorithm [18, 34, 75], using a cost function for which the data and predictedtime courses were taken to be single cycles of the entrained solution in the DE and logic modelsrespectively. In fitting the 3-loop
Arabidopsis model to the LUC data shown in Fig. 9A, genes were matched tomodel variables in the following manner: (i) CCA1 was equated to LHY on the basis that the LHYvariable in the equivalent DE model groups together the effect of the two genes [36]; (ii) TOC1 wasmatched to TOC1; (iii) GI was matched to Y owing to experimental results showing that this genecan account for some of the action of Y in the DE model [36]; (iv) PRR9 was matched to PRR, asthis variable combines the PRR7 and 9 genes in the DE formulation [36].15ince the exact biological correlate of variable X is currently unknown, it was not used forcosting the fit. Consequently, g and τ were both fixed at 0 (note that fixing g reduces thenumber of possible LCs from 2048 to 1024). This preserves the dynamics of all components in themodel excluding X, together with the delay bounds used for constraining the parameter space. Italso means that TOC1 and the dummy variable X have identical time series. Thus, in practice, thediscretised TOC1 expression time series was used as a proxy for X data when calculating the costat the LHY vertex. Also, as the LUC data was measured in LL, the parameters p Ñ p controllingthe light inputs were all set to 24 (cf. eqn. (2)). In LL conditions, the absence of dawn and duskmeans that the choice of the delay parameters τ Ñ τ is arbitrary. We therefore set the valuesof these delays to 0. T MIN and T MAX were fixed at 0 . . k τ was initially setto 2 and T R to 0 .
2. Scores were then recalculated with k τ “ T R “ .
05, within intervals r τ i ´ τ R , τ i ` τ R s and r T j ´ T R , T j ` T R s centred around the best scoring parameter sets. Theoptimal parameter set for each LC was assessed to determine whether it yielded a viable clock byfirst generating the solution to the model using the discretised LUC time series as initial conditions,and then checking that this gave a limit cycle with free-running period within 20% of 24h. The numerical routines for parameter optimisation and model simulation were initially developedin MATLAB (Mathworks, Cambridge) and C. The scoring algorithms used for global parametersweeps were subsequently converted into Java and run on a task farm computer consisting of118 Intel Harpertown quad-core processors. All software used is available on request from thecorresponding author.
Acknowledgments eferences
1. Dunlap, J. C., Loros, J. J. & DeCoursey, P. J. 2003
Chronobiology: Biological Timekeeping .(Sinauer).2. Young, M. W. & Kay, S. A. 2001 Time zones: a comparative genetics of circadian clocks.
Nat.Rev. Genet. , 702–15.3. Bell-Pedersen, D., Cassone, V. M., Earnest, D. J., Golden, S. S., Hardin, P. E., Thomas, T. L.& Zoran, M. 2005 Circadian rhythms from multiple oscillators: lessons from diverse organisms. Nat. Rev. Genet. , 544–556.4. Zhang, E. E. & Kay, S. A. 2010 Clocks not winding down: unravelling circadian networks. Nat.Rev. Mol. Cell Biol. , 764–776.5. Khapre, R. V., Samsa, W. E. & Kondatov, R. V. 2010 Circadian regulation of cell cycle:Molecular connections between aging and the circadian clock. Ann. Med. , 1695–1700.6. Gery, S. & Koeffler, H. P. 2010 Circadian rhythms and cancer. Cell Cycle , 1097–1103.7. Takeda, N. & Maemura., K. 2010 Circadian clock and vascular disease. Hypertens. Res. ,645–651.8. Westrich, L. & Sprouse., J. 2010 Circadian rhythm dysregulation in bipolar disorder. Curr.Opin. Investig. Drugs , 779–787.9. Lange, T., Dimitrov, S., Fehm, H. L., Westermann, J. & Born, J. 2006 Shift of monocytefunction toward cellular immunity during sleep. Arch. Intern. Med. , 1695–1700.10. Liu, J., Malkani, G., Mankani, G., Shi, X., Meyer, M., Cunningham-Runddles, S. & Sun, Z. S.2006 The circadian clock Period 2 gene regulates gamma interferon production of NK cells inhost response to lipopolysaccharide-induced endotoxic shock.
Infect. Immun. , 4750–4756.11. Keller, M., Mazuch, J., Abraham, U., Eom, G. D., Herzog, E. D., Volk, H. D., Kramer, A.& Maier, B. 2009 A circadian clock in macrophages controls inflammatory immune responses. Proc. Natl. Acad. Sci. USA , 21407–12.12. Ouyang, Y., Andersson, C. R., Kondo, T., Golden, S. S. & Johnson, C. H. 1998 Resonatingcircadian clocks enhance fitness in cyanobacteria.
Proc. Natl. Acad. Sci. USA , 8660–4.13. Dodd, A. N., Salathia, N., Hall, A., K´evei, E., T´oth, R., Nagy, F., Hibberd, J. M., Millar,A. J. & Webb, A. A. 2005 Plant circadian clocks increase photosynthesis, growth, survival, andcompetitive advantage. Science , 630–3.14. Gardner, G. F. & Feldman, J. F. 1981 Temperature compensation of circadian period lengthin clock mutants of
Neurospora crassa . Plant Physiol. , 1244–1248.15. Ruoff, P., Rensing, L., Kommedal, R. & Mohsenzadeh, S. 1997 Modeling temperature compen-sation in chemical and biological oscillators. Chronobiol. Int. , 499–510.16. Gould, P. D., Locke, J. C., Larue, C., Southern, M. M., Davis, S. J., Hanano, S., Moyle, R.,Milich, R., Putterill, J., Millar, A. J. & Hall, A. 2006 The molecular basis of temperaturecompensation in the Arabidopsis circadian clock.
Plant Cell , 1177–1187.177. Brunner, M. & Diernfellner, A. 2006 How temperature affects the circadian clock of Neurosporacrassa . Chronobiol. Int. , 81–90.18. Akman, O. E., Locke, J. C. W., Tang, S., Carr´e, I., Millar, A. J. & Rand, D. A. 2008 Isoformswitching facilitates period control in the Neurospora crassa circadian clock.
Mol. Syst. Biol. , 64.19. Leloup, J.-C., Gonze, D. & Goldbeter, A. 1999 Limit cycle models for circadian rhythms basedon transcriptional regulation in Drosophila and
Neurospora . J. Biol. Rhythms , 433–448.20. Smolen, P., Baxter, D. A. & Byrne, J. H. 2001 Modeling circadian oscillations with interlockingpositive and negative feedback loops. J. Neurosci. , 6644–56.21. Sriram, K. & Gopinathan, M. S. 2004 A two variable delay model for the circadian rhythm of Neurospora crassa . J. Theor. Biol. , 23–38.22. Francois, P. 2005 A model for the
Neurospora circadian clock.
Biophys. J. , 2369–2383.23. Ruoff, P., Loros, J. J. & Dunlap, J. C. 2005 The relationship between FRQ-protein stabilityand temperature compensation in the Neurospora circadian clock.
Proc. Natl. Acad. Sci. USA , 17681–6.24. Hong, C. I., Jolma, I. W., Loros, J. J., Dunlap, J. C. & Ruoff, P. 2008 Simulating darkexpressions and interactions of frq and wc-1 in the
Neurospora circadian clock.
Biophys. J. ,1221–32.25. Akman, O. E., Rand, D. A., Brown, P. E. & Millar, A. J. 2010 Robustness from flexibility inthe fungal circadian clock. BMC Syst. Biol. , 88.26. Goldbeter, A. 1995 A model for circadian oscillations in the Drosophila period protein PER.
Proc. Biol. Sci. , 319–24.27. Leloup, J. C. & Goldbeter, A. 1998 A model for circadian rhythms in
Drosophila incorporatingthe formation of a complex between the PER and TIM proteins.
J. Biol. Rhythms , 70–87.28. Ueda, H. R., Hagiwara, M. & Kitano, H. 2001 Robust oscillations within the interlockedfeedback model of Drosophila circadian rhythm.
J. Theor. Biol. , 401–6.29. Smolen, P., Hardin, P. E., Lo, B. S., Baxter, D. A. & Byrne, J. H. 2004 Simulation of
Drosophila circadian oscillations, mutations, and light responses by a model with VRI, PDP-1, and CLK.
Biophys. J. , 2786–802.30. Forger, D. B. & Peskin, C. S. 2003 A detailed predictive model of the mammalian circadianclock. Proc. Natl. Acad. Sci. USA , 14806–11.31. Leloup, J. C. & Goldbeter, A. 2003 Toward a detailed computational model for the mammaliancircadian clock.
Proc. Natl. Acad. Sci. USA , 7051–6.32. Becker-Weimann, S., Wolf, J., Herzel, H. & Kramer, A. 2004 Modeling feedback loops of themammalian circadian oscillator.
Biophys. J. , 3023–3034.33. Mirsky, H. P., Liu, A. C., Welsh, D. K., Kay, S. A. & Doyle III, F. J. 2009 A model of thecell-autonomous mammalian circadian clock. Proc. Natl. Acad. Sci. USA , 11107–12.184. Locke, J. C. W., Millar, A. J. & Turner, M. S. 2005 Modelling genetic networks with noisyand varied experimental data: the circadian clock in
Arabidopsis thaliana . J. Theor. Biol. ,383–93.35. Locke, J. C. W., Southern, M. M., Kozma-Bognar, L., Hibberd, V., Brown, P. E., Turner, M. S.& Millar, A. J. 2005 Extension of a genetic network model by iterative experimentation andmathematical analysis.
Mol. Syst. Biol. , 2005.0013.36. Locke, J. C. W., Kozma-Bognar, L., Gould, P. D., Feh´er, B., Kevei, E., Nagy, F., Turner,M. S., Hall A. & Millar, A. J. 2006 Experimental validation of a predicted feedback loop in themulti-oscillator clock of Arabidopsis thaliana . Mol. Syst. Biol. , 59.37. Zeilinger, M. N., Farr´e, E. M., Taylor, S. R., Kay, S. A. & Doyle III, F. J. 2005 A novelcomputational model of the circadian clock in Arabidopsis that incorporates prr7 and prr9.
Mol. Syst. Biol. , 58.38. Pokhilko, A., Hodge, S. K., Stratford, K., Knox, K., Edwards, K., Thomson, A. W., Mizuno,T. & Millar, A. J. 2010 Data assimilation constrains new connections and components in acomplex, eukaryotic circadian clock model. Mol. Syst. Biol. , 416.39. Gonze, D., Halloy, J. & Goldbeter, A. 2002 Biochemical clocks and molecular noise: Theoreticalstudy of robustness factors. J. Chem. Phys. , 10997–11010.40. Gonze, D., Halloy, J. & Goldbeter, A. 2002 Robustness of circadian rhythms with respect tomolecular noise.
Proc. Natl. Acad. Sci. USA , 673–8.41. Vilar, J. M., Kueh, H. Y., Barkai, N. & Leibler, S. 2002 Mechanisms of noise-resistance ingenetic oscillators. Proc. Natl. Acad. Sci. USA , 5988–5992.42. Kauffman, S. A. 1969 Metabolic stability and epigenesis in randomly constructed genetic nets. J. Theor. Biol. , 437–467.43. Thomas, R. 1991 Regulatory networks seen as asynchronous automata: A logical description. J. Theor. Biol. , 1–23.44. Kaufman, M., Andris, F., Leo, O. 1999 A logical analysis of T cell activation and anergy.
Proc.Natl. Acad. Sci. USA , 3894–3899.45. Shmulevich, I., Dougherty, E. R., Kim, S. & Zhang, W. 2002 Probabilistic Boolean networks:a rule-based uncertainty model for gene regulatory networks. Bioinformatics , 261–274.46. Watterson, S., Marshall, S. & Ghazal, P. 2008 Logic models of pathway biology. Drug. Discov.Today , 447–456.47. Yu, L., Watterson, S., Marshall, S. & Ghazal, P. 2008 Inferring Boolean networks with pertur-bation from sparse gene expression data: a general model applied to the Interferon regulatorynetwork. Mol. Biosyst. , 1024–1030.48. Saez-Rodriguez, J., Alexopoulos, L. G., Epperlein, J., Samaga, R., Lauffenburger, D. A., Klamt,S. & Sorger, P. K. 2009 Discrete logic modelling as a means to link protein signalling networkswith functional analysis of mammalian signal transduction. Mol. Syst. Biol. , 331.199. Schlatter, R., Schmich, K., Avalos Vizcarra, I., Scheurich, P., Sauter, T., Borner, C., Ederer,M., Merfort, I. & Sawodny, O. 2009 ON/OFF and beyond - a Boolean model of apoptosis. PLoS Comput. Biol. , e1000595.50. Watterson, S. & Ghazal, P. 2010 Use of logic theory in understanding regulatory pathwaysignaling in response to infection. Future Microbiol. , 163–176.51. Gendron, J. M., Pruneda-Paz, J. L., Doherty, C. J., Gross, A. M., Kang, S. E. & Kay, S. A.2012 Arabidopsis circadian clock protein, TOC1, is a DNA-binding transcription factor.
Proc.Natl. Acad. Sci. USA , 3167–72.52. Thomas, R. 1983 Logical description, analysis and synthesis of biological and other networkscomprising feedback loops.
Adv. Chem. Phys. , 247–282.53. Kaufman, M., Urbain, J. & Thomas, R. 1985 Towards a logical analysis of the immune response. J. Theor. Biol. , 527–561.54. Shmulevich, I., Dougherty, E. R. & Zhang, W. 2002 From Boolean to probabilistic Booleannetworks as models of genetic regulatory networks.
Proc. IEEE , 1778–1792.55. Troein, C., Locke, J. C., Turner, M. S. & Millar, A. J. 2009 Weather and seasons togetherdemand complex biological clocks. Curr. Biol. , 1961–4.56. Tan, Y., Dragovic, Z., Roenneberg, T. & Merrow, M. 2004 Entrainment dissociates transcrip-tion and translation of a circadian clock gene in Neurospora . Curr. Biol. , 433–8.57. Edwards, K. D., Akman, O. E., Knox, K., Lumsden, P. J., Thomson, A. W., Brown, P. E.,Pokhilko, A., Kozma-Bognar, L., Nagy, F., Rand, D. A. & Millar, A. J. 2010 Quantitativeanalysis of regulatory flexibility under changing environmental conditions. Mol. Syst. Biol. ,424.58. Ruoff, P. & Rensing, L. 1996 The temperature-compensated Goodwin model simulates manycircadian clock properties. J. Theor. Biol. , 275–285.59. Ay, F., Xu, F. & Kahveci, T. 2009 Scalable steady state analysis of Boolean biological regulatorynetworks.
PLoS ONE , e7992.60. Hau, L. D. & Kwon, Y. K. 2011 The effects of feedback loops on disease comorbidity in humansignaling networks. Bioinformatics , 1113–20.61. Krawitz, P. & Shmulevich, I. 2007 Basin entropy in Boolean network ensembles. Phys. Rev.Lett. , 158701.62. Ribeiro, A. S. & Kauffman, S. A. 2007 Noisy attractors and ergodic sets in models of generegulatory networks. J. Theor. Biol. , 743–55.63. Garg, A., Cara, A. D., Xenarios, I., Mendoza, L. & Micheli, G. D. 2008 Synchronous versusasynchronous modeling of gene regulatory networks.
Bioinformatics , 1917–25.64. Buchler, N., Gerland, U. & Hwa, T. 2003 On schemes of combinatorial transcription logic. Proc. Natl. Acad. Sci. USA , 5136–5141.205. Gerstung, M., Timmer, J. & Fleck, C. 2009 Noisy signaling through promoter logic gates.
Phys.Rev. E , 011923.66. Hunziker, A., Tuboly, C., Horv´ath, P., Krishna, S. & Semsey, S. 2010 Genetic flexibility ofregulatory networks. Proc. Natl. Acad. Sci. USA , 12998–3003.67. Milo, R., Shen-Orr, S., Itzkovitz, S., Kashtan, N., Chklovskii, D. & Alon, U. 2002 Networkmotifs: Simple building blocks of complex networks.
Science , 824–827.68. Prill, R., Iglesias, P. & Levchenko, A. 2005 Dynamic properties of network motifs contributeto biological network organization.
PLoS Biol. , e343.69. Alabad´ı, D., Oyama, T., Yanovsky, M. J., Harmon, F. G., M´as, P. & Kay, S. A. 2001 Reciprocalregulation between TOC1 and LHY/CCA1 within the Arabidopsis circadian clock.
Science ,880–883.70. Pokhilko, A., Fernandez, A. P., Edwards, K. D., Southern, M. M., Halliday, K. J. & Millar,A. J. 2012 The clock gene circuit in
Arabidopsis includes a repressilator with additional feedbackloops.
Mol. Syst. Biol. , 574.71. Dodd, A. N., Love, J. & Webb, A. A. 2005 The plant clock shows its metal: circadian regulationof cytosolic free Ca ` . Trends Plant Sci. , 15–21.72. Ruoff, P. 1992 Introducing temperature-compensation in any reaction kinetic oscillator model. Interdiscipl. Cycle Res. , 92–99.73. Ruoff, P. 1994 General homeostasis in period and temperature-compensated chemical clockmutants by random selection conditions. Naturwissenschaften , 456—459.74. Mendes, P. & Kell, D. 1998 Non-linear optimization of biochemical pathways: applications tometabolic engineering and parameter estimation. Bioinformatics , 869–83.75. Kirkpatrick, S., Gelatt, C. D. & Vecchi, M. P. 1983 Optimization by simulated annealing. Science , 671–680. 21 igure legends
Figure 1. Circuit diagrams for the clock models.
Genes are boxed and arrows denoteregulatory interactions. Diamonds represent light inputs. A . The single-loop Neurospora model[19]. FRQ protein represses production of frq transcript. Light acts on the network by upregulating frq transcription. B . The two-loop Neurospora model [18]. Two isoforms of FRQ are producedwhich both repress frq transcription. Light upregulates frq as in A. C . The two-loop Arabidopsis model [35]. TOC1 activates its repressor LHY (combining LHY and CCA1) indirectly througha hypothetical gene X, forming the central negative feedback loop of the circuit. LHY is directlyupregulated by light while light indirectly activates TOC1 via a second hypothetical gene Y, positedto have two distinct light inputs. Y activates TOC1 transcription and TOC1 represses Y, forminga second, interlocked feedback loop. D . The three-loop Arabidopsis model [36]. The additionalPRR gene (combining PRR7 and PRR9) is light-activated and represses LHY transcription. LHYupregulates PRR, creating a third feedback loop.
Figure 2. The abstract topologies for the logic representations of the clock circuitsshown in Fig. 1.
Genes are boxed and arrows denote regulatory interactions. A . The single-loop Neurospora model. B . The two-loop Neurospora model. C . The two-loop Arabidopsis model. D .The three-loop Arabidopsis model. Numerals l index logic gates g l P t , u that can be varied togenerate different regulatory structures. Numerals at the end of an arrow index the single-inputgate defining that interaction. Numerals within boxes index logic gates governing double-inputinteractions. Diamonds represent light inputs, with the corresponding fixed gates in ovals indicatinghow these affect the target species (e.g. in C, L and L are combined with an OR gate after whichthe resulting bitstring is combined with the output of Y through an AND gate - see the Methodssections for full details). τ j s represent the circuit delays and T i s the discretisation thresholds usedto fit continuous data. Figure 3. The number of parameters required for each clock configuration as DE andlogic models.Figure 4. The results of exploring the logic configurations belonging to the abstracttopologies of the 1-loop (A) and 2-loop (B)
Neurospora models.
Cost scores are shown forthe optimal fit of each LC to synthetic data. The LCs are indexed by their decimal representationsfor brevity (see Methods for details). Here, a score of 0 indicates the best fit and a score of 1 theworst fit. Triangles indicate LCs for which the Boolean model yields a viable clock. LCs mirroringthe activation and inhibition pattern of the corresponding DE models in Figs. 1A and B are plottedin red. In A, one such LC mirrors the corresponding DE model, G “ p q , and this emerges asthe optimal configuration yielding a viable clock. In B, only LCs yielding scores less than 0.75 areshown. There are two that mirror the equivalent DE model. One of these, G “ p q , is identifiedas the optimal configuration giving a viable clock (leftmost red triangle). In this LC, either of theFRQ isoforms can independently inhibit transcription (the corresponding two-input gate is of theAND type). Figure 5. The results of exploring the logic configurations belonging to the abstracttopologies of the 2-loop (A) and 3-loop (B)
Arabidopsis models.
Cost scores are shown22or the optimal fit of each LC to synthetic data. As in Fig. 4, each LC is indexed by its decimalexpansion. Scores of 0 and 1 indicate the best and worst fits respectively. Triangles denote LCs forwhich the Boolean model yields a viable clock. LCs mirroring the activation and inhibition patternof the corresponding DE models in Figs. 1C and D are plotted in red. In A, only LCs yielding scoresless than 0.75 are shown. Of these, there are 3 LCs consistent with the activation and inhibitionpattern of the equivalent DE model from a possible total of 4. From this subset, G “ p q emerges as the optimal configuration yielding a viable clock. For this circuit, both LHY and TOC1can independently inhibit Y while TOC1 is repressed unless LHY is repressed while Y is activated(the corresponding gates are both of the AND type). In B, only the 8 LCs obtained by varyingthe gates of the PRR-LHY loop were considered: all other gates were fixed to those of the optimal2-loop configuration. Of these 8 possible circuits, two LCs were consistent with the equivalent DEmodel, from which G “ p q is identified as the optimal configuration giving a viableclock. This LC corresponds to a clock network in which LHY is repressed unless PRR is inactiveand X is active (the corresponding gate is of the AND type). Figure 6. Time series for the differential equation and Boolean versions of the clockmodels in 12:12 LD cycles.
Two 24h cycles are plotted for each model. A , B : 1-loop Neurospora ; C , D : 2-loop Neurospora ; E , F : 2-loop Arabidopsis ; G , H : 3-loop Arabidopsis . Differential equationtime series (left panels) have been normalised to lie between 0 and 1 in order to facilitate comparisonwith the Boolean simulations (right panels). Different components within a model are slightly offsetfrom one another so they can be distinguished more easily. The time step used for solving theBoolean models was 0.5h, equal to the data sampling interval.
Figure 7. Comparing the photoperiodic behaviour of the Boolean and DE versions ofeach model.
For Boolean models, the phase of each species is taken as the time within the LDcycle of the ON to OFF transition (downward triangles). Analogously, the DE model phases aredefined as the times at which species decrease below the thresholds yielding the optimal fit of thecorresponding Boolean circuit to data (upward triangles). Shaded areas of plots, darkness; openareas, light. A . 1-loop Neurospora . The phase-photoperiod profiles are coincident, indicating thatthe Boolean model exactly reproduces the photoperiodic behaviour of its DE counterpart: FRQtranscript and protein are both locked to dusk across the photoperiod range. B . 2-loop Neurospora .The phase plots are almost exactly equal, except for shorter photoperiods where they differ by thedata sampling interval. As for A, all components are locked to dusk. C . 2-loop Arabidopsis . TheBoolean and DE models exhibit very similar patterns of dawn- and dusk-locking across genes. Thetwo Y phase-photoperiod profiles reflect the double peak observed in this component (Fig. 6E)which gives rise to a (dawn-locked) light-induced peak and a (dusk-locked) circadian peak [57]. D .3-loop Arabidopsis . The phase plots are similar, with all components predominately dawn-locked.As for 2-loop
Arabidopsis , the two Y profiles reflect the double peak observed for this gene (Fig.6G).
Figure 8. Identifying the logic configurations of the 3-loop
Arabidopsis model givingthe best fits to experimental data.
Plotting conventions are as described in Figs. 4 and 5. A shows the top ranking LCs . The black arrow indicates the optimal configuration yielding a viableclock, G OP T “ p q . For this circuit, LHY and TOC1 repress each other, in agreementwith recent biochemical evidence [51]. The red arrow denotes the second highest ranking viable23C, G DE “ p q . This LC matches the regulatory structure of the DE model, and waspreviously identified as the configuration yielding the best fit to synthetic data (see Fig. 5B). B plots the top ranking LCs obtained under the constraint that LHY represses TOC1 and TOC1activates LHY (i.e. g “ g “ g “
0; see Fig. 2D). Under this assumption, G DE emerges asthe optimal clock circuit. The circuit diagrams for G OP T and G DE can be seen in Supp. Fig. S4. Figure 9. Simulations generated by the optimal fits of the 3-loop
Arabidopsis modelto experimental data. A . Experimental expression profiles for the genes CCA1, TOC1, GI andPRR9 in free-running conditions (LL). Expression levels were determined using LUC reporter geneimaging constructs and have been normalised to lie between 0 and 1. B . The equivalent Booleantime series generated by the logic configuration G OP T yielding the best fit to data. C . Booleanexpression profiles for the highest ranked configuration G DE incorporating the central LHY-CCA1negative feedback loop of the DE model. In all plots, different components are slightly offset fromone another so they can be distinguished more easily. The time step used for solving the logicmodel was 1.5h, equal to the data sampling interval.24 ables Table 1:
Optimal parameter sets - synthetic data.
One-loop Two-loop Two-loop Three-loop
Neurospora Neurospora Arabidopsis Arabidopsis G
01 00111 10011011 10011011011 τ (h) 5 (5) 5 (5.5) 1.5 (3) 0 (3) τ (h) 6.5 (6.5) 1.5 (2) 5.5 (6) 5.5 (6.5) τ (h) 7.5 (9) 6 (6.5) 6.5 (7.5) 7 (8) τ (h) - 10 (9) 0 (0.5) 0 (1) τ (h) - 9 (9) 7.5 (6) 8 (5) τ (h) - - 4 (4) 5 (3.5) τ (h) - - 0 (1) 4.5 (6) τ (h) - - 2.5 (0.5) 6 (5) τ (h) - - 1 (0) 1 (0) τ (h) - - - 1 (3) τ (h) - - - 0.5 (0) τ (h) - - - 3 (4) T T T - 0.250 (0.350) 0.575 (0.750) 0.6250 (0.7325) T - - 0.150 (0.325) 0.2250 (0.1295) T - - - 0.8250 (0.9500) p (h) P P p (h) - P P Pp (h) - - 0.5 0.5 p (h) - - - 3The logic configurations, G , delays, τ j , and discretisation thresholds, T i (0 ă T i ă p k indicates the parameter used to simulate light input L k through equation (2). These were fixed atthe values shown, with P denoting the photoperiod t DUSK ´ t DAW N .25able 2: