Robustness and Games Against Nature in Molecular Programming
aa r X i v : . [ c s . G T ] F e b Robustness and Games Against Nature inMolecular Programming
Jack H. Lutz , Neil Lutz , Robyn R. Lutz , and Matthew R. Riley Department of Computer Science, Iowa State University Department of Computer and Information Science, University ofPennsylvaniaFebruary 19, 2019
Abstract
Matter, especially DNA, is now programmed to carry out useful pro-cesses at the nanoscale. As these programs and processes become morecomplex and their envisioned safety-critical applications approach de-ployment, it is essential to develop methods for engineering trustworthi-ness into molecular programs. Some of this can be achieved by adapt-ing existing software engineering methods, but molecular programmingalso presents new challenges that will require new methods. This paperpresents a method for dealing with one such challenge, namely, the diffi-culty of ascertaining how robust a molecular program is to perturbationsof the relative “clock speeds” of its various reactions. The method pro-posed here is game-theoretic. The robustness of a molecular program isquantified in terms of its ability to win (achieve its original objective) ingames against other molecular programs that manipulate its relative clockspeeds. This game-theoretic approach is general enough to quantify the security of a molecular program against malicious manipulations of itsrelative clock speeds. However, this preliminary report focuses on gamesagainst nature , games in which the molecular program’s opponent per-turbs clock speeds randomly (indifferently) according to the probabilitiesinherent in chemical kinetics.
Molecular programming is, at its simplest, computation with DNA. A pro-grammed molecular system is a nanosystem that will execute the algorithmicbehavior encoded into it. Examples of programmed DNA systems include neu-ral net simulation, probabilistic switching circuits, nano-robotic walkers, andoscillators [7, 21, 24, 27]. That is, we are programming matter itself when wecreate a programmed molecular system.1any of the intended uses of molecular programming are safety-critical, suchas biosensors to detect pollutants, diagnostic devices to identify diseases, andnano-robotic walkers to perform targeted drug delivery [9]. Software engineer-ing techniques including goal-oriented requirements modeling [26], risk analy-ses [14], and probabilistic model checking [15] have recently been extended tothe emerging field of molecular programming, in order to aid the developmentof safe programmed molecular systems [10, 15, 16].Assuring the robustness of a molecular program needs to occur before sucha system is deployed. However, what robustness means for such a system is notwell-defined. This, in turn, hinders efforts to determine how robust a particularsystem is.The problem addressed by this paper is the difficulty of answering the ques-tion, “how robust is this molecular program?” The contribution of the paperis to propose a game-theoretic method by which we can quantitatively evaluatehow robust a molecular program is to an opponent’s perturbing the relativeclock speeds of its constituent processes (reactions). This sort of robustnessis especially important, because the “rate constants” that govern the rates ofchemical reactions are notoriously approximate and non-constant in actual lab-oratory experiments. We formulate this as a game against nature [17, 19], inwhich nature manipulates clock speeds at random, disinterested in the out-come. Although this approach is general enough to evaluate robustness in theface of a game against a malicious opponent, we focus here on the random per-turbations inflicted by an indifferent nature. We illustrate our method on animportant consensus algorithm, approximate majority [5]. We thus develop agames-against-nature formalism of robustness and then evaluate it on a smallmolecular program that computes the approximate majority.
Molecular programs are typically specified as chemical reaction networks (CRNs) [1,6], which are roughly equivalent to stochastic Petri nets [8]. These CRNs canthen be automatically compiled into DNA strand displacement systems that canbe implemented in laboratory experiments [2, 4, 23].
Syntax
We now review the definition of CRNs. We fix a countably infinite set S whoseelements are called species . We informally regard each species as an abstracttype of molecule.A reaction over a finite set S ⊆ S is an ordered triple ρ = ( r , p , k ) ∈ N S × N S × (0 , ∞ ) , where r = p and N S is the set of functions from S into N . Given such a reaction ρ , we call r ( ρ ) = r the reactant vector of ρ , p ( ρ ) = p the product vector of ρ , and k ( ρ ) = k the rate constant of ρ . (Since S is finite, it is natural to regard elements2f N S as vectors.) The species in the support supp( r ) = { X ∈ S | r ( X ) > } are the reactants of ρ , and the species in supp( p ) are the products of ρ .We usually write reactions in a more chemistry-like notation. For example,if S = { X, Y, C } , then we write X + C k → Y + C for the reaction ( r , p , k ), where r ( X ) = 1, r ( Y ) = 0, r ( C ) = 1, p ( X ) = 0, p ( Y ) = 2, and p ( C ) = 1. A species C satisfying r ( ρ )( C ) = p ( ρ )( C ) >
0, as inthis example, is called a catalyst of the reaction ρ . The net effect of a reaction ρ is the vector ∆ ρ = p ( ρ ) − r ( ρ ) ∈ Z S . The arity of a reaction ρ isarity( ρ ) = X Y ∈ S r ( ρ )( Y ) , i.e., its total number of reactants. A chemical reaction network (CRN) is anordered pair N = ( S, R ), where S ⊆ S is finite and R is a finite set of reactionsover S . Semantics
In this paper we assign each CRN N = ( S, R ) the operational meaning givenby the stochastic mass action semantics (also called the stochastic mass actionkinetics ) introduced by Gillespie [13]. In this semantics a state of N is a vector x ∈ N S . For each Y ∈ S , the component x ( Y ) of x is the count of species Y inthe state x . A reaction ρ is applicable to state x if r ( ρ ) ≤ x , i.e., all the reactantsof ρ are present in x . If ρ is applicable to x , then the result of applying ρ to x is the state ρ ( x ) = x + ∆ ρ ∈ N S .The (stochastic mass action) rate of a reaction ρ in a state x ∈ N S andvolume V > x ( ρ ), was defined and justifiedby Gillespie [13]. Here we give a single example. Let ρ be a reaction3 Y + Z → RHS . (The right-hand side RHS does not affect the rate of a reaction.) For brevity,write y = x ( Y ) and z = x ( Z ). If ρ is applicable to x (i.e., if y ≥ z ≥ x ( ρ ) = k · V − arity( ρ ) · y · ( y − · ( y − · z = ky ( y − y − z/V . Under stochastic mass-action semantics, a CRN N = ( S, R ) functions as acontinuous-time Markov chain [22] with state space N S and, for each x , y ∈ N S ,transition rate rate( x , y ) = X ρ ( x )= y rate x ( ρ ) . The CRN N is initialized to some state or distribution over states. When itenters a state x , it stays there for a random, real-valued sojourn time t ∈ (0 , ∞ ]3efore instantaneously executing some reaction ρ and jumping to the state ρ ( x ).A trajectory of N is thus a sequence τ = (( x , t ) , ( x , t ) , . . . ) of ordered pairs( x i , t i ), where x i is a state of N and t i is the associated sojourn time. Thetrajectory τ is finite if it reaches a state to which no reaction is applicable.Otherwise, the trajectory is infinite. To quantify the robustness of a CRN, we consider how its performance mightbe affected by other CRNs that are present in the same solution. Clearly,this evaluation will depend both on how performance is defined and on whatkinds of other CRNs are present. We begin by describing a general game-theoretic framework that allows for any scalar quantification of performance(by defining appropriate utility functions) and arbitrary constraints placed onthe other CRNs (by restricting the other players’ strategy spaces). We thendiscuss the special case where interactions between CRNs are mediated only bycatalysts.An n -player CRN game with players 1 , , . . . , n is a pair G = ( N , u ) withthe following components.1. N = N × N × . . . × N n is the strategy profile space . To play the game, eachplayer i selects a strategy : a CRN N i = ( S i , R i ) from its strategy space N i ,which is a set of CRNs. For convenience we require that ( ∅ , ∅ ) ∈ N i . The n players’ selected strategies collectively define a strategy profile σ ∈ N and a CRN N σ = ( S σ , R σ ), where S σ = S i S i and R σ = S i R i . A state ofthe game comprises counts of all species in S σ . As in a CRN, a trajectory in this game is a (finite or infinite) sequence of states paired with sojourntimes. The space of all trajectories for the game G is T G .2. u = ( u , u , . . . , u n ) is a profile of utility functions u i : T G → R , where u i ( τ ) is the utility player i gets from trajectory τ . In our example below,this utility function is simply a binary indicator for “success,” meaningthat the player gets utility 1 from any trajectory that performs a giventask correctly and gets utility 0 from all other trajectories. A playerrepresenting “nature” is totally indifferent to the game’s outcome andhence gets utility 0 from all trajectories.Let ξ = ( ξ , ξ , . . . , ξ n ) where ξ i ∈ N S i is the random vector for N i ’s initialstate, and let ˆ ξ i be the embedding of ξ i in N S σ . Then the initial state of N σ is the random vector P i ˆ ξ i . Given this initial distribution, the theory ofcontinuous-time Markov chains specifies a probability measure on the set T N σ ofall trajectories of the CRN N σ [22], which also immediately yields a probabilitymeasure µ σ,ξ on T G . This allows us to define the function U i ( σ, ξ ) = Z T G u i ( τ ) µ σ,ξ ( τ ) , i ’s utility when the strategy profile σ isplayed. Robustness
We measure the robustness of a CRN N to a profile ( N , N , . . . , N n ) of otherplayers’ CRNs by comparing player 1’s expected utility playing N against thoseCRNs to its expected utility playing N against trivial CRNs. Formally, for any α ∈ [0 , N ∈ N is α -robust to ( N , N , . . . , N n ) in game G under ξ if U (( N , N , . . . , N n ) , ξ ) ≥ αU (( N , ( ∅ , ∅ ) , . . . , ( ∅ , ∅ )) , ξ ′ ) , where ξ ′ = ( ξ , ǫ, ǫ, . . . , ǫ ) and ǫ is the trivial distribution that assigns probability1 to the empty vector. This means that the participation of other players usingthese strategies can decrease player 1’s expected utility by at most a factor of α . Catalytic Games
The very general CRN games that we have defined allow essentially unrestrictedinteractions among the players’ CRNs. For many purposes, including those ofthis paper, it is more appropriate to restrict these interactions to those mediatedby catalysts.A catalytic game is one in which each player’s set of species can be writtenas S i = A i ∪ C i such that1. A i and A j are disjoint for all j = i , and2. for all C ∈ C i and ρ ∈ R i , we have ∆ ρ ( C ) = 0.In such a game, each player can affect other players only by altering the countsof their catalysts, and hence only by altering the rates of their reactions. Approximate Majority
In this preliminary report, we use a game against nature to investigate therobustness of a simple chemical reaction network that computes approximatemajority.The task in approximate majority is to design a chemical reaction network N with two designated species X and Y and the following objective. Let x ( t )and y ( t ) be the counts of X and Y , respectively, at time t . First, the totalpopulation x ( t ) + y ( t ) should be constant as t varies. Moreover, if x (0) and y (0)differ by a non-negligible amount, then whichever is larger should eventually“take over.” That is, if x (0) ≫ y (0), then we should with high probability have x ( t ) = x (0) + y (0) (and hence y ( t ) = 0) for all sufficiently large t . Similarly, if5 (0) ≫ x (0), then we should with high probability have y ( t ) = x (0) + y (0) forall sufficiently large t . If x (0) is very close to y (0), then we want one of these“takeovers” to occur with high probability, but it may be either species thattakes over.We investigate the robustness of the approximate majority CRN R : 2 X + Y → XX + 2 Y → Y of Condon, Hajiaghayi, Kirkpatrick and Manuch [5]. In order to do this in acatalytic game, we replace R with the catalyzed CRN R ′ : 2 X + Y + A → X + AX + 2 Y + B → Y + B. The crucial thing to note here is that, if x , y , a , and b are the counts of X , Y , A , and B at some time, then the rates (“clock speeds”) of the reactions in R at this time are x ( x − y and xy ( y − R ′ are ax ( x − y and bxy ( y − a and b are equaland constant, then R ′ is merely a uniformly sped-up version of R . However, if a and b vary randomly, then the relative rates of the reactions in R ′ also varyrandomly (i.e., the ratio of these rates varies randomly).In order to test the robustness of R to random perturbations of the relativerates of their reactions we thus play the CRN R ′ against a random “nature”that varies the initially equal counts a and b randomly. We model this behaviorby the simple CRN N k : A k → BB k → A , which is “calibrated” by the rate constant k ∈ (0 , ∞ ). Simulation
We assessed the robustness of the approximate majority CRN R by comparingthe performance of the catalyzed CRN R ′ in isolation with its performance inthe presence of the CRN N k representing nature randomly perturbing rate con-stants. We frame this as a game where the utility of the approximate majorityplayer is given by its success frequency.We first created models in MATLAB using SimBiology software tools forthe catalyzed approximate majority CRN R ′ and the nature CRN N k describedabove. With initial populations a (0) = b (0) = 100 and combined initial popu-lation x + y = 10 , x (0) − y (0) from 0 to 1,000 byintervals of 10. With these initial conditions, we ran R ′ in the presence of N k with k = 10 . We ran 10,000 trials for each set of initial conditions. Each trialconverged within 10 − time units, meaning that either x (10 − ) or y (10 − ) was0. The design of the CRN R ′ guarantees that, once one population has takenover, no further reactions can occur. 6s Fig. 1 shows, the random perturbations of a and b did reduce the successprobability of the approximate majority algorithm for some values of x (0) − y (0).For example, when x (0) = 5 ,
120 and y (0) = 4 , R ′ was 99% successful ina vacuum but only 76% successful in the presence of nature. However, evenwith the random perturbations, the success frequency in the presence of naturewas always greater than 70% of the success frequency in the absence of nature.This suggests that the CRN R ′ is at least 0 . N in this game underarbitrary distributions of initial states with a = b = 100 and x + y = 10 , x(0) - y(0) u t ili t y Against natureWithout nature
Figure 1: Success frequency of the approximate majority CRN R ′ with combinedinitial populations x (0) + y (0) = 10 ,
000 and initial catalyst populations a (0) = b (0) = 100, both in isolation and in a game against the nature CRN N , forvarying values of x (0) − y (0). Software engineering for molecular programming is a new research direction withopen problems that can benefit from the attention of the software engineeringresearch community. Many planned molecular systems will be deployed for use in vivo within a few years, and certification for safety-critical scenarios such asbiosensors and drug delivery devices will require improved evidence of robust-ness. Molecular program developers similarly will be called upon to prevent7ystem design vulnerabilities to malicious adversaries. Software engineering hasan essential role to play in what scientists are already labeling as the century oflife sciences [12].The preliminary work described in this paper uses a game-theoretic approachto (1) formulate the robustness of a molecular program’s CRN model in termsof a game against nature and (2) provide a method to quantitatively evaluateits robustness. The example we present concerns random perturbations of theprogram’s clock speed by nature; however, the approach is general enough toalso enable evaluation of security against an adversarial molecular program whomaliciously perturbs the relative clock speeds. Future work on this will en-tail challenging issues involving strategic equilibria [18]. Our approach providesa foundation from which to pursue improved development and deployment ofverifiably robust programmed molecular systems. More broadly, robustness inthe presence of probabilistic behavior also is required for many non-molecularprogrammed systems [3, 11, 20, 25], and the advances described here may en-hance our understanding of how to design in and verify robustness for otherasynchronous systems operating in stochastic environments.
Acknowledgments
This research was supported in part by National Science Foundation grant1545028. We thank Jim Lathrop for tool assistance.
References [1] David F. Anderson and Thomas G. Kurtz, editors.
Stochastic Analysisof Biochemical Systems , volume 1.2 of
Stochastics in Biological Systems .Springer International Publishing, 2015.[2] Stefan Badelt, Seung Woo Shin, Robert F. Johnson, Qing Dong, ChrisThachuk, and Erik Winfree. A general-purpose CRN-to-DSD compiler withformal verification, optimization, and simulation capabilities. In
Proceed-ings of the 23rd International Conference on DNA Computing and Molec-ular Programming , Lecture Notes in Computer Science, pages 232–248,2017.[3] Radu Calinescu, Carlo Ghezzi, Marta Z. Kwiatkowska, and Raffaela Mi-randola. Self-adaptive software needs quantitative verification at runtime.
Communications of the ACM , 55(9):69–77, 2012.[4] Yuan-Jyue Chen, Neil Dalchau, Niranjan Srinivas, Andrew Phillips, LucaCardelli, David Soloveichik, and Georg Seelig. Programmable chemicalcontrollers made from DNA.
Nature Nanotechnology , 8(10):755–762, 2013.[5] Anne Condon, Monir Hajiaghayi, David G. Kirkpatrick, and J´an Manuch.Simplifying analyses of chemical reaction networks for approximate major-8ty. In
DNA Computing and Molecular Programming - 23rd InternationalConference, DNA 23, Austin, TX, USA, September 24-28, 2017, Proceed-ings , pages 188–209, 2017.[6] Matthew Cook, David Soloveichik, Erik Winfree, and Jehoshua Bruck. Pro-grammability of chemical reaction networks. In Anne Condon, David Harel,Joost N. Kok, Arto Salomaa, and Erik Winfree, editors,
Algorithmic Bio-processes , Natural Computing Series, pages 543–584. Springer, 2009.[7] Frits Dannenberg, Marta Kwiatkowska, Chris Thachuk, and Andrew J.Turberfield. DNA walker circuits: Computational potential, design, andverification. In
Proceedings of the 19th International Conference on DNAComputing and Molecular Programming , volume 8141 of
Lecture Notes inComputer Science , pages 31–45. Springer, 2013.[8] Ren´e David and Hassane Alla.
Discrete, Continuous, and Hybrid PetriNets . Springer, 2010.[9] Shawn M. Douglas, Ido Bachelet, and George M. Church. A logic-gated nanorobot for targeted transport of molecular payloads.
Science ,335(6070):831–834, 2012.[10] Samuel J. Ellis, Titus H. Klinge, James I. Lathrop, Jack H. Lutz, Robyn R.Lutz, Andrew S. Miner, and Hugh D. Potter. Runtime fault detection inprogrammed molecular systems.
ACM Transactions on Software Engineer-ing and Methodology , to appear.[11] Antonio Filieri, Carlo Ghezzi, and Giordano Tamburrelli. Run-time effi-cient probabilistic model checking. In
Proceedings of the 33rd InternationalConference on Software Engineering , pages 341–350. ACM, 2011.[12] Jasmin Fisher, David Harel, and Thomas A. Henzinger. Biology as reac-tivity.
Communications of the ACM , 54(10):72–82, 2011.[13] Daniel T. Gillespie. Exact stochastic simulation of coupled chemical reac-tions.
The Journal of Physical Chemistry , 81(25):2340–2361, 1977.[14] John Knight.
Fundamentals of Dependable Computing for Software Engi-neers . CRC Press, 2012.[15] Marta Kwiatkowska and Chris Thachuk. Probabilistic model checking forbiology.
Software Systems Safety , 36:165–189, 2014.[16] Robyn Lutz, Jack Lutz, James Lathrop, Titus Klinge, Eric Henderson,Divita Mathur, and Dalia Abo Sheasha. Engineering and verifying require-ments for programmable self-assembling nanomachines. In
Proceedings ofthe 34th International Conference on Software Engineering , pages 1361–1364. IEEE, 2012. 917] John Milnor. Games against nature. Technical report, Rand Corporation,1951.[18] Abraham Neyman. Continuous-time stochastic games.
Games and Eco-nomic Behavior , 104:92–130, 2017.[19] Christos H. Papadimitriou. Games against nature.
J. Comput. Syst. Sci. ,31(2):288–301, 1985.[20] Esteban Pavese, V´ıctor Braberman, and Sebasti´an Uchitel. Less is more:Estimating probabilistic rewards over partial system explorations.
ACMTransactions on Software Engineering and Methodology , 25(2):16:1–16:47,2016.[21] Lulu Qian, Erik Winfree, and Jehoshua Bruck. Neural network computa-tion with DNA strand displacement cascades.
Nature , 475(7356):368–372,2011.[22] Y.A. Rozanov.
Probability Theory: A Concise Course . Dover Publications,1969.[23] David Soloveichik, Georg Seelig, and Erik Winfree. DNA as a universalsubstrate for chemical kinetics.
Proceedings of the National Academy ofSciences , 107(12):5393–5398, 2010.[24] Niranjan Srinivas, James Parkin, Georg Seelig, Erik Winfree, and DavidSoloveichik. Enzyme-free nucleic acid dynamical systems.
Science ,358(6369), 2017.[25] Guoxin Su, Taolue Chen, Yuan Feng, and David S. Rosenblum. ProEva:runtime proactive performance evaluation based on continuous-timemarkov chains. In
Proceedings of the 39th International Conference onSoftware Engineering, ICSE 2017, Buenos Aires, Argentina, May 20-28,2017 , pages 484–495, 2017.[26] Axel van Lamsweerde.
Requirements Engineering - From System Goals toUML Models to Software Specifications . Wiley, 2009.[27] Daniel Wilhelm, Jehoshua Bruck, and Lulu Qian. Probabilistic switchingcircuits in DNA.