[PDF] Poisson channel with binary Markov input and average sojourn time constraint

Abstract

A minimal model for gene expression, consisting of a switchable promoter together with the resulting messenger RNA, is equivalent to a Poisson channel with a binary Markovian input process. Determining its capacity is an optimization problem with respect to two parameters: the average sojourn times of the promoter's active (ON) and inactive (OFF) state. An expression for the mutual information is found by solving the associated filtering problem analytically on the level of distributions. For fixed peak power, three bandwidth-like constraints are imposed by lower-bounding (i) the average sojourn times (ii) the autocorrelation time and (iii) the average time until a transition. OFF-favoring optima are found for all three constraints, as commonly encountered for the Poisson channel. In addition, constraint (i) exhibits a region that favors the ON state, and (iii) shows ON-favoring local optima.

Full PDF

PPoisson channel with binary Markov input andaverage sojourn time constraint

Mark Sinzger, Maximilian Gehri, Heinz Koeppl

Dept. of Electrical Engineering, Centre for Synthetic Biology , Technische Universit¨at Darmstadt

Darmstadt, GermanyEmail: { mark.sinzger,maximilian.gehri,heinz.koeppl } @bcs.tu-darmstadt.de Abstract —A minimal model for gene expression, consisting of aswitchable promoter together with the resulting messenger RNA,is equivalent to a Poisson channel with a binary Markovian inputprocess. Determining its capacity is an optimization problemwith respect to two parameters: the average sojourn times of thepromoter’s active (ON) and inactive (OFF) state. An expressionfor the mutual information is found by solving the associatedﬁltering problem analytically on the level of distributions. Forﬁxed peak power, three bandwidth-like constraints are imposedby lower-bounding (i) the average sojourn times (ii) the autocor-relation time and (iii) the average time until a transition. OFF-favoring optima are found for all three constraints, as commonlyencountered for the Poisson channel. In addition, constraint (i)exhibits a region that favors the ON state, and (iii) shows ON-favoring local optima.

Index Terms —Poisson channel, gene expression, binaryMarkov, average sojourn time, ﬁltering, bandwidth constraint

I. I

NTRODUCTION

There is mounting evidence that the information encodedin the temporal concentration proﬁles of biomolecules plays akey role in cellular sensing and decision making [1], [2] andhelps to overcome biochemical noise [3]. The computation ofmutual information (MI) between time-varying, biomolecularsignals is complex, and analytical solutions so far relied onGaussian [4] or steady-state approximations [5]. Other papershave focused on single time-point transmission, e.g., [6],[7]. Works, that account for the discrete nature of chem-ical reactions, commonly assume diffusion approximationsas inputs [8], [9] or are based on stochastic simulation [ ? ],[10], [11]. Restrictions to sub-classes of discrete-state inputprocesses permit analytical bounds on the capacity [12], oftenchallenging diffusion based results [13].This paper analyzes the minimal gene expression modelwith a two-state promoter (see e.g. [14] and Fig. 2a in [15])as an analytically tractable example. Switching stochasticallybetween activated state x and inactivated state x , the pro-moter is modeled by a stationary random telegraph process X ( t ) , i.e., a binary, time-homogeneous, stationary Markovprocess (BMP). Its state linearly modulates the synthesis rate The work was supported by the European Research Council (ERC) withinthe Consolidator Grant CONSYN (grant agreement no. 773196).This article was accepted for publication by IEEE, ISIT 2020.doi: 10.1109/ISIT44484.2020.9174360 ©2020 IEEE of messenger RNA molecules Y ( t ) . The decay of mRNAmolecules can be ignored from an information theoretic pointof view, because birth events are uniquely identiﬁed frombirth-death trajectories [10]. The joint distribution of ( X, Y ) ,that factorizes in the conditional Y | X and the input pathdistribution µ X , is equivalent to the Poisson channel, whoseclass of input processes X ( t ) is restricted to BMPs. Wedistinguish between channels with leakage ( x > ) andwithout leakage ( x = 0 ), see Fig. 1a. By X [0 ,T ] we denotethe trajectory X ( t ) ≤ t ≤ T of the time-varying input signal withtransmission duration T . The sojourn times σ in x and σ in x are exponentially distributed with parameters c and c such that E [ σ ] = c − and E [ σ ] = c − . The channel output Y ( t ) ﬁres at rate c X ( t ) , where c is the channel gain thatdictates the time scale of Y ( t ) . We denote the jump times of Y as t i with < t < · · · < T .The Poisson channel was introduced as a model for direct-detection optical communication systems [16]. Classically,peak and average power are constrained. Then among generalinputs the class of BMPs achieves capacity, however at the costof inﬁnite switching rates c , c [17], [18]. This physical im-plausibility motivates bandwidth-like constraints [19], [20]. Byrestricting general signals to binary inputs with lower-boundedsojourn times, [21] reported a transition from asymmetric tosymmetric allocation for the capacity-achieving input, as thelower bound increases. A. Problem statement and outline

We consider the Poisson channel with BMPs as input classand investigate the following bandwidth-like constraints: • (C1) < c ≤ r , < c ≤ r , i.e., lower-boundingthe average sojourn times by E [ σ ] ≥ r − , E [ σ ] ≥ r − . A special case is the homogeneous constraint max { c , c } ≤ r , in analogy to [21]. • (C2) < c + c ≤ r . Since C ov[ X ( t ) , X ( t + s )] ∝ exp( − ( c + c ) s ) , this bounds the autocorrelation timeof X ( t ) from below. • (C3) E [ E [ σ X ( t ) | X ( t )]] = c − + c − − / ( c + c ) ≥ r − ,which lower-bounds the average sojourn time, similarlyto (C1), but regardless of the transition type.As the biophysical interpretation and value of the rates c , c strongly depends on the context of the promoter model [15],[22]–[24] we list some generic motivations. The timescales a r X i v : . [ q - b i o . M N ] J a n f the rates may be determined by binding afﬁnity to thepromoter, temperature, diffusion [25], [26], and availability ofactivating and deactivating constituents, such as transcriptionfactors, polymerases, other enzymes, or ATP. Highly autocor-related dynamics may be caused by upstream regulation orfeedback of downstream elements in a reaction network [8],[27].We consider the path-wise MI I ( X [0 ,T ] , Y [0 ,T ] ) [28] and theinformation rate, deﬁned as ¯ I ( X, Y ) := lim T →∞ T I ( X [0 ,T ] , Y [0 ,T ] ) . Optimizing (i) the MI and (ii) the information rate withrespect to all admissible input path distributions µ X yields(i) the capacity C T and (ii) the information rate capacity C .For ﬁxed OFF and ON states x , x , the distribution µ X isparametrized solely by the system parameters c , c ≥ .Fixing c , the respective capacity-achieving path distributionsare characterized by the following constrained optimizationproblems (i) max c ,c T I ( X [0 ,T ] , Y [0 ,T ] ) , (ii) max c ,c ¯ I ( X, Y ) subject to one of the constraints (C1) to (C3).The remainder of the work is organized as follows. Exploit-ing the link between MI and ﬁltering, section II addresses theassociated ﬁltering problem [29] analytically on the level ofdistributions. We recognize the conditional mean as a piece-wise deterministic Markov process [30], and solve the partialdifferential equation (PDE) given by the hybrid generator [31].In section III we express the MI and information rate asnonelementary Riemann integrals related to hypergeometricseries. This permits further analysis of the constraints (C1)to (C3). Capacity-achieving input was reported so far asasymmetric, favoring the OFF state [17], [18]. In contrast, weﬁnd that constraint (C1) either enforces a symmetric allocationof ON and OFF state at optimality for certain r , r or favorsthe ON state. ON-favoring local optima are implied by (C3). B. Previous Work

The following expression due to [28] links the path-wise MIwith the ﬁltering problem of observing the input X indirectlyvia the Poisson channel output Y . Accordingly, I ( X [0 ,T ] , Y [0 ,T ] ) = (cid:90) T E [ φ ( c X ( t ))] − E [ φ ( c Z ( t ))] d t, (1)where φ ( z ) = z ln( z ) and the ﬁrst conditional moment Z ( t ) = E [ X ( t ) | Y [0 ,t ] ] is the optimal causal estimator of the inputunder a quadratic criterion.Applying Jensen’s inequality to the second integrand termof (1) and using the identity φ ( cz ) = cφ ( z ) + zφ ( c ) as wellas the stationarity of BMPs, the MI is bounded by I ( X [0 ,T ] , Y [0 ,T ] ) c T ≤ c φ ( x ) + c φ ( x ) c + c − φ (cid:18) x + c ∆ xc + c (cid:19) =: J ( x , x , c / ( c + c )) , ab c d x xx X ( t ) AAACfXicfVFNS8NAEN3G7/rV6tHLYikoSEmqUI8FLx4VrBbaIJvNpF3cZMPupFBC/4JX/Wv+Gt20RUwLDiw83pvZeTMTpFIYdN2virOxubW9s7tX3T84PDqu1U+ejco0hx5XUul+wAxIkUAPBUropxpYHEh4Cd7uCv1lAtoIlTzhNAU/ZqNERIIzLKj+BV6+1hpuy50HXQfeEjTIMh5e6xU+DBXPYkiQS2bMwHNT9HOmUXAJs+owM5Ay/sZGMLAwYTEYP5+bndGmZUIaKW1fgnTO/q3IWWzMNA5sZsxwbFa1gvzVmqVWGN36uUjSDCHhi05RJikqWoxOQ6GBo5xawLgW1izlY6YZR7ug6n8/lTwUnVEpacqDjkFOwG7BRHOHJc1wJiH080VOScuDIF4llAxnVXsWb/UI6+C53fKuW+3Hm0aXLg+0S87IObkgHumQLrknD6RHOBmTd/JBPivfTtO5clqLVKeyrDklpXA6P5qZxUQ= Z ( t ) AAACfXicfVFNS8NAEN3Gr1q/Wj16WSwFBSlJFfRY8OKxgm1FG8pmM7GLm2zYnQgl9C941b/mr9FNW8RUcGDh8d7MzpuZIJXCoOt+Vpy19Y3Nrep2bWd3b/+g3jgcGJVpDn2upNIPATMgRQJ9FCjhIdXA4kDCMHi5KfThK2gjVHKP0xT8mD0nIhKcYUE9nuLZuN502+486F/gLUGTLKM3blT4KFQ8iyFBLpkxT56bop8zjYJLmNVGmYGU8Rf2DE8WJiwG4+dzszPaskxII6XtS5DO2d8VOYuNmcaBzYwZTsyqVpA/WqvUCqNrPxdJmiEkfNEpyiRFRYvRaSg0cJRTCxjXwpqlfMI042gXVPvvp5KHojMqJU150AnIV7BbMNHcYUkznEkI/XyRU9LyIIhXCSXDWc2exVs9wl8w6LS9i3bn7rLZpcsDVckxOSGnxCNXpEtuSY/0CScT8kbeyUfly2k55057kepUljVHpBTO1Teey8VG t AAACenicfVFNS8NAEN3G7/rV6tHLYikoQklqQY8FLx4VrC1okM1m0i7dZMPupFBCf4FX/XH+Fw9u2iKmBQcWHu/N7LyZCVIpDLruV8XZ2Nza3tndq+4fHB4d1+onz0ZlmkOPK6n0IGAGpEighwIlDFINLA4k9IPxXaH3J6CNUMkTTlPwYzZMRCQ4Q0s94lut4bbcedB14C1Bgyzj4a1e4a+h4lkMCXLJjHnx3BT9nGkUXMKs+poZSBkfsyG8WJiwGIyfz53OaNMyIY2Uti9BOmf/VuQsNmYaBzYzZjgyq1pB/mrNUiuMbv1cJGmGkPBFpyiTFBUt5qah0MBRTi1gXAtrlvIR04yj3U71v59KHorOqJQ05UFHICdgt2CiucOSZjiTEPr5Iqek5UEQrxJKhrOqPYu3eoR18Nxuedet9mOn0aXLA+2SM3JOLohHbkiX3JMH0iOcAHknH+Sz8u2cO5fO1SLVqSxrTkkpnM4Px8HEfQ== Z (2) AAACfXicfVHbasJAEF3Tm7U3bR/7slQEC0USW7CPQl/6aKFeqAbZbCa6uMmG3Y0gwV/oa/tr/Zp2o1IahQ4sHM6Z2Tkz48WcKW3bXwVrb//g8Kh4XDo5PTu/KFcue0okkkKXCi7kwCMKOIugq5nmMIglkNDj0PdmT5nen4NUTESvehGDG5JJxAJGic6ot3rzdlyu2g17FXgXOBtQRZvojCsFOvIFTUKINOVEqaFjx9pNidSMcliWRomCmNAZmcDQwIiEoNx0ZXaJa4bxcSCkeZHGK/ZvRUpCpRahZzJDoqdqW8vIX62Wa6WDRzdlUZxoiOi6U5BwrAXORsc+k0A1XxhAqGTGLKZTIgnVZkGl/37Kecg6ayG4yg86BT4HswUVrBzmNEUJB99N1zk5LfW8cJsQ3F+WzFmc7SPsgl6z4dw3mi8P1TbeHKiIrtENqiMHtVAbPaMO6iKKpugdfaDPwrdVs+6sxjrVKmxqrlAurNYPFN3FBA== Z ( ) AAACgnicfVHJSgNBEO2MW4y7Hr00BkE9hJko6MFDwItHBeOaQXp6akxjL0N3jRCGfIVX/TD/xs6COBEsaHi8V9X1qirJpXAYhl+1YG5+YXGpvtxYWV1b39jc2r51prAcutxIY+8T5kAKDV0UKOE+t8BUIuEueb0Y6XdvYJ0w+gYHOcSKvWiRCc7QUw+PBz2hMxwcPm82w1Y4DvoXRFPQJNO4et6q8V5qeKFAI5fMuacozDEumUXBJQwbvcJBzvgre4EnDzVT4OJy7HhI9z2T0sxY/zTSMfu7omTKuYFKfKZi2Hez2oj80fYrrTA7i0uh8wJB80mnrJAUDR3NT1NhgaMceMC4Fd4s5X1mGUe/pcZ/P1U8jDqjMdJVB+2DfAO/BZeNHVY0x5mENC4nORWtTBI1SxiZDhv+LNHsEf6C23YrOm61r0+aHTo9UJ3skj1yQCJySjrkklyRLuFEkXfyQT6D+eAoiILjSWpQm9bskEoE59+P4MaL pdf of pdf of f ( t ) AAACf3icfVFNS8NAEN3Er1o/q0cvi0XUS0iqoN4KXjxWsK2goWw2E7u6yYbdiVBC/4NX/Wf+G7cfiGnBgYXHezM7b2aiXAqDvv/tuCura+sbtc361vbO7t5+46BnVKE5dLmSSj9GzIAUGXRRoITHXANLIwn96O12ovffQRuhsgcc5RCm7CUTieAMLdVLBsEZng/2m77nT4Mug2AOmmQenUHD4c+x4kUKGXLJjHkK/BzDkmkUXMK4/lwYyBl/Yy/wZGHGUjBhObU7pieWiWmitH0Z0in7t6JkqTGjNLKZKcOhWdQm5K92UmmFyXVYiiwvEDI+65QUkqKik+FpLDRwlCMLGNfCmqV8yDTjaFdU/++niodJZ1RKmuqgQ5DvYLdgkqnDimY4kxCH5SynopVRlC4SSsbjuj1LsHiEZdBrecGF17q/bLbp/EA1ckSOyRkJyBVpkzvSIV3CySv5IJ/ky3XcU9dz/Vmq68xrDkkl3JsfC3LE9w== f m ( t ) AAACf3icfVFNSwMxEE3X7/rV6tFLsIh6KbtVUG8FLx4VbC3UpWSzszaabJZktlCW/gev+s/8N6YfiFvBgcDjvZnMm5kok8Ki739VvJXVtfWNza3q9s7u3n6tftC1OjccOlxLbXoRsyBFCh0UKKGXGWAqkvAUvd1O9acRGCt0+ojjDELFXlKRCM7QUd1koM7wfFBr+E1/FvQvCBagQRZxP6hX+HOsea4gRS6Ztf3AzzAsmEHBJUyqz7mFjPE39gJ9B1OmwIbFzO6Enjgmpok27qVIZ+zvioIpa8cqcpmK4dAua1PyRzsptcLkOixEmuUIKZ93SnJJUdPp8DQWBjjKsQOMG+HMUj5khnF0K6r+91PJw7Qzai1tedAhyBG4Ldhk5rCkWc4kxGExzylpRRSpZULLeFJ1ZwmWj/AXdFvN4KLZerhstOniQJvkiByTMxKQK9Imd+SedAgnr+SdfJBPr+Kdek3Pn6d6lUXNISmFd/MNiU7FMw== sample Fig. 1. Dynamics of the causal estimator Z ( t ) for c = c = 0 . , c =1 , x = 0 , x = 1 . a Sample trajectory of the binary signal input X ( t ) . b The solid line is a sample trajectory of Z ( t ) . Crosses indicate the jump times t , t , t of Y . The dotted line f m ( t ) is the common trajectory of Z ( t ) priorto the ﬁrst jump. The dashed line f ( t ) separates the ( t, z ) -plane. Prior tothe ﬁrst jump the trajectories evolve below, afterwards they evolve above. c The probability distribution of the causal estimator at time t = 2 is composedof a Dirac measure with weight κ (2) at f m (2) and a density supported on ( f (2) , . d Probability density of the asymptotic causal estimator. where ∆ x := x − x is the dynamic range. Optimizing overthe average power constraint < c / ( c + c ) ≤ p implies C ≤ c J ( x , x , min { p , ¯ p } ) , (2)where C can be replaced by C T and ¯ p = (cid:26) exp (cid:18) φ ( x ) − φ ( x )∆ x − (cid:19) − x (cid:27) / ∆ x. Kabanov and Davis showed that for c , c → ∞ the bound(2) is indeed achieved with the respective asymptotic ratio c / ( c + c ) = min { p , ¯ p } [17], [18]. The case x = 0 , x = 1 (no leakage) with p ≥ /e reduces to C = C T = c /e with c / ( c + c ) = 1 /e. We contribute by analyzing how the capacity-achievingBMP input behaves for ﬁnite c , c , reﬂected in the bandwidth-like constraints (C1) to (C3). For the process class of BMPs, Z ( t ) in (1) evolves according to the ODE with stochasticjumps [29] d Z ( t ) = [ c · ∆ x − ( c + c + c · ∆ x )( Z ( t ) − x )+ c ( Z ( t ) − x ) (cid:3) d t + g ( Z ( t )) d Y ( t ) , (3)where d Y ( t ) = 1 for jump times t = t i , vanishingotherwise; and g ( Z ( t i )) = ( Z ( t i ) − x )( x − Z ( t i )) Z ( t i ) is the jumpheight d Z ( t i ) = Z ( t i +) − Z ( t i − ) at jump times of Y . Fig.1b shows a sample trajectory for x = 0 , x = 1 .II. T HE SOLUTION OF THE FILTERING PROBLEM

In the following, we solve for the distribution of Z ( t ) ,provided that leakage is absent ( x = 0 ) and the dynamicrange is normalized ( x = 1 ). This self-noise limited caseallows for an analytic expression of the MI in III-A. A. The probability evolution equation

In absence of leakage the stochastic reset condition in(3) simpliﬁes, because Y ( t ) increases solely if X ( t ) is inthe ON state. Hence, Z ( t ) is reset to x = 1 upon jumpsf Y . The joint process { Z ( t ) , Y ( t ) } t is then a piece-wisedeterministic Markov process [30] that jumps stochasticallyfrom state { Z ( t i − ) , Y ( t i − ) } to state { Z ( t i +) , Y ( t i +) } =(1 , Y ( t i − ) + 1) . Hence, jump times t i of Y and of Z are identical, and jumps occur with propensity c Z ( t ) [29].Since the propensity depends only on the ﬁrst component,the projection onto Z ( t ) is a piece-wise deterministic Markovprocess itself. Its probability evolution equation is given bya hybrid generator, composed of the drift (Liouville) and thejump (Poisson) part [31]: ∂∂t p ( t, z ) = − ∂∂z { A ( z ) p ( t, z ) }− c zp ( t, z ) , ω < z < , (4)where A ( z ) := c ( c ∆ x/c − γz + z ) with γ := ( c + c +∆ xc ) /c is the drift dynamics, i.e., the ODE part of (3) withstable equilibrium ω := γ − ρ , where ρ := (cid:112) γ − c ∆ x/c .The reason why (4) is lacking an inﬂow of probability due tojumps is because all inﬂow enters at z = 1 . This is reﬂected inthe proof of the subsequent theorem II.1. Although expression(1) has been used and extended [32] in multiple ways toaddress the capacity problem, to the best of our knowledge,the method introduced here is new to the ﬁeld. B. Transient and asymptotic distribution

Aiming for the distribution of Z ( t ) , we envision the stochas-tic ensemble of trajectories. All trajectories are initiated in thestationary mean m := c / ( c + c ) , because Z (0) = E [ X (0)] .They slide down according to the function f ι ( t ) = ω + ρ ( ι − ω ) ι − ω + e ρc t ( ρ + ω − ι ) with ι = m (see Fig. 1b), which solves the Riccati equation dd t Z ( t ) = A ( Z ( t )) . The solution curve f ( t ) with initial value z = 1 at t = 0 separates the ( t, z ) -plane into two regionssummarized in the equivalence that involves the ﬁrst jumptime t of Z : t < t ⇔ Z ( t ) ≤ f ( t ) ⇔ Z ( t ) = f m ( t ) . (5)Fig. 1c visualizes the ensemble of trajectories stopped at aﬁxed t , while Fig. 1d visualizes the asymptotic probabilitydistribution. The following theorem fully describes the distri-bution of Z ( t ) at any time point t . Besides preparing the mainresult theorem III.1, it can be interesting in its own right inthe related ﬁelds of ﬁltering and control theory. Theorem II.1.

The probability measure µ t : B ( ω, → [0 , , µ t ( B ) = P [ Z ( t ) ∈ B ] , deﬁned for Borel sets B ⊆ ( ω, , is a hybrid measure µ t ( B ) = κ ( t ) δ f m ( t ) ( B ) + ν t ( B ) , (6) composed of a Dirac measure δ a ( B ) = B ( a ) , a ∈ R at f m ( t ) with weight κ ( t ) = e − ωc t · (cid:0) − ρ − ( m − ω )(1 − e − ρc t ) (cid:1) and an absolutely continuous measure ν t (d z ) = π ( z )d z supported on ( f ( t ) , with time-independent density π ( z ) = α ( z − ω ) β − ( ω + ρ − z ) − ( β + ) , (7) where α := (1 − ω ) − β ( ω + ρ − β + · c mc , β := γ ρ . Proof.

We use (5) and compute P [ Z ( t ) ∈ B ] = P [ Z ( t ) ∈ B, t > t ] + P [ Z ( t ) ∈ B, t ≤ t ]= P [ t > t ] δ f m ( t ) ( B ) + P [ Z ( t ) ∈ B ∩ ( f ( t ) , having the form (6). First, P [ t > t ] = exp (cid:18) − (cid:90) t f m ( s ) d s (cid:19) = κ ( t ) . Second, by the equivalence Z ( t ) ∈ B ∩ ( f ( t ) , ⇔ t − sup { t i | t i ≤ t } ∈ f − ( B ) , t ≤ t the absolutely continuity of the t i implies that ν t ( B ) = P [ Z ( t ) ∈ B ∩ ( f ( t ) , is an absolute continuous measure,i.e., ν t (d z ) = p ( t, z )d z with some density p ( t, z ) supported on ( f ( t ) , . Its solution is obtained by the method of character-istics [33], initiated at the boundary condition p ( t ) := p ( t, and propagated through the rewritten linear PDE (4) ∂∂t p ( t, z ) + A ( z ) ∂∂z p ( t, z ) = p ( t, z ) (cid:26) − c z − dd z A ( z ) (cid:27) . It remains to evaluate p ( t ) . We compute E [ c X ( t )] = E (cid:20) lim h → P [ Y ( t + h ) − Y ( t ) = 1 | X [0 ,t ] ] h (cid:21) = lim h → P [ Y ( t + h ) − Y ( t ) = 1] h = lim h → P [ Z ( t + h ) ∈ ( f ( h ) , h = − f (cid:48) ( t ) p ( t,

1) = − A (1) p ( t ) = c p ( t ) . The absolute continuity of ν t was used in the fourth equality.Stationarity of X implies that p ( t ) ≡ c m/c independent of t . Plugging this in, p ( t, z ) = π ( z ) ( f ( t ) , ( z ) is obtained. Theorem II.2.

The asymptotic distribution is absolutely con-tinuous, supported on ( ω, with density π ( z ) as in (7) .Proof. Since Z ( t ) is an ergodic Markov process, the asymp-totic equals the stationary distribution. The distribution π ( z ) satisﬁes the stationarity condition obtained from (4) by equat-ing the right side to zero.III. T HE INFORMATION SURFACE

We consider the Poisson channel without leakage, unlessmentioned otherwise.

A. Analytic expression

The results of the previous section allow us to state the mainresult of this paper. heorem III.1.

Let x = 0 , x = 1 , then T I ( X [0 ,T ] , Y [0 ,T ] ) = − c T (cid:90) T φ ( f m ( t )) κ ( t ) d t − c (cid:90) ω φ ( z ) π ( z )(1 − f − ( z ) /T ) d z, ¯ I ( X, Y ) = − c (cid:90) ω φ ( z ) π ( z ) d z. Proof.

Using theorems II.1,II.2, the identity φ ( cz ) = cφ ( z ) + zφ ( c ) , and Fubini, evaluate (1).We use a linear transformation of the integration variable z → (1 − z )(1 − ω ) − and the series expansion of thelogarithm as well as z (cid:55)→ ( ρ + ω − z ) − β − / . Then by uniformconvergence of the integrand, we may express ¯ I ( X, Y ) = ∞ (cid:88) i,j =0 A ij (cid:18) − ωρ (cid:19) j (1 − ω ) i , (8)as an absolutely convergent series with coefﬁcients A ij = c αρ − β − (1 − ω ) β + Γ( β + ) × i !Γ( β + + j )Γ( β − + j )( β − + j + ω ( i + 2)) j !Γ( β + + i + j ) , establishing a link with Appell’s hypergeometric F series[34]. Using (8), it can be veriﬁed that for c , c → ∞ with asymptotic ratio c / ( c + c ) → p , it holds that β → / , ω → p and (1 − ω ) /ρ → . Hence ¯ I ( X, Y ) → c J (0 , , p ) , in accordance with paragraph I-B. Expand-ing f − ( z ) , the MI I ( X [0 ,T ] , Y [0 ,T ] ) is given in terms ofSrivastava’s triple hypergeometric series [35] by analogousproceeding, implying T I ( X [0 ,T ] , Y [0 ,T ] ) → c J (0 , , p ) for c , c → ∞ with asymptotic ratio c / ( c + c ) → p .The above linear transformation and subsequent differenti-ation under the integral sign or, alternatively, summand-wisedifferentiation of (8) make (higher-order) partial derivativeswith respect to c , c accessible for the phase plane analysis.We introduce the relative rates ˜ c = c /c and ˜ c = c /c .The parameters γ, ω, ρ and hence the stationary density π ( z ) only depend on ˜ c and ˜ c , and c scales the time. The lineartime scaling ˜ t = c t, ˜ T = c T allows us to write the MI as T I ( X [0 ,T ] , Y [0 ,T ] ) = c · T I ( ˜ X [0 , ˜ T ] , ˜ Y [0 , ˜ T ] ) , where ˜ X ( t ) and ˜ Y ( t ) are input and output of the correspondentchannel with relative rates ˜ c , ˜ c and normalized ˜ c = 1 . Forconvenience we drop the tilde in the following and assume anormalized time scale c = 1 for the channel. c c ˜ c AAAChHicfVHLSsNAFJ3GV63P6tJNsBRcSEl8oCspunFZwT6gDWUyubGDk0yYuSmU0M9wq9/l3zhNg5gWvDBwOOc+ztzrJ4JrdJzvirWxubW9U92t7e0fHB4d1096WqaKQZdJIdXApxoEj6GLHAUMEgU08gX0/fenhd6fgtJcxq84S8CL6FvMQ84oGmo4Qi4CyNh87I6PG07LycNeB24BGqSIzrheYaNAsjSCGJmgWg9dJ0Evowo5EzCvjVINCWXv9A2GBsY0Au1luee53TRMYIdSmRejnbN/KzIaaT2LfJMZUZzoVW1B/mrN0igM772Mx0mKELPlpDAVNkp7sQE74AoYipkBlCluzNpsQhVlaPZU+69TycNiMkopdPmjExBTMFvQYe6wpGlGBQRetswpaZnvR6uEFMG8Zs7irh5hHfSuWu516+rlptF+LA5UJWfknFwQl9yRNnkmHdIljEjyQT7Jl7VtXVrX1u0y1aoUNaekFNbDD1kEx+k= ˜ c AAAChHicfVFNS8NAEN1GrbV+VT16CZaCBylJq+hJRC8eK1hbaEPZbCbt0k027E4KJfRneNXf5b9x0xYxFRxYeLw3H29n/ERwjY7zVbK2tnfKu5W96v7B4dFx7eT0TctUMegyKaTq+1SD4DF0kaOAfqKARr6Anj99yvXeDJTmMn7FeQJeRMcxDzmjaKjBELkIIGOLUWtUqztNZxn2X+CuQZ2sozM6KbFhIFkaQYxMUK0HrpOgl1GFnAlYVIephoSyKR3DwMCYRqC9bOl5YTcME9ihVObFaC/Z3xUZjbSeR77JjChO9KaWkz9aozAKwzsv43GSIsRsNSlMhY3SzjdgB1wBQzE3gDLFjVmbTaiiDM2eqv91KnjIJ6OUQhc/OgExA7MFHS4dFjTNqIDAy1Y5BS3z/WiTkCJYVM1Z3M0j/AVvrabbbrZerusPj+sDVcg5uSCXxCW35IE8kw7pEkYkeScf5NMqW1dW27pZpVqldc0ZKYR1/w1bGsfq A B C III @ ¯ I = 0 AAACmnicfVFdS+NAFJ3G9au6btVHhR0swj6VxBX0RRB9UXxRsCqYUO5MbuzQSSbMTIQS8uiv8VV/zP6bnbRFTAUvDBzOuXfO/WC5FMb6/r+Wt/BjcWl5ZbW9tv5z41dnc+vOqEJz7HMllX5gYFCKDPtWWIkPuUZImcR7Njqv9ftn1Eao7NaOc4xSeMpEIjhYRw06v8MctBUgBwENGegyTMEOGSsvq4qeUH/Q6fo9fxL0KwhmoEtmcT3YbPEwVrxIMbNcgjGPgZ/bqKxduMSqHRYGc+AjeMJHBzNI0UTlZJKK7jsmponS7mWWTtjPFSWkxoxT5jLrPs28VpMf2n7DyibHUSmyvLCY8alTUkhqFa33QmOhkVs5dgC4Fq5ZyoeggVu3vfZ3PzV6qJ2tUtI0Bx2ifEa3BZNMOmxohoPEOCqnOQ2tZCydJ5SMq7Y7SzB/hK/g7qAX/O0d3Bx2T89mB1ohO2SP/CEBOSKn5IJckz7h5IW8kjfy7u16Z96ldzVN9Vqzmm3SCO/2P8mRz8M= @ ¯ I = 0 AAACmnicfVFdS+NAFJ1G19W6ulUfFRwsgk8l6QruiyDui+KLglXBhHJncmMHJ5kwMxFKyKO/xlf3x/hvnLRFTAUvDBzOuXfO/WC5FMb6/lvLW1j8sfRzeaW9+mtt/XdnY/PGqEJzHHAllb5jYFCKDAdWWIl3uUZImcRb9viv1m+fUBuhsms7zjFK4SETieBgHTXs7IY5aCtADvs0ZKDLMAU7Yqw8ryp6TP1hp+v3/EnQryCYgS6ZxeVwo8XDWPEixcxyCcbcB35uo7J24RKrdlgYzIE/wgPeO5hBiiYqJ5NUdN8xMU2Udi+zdMJ+righNWacMpdZ92nmtZr80PYbVjb5G5UiywuLGZ86JYWkVtF6LzQWGrmVYweAa+GapXwEGrh122t/91Ojh9rZKiVNc9ARyid0WzDJpMOGZjhIjKNymtPQSsbSeULJuGq7swTzR/gKbvq94E+vf3XYPTmdHWiZbJM9ckACckROyBm5JAPCyTN5Ia/kv7fjnXrn3sU01WvNarZII7zrd8u8z8Q= Fig. 2. Phase plane analysis of (C1) for x = 0 , x = 1 . Nullclines ∂ ¯ I = 0 and ∂ ¯ I = 0 were evaluated, using the ﬁrst × summands of the F series in (8) and deriving summand-wise. Colored arrows indicate thegradient of the information rate, calculated alike. Optimization domains arerectangular. Depending on the location of the domain’s upper right corner ( r , r ) , the optimum is assumed on the nullclines ∂ ¯ I = 0 [ ( r , r ) inC] and ∂ ¯ I = 0 [ ( r , r ) in A], respectively, or in the interior of region B[ ( r , r ) in B]. Regions I and II contain all ( r , r ) , whose optima favor theON and OFF state, respectively. B. Constraint (C1) permits ON-favoring input

The optimization of ¯ I ( X, Y ) = ¯ I ( c , c ) on the rectangulardomain < c ≤ r , < c ≤ r in the ( c , c ) -planevaries qualitatively, depending on the location of the corner ( r , r ) . As depicted in Fig. 2, we distinguish three regionsA, B and C, which are separated by the two nullclines ∂ ¯ I = 0 and ∂ ¯ I = 0 . The following analysis relies on theconjecture that both nullclines do not intersect except in theorigin. Then the three regions are characterized by the signsof the partial derivatives [sgn( ∂ ¯ I ) , sgn( ∂ ¯ I )] , taking on thevalues [1 , − , [1 , , [ − , on A, B, C, respectively. Themaximum ( c ∗ , c ∗ ) satisﬁes c ∗ = r , c ∗ < r in region A, c ∗ = r , c ∗ = r in region B and c ∗ < r , c ∗ = r inregion C. It is hence located in B or its boundary in allcases. The ratio χ := c ∗ /c ∗ reﬂects whether the optimal inputprocess X ( t ) favors the ON state ( χ > ) or the OFF state( χ < ). For large r , r , paragraph I-B implies that the ratio χ approaches / ( e − . The asymmetry of the region B withrespect to reﬂection at the bisection line causes a smallerregion I favoring the ON state and a larger region II favoringthe OFF state. If we assume a homogeneous constraint (C1) max { c , c } ≤ r , closely related to [21], the bisection linetransits from region B to region C, upon increasing r . Thisconstraint thus implies a phase transition from symmetric( χ = 1 ) to asymmetric ( χ < ) allocation of the ON andthe OFF state. An analogous transit behavior for the minimalsojourn time constraint has been reported in [21]. Fig. 3 showsthe phase plane associated with the MI, i.e., when T is ﬁnite,for comparison with the results obtained for ¯ I ( X, Y ) . III a b ˜ c AAAChHicfVHLSsNAFJ3GV62vVpdugqXgQkrSKrqSghuXFewD2lAmk5t26CQTZm4KJfQz3Op3+TdOH4hpwQsDh3Pu48y9fiK4Rsf5Llh7+weHR8Xj0snp2flFuXLZ1TJVDDpMCqn6PtUgeAwd5CignyigkS+g509flnpvBkpzGb/jPAEvouOYh5xRNNRgiFwEkLHFyB2Vq07dWYW9C9wNqJJNtEeVAhsGkqURxMgE1XrgOgl6GVXImYBFaZhqSCib0jEMDIxpBNrLVp4Xds0wgR1KZV6M9or9W5HRSOt55JvMiOJEb2tL8ler5UZh+ORlPE5ShJitJ4WpsFHayw3YAVfAUMwNoExxY9ZmE6ooQ7On0n+dch6Wk1FKofMfnYCYgdmCDlcOc5pmVEDgZeucnJb5frRNSBEsSuYs7vYRdkG3UXeb9cbbfbXlbA5UJNfkhtwSlzySFnklbdIhjEjyQT7Jl3Vo3VlN62GdahU2NVckF9bzD1Oax9c= ˜ c AAAChHicfVHLSsNAFJ3GV62vVpdugqXgQkrSKrqSghuXFewD2lAmk5t26CQTZm4KJfQz3Op3+TdOH4hpwQsDh3Pu48y9fiK4Rsf5Llh7+weHR8Xj0snp2flFuXLZ1TJVDDpMCqn6PtUgeAwd5CignyigkS+g509flnpvBkpzGb/jPAEvouOYh5xRNNRgiFwEkLHFyB2Vq07dWYW9C9wNqJJNtEeVAhsGkqURxMgE1XrgOgl6GVXImYBFaZhqSCib0jEMDIxpBNrLVp4Xds0wgR1KZV6M9or9W5HRSOt55JvMiOJEb2tL8ler5UZh+ORlPE5ShJitJ4WpsFHayw3YAVfAUMwNoExxY9ZmE6ooQ7On0n+dch6Wk1FKofMfnYCYgdmCDlcOc5pmVEDgZeucnJb5frRNSBEsSuYs7vYRdkG3UXeb9cbbfbXlbA5UJNfkhtwSlzySFnklbdIhjEjyQT7Jl3Vo3VlN62GdahU2NVckF9bzD1Oax9c= ˜ c AAAChHicfVHLSsNAFJ3GV62vVpdugqXgQkrSKrqSghuXFewD2lAmk5t26CQTZm4KJfQz3Op3+TdOH4hpwQsDh3Pu48y9fiK4Rsf5Llh7+weHR8Xj0snp2flFuXLZ1TJVDDpMCqn6PtUgeAwd5CignyigkS+g509flnpvBkpzGb/jPAEvouOYh5xRNNRgiFwEkLHFqDEqV526swp7F7gbUCWbaI8qBTYMJEsjiJEJqvXAdRL0MqqQMwGL0jDVkFA2pWMYGBjTCLSXrTwv7JphAjuUyrwY7RX7tyKjkdbzyDeZEcWJ3taW5K9Wy43C8MnLeJykCDFbTwpTYaO0lxuwA66AoZgbQJnixqzNJlRRhmZPpf865TwsJ6OUQuc/OgExA7MFHa4c5jTNqIDAy9Y5OS3z/WibkCJYlMxZ3O0j7IJuo+426423+2rL2RyoSK7JDbklLnkkLfJK2qRDGJHkg3ySL+vQurOa1sM61Spsaq5ILqznH1Wwx9g= Fig. 3. Phase planes for ﬁnite T . a T = 1 b T = 5 . Dotted and dashedlines indicate ∂ I = 0 and ∂ I = 0 , respectively, and were obtained fromnumerically evaluating the integral in theorem III.1 on a grid with mesh . .The red shaded area indicates region B of ¯ I , see Fig. 2, i.e., the case T → ∞ . C. Leakage does not alter the qualitative behavior

For a channel with leakage ( x > ) the analogous evolutionequation (4) reads ∂∂t p ( t, z ) = − ∂∂z { A ( z − x ) p ( t, z ) } − c zp ( t, z )+ c f − ( z ) p ( t, f − ( z )) ( f − ( z ) > ˜ ω ) , ˜ ω < z < x , where ˜ ω := x + ω and f − ( z ) = x x x + x − z is the valueof Z ( t i − ) that jumps to Z ( t i +) = z . The delay term p ( t, f − ( z )) in the z -component turns it into an equationdifﬁcult to solve compared to (4). It is yet unclear, whetherthe asymptotic density π ( z ) can be solved for, not to mentionthe time-evolving probability distribution. The ﬁrst choiceof technique for π ( z ) , method of steps [36], failed here,because the boundary value π ( f − − (˜ ω )) of the ﬁrst inter-val (˜ ω, f − − (˜ ω )] is yet unknown. Furthermore the successiveintervals ( f − i − (˜ ω ) , f − ( i +1) − (˜ ω )] , i ∈ N , where f − (˜ ω ) :=˜ ω, f − ( i +1) − := f − − ◦ f − i − , accumulate, because the sequence f − i − (˜ ω ) i monotonically approaches x .Plots in Fig. 4 show results obtained from stochastic sim-ulation of Z ( t ) with sample size · and Monte Carloevaluation of the mean in (1). While the qualitative behavioris not altered, leakage augments the asymptotic ratio from χ = 1 / ( e − (for x = 0 ) to χ = 1 (for x = 1 ) if x = 1 is ﬁxed, according to paragraph I-B. Increasing leakage thusbends region B towards the bisection line and enlarges theON-favoring region I. D. Constraint (C2) enforces asymmetric allocation

Independent of the input states x , x , the autocorrelationtime of X is ( c + c ) − . The optimum of ¯ I constrained tothe triangular area (C2) c + c ≤ r is achieved on thediagonal boundary. The condition ∂ ¯ I − ∂ ¯ I = 0 determinesthe parametric curve of optimal rates in the c - c -plane. Henceit must lie in the region enclosed by the nullclines ∂ ¯ I = 0 and ∂ ¯ I = 0 , i.e., in region B of Fig. 2. We observe strictlyOFF-favoring, asymmetric, optimal allocations. For c , c → symmetric allocation becomes asymptotically optimal for largeminimal autocorrelation times. IIII II ˜ c AAAChHicfVHLSsNAFJ3GV63P6tJNsBRcSEl8oCspunFZwT6gDWUyubGDk0yYuSmU0M9wq9/l3zhNg5gWvDBwOOc+ztzrJ4JrdJzvirWxubW9U92t7e0fHB4d1096WqaKQZdJIdXApxoEj6GLHAUMEgU08gX0/fenhd6fgtJcxq84S8CL6FvMQ84oGmo4Qi4CyNh87I6PG07LycNeB24BGqSIzrheYaNAsjSCGJmgWg9dJ0Evowo5EzCvjVINCWXv9A2GBsY0Au1luee53TRMYIdSmRejnbN/KzIaaT2LfJMZUZzoVW1B/mrN0igM772Mx0mKELPlpDAVNkp7sQE74AoYipkBlCluzNpsQhVlaPZU+69TycNiMkopdPmjExBTMFvQYe6wpGlGBQRetswpaZnvR6uEFMG8Zs7irh5hHfSuWu516+rlptF+LA5UJWfknFwQl9yRNnkmHdIljEjyQT7Jl7VtXVrX1u0y1aoUNaekFNbDD1kEx+k= ˜ c AAAChHicfVFNS8NAEN1GrbV+VT16CZaCBylJq+hJRC8eK1hbaEPZbCbt0k027E4KJfRneNXf5b9x0xYxFRxYeLw3H29n/ERwjY7zVbK2tnfKu5W96v7B4dFx7eT0TctUMegyKaTq+1SD4DF0kaOAfqKARr6Anj99yvXeDJTmMn7FeQJeRMcxDzmjaKjBELkIIGOLUWtUqztNZxn2X+CuQZ2sozM6KbFhIFkaQYxMUK0HrpOgl1GFnAlYVIephoSyKR3DwMCYRqC9bOl5YTcME9ihVObFaC/Z3xUZjbSeR77JjChO9KaWkz9aozAKwzsv43GSIsRsNSlMhY3SzjdgB1wBQzE3gDLFjVmbTaiiDM2eqv91KnjIJ6OUQhc/OgExA7MFHS4dFjTNqIDAy1Y5BS3z/WiTkCJYVM1Z3M0j/AVvrabbbrZerusPj+sDVcg5uSCXxCW35IE8kw7pEkYkeScf5NMqW1dW27pZpVqldc0ZKYR1/w1bGsfq ˜ c AAAChHicfVHLSsNAFJ3GV63P6tJNsBRcSEl8oCspunFZwT6gDWUyubGDk0yYuSmU0M9wq9/l3zhNg5gWvDBwOOc+ztzrJ4JrdJzvirWxubW9U92t7e0fHB4d1096WqaKQZdJIdXApxoEj6GLHAUMEgU08gX0/fenhd6fgtJcxq84S8CL6FvMQ84oGmo4Qi4CyNh87I6PG07LycNeB24BGqSIzrheYaNAsjSCGJmgWg9dJ0Evowo5EzCvjVINCWXv9A2GBsY0Au1luee53TRMYIdSmRejnbN/KzIaaT2LfJMZUZzoVW1B/mrN0igM772Mx0mKELPlpDAVNkp7sQE74AoYipkBlCluzNpsQhVlaPZU+69TycNiMkopdPmjExBTMFvQYe6wpGlGBQRetswpaZnvR6uEFMG8Zs7irh5hHfSuWu516+rlptF+LA5UJWfknFwQl9yRNnkmHdIljEjyQT7Jl7VtXVrX1u0y1aoUNaekFNbDD1kEx+k= a b Fig. 4. Phase planes with leakage. a x = 0 . , x = 1 b x = 0 . , x =1 . Black and gray marks indicate the nullclines ∂ ¯ I = 0 and ∂ ¯ I = 0 ,respectively, and were obtained from Monte Carlo simulations with samplesize · . The red shaded area indicates region B of the case x = 0 . E. Constraint (C3) allows for local optima

The quantity E [ E [ σ X ( t ) | X ( t )]] = c − + c − − / ( c + c ) is the expectation of the average sojourn times of the states x , x taking into account stationarity of X ( t ) .The constraint (C3) allows one rate to be inﬁnite giventhe other is less than r . Thus its relevance seems to bequestionable at ﬁrst glance. Combination with other constraintsthat strictly determine the rates to be ﬁnite, like (C1) and (C2),legitimate (C3). Nevertheless we examined (C3) alone. Theconstraint deﬁnes a family of level sets E [ E [ σ X ( t ) | X ( t )]] = r − in the ( c , c ) -plane, parametrized by r . The level setfor each r is a smooth curve, symmetric with respect to thebisection line. Regardless of r , the maximum of ¯ I on eachlevel set is assumed in the OFF-favoring region, i.e., above thebisection line. As r decreases, at value r ≈ . , a localmaximum of ¯ I appears in the ON-favoring region and per-petuates for smaller r , resembling a saddle-node bifurcation.Consequently, for average intertransition times greater thanapproximately . /c high information throughput can beachieved by an ON-favoring rate pair. Both maxima approachthe symmetric allocation as r tends to zero.IV. C ONCLUSION

We considered the Poisson channel with BMPs as inputclass. The method of using the hybrid generator for the evolu-tion of the causal estimator allowed to express its distributionin closed form and represent the MI and information rate byRiemann integrals. The information surface was then analyzedfor different constraints on sojourn times. The mathematicalderivation heavily relied on the Markov assumption and on ne-glecting leakage. Advancing mathematical results on leaking,Semi-Markov and multi-state inputs remain open problems.Among general binary inputs, our result yields a lower boundon the capacity.Optimizing the allocation of the ON and the OFF state underconstraint (C1) can be intuitively explained as an interplaybetween different forces that maximize the efﬁciency andprecision of signal transmission. On the one hand, a force,reducing average sojourn times is predominant in region Bof Fig. 2. This force aims at increasing the amount of signalstransmitted. On the other hand, forces that increase the sojourntime in the ON and OFF states are predominant in regions and C, respectively. A larger sojourn time in the ONstate increases the likelihood of observing the ON state atthe channel output. A larger sojourn time in the OFF statedecreases the likelihood of misinterpreting the period betweenconsecutive channel output pulses as an input OFF phase. Thephase diagram in Fig. 2 explicitly quantiﬁes how the ensembleof forces is balanced. Constraint (C3) admits an analogousdriving force towards its local maximum.Given the analyzed model’s rather minimal nature it mightnot account for biophysical reality. Yet assuming that evolu-tionary strategies aim at achieving capacity [25] the modelsupplies the hypothesis that system parameters ˜ c , ˜ c are lo-cated in or near the optimal region B [37]. The prediction madeby this interpretation is yet to be veriﬁed by experimentalists.R EFERENCES[1] J. E. Purvis and G. Lahav, “Encoding and decoding cellular informationthrough signaling dynamics,”

Cell , vol. 152, no. 5, pp. 945–956, 2013.[2] D. Friedrich, L. Friedel, A. Finzel, A. Herrmann, S. Preibisch, andA. Loewer, “Stochastic transcription in the p53-mediated response toDNA damage is modulated by burst frequency,”

Molecular SystemsBiology , vol. 15, no. 12, 2019.[3] J. Selimkhanov, B. Taylor, J. Yao, A. Pilko, J. Albeck, A. Hoffmann,L. Tsimring, and R. Wollman, “Accurate information transmissionthrough dynamic biochemical signaling networks,”

Science , vol. 346,no. 6215, pp. 1370–1373, 2014.[4] F. Tostevin and P. R. ten Wolde, “Mutual information between input andoutput trajectories of biochemical networks,”

Phys. Rev. Lett. , vol. 102,p. 218101, 2009.[5] A. Mugler, A. M. Walczak, and C. H. Wiggins, “Spectral solutions tostochastic models of gene expression with bursts and regulation,”

Phys.Rev. E , vol. 80, p. 041921, 2009.[6] R. Suderman, J. A. Bachman, A. Smith, P. K. Sorger, and E. J. Deeds,“Fundamental trade-offs between information ﬂow in single cells andcellular populations,”

Proceedings of the National Academy of Sciences ,vol. 114, no. 22, pp. 5755–5760, 2017.[7] G. Tkaˇcik, C. G. Callan, and W. Bialek, “Information ﬂow and optimiza-tion in transcriptional regulation,”

Proceedings of the National Academyof Sciences , vol. 105, no. 34, pp. 12 265–12 270, 2008.[8] I. Lestas, G. Vinnicombe, and J. Paulsson, “Fundamental limits on thesuppression of molecular ﬂuctuations,”

Nature , vol. 467, no. 7312, p.174, 2010.[9] Y. Nakahira, F. Xiao, V. Kostina, and J. C. Doyle, “Fundamental limitsand achievable performance in biomolecular control,” in , 2018, pp. 2707–2714.[10] L. Duso and C. Zechner, “Path mutual information for a class ofbiochemical reaction networks,” in , 2019, pp. 6610–6615.[11] S. A. Pasha and V. Solo, “Computing the trajectory mutual informationbetween a point process and an analog stochastic process,” in . IEEE, 2012, pp. 4603–4606.[12] S. Shamai, “Capacity of a pulse amplitude modulated direct detectionphoton channel,”

IEE Proceedings I - Communications, Speech andVision , vol. 137, no. 6, pp. 424–430, 1990.[13] K. V. Parag, “On signalling and estimation limits for molecular birth-processes,”

Journal of theoretical biology , vol. 480, pp. 262–273, 2019.[14] A. Raj, C. S. Peskin, D. Tranchina, D. Y. Vargas, and S. Tyagi,“Stochastic mRNA synthesis in mammalian cells,”

PLoS Biology , vol. 4,no. 10, p. e309, 2006.[15] C. Zechner, M. Unger, S. Pelet, M. Peter, and H. Koeppl, “Scalableinference of heterogeneous reaction kinetics from pooled single-cellrecordings,”

Nature Methods , vol. 11, no. 2, pp. 197–202, 2014.[16] O. Macchi and B. Picinbono, “Estimation and detection of weak opticalsignals,”

IEEE Transactions on Information Theory , vol. 18, no. 5, pp.562–573, 1972.[17] Y. M. Kabanov, “The capacity of a channel of the Poisson type,”

Theoryof Probability & Its Applications , vol. 23, no. 1, pp. 143–147, 1978. [18] M. Davis, “Capacity and cutoff rate for Poisson-type channels,”

IEEETransactions on Information Theory , vol. 26, no. 6, pp. 710–715, 1980.[19] D. Snyder and C. Georghiades, “Design of coding and modulation forpower-efﬁcient use of a band-limited optical channel,”

IEEE Transac-tions on Communications , vol. 31, no. 4, pp. 560–565, 1983.[20] S. Shamai and A. Lapidoth, “Bounds on the capacity of a spectrallyconstrained Poisson channel,”

IEEE Transactions on Information The-ory , vol. 39, no. 1, pp. 19–29, 1993.[21] S. Shamai, “On the capacity of a direct-detection photon channelwith intertransition-constrained binary input,”

IEEE Transactions onInformation Theory , vol. 37, no. 6, pp. 1540–1550, 1991.[22] L. Bintu, N. E. Buchler, H. G. Garcia, U. Gerland, T. Hwa, J. Kondev,and R. Phillips, “Transcriptional regulation by the numbers: models,”

Current Opinion in Genetics & Development , vol. 15, no. 2, pp. 116–124, 2005.[23] D. M. Suter, N. Molina, D. Gatﬁeld, K. Schneider, U. Schibler, andF. Naef, “Mammalian genes are transcribed with widely different burst-ing kinetics,”

Science , vol. 332, no. 6028, pp. 472–474, 2011.[24] L. A. Mirny, “Nucleosome-mediated cooperativity between transcriptionfactors,”

Proceedings of the National Academy of Sciences , vol. 107,no. 52, pp. 22 534–22 539, 2010.[25] G. Tkaˇcik and A. M. Walczak, “Information transmission in geneticregulatory networks: a review,”

Journal of Physics: Condensed Matter ,vol. 23, no. 15, p. 153102, 2011.[26] W. Bialek and S. Setayeshgar, “Physical limits to biochemical signaling,”

Proceedings of the National Academy of Sciences , vol. 102, no. 29, pp.10 040–10 045, 2005.[27] K. H. Kim and H. M. Sauro, “Measuring retroactivity from noise ingene regulatory networks,”

Biophysical Journal , vol. 100, no. 5, pp.1167–1177, 2011.[28] R. S. Liptser and A. N. Shiryaev, “Statistics of random processes II:Applications, vol. 2,”

Springer , vol. 737, p. 738, 2001.[29] D. L. Snyder and M. I. Miller,

Random point processes in time andspace , ser. Springer texts in Electrical Engineering. Springer-Verlag,1991.[30] M. H. Davis, “Piecewise-deterministic Markov processes: A generalclass of non-diffusion stochastic models,”

Journal of the Royal StatisticalSociety: Series B (Methodological) , vol. 46, no. 3, pp. 353–376, 1984.[31] C. Gardiner,

Stochastic methods: a handbook for the natural and socialsciences, 4th ed.

Springer-Verlag, 2009.[32] D. Guo, S. Shamai, and S. Verd´u, “Mutual information and conditionalmean estimation in Poisson channels,”

IEEE Transactions on Informa-tion Theory , vol. 54, no. 5, pp. 1837–1849, 2008.[33] L. C. Evans,

Partial differential equations . American MathematicalSociety, 1998.[34] P. Appell, “Sur les s´eries hyperg´eom´etriques de deux variables et surd´es ´equations diff´erentielles lin´eaires aux d´eriv´es partielles.”

ComptesRendus , vol. 90, pp. 296–299, 731–735, 1880.[35] H. M. Srivastava and P. W. Karlsson,

Multiple Gaussian hypergeometricseries . Ellis Horwood, 1985.[36] R. D. Driver,

Ordinary and delay differential equations . SpringerScience & Business Media, 2012, vol. 20.[37] A. P´erez-Escudero, M. Rivera-Alba, and G. G. de Polavieja, “Structureof deviations from optimality in biological systems,”