[PDF] Physical limit to concentration sensing in a changing environment

Abstract

Cells adapt to changing environments by sensing ligand concentrations using specific receptors. The accuracy of sensing is ultimately limited by the finite number of ligand molecules bound by receptors. Previously derived physical limits to sensing accuracy have assumed that the concentration was constant and ignored its temporal fluctuations. We formulate the problem of concentration sensing in a strongly fluctuating environment as a non-linear field-theoretic problem, for which we find an excellent approximate Gaussian solution. We derive a new physical bound on the relative error in concentration c which scales as δc/c∼(Dacτ ) −1/4 with ligand diffusivity D , receptor cross-section a , and characteristic fluctuation time scale τ , in stark contrast with the usual Berg and Purcell bound δc/c∼(DacT ) −1/2 for a perfect receptor sensing concentration during time T . We show how the bound can be achieved by a simple biochemical network downstream the receptor that adapts the kinetics of signaling as a function of the square root of the sensed concentration.

Full PDF

PPhysical limit to concentration sensing in a changing environment

Thierry Mora ∗ and Ilya Nemenman Laboratoire de physique de ´Ecole normale sup´erieure (PSL University), CNRS,Sorbonne University, Universit´e de Paris, 24 rue Lhomond, 75005 Paris, France Department of Physics, Department of Biology, and Initiative in Theory andModeling of Living Systems, Emory University, Atlanta, GA 30322, USA

Cells adapt to changing environments by sensing ligand concentrations using speciﬁc receptors.The accuracy of sensing is ultimately limited by the ﬁnite number of ligand molecules bound byreceptors. Previously derived physical limits to sensing accuracy have assumed that the concentra-tion was constant and ignored its temporal ﬂuctuations. We formulate the problem of concentrationsensing in a strongly ﬂuctuating environment as a non-linear ﬁeld-theoretic problem, for which weﬁnd an excellent approximate Gaussian solution. We derive a new physical bound on the relativeerror in concentration c which scales as δc/c ∼ ( Dacτ ) − / with ligand diﬀusivity D , receptor cross-section a , and characteristic ﬂuctuation time scale τ , in stark contrast with the usual Berg andPurcell bound δc/c ∼ ( DacT ) − / for a perfect receptor sensing concentration during time T . Weshow how the bound can be achieved by a simple biochemical network downstream the receptorthat adapts the kinetics of signaling as a function of the square root of the sensed concentration. Cells must respond to extracellular signals to guidetheir actions in the world. The signals typically come inthe form of changing concentrations of various molecu-lar ligands, which are conveyed to the cell through lig-and binding to cell surface receptors. A lot of ink hasbeen expended on deriving the fundamental limits to theprecision with which a cell can measure the concentra-tions from the activity of its receptors, constrained bythe stochasticity of ligand binding and unbinding [1–4].In particular, it has become clear that the temporal se-quence of binding-unbinding events carries more infor-mation about the underlying ligand concentration thanjust the mean receptor occupancy, typically used in de-terministic chemical kinetics models of this problem [5].In particular, such precise temporal information allowscells to estimate the concentration of a cognate ligandeven in a sea of weak spurious ligands [6–8], as well as toestimate concentrations of multiple ligands from fewer re-ceptor types [9, 10], and molecular network motifs able toperform such complex estimation exist in the real world,even potentially taking advantage of cross-talk betweenreceptor-ligand pairs [11].Importantly, concentrations of ligands are worth mea-suring only when they are a priori unknown; or, in otherwords, if they change with time, allowing for instancecells to adapt their behaviour accordingly and maximizetheir long-term growth [12]. However, all of the preced-ing analyses have focused on the regime with a clear timescale separation, where the concentration is constant orconstantly changing [13] during the period over whichit is estimated. In this article, we will ﬁll in this gap bycalculating the accuracy with which a temporally varyingligand concentration may be estimated from a sequenceof binding and unbinding events. This requires makingassumptions about the time scale over which signiﬁcant ∗ Corresponding author: [email protected] changes of the concentration are possible. In our formu-lation, the optimal sensor performs a Bayesian computa-tion, formalized mathematically as a stochastic ﬁeld the-ory. Crucially, we show how simple biochemical circuitsallow one to perform the relevant complex computations.

Field theory of concentration sensing.

We associateto the ligand concentration c ( t ) a ﬁeld ϕ ( t ) through c ( t ) = c e − ϕ ( t ) , where c is an irrelevant reference con-centration. Ligand concentration controls the ligand-receptor binding rate r ( t ) = 4 Dac ( t ) = 4 Dac e − ϕ ( t ) ≡ r e − ϕ ( t ) , where 4 Da is the diﬀusion-limited binding rateper molecule of the ligand to its target receptor, modeledas a circle of diameter a on the cell’s surface, and D is theligand diﬀusivity. This binding rate can be readily gener-alized to N receptors by using instead r ( t ) = 4 N Dac ( t ).All our results will then hold with this additional N fac-tor. We assume that the concentration follows a geo-metric random walk, with characteristic time scale τ : dϕ = τ − / dW , with W a Wiener process. This choiceis justiﬁed by the fact that in many biological contexts,such as bacterial chemotaxis, concentrations may varyover many orders of magnitude.The probability of the concentration temporal evolu-tion over the time interval (0 , T ) is given by P prior ( { ϕ ( t ) } ) = 1 Z prior exp (cid:34) − τ (cid:90) T dt (cid:18) dϕdt (cid:19) (cid:35) . (1)The receptor sees binding events at times t , t , . . . , t n ,each occuring with rate 4 Dac ( t i ) = r e − ϕ ( t i ) . To sim-plify, let us assume that unbinding is instantaneous (gen-eralization to ﬁnite binding times is discussed later). Theposterior distribution of the concentration proﬁle then a r X i v : . [ q - b i o . S C ] A ug follows Bayes’ rule: P ( { ϕ ( t ) } ) = P ( t , . . . , t n |{ ϕ ( t ) } ) P prior ( { ϕ ( t ) } ) P ( t , . . . , t n )= 1 Z exp (cid:40) − (cid:90) T dt (cid:34) τ (cid:18) dϕdt (cid:19) + r e − ϕ ( t ) (cid:35) − n (cid:88) i =1 ϕ ( t i ) (cid:41) , (2)where Z is a normalization constant independent of ϕ .The term r e − ϕ dt in the integral corresponds the proba-bility of not binding a ligand between t and t + dt (exceptat times t i ). The binding events at t = t i are gener-ated by the true temporal trace of ligand concentration, c ∗ ( t ) = c e − ϕ ∗ ( t ) . In the following the true trace ϕ ∗ ( t )will be distinguished from the ﬁeld ϕ , which refers to ourobservation-based belief.The one-dimensional ﬁeld-theoretic problem (2) is aparticular case of Bayesian ﬁltering [14]. When collect-ing information from binding events, cells do not haveaccess to the future and cannot use the full span [0 , T ] ofobservations to infer the concentration at time t . Instead,they must infer it solely based on past observation in theinterval [0 , t ], which distinguishes our problem from themathematically similar inference of a continuous proba-bility density [15–19]. This inference can be performedrecursively by the rules of Bayesian sequential forecast-ing, similar to the transfer matrix technique, and alsoknown as the forward algorithm [14]. To do this recur-sion, we ﬁrst deﬁne: Z ( ϕ, t ) = (cid:90) D ϕ ( t ) δ ( ϕ ( t ) − ϕ ) exp (cid:34) − τ (cid:90) t dt (cid:48) (cid:18) dϕdt (cid:19) − (cid:90) t dt (cid:48) (cid:32) r e − ϕ ( t (cid:48) ) + ϕ ( t (cid:48) ) n (cid:88) i =1 δ ( t (cid:48) − t i ) (cid:33)(cid:35) . (3)Considering past observations during the interval [0 , t ],the posterior distribution of ϕ at time t reads: P ( ϕ, t ) = Z ( ϕ, t ) Z ( t ) , with Z ( t ) = (cid:90) ∞−∞ dϕ (cid:48) Z ( ϕ (cid:48) , t ) . (4)When considering periods during which no bindingevent was observed, we can write a recursion for Z ( ϕ, t )between t and t + dt . Taking the δt → t (cid:54) = t i (App. A [20]): ∂P ( ϕ, t ) ∂t = − r ( e − ϕ − (cid:104) e − ϕ (cid:105) ) P ( ϕ, t ) + 12 τ ∂ P∂ϕ , (5)where (cid:104)·(cid:105) denotes an average over P ( ϕ ). When a bindingeven does occur at time t i , the posterior distribution isupdated using Bayes’ rule: P ( ϕ, t + i ) = e − ϕ P ( ϕ, t − i ) (cid:104) e − ϕ (cid:105) , (6)where t ± i refer to the values right before and after theobservation. The partition function Z ( t ) can be sim-ilarly calculated (App. A [20]) and could in principle Dac τ E rr o r δ c / c -3 -2 -1 Simulation √ Dac τ t/ τ D K L ( P ( φ ) k P G a u ss i a n ( φ )) φ -2 0 2 P ( φ ) exactGaussian A. B.

FIG. 1:

Numerical validations of analytical results.A.

The Gaussian Ansatz (7)-(8) is validated by simulatingthe general equations for Bayesian ﬁltering (5)-(6). The nu-merical solution approaches the Gaussian solution rapidly,as indicated by the decay of the Kullback-Leibler divergence D KL ( P ( ϕ ) (cid:107) P Gaussian ( ϕ )) = (cid:82) dϕ P ( ϕ ) ln( P ( ϕ ) /P Gaussian ( ϕ )).We used rτ = Dacτ = 50. B . Concentration sensing er-ror as a function of concentration. The error estimated fromsimulations follows closely the prediction from (13), which isexpected to be valid for 4 Dacτ (cid:29) be used to infer the correct timescale τ by maximizing P ( τ |{ t , . . . , t N } ) ∝ Z (App. C [20]). Gaussian solution.

Because of the P ( ϕ ) dependencein (cid:104) e − ϕ (cid:105) , the equations for the evolution of the poste-rior probability (5)-(6) are nonlinear. However, assuminga Gaussian Ansatz P ( ϕ, t ) = (2 πσ ( t ) ) − / exp[ − ( ϕ − ˆ ϕ ( t )) / σ ( t ) ], which is accurate in the limit of long mea-surement times (see below), gives a closed-form solution(App. B [20]), with: d ˆ ϕdt = σ (cid:34) r e − ˆ ϕ + σ / − n (cid:88) i =1 δ ( t − t i ) (cid:35) , (7) dσ dt = 1 τ − σ r e − ˆ ϕ + σ / . (8)The maximum a posterior estimator for the concentra-tion is then simply given by ˆ c ( t ) = c e − ˆ ϕ ( t ) , while σ ( t ) deﬁnes the Bayesian uncertainty on the estimator.To check the validity of the Gaussian solution, we sim-ulated (5)-(6) numerically, starting from a uniform distri-bution ( P ( ϕ,

0) = 1 / ϕ ∈ [ − ,

1] and 0 otherwise),with r τ = 50 and a true ϕ ∗ ( t ) starting at ϕ ∗ (0) = 0.The numerical solution quickly approaches the Gaussiansolution given by (7)-(8) starting with ˆ ϕ (0) = (cid:104) ϕ (cid:105) t =0 and σ (0) = Var( ϕ ) t =0 . The Kullback-Leibler divergence be-tween the numerical and analytical solutions falls rapidly(Fig. 1A) and the numerical solution approaches the pre-dicted Gaussian very closely (Fig. 1A, inset). Thus, theGaussian solution provides an excellent approximation. Error estimate.

To study the typical behaviour of (7)-(8), we now assume that the rate of binding events islarge compared to the rate of change of the concentra-tion, 4

Dacτ = rτ (cid:29)

1. This regime is the biologicallyrelevant one: to sense concentration, cells need to recordmany binding events over the time scale on which theconcentration ﬂuctuates. In that limit the estimator ˆ ϕ is close to the true value ϕ ∗ , and the Bayesian uncer-tainty σ is small, allowing for two simpliﬁcations. First,(8) relaxes over time scale r ( t ) − to a quasi-steady statevalue σ ≈ / (cid:112) r e − ˆ ϕ τ (cid:28)

1. Second, we can make asmall noise approximation for binding events: over sometime interval ∆ t , with r ∗ ( t ) − (cid:28) ∆ t (cid:28) τ , the numberof binding events has both mean and variance equal to r ∗ ( t )∆ t , allowing us to replace discrete jumps in (7) by: d (cid:32) n (cid:88) i =1 δ ( t − t i ) (cid:33) ≈ r e − ϕ ∗ dt + ( r e − ϕ ∗ ) / dW (cid:48) , (9)where W (cid:48) is a Wiener process. As a result, the estimatorˆ ϕ tracks the true value ϕ ∗ according to: d ˆ ϕ ≈ ( r e − ˆ ϕ /τ ) / ( ϕ ∗ − ˆ ϕ ) + τ − / dW (cid:48) , (10)where we have expanded at ﬁrst order in ˆ ϕ − ϕ ∗ . In thegeneral case, the true ﬁeld may evolve according to a dif-ferent characteristic time scale, τ ∗ , than the one assumedby the Bayesian ﬁlter, τ , so that dϕ ∗ = ( τ ∗ ) − / dW . Theestimation error (cid:15) = ˆ ϕ − ϕ ∗ then evolves according to: d(cid:15) = − ( r/τ ) / (cid:15) dt + τ − / dW (cid:48) − ( τ ∗ ) − / dW. (11)Intrigingy, the noises dW (cid:48) and dW have very diﬀerentinterpretations, one being due to the random arrival ofbinding events, and the other to the geometric diﬀusionof the concentration. Yet they come in the same formin this equation. Relying again on the assumption that rτ (cid:29)

1, we get an estimate of the error: (cid:104) (cid:15) (cid:105) = 12 √ r (cid:18) √ τ + √ ττ ∗ (cid:19) , (12)which has a minimum as a function of τ , reached for thetrue value of the characteristic ﬂuctuation time τ = τ ∗ : (cid:104) (ˆ c − c ∗ ) (cid:105) c ≈ (cid:104) (cid:15) (cid:105) = 1 √ rτ = 1 √ Dacτ . (13)This error is equal to the Bayesian uncertainty σ =1 / (cid:112) r τ e − ˆ ϕ ≈ / √ Dacτ and is consistent with the er-ror found using the saddle-point approximation in therelated problem of probability density estimate [15].We checked the validity of our small-noise approxima-tion by comparing the prediction from (12) with the re-sults of a numerical simulation of (7)-(8), in which weaveraged the error (cid:104) (ˆ c − c ∗ ) (cid:105) as a function of c for manyrealization of the process. The agreement is found to beexcellent, and gets better as rτ = 4 Dacτ becomes larger(Fig. 1B).The error in (13) sets a fundamental physical limit onany concentration sensing device, biological or artiﬁcial,in a concentration proﬁle that follows a geometric ran-dom walk. This bound is radically diﬀerent from that

A A ∗ B ∗ B k + = 4 Dak + A k − A k + B k − B B ∗ ∼ √ A ∗ boundunbound A. t (sec) li ga nd c o n ce n t r a t i o n c ( n M ) ˆ c A ∝ [ A ∗ ]true c ∗ c (nM) δ c / c -2 -1 (ˆ c A − c ∗ ) c √ Dac τ B. FIG. 2:

Performance of adaptive biochemical networkin ﬂuctuating ligand concentration.

A. Schematic of thebiochemical network implementing optimal Bayesian ﬁltering.The receptor-induced activation of the readout molecule A ∗ ,as well as its deactivation are regulated by a second molecule B ∗ , which is made to scale like √ A ∗ using a mechanism ofdeactivation by dimerisation (shaded box). B. Simulation ofthe network readout c A ( t ) ∝ A ∗ ( t ) in response to stochasticbinding events in a ﬂuctuating concentration ﬁeld c ∗ ( t ). Therelative estimation error (cid:104) (ˆ c A − c ∗ ) (cid:105) /c behaves according tothe theoretical bound 1 / √ Dacτ (inset). obtained by Berg and Purcell for the concentration sens-ing error by a single receptor integrating binding eventsover time T [1, 5]: δc c = 14 DacT (14)(in the limit where binding events are short so that thereceptor is always free).The major diﬀerence is that Berg and Purcell, as wellas most of the literature on concentration sensing, assumethat the sensed concentration does not change with time.Our result can be reconciled with Berg and Purcell bydeﬁning an eﬀective measurement time T ∼ (cid:112) τ / Dac — the geometric mean between the mean time betweenbinding events and the time scale of variation. This T realizes the optimal tradeoﬀ between the requirement tointegrate over many binding events, T (cid:29) / (4 Dac ), butover a relatively constant concentration, T (cid:28) τ [21]. Plausible biological implementation.

Can cells imple-ment the optimal Bayesian ﬁltering scheme and reachthe bound set by (13)? To gain intuition, it is useful torewrite (7)-(8) in term of the concentration estimator ˆ c ,in the limit 4 Dacτ (cid:29) σ can be eliminated: d ˆ cdt = (cid:112) Da ˆ c/τ (cid:32) Da n (cid:88) i =1 δ ( t − t i ) − ˆ c (cid:33) . (15)Each binding event should lead to an increment of ˆ c ,followed by a continuous, exponential decay, with a rategiven by T − = (cid:112) Da ˆ c/τ .This scheme can be implemented by a simple biochem-ical network schematized in Fig. 2A. The concentrationreadout ˆ c A may be represented by the “active” (for in-stance phosphorylated) form A ∗ of a chemical species.Binding events cause the receptor to activate A into A ∗ ,which gets subsequently deactivated. Both the activationand deactivation of A are catylized by a second chemi-cal species in its active form, B ∗ . Thus, upon a bindingevent, the concentration of active A ∗ is increased by:∆[ A ∗ ] = k + A [ A ][ B ∗ ] , (16)and it decays between binding events according to: d [ A ∗ ] dt = − k − A [ B ∗ ][ A ∗ ] , (17)where k ± A are biochemical parameters.To implement (15), the concentration of B ∗ must becontroled by the square root of A ∗ . This dependencecan be achieved by assuming that B is activated into B ∗ through the catalityc activity of A ∗ , and that B ∗ getsdeactivated cooperatively as a dimer: d [ B ∗ ] dt = k + B [ B ][ A ∗ ] − k − B [ B ∗ ] , (18)where k ± B are biochemical reaction rates.Assuming that the kinetics of B are fast compared to A , we obtain B ∗ = ( Bk + B /k − B ) / √ A ∗ and d [ A ∗ ] dt = α (cid:112) [ A ∗ ] (cid:32) β (cid:88) i =1 δ ( t − t i ) − [ A ∗ ] (cid:33) . (19)with α = k − A ([ B ] k + B /k − B ) / and β = ( k + A [ A ] /k − A ). If A and B are in excess, and thus approximately constant,then this biochemical network exactly implements (15),with 4 Da ˆ c A ≡ k − A [ A ∗ ] /k + A [ A ], and τ = τ net ≡ / ( α β ) = k − B / ( k + B k + A k − A [ A ][ B ]).Interestingly, the amount of inactive ( ≈ total) B con-trols the time scale of concentration ﬂuctuations, andcould be tuned through gene regulation to adapt todiﬀerent speeds of environmental ﬂuctuations. A bio-chemical network might be able to ﬁnd the optimal τ and then adjust [ B ] accordingly by empirically mea-suring the fold-change of r ( t ) (which can be done bybiochemical networks, see e.g. [22]) but with a delay, (cid:104) r ( t + ∆ t ) /r ( t ) (cid:105) = e ∆ t/ τ , and then inverting the rela-tionship to extract τ .We tested the performance of the biochemical networkfor sensing concentration by simulating (16)-(18) with aﬂuctuating ligand concentration c ( t ) with characteristictime scale τ . For concreteness, we set c ∗ (0) = 10nM, τ ∗ = 10s, k + A [ A ] = 0 . k − A = k + B = k + B = 1 µ M − s − and [ B ] = 10 µ M, so that τ net = τ ∗ . Fig. 2B shows thenetwork estimate ˆ c A ( t ) along with the true value c ∗ ( t ).The empirical error (cid:104) (ˆ c A − c ∗ ) (cid:105) as a function of c ∗ aver-aged over 10 s (Fig. 2B, inset), again shows an excellentagreement with the theoretical bound 1 / √ Dacτ . Discussion.

For the sake of clarity our analysis madesimplifying assumptions which can be easily relaxed. Our proposed biochemical implementation assumed a con-stant burst of activity following each binding event, con-sistent with the optimal estimation strategy. However, inreal receptors, stochasticity in the bound time is knownto double the variance in the estimate [5] (App. D [20]).Treating this eﬀect simply adds a factor √ (cid:104) δc (cid:105) /c ≈ / √ Dacτ .We also ignored periods during which the receptor wasbound. During that time the receptor is blind to the ex-ternal world, and the posterior evolves according to theprior: ∂ t P = (1 / τ ) ∂ ϕ P , ∂ t ˆ ϕ = 0 and ∂ t σ = 1 /τ . Inour results, these “down times” renormalize the eﬀectiveobservation time by the fraction of time the receptor isfree, p free = (1+4 Dacu ) − , where u is the average boundtime, (cid:104) δc (cid:105) /c ≈ / √ Dacp free τ (App. D [20]). Combin-ing the two eﬀects (stochasticity in bound time and recep-tor availability) would yield (cid:104) δc (cid:105) /c ≈ / √ Dacp free τ .The ﬁeld theory of (2) is mathematically similar to theproblem of estimating a density function from a smallsample set with a smoothing prior [15–17, 19]. The maindiﬀerence lies in the domain of observations. In densityestimation the whole function { ϕ ( t ) } t ∈ [0 ,T ] is infered to-gether on the whole domain of t , while sensors can onlylearn from past observations, i.e. the t (cid:48) < t half-plane.However, our solution can easily be generalized to dealwith the entire time domain using the forward-backwardalgorithm (App. E [20]). Eqs. (5)-(6) and (7)-(8) can besolved both forward (from 0 to t ) and backward (from T to t , with time reversal) in time, giving P → ( ϕ ), ˆ ϕ → , σ → for the forward solution (the one treated in this article),and P ← ( ϕ ), ˆ ϕ ← , σ ← for the backward solution. TheBayesian posterior at any given time is then given by ∝ P → ( ϕ ) P ← ( ϕ ), of mean ( σ ← ˆ ϕ → + σ → ˆ ϕ ← ) / ( σ ← + σ → )and variance σ ← σ → / ( σ ← + σ → ) in the Gaussian approx-imation. While this situation is not relevant for concen-tration sensing, our general solution should be applicableto problems of density estimation. The saddle-point ap-proximation usually made in that context [15–17] is ex-pected to work in the same limit as our Gaussian Ansatz;however, recent work has emphasized the importance ofnon-Gaussian ﬂuctuations for small datasets [19].The biological implementation we propose is specula-tive. An interesting direction would be to identify square-root or similar control of receptor signaling in real bio-logical systems, and interpret them in terms of optimalBayesian ﬁltering. Signaling pathways dealing with con-centration changes over several orders of magnitude, suchas bacterial chemotaxis, typically use adaptation mech-anisms to increase the dynamic range of sensing [23]—afeature that is absent from our approach as we neglectnoise in the signaling output. Combining adaptation de-sign with ideas from Bayesian estimation could help usgain insight into the fundamental bounds and resourceallocation tradeoﬀs that limit biological information pro-cessing. Acknowledgments.

The authors would like to thankthe Casa Matem´atica Oaxaca from the Banﬀ Interna-tional Research Station where this work was initiated.TM was partially supported by Agence National pourla Recherche (ANR) grant No. ANR-17-ERC2-0025-01 “IRREVERSIBLE” and IN by NSF Grants No. PHY-1410978 and IOS-1822677. [1] H. C. Berg and E. M. Purcell, Biophys. J. , 193 (1977).[2] W. Bialek and S. Setayeshgar, Proc. Natl. Acad. Sci. U.S. A. , 10040 (2005).[3] K. Kaizu, W. De Ronde, J. Paijmans, K. Takahashi,F. Tostevin, P. R. T. Wolde, and P. R. ten Wolde, Bio-phys. J. , 976 (2014).[4] G. Aquino, N. S. Wingreen, and R. G. Endres, J. Stat.Phys. , 1353 (2016).[5] R. G. Endres and N. S. Wingreen, Phys. Rev. Lett. ,158101 (2009).[6] E. D. Siggia and M. Vergassola, Proc. Natl. Acad. Sci.U. S. A. , E3704 (2013).[7] J.-B. Lalanne and P. Fran¸cois, Proc. Natl. Acad. Sci. ,1898 (2015).[8] T. Mora, Phys. Rev. Lett. , 038102 (2015).[9] V. Singh and I. Nemenman, PLoS Comput. Biol. , 1(2017).[10] V. Singh and I. Nemenman, arXiv:1906.08881 (2019).[11] M. Carballo-Pacheco, J. Desponds, T. Gavrilchenko,A. Mayer, R. Prizak, G. Reddy, I. Nemenman, andT. Mora, Phys. Rev. E (2019). [12] E. Kussell and S. Leibler, Science , 2075 (2005).[13] T. Mora and N. S. Wingreen, Phys. Rev. Lett. , 1(2010).[14] Z. H. E. Chen, Statistics , 1 (2003).[15] W. Bialek, C. Callan, and S. Strong, Phys. Rev. Lett. , 4693 (1996).[16] I. Nemenman and W. Bialek, Phys. Rev. E , 2 (2002).[17] J. B. Kinney, Phys. Rev. E - Stat. Nonlinear, Soft MatterPhys. (2014).[18] J. B. Kinney, Phys. Rev. E - Stat. Nonlinear, Soft MatterPhys. (2015).[19] W. C. Chen, A. Tareen, and J. B. Kinney, Phys. Rev.Lett. , 160605 (2018).[20] See Supplemental Material for detailed derivations. [21] A. Kolmogoroﬀ, C. R. Acad. Sci. Paris , 2043 (1939).[22] L. Goentoro, O. Shoval, M. W. Kirschner, and U. Alon,Mol. Cell , 894 (2009).[23] M. D. Lazova, T. Ahmed, D. Bellomo, R. Stocker, andT. S. Shimizu, Proc. Natl. Acad. Sci. U. S. A. , 13870(2011). Appendix A: Field theory for concentration sensing

We ﬁrst recall the problem outlined in the main text for self-consistency. Receptor binding happens with rate r ( t ) = 4 Dac ( t ) = r e − ϕ ( t ) . The ﬁeld ϕ ( t ) follows a random walk with characteristic time τ ∗ : dϕ ( t ) = τ − / dW. (A1)In the following ϕ ∗ will refer to the actual realization of the random walk, with characteristic time τ ∗ , and ϕ will referto the guess made based on the observation of binding events, while τ denotes the assumes characteristic time scale.The probability of the time trace of ϕ in the absence of any observation is given by P prior ( { ϕ ( t ) } ) = 1 Z prior exp (cid:34) − τ (cid:90) T dt (cid:18) dϕdt (cid:19) (cid:35) , (A2)with Z prior = (2 πdt/τ ) T/ dt , where dt is an inﬁnitesimal discretization scale.During each interval [ t, t + dt ] without a binding event, the likelihood reads e − dtr ( t ) = e − dtr e − ϕ ( t ) . A binding eventin interval [ t, t + dt ] has likelihood dtr e − ϕ ( t ) . Thus the posterior probability thus reads: P ( { ϕ ( t ) } ) = P ( t , . . . , t n |{ ϕ ( t ) } ) P prior ( { ϕ ( t ) } ) P ( t , . . . , t n ) = 1 Z exp (cid:40) − (cid:90) T dt (cid:34) τ (cid:18) dϕdt (cid:19) + r e − ϕ ( t ) (cid:35) − n (cid:88) i =1 ϕ ( t i ) (cid:41) , (A3)We deﬁne: Z ( ϕ, t ) = (cid:90) D ϕ δ ( ϕ ( t ) − ϕ ) exp (cid:40)(cid:90) t dt (cid:48) (cid:34) − τ (cid:18) dϕdt (cid:19) − r e − ϕ ( t (cid:48) ) − ϕ ( t (cid:48) ) n (cid:88) i =1 δ ( t (cid:48) − t i ) (cid:35)(cid:41) . (A4)The marginal of ϕ at time t , P ( ϕ, t |{ t , . . . , t n (cid:48) } ), where n (cid:48) is the last binding event before t , is then: P ( ϕ, t ) = Z ( ϕ, t ) Z ( t ) , Z ( t ) = (cid:90) dϕ (cid:48) Z ( ϕ (cid:48) , t ) , (A5)and Z = Z ( T ).The partial partition function Z ( ϕ, t ) can be computed recursively. Let us start with the case where no bindingoccurs between t and t + dt . Then: Z ( ϕ, t + dt ) = (cid:90) dϕ (cid:48) Z ( ϕ (cid:48) , t ) exp (cid:20) − τ ( ϕ − ϕ (cid:48) ) dt − dtr e − ϕ (cid:21) . (A6)Equivalently, P ( ϕ, t + dt ) = 1 z ( t ) (cid:90) dϕ (cid:48) P ( ϕ (cid:48) , t ) exp (cid:20) − τ ( ϕ − ϕ (cid:48) ) dt − dtr e − ϕ (cid:21) , (A7)where z ( t ) = (cid:90) dϕ dϕ (cid:48) P ( ϕ (cid:48) , t ) exp (cid:20) − τ ( ϕ − ϕ (cid:48) ) dt − dt r e − ϕ (cid:21) (A8)is a normalization constant, which can be used to calculate Z ( t ) recursively: Z ( t + dt ) = Z ( t ) z ( t ). Let us now takethe limit dt →

0. The Gaussian integral becomes inﬁnitely peaked. Expanding P ( ϕ (cid:48) , t ) = P ( ϕ, t ) + ∂P ( ϕ, t ) /∂ϕ ( ϕ (cid:48) − ϕ ) + ∂ P ( ϕ, t ) /∂ϕ ( ϕ (cid:48) − ϕ ) /

2, we obtain: ∂P ( ϕ, t ) ∂t = − r ( e − ϕ − (cid:104) e − ϕ (cid:105) ) P ( ϕ, t ) + 12 τ ∂ P∂ϕ , (A9)where (cid:104) f ( ϕ ) (cid:105) = (cid:82) dϕ P ( ϕ, t ) f ( ϕ ), and z ( t ) = (1 − dt (cid:104) e − ϕ (cid:105) ) (cid:112) πdt/τ . (A10)Deﬁning Z ( t ) = (2 πdt/τ ) t/ dt ˜ Z ( t ), we get:˜ Z ( t + dt )˜ Z ( t ) = (1 − dt (cid:104) e − ϕ (cid:105) ) , or d ln ˜ Z ( t ) dt = −(cid:104) e − ϕ (cid:105) . (A11)Now assume that there is a binding event between t = t i and t i + dt . Then P ( ϕ, t ) is discontinuous at t i and termsof order dt can be ignored: P ( ϕ, t i + dt ) = e − ϕ P ( ϕ, t i ) (cid:104) e − ϕ (cid:105) , (A12)and ˜ Z ( t i + dt ) / ˜ Z ( t i ) = (cid:104) e − ϕ (cid:105) .In summary, the evolution equation for P reads: ∂P ( ϕ, t ) ∂t = 12 τ ∂ P∂ϕ − r ( e − ϕ − (cid:104) e − ϕ (cid:105) ) P ( ϕ, t ) + (cid:18) e − ϕ (cid:104) e − ϕ (cid:105) − (cid:19) P ( ϕ, t ) n (cid:88) i =1 δ ( t − t i ) , (A13)and the partition function is given by Z ( t ) = (2 πdt/τ ) t/ dt ˜ Z ( t ) , with ∂ ln ˜ Z∂t = − r (cid:104) e − ϕ (cid:105) + ln (cid:104) e − ϕ (cid:105) n (cid:88) i =1 δ ( t − t i ) . (A14) Appendix B: Gaussian approximation

Because of the term (cid:104) e − ϕ (cid:105) , (A13) is non-linear and cannot be solved analytically. However, if we assume that P ( ϕ, t ) is Gaussian, P ( ϕ, t ) = 1 (cid:112) πσ ( t ) exp (cid:20) − ( ϕ − ˆ ϕ ( t )) σ ( t ) (cid:21) , (B1)then closed equations can be obtained for the mean ˆ ϕ ( t ) and variance σ ( t ): d ˆ ϕdt = d (cid:104) ϕ (cid:105) dt = 12 τ (cid:90) dϕϕ ∂ P∂ϕ − r ( (cid:104) ϕe − ϕ (cid:105) − (cid:104) ϕ (cid:105)(cid:104) e − ϕ (cid:105) ) + (cid:18) (cid:104) ϕe − ϕ (cid:105)(cid:104) e − ϕ (cid:105) − (cid:104) ϕ (cid:105) (cid:19) n (cid:88) i =1 δ ( t − t i )= σ (cid:34) r e − ˆ ϕ + σ / − n (cid:88) i =1 δ ( t − t i ) (cid:35) , (B2) dσ dt = d ( (cid:104) ϕ (cid:105) − (cid:104) ϕ (cid:105) ) dt = 12 τ (cid:90) dϕϕ ∂ P∂ϕ − r ( (cid:104) ϕ e − ϕ (cid:105) − (cid:104) ϕ (cid:105)(cid:104) e − ϕ (cid:105) ) + (cid:18) (cid:104) ϕ e − ϕ (cid:105)(cid:104) e − ϕ (cid:105) − (cid:104) ϕ (cid:105) (cid:19) n (cid:88) i =1 δ ( t − t i ) − d (cid:104) ϕ (cid:105) dt = 1 τ − σ r e − ˆ ϕ + σ / , (B3) d ln ˜ Zdt = − r e − ˆ ϕ + σ / − ( ˆ ϕ − σ / n (cid:88) i =1 δ ( t − t i ) , (B4)where we have used the Gaussian integral rules: (cid:104) e − ϕ (cid:105) = e − ˆ ϕ + σ / , (cid:104) ϕe − ϕ (cid:105) = ( ˆ ϕ − σ ) e − ˆ ϕ + σ / , (cid:104) ϕ e − ϕ (cid:105) = ( σ +( ˆ ϕ − σ ) ) e − ˆ ϕ + σ / , and (cid:104) ϕ (cid:105) = ˆ ϕ + σ , and we have used integration by parts to calculate the integrals. Appendix C: Partition function and time scale inference

The most likely timescale τ can be inferred from the observations as well by using Bayes’s rule again: P ( τ ) ∝ (cid:90) D ϕ ( t ) P ( t , . . . , t n |{ ϕ ( t ) } ) P prior ( { ϕ ( t ) }| τ ) P prior ( τ ) = Z ( τ ) P prior ( τ ) Z prior ( τ ) = ˜ Z ( τ ) P prior ( τ ) , (C1)where we have used Z prior ( τ ) = (2 πdt/τ ) T/ dt .We can calculate ˜ Z from the Gaussian approximation (B4):log ˜ Z ≈ − (cid:90) T dt ˆ r ( t ) e σ ( t ) / + n (cid:88) i =1 (ln ˆ r ( t i ) + σ ( t i ) / . (C2)This expression looks like the log-likelihood of a sequence of binding events, up to the σ corretions. Bear in mindthat ˆ r ( t ) is the estimated rate, not the true one r ∗ ( t ). We have ˆ r ( t ) = r ∗ ( t ) e − (cid:15) ( t ) , where we (cid:15) = ˆ ϕ − ϕ ∗ . Expandingin (cid:15) and σ , we obtain:log ˜ Z = − (cid:90) T dt r ∗ ( t ) + n (cid:88) i =1 ln r ∗ ( t i ) + (cid:90) T dt (cid:34) r ∗ ( t ) − n (cid:88) i =1 δ ( t − t i ) (cid:35) ( (cid:15) ( t ) − σ ( t ) / − (cid:90) T dt r ∗ ( t ) (cid:15) ( t )2 . (C3)Both (cid:15) ( t ) and (cid:80) i δ ( t − t i ) − r ∗ are stochastic processes of mean 0. They are also uncorrelated with each other, sothat the third term is sub-linear in T . The last term, which scales with T , thus dominates the τ -dependent part ofthe likelihood. It is maximized for minimum mean squared error, that is for τ = τ ∗ , as shown in the main text. Atlarge T , the ˜ Z ( τ ) term exponentially dominates the prior P prior ( τ ), so that P ( τ ) is peaked around the maximum of˜ Z ( τ ).Eq. (C2) is an example of the usual bias-variance tradeoﬀ (where “bias” refers to errors made from overﬁttingthe data, and “variance” to errors due to limited data). At small τ , ˆ r changes rapidly, jumping when a new bindinghappens, and then rapidly decreases. Thus the term (cid:80) ln ˆ r ( t i ) increases, indicating increase in the goodness of ﬁt.At the same time σ ( t ) increases, so that at small τ the integral term in (C2) becomes large and negative, explodingexponentially for τ →

0. In contrast, for τ → ∞ , ˆ r ( t i ) = n/T , and σ →

0, so that now the goodness of ﬁt is small.Overall, theres an optimal τ that maximizes ˜ Z . The three terms in (C2) parallel the three terms (ﬂuctuationdeterminant, goodness of ﬁt, and the kinetic term) in the ﬁeld-theoretic formulation of continuous probability densityestimation from samples [15, 16], and thus we expect that maximizing ˜ Z will result in an optimal τ not only whenthe true concentration undergoes a geometric random walk, but also when it undergoes various anomalous walks [16]. Appendix D: Bound time

We have so far neglected the time the receptor remained bound the ligand Let us denote by t i, oﬀ the unbindingtime following the binding time at t i . When the receptor is bound, t i < t ≤ t i, oﬀ , no information can be obtainedfrom the environment, and the evolution equation for the posterior simply follows the diﬀusion law: ∂P∂t = 12 τ ∂ P∂ϕ . (D1)The rest of the time, t i, oﬀ < t ≤ t i , (A13) holds. Similarly, in the Gaussian approximation, we get d ˆ ϕdt = 0 , dσ dt = 1 τ (D2)for bound receptors, t i < t ≤ t i, oﬀ , and (B2)-(B3) for unbound receptors. In the limit where binding and unbindingevents are frequent compared to τ , rτ (cid:29)

1, we have: dσ dt = 1 τ − p free σ r e − ˆ ϕ + σ / , (D3)where p free ( t ) = (cid:104) t i − t i − , oﬀ (cid:105)(cid:104) t i − t i − (cid:105) = r ( t ) − r ( t ) − + u = 11 + r ( t ) u = 11 + 4 Dac ( t ) u , (D4)where u = (cid:104) t i, oﬀ − t i (cid:105) is the average bound time, and r ( t ) − = (cid:104) t i − t i − , oﬀ (cid:105) the average unbound time. The uncertaintythen reads: (cid:104) δc (cid:105) c = 1 √ rp free τ = 1 √ Dap free c . (D5)The time the receptor remains bound has another impact on the ability to sense concentration. In the biochemicalscheme proposed in the main text, each binding event causes a ﬁxed burst of activity δ ( t − t i ), regardless of thebound time. In the simplest receptors however, signaling occurs during the time the receptor is bound, which isitself stochastic. We can model this by replacing the Dirac delta by a random burst of activity, b i δ ( t − t i ), with b i proportional to the bound time, b i = ( t i, oﬀ − t i ) /u , so that (cid:104) b i (cid:105) = 1 and Var( b i ) = 1, since the bound time is distributedexponentially according to (1 /u ) e − ( t i, off − t i ) /u . More generally we can consider (cid:104) b i (cid:105) = 1 and Var( b i ) = CV , where0 ≤ CV ≤ denotes the coeﬃcient of variation. The general base (cid:104) b i (cid:105) can be renormalized away into the biochemicalparameters. The special case CV = 0 gives back the results of the main text. When CV >

0, the variance of (cid:82) t +∆ tt (cid:80) i b i δ ( t − t i ) over an interval of duraction ∆ t instead reads:Var (cid:34)(cid:90) t +∆ tt (cid:88) i b i δ ( t − t i ) (cid:35) = (cid:104) b i (cid:105) Var( m ) + Var( b i ) (cid:104) m (cid:105) = r ∗ ∆ t (1 + CV ) . (D6)where n (cid:48) is the number of binding events in the interval. As the result, the noise dW (cid:48) in the main text gains a factor √ CV , and the error becomes: (cid:104) (cid:15) (cid:105) = 12 √ r (cid:18) CV √ τ + √ ττ ∗ (cid:19) , (D7)and minimal error reached for τ = (1 + CV ) τ ∗ : (cid:104) δc (cid:105) c = (cid:104) (cid:15) (cid:105) = √ CV √ rτ = √ CV √ Dacτ . (D8)Taking into account both receptor occupancy and stochasticity in bound times ﬁnally yields: (cid:104) δc (cid:105) c = √ CV √ Dap free cτ , (D9)or, in the case of complete stochastic unbinding, CV = 1: (cid:104) δc (cid:105) c = 1 √ Dap free cτ . (D10)

Appendix E: Beyond concentration sensing – using the future

Information about future binding events can be exploited by using the backward equation for P ← = P ( ϕ, t |{ t n (cid:48) +1 , . . . , t n } ): ∂P ← ( ϕ, t ) ∂t = − τ ∂ P ← ∂ϕ + r ( e − ϕ − (cid:104) e − ϕ (cid:105) ) P ← ( ϕ, t ) − (cid:18) e − ϕ (cid:104) e − ϕ (cid:105) − (cid:19) P ← ( ϕ, t ) n (cid:88) i =1 δ ( t − t i ) , (E1)where m is the last binding event before t . We denote by P → = P ( ϕ, t |{ t , . . . , t n (cid:48) } ) the solution of the forwardequation (A13) discussed before. The distribution of ϕ at time t is then given by: P ( ϕ, t |{ t , . . . , t n } ) ∝ P ( { t , . . . , t n }| ϕ, t ) P prior ( ϕ, t ) = P ( { t , . . . , t n (cid:48) }| ϕ, t ) P ( { t n (cid:48) +1 , . . . , t n }| ϕ, t ) P prior ( ϕ, t ) ∝ P ( ϕ, t |{ t , . . . , t n (cid:48) } ) P ( ϕ, t |{ t n (cid:48) +1 , . . . , t n } ) P prior ( ϕ, t ) ∝ P → ( ϕ, t ) P ← ( ϕ, t ) (E2)where we have used the fact that the past and future where conditionally independent given ϕ at time t , since theprocess is Markovian, and a uniform prior. We thus have: P ( ϕ, t ) = P → ( ϕ, t ) P ← ( ϕ, t ) (cid:82) dϕ (cid:48) P → ( ϕ (cid:48) , t ) P ← ( ϕ (cid:48) , t ) . (E3)Using the Gaussian solution, and denoting ˆ ϕ → , σ → the parameters of the forward solution, and ˆ ϕ ← , σ ← those of thebackward solution, we obtain: P ( ϕ, t ) = 1 √ πσ exp (cid:20) − ( ϕ − ˆ ϕ )2 σ (cid:21) (E4)with ˆ ϕ = ˆ ϕ → σ ← + ˆ ϕ ← σ → σ ← + σ → , σ = σ ← σ → σ ← + σ → . (E5)These formulas could be used in estimates of density r ( t ) from sparse observations t i , where r ( t ) is interpreted asa density of events, and tt