[PDF] Coupled differentiation and division of embryonic stem cells inferred from clonal snapshots

Abstract

The deluge of single-cell data obtained by sequencing, imaging and epigenetic markers has led to an increasingly detailed description of cell state. However, it remains challenging to identify how cells transition between different states, in part because data are typically limited to snapshots in time. A prerequisite for inferring cell state transitions from such snapshots is to distinguish whether transitions are coupled to cell divisions. To address this, we present two minimal branching process models of cell division and differentiation in a well-mixed population. These models describe dynamics where differentiation and division are coupled or uncoupled. For each model, we derive analytic expressions for each subpopulation's mean and variance and for the likelihood, allowing exact Bayesian parameter inference and model selection in the idealised case of fully observed trajectories of differentiation and division events. In the case of snapshots, we present a sample path algorithm and use this to predict optimal temporal spacing of measurements for experimental design. We then apply this methodology to an \textit{in vitro} dataset assaying the clonal growth of epiblast stem cells in culture conditions promoting self-renewal or differentiation. Here, the larger number of cell states necessitates approximate Bayesian computation. For both culture conditions, our inference supports the model where cell state transitions are coupled to division. For culture conditions promoting differentiation, our analysis indicates a possible shift in dynamics, with these processes becoming more coupled over time.

Full PDF

CCoupled diﬀerentiation and division of embryonicstem cells inferred from clonal snapshots

Liam J. Ruske

Rudolf Peierls Centre for Theoretical Physics, University of Oxford, Parks Road,Oxford, OX1 3PU, UKE-mail:

[email protected]

Jochen Kursawe

School of Mathematics and Statistics, University of St Andrews, North Haugh, StAndrews, UKE-mail: [email protected]

Anestis Tsakiridis

Centre for Stem Cell Biology, Department of Biomedical Science, The University ofSheﬃeld, Sheﬃeld, United KingdomE-mail: [email protected]

Valerie Wilson

MRC Centre for Regenerative Medicine, The University of Edinburgh, EdinburghBioQuarter, 5 Little France Drive, Edinburgh, EH164UU, UKE-mail:

[email protected]

Alexander G. Fletcher

School of Mathematics and Statistics, University of Sheﬃeld, Hicks Building,Hounsﬁeld Road, Sheﬃeld, S10 1RB, UKBateson Centre, University of Sheﬃeld, Sheﬃeld S10 2TN, UKE-mail: [email protected]

Richard A. Blythe

SUPA, School of Physics and Astronomy, University of Edinburgh, James ClerkMaxwell Building, Peter Guthrie Tait Road, Edinburgh, EH9 3FD, UKE-mail:

[email protected]

Linus J. Schumacher

MRC Centre for Regenerative Medicine, The University of Edinburgh, EdinburghBioQuarter, 5 Little France Drive, Edinburgh, EH164UU, UKE-mail:

[email protected] a r X i v : . [ q - b i o . Q M ] J un oupled diﬀerentiation and division of stem cells April 2020

Abstract.

The deluge of single-cell data obtained by sequencing, imaging andepigenetic markers has led to an increasingly detailed description of cell state. However,it remains challenging to identify how cells transition between diﬀerent states, in partbecause data are typically limited to snapshots in time. A prerequisite for inferring cellstate transitions from such snapshots is to distinguish whether transitions are coupledto cell divisions. To address this, we present two minimal branching process modelsof cell division and diﬀerentiation in a well-mixed population. These models describedynamics where diﬀerentiation and division are coupled or uncoupled. For each model,we derive analytic expressions for each subpopulation’s mean and variance and for thelikelihood, allowing exact Bayesian parameter inference and model selection in theidealised case of fully observed trajectories of diﬀerentiation and division events. Inthe case of snapshots, we present a sample path algorithm and use this to predictoptimal temporal spacing of measurements for experimental design. We then applythis methodology to an in vitro dataset assaying the clonal growth of epiblast stemcells in culture conditions promoting self-renewal or diﬀerentiation. Here, the largernumber of cell states necessitates approximate Bayesian computation. For both cultureconditions, our inference supports the model where cell state transitions are coupledto division. For culture conditions promoting diﬀerentiation, our analysis indicates apossible shift in dynamics, with these processes becoming more coupled over time. oupled diﬀerentiation and division of stem cells

1. Introduction

Changes in gene expression underlie many aspects of cellular behaviour in tissuedevelopment, homeostasis, and regeneration. The concept of discrete cell states isintended to capture the distinct patterns of gene expression that are observed withintissues over time. The deluge of single cell data is leading to an increasingly detaileddescription of cell state [1, 2]. There are various ways to interrogate a cell’s state indiﬀerent contexts in vivo and in vitro. Modern technologies such as scRNAseq producevast amounts of data but are costly, laborious to analyse, and relatively noisy. Oldertechniques, such as immunoﬂuorescent stainings, where cell state can be deﬁned by theco-expression of a small number of genes/proteins, are cheaper and simpler and thusstill remain heavily used.While there are numerous techniques for classifying cell states based on single celldata, and ordering them along pseudotime trajectories [3, 4], such classiﬁcation onlyoﬀers very limited insight into the dynamics of cell state transitions [4, 5]. The abilityto quantify the dynamics of cell state transitions promises greater insights into cellheterogeneity and how diﬀerentiation varies with culture conditions, and could helpsteer targeted diﬀerentiation of stem cells in vitro. Currently, the data remain limitedto snapshots in most cases, with only one or few time-point measurements available, anda loss of cell identity across measurements due to their destructive nature. Fluorescentreporters of gene expression for live-imaging circumvent this, but their availability isrelatively limited and they allow observation of at most a few genes.Correct tissue development and regeneration requires balanced cell proliferationand diﬀerentiation at the tissue level to ensure the right cell type in the right placeat the right amount. However, which dynamics at the cellular level give rise to thispopulation behaviour is in many cases still unclear. In particular, deciding how orwhether cell state transitions are coupled to, or occur independently of, cell division isa pre-requisite for quantifying cell state transition rates. Determining evidence for suchcoupling from snapshot data is an incompletely solved problem [6].If we only have access to snapshots of clonal colony growth, how can one distinguishwhether cell division and cell state transitions are coupled? The answer to this questionwill aﬀect how well one can infer cell state transition rates, as assuming a “wrong”underlying model will impair the quality of the inference. Conversely, can we informexperimental design to maximise the information gained from experimental data? Toaddress this, we combine biophysical theory of stochastic population dynamics withBayesian parameter inference and model comparison. As an example we choose thewell-established system of epiblast stem cells (EpiSCs) [7]. EpiSCs are self-renewing,pluripotent cell lines that are an in vitro equivalent of the embryonic epiblast tissue. Invivo, these cells begin to diﬀerentiate (undergo lineage commitment) during gastrulation.We start with minimal branching process models of cell division and cell statetransition (e.g. reversible diﬀerentiation), and derive the population dynamics andexact likelihoods for inference based on complete trajectory data (observing every cell oupled diﬀerentiation and division of stem cells

2. Methods

We begin this section by introducing two continuous-time Markov processes as minimalstochastic models for stem cell dynamics. These simple models are then generalized todescribe the dynamics of epiblast stem cells (EpiSC) in more detail. We will close thissection with a brief summary of Bayesian parameter inference and model selection.

We begin by deﬁning minimal models of cell division and diﬀerentiation in a well-mixedpopulation. We assume that the population dynamics can be adequately described by acontinuous-time Markovian stochastic process where state transition rates depend onlyon the current state of the system. This in turn is speciﬁed by a set of populationnumbers { N } of diﬀerent cell states. In this section we will consider systems with twodiﬀerent cell states and assume that cells within the population divide and change stateindependently of one another.We consider two distinct models of population dynamics: one in which celldiﬀerentiation is either coupled to cell division (Model C) one in which processes areuncoupled (Model U). In both models there are two cell states, A and B. The essentialdiﬀerence is that in Model C, cells of a diﬀerent state can be produced only through celldivision, whereas in Model U cells can change state reversibly without dividing. Thedynamics in the state space spanned by N A ( t ) and N B ( t ) are illustrated in Figure 1.Initial populations N A (0) and N B (0) will be denoted as N A and N B , respectively. Model C is deﬁned by the following continuous-time transition rates:[ N A , N B ] →  [ N A + 1 , N B ] , with rate h C := N A λ AA ,[ N A , N B + 1] , with rate h C := N A λ AB , [ N A − , N B + 2] , with rate h C := N A λ BB , (1) oupled diﬀerentiation and division of stem cells h Ck . In each of these three reactions,an A-state cell divides, producing two A-state cells, an A- and a B-state cell, or twoB-state cells, at per-capita rates λ AA , λ AB , and λ BB respectively. In this model, B-statecells do not divide.Let p i,j ( t ) be the probability that our cell population consists of i A-state and j B-state cells at time t , then the master equation for Model C is given by dp i,j dt = ( i − λ AA p i − ,j + ( i + 1) λ BB p i +1 ,j − + iλ AB p i,j − − i ( λ AA + λ AB + λ BB ) p i,j . (2)This master equation is a diﬀerential equation deﬁning the time evolution of populationprobabilities p i,j ( t ) caused by stochastic transitions. In Model U, the state transition rates are:[ N A , N B ] →  [ N A + 1 , N B ] , with rate h U := N A λ A ,[ N A − , N B + 1] , with rate h U := N A k AB , [ N A , N B + 1] , with rate h U := N B λ B , [ N A + 1 , N B − , with rate h U := N B k BA , (3)with transition rates h Uk . These reactions correspond to division of A-state and B-statecells into two daughter cells of the same state at per-capita rates λ A and λ B , respectively,and to dynamic state changes of an A-state into a B-state at per-capita rate k AB , andof a B-state into an A-state at per-capita rate k BA .The master equation of Model U is given by dp i,j dt = ( i − λ A p i − ,j + ( j − λ B p i,j − + ( j + 1) k BA p i − ,j +1 + ( i + 1) k AB p i +1 ,j − − ( iλ A + jλ B + ik AB + jk BA ) p i,j . (4) While ModelsC and U are useful for illustrative purposes, we require more cell states to make useof experimental data on epiblast-derived stem cell (EpiSC) dynamics. Here, we deﬁneEpiSC states by the binary expression of the transcription-factor genes T(Bra), Sox2,and Foxa2, which is what the experimental dataset contains information about [7].There are thus 8 distinct cell states accessible for modelling, which we will denote by Φ i :=  [ T − , S − , F − ][ T − , S − , F + ][ T − , S + , F − ][ T − , S + , F + ][ T + , S − , F − ][ T + , S − , F + ][ T + , S + , F − ][ T + , S + , F + ]  i , (5) oupled diﬀerentiation and division of stem cells (a)(b) (c) Figure 1:

Cell transitions in Models C and U. (a) The state of the system is deﬁnedby the number of A-state cells ( i ) and B-state cells ( j ). Population dynamics canbe interpreted as a 2D random walk with position-dependent jump rates. (b) ModelC allows symmetric and asymmetric divisions. (c) Model U features only symmetricdivisions, and state changes are made through reversible transitions that do not changecell number.where the index T,S,F represents gene T(Bra), Sox2 and Foxa2, respectively. Forsimplicity, we start with the assumption that cell state transitions change only oneof the three genes T, F, S at a time. This leads to 12 allowed transitions between cellstates, which can be conveniently visualised as a cube where vertices represent cell statesand edges represent transition paths (see Figure 2a).The generalisation of Model C retains the deﬁning feature that new cell states canbe created only at division of the parent. Speciﬁcally, we allow symmetric division ofcells in each state, Φ i → i , at per-capita rate θ ii . Note that we can recover ModelC by setting all but one of these rates to zero. To keep the number of parameterssmall, we accommodate only the other symmetric cell division processes, Φ i → j for oupled diﬀerentiation and division of stem cells (a)(b) (c) Figure 2:

Dynamics in generalisations of Models C and U. (a) We characterise EpiSCsby the co-expression of three genes, T(Bra), Sox2 and Foxa2, corresponding to a space ofeight possible cell states (cube vertices), S = { ( T ± , S ± , F ± ) } . A combination of these 8states may be present in a heterogeneous EPiSC population. We assume that cell statetransitions change the expression of only one gene at a time. This results in 12 possibletransitions between cell states (cube edges). (b) In the generalised Model C, cells instate Φ i ∈ S divide symmetrically with rate θ ii into two daughter cells with state Φ i ,and symmetrically with rate θ ij into two daughter cells with a neighbouring state Φ j .(c) In the generalised Model U, cells in state Φ i ∈ S divide symmetrically with rate θ ii into two daughter cells with state Φ i , and transition with rate θ ij into a neighbouringstate Φ j . oupled diﬀerentiation and division of stem cells j sharing an edge with i , at per-capita rate θ ij . These dynamics are illustrated inFigure 2b. The fact that we chose to include only symmetric cell divisions should notchange the outcome of model comparison too much, considering the following argument:the asymmetric division process Φ i → Φ i + Φ j with i (cid:54) = j eﬀectively increases the Φ j population by one. The same outcome can also be achieved either by a symmetricdivision Φ j → j or by the sequence of two symmetric cell divisions Φ i → j , i (cid:54) = j and Φ j → k , j (cid:54) = k .Since cells in any state can divide, there are eight diﬀerent reactions of the formΦ i → i . State-changing divisions of the form Φ i → j can happen between allneighbouring cell states ( i, j ) in the network shown in Figure 2a; there are 3 × θ ii = θ for all i . We further assume that switching each gene M ∈ { T, S, F } on or oﬀ does not depend on the state of the other two genes which arenot changed. This assumption implies the transition rates of genes are independent ofeach other, which is a ﬁrst approximation only. We discuss relaxing this assumptionin Section 4.2. Each gene M thus contributes two independent rate constants θ M + , θ M − for switching on and oﬀ, respectively. The extended Model C therefore has 7 freeparameters: θ = ( θ , θ T + , θ T − , θ S + , θ S − , θ F + , θ F − ) . (6)We now turn to the generalisation of Model U, recalling that the original two-statemodel allowed symmetric division of both states, and dynamic diﬀerentiation in whichone state could turn into the other. This is easily extended to 8 cell states Φ i , whereineach cell state can perform symmetric cell division at rate θ ij , and dynamical statechanges between all pairs of neighbouring cell states can occur at rates θ ij and θ ji . SeeFigure 2c.As in the extended Model C, there are eight symmetric division rates, which weagain set to a universal value θ ii = θ . Similarly the rates of dynamic cell state changesare deﬁned by the gene that is switched on or oﬀ, which again leads to a total of8 + 24 = 32 reaction types and 7 free parameters: θ = ( θ , θ T + , θ T − , θ S + , θ S − , θ F + , θ F − ) . (7)An appealing feature of Models C and U from a classical model-selection perspectiveis that they have the same number of free parameters. A problem when comparingmodels with diﬀerent numbers of free parameters is that the models which oﬀer more freeparameters will usually show better agreement with the data. However, this is becauseit can be more accurately ﬁtted to the data, not because the underlying assumptionsare more accurate in describing nature. Bayesian model selection generally eliminatesthis problem since it embodies Occams razor: simpler models have higher posteriorprobabilities than more complex models if both ﬁt the data equally well [8]. oupled diﬀerentiation and division of stem cells We use Bayesian methods to perform model selection and parameter inference. Givensome experimental data ξ , we seek the probability distribution p ( θ | ξ ) over theparameters θ of either Model C or Model U. This is achieved by appealing to Bayes’theorem, p ( θ | ξ ) = p ( ξ | θ ) p ( θ ) p ( ξ ) . (8)Here p ( θ ) encapsulates our prior beliefs as to appropriate values for the parameters θ (i.e., in the absence of any data), and p ( ξ | θ ) is the likelihood of the data ξ given aparameter choice. In this context, the likelihood is given by solving the master equationfor the model under consideration. Lastly p ( ξ ) is the probability of observing the data ξ ,and is usually obtained indirectly by ensuring that the posterior distribution is correctlynormalised. It is worth mentioning that data ξ can take very diﬀerent forms, such ascontinuous trajectory data of cell population or population snapshots at a ﬁnite numberof times. The form of ξ does not aﬀect Bayes’ theorem, but enters the calculation viathe likelihood function p ( ξ | θ ).If not only the model parameters θ are unknown, but also the model itself, wecan use Bayesian methods for model comparison. Given data ξ and the two competingmodels M C and M U , we wish to calculate the probability of each model p ( M j | ξ ) giventhe data. Following Bayes’ theorem, the posterior odds of the models are given by p ( M C | ξ ) p ( M U | ξ ) = p ( ξ | M C ) q ( M C ) p ( ξ | M U ) q ( M U ) , (9)where q ( M i ) is the prior belief that model M i is the appropriate description of the data.The selection of the “best” model is usually not based on posterior odds, but the ratioof posterior to prior odds: that is, how much more the data point to one model over theother, given our prior beliefs: B CU := p ( M C | ξ ) /p ( M U | ξ ) q ( M C ) /q ( M U ) = p ( ξ | M C ) p ( ξ | M U ) . (10)This ratio is called the Bayes factor [9]. Following (10), the marginal likelihoods p ( ξ | M j ) are the quantities of key interest. They can be obtained by integrating over theparameter space θ j of the corresponding model: p ( ξ | M j ) = (cid:90) p ( ξ | θ j , M j ) p ( θ j | M j ) d θ j , (11)where p ( θ j | M j ) = q ( θ j ) is the parameter prior of model M j .All algorithms shown in this work were implemented using python3 with standardlibraries (numpy, pandas, scipy, pyabc). All code can be accessed via GitHub(https://github.com/real-save/Stem-cell-inference). oupled diﬀerentiation and division of stem cells

3. Results

Our ﬁrst set of results pertain to how populations evolve over time in Models C and Uin their original formulation with two cell states. In particular, we will see the utility of moment generating functions . In the following we will derive equations of the probabilitygenerating function and mean expressions for N A and N B for each of Models C and U.In the last section we will also derive approximate expressions for mean populations (cid:104) Φ (cid:105) in the extended models. Applying the procedure of probabilitygenerating functions (see Appendix B) to the master equation (2) for Model C yields ∂G∂t = (cid:2) λ AA ( z A − z A ) + λ BB ( z B − z A ) + λ AB ( z A z B − z A ) (cid:3) ∂G∂z A . (12)This is a linear, ﬁrst-order PDE for G ( z , t ) and can be solved by the method ofcharacteristics [10]. We assume the system initially consists of N A A-state cells and N B B-state cells, which translates to the initial condition G ( z A , z B , t = 0) = z N A A z N B B .Using this initial condition, we ﬁnd G ( z A , z B , t ) = z N B B (cid:20) ρλ AA tan (cid:18) tρ + arctan (cid:18) ρλ AA z A + ρξ (cid:19)(cid:19) − ξ λ AA (cid:21) N A , (13)where ξ ( z B ) = λ AB ( z B − − λ AA − λ BB ,ρ ( z B ) = 2 (cid:112) λ AA λ BB z B − ξ . (14)The full time-dependence of mean population sizes then follow from ( ?? ): (cid:104) N A ( t ) (cid:105) = N A e ∆ λ t , (15) (cid:104) N B ( t ) (cid:105) = N B + N A (cid:18) λ BB + λ AB ∆ λ (cid:19) (cid:0) e ∆ λ t − (cid:1) , (16)where ∆ λ := λ AA − λ BB is the diﬀerence of symmetric division rates. To understandthese results, we note that cells in state A divide at per-capita rate λ AA , and are lost atper-capita rate λ BB . These competing processes lead to either an exponential growth( λ AA > λ BB ) or exponential decay ( λ AA < λ BB ) in the mean number of cells in state A.Since there is no loss of cells from state B, the mean number of cells in state B eithergrows exponentially with time ( λ AA > λ BB ) or plateaus at a ﬁnite value ( λ AA < λ BB ).Mean population dynamics are therefore sensitive to changes in symmetric divisionrates since ∆ λ controls the exponential growth/decay of mean populations. On theother hand, the rate of asymmetric cell divisions, λ AB , weakly aﬀects mean dynamicsas it enters eqn (16) only as a constant prefactor. oupled diﬀerentiation and division of stem cells

11A particular feature of Model C is that there is an absorbing state. This arisesbecause A-state cells can be lost via the process A → B , and B-state cells do notdivide. Therefore, this a ﬁnite probability that the number of cells in state A willeventually decrease to zero and the system dynamics will come to a halt. We can obtain p ext ( t ), the extinction probability that this has occurred by time t , from the momentgenerating function as follows: p ext ( t ) = Pr [ N A ( t ) = 0] = ∞ (cid:88) j =0 p ,j ( t ) = G (0 , , t ) . (17)From (13) we ﬁnd p ext ( t ) = (cid:18) λ BB λ AA + λ BB + | ∆ λ | coth ( t | ∆ λ | / (cid:19) N A . (18)In the special case λ AA = λ BB = λ this simpliﬁes to p ext ( t ) = (cid:18) λt λt (cid:19) N A . (19)The extinction probability is monotonically increasing with time and converges to aﬁnite value lim t →∞ p ext ( t ) = p maxext . For λ BB ≥ λ AA , it follows p maxext = 1 and A-state cellswill always die out. For λ BB < λ AA however, p maxext = ( λ BB /λ AA ) N A < We can solve Model U by following the samesequence of steps. We ﬁrst arrive at the PDE ∂G∂t = (cid:2) λ A ( z A − z A ) + k AB ( z B − z A ) (cid:3) ∂G∂z A + (cid:2) λ B ( z B − z B ) + k BA ( z A − z B ) (cid:3) ∂G∂z B . (20)The solution is given by G ( z A , z B , t ) = (cid:2) e Γ t ( c ( t )( z A −

1) + c ( t )( z B − (cid:3) N A × (cid:2) e Γ t ( c ( t )( z A −

1) + c ( t )( z B − (cid:3) N B . (21)Here, c , c , c , c are functions of time t : c ( t ) = cosh (cid:16) ω t (cid:17) + θ sinh (cid:16) ω t (cid:17) ,c ( t ) = 2 k AB ω sinh (cid:16) ω t (cid:17) ,c ( t ) = 2 k BA ω sinh (cid:16) ω t (cid:17) ,c ( t ) = cosh (cid:16) ω t (cid:17) − θ sinh (cid:16) ω t (cid:17) , (22) oupled diﬀerentiation and division of stem cells ω , θ are constants depending on model parameters ( λ A , λ B , k AB , k BA ):Γ = 12 (( λ A + λ B ) − ( k AB + k BA )) ,ω = (cid:112) ( λ A − λ B ) + ( k AB + k BA ) + 2( λ A − λ B )( k BA − k AB ) ,θ = 1 ω (( λ A + k BA ) − ( λ B + k AB )) . (23)The mean numbers of cells in each state are given by (cid:104) N A ( t ) (cid:105) = e αt (cid:20) N A cosh( βt ) + (cid:18) k BA β N B + γ β N A (cid:19) sinh( βt ) (cid:21) , (24) (cid:104) N B ( t ) (cid:105) = e αt (cid:20) N B cosh( βt ) + (cid:18) k AB β N A − γ β N B (cid:19) sinh( βt ) (cid:21) , (25)where α , β , γ are constants depending on model parameters ( λ A , λ B , k AB , k BA ): α = 12 (( λ A + λ B ) − ( k AB + k BA )) , (26) β = 12 (cid:112) γ + 4 k AB k BA , (27) γ = ( λ A + k BA ) − ( λ B + k AB ) . (28)Over long times, the population means grow approximately exponentially, as in ModelC, but with a diﬀerent rate α + β . In contrast to Model C, however, the populationmeans never decay since α + β ≥ λ A , λ B , k AB , k BA ) in Model U, there exists a set of parameters ( λ AA , λ AB , λ BB ) inModel C for which the asymptotic behaviour ( t → ∞ ) of the means (cid:104) N A (cid:105) , (cid:104) N B (cid:105) isidentical in both models (see Appendix A). Thus, we cannot use information aboutmean populations alone to distinguish Model U from Model C. The converse is nottrue, since (cid:104) N A (cid:105) decays for some parameter regimes in Model C, which is impossible inModel U. This is part of the rationale for appealing to Bayesian methods to infer modelparameters and facilitate model comparison. As we showed in the lasttwo sections, obtaining an exact solution of the probability generating function isalready quite challenging for only two distinct cell states A and B. To obtain theprobability generating function for the extended models including eight distinct cellstates we would need to solve PDEs in 8 dimensions with 7 free parameters. Thereforewe restrict the analysis of population dynamics in the extended models on obtainingapproximate expressions of the mean populations (cid:104) Φ (cid:105) ( t ). Following the deﬁnition ofallowed transitions in the extended models, on can derive a system of ODEs for thetime evolution of moments (cid:104) Φ (cid:105) of the form [11]: d (cid:104) Φ (cid:105) dt = M · (cid:104) Φ (cid:105) , (29)where M is a 8 × oupled diﬀerentiation and division of stem cells (cid:104) Φ (cid:105) ( t ) = (cid:80) m c m v m e λ m t where λ m , v m are the m-th eigenvalue andeigenvector of matrix M , respectively, and c m are constants depending on initialconditions. For the extended models U and C, the matrix M is listed in theSupplementary Material, eqn ( ?? ) and ( ?? ), and only has real eigenvalues. At late times λ m t (cid:29)

1, the term with the largest eigenvalue max( { λ m } ) dominates the dynamics. Forextended model U, the late time evolution according to the largest eigenvalue is givenby (cid:104) Φ (cid:105) ( t ) ∝ v · e θ t , v =  θ T − θ S − θ F − θ T − θ S − θ F + θ T − θ S + θ F − θ T − θ S + θ F + θ T + θ S − θ F − θ T + θ S − θ F + θ T + θ S + θ F − θ T + θ S + θ F +  . (30)This result can be interpreted as follows: all cell states grow with the same rate θ andtherefore the total population size also grows with rate θ . The shape of v reﬂects thatafter suﬃciently many division events, pairs of cell states are distributed according totheir equilibrium distribution, (cid:104) Φ M + (cid:105) / (cid:104) Φ M − (cid:105) = θ M + /θ M − , where (cid:104) Φ M ± (cid:105) is the meanpopulation size of cells with gene M ∈ { T, S, F } switched on and oﬀ, respectively. Forextended Model C, we can ﬁnd two kinds of approximations for the late time evolutionof mean populations. If rates for switching genes on and oﬀ are very unbalanced, θ M + (cid:29) θ M − or θ M + (cid:28) θ M − , it follows (cid:104) Φ (cid:105) ( t ) ∝ e i · e θ t , (31)where e i equals one for those cell state favoured by the combination of imbalanced rates θ M + (cid:29) θ M − and zero for all other cell states. The eﬀective growth rate of the totalpopulation equals θ and is therefore the same as in Model U. On the other hand, ifrates for switching genes on and oﬀ are approximately equal, θ M + ≈ θ M − = θ M , latetime dynamics are given by (cid:104) Φ (cid:105) ( t ) ∝ · e ( θ + θ T + θ S + θ F ) t , (32)where is a 7 × θ + θ T + θ S + θ F ).Diﬀerentiation events can therefore trigger increased population growth in extendedmodel C. Populations grow with an increased rate ˜ θ , θ ≤ ˜ θ ≤ ( θ + θ T + θ F + θ S ),which exact value depends on the rate imbalance θ M + /θ M − of the marker switchingrates. In this section we derived exact solutions of the probability generatingfunction for both Model C and Model U. This allows an eﬃcient and systematic way ofcalculating moments of the population distribution. We explicitly derived expressions oupled diﬀerentiation and division of stem cells ?? ) and for proposing experimental measurement procedureswhich optimise parameter inference (see Section 3.2.3). We showed that the extendedstem cell models follow similar growth dynamics and presented approximate expressionsfor mean population at late times. In the previous section we analysed the time-evolution of the distribution over cell states N ( t ) as a function of model parameters θ . Now we will look at the inverse problem:obtaining the distribution over model parameters θ given population data N . In theﬁrst section we consider parameter inference given complete knowledge about populationsizes N ( t ) at all times. In the second section we generalise parameter inference to caseswhere we have incomplete data and population sizes N ( t i ) are only known at certaintime points { t i } . Weﬁrst suppose that we have complete knowledge of how the population sizes N ( t ) havechanged over time. In other words, each cell division and diﬀerentiation event isevident in the trajectory N ( t ). Such data could, for example, come from live-imagingexperiments in which each cell state can be identiﬁed and the full lineage be tracked.Since each reaction as deﬁned by Eq. (1), (3) leads to a unique change in the numberof cells in each state, we can infer the type of an occurred reaction from changes in N ( t ). We deﬁne τ i , ν i and h ν i ( τ i ) as the reaction times, the corresponding reactiontypes and the transition rates, respectively. We deﬁne the combined transition rate h ( t ) as the sum over transition rates for all possible reactions that can take place attime t (which will depend on the state of the system at that time). Since each reactionevent is modelled as a Poisson process with corresponding rate h ν i , the likelihood of asequence of n reactions is given by [12] L comp ( ξ | θ ) = (cid:34) n (cid:89) i =1 h ν i ( τ i − ) (cid:35) · e − (cid:80) ni =0 h ( τ i ) · ( τ i +1 − τ i ) , (33)where the subscript comp denotes complete data and τ = 0, τ n +1 = T . Now, in themodels we consider here, each transition rate h ν i is proportional to a per-capita rate θ ν i multiplied by the number of cells in state Φ ν , which is the cell state associated with areaction of type ν . If we deﬁne now N ν ( t ) as the total number of cells in state Φ ν attime t , we have that L comp ( ξ | θ ) ∝ (cid:89) ν (cid:20) θ r ν ν exp (cid:18) − θ ν (cid:90) T N ν ( t ) dt (cid:19)(cid:21) , (34) oupled diﬀerentiation and division of stem cells ν (i.e. division or diﬀerentiation of eachcell state) and r ν is the number of times that reaction ν occurs during the observationwindow 0 ≤ t ≤ T .Note that what matters for Bayesian inference is the dependence of the likelihoodon the parameters θ . Therefore factors that do not depend on these parameters (such asthe population sizes N ν that multiply each of the transition rates θ k ) can be subsumedinto the constant of proportionality in (34).A key feature of (34) is that it factorises into terms that each depend on a diﬀerentreaction rate θ ν . This means that each parameter value can be inferred independently,assuming that parameter priors are independent: p ( θ ) = (cid:81) ν p ( θ ν ). That is, the posteriordistribution of θ ν is p ( θ ν | ξ ) ∝ θ r ν ν exp (cid:18) − θ ν (cid:90) T N ν ( t ) dt (cid:19) p ( θ ν ) , (35)where p ( θ ν ) is the prior distribution over the reaction rate θ ν . As before, the constantof proportionality can be obtained by normalising the posterior distribution p ( θ ν | ξ ). Itis convenient for the prior distribution to be taken conjugate to the likelihood, whichmeans that the product of prior and likelihood is of the same functional form as theprior. Here, the likelihood is a gamma distribution, and therefore the conjugate prior isalso a gamma distribution [13]. If the prior is a gamma distribution with shape a ν andrate b ν , then the posterior is a gamma distribution with shape a (cid:48) ν = a ν + r ν and rate b (cid:48) ν = b ν + (cid:82) T N Φ ν ( t ) dt , with mean (cid:104) θ ν | ξ (cid:105) = a (cid:48) ν /b (cid:48) ν and variance Var( θ ν | ξ ) = a (cid:48) ν /b (cid:48) ν [14].The conﬁdence of parameter estimation can be quantiﬁed by the dispersion of theposterior distribution p ( θ ν | ξ ), measured by the coeﬃcient of variation c v (COV) c v ( θ ν | ξ ) = (cid:112) Var( θ ν | ξ ) (cid:104) θ ν | ξ (cid:105) = 1 √ a ν + r ν . (36)For experimental design it is essential to estimate how much data is needed to achieveparameter inference of a certain precision. More precisely, if we observe n identicalsystems over a time period T , we would like to know how the COV scales on averagewith sample size n and observation time T . As shown in Appendix C, the asymptoticscaling of the mean COV with observation time T in Model C is given by (cid:104) c v (cid:105) ( θ ν | ξ , . . . , ξ n ) ∝  exp ( − ∆ λ T ) √ N A n ∆ λ > , √ N A n ∆ λ < , (37)where ∆ λ = λ AA − λ BB . For Model U, we obtain (cid:104) c v (cid:105) ( θ ν | ξ , . . . , ξ n ) ∝ exp (cid:16) − ( α + β )2 T (cid:17) n , (38)where α and β are functions of rate constants as given by (28). Note that the mean COVof θ ν depends on the rate constant θ ν itself. This result can be useful for experimental oupled diﬀerentiation and division of stem cells θ ν is available. If not, one has to relyonly on the scaling with sample size n .To summarise, if we have a complete history of all the population sizes N ν ( t ) overtime, we can infer the reaction rate θ ν by counting how many times the reaction ν occurs,and by integrating the size of the population that drives each reaction, N ν ( t ), over time t . Scaling laws for the conﬁdence of parameter estimation can be obtained as a functionof sample size n and observation time T . The uncertainty of the parameter estimationdecreases exponentially with observation time, but only decays as a power law withsample size. Besides parameter inference, the aim of experiments is often to comparecompeting models based on data ξ . In the complete-data scenario, reaction types canbe directly inferred from changes in population. Hence, if reactions are observed whichare impossible in either Model C or Model U, we can rule out the respective model andmodel selection is trivial. This will be very diﬀerent in scenarios where we only haveaccess to snapshot data, as we now show. The results of the previous section depend on having complete knowledge of populationsizes over time. It is unlikely that this level of detail will be available in practice.More likely are a set of snapshots , that is, measurements of N at a ﬁnite set of times t = 0 , t , t , . . . , t n .In the previous section we obtained an expression for the posterior distribution p ( θ | ξ ) given a completely-speciﬁed trajectory ξ . To obtain the corresponding quantityfrom snapshot data, we integrate over all trajectories ξ that are consistent with theobservations { N ( t i ) } , weighted by the probability p ( ξ | N ( t i )) which is conditioned onpassing through the observation points. That is, p ( θ |{ N ( t i ) } ) = (cid:90) Dξ p ( θ | ξ ) p ( ξ |{ N ( t i ) } ) . (39)Since the underlying model is Markovian, we have that p ( ξ | N ( t i )) = n (cid:89) i =1 p ( ξ i | N ( t i − ) , N ( t i )) , (40)where ξ i is the portion of the trajectory corresponding to the time interval [ t i − , t i ].This observation implies that the posterior distribution can be sampled using thefollowing algorithm (see also Figure 3). oupled diﬀerentiation and division of stem cells Figure 3:

Sample path Markov chain Monte Carlo (MCMC) algorithm for sampling N MC reaction rate values θ from the posterior p ( θ |{ N ( t i ) } ), given snapshot-data { N ( t i ) } . Algorithm motivated by [12]. Algorithm 1

Sample path MCMC for parameter inference Initialise algorithm with valid trajectory segments { ξ i } , a parameter set θ and set i = 1. Propose trajectory segment ξ (cid:63)i between time points t i − and t i with ﬁxed endpoints N ( t i − ) and N ( t i ). Metropolis-Hastings step: replace original segment ξ i by ξ (cid:63)i with acceptanceprobability min(1 , A i ), with acceptance ratio A i from eqn ( ?? ). Otherwise keeporiginal segment ξ i . If i < n , set i = i + 1 and go to (2), otherwise go to (5) Form an updated complete trajectory ξ by joining the updated segments ξ i . Sample a new parameter set θ from the complete-data posterior p ( θ | ξ ). Output the current θ , set i = 1 and go back to (2).Previously we used the Gillespie algorithm to sample random trajectories with aﬁxed starting point. However, to propose new trajectory segments ξ (cid:63)i in step 2 of thealgorithm, we need to sample a random trajectory where both end points are ﬁxed.This is a much harder problem and we will use an approach based on the Metropolis-Hastings-Green algorithm [12, 15] to achieve this. The basic idea is to draw ξ (cid:63)i from a oupled diﬀerentiation and division of stem cells A accordingly.To construct the approximate distribution, we make use of the exact results forthe mean population sizes that were derived in Sections 3.1.1 and 3.1.2. We found thattypically these grow or decline exponentially. Therefore we consider a set of reactions ν whose transition rates are deterministic functions of time, h ∗ ν ( t ) = N ν (0) θ ν (cid:18) N ν ( T ) N ν (0) (cid:19) t/T , (41)where t is the time from the start of the trajectory segment, and T is the length of thetrajectory segment. The simpliﬁcation here is that the probability a reaction occurs attime 0 ≤ t ≤ T is not aﬀected by the sequence of reactions that has occurred up totime t , as would be the case for the true distribution. This amounts to sampling r ν reactions of type ν from an inhomogeneous Poisson process with the time-dependenttransition rates h ∗ ν . The procedure is to ﬁrst draw the numbers of reactions r ν , andthen sample the reaction times from an inhomogeneous Poisson process. To account forthe approximation in sampling the trajectory, the acceptance ratio in the Sample PathMCMC algorithm must be chosen accordingly, which is shown in detail in Appendix D. We now apply theSample Path MCMC algorithm to understand how well it performs given speciﬁcsnapshots. Recall that this algorithm generates a sequence of rate-constant vectors, θ ,which (when converged) are distributed according to the correct posterior distribution p ( θ |{ N ( t i ) } ).Our ﬁrst test is a single run of Model C, in which the initial condition is N (0) = (1 ,

1) and after time t = 1 . N (1 .

5) = (3 ,

7) with reactionrates θ = ( λ AA , λ AB , λ BB ) = (1 , , ?? , and shows that the parameter estimatesare reasonably well constrained despite the small amount of data (2 snapshots) usedfor the inference. Throughout, we use Scott’s Rule [16] for KDE bandwidth estimation.Unlike the complete-data scenario (Section 3.2.1), the likelihood (and so the posterior) does not factorise into independent component likelihoods (cf. Eq. (35)). This meansgiven snapshot-data, rate constants are correlated and rate inference cannot be carriedout separately. The alignment of the high-density regions in Figure 4 along the diagonalclearly indicate correlations between λ AA and λ BB .We now investigate how the parameter estimates are aﬀected by the length of time,∆ t , between the two snapshots. We anticipate that when ∆ t is small, only a smallnumber of reactions could reasonably have occurred, and so one would not expect tolose much from having only the trajectory endpoints (as opposed to the full trajectory). oupled diﬀerentiation and division of stem cells (a) λ AA - λ AB -space (b) λ AB - λ BB -space Figure 4:

Posterior distribution of Model C calculated based on ﬁve independentsnapshots { N ( t = 2 . } which were sampled from a system with parameters θ = (1 , , N ( t = 0) = (1 , θ ν iid ∼ U [0 , λ AB than to variations in λ AA and λ BB , we observe a larger posterior variance along λ AB .On the other hand, when ∆ t is large, we would expect estimates based on snapshots tobe less tightly constrained than those based on full trajectories.Our procedure is to generate a trajectory from Model C of length ∆ t , startingfrom N (0) = (1 ,

1) and with θ = ( λ AA , λ AB , λ BB ) = (1 , , λ AA for diﬀerent ∆ t are shown inFigure 5.The snapshot posteriors (blue, ﬁlled curves) are always less accurate with respectto the true parameters than their complete-data counterpart (red lines), as one wouldexpect. As more reaction events are contained in the full trajectory ξ for longer times,it provides more information about the system and the variance of the complete-dataposterior decreases. The marginal information stored in snapshot-data increases toowith reaction numbers, however at a much smaller rate than the full trajectory ξ (seeFigure 6).Suppose now we have the ability to take more than two snapshots, and have freedom oupled diﬀerentiation and division of stem cells (a) ∆ t = 0 . (b) ∆ t = 1 . Figure 5:

Complete-data posterior (red curve) given a trajectory sampled from ModelC with parameters θ = (1 , ,

1) (black, vertical lines), N = (1 ,

1) on time interval[0 , ∆ t ]. The snapshot posterior (blue, ﬁlled curve) is calculated based on populationsnapshots of the underlying trajectory at times t = 0 and t = ∆ t . Snapshot posteriorsare less accurate than their complete-data counterpart, especially for large snapshotseparation ∆ t . In the simulation we used uniform priors for rate constants θ k iid ∼ U [0 , Figure 6:

Variance of complete-data posterior (orange) and snapshot posterior (blue)as a function of snapshot separation ∆ t . Trajectories were sampled from Model C withparameters θ = (1 , , N = (1 ,

1) on time interval [0 , ∆ t ]. Error bars show thespread of the posterior variance calculated from 8 independently sampled trajectories,while ∆ t was kept constant. The variance of the snapshot posterior decays at a muchsmaller rate with observation time than the complete-data posterior (solid lines showexponential ﬁts). oupled diﬀerentiation and division of stem cells N A ( t )( λ AA + λ AB + λ BB )and from equation (16) that the mean number of A-state cells grows exponentially as N A e ∆ λt , where ∆ λ = λ AA − λ BB . We can therefore deduce that the mean number ofreactions between times t i − and t i is (cid:104) r (cid:105) i = N A λ AA + λ AB + λ BB ∆ λ (cid:0) e ∆ λt i − e ∆ λt i − (cid:1) . (42)Setting this number equal for all i , with the constraint that t = 0 and t n = T , we ﬁndthat the i th snapshot should be taken at time t i = 1∆ λ ln (cid:20) in (cid:0) e ∆ λT − (cid:1)(cid:21) , (43)where n is the total number of snapshots taken in the time interval. We can comparethe variance of posterior distributions obtained from diﬀerently spaced snapshots byreplacing ∆ λ with a variation parameter z in equation (43). For z < z = 0 represents uniformlyspaced snapshots and z > − ≤ z ≤

7. We see that parameter estimates are most precise if the average numberof reactions between snapshots are kept constant ( z = ∆ λ ), as proposed. This conﬁrmsour intuition that one is likely to be best served by sampling larger populations morefrequently. A similar analysis is possible for Model U, which yields the same optimalsnapshot times as in Eq. (43), but ∆ λ is replaced by α + β from Eq. (28). Having established that parameter inference and model selection is viable with relativelylimited amounts of data in the context of Models C and U with two cell states, weturn now to the more challenging case of epiblast stem cell data with eight states asdescribed in Section 2.1.3. This may elucidate whether cell division and state changesare independent or coupled in this particular biological system.

The structure of our dataset is largely inﬂuenced by measurementrestrictions of the experimental setup (Schumacher et al., in preparation). Theexperiment starts with single cells of only partially known state: prior to the experimentscells have been sorted for the expression of marker T ± . From previous experiments oupled diﬀerentiation and division of stem cells Figure 7:

Trace of covariance matrix of snapshot posterior as a function of snapshotspacing parameter z . Trajectories were sampled from Model C with parameters θ = (1 , , N = (1 ,

1) on time interval [0 ,

3] and n = 7 snapshots where takenaccording to equation (43) with spacing parameter z . Error bars show the spread ofthe posterior variance calculated from eight independently sampled trajectories. Aspredicted, the posterior variance is on average minimised for z ≈ ∆ λ = 1.measuring steady-state populations of T ± -sorted stem cells [7] we can calculate thedistribution of marker expressions for F, S as p ( F + | T + ) ≈ . p ( F + | T − ) ≈ . p ( S + | T + ) ≈ . p ( S + | T − ) ≈ .

46, with complementary probabilities p ( F − | T ) =1 − p ( F + | T ) and p ( S − | T ) = 1 − p ( S + | T ), and marker expressions are assumed to beindependent, so that p ( F, S | T ) = p ( F | T ) p ( S | T ).After time ∆ t , individuals cells will have grown into small colonies (clones). Theseare ﬁxed, the number of cells in each clone counted and the expression of either T, F or T, S are measured. We can therefore only obtain information about the sum ofunderlying populations, where the sum is taken over the unknown marker (either S or F ) in a given measurement. As an example: the number of cells possessing markerexpression [ T + , F + ] is given by the sum of cell populations with marker expression[ T + , F + , S − ] and [ T + , F + , S + ]: N [ T + ,F +] = N [ T + ,F + ,S − ] + N [ T + ,F + ,S +] . In general, N [ M ,M = (cid:88) M N [ M ,M ,M , (44)where M M M projected populations { N [ M ,M } := D sim before comparing it to measuredpopulations from experiments. Population snapshot data was experimentally recordedfrom cell colonies subject to two diﬀerent environmental conditions: CHIR ( D CHIR ),which refers to the presence of the diﬀerentiation factor Chrion, and EPISC ( D EPISC ),which refers to the absence of this factor [7]. The ﬁrst data set D CHIR consists of eightmeasurement series recorded at diﬀerent times and for marker expressions:(i) Initial marker T + , marker measured after ∆ t = 2 d and ∆ t = 3 d : T, F ; oupled diﬀerentiation and division of stem cells (a) Uniform snapshot spacing ( z = 0) (b) Optimal snapshot spacing ( z = ∆ λ = 1) Figure 8:

Left panels: sampled bridging trajectories between population snapshots(vertical, dashed line) obtained via the sample path algorithm (see Figure 3). Rightpanels: complete-data posterior (red curve) given a trajectory sampled from Model Cwith parameters θ = (2 , ,

1) (black, vertical line), N = (1 ,

1) on time interval [0 , n = 5 populationsnapshots which where taken according to equation (43) with spacing parameter z .(ii) Initial marker T + , marker measured after ∆ t = 2 d and ∆ t = 3 d : T, S ;(iii) Initial marker T − , marker measured after ∆ t = 2 d and ∆ t = 3 d : T, F ;(iv) Initial marker T − , marker measured after ∆ t = 2 d and ∆ t = 3 d : T, S .The second data set D EPISC consists of four measurement series recorded at diﬀerenttimes and for marker expressions:(i) Initial marker T + , marker measured after ∆ t = 3 d : T, F ;(ii) Initial marker T + , marker measured after ∆ t = 3 d : T, S ;(iii) Initial marker T − , marker measured after ∆ t = 3 d : T, F ;(iv) Initial marker T − , marker measured after ∆ t = 3 d : T, S .Each measurement series in both sets contains between 98 and 142 observations ofprojected population snapshots.

The main challenge in applying the MCMC algorithmdeveloped above to the EpiSC cell state network is the high dimensionality of the state oupled diﬀerentiation and division of stem cells { N Φ } of all 8 cellstates Φ i . We need to evaluate snapshot likelihoods of the form L SS ( { N Φ (∆ t ) }| θ ) byan MC estimate, so a suﬃciently large ensemble of trajectories is necessary to populatemost parts of state space { N Φ } . The experimental data D CHIR and D EPISC containup to 8 cells per cell state and therefore the MC estimate would need to sample upto 8 ≈ · possible system states. Unfortunately, sampling a suﬃciently largeensemble from ∼ system states turned out to be computationally not feasible. Wetherefore turn to Approximate Bayesian Computation (ABC) [17]. In our case, ABCspeeds up calculations by several order of magnitudes by comparing data based on lower-dimensional summary statistics only, at the expense of introducing an additional errorby approximating posterior distributions.We choose the following summary statistics: the mean (cid:104) N Φ (cid:105) , median med { N Φ } ,and standard deviation σ Φ . We use a Sequential Monte Carlo algorithm (SMC-ABC)for parameter inference and model selection [18, 19]. We verify our results also usinga Rejection-ABC algorithm (modiﬁed from [17]) for parameter inference and modelselection, which is presented in Appendix E.In ABC, a distance function D ( s ( D sim ) , s ( D )) classiﬁes how “close” the summarystatistic of observed and simulated data are. A tolerance (cid:15) ≥ (cid:15) > d : s i ( d ) = [ (cid:104) N i (cid:105) , Var( N i ) , Med( N i )] T , where i ∈ { , , , } is indexing theprojected populations (expression of either T and F or T and S, starting from eitherT+ or T- cells). This summary statistic captures the centre, spread, and skewness ofthe respective distribution p ( N i ). We use the sample median instead of the skewnessas it can be calculated more easily than the third central moments needed for skewnessestimation. We use the following norm D ( s ( D sim ) , s ( D )) to compare summary statisticsof experimental and simulated data: D ( s ( D sim ) , s ( D )) = (cid:88) k (cid:88) i =1 || s i ( d simk ) − s i ( d k ) || , (45)where d simk and d k is the summary statistic of the k -th measurement series of simulatedand experimental data, respectively, and || . || is the L -norm. The choice of norm doesnot signiﬁcantly aﬀect the shape of parameter posteriors if the acceptance threshold (cid:15) is suﬃciently small (see Figure ?? ). Note that k ∈ { , . . . , } for data set D CHIR and k ∈ { , . . . , } for data set D EPISC . The output of this ABC algorithm is a setof parameters { θ } distributed according to p ( θ | D ( s ( D sim ) , s ( D )) < (cid:15) ). For suﬃcientlysmall (cid:15) , this distribution should be a good approximation of the true posterior p ( θ | s ( D ))which we conﬁrmed by comparing the ABC posterior of cell division rate θ in model Uwith the analytical solution (see Figure ?? ).For parameter inference and model selection we choose identical log-uniform priorsfor all rate constants in both models log ( θ k ) ∼ U [ − , oupled diﬀerentiation and division of stem cells Figure 9:

Bayes factors B CU of datasets D CHIR and D EPISC over a range of tolerancevalues (cid:15) with identical log-uniform priors for all rate constants in both models log ( θ k ) ∼U [ − , N set in therespective dataset ( N set = 8 in D CHIR , N set = 4 in D EPISC ). Red dots show the minimalepsilon value to which the SMC-ABC algorithm converged after 20 iterations, resultingin B CU = 4 . ± . D CHIR and B CU = 20 ± D EPISC . Three independent SMC-ABC runs were performed (shown as multiple lines), and the error of B CU was estimatedby the spread between runs.to zero above this threshold, p ( θ k > ≈

0. Cell transitions of a certain type shouldoccur suﬃciently often over the observation period of the experiment to be consideredin our model. By constraining the prior distribution p ( θ k < .

01) = 0, we only discardvery rare cell transitions which would occur less than once every 100 days in a singlecell. Considering that total population sizes in the experiment are around 10 cells andthe observation period is three days, this lower cut-oﬀ seems reasonable.We run the SMC-ABC algorithm, which progressively lowers tolerance (cid:15) until theacceptance probability becomes too small and (cid:15) cannot be decreased much furtherwithout impractical computational costs. The Bayes factor B CU can be calculatedfor every SMC-iteration from the marginal posterior distributions of models M C and M U . This was done for both datasets D CHIR and D EPISC and the resulting Bayes factorsare shown in Figure 9 as a function of (cid:15)/N set . We normalised the tolerance value (cid:15) by the number of measurement series N set in a dataset since the our distance functionEq. (45) sums over the deviation of every measurement series 1 ≤ k ≤ N set . Forthe lowest achievable tolerance (see red dots in Figure 9) we obtain Bayes factors of B CU = 4 . ± . D CHIR and B CU = 20 ± D EPISC ). This constitutes moderateto strong evidence in favour of Model C dynamics, i.e. a coupling of cell division andstate transitions. oupled diﬀerentiation and division of stem cells ( θ ν ) are shown in Figure 10(a)for both data sets. While the cell division rate θ depends only slightly on theexperimental conditions (log ( θ ) ≈ − . θ M + /θ M − of marker switching rates is shown in Figure 10 b, which implies: • Cells in CHIR conditions are mostly in state [ T − , S − , F + ]. Although only a smallsub-population of cells show marker expression T + in CHIR conditions, there arestill more T + cells than under EPISC conditions which is consistent with the rawdata from [7]. The rate-imbalance of markers F and S are quite small, so in steady-state there may be signiﬁcant sub-populations with marker expression S + or F − . • Cells in EPISC conditions are biased towards the state [ T − , S + , F − ]. The rate-imbalance of marker F is quite small, so in steady-state there may be sub-populations with marker expression F + .It is worth mentioning that although Model U was shown to be less likely, theposterior distribution of Model U shows a very similar rate imbalance (see Figure ?? ).Furthermore, the inferred log-rates of model C allow us to obtain the completedistribution over cell states, which is not directly accessible by the experiments. Asshown in Figure 11, the cell state distribution depends on the initial cell conditions( T − -sorted/ T + -sorted stem cells) and the environmental conditions (CHIR/EPISC).The cell state distribution for initially unsorted stem cells (assuming every cell state isequally likely) is shown in Appendix F.So far we assumed cell transitions happen at a ﬁxed rate θ k which does not vary intime. We can test this assumption by using experimental data D CHIR which recordedpopulation snapshots at two diﬀerent times ∆ t = 2 d and ∆ t = 3 d . We used exclusivelyday 2 and day 3 data to obtain parameter posteriors separately for population dynamicsat early times 0 < t < d and late times 2 d < t < d , respectively. The inferred reactionrates are compared in Figure 12, which indicates: • The universal reproduction rate θ and marker rates θ S + , θ S − do not show asubstantial time dependence and stay constant over the course of three days. • The rate imbalance of marker T + , θ T + /θ T − , signiﬁcantly decreases at day 3compared to day 1 and day 2 (diﬀerence > σ ). The rate imbalance of marker F + , θ F + /θ F − , seems to slightly increase with time, although the diﬀerence betweenday 2 and day 3 is not very substantial (diﬀerence < σ ).Again, the posterior distribution of Model U shows a very similar time dependenceof reaction rates (see Figure ?? ).

4. Discussion

In this paper, we studied the population dynamics of two stochastic models of celldivision and diﬀerentiation, or more generally, cell state changes: one in which cell oupled diﬀerentiation and division of stem cells (a)(b) Figure 10:

Median (horizontal line), 0.25/0.75 quantiles (box) and 0.05/0.95 quantiles(vertical lines) of (a) posterior distributions and (b) rate imbalance of Model C posteriorrates log ( θ k + /θ k − ) given dataset D CHIR and D EPISC .state transitions are coupled to cell division, model C, deﬁned by Eq. (2), and one inwhich these two processes are independent, model U, deﬁned by Eq. (4). We solved thecorresponding master equations analytically by using probability generating functions,which allow a systematic calculation of moments of the population distribution. Theresulting solutions for both Models C and U allowed us to analyse the dynamics ofthe population mean, based on which we showed that one cannot always diﬀerentiatebetween the two models. In Model C, there is a ﬁnite probability that the proliferatingsub-population will eventually die out and the dynamics will come to a halt. Theprobability of extinction (18) can be used to approximately infer the underlying reactionrate of model C solely based on the fraction of systems in which extinction wasobserved. Contrary, in Model U both populations grow without bounds and extinctionis impossible for ﬁnite reaction constants. This characteristic feature of extinction inModel C might be useful to classify experimental observations or to infer cell divisionrates when there is negligible cell death.For Bayesian parameter inference we were able to obtain analytical expressions oupled diﬀerentiation and division of stem cells (a) T − -sorted, CHIR conditions (b) T + -sorted, CHIR conditions (c) T − -sorted, EPISC conditions (d) T + -sorted, EPISC conditions Figure 11:

Complete cell state distribution after t = 3 d for diﬀerent environmentalconditions (CHIR/EPISC) and initial cell conditions ( T − -sorted/ T + -sorted stem cellsat t = 0 d ). The initial cell state distribution of T ± -sorted stem cells was obtained fromsteady-state populations from previous experiments [7] and model C dynamics withinferred log-rates (see Figure 10) were used to obtain cell-state distribution at t = 3 d .The probability of the respective marker expression is indicated by the colour map.for the likelihood if continuous-time observations [0 , T ] of all populations are available,e.g. as would be obtained from live imaging. Using conjugate priors, we obtained theposterior distributions analytically. When confronted with incomplete data in form ofpopulation snapshots, Bayesian inference is more challenging, and we have to rely onnumerical results. As estimating the snapshot likelihood directly is computationallycostly, we used a more elaborate, but eﬃcient sample path method for obtainingposterior distributions of rate constants given snapshot-data. We further showed thatthe information content of snapshot-data depends on the times these snapshots weretaken, and proposed an optimal spacing strategy which maximises the conﬁdence ofparameter inference.For parameter and model inference based on experimental EpiSC data, we hadto extend Models C and U from two distinct cell populations to in total 8 cell states. oupled diﬀerentiation and division of stem cells (a) Model C posterior distribution of log-rates log ( θ k ). (b) Rate imbalance θ M + /θ M − of posterior distributions for ModelC. Figure 12:

Median (horizontal line), 0.25/0.75 quantiles (box) and 0.05/0.95 quantiles(vertical lines) of posterior distributions of Model C log-rates log ( θ k ) obtained from D CHIR . CHIR D2 and CHIR D3 shows inferred parameters using exclusively day 2 orday 3 data, respectively. CHIR D2+D3 was obtained by including all data points of D CHIR and coincides with the results shown in Figure 10.Since our sample path method turned out to be computationally not feasible, we usedApproximate Bayesian Computation (ABC) which speeds up calculation by severalorders of magnitude. We used ABC rejection sampling and ABC-SMC to obtainposterior parameter distributions and Bayes factors for selecting between the twocompeting Models C and U. The Bayes factors provide moderate to strong evidence infavour of model C in which cell division and diﬀerentiation are coupled (see Figure 9).The posterior distribution of rate constants θ quantify how cell fate changes dependingon the in vitro culture conditions (CHIR/EPISC, see Figure 10). For cells in CHIRcondition, we quantiﬁed how the rates of cell state transitions might change over time(see Figure 12). As we only have data from multiple time-points for the CHIR conditions,we cannot say whether this apparent time dependence is caused by a cell-intrinsic timingmechanism or whether the environment the cells experience changes over time because oupled diﬀerentiation and division of stem cells ?? ).To account more rigorously for this time-dependent dynamics in systems which are notwell-mixed or with strong cell-cell interactions, one would have to consider more complexmodels such as non-homogeneous Markov models or explicit representations of dynamicgene expression [20]. We began our investigation with minimalmodels of dynamics with two cell states. Although Models C and U were motivatedby biological arguments, the stochastic, two-dimensional jumping processes we deﬁnedcould be applied to a variety of systems. The assumption of Markovian dynamics isoften used due to a range of useful properties which makes the models easier to analyse.Often, especially in the context of cell dynamics, the Markov assumption is at best arough approximation [21, 22]. Waiting times τ between jumps are distributed accordingto an exponential distribution, which is biologically unrealistic: On a molecular level,many transport processes are required for a cell to replicate. Therefore, waiting timesbetween two division events cannot be arbitrarily small [22].Indeed, Recent work has suggested that experimentally observed cell cycle timedistributions diﬀer substantially from a markovian exponential distribution and arebetter described by the Erlang distribution [22]. How do population dynamics changefor a non-Markovian process? In this scenario cell populations cannot grow arbitrarilyfast, there is a time-dependent upper bound on population sizes N < N lim ( t ). Stateprobabilities p i,j ( t ) = p ( N A ( t ) = i, N B ( t ) = j ) are therefore zero for population numbers( i, j ) exceeding limit N lim ( t ). Cell population dynamics with Erlang distributed waitingtimes are non-Markovian stochastic processes with discrete states in continuous time.Non-Markovian processes like this can in general be analysed by converting them intoMarkov processes by inclusion of supplementary variables [23]. For Bayesian inference given snapshot data,our main focus was on building a Bayesian framework which can be applied toexperimental data. However, the algorithms for parameter inference and model selectionoriginally proposed for Model C and U were computationally too intensive for the morecomplicated models with larger cell state networks. Nonetheless, the presented optimalsnapshot spacing and scaling behaviour of Bayes factors with snapshot separation shouldgenerally be useful for experimental design. Measurement should be taken at times oupled diﬀerentiation and division of stem cells n snapshots are taken from the same system. If n snapshots were taken from independent systems, equation (43) cannot be applied andthe posterior variance scales with snapshot separation ∆ t as shown in Figure 6. Since theposterior of a set of independent trajectories is the product over posteriors obtained fromindividual trajectories, one should try to maximise ∆ t for every trajectory individually.Here we have presented models for growing in vitro colonies of stem cells, for whichneglecting cell death is a reasonable approximation. There are many biological systemsin which cell death may be relevant, e.g. under homeostatic self-renewal of tissues, inwhich total cell numbers have to kept constant, or even some applications in developmentwhere cell death is used for tissue size control [24, 25]. For these systems, the proposedmodels can be easily extended by including either a cell death process of the form A → ∅ .Similarly, constraints on the population size can be implemented by introducing a celldivision or death rate which depends on the total population size N , which acts as acrowding feedback mechanism [6]. This could represent contact inhibition or competitionfor niche access, for example. In spatial extensions of the models presented here, onecould consider locally coupling division to diﬀerentiation and death events, as is relevantfor example in epidermal homeostasis [26, 27]. Any such mechanisms can be easily addedto the models and the same methods for inference and model selection can be applied. Adisadvantage of these additional mechanisms is that the corresponding master equationswould likely be intractable and one would have to rely solely on numerical results. We carried out parameter inference and modelselection based on experimental EpiSC data using an ABC algorithm which works withsummary statistics of the data. We obtained Bayes factors of B CU ≈ . D CHIR and B CU ≈

20 for D EPISC (see Fig. 9), which indicates moderate to strong evidence in favourof Model C. This result, however, has to be interpreted with care:First, model selection based on Bayes factors depends on the prior distributionaccurately reﬂecting the state of knowledge about the underlying model parameters θ . Here we chose log-uniform priors θ k ∼ log ( U [ − , θ . The exact value of Bayes factors might therefore changeslightly depending on the choice of prior.Secondly, because we used summary statistics of the data instead of the full set ofobservations, we did not obtain the “true” Bayes factor based on data D , but one basedon the summary statistic s ( D ) of data. Considering the full data would mean to lookat each experimental replicate individually by calculating the parameter posterior givena single experimental realisation and iterate over all data points, using the posteriordistribution as the prior distribution for the next data point. The “true” Bayes factorand the one obtained using summary statistics only coincide if the summary statistic s ( D ) is suﬃcient for comparing Models C and U: p ( D| s ( D ) , M X ) = p ( D| s ( D ) , M Y ).There are only few cases in which summary statistics are known to be suﬃcient ,e.g. [28]. The more usual case is that we have to work with arbitrary summary statistics, oupled diﬀerentiation and division of stem cells (cid:15) >

0, and one dueto insuﬃcient summary statistics. Alternatives to model selection with insuﬃcientstatistics were explored by [30], which selected models via a machine learning approach(random forests). An exhaustive investigation how the choice of summary statisticsaﬀects the shape of the posterior distributions is beyond the scope of this present work,but an interesting prospect for follow-on work.Despite concerns about suﬃciency of our summary statistics, we veriﬁed that thesummary statistic used here is able to discriminate between extended Models C and U.Using Rejection-ABC, we obtained Bayes factors of B CU ≈ . D simCHIR ( (cid:15)/N set = 3 .

75) and (cid:104) B CU (cid:105) ≈ . D simEPISC ( (cid:15)/N set = 6,note that one would likely achieve higher Bayes Factors at lower (cid:15) with more eﬃcientsampling schemes). Synthetic data was generated from model C with reaction rates setto posterior means obtained from experimental data (see Figure 10). It is worth notingthat for simulated data, the acceptance probability for a given (cid:15) value is much higherthan for experimental data. This suggests that the Models C and U do not perfectlydescribe the experimental data. More realistic models for cell state changes are thereforediscussed in the following section. Several approaches that aim to go beyondpseudotime ordering and connect single cell gene expression data to dynamic modelshave recently emerged. Examples include ﬁtting drift-diﬀusion equations on a k -nearestneighbour graph [5] or along one-dimensional pseudotime trajectories [31], simulatingcell trajectories in dimensionally reduced gene expression space based on RNA velocityestimates [32], and continuous state Markov process from single-cell time series [33].Here, we have presented an alternative approach to these which allows for rigorousquantiﬁcation of uncertainty. Working in a Bayesian framework further allows us tointegrate multiple sources of data relatively easy, e.g. to combine bulk gene expressionand single-cell data, which we discuss further below. The models proposed in this work assumed that rates for switching genes on and oﬀ areindependent of each other, thereby neglecting regulatory interactions between genes. Anatural next step is to consider models with interdependencies between transcriptionfactors, i.e., where the rate of transitioning into the on/oﬀ state for one transcriptionfactor can depend on the expression state of another transcription factor. The numberof possible pairwise dependencies alone leads to a model comparison problem that iscomputationally too intensive typical ABC inference approaches. However, systematicBayesian model comparison can be performed on bulk population data, where likelihood oupled diﬀerentiation and division of stem cells

Acknowledgements

AGF was supported by a Vice-Chancellors Fellowship from the University of Sheﬃeld,LJS was supported by a Chancellor’s Fellowship from the University of Edinburgh. oupled diﬀerentiation and division of stem cells References [1] M.J. Casey, P.S. Stumpf, and B.D. MacArthur. Theory of cell fate.

Wiley Interdiscip. Rev. Syst.Biol. Med. , 12(2):e1471, 2019.[2] S.A. Morris. The evolving concept of cell identity in the single cell era.

Development , 146(12),2019.[3] W. Saelens, R. Cannoodt, H. Todorov, and Y. Saeys. A comparison of single-cell trajectoryinference methods.

Nat. Biotech. , 37(5):547–554, may 2019.[4] S. Tritschler, M. B¨uttner, D.S. Fischer, M. Lange, V. Bergen, H. Lickert, and F.J. Theis. Conceptsand limitations for learning developmental trajectories from single cell genomics.

Development ,146(12), 2019.[5] C. Weinreb, S. Wolock, B.K. Tusi, M. Socolovsky, and A.M. Klein. Fundamental limits on dynamicinference from single-cell snapshots.

Proc. Natl. Acad. Sci. U.S.A. , 115(10):E2467–E2476, 2018.[6] P. Greulich and B.D. Simons. Dynamic heterogeneity as a strategy of stem cell self-renewal.

Proc.Natl. Acad. Sci. U.S.A. , 113(27):7509–7514, 2016.[7] A. Tsakiridis, Y. Huang, G. Blin, S. Skylaki, F. Wymeersch, R. Osorno, C. Economou,E. Karagianni, S. Zhao, S. Lowell, and V. Wilson. Distinct Wnt-driven primitive streak-likepopulations reﬂect in vivo lineage precursors.

Development , 141(6):1209–1221, 2014.[8] W.H Jeﬀerys and J.O. Berger. Sharpening Ockhams razor on a Bayesian strop.

Technical Report ,1991.[9] R.E. Kass and A.E. Raftery. Bayes factors.

J. Am. Stat. Assoc. , 90(430):773–795, 1995.[10] D. Courant, R.and Hilbert.

Methoden der mathematischen Physik: Zweiter Band . Julius Springer,1937.[11] D. Schnoerr, G. Sanguinetti, and R. Grima. Approximation and inference methods for stochasticbiochemical kineticsa tutorial review.

J. Phys. A , 50(9):093001, 2017.[12] D.J. Wilkinson.

Stochastic modelling for systems biology . CRC press, 2011.[13] P.G. Moscuoroums. The distribution of the sum of independent gamma random variables.

Ann.Inst. Statist. Math , 37(Part A):541–544, 1985.[14] S. Kotz, N. Balakrishnan, and N.L. Johnson.

Continuous multivariate distributions, Volume 1:Models and applications , volume 1. John Wiley & Sons, 2004.[15] S. Brooks, A. Gelman, G. Jones, and X.-L. Meng.

Handbook of Markov Chain Monte Carlo . CRCpress, 2011.[16] D.W. Scott.

Multivariate density estimation: theory, practice, and visualization . John Wiley &Sons, 2015.[17] T. Toni, D. Welch, N. Strelkowa, A. Ipsen, and M.P.H. Stumpf. Approximate Bayesiancomputation scheme for parameter inference and model selection in dynamical systems.

J.R. Soc. Interface , 6(31):187–202, 2008.[18] S.A. Sisson, Y. Fan, and M.M. Tanaka. Sequential Monte Carlo without likelihoods.

Proc. Natl.Acad. Sci. U.S.A. , 104(6):1760–1765, 2007.[19] E. Klinger, D. Rickert, and J. Hasenauer. pyABC: distributed, likelihood-free inference.

Bioinformatics , 34(20):3591–3593, 2018.[20] C.S. Manning, V. Biga, J.S. Boyd, J. Kursawe, B. Ymisson, D.G. Spiller, C.M. Sanderson, T. Galla,M. Rattray, and N. Papalopulu. Quantitative single-cell live imaging links HES5 dynamics withcell-state and fate in murine neurogenesis.

Nat. Commun. , 10(1):2835, Jun 2019.[21] P.S. Stumpf, R.C.G. Smith, M. Lenz, A. Schuppert, F.-J. M¨uller, A.C. Babtie, T.E. Chan, M.P.H.Stumpf, Colin P. Please, S.D. Howison, F. Arai, and B.D. MacArthur. Stem Cell Diﬀerentiationas a Non-Markov Stochastic Process.

Cell Syst. , 5(3):268–282.e7, 2017.[22] E. Gavagnin, M.J. Ford, R.L. Mort, T. Rogers, and C.A. Yates. The invasion speed of cellmigration models with realistic cell cycle time distributions.

J. Theor. Biol. , 2018.[23] D.R. Cox. The analysis of non-Markovian stochastic processes by the inclusion of supplementaryvariables. In

Mathematical Proceedings of the Cambridge Philosophical Society , volume 51, pages oupled diﬀerentiation and division of stem cells PLoScomputational biology , 11(12), 2015.[25] Arnaud Ambrosini, M´elanie Gracia, Amsha Proag, M´egane Rayer, Bruno Monier, and MagaliSuzanne. Apoptotic forces in tissue morphogenesis.

Mechanisms of development , 144:33–42,2017.[26] Panteleimon Rompolas, Kailin R Mesa, Kyogo Kawaguchi, Sangbum Park, David Gonzalez,Samara Brown, Jonathan Boucher, Allon M Klein, and Valentina Greco. Spatiotemporalcoordination of stem cell commitment during epidermal homeostasis.

Science , 352(6292):1471–1474, 2016.[27] Kailin R Mesa, Kyogo Kawaguchi, Katie Cockburn, David Gonzalez, Jonathan Boucher, TianchiXin, Allon M Klein, and Valentina Greco. Homeostatic epidermal stem cell self-renewal is drivenby local diﬀerentiation.

Cell stem cell , 23(5):677–686, 2018.[28] A. Grelaud, C.P. Robert, and J.-M. Marin. ABC methods for model choice in Gibbs randomﬁelds.

C. R. Acad. Sci. , 347(3-4):205–210, 2009.[29] J.-M. Marin, P. Pudlo, A. Estoup, and C.P. Robert.

Likelihood-free model choice , pages 153–178.Handbook of Approximate Bayesian Computation. Chapman and Hall/CRC Press, 2015.[30] P. Pudlo, J.-M. Marin, A. Estoup, J.-M. Cornuet, M. Gautier, and C.P. Robert. Reliable ABCmodel choice via random forests.

Bioinformatics , 32(6):859–866, 2015.[31] D.S. Fischer, A.K. Fiedler, E.M. Kernfeld, R.M.J. Genga, A. Bastidas-Ponce, M. Bakhti,H. Lickert, J. Hasenauer, R. Maehr, and F.J. Theis. Inferring population dynamics from single-cell RNA-sequencing time series data.

Nat. Biotech. , 37(4):461–468, 2019.[32] J.C. Kimmel, A.B. Hwang, A. Scaramozza, W.F. Marshall, and A.S. Brack. Aging inducesaberrant state transition kinetics in murine muscle stem cells.

Development , page dev.183855,2020.[33] J.P. Taylor-King, A.N. Riseth, W. Macnair, and M. Claassen. Dynamic distribution decompositionfor single-cell snapshot time series identiﬁes subpopulations and trajectories during iPSCreprogramming.

PLoS Comput. Biol. , 16(1):e1007491, 2020.[34] N. Picco, S. Hippenmeyer, J. Rodarte, C. Streicher, Z. Moln´ar, P.K. Maini, and T.E. Woolley. Amathematical insight into cell labelling experiments for clonal analysis.

J. Anat. , pages 686–696,2019.[35] B. Lambert, A.L. Maclean, A.G. Fletcher, A.N. Coombes, M.H. Little, H.M. Byrne, A.N. Combes,M.H. Little, and H.M. Byrne. Bayesian inference of agent-based models: a tool for studyingkidney branching morphogenesis.

J. Math. Biol. , 76:1673–1697, 2018.[36] P. Guerrero, R. Perez-Carrasco, M. Zagorski, D. Page, A. Kicheva, J. Briscoe, and K.M. Page.Neuronal diﬀerentiation inﬂuences progenitor arrangement in the vertebrate neuroepithelium.

Development , 146(23), 2019.[37] M.J. Simpson, R.E. Baker, S.T. Vittadello, and O.J. Maclaren. Practical parameter identiﬁabilityfor spatio-temporal models of cell invasion.