Inferring differentiation order in adaptive immune responses from population level data
IInferring differentiation order in adaptive immune responsesfrom population level data
Alexander S. Miles ∗ , Philip D. Hodgkin † , and Ken R. Duffy ‡ June 4, 2020
Abstract
A hallmark of the adaptive immune response is the proliferation of pathogen-specific lymphocytes thatleave in their wake a long lived population of cells that provide lasting immunity. A subject of ongoinginvestigation is when during an adaptive immune response those memory cells are produced. In two ground-breaking studies, Buchholz et al. (Science, 2013) and Gerlach et al. (Science, 2013) employed experimentalmethods that allowed identification of offspring from individual lymphocytes in vivo , which we call clonaldata, at a single time point. Through the development, application and fitting of a mathematical model,Buchholz et al. (Science, 2013) concluded that, if memory is produced during the expansion phase, memorycell precursors are made before the effector cells that clear the original pathogen. We sought to determinethe general validity and power of the modeling approach introduced in Buchholz et al. (Science, 2013) forquickly evaluating differentiation networks by adapting it to make it suitable for drawing inferences frommore readily available non-clonal phenotypic proportion time-courses. We first established the method drewconsistent deductions when fit to the non-clonal data in Buchholz et al. (Science, 2013) itself. We fit a variantof the model to data reported in Badovinac et al. (Journal of Immunology, 2007), Schlub et al. (Immunologyand Cell Biology, 2010), and Kinjo et al. (Nature Communications, 2015) with necessary simplificationsto match different reported data in these papers. The deduction from the model was consistent with thatin Buchholz et al. (Science, 2013), albeit with questionable parameterizations. An alternative possibility,supported by the data in Kinjo et al. (Nature Communications, 2015), is that memory precursors are createdafter the expansion phase, which is a deduction not possible from the mathematical methods provided inBuchholz et al. (Science, 2013). This investigation further supports the value of the approach, though weindicate where published experimental findings run contrary to assumptions underlying the model, whichmay impact the strength of inferences, and what an alternate model more consistent with recent data mightbe.
During an adaptive immune response, a population of naive lymphocytes becomes heterogeneous, containinga population of effector cells that clear the pathogen and then die during the contraction phase, and a popu-lation of memory cells that are of a longer lineage. For CD8 + T cells, many theories have been proposed forthe dynamics of this differentiation. The three primarily championed ones are [1, 16, 4]: ∗ Hamilton Institute, Maynooth University, Ireland † Walter and Eliza Hall Institute, Royal Melbourne Hospital, Parkville, Australia and the Department of Medical Biology, The Universityof Melbourne, Parkville, Australia ‡ Hamilton Institute, Maynooth University, Ireland. E-mail: Corresponding [email protected] a r X i v : . [ q - b i o . CB ] J un Linear Effector First Model:
Naive cells differentiate into proliferating effector cells. Later in theimmune response, these cells differentiate into memory cells or die [19, 13, 23, 15]. • Linear Memory First Model:
Naive cells differentiate into memory precursor cells that then prolifer-ate, with some differentiating into effector cells. Some time after the expansion phase is completed theeffector cells die [3]. • Bifurcation Model:
Each naive cell has the potential to divide into either effector or memory precursorsubsets which then continue to proliferate inheriting their phenotype, and effector cells die after theexpansion phase while memory cells remain [5, 14].These three are non-exhaustive, high-level summaries of a more diverse and subtle set of theories. Provisionof a simple classification is further complicated by phenotypic characterization of T cells where there is anoverlap between memory and effector functionality and phenotypic markers, sometimes resulting in a sub-category that is referred to as effector memory cells (TEMp, the p standing for precursor) to distinguish themfrom effector (TEF) and central memory cells (TCMp).The Linear Effector First Model has a long history and could be considered the traditional view [19, 1].In 2007, the authors of [5] proposed asymmetric cell division to explain heterogeneity with CD8 + T cells,supporting the Bifurcation Model, which won’t be considered further here. In 2013, two landmark paperswere published [3, 8] that further addressed this question. Both used novel methodologies to determineclonal relationships of cells. In addition, due to the more quantitative nature of their assay, the authors of [3]developed a methodology to fit a memoryless multi-type Bellman-Harris process model [9] to clonal summarystatistics of CD8 + T cell phenotypes, finding that a quantitative realization of the Linear Memory First Modelwas the best explanation of their data.Let us introduce some notation in order to distinguish between the available data sources. For an exper-iment initiated at time t = n pathogen specific cells, let c fi ( t ) denote the number offspring that are ofphenotype f ∈ { ,..., F } from an initial transferred naive cell i at time t post infection. The method used in[3] was effectively unique in determining the number of cells of each phenotype that were descendent froman initially naive cell, c fi ( t ) , for each i and all f ∈ { TEF, TEMp, TCMp } at t = day eight post infection. Weshall refer to this data as being at a clonal level.A more common experimental set-up [3, 2, 21, 15] provides the fraction of all offspring that are of eachphenotype, f , at a given time t , ρ f ( t ) = ∑ ni = c fi ( t ) ∑ Fj = ∑ ni = c ji ( t ) , with clonal contributions being indistinguishable. Experiments recording this information are typically per-formed as a time-course, with cells harvested at distinct days post infection. We shall refer to it as cohort orpopulation level data.In the present paper, we describe deductions made by taking the modeling framework described in [3]and adapting it to make it suitable for fitting to cohort data. Our aim was to explore the general validity andpower of the method for quickly evaluating differentiation networks from simple cohort time series as, unlikeclonal data, those data are more readily available. We used the method to interrogate data reported by otherauthors [2, 21, 15], that employ similar experimental systems to that described in [3].A high-level summary of the content of this paper is:1. To adapt the method in [3] for fitting to non-clonal data, we introduce a distinct objective function.On comparing the best parameter fits to six representative models of the 304 considered in the originalpaper to the time-course cohort data reported in [3] itself, the finding was the same as for the clonaldata: that if memory is made during the expansion phase, then the Linear Memory First Model is thebest. Moreover, the best-fit parameterization from the cohort data was quantitatively similar to thatwhen fit to clonal data, suggesting consistency of the method.2. Due to data limitations where fewer phenotypes were identified, for the papers [2, 21, 15] we fit onlyto two representative differentiation structures: Linear Effector First and Linear Memory First. Naive pplication of the methodology to the entire time-course cohort data taken from [15] supported the Lin-ear Effector First Model over the Linear Memory First Model. A more sophisticated application wasperformed, noting that the number of antigen-specific T cells adoptively transferred in the experimentsreported in [15] was significantly higher than in [3], resulting in a curtailed expansion phase. Limit-ing model fitting to expansion phase only, data from blood and spleen in four papers, [2, 21, 3, 15]all support a Linear Memory First Model in favor of a Linear Effector First Model. Best fit modelparameterizations, however, suggest the model is struggling to explain this data.3. Additional data reported in [15] suggests that the a priori assumption underlying the mathematicalanalysis that memory is created during the expansion phase could be questionable.4. While the evidence presented here further supports the utility of the method introduced in [3], weindicate some of the discrepancies between its model assumptions and published data, suggesting analternate model structures that are more consistent with them, which might be of use in future. Clonal level experiments [3] : The authors performed innovative OT-1 CD8+ T cell adoptive transfer experi-ments where cell lineage could be determined. Donor mice were engineered to express both CD45 and CD90markers homozygously, heterozygously or not at all, allowing for nine combinations of markers in total, Fig.1A. C57BL/6 host mice were chosen from the double heterogeneous marker pool. Recipient mice received amix of naive OT-1 CD8 T cells that contained one cell from each of the seven uniquely congenically markeddonor mice together with 100 cells from one further uniquely marked mouse. As lineage was determined bycell surface markers, clonal cell numbers could be quantitatively evaluated by FACS. The cellular barcodingapproach used in the twinned paper [8] was based on DNA tags and their processing via PCR makes that dataless suitable for quantitative modeling analysis.In the experiments reported in [3], the host was infected with 5 × Listeria monocytogenes, a pathogenicbacteria, expressing OVA (LM-OVA). At day eight post infection, cells were taken from the spleen, lymphnodes and lungs. The cells were sorted into three phenotypes: • TCMp : CD62L + CD27 + phenotype. • TEMp : CD62L - CD27 + phenotype. • TEF : CD62L - CD27 - phenotype.Summary statistics from the data collected for these phenotypically defined populations were reported: theaverage cell count per family, coefficients of variation and correlation coefficients between families for thethree phenotypes, Fig. 2A. Cohort level experiments, [2, 21, 3, 15]
Each of these four papers also performed OT-1 CD8+ T celladoptive transfer experiments, Fig. 1B. In these, cells were not tracked at a clonal level, { c fi ( t ) } , but at acohort level { ρ f ( t ) } , via expression of either CD45 or CD90. The host mouse was infected with a pathogenexpressing OVA; PR8-OVA in [15] and LM-OVA in the other papers. The reported experiments have cellstaken from different organs, spleen, or mesenteric lymph nodes (MLN), at a range of times post infections. Inaddition, [2, 21] performed experiments with varying number of transferred OT-1 T cells. All of those papersreport the percentage of cells that were CD62L + at harvest times.For cohort spleen and blood data reported in all four papers, at early stages post infection all experimentsshow an initial large CD62L + population, Fig. 2B, which the authors of [3] associate this with an earlymemory precursor population while the authors of [15] associate it with cells still being naive. All papersshow a decrease in this initial CD62L + proportion, which both [15] and [3] explain is due to the effector poolincreasing in size. The paper [15] reports an increase in the percentage of CD62L + by day seven and explainsthis as cells differentiating into memory cells, whereas the data in [3] continues to show a decrease by dayseven.For experiments with a high number of adoptively transferred cells, [21] and [2] report an earlier expan-sion peak and a shorter expansion phase. Experiments transferring less than 500 cells had a peak at day seven, hile for experiments transferring 5 × cells the peak immune response occurred at day five. Curtailingthe experimental data with higher numbers of adoptively transferred cells to its expansion phase (a method forestimating this time we describe later in the paper) removes much of the later upturn in CD62L proportions,Fig. 2B. Having transferred 5 × cells, the paper [2] estimates the family size increase to be 40 – 400fold on average at the peak, compared to the estimated 400,000 fold increase on average for the experimenttransferring 50 cells. The authors of [3] introduced a collection of memoryless multi-type Bellman-Harris stochastic processes [9]to model the expansion phase of an adaptive immune response based on the following assumptions. • There are four types of cell: naive, TCMp, TEMp, and TEF. • There is no cell death during the expansion phase described by the model. • For each cell type, the lifetime of each cell is an independent and identically exponentially distributedrandom variable with a cell-type dependent mean. • At the end of its lifetime, a naive cell differentiates into one of the other cell types. • TCMp, TEMp, and TEF cells can divide or differentiate, with the each cell’s fate determined proba-bilistically and independently based on a cell-type dependent parameterization. • Possible differentiation between cell-types is known as a path. Different model types have a differentcombination of paths between cell types, which can be uni- or bi-directional. • There must be a direct or indirect path from the naive cell to each cell-type. • When the combinatorics are all done, the collection of possible paths describes 304 different modelswith between seven and 12 parameters.For a given differentiation network and parameterization, the authors of [3] employ a χ distance as a mea-sure of a quality of fit between expected summary statistics and observed summary statistics. Denote the nineobserved clonal summary statistics per-phenotype of mean family size, coefficient of variation and Pearsoncorrelation between the phenotypes as ˆ x = ( ˆ x ,..., ˆ x ) ∈ R , and their equivalent expected values for a givenmodel plus parameterization, θ , as (cid:126) x ( θ ) ∈ R . With experimental uncertainty of each observation, ˆ x i , esti-mated by bootstrap approximation as ˆ σ i , the χ distance between models and observations is computed tobe d ( ˆ x ,(cid:126) x ( θ )) = ∑ i = ( ˆ x i − x i ( θ )) ˆ σ i . For each of the 304 possible models, the best-fit parameterization, θ ∗ = arginf θ d ( ˆ x ,(cid:126) x ( θ )) , was numericallyidentified and more involved models were penalised via Akaike Information Criterion (AIC). One of the twobest fits over all 304 possible differentiation networks was a linear differentiation pathway with memory cellsappearing before effector cells (the second model was as the first, but with an additional 10% chance for naiveto skip the TCMp stage of the path).We wished to determine the method’s consistency when fitting the model class described in [3] to thetime-course cohort data { ρ f ( t ) : t = day 1 , , , , , } , described in that paper rather than to the clonal data { c fi ( t ) : t = day 8 } as the authors had originally done. To do so, we replaced the clonal summary statistics withˆ y = ( ˆ y ,..., ˆ y ) where ˆ y is the average family size at the peak and y ,..., y are the 18 measured proportionsof phenotypes observed at different times post infection. We denote ˆ σ i as the sample error of the statistic y i . In the case of the average family size, the SEM was approximated by summing together the SEMs of thephenotypic components. Thus, we use the formulae d ( ˆ y ,(cid:126) y ( θ )) = ∑ i = ( ˆ y i − y i ( θ )) ˆ σ i . to measure the distance between expected and observed cohort statistics.We numerically identified the best fit parameterization of the model for six representative differentiationpathways, Fig. 3A, of the 304 considered in [3]. These six were chosen since they include one of the two bestfit models and were sufficient to recreate the most significant results reported in [3]. We label the six modelsby: • Model 1 : Naive → CD62L + CD27 + → CD62L - CD27 + → CD62L - CD27 - . • Model 2 : Naive → CD62L + CD27 + → CD62L - CD27 - → CD62L - CD27 + . • Model 3 : Naive → CD62L + CD27 - → CD62L + CD27 + → CD62L - CD27 - . • Model 4 : Naive → CD62L + CD27 - → CD62L - CD27 - → CD62L + CD27 + . • Model 5 : Naive → CD62L - CD27 - → CD62L + CD27 + → CD62L + CD27 - . • Model 6 : Naive → CD62L - CD27 - → CD62L - CD27 + → CD62L + CD27 + .Model 1 is one of the two original best fitting models reported in [3]. As these models all have the samenumber of parameters, we do not need to consider penalization by AIC. Due to less extensive data in [2, 21,15], later we will reduce the set model further to consider just two competing models, Memory (CD62L + )First and Effector (CD62L - ) First, Fig. 3B.The model in [3] has four cell types, but only reports data for three phenotypes and no statistics for naivecells are presented. This could be interpreted two ways, either naive cells have been excluded from the data, orthey were included in the count of one of the other phenotypes. We note that [3] define naive cells as CD62L + CD27 + ([3] Supplementary Fig. 3), but with a different gating strategy than used for the classification of othercell types ([3] Supplementary Fig. 22). For the analysis here, we assume that naive cells are included in theTCMp count reported in [3].When applied to the time-course cohort data reported in [3], we found that Model 1, Memory First,to be the best fit, consistent with the original deduction from fitting to clonal data, Fig. 4A. Notably, theparameterization for the best fitting model was remarkably similar for five parameters out of six, as shown inFig. 4B. Fitting to either data-set is consistent with an increase in the division rate as cells differentiate frommemory to effector cells, as also reported in [3]. The difference in the parameter λ N , which parameterizesthe lifetime of naive cells, could be explained by the fact that the adjusted method compares the sum ofproportions of naive and TCMp cells to one statistic, the proportion of CD62L + CD27 + cells, and is not ableto discriminate between their separate contributions. These results suggest that, for this model, fitting totime-course cohort data with an average family size is a reasonable alternative to fitting to clonal data ofmean count of cells, variances and covariances. Adaptation.
As data from other papers we wish to analyze, [2, 21, 15], did not include all of the informationas [3] it was necessary to use a modified scheme. In some cases, average family size or SEM informationwas not given. All four papers provide the proportion of CD62L + cells, but some do not report CD27 data.Thus we define a new set of models that do not distinguish between CD27 - and CD27 + cells so that we canfit the observed statistics without this information. We change the assumption that there are four underlyingcell types: naive, TCMp, TEMp and TEF used in [3], to three cell types: naive, memory and effector cells.Thus the phenotypes are: • Naive and memory cells : CD62L + phenotype. Effector cells : CD62L - phenotype.We label the two versions of these models as: • Linear Memory First Model differentiation path: Naive → CD62L + → CD62L - . • Linear Effector First Model differentiation path: Naive → CD62L - → CD62L + .It is estimated that a mouse has 100–1000 cells capable of responding to any specific antigen, meaninglarge adoptive transfers such as 10 used in [15] dwarf the endogenous response [2]. The papers [2, 21] estab-lish that increased transfer size leads to an earlier peak response. We used log-linear fitting to approximate therelationship between the number of transferred cells and the day of peak response, Fig. 5A. For an experimenttransferring 107 cells, as described in [3], we estimate the peak to be at day between seven and eight. For anexperiment transferring 10 cells, as described in [15], we estimate the peak day to be between day four andfive. We proceed with the assumption in [3] that the expansion phase in that system continues until day eight.The χ objective function employed by [3] uses estimates of the sample variance, ˆ σ i , for each observedstatistic. There is not, however, sufficient data reported to make the bootstrapping calculation for the SEM of[15] data. As a result we needed to modify the weighting in the objective function so it was not dependenton this missing information. This weighting also has to manage the fact that an average family size is on adifferent scale than a proportional statistic. Let us suppose we have an observed average family size denoted asˆ y and k statistics observed of proportional phenotypes at different times denoted as ˆ y ,... ˆ y k + . The objectivefunction we applied was a weighted mean squared error defined by d ( ˆ y ,(cid:126) y ( θ )) = ( ˆ y − y ( θ )) ˆ y + k ∑ i = ( ˆ y i − y i ( θ )) . That is, we use another weighted mean squared error where we only divide one of the sum’s elements by therespective observed statistic squared, the observed populations at day eight post infection. This weighting isrequired to ensure that the population statistic does not disproportionately influence the objective function, asthe population scales exponentially over time while the proportional statistics do not.The methodology of fitting models to observed cohort data requires an estimate of the average family sizeat peak immune response in order for models to scale correctly, however this data is not reported in [15].The authors of [21] provide a methodology for estimating the difference in the average number of divisionstransferred OT-1 CD8+ T cells have made by the time of peak response between two experiments. We adoptand extend this method to estimate average family size at peak. We fit a log linear relationship to estimatethe percentage of OT-1 cell at peak of the immune response for different adoptive transfer amounts, Fig. 5B.We also make the assumption that the total number lymphocytes (OT-1 and endogenous) at the peak immuneresponse is the same across all experiments, an assumption supported by spleen data in [21]. Spleen countsreported in [2] suggest this may not always be the case, particularly for low transfer numbers. Finally weassume that the average family size reported in [3] is representative. Combing these assumptions allowedus to derive the average family size for a given adoptive transfer of OT-1 cells, Fig. 5C. The log-linearassumption is used for simplicity, but predicts negative growth for low numbers of OT-1 cells, and so canonly be useful above a threshold.This scheme indicates that with a transfer of 1 × cells, as in [3], each clone results in an average ofnine more divisions than the clones in a transfer of 1 × cells, as in [15]. This scales roughly with thatestimated in [21] that a transfer of 3 . × cells results in five more divisions than a 4 × transfer. Wecalculate that in the experiment reported in [15] we would expect 23 cells per family (four to five averagedivisions) at the peak of around day 4.5. This is reasonable when compared to data in [2], which reported thatthere is only a 13 fold increase in total OT-1 proportion at peak for a 10 fold increase in precursor transfer.While relative numbers are comparable, large discrepancies in reported data for absolute cell counts highlightcaveats with these estimates. On transferring 10 cells, [3] reports an average peak family size of 15k, while[2] gives a 400k estimate for 50 cells transferred. Our estimate for the total number of lymphocytes in amouse, calculated from the value reported in [3], is four fold lower than that [21] reported in the spleen.Because of the fragility of this calculation, we performed a sanity check which established there was nosignificant impact (data not shown). pplication. When fitting to all the data in [15] up to day eight post infection, contrary to the deductionin [3], the method indicated that the Linear Effector First Model provided the better fit, Fig. 6A. The LinearEffector First Model was also the best fit for the experiments in [21] with high adoptive transfer numbers. Forexperiments with lower numbers of adoptively transferred cells, the deduction flipped and the Linear Memoryfirst model was the best fit. Thus, when fitting without adjusting for the number of cell adoptively transferred,we found contradictory results.We restricted the data from these papers to the estimated expansion phase. Curtailed data with less thanfour data points was not considered. For all other data sets, the Linear Memory First Model provided the bestfit, Fig. 6B. On examination, however, all of best fits, had biologically implausible parameterizations, withnaive cell average lifetimes taking values far over 10 days, with the one exception of the Memory First modelfitting to [3] data. This is at odds with data showing that, even for high adoptive transfers, less than 0.1%of the OT-1 population comprised unrecruited cells [21], and data from [15] showing that while no cells haddivided on day two post infection the majority were preparing to divide. We therefore further adapted themethod, re-performing the analysis, but upper bounding parameters so that average cell lifetime is five daysor less. When fitting with this adjustment, the Linear Memory First Model still provided the best fit to allthe curtailed data sets, but the result was stronger, with the Linear Effector First Model failing to give a goodexplanation for the data in [3, 15], Fig. 6C. This result suggested that under the assumptions of the modelthere was no biologically plausible good fit for the Linear Effector First Model when fit to either data set. Weapplied the fitting method to blood data from [2] which was not included in the main analysis, and this alsoreturned a Memory First model as the best fit, Fig. 6D. We also applied the method to MLN data reported in[15], which, in contrast to the other data, returned the Linear Effector First Model as the best fit. The MLNdata, however, may not be appropriate for this analysis, due to the high levels of migration.The method chose boundary values for at least one parameter in all cases except when fitting the LinearMemory First Model to [3] data, Fig. 7. In addition, some models predicted little memory, primarily fittingthe CD62L + portion with naive and effector cells, as was the case with the Effector First Model fitted to[15]. This suggested that the model may not be sophisticated enough or did not have enough data to fit to.An alternative explanation could come from other [15] results, which showed a homogeneous populationeffectors on day four post infection make way for two homogeneous groups of effectors and memory cells onday seven. With the insight that day seven is after the expansion phase in the experimental setup describedin [15], this result suggests the possibility that memory may appear after the strictly exponential expansionphase considered by the model in [3] has come to an end. The evidence presented here suggests that the modeling methodology introduced in [3] to discriminate be-tween differentiation pathways, originally implemented to fit to clonal data, is robust in its deductions whenmodified to fit to more readily available time-course non-clonal data. Robustness is a key advantage of theapproach for quickly evaluating differentiation networks.We acknowledge that our application of the model to published data has several limitations beyond theoriginal: we considered a much more limited collection of linear-only paths; we had to collapse phenotypicdefinitions due to not having CD27 measurements; we extracted data from papers directly from graphs; weemployed an adapted method for estimating day of peak immune response, and average family size.Despite all of those considerations, the model provided a robust outcome when fit to the non-clonal cohorttime-course published in [3]. Once day of peak response was taken into account, it produced consistentdeductions, that within this modeling framework Memory First provides the best fit, across a range of datafrom other papers [2, 21, 15]. When inappropriately fit to time-courses that extend beyond the expansionphase, interestingly it supported the Effector First Model in some cases, which is more consistent with otherdata in [15].As with any modeling framework, there are, of course, caveats in the strength of the conclusions drawnfrom it. Presumably due to the inherently limited nature of the in vivo data being fit, several modelingassumptions were made in [3] that keep the computational model tractable and parameters identifiable, but ome of which are inconsistent with published experimental data. As more sophisticated tracking of cellfamilial fates and information on physiological constraints around lineage transitions develops, we anticipatethat ultimately other mathematical models will be employed that are built on that knowledge. For illustration,we attempt to highlight where those assumptions may influence the deductions of the present study.The model’s remit is exclusively an expansion phase with no death and strictly exponentially growingpopulations. As a result, the method cannot be reasonably expected to draw inferences on differentiation if itoccurs as motivation to divide is petering out.The fundamental unit of autonomous randomness in the model is the cell: each cell is born afresh, in-dependently selecting its lifetime and fate in a cell-type dependent fashion. The data in both [3, 8] illustratea strong familial influence, with some clones expanding dramatically more than others. Controlled in vitro experiments have established significant elements of clonal dependency [16], notably in burst size [17, 12],while in vivo data using the cell cycle reporter FUCCI [20] has established that during an adaptive immuneresponse cells drop out of cycle as early as day four and the number of non-dividing cells progressively in-creases with time [18]. Thus, to retain simplicity and identifiability, an alternative view, taking to accountthese experimental features would posit that the ultimate clone burst size is randomized between families,and that the slowing in proliferation rate noted by the memory first model, Fig. 4B and Fig. 7B, is created bynon-dividing memory cells diluting the average division time of the OT-1 populations being measured.The use of an exponential distribution for the lifetime distribution ensures that the multi-type Bellman-Harris process is Markovian, and facilitates the use of non-asymptotic values in model fitting. Due to theminimum time needed to synthesize DNA, there is always a lower-bound on division times and it has beenlong since known that for lymphocytes, at least in vitro , the exponential does not offer a good description ofthe distribution of division times and a right skewed distribution is more appropriate [22, 6, 11, 10, 7]. Thus,while cohort and clonal data yielded the same deductions in our analysis, only experiments looking deeperthan the population level can shed light on the validity of the modeling assumptions used for inference.Whether a model built taking these assumptions into account would lead to distinct deductions is unclear.What is clear is that, within the confines of the assumptions underlying it, the model introduced in [3] providesa method for quickly evaluating differentiation networks, even for cohort level data. Acknowledgement.
The work of A.M. and K.R.D. was supported by Science Foundation Ireland Grant12IP1263. This work was supported by the National Health and Medical Research Council of Australiavia Project Grant 1010654, Program Grant 1054925 and a fellowship to P.D.H. as well as an AustralianGovernment National Health and Medical Research Council Independent Research Institutes InfrastructureSupport Scheme Grant 361646.
References [1] Rafi Ahmed, Michael J. Bevan, Steven L. Reiner, and Douglas T. Fearon. The precursors of memory:models and controversies.
Nature Reviews Immunology , 9(9):662–668, 2009.[2] Vladimir P. Badovinac, Jodie S. Haring, and John T. Harty. Initial T cell receptor transgenic cell precur-sor frequency dictates critical aspects of the CD8+ T cell response to infection.
Immunity , 26(6):827–841, 2007.[3] Veit R. Buchholz, Michael Flossdorf, Inge Hensel, Lorenz Kretschmer, Bianca Weissbrich, PatriciaGr¨af, Admar Verschoor, Matthias Schiemann, Thomas H¨ofer, and Dirk H. Busch. Disparate individualfates compose robust CD8+ T cell immunity.
Science , 340(6132):630–635, 2013.[4] Veit R. Buchholz, Ton N. M. Schumacher, and Dirk H. Busch. T cell fate at the single-cell level.
Annualreview of immunology , 34:65–92, 2016.[5] John T. Chang, Vikram R. Palanivel, Ichiko Kinjyo, Felix Schambach, Andrew M. Intlekofer, ArnobBanerjee, Sarah A. Longworth, Kristine E. Vinup, Paul Mrass, Jane Oliaro, Nigel Killeen, Jordan S.Orange, Sarah M. Russell, Wolfgang Weninger, and Steven L. Reiner. Asymmetric T lymphocytedivision in the initiation of adaptive immune responses.
Science , 315(5819):1687–1691, 2007.
6] Mark R. Dowling, Andrey Kan, Susanne Heinzel, Jie H. S. Zhou, Julia M. Marchingo, Cameron J.Wellard, John F. Markham, and Philip D. Hodgkin. Stretched cell cycle model for proliferating lym-phocytes.
Proceedings of the National Academy of Sciences , 111(17):6377–6382, 2014.[7] Ken R. Duffy, Cameron J. Wellard, John F. Markham, Jie H. S. Zhou, Ross Holmberg, Edwin D.Hawkins, Jhagvaral Hasbold, Mark R. Dowling, and Philip D. Hodgkin. Activation-induced B cellfates are selected by intracellular stochastic competition.
Science , 335(6066):338–341, 2012.[8] Carmen Gerlach, Jan C. Rohr, Le¨ıla Peri´e, Nienke van Rooij, Jeroen W. J. van Heijst, Arno Velds, JosUrbanus, Shalin H. Naik, Heinz Jacobs, Joost B. Beltman, Rob J. de Boer, and Ton N. M. Schumacher.Heterogeneous differentiation patterns of individual CD8+ T cells.
Science , 340(6132):635–639, 2013.[9] Theodore E Harris.
The Theory of Branching Processes . The RAND Corporation, 1964.[10] E. D. Hawkins, J. F. Markham, L. P. McGuinness, and P. D. Hodgkin. A single-cell pedigree anal-ysis of alternative stochastic lymphocyte fates.
Proceedings of the National Academy of Sciences ,106(32):13457–13462, 2009.[11] E. D. Hawkins, M. L. Turner, M. R. Dowling, C. van Gend, and P. D. Hodgkin. A model of immuneregulation as a consequence of randomized lymphocyte division and death times.
Proceedings of theNational Academy of Sciences , 104(12):5032–5037, 2007.[12] Susanne Heinzel, Tran Binh Giang, Andrey Kan, Julia M Marchingo, Bryan K Lye, Lynn M Corcoran,and Philip D Hodgkin. A Myc-dependent division timer complements a cell-death timer to regulate Tcell and B cell responses.
Nature Immunology , 18:96–103, 2017.[13] Joshyi Jacob and David Baltimore. Modelling T-cell memory by genetic marking of memory T cells invivo.
Nature , 399:593–597, 1999.[14] Carolyn G. King, Sabrina Koehli, Barbara Hausmann, Mathias Schmaler, Dietmar Zehn, and Ed Palmer.T cell affinity regulates asymmetric division, effector cell differentiation, and tissue pathology.
Immu-nity , 37(4):709–720, 2012.[15] Ichiko Kinjyo, Jim Qin, Sioh-Yang Tan, Cameron J. Wellard, Paulus Mrass, William Ritchie, AtsushiDoi, Lois L. Cavanagh, Michio Tomura, Asako Sakaue-Sawano, Osami Kanagawa, Atsushi Miyawaki,Philip D. Hodgkin, and Wolfgang Weninger. Real-time tracking of cell cycle progression during CD8+effector and memory T-cell differentiation.
Nature Communications , 6:6301 EP –, 2015.[16] Fabrice Lemaˆıtre, H´el`ene D Moreau, Laura Vedele, and Philippe Bousso. Phenotypic CD8+ T celldiversification occurs before, during, and after the first T cell division.
The Journal of Immunology ,191(4):1578–1585, 2013.[17] J. M. Marchingo, G. Prevedello, A. Kan, S. Heinzel, P. D. Hodgkin, and K. R. Duffy. T-cell stimuliindependently sum to regulate an inherited clonal division fate.
Nature Communications , 7:13540, 2016.[18] Julia M. Marchingo, Andrey Kan, Robyn M. Sutherland, Ken R. Duffy, Cameron J. Wellard, Gabrielle T.Belz, Andrew M. Lew, Mark R. Dowling, Susanne Heinzel, and Philip D. Hodgkin. Antigen affinity,costimulation, and cytokine inputs sum linearly to amplify T cell expansion.
Science , 346(6213):1123–1127, 2014.[19] Joseph T. Opferman, Bertram T. Ober, and Philip G. Ashton-Rickardt. Linear differentiation of cytotoxiceffectors into memory T lymphocytes.
Science , 283(5408):1745–1748, 1999.[20] Asako Sakaue-Sawano, Hiroshi Kurokawa, Toshifumi Morimura, Aki Hanyu, Hiroshi Hama, HatsukiOsawa, Saori Kashiwagi, Kiyoko Fukami, Takaki Miyata, Hiroyuki Miyoshi, Takeshi Imamura, Masa-haru Ogawa, Hisao Masai, and Atsushi Miyawaki. Visualizing spatiotemporal dynamics of multicellularcell-cycle progression.
Cell , 132(3):487–498, 2008.[21] Timothy E. Schlub, Vladimir P. Badovinac, Jaime T. Sabea, John T. Harty, and Miles P. Davenport.Predicting CD62L expression during the CD8+ T cell response in vivo.
Immunology and Cell Biology ,88(2):157–164, 2010.
22] J. A. Smith and L. Martin. Do cells cycle?
Proceedings of the National Academy of Sciences ,70(4):1263–1267, 1973.[23] E. John Wherry, Volker Teichgr¨aber, Todd C. Becker, David Masopust, Susan M. Kaech, Rustom Antia,Ulrich H. von Andrian, and Rafi Ahmed. Lineage relationship and protective immunity of memory CD8T cell subsets.
Nature Immunology , 4:225–234, 2003. . Path ogen
Naive
NaiveNaive Flow
Cytometer ? ? ? ?Host
1. An array of eight OT-1 congenic C57BL/6 donor mice is crossbred so that each has
CD45/CD90 T cell markers that distinguish offspring. 2. Adoptive transfer of purified naïve cells from 8 donor mice to host mouse. 1 cell is transferred per donor for 7 mice, 100 for the 8 th donor mouse. cfu of LM-OVA, triggering the immune response. 4. At eight days after infection cells are harvested from the host mouse from the lungs, spleen and lymph nodes.5.FACs analysis. For observed T cells, the congenic markers indicate the family and CD62L and CD27 markers indicate phenotype. Thus summary statistics of phenotype population is gathered at a family level. Analysis
HostDonors B . Flow Cytometer ? ? ? ?
1. The donor mouse is crossbred so as to express CD90 or CD45. 2. Adoptive transfer of different numbers of purified naïve T cells from donor mice to host mouse: • - 5x [B.2007] • - 4x [S.2010] • - 1x [B.2013] • [K.2015] Various days post infection: • • • •
3. The host mouse (expressing different markers from the donor) is infected with OVA expressing pathogen: • LM-OVA (7x cfu) [B.2007] • LM-OVA (8x cfu) [S.2010] • LM-OVA (5x cfu) [B.2013] • PR8-OVA (1x pfu) [K.2015]
4. At a fixed time post infection cells are harvested from the host mouse from: • blood tail snips [B.2007][S.2010] • spleen only [B.2013] • or spleen and MLN [K. 2015] Analysis
HostHostDonor *1-4 days for 1x transfer 6,8 days for 1x transfer Path ogen
NaiveNaiveNaive
Figure 1. Summary adoptive transfer experiments described in [2, 21, 3, 15]. (A) Experiments describedin [3] determine the number of cells per OT-1 CD8+ T cell clone at day eight. (B) All four papers describesimilar OT-1 CD8+ T cell adoptive transfer experiments, where cohort data is reported for a time-course ofdays post infection. The number of cells adoptively transferred varies between papers.11 . a v e r a g e f a m il y s i z e CD62L-CD27-CD62L-CD27+CD62L+CD27+ c o e ff i c i e n t s o f v a r i a t i o n CD62L-CD27-CD62L-CD27+CD62L+CD27+ c o rr e l a t i o n c o e ff i c i e n t s CD62L-CD27-/CD62L+CD27+CD62L-CD27-/CD62L-CD27+CD62L+CD27+/CD62L-CD27+ p o p u l a t i o n p r o p o r t i o n s CD62L-CD27-CD62L-CD27+CD62L+CD27+ B . [S. 2010] - blood a ll d a t a u p t o d a y p . i . ( p r o p o r t i o n C D L + ) [B. 2007] - blood d a t a c u r t a il e d t o e x p a n s i o n p h a s e ( p r o p o r t i o n C D L + ) [K. 2015] - spleen [K. 2015] - MLN Min ( x ) Median ( . x ) Max ( x ) [B. 2013] - spleen Figure 2. Experimental data from [2, 21, 3, 15].
Data manually extracted from graphs in those papers. (A, leftthree graphs) Results from the clonal experiment described in [3], showing average family size, coefficients ofvariation and correlations for different phenotypes populations. (A, rightmost graph) Proportional phenotyperesults from the cohort experiment. (B) Comparing the CD62L + proportions from the four papers reportingcohort data. The top row shows all reported data up to day eight post infection. The bottom row shows the datawhen curtailed to the estimated expansion phase, which is shortened due to high adoptive transfer cell numbers.12 . TYPE 2Naive TYPE 2TYPE 2
TYPE 2
TYPE 3TYPE 3TYPE 3
TYPE 3 λ λ N λ p TYPE 4TYPE 4TYPE 4TYPE 4 λ p
1. Naïve cells live a exponentially distributed lifetime parameterized by λ N . Upon exit from this state, naïve cells become one cell of type two. 2. Type two cells live an exponentially distributed lifetime parameterized by λ . Upon exit from this state, type two cells differentiate into one cell of type three with probability p or divide into two type two cells with probability 1-p .
3. Type three cells live an exponentially distributed lifetime parameterized by λ . Upon exit from this state, type three cells differentiate into one cell of type four with probability p or divide into two type three cells with probability 1-p .4. Type four cells live an exponentially distributed lifetime parameterized by λ . Upon exit from this state, type four cells always divide into two type four cells. B . TYPE 2Naive TYPE 2TYPE 2TYPE 2 TYPE 3TYPE 3TYPE 3TYPE 3 λ λ N λ p
1. Naïve cells live a exponentially distributed lifetime parameterized by λ N . Upon exit from this state, naïve cells become one cell of type two.
2. Type two cells live an exponentially distributed lifetime parameterized by λ . Upon exit from this state, type two cells differentiate into one cell of type three with probability p or divide into two type three cells with probability 1- p .
3. Type three cells live an exponentially distributed lifetime parameterized by λ . Upon exit from this state type three cells always divide into two type three cells. Figure 3. Linear differentiation models considered here. (A) These linear models are a subset of the304 multi-type Bellman-Harris processes modeled by [3]. Each of the six models has six parameters, ( λ N , λ , λ , λ , p , p ) . (B) Simplified linear models, with the observed phenotypes reduced from three(CD62L + CD27 + , CD62L + CD27 - , CD62L - CD27 - ) to two (CD62L + , CD62L - ) and cell types reduced fromfour to three. The simplified model has four parameters, ( λ N , λ , λ , p ) .13 . Model 1 Model 2 Model 3 Model 4 Model 5 Model 6Model Index010203040506070 o b j e c t i v e f un c t i o n s c o r e ( c l o n a l ) o b j e c t i v e f un c t i o n s c o r e ( c o h o r t ) B . λ − N (days) λ − (days) λ − (days) λ − (days) p p parameter0.00.20.40.60.81.01.21.4 p a r a m e t e r v a l u e Model fitted to:
Clonal data (means, variances and covariances of phenotype family size).Population data (phenotypeproportions and averagefamily size).
Figure 4. Results from fitting the linear models to clonal data, as in the original method, or cohort datafrom [3]. (A) Objective function values for the best fitting parameterization of each model, where smaller val-ues indicate a better fit. (B) Best fitting parameterization for Model 1, where λ corresponds to CD62L + CD27 + (TCMp), λ to CD62L - CD27 + (TEMp), λ to CD62L - CD27 - (TEF), p to the probability that a TCMp cellbecomes a TEMp cell and p to the probability that a TEMp cell becomes a TEF cell.14 . transfered number of cells0123456789 d a y o f p e a k r e s p o n s e r =0.81 [Badovinac et al. 2007][Schlub et al. 2010]Estimate for [Buchholz et al. 2013]Estimate for [Kinjyo et al. 2015] B . transfered number of cells0.00.10.20.30.40.50.6 p r o p o r t i o n o f O T - c e ll s a t p e a k r e s p o n c e r =0.78 [Badovinac et al. 2007][Schlub et al. 2010]Estimate for [Buchholz et al. 2013]Estimate for [Kinjyo et al. 2015] C . number of cells adoptively transfered05000100001500020000 e s t i m a t e f o r a v e r a g e f a m il y s i z e Adoptive transfer: [Buchholz et al. 2013] 107[Schlub et al. 2010] . x [Schlub et al. 2010] . x [Kinjyo et al. 2015] x Figure 5. Estimates of the day of peak response and average family size as a function of the number ofadoptively transferred cells, using data from [2, 21]. (A) Log-linear regression was used to find an estimatefor the day of peak response (y co-ordinate) from the number of adoptively transferred cells (x co-ordinate)reported in [15] and [3]. (B) as in A but estimating proportion of OT-1 cells of lymphocytes at peak. (C)Estimate of average family size at peak for given number of adoptively transferred cells. For low numbers oftransferred cells ( < . x − spleen[B. 2013] . x blood[S. 2010] . x blood[S. 2010] x blood[S. 2010] x blood[S. 2010] x spleen[K. 2015] T cells tranfered / data source0.000.050.100.150.200.25 o b j e c t i v e f un c t i o n s c o r e B . x − spleen[B. 2013] . x blood[S. 2010] . x blood[S. 2010] x no data[S. 2010] x no data[S. 2010] x spleen[K. 2015] T cells tranfered / data source0.000.020.040.060.080.100.120.14 o b j e c t i v e f un c t i o n s c o r e C . x − spleen[B. 2013] . x blood[S. 2010] . x blood[S. 2010] x no data[S. 2010] x no data[S. 2010] x spleen[K. 2015] T cells tranfered / data source0.00.10.20.30.40.50.6 o b j e c t i v e f un c t i o n s c o r e D . x blood[B. 2007] x blood[B. 2007] x blood[B. 2007] x lymph[K. 2015]T cells tranfered / data source0.000.020.040.060.080.10 o b j e c t i v e f un c t i o n s c o r e s Linear Memory First ModelLinear Effector First Model
Figure 6. The objective function value for the two simplified linear models for the best parameterizationswhen fitting cohort statistics and average family size.
Smaller numbers indicate a better fit. Fits only donewhen data sets had four or more data points. (A) Fitted to all reported data up to day eight post infection. (B)As in A, but curtailing data to the estimated expansion phase. (C) As in B, but with the average cell lifetimelimited to being below five days. (D) as in C, but fit to additional data sources.16 . a v e r a g e f a m il y s i z e x − [B. 2013] . x [S. 2010] . x [S. 2010] All CD62L- CD62L+ x [K. 2015] Memory First Effector First p o p u l a t i o n p r o p o r t i o n B . λ − N (days) λ − (days) λ − (days) P parameter0.00.51.01.52.02.5 p a r a m e t e r v a l u e Linear Memory First Model
Data source: x − [B. 2013] . x [S. 2010] . x [S. 2010] x [K. 2015] C . λ − N (days) λ − (days) λ − (days) P parameter0.00.51.01.52.02.5 p a r a m e t e r v a l u e Linear Effector First Model