[PDF] Bayesian information-theoretic calibration of patient-specific radiotherapy sensitivity parameters for informing effective scanning protocols in cancer

Abstract

With new advancements in technology, it is now possible to collect data for a variety of different metrics describing tumor growth, including tumor volume, composition, and vascularity, among others. For any proposed model of tumor growth and treatment, we observe large variability among individual patients' parameter values, particularly those relating to treatment response; thus, exploiting the use of these various metrics for model calibration can be helpful to infer such patient-specific parameters both accurately and early, so that treatment protocols can be adjusted mid-course for maximum efficacy. However, taking measurements can be costly and invasive, limiting clinicians to a sparse collection schedule. As such, the determination of optimal times and metrics for which to collect data in order to best inform proper treatment protocols could be of great assistance to clinicians. In this investigation, we employ a Bayesian information-theoretic calibration protocol for experimental design in order to identify the optimal times at which to collect data for informing treatment parameters. Within this procedure, data collection times are chosen sequentially to maximize the reduction in parameter uncertainty with each added measurement, ensuring that a budget of n high-fidelity experimental measurements results in maximum information gain about the low-fidelity model parameter values. In addition to investigating the optimal temporal pattern for data collection, we also develop a framework for deciding which metrics should be utilized at each data collection point. We illustrate this framework with a variety of toy examples, each utilizing a radiotherapy treatment regimen. For each scenario, we analyze the dependence of the predictive power of the low-fidelity model upon the measurement budget.

Full PDF

BBayesian information-theoretic calibration of patient-speciﬁcradiotherapy sensitivity parameters for informing eﬀectivescanning protocols in cancer

Heyrim Cho, Allison L. Lewis, Kathleen M. Storey

Abstract

With new advancements in technology, it is now possible to collect data for a variety ofdiﬀerent metrics describing tumor growth, including tumor volume, composition, and vascular-ity, among others. For any proposed model of tumor growth and treatment, we observe largevariability among individual patients’ parameter values, particularly those relating to treat-ment response; thus, exploiting the use of these various metrics for model calibration can behelpful to infer such patient-speciﬁc parameters both accurately and early, so that treatmentprotocols can be adjusted mid-course for maximum eﬃcacy. However, taking measurementscan be costly and invasive, limiting clinicians to a sparse collection schedule. As such, thedetermination of optimal times and metrics for which to collect data in order to best informproper treatment protocols could be of great assistance to clinicians. In this investigation,we employ a Bayesian information-theoretic calibration protocol for experimental design inorder to identify the optimal times at which to collect data for informing treatment param-eters. Within this procedure, data collection times are chosen sequentially to maximize thereduction in parameter uncertainty with each added measurement, ensuring that a budget of n high-ﬁdelity experimental measurements results in maximum information gain about the low-ﬁdelity model parameter values. In addition to investigating the optimal temporal pattern fordata collection, we also develop a framework for deciding which metrics should be utilized ateach data collection point. We illustrate this framework with a variety of toy examples, eachutilizing a radiotherapy treatment regimen. For each scenario, we analyze the dependence ofthe predictive power of the low-ﬁdelity model upon the measurement budget. Mathematical and computational approaches have long been developed for a better under-standing of cancer [1, 2, 3, 4, 5]. Complex mechanisms that control carcinogenesis, tumorprogression, and dynamics under treatments have been studied through various mathematicalmodels [1, 4]. Such quantitative models are utilized to predict cancer dynamics and therapyresponse, improve early detection, and design therapeutic protocols [5]. Moreover, advances intechnology have made it possible to collect considerable amounts of detailed data, includinggenetic, cellular level, and tissue-scale imaging, describing various aspects of cancer progressionin patients [6, 7]. Incorporating the available data into an appropriate mathematical model canenhance the eﬀectiveness of personalized treatments. However, practically speaking, an abun-dance of data may not be available in a clinical setting, particularly in the temporal domain.Speciﬁcally, data can often only be collected at sparse times, due to physical and ﬁnancial a r X i v : . [ q - b i o . Q M ] S e p urdens to patients. Therefore, with a limited number of total scans, it is important to de-cide when to collect the data to most accurately calibrate the model for maximum predictivepower. Moreover, another important purpose of data collection is to aid in making an accurateprediction as early as possible during the treatment so that any necessary adjustments—forexample, altering the dosage or discontinuing the treatment—can be made as soon as possible.However, this goal is also prohibited by the limited availability of patient data in the temporaldomain.Many diﬀerent types of mathematical models have been developed to simulate tumor growthand response to treatment. These range from phenomenological models composed of or-dinary diﬀerential equations (ODEs) [8, 9, 10] to complex multi-scale models that combinerepresentations of the tumor microenvironment at the subcellular, cellular, and tissue scales[11, 12, 13, 14]. Typically the biological accuracy increases as the model complexity increases;however, achieving unique identiﬁability of model parameter values becomes more challenging.In [15], we present a framework for choosing an appropriate model and calibrating parametervalues, given a set of data describing various tumor metrics over time. Here, we extend thatwork by developing a methodology to determine the optimal data collection times, in order tomaximize the utility of the collected samples.We focus on applying this methodology to models that simulate tumor response to radio-therapy. Radiotherapy is a common treatment modality applied to many cancer types, andthere is a long tradition of mathematical modeling of radiotherapy response. Fractionatedradiation dosing is typically modeled using the linear-quadratic (L-Q) model [16, 17]. Severalrecent studies have applied the L-Q model to patient-speciﬁc data, in an eﬀort to evaluate andpredict individual responses to radiotherapy [18, 19, 20, 15]. Our work extends this body ofliterature by suggesting optimal sampling times for tumors undergoing radiotherapy, so thatthe L-Q parameters can be eﬃciently calibrated and tuned for more accurate post-treatmentpredictions.We utilize a Bayesian information-theoretic experimental design framework for choosingoptimal design conditions at which to collect data for model calibration. This methodologyis based on the concept of maximizing information gain—and thus reducing uncertainty—about low-ﬁdelity model parameters, while minimizing the number of design conditions atwhich experimental data must be collected. Originally built around the concept of Shannonentropy [21], and utilizing the k th-Nearest-Neighbor ( k NN) estimate of mutual information tomake the procedure more computationally feasible [22], this framework was ﬁrst introducedin [23, 24, 25]. The methodology was further extended in [26] to incorporate more robustBayesian methods for supporting highly correlated and nonlinear parameter dependencies.Here, we further amend this framework to be suitable for a temporal data collection setting,emphasizing the minimization of uncertainty in our model parameters while also penalizing ouralgorithm for choosing points further out in time, which precludes the potential informationgain that would result from data collection at intermediate time steps.In Section 2, we introduce the mathematical models for tumor growth and radiotherapythat will be used to illustrate our methodology throughout this work. Section 3 outlinesthe procedure for using mutual information to optimally choose design conditions at whichto evaluate tumor data in order to maximize information gain about model parameters foraccurate and timely calibration. This section includes the novel introduction of a score functionfor temporal data collection, which penalizes the user for skipping too many potential dataevaluation times in pursuit of the largest possible information gain. In Section 4, we illustrateour proposed framework for several examples that showcase a wide variety of scenarios withrespect to model complexity and data collection protocols. We summarize our ﬁndings anddiscuss plans for future investigation in Sections 4.5 and 5. Mathematical models used for testing

We begin by presenting two mathematical models that will be used throughout this work aslow-ﬁdelity models to be calibrated for clinical predictions, and another model that will be usedto generate synthetic high-ﬁdelity data. The ﬁrst low-ﬁdelity model is a one-compartment ODEmodel that tracks tumor volume over time—our simplest model for describing tumor growth.The second model is a two-compartment ODE model that incorporates a state variable fortracking the portion of tumor volume that is composed of necrotic tissue, thereby introducingthe concept of tumor heterogeneity.These two models will be calibrated using high-ﬁdelity data generated by a more complexmodel, namely, a cellular automaton model. In addition to quantifying both the viable andnecrotic cell populations, the cellular automaton model also tracks the cell division cycle,quiescent cells, and oxygen levels, allowing for more accurate reﬂection of the stochastic andheterogeneous nature of cancer growth in reality [14, 15]. In all three cases, we incorporatetreatment via radiation using the linear-quadratic model for radiotherapy [27, 28], as outlinedin Section 2.4.

The one-compartment model describes the time evolution of the total tumor volume, V ( t ),using a logistic growth model with growth rate λ and carrying capacity K : dVdt = λV (cid:18) − VK (cid:19)(cid:124) (cid:123)(cid:122) (cid:125) logistic growth − ηV. (cid:124)(cid:123)(cid:122)(cid:125) natural cell death (1)We incorporate natural cell death via the term − ηV . In what follows, it will be convenient tore-parameterize Equation (1) to obtain the simpler and parametrically-identiﬁable form dVdt = AV (cid:18) − BA V (cid:19) , (2)where A = λ − η and B = λK . From this point forward, any reference to the one-compartmentmodel is referring to the re-parameterized form, Equation (2).Biologically speaking, as a tumor grows, regions at a distance from oxygen and nutrientsources (e.g., blood vessels for tumors growing in vivo ) may undergo necrosis in response tosustained oxygen and/or nutrient deprivation. In this simple one-compartment model, suchdead or necrotic cells are assumed to be removed from the tumor instantaneously; that is, weview the tumor as a homogeneous mass of proliferating, viable cells. In order to account for some aspects of tumor heterogeneity, we next study a two-compartmentmodel that tracks the time evolution of the viable tumor cell volume, V ( t ), and the necroticcore volume, N ( t ), originally developed in [29] and further analyzed in [15]. We consider thismodel in an eﬀort to better represent reality, as it has been shown that the proportion ofnecrotic material has a signiﬁcant impact on one’s ability to estimate a tumor’s response toradiotherapy [15, 30, 31]. We still assume that the population of proliferating (i.e., viable)cells, V ( t ), grows logistically with growth rate λ and carrying capacity K , and that viablecells convert to necrotic cells at a constant rate η . In this second model, we assume that thenatural death of viable cells results in the build-up of a necrotic core, denoted by N ( t ), wherethe necrotic material then undergoes natural decay at a constant rate ζ . We note that as → ∞ , the two-compartment model converges to the one-compartment model behaviorally.Combining these processes, we arrive at the following ODE system for V ( t ) and N ( t ): dVdt = λV (cid:18) − VK (cid:19) − ηV, (3a) dNdt = ηV − ζN. (3b)Throughout the following investigation, we refer to this system (3) as the two-compartmentmodel. In the absence of experimental data, we generate synthetic total tumor and necrotic volumedata using a spatially-explicit, hybrid cellular automaton (CA) model to allow for illustration ofour methodology. Throughout this investigation, the synthetic data from the CA model servesas the high-ﬁdelity data in the high-to-low ﬁdelity model calibration framework referred to in[26]. Our cellular automaton model is adapted from that developed in [14] and later expandedin [15]. The model incorporates spatially heterogeneous oxygen levels, a stochastic cell life cycle,and a heterogeneous cell population, including proliferating, quiescent, and necrotic cells. Thecells are arranged on a discrete lattice representing a two-dimensional square cross-section ofsize 0.36 × through a three-dimensional cancer spheroid in vitro . We identify with eachautomaton x = ( x, y ) at time t a dynamical variable with a state and a neighborhood.Each automaton can be occupied by a tumor cell in one of three states—proliferating, P ,quiescent, Q , or necrotic, N —or can be unoccupied and denoted as empty, E . We note that alllattice cells have an associated oxygen level, regardless of whether they are currently occupied.This oxygen level, c , determines the state of the occupying cell using thresholds c N and c Q : if c > c Q then the cells proliferate, if c N < c < c Q then the cells transition to quiescent cells witha halved oxygen consumption rate, and if c ≤ c N then the cells are considered necrotic.We model the single growth-rate-limiting nutrient, oxygen, explicitly via a reaction-diﬀusionequation. See [14, 15] for a detailed description of the oxygen model. If a cell becomes necroticdue to low oxygen concentration or irradiation, which will be discussed in Section 2.4, thenecrotic cells are lysed at rate p NR . Lysis involves removing the necrotic cell and then shiftinginward a chain of cells starting from the boundary of the spheroid to ﬁll in the removed cell’ssite. Additionally, the CA mimics several other mechanisms observed in tumor spheroids. Weincorporate the regulatory process known as contact inhibition of proliferation by reducing thedivision of cells with a large number of neighbors. We simulate cell-cell adhesion by shiftingchains of cells outward after cell division. The details of these model features are described in[14, 15].We use the CA model to generate a series of synthetic spheroids that diﬀer in their responseto radiosensitivity. The parameter values that are used to generate data using the CA modelare shown in Table 3 in the Appendix A. These parameters are estimated using experimentaldata from the prostate cancer cell line, PC3, in [14]. Finally, we discuss the incorporation of a radiotherapy (RT) treatment protocol in all threemodels. We consider a typical tumor treatment regimen in which daily doses of 2 Gy areadministered Monday through Friday for six consecutive weeks. We use the linear-quadraticmodel [27, 28] to account for the eﬀects of RT. This model assumes that the fraction of cellsthat survive exposure to a single administered dose d of RT is given by urvival fraction , SF = e − αd − βd , (4)where α and β represent tissue-speciﬁc radiosensitivity parameters that model single- anddouble-strand breaks of the DNA, respectively [32].For implementation in the one-compartment model, we assume that the eﬀect of RT isinstantaneous. That is, the irradiated cell fraction is removed immediately from the tumorvolume, akin to the treatment of natural cell death in our model. Under these assumptions,the one-compartment model with RT treatment can be re-formulated as  dVdt = AV (cid:0) − BA V (cid:1) , for t + i < t < t − i +1 ,V ( t + i ) = exp( − αd − βd ) V ( t − i ) , (5)where t i (for i = 1 , , . . . , n R ) denote the times at which an RT dose is delivered, and V ( t ± i )denote the tumor volume just before and after radiotherapy is administered.Because our two-compartment model allows for the existence of a necrotic core, our imple-mentation of RT in the two-compartment model assumes that irradiated cells from the totaltumor volume move directly into the necrotic compartment, and later decay naturally. Thismanifests visually as a delayed reaction to RT in the total tumor volume; it is the heterogeneouscomposition of the tumor that displays the immediate eﬀects of RT.We apply the linear-quadratic model to the cellular automaton model in a similar fashion.At each administration of RT, each living cell converts to a necrotic cell with probability1 − e − αd − βd , and then undergoes natural decay during a later iteration with probability p NR ,as detailed in Section 2.3.To illustrate our framework for several scenarios with data displaying a variety of behaviorwith regard to treatment success, we produce three diﬀerent sets of test data using the CAmodel. These three test scenarios are representative of a patient whose tumor is highly sensitiveto radiotherapy (generated in the CA model using α = 0 . β = 0 .

14, for a radiosensitivityparameter ratio of 1), a patient whose tumor is somewhat sensitive to RT (using α = 0 .

14 and β = 0 . α/β = 3), and a patient whose tumor does not respond meaningfullyto RT (using α = 0 .

14 and β = 0 . α/β = 9). Throughout the investigation,we will refer to these three test scenarios as “high,” “medium,” and “low” radiosensitivity,respectively. The three sets of CA synthetic data are shown in Figure 1. In all patients, weconsider a typical treatment protocol in which daily doses of 2 Gy are administered Mondaythrough Friday for 6 weeks, as discussed above. Figure 1: High-ﬁdelity CA data for each of the three radiosensitivity scenarios: high ( α/β = 1, onleft), medium ( α/β = 3, in middle), and low ( α/β = 9, on right).5

Bayesian Information-Theoretic Methodology

Our goal is to calibrate the parameters of a low-ﬁdelity model (Equations (2) or (3)) usingas few high-ﬁdelity data collections (or in this case, as few evaluations of the high-ﬁdelity CAmodel) as possible. In Section 3.1, we outline the mutual information methodology used tochoose the most informative high-ﬁdelity data points with respect to low-ﬁdelity parameteruncertainty. Then, in Section 3.2, we present our adaptation of this method to allow for its useon time series data.

Given a set of initial experimental data (or high-ﬁdelity code evaluations) and a low-ﬁdelitymodel that we wish to calibrate, we seek to determine the optimal design conditions at whichto collect additional data (or evaluate our high-ﬁdelity model) in order to best inform the pa-rameters of the low-ﬁdelity model, θ ∈ R p . We note that throughout this investigation, we willuse a high-ﬁdelity code to produce synthetic data to represent experimental data acquisition.We denote the existing high-ﬁdelity data set by D n − = { ˜ d , ˜ d , . . . , ˜ d n − } , and select the nextdesign condition ξ n from the set of all possible evaluation strategies or experimental designs,Ξ, in such a way as to maximize reduction in the uncertainty in θ when ˜ d n = d h ( ξ n )—the datapoint resulting from evaluation at design ξ n —is appended to the existing data set.Because we cannot determine the value of ˜ d n without ﬁrst evaluating the high-ﬁdelitycode, we must base our decision of ξ n on predictions from the low-ﬁdelity model. The low-ﬁdelity model prediction for a speciﬁed condition is denoted by d n = d (cid:96) ( θ, ξ n ), where d (cid:96) ( θ, ξ n )represents the low-ﬁdelity model evaluated at parameter set θ and design ξ n . Predictions madeby this model can be extended into a statistical modeling framework by incorporating a modeldiscrepancy term, δ ( ξ n ), and the possibility of measurement or discretization errors, ε n ( ξ n ).In this study, we assume no model discrepancy and zero measurement errors, and refer theinterested reader to [26] for additional details on the corresponding statistical models for boththe low-ﬁdelity and high-ﬁdelity frameworks.Recall that our objective is to calibrate the low-ﬁdelity model parameters using as littlehigh-ﬁdelity data as possible, so as to minimize unnecessary computational or experimentalcosts. Thus, we want to choose design conditions for the high-ﬁdelity code so as to contributethe maximum amount of information about our low-ﬁdelity model parameters. We do so usinga Bayesian framework. We begin with Bayes’ Rule, which gives a method by which to updatethe knowledge about the parameters based on the addition of a new data point: p ( θ | D n ) = p ( D n | θ ) p ( θ ) p ( D n ) = p ( ˜ d n , D n − | θ ) p ( θ ) p ( ˜ d n , D n − ) . In essence, Bayes’ Rule states that the posterior distribution of θ , given data D n , depends bothon the likelihood of obtaining data D n given the parameter set θ , represented by p ( D n | θ ), andthe prior distribution of θ , p ( θ ), which incorporates all known information about θ .As a means of quantifying the information gain obtained from including an additional datapoint, we utilize the mutual information between the low-ﬁdelity model parameters and thehigh-ﬁdelity data. We include here the basic layout about the mutual information derivation,and refer the interested reader to [23, 22, 26] for additional details.The average level of uncertainty or information inherent in a random variable can be quan-tiﬁed by its Shannon entropy [21]. For a random variable Θ having a corresponding density p ( θ ) for θ ∈ Ω, the Shannon entropy is deﬁned as H (Θ) = − (cid:90) Ω p ( θ ) log( p ( θ )) dθ or the prior distribution and H (Θ | D n − ) = − (cid:90) Ω p ( θ | D n − ) log( p ( θ | D n − )) dθ for the posterior distribution, given data D n − . Since we wish to determine the gain in infor-mation that results from incorporating an additional high-ﬁdelity data point ˜ d n , we deﬁne ourutility function to be the following diﬀerence: U ( d n , ξ n ) = (cid:90) Ω p ( θ | d n , D n − ) log p ( θ | d n , D n − ) dθ − (cid:90) Ω p ( θ | D n − ) log p ( θ | D n − ) dθ. (6)Note that in Equation (6), we use the low-ﬁdelity model prediction d n as an estimate in placeof the high-ﬁdelity output ˜ d n , as we cannot know the value of this high-ﬁdelity output withoutconducting a potentially expensive experiment.To compute the average information gain resulting from the contribution of the experimentaldesign ξ n , we marginalize over the full set of all unknown future observations, D , to obtain E d n [ U ( d n , ξ n )] = (cid:90) D U ( d n , ξ n ) p ( d n | D n − , ξ n ) dd n . (7)By substituting Equation (6) into (7) and simplifying, we obtain E d n [ U ( d n , ξ n )] = (cid:90) D (cid:90) Ω p ( θ, d n | D n − , ξ n ) log p ( θ, d n | D n − , ξ n ) p ( θ | D n − ) p ( d n | D n − , ξ n ) dθdd n = I ( θ ; d n | D n − , ξ n ) , (8)which we deﬁne to be the mutual information between the low-ﬁdelity model parameters, θ ,and the high-ﬁdelity data collected at design ξ n . Essentially, this is a measure of parameter un-certainty reduction—the larger the mutual information, the more knowledge we expect to gainabout the low-ﬁdelity model parameters by collecting data at that experimental design. Thus,we optimize over the set of all available design conditions, choosing the one that maximizesthis quantity: ξ ∗ n = arg max ξ n ∈ Ξ I ( θ ; d n | D n − , ξ n ) . Having chosen ξ ∗ n , we evaluate the high-ﬁdelity model at this point, ˜ d n = d h ( ξ ∗ n ), append ˜ d n tothe data set D n − , and re-calibrate the low-ﬁdelity model parameters. For this investigation, weuse the Delayed Rejection Adaptive Metropolis (DRAM) algorithm for our model calibrationstep—see [33, 34, 26] for additional details. Once the low-ﬁdelity model parameters have beenre-calibrated, this procedure is repeated in full until either (a) the budget of high-ﬁdelity modelevaluations has been exhausted or (b) the uncertainty in the low-ﬁdelity model parameters hasbeen reduced below a user-deﬁned threshold.We note that Equation (8) often cannot be evaluated directly and may be prohibitivelyexpensive to compute via numerical integration. As such, we utilize the k th-Nearest-Neighbor( k NN) estimate of mutual information described fully in [22, 26]. We detail our k NN estimateprocedure in Appendix B. .2 Modiﬁed mutual information for time series data When collecting clinical data to determine a tumor’s response to treatment, we consider theset of available design conditions to be a series of time steps at which data can be collectedsequentially. We note that it is critical to collect measurements at multiple time points toachieve an accurate calibration of the model. Thus, we adapt the calibration framework de-scribed in the previous subsection to apply to time-series data. We present this adaptationusing the one-compartment model as the low-ﬁdelity model, and note that it can be extendedto the two-compartment model by specifying the type of data to be collected (i.e., total tumorvolume versus necrotic volume), in addition to the time that it is to be collected.Let t , t , . . . , t n T denote the set of n T times at which high-ﬁdelity data can be collected,with t i < t j for i < j . We use ˜ d ( t i ) to denote the value of the high-ﬁdelity data, e.g. the tumorvolume as evaluated by the CA model, collected at time t i ; the low-ﬁdelity model prediction at t i is denoted by d ( t i ). Suppose that in the previous step of the algorithm, the data point ˜ d ( t r )was chosen to append to the data set, which we denote by D r . When choosing the next datapoint to add to this data set, for all i > r , we let I ( θ ; d ( t i ) | D r ) denote the mutual informationbetween the low-ﬁdelity model parameters, θ , and the high-ﬁdelity data collected at time t i ,as deﬁned in the general setting in Equation (8).In some cases, data collected at later time points may provide more overall mutual infor-mation, but choosing such a data point then precludes the subsequent collection of data atany previous time. In order to account for this trade-oﬀ, we deﬁne a score function basedon mutual information, which rewards the user for selecting a point with a large mutual in-formation, while penalizing the user for losing the information that could have been collectedfrom the skipped time points. Before presenting this score function, we ﬁrst deﬁne the mutualinformation relative to the maximum mutual information at a given step in the algorithm. Let I ∗ r denote the maximum possible mutual information in the step after data point ˜ d ( t r ) has beenchosen, deﬁned as follows: I ∗ r = max i>r I ( θ ; d ( t i ) | D r ) . Next we deﬁne the corresponding relative mutual information provided by each data pre-diction d ( t i ) by R ( i, r ) = I ( θ ; d ( t i ) | D r ) I ∗ r . Note that 0 ≤ R ( i, r ) ≤ i > r . Using this notation, we are now ready to deﬁne ourscore function S k ( i, r ), which combines the relative mutual information obtained from choosingdata point d ( t i ), with a penalty term that captures skipped information. We use the parameter k to vary the weight of the penalty term. We deﬁne the score function as follows, S k ( i, r ) = R ( i, r ) − k  i − (cid:88) j = r +1 R ( j, r ) n T (cid:88) l = r +1 R ( l, r )  , (9)where the sum in the numerator denotes the total amount of relative mutual information thatwill be lost by choosing data point ˜ d ( t i ), and the sum in the denominator denotes the totalrelative mutual information from all data points that may collected after ˜ d ( t r ).In Sections 4.2–4.4, we use the score function S k ( i, r ) to determine the optimal high-ﬁdelitydata points to collect for use in low-ﬁdelity model calibration. At each step of the modelcalibration algorithm, we choose the data point that maximizes S k ( i, r ), that is, ˜ d ( t i ∗ ) such hat i ∗ = arg max r ≤ i ≤ n T S k ( i, r ). We test this process multiple times, varying the value of thepenalty weight parameter, k , between 0 and 1, in order to investigate the eﬀect of k on thetemporal data collection sequence. Note that k = 0 is equivalent to using solely the mutualinformation to determine the next design choice. In the following section, we outline the results of our algorithm applied to a variety of scenarios.First, we apply our algorithm in a simpliﬁed setting where we assume that the clinician hasthe ability to collect one scan per week, and look for a pattern in the days chosen for scanning.Then, we extend the use of the algorithm to scenarios in which n scans can be collected at anydays during the treatment cycle, and investigate how the algorithm performs for both the one-and two-compartment models for three radiosensitivity levels: high, medium, and low. In all ofthe following scenarios, we ﬁx the pre-treatment parameters ( A and B in the one-compartmentmodel, and λ , K , η , and ζ in the two-compartment model) at the values listed in Appendix A,assuming that these parameters have been identiﬁed prior to the start of the treatment regimen.Additionally, we ﬁx α = 0 .

14 to avoid identiﬁability issues, as discussed in [15]. Thus, our focusin this investigation is in determining the value of radiosensitivity parameter β as quickly andaccurately as possible, so as to increase the predictive power of our models and allow for thealteration or discontinuation of a treatment protocol that is predicted to be ineﬀective. Frequently in the clinical setting, tumor data collection is constrained to a strict budget dueto limited resources. As an example scenario, we consider the case in which one tumor volumescan can be taken per week during the weeks in which treatment is administered. In additionto assuming that data has been collected prior to the start of treatment at days 5 and 10, weautomatically provide the scan for day 15 (day one of treatment week one) to obtain an initialﬁt for parameter β , and then enforce a budget of one scan per week in our mutual informationframework with the one-compartment ODE as the low-ﬁdelity model, to determine the optimalday to collect weekly data in weeks 2–6. We complete this procedure for three radiosensitivitylevels: high, medium, and low. Table 1 shows the optimal days chosen for each radiosensitivitylevel, with the day number in the weekly treatment cycle indicated in parentheses (i.e. 1corresponds to Monday, the ﬁrst day of treatment each week, and 6 corresponds to Saturday,the day after ﬁve sequential days of treatment). Figure 2 provides a visualization of the resultssummarized in the table.We observe that in both the high and medium radiosensitivity cases, the ﬁrst day of eachweekly treatment cycle consistently provides the highest level of information for parametercalibration. However, we note that in the high radiosensitivity case, the tumor volume isnearly zero after a the ﬁrst few treatment weeks, so during the last three cycles, the day 1scan provides just slightly more mutual information than the other available scans. Figure 3displays the ﬁnal model ﬁts for each radiosensitivity case, calibrated using all eight selectedscans. In the low radiosensitivity case, the optimal choices are the ﬁrst day of the treatmentcycle in week 2 and the second day of the cycle in week 3—we note that in week 3, days 1and 2 of the cycle provide nearly identical levels of mutual information. In the subsequentthree weeks, day 6 (the Saturday after the treatment cycle ends) provides the highest level ofmutual information. Our results suggest that in the second half of the treatment protocol forthis low radiosensitivity case, it becomes most important to assess the full extent of the tumorreduction from each week of radiation doses, encoded in the scan collected on day 6 of eachweek. f a clinician has the resources to collect tumor volume data exactly once per week duringtreatment, then our results suggest that they should begin by collecting this data on the ﬁrstday of each treatment cycle. If the tumor appears to respond well to the radiotherapy after theﬁrst few weeks, then they should continue to collect data on the ﬁrst treatment day. However,if the tumor is responding slowly after three weeks—for instance, if nearly half of the pre-treatment tumor remains, as in our low radiosensitivity case—then the methodology suggestsswitching data collection to the end of the week. In particular, it would be optimal in such ascenario to conduct measurements on Saturday, the day following the ﬁve-day treatment cycle,for the ﬁnal three weeks, to provide maximal information for model calibration. Initial Week 1 Week 2 Week 3 Week 4 Week 5 Week 6

High

Medium

Low n number of scans In this section, we explore the possibility of designing a more eﬀective scanning schedule tocalibrate the one-compartment model, given by Equation (2). We use our proposed scorefunction, (9) in Section 3.2, to determine the scanning schedule. In particular, we study howthe suggested scanning schedule changes for diﬀerent values of the score function parameter k in Equation (9). Moreover, we aim to ﬁnd an optimal k for each radiosensitivity cases with igh Radiosensitivity Medium Radiosensitivity Low Radiosensitivity Figure 3: Budget of one scan per week. The ﬁtted models are shown after calibrating with onedata point per week, for all six treatment weeks. All three radiosensitivity cases are shown, withhigh radiosensitivity on the left, medium radiosensitivity in the middle, and low radiosensitivity onthe right. respect to the accuracy of the model calibration. To quantify the accuracy of the calibrationto data, we deﬁne error . = (cid:107) d h − d l (cid:107) (cid:107) d h (cid:107) , (10)where d h = [ d h ( t ) , ..., d h ( t n T )] is the vector of high-ﬁdelity data at all days during the treat-ment period, and d l = [ V ( t ) , ..., V ( t n T )] is the low-ﬁdelity model approximation evaluated atthose same days. The error includes all the data points up to the ﬁnal time n T so that it canbe regarded as a measure of predictive power as well. Figure 4: One-compartment model. Choice of scan ( × ) for diﬀerent values of score function param-eter k = 0 , . , ..., . , . ,

1. We observe that using larger values of parameter k tends to favor thechoice of earlier time points, since the choice of point is more heavily penalized for skipping overdays. On the other hand, smaller values of k allow the algorithm to skip over earlier time points infavor of gathering points with larger mutual information towards the end of the treatment period. Figure 4 compares the choice of scan for diﬀerent values of score function parameter k . Weobserve that using larger values of k tends to result in the inclusion of more scans from earliertime points, while small values of k often skip those earlier points as a result of the smallerpenalty for choosing later points. A sampling of score function values for k = 0 and k = 0 . ( i , r ) S . ( i , r ) Figure 5: One-compartment model for high radiosensitivity case. Score function S k ( i, r ) (9) withrespect to i ∈ [ r + 1 , n T ] for diﬀerent values of score function parameter k = 0 and k = 0 .

5, whenchoosing the third (left), seventh (middle), and tenth (right) scan. Larger values of k , for example, k = 0 . k = 0, theoptimal scan choice will skip some of the earlier points. k = 0, which coincides with the standard mutual information of Equation (8), we observe thatthe score function values vary widely within a single week. In particular, either the ﬁrst or thelast day of the week have large score function values, and thus will be selected. Therefore, byusing this score function, the intermediate time points of scans will be skipped. However, byusing our score function with k = 0 .

5, more weight is given to earlier time points. This makesit possible to have more frequent earlier scan choices, as shown in Figure 4.In Figure 6, we compare the relative error of model ﬁtness to the data as deﬁned in Equation(10) with respect to the number of scans for diﬀerent values of score function parameter k inthree radiosensitivity levels: high, medium, and low. We observe that when using k = 0,the error is reduced most rapidly in the small scan number range. In the case of mediumradiosensitivity, any k value less than or equal to 0 . k = 0 gives themost accurate result. However, larger k values give more accurate results as the available scannumber become larger. Speciﬁcally, k values around 0 . k close to k = 1 gives the most accurate result with nearly any scanbudget. We conclude that using k = 0 or a relatively small value of k is preferable when theavailable scan number is small, especially when the tumor is not responsive to treatment, asthese small k values allow one to skip forward and obtain at least one measurement towardsthe end of the treatment cycle so as to capture the overall trend in the data. However, whenthe available scan number increases, k > k = 0 k values(top), and error with respect to k using a ﬁxed number of 6, 12, ..., 30 scans (bottom). In general,when the scan number is limited to a small number, (6 and 12 scans), k = 0 gives accurate results,particularly when the tumor is less responsive to radiotherapy. However, if a large number of scansis available, k > k = 0 k = 0 . k = 0 . k = 1High .0838 .0945 .1100 .1194Medium .0526 .0694 .0740 .0745Low .0225 .0248 .0239 .0225Table 2: One-compartment model. Final calibrated β values for the scenarios shown in Figure8. We observe that while k = 0 uses the fewest number of scans—a fact that makes it the moreappealing option in terms of expense—using more scans as in the larger k values for the high andmedium radiosensitivity cases results in a signiﬁcantly diﬀerent ﬁnal value, suggesting that morescans may be needed for β to converge to a best estimate. and k = 0 .

5, and the ﬁtted model prediction using the one-compartment model (2). When thetumor is highly responsive to radiotherapy, we observe that for larger values of k —for example, k = 0 . k , including k = 0,gives fairly accurate results. However, the accuracy can be improved with a larger number ofscans, using larger k values. This is supported by Figure 8, where it can be seen that k = 0 igh Radiosensitivity Low Radiosensitivity k = . k = . Figure 7: One-compartment model. Choice of high ﬁdelity data ( • ) among the potential data ( ◦ )using a 12 scan budget, and the ﬁtted model prediction (–) of tumor volume using Equation (2)for parameter values k = 0 and 0 .

5. We observe a better ﬁt using k = 0 . k = 0 in the case of low radiosensitivity. results in the fewest total scans, but including more scans (as with the use of larger k values)may lead to a more reﬁned ﬁnal parameter estimate, particularly for the low and mediumradiosensitivity cases; the ﬁnal calibrated β parameter values are listed in Table 2 for each ofthe scenarios illustrated in Figure 8. We see this theme recurring throughout our investigation;small k values require fewer scans and are thus less expensive to implement, but in manyscenarios—particularly those in which the overall gradient of the data is steeper—choosing asmall k results in larger prediction errors. n number ofscans In this section, we present results of the selected scanning schedule and accuracy of the cali-bration in the two-compartment model (3), by selecting either the tumor volume data or thenecrotic volume data at each step in the algorithm. Though only one metric can be chosen at atime, we do allow for the next step in the algorithm to pick the second metric at the same timepoint; i.e., if the algorithm ﬁrst chooses to measure the tumor volume at day 16, the next pointchosen could be the necrotic volume at day 16. The results are similar to the one-compartmentmodel in Section 4.2.In Figure 9, we observe that the selected scanning schedule becomes more reﬁned in theearlier times for larger values of score function parameter k . We observe that our mutualinformation criterion often chooses the necrotic volume data over the tumor volume data. Thisis reasonable, considering that the necrotic volume is more informative for estimating the eﬀect β after theaddition of each high-ﬁdelity data point. As more information is gathered, we observe shrinkagein the IQR demonstrating increased precision, and convergence to the ﬁnal parameter estimatedemonstrating increased accuracy. The red dashed lines indicate the ﬁnal value of the β estimate.For the high and medium radiosensitivity cases, it can be seen that using k = 0 does not allow theparameter estimate to fully converge to its actual value, leading to larger errors. In these cases,where the gradient of the data is rapidly changing, we are better oﬀ using more scans to achieveour desired accuracy in the ﬁnal ﬁt. 15 f radiotherapy parameters accurately, since the administration of RT has an immediate eﬀecton the composition of the tumor through the conversion of viable cells to necrotic cells, butonly a delayed eﬀect on the total tumor volume.We remark that choosing necrotic volume data may result in non-monotonic decay in errordue to the fact that only providing necrotic volume without tumor volume can deteriorate theaccuracy in predicting total tumor volume, as shown in Figure 10. However, the optimal k value with respect to accuracy displays similar results to the one-compartment model. In thecase of high radiosensitivity, larger k values, especially k = 1, provide the most accurate result.For the medium to low radiosensitivity cases, k = 0 is more accurate when the scan budgetis small (for example, less than 25 scans), as it is able to capture some of the ﬁnal points inthis limited scan budget and is thus cognizant of the overall shape of the data across the fulltreatment period. However, larger values of k (i.e., k ≥ .

8) provide a more accurate ﬁnal ﬁtwhen large scan budgets are allowed—the inclusion of additional scans in the beginning of thetreatment regimen allow for a more reﬁned ﬁt.

Figure 9: Two-compartment model with one type of point chosen at each step of the algorithm.Choice of scan of either tumor volume ( × ) or necrotic volume ( ◦ ) for diﬀerent values of scorefunction parameter k = 0 , . , ...,

1. Larger values of k result in choosing earlier time points in allthree radiosensitivity levels.Figure 10: Two-compartment model with one type of point chosen at each step of the algorithm.Relative error of model ﬁtness to the data as deﬁned in Equation (10) with respect to number ofscans for each of three radiosensitivity levels. Using k = 1 gives the most accurate result in highradiosensitivity level, while k = 0 is most beneﬁcial for a low radiosensitivity response.16 .4 Scenario 4: Two-compartment model in practical setting While the two-compartment model analysis described in Section 4.3 gives an interesting lookat the performance of the algorithm when presented with two alternate metric choices at eachtime, from a practical standpoint, a clinician who chooses to measure a necrotic proportion attime t would essentially get a tumor volume estimate at time t for “free” from the imagingscan. Thus, in this section we repeat the analysis from Section 4.3, but this time incorporatea tumor volume measurement automatically whenever a necrotic proportion metric is chosen,to better represent the clinical setting.In Figure 11, we display the design conditions chosen for each of six diﬀerent k values. In thehigh radiosensitivity case, there are numerous scenarios in which the MI score function selectstumor volume as the most informative metric, particularly for large values of k when there isa larger penalty for skipping days. In the medium and low radiosensitivity cases, however, thealgorithm nearly always chooses necrotic proportion as the more informative metric. Generally,these results agree with those found when tumor volume was not automatically incorporatedin Section 4.3.As in the previous two-compartment model analysis, the high radiosensitivity simulationfavors larger score function parameter k values in terms of prediction error, while the othertwo simulations favor smaller k values when scan budget is limited—see Figure 12. Moreover,we emphasize that the prediction error using the two-compartment model is smaller comparedto using the one-compartment model when the same number of scans are available, althoughthis is likely due to the additional information about the necrotic fraction. For example, weobserve that the two-compartment model using as few as 6 scans in high radiosensitivity with k = 0 . k = 0—shows far more accurate results comparedto using the same number of scans with the one-compartment model; see Figure 13, where wecompare the error with respect to the number of scans and time in days, in addition to thecalibrated data ﬁt in Figure 14. In what follows, we summarize the major observations from our simulation results in a clinicallyrelevant manner. However, we note that these observations are based on very speciﬁc scenarios;results may diﬀer when this methodology is applied to diﬀerent tumor growth models, treatmentregimens, or data collection protocols.In our ﬁrst scenario, which investigated this framework under the assumption that a clinician

Figure 11: Two-compartment model with tumor volume ( × ) automatically included when necroticvolume ( ◦ ) is chosen. Choice of scan for diﬀerent values of score function parameter k = 0 , . , ..., k result in choosing earlier time points in all three radiosensitivity levels.17igure 12: Two-compartment model testing with tumor volume automatically included whennecrotic is chosen. Plots of error (10) vs k . In case of high radiosensitivity, using larger valuesof k > k = 0gives the most accurate result when using 6 scans, however, larger values of k yield more accuratepredictions when larger scan budgets are available.Figure 13: Comparison of accuracy between the one-compartment model calibration (2) and two-compartment model calibration (3). The shown result compares the error (10) with respect to scannumber (top) and time in days (bottom). Using the same number of scans, the two compartmentmodel show more accurate results in high and medium radiosensitivity, especially using our rec-ommended k value. In case of low radiosensitivity, the one-compartment model with k = 0 showsbetter accuracy for scan budgets around 10 to 20. could collect tumor volume data once per week, our results suggest that the scan should betaken on the ﬁrst day of treatment for the ﬁrst three weeks. At this point, if roughly half ormore of the pre-treatment tumor volume still remains, it would be beneﬁcial to switch to day6 (Saturday) for weekly data collection, in order to maximize the mutual information provided igh Radiosensitivity Medium Radiosensitivity k = 0 . k = 0 . k = 0 . k = 0 . • ) and necrotic volume data ( • ) among the potential data ( ◦ , ◦ ) and the ﬁtted model resultsare plotted. The shown results show how collecting both tumor volume and necrotic fraction andusing the two-compartment model calibration can improve the model prediction signiﬁcantly fromthe one compartment model. In particular, using k = 0 . k = 0in medium radiosensitivity show the best calibrated results. by each scan.However, if clinicians are not restricted to taking one scan per week, as in Scenarios 2-4,we use our score function, S k ( i, r )—deﬁned in Equation (9)—to enforce a penalty for choosinglater scans. Our results suggest that in the the high radiosensitivity case, large values ofpenalty parameter k ≈ k = 0 give the optimal scan schedule, especially when the scanbudget is small—the use of a small k allows the algorithm to skip ahead and obtain data towardthe end of the treatment protocol so that an overall data trend can be captured. Such a scorefunction recommends data collection largely on the ﬁrst and last days of the week, skippingmost intermediate time points. In the case of medium radiosensitivity, k = 0 is optimal whenthe scan budget is less than 15, with larger k values favored to improve accuracy as more scansare available.In general, the optimal choice of k is highly dependent upon the shape of the data; data witha rapidly changing gradient (i.e., high radiosensitivity) will require diﬀerent handling of thescanning schedule than data that is nearly constant, such as our low radiosensitivity scenario.This suggests that we might beneﬁt from updating k as we learn about a patient’s sensitivity toradiotherapy, thus gaining information about the trend of their data. Though this will requirean abundance of future work to investigate in depth, we give the following examples of the typeof suggestion that might be made to clinicians based on our preliminary results: If only total tumor volume is measured, then for a small scan budget, our resultsrecommend to start with a score function with small k , within ≤ k ≤ . . Then, ifthe patient is highly responsive to radiotherapy, increase k ≥ . , or if the patient is ess responsive to radiotherapy, reduce to k = 0 . When the budget of total scans ishigh (for example, more than 15), we suggest using large k in the range of k > . ,and then to increase k further if the patient is highly responsive to radiotherapy.If both the tumor volume and necrotic volume can be measured, then for a small scanbudget, our results suggest to start with a score function with parameter k ≈ . .Similarly to the one-compartment case, we might further recommend increasing k for highly responsive patients, or reducing k to k = 0 for less responsive patients.For a scan budget at or above 15 scans, we suggest using a non-zero k , for instance, k ≥ . , in all scenarios. In summary, we have illustrated a Bayesian information-theoretic framework for determiningthe optimal sampling of time-series data in order to accurately calibrate low-ﬁdelity models foruse in clinical decision-making. We applied this methodology to one- and two-compartmentODE models of tumor response to radiotherapy, calibrated using synthetic data generated froma cellular automaton model, simulating tumors with varying radiosensitivity levels.We ﬁrst enforced a budget of exactly one tumor volume scan per week, and used the mutualinformation between the low-ﬁdelity model parameters and high-ﬁdelity data to determine theoptimal day within each week for data collection. Our results suggested that clinicians with aweekly scan budget should start by measuring the tumor volume on the ﬁrst day of the weeklytreatment regime. If the tumor does not respond well after three weeks of treatment, thenscans should be collected on the sixth day of the treatment schedule for the ﬁnal three weeksto garner additional information about the magnitude of tumor reduction over the course ofone dosage cycle; otherwise, scans should continue to be collected on day 1.In order to relax the scan schedule restrictions to allow for n scans collected at any timethroughout the course of treatment, we developed and applied an algorithm relying on a scorefunction, which rewards the user for choosing a data point that yields a large mutual informationbetween the low-ﬁdelity model parameters and high-ﬁdelity data, and enforces a penalty withweight k for skipping forward to later time points. We tested this algorithm on both the one-and two-compartment models; for the latter, we ﬁrst incorporated a single metric at each step,and then repeated the analysis in a more practical setting by providing the total tumor volumewhenever the necrotic volume metric was chosen. We assessed the predictive power of themodel resulting from the calibration algorithm using the error between all high-ﬁdelity datapoints and the low-ﬁdelity model approximation.In all cases, we observed the algorithm favoring a larger penalty parameter, k , for tumorswith a high level of radiosensitivity and any scan budget size; for cases such as this, the accuracyof the model beneﬁts from the inclusion of data at numerous early time points. For tumorswith low and medium levels of radiosensitivity and a small scan budget, the calibration wasmost accurate when using small k values, allowing for samples interspersed throughout thefull treatment schedule and observance of the overall data trend. However, in these lowerradiosensitivity cases, larger k values became more favorable as the scan budget increased,allowing for the inclusion of more data. from early in the treatment regimen. When calibratingthe two-compartment model, the algorithm largely preferred necrotic data to tumor volume,with some exceptions in the high radiosensitivity case. Overall, we observed a smaller predictionerror when using the two-compartment model than when using the one-compartment model,due to the additional information provided by the necrotic fraction.Although this work used only ﬁxed values of the penalization parameter k , as future workwe propose the development of an adaptive framework to alter k throughout the course of the reatment regimen as we gain information about a patient’s sensitivity to treatment. Addition-ally, we plan to further investigate robustness of the optimal measurement protocol on diﬀerentpotential patients, based not only on radiosensitivity, but also in terms of other characteristics—including tumor aggressiveness and micro-environment—and study the impact of measurementerrors on optimal measurement protocol. The results in this work are highly dependent uponthe models, treatment regimen, and data collection protocols chosen. We plan to expand thisinvestigation to demonstrate the eﬀectiveness of this framework on models that can incorpo-rate two less closely-related metrics—such as tumor volume and immune cell count—as wellas models containing other treatment types, such as chemotherapy, immunotherapy, or somecombination of these modalities with radiotherapy. We plan to use such extensions of our scan-ning framework to help inform decision-making about adjusting therapy types and schedulesduring tumor treatment at the clinical level. We would like to thank Dr. Helen Byrne of the University of Oxford for providing helpfulfeedback.

The following abbreviations are used in this manuscript:ODE Ordinary Diﬀerential EquationL-Q Linear-QuadraticRT RadiotherapyCA Cellular Automaton k NN k th-Nearest-NeighborDRAM Delayed Rejection Adaptive Metropolis A Parameter Values

Parameter Description Value Units l Cell size 0.0018 cm L Domain length 0.36 cm¯ τ cycle Mean (standard deviation) cell cycle time 18.3 (1.4) h c ∞ Background O concentration 2 . × − mol cm − D O diﬀusion constant 1 . × − cm s − c Q O concentration threshold for proliferating cells 1.82 × − mol cm − c N O concentration threshold for quiescent cells 1.68 × − mol cm − κ P O consumption rate of proliferating cells 1.0 × − mol cm − s − κ Q O consumption rate of quiescent cells 5.0 × − mol cm − s − p NR Rate of lysis of necrotic cells 0.015 hr − Table 3: A summary of the parameters used in the CA Model and their default values. Parametervalues are estimated using experimental data from the prostate cancer cell line, PC3, in [14].21 ne-Compartment Model Two-Compartment ModelParameter

High Medium Low High Medium Low A B λ k η ζ B k NN estimate of mutual information

Our deﬁnition of mutual information in Equation (8) requires an integral that often cannot beevaluated directly and may be prohibitively expensive, even using many numerical methods.As such, we utilize the k NN ( k th-Nearest-Neighbor) estimate of mutual information to makeour procedure computationally feasible. This estimate (and a proof of convergence) was fullydeveloped in [22]—below, we summarize the major details.To begin, at each calibration step, we estimate the posterior distribution p ( θ | D n − ) usingthe Delayed Rejection Adaptive Metropolis (DRAM) algorithm for model calibration [33]. Thisvariation on a traditional Metropolis-Hastings algorithm includes an additional two steps; anadaptation step that allows for periodic updating of the covariance matrix as information islearned about parameter dependencies, and a delayed rejection step, whereby the algorithmavoids stagnating for long periods of time by replacing rejected parameter set candidates withan alternate candidate chosen from a narrowed proposal distribution. We refer the interestedreader to [33, 34] for additional details about the DRAM algorithm, and to [26] for moreinformation on how this calibration procedure ﬁts into our mutual information framework.To estimate the mutual information between the low-ﬁdelity model parameters and a spec-iﬁed design condition ξ n , we draw M samples from the prior distribution p ( θ | D n − ) computedvia our Metropolis algorithm, and append low-ﬁdelity model estimates to each parameter setdrawn—this results in a chain X = ( θ, d n ( ξ n )) of size M × ( p + 1). We satisfy the k NN algo-rithm assumption of independent parameter samples by systematically thinning our dependentMetropolis chains, selecting every 10th sample in the posterior distribution to include in ourset of M samples.Following the procedure laid out in [22], for each chain element X i , we compute the distance (cid:15) ( i ) / || X i − X k ( i ) || ∞ , where X k ( i ) represents the k th -nearest neighbor to X i in the chain { X i } Mi =1 (determined using the supremum norm). We determine the number of points in eachmarginal subspace n θ and n d that lie within (cid:15) ( i ) / I ( θ ; d n | D n − , ξ n ) ≈ ψ ( k ) − M (cid:34) M (cid:88) i =1 ψ ( n θ ( i ) + 1) + M (cid:88) i =1 ψ ( n d ( i ) + 1) (cid:35) + ψ ( M ) , where ψ ( · ) is the digamma function. X i d θ" ε!! ( i ) ε! ( i ) X k(i) Figure 15: Calculation of (cid:15) ( i ), n θ ( i ), and n d ( i ) for the case k = 1 from [22]. Here we have n θ ( i ) = 3and n d ( i ) = 4. Note that the k th -nearest neighbor is not included in the determination of n θ and n d . For this study, we estimate mutual information using k = 6, as suggested in [23], thoughwe note that one could calculate the mutual information for a vector of diﬀerent k values andchoose the value that yields the maximum mutual information. References [1] Philipp M. Altrock, Lin L. Liu, and Franziska Michor. The mathematics of cancer: Inte-grating quantitative models.

Nature Reviews Cancer , 15:730–745, 2015.[2] Orit Lavi, Michael M. Gottesman, and Doron Levy. The dynamics of drug resistance: Amathematical perspective.

Drug Resistance Updates , 15(1-2):90–97, 2012.[3] Andrzej Swierniak, Marek Kimmel, and Jaroslaw Smieja. Mathematical modeling as atool for planning anticancer therapy.

European Journal of Pharmacology , 625:108–121,2009.[4] Helen M. Byrne. Dissecting cancer through mathematics: From the cell to the animalmodel.

Nature Reviews Cancer , 10:221–230, 2010.[5] Russell C. Rockne, Andrea Hawkins-Daarud, Kristin R. Swanson, James P. Sluka,James A. Glazier, Paul Macklin, David A. Hormuth, Angela M. Jarrett, Ernesto A.B.F.Lima, J. Tinsley Oden, George Biros, Thomas E. Yankeelov, Kit Curtius, IbrahimAl Bakir, Dominik Wodarz, Natalia Komarova, Luis Aparicio, Mykola Bordyuh, RaulRabadan, Stacey D. Finley, Heiko Enderling, Jimmy Caudell, Eduardo G. Moros, Alexan-der R.A. Anderson, Robert A. Gatenby, Artem Kaznatcheev, Peter Jeavons, Nikhil Kr-ishnan, Julia Pelesko, Raoul R. Wadhwa, Nara Yoon, Daniel Nichol, Andriy Marusyk,Michael Hinczewski, and Jacob G. Scott. The 2019 mathematical oncology roadmap.

Physical Biology , 16(041005):1–33, 2019.[6] David A. Chambers, Eitan Amir, Ramy R. Saleh, Danielle Rodin, Nancy L. Keating,Travis J. Osterman, and James L. Chen. The Impact of Big Data Research on Prac-tice, Policy, and Cancer Care.

American Society of Clinical Oncology Educational Book ,39:e167–e175, 2019.

7] Jean Emmanuel Bibault, Philippe Giraud, and Anita Burgun. Big Data and machinelearning in radiation oncology: State of the art and future prospects.

Cancer Letters ,382:110–117, 2016.[8] Hope Murphy, Hana Jaafari, and Hana M. Dobrovolny. Diﬀerences in predictions of ODEmodels of tumor growth: A cautionary example.

BMC Cancer , 16(163):1–10, 2016.[9] Joe Collis, Anthony J. Connor, Marcin Paczkowski, Pavitra Kannan, Joe Pitt-Francis,Helen M. Byrne, and Matthew E. Hubbard. Bayesian Calibration, Validation and Uncer-tainty Quantiﬁcation for Predictive Modelling of Tumour Growth: A Tutorial.

Bulletin ofMathematical Biology , 79:939–974, 2017.[10] James A. Koziol, Theresa J. Falls, and Jan E. Schnitzer. Diﬀerent ODE models of tumorgrowth can deliver similar results.

BMC Cancer , 20(226):1–10, 2020.[11] Yingting Liu, Jeremy Purvis, Andrew Shih, Joshua Weinstein, Neeraj Agrawal, and RaviRadhakrishnan. A multiscale computational approach to dissect early events in the erbfamily receptor mediated activation, diﬀerential signaling, and relevance to oncogenictransformations.

Annals of Biomedical Engineering , 35:1012–1025, 2007.[12] Ignacio Ramis-Conde, Mark A.J. Chaplain, Alexander R.A. Anderson, and Dirk Drasdo.Multi-scale modelling of cancer cell intravasation: The role of cadherins in metastasis.

Physical Biology , 6(16008), 2009.[13] Andrea Hawkins-Daarud, Serge Prudhomme, Kristoﬀer G. van der Zee, and J. TinsleyOden. Bayesian calibration, validation, and uncertainty quantiﬁcation of diﬀuse interfacemodels of tumor growth.

Journal of Mathematical Biology , 67:1457–1485, 2013.[14] Pavitra Kannan, Marcin Paczkowski, Ana Miar, Joshua Owen, Warren Kretzschmar, Ser-ena Lucotti, Jakob Kaeppler, Jianzhou Chen, Bostjan Markelc, Leoni Kunz-Schughart,Adrian Harris, Mike Partridge, and Helen Byrne. Radiation resistant cancer cells enhancethe survival and resistance of sensitive cells in prostate spheroids. bioRxiv , 2019.[15] H. Cho, A.L. Lewis, K.M. Storey, R. Jennings, B. Shtylla, A.M. Reynolds, and H.M.Byrne. A framework for performing data-driven modeling of tumor growth with radio-therapy treatment.

Springer Special Issue: Using Mathematics to Understand BiologicalComplexity, Women in Mathematical Biology , Accepted, 2020.[16] Howard D. Thames, H. Rodney Withers, Lester J. Peters, and Gilbert H. Fletcher.Changes in early and late radiation responses with altered dose fractionation: Implica-tions for dose-survival relationships.

International Journal of Radiation Oncology, Biology,Physics , 8:219–226, 1982.[17] J. F. Fowler. The linear-quadratic formula and progress in fractionated radiotherapy.

British Journal of Radiology , 62:679–694, 1989.[18] R. Rockne, J. K. Rockhill, M. Mrugala, A. M. Spence, I. Kalet, K. Hendrickson, A. Lai,T. Cloughesy, E. C. Alvord, and K. R. Swanson. Predicting the eﬃcacy of radiotherapyin individual glioblastoma patients in vivo: A mathematical modeling approach.

Physicsin Medicine and Biology , 55:3271–3285, 2010.[19] David Corwin, Clay Holdsworth, Russell C. Rockne, Andrew D. Trister, Maciej M. Mru-gala, Jason K. Rockhill, Robert D. Stewart, Mark Phillips, and Kristin R. Swanson. To-ward patient-speciﬁc, biologically optimized radiation therapy plans for the treatment ofglioblastoma.

PLoS ONE , 8(e79115), 2013.[20] Enakshi D. Sunassee, Dean Tan, Nathan Ji, Renee Brady, Eduardo G. Moros, Jimmy J.Caudell, Slav Yartsev, and Heiko Enderling. Proliferation saturation index in an adap-tive Bayesian approach to predict patient-speciﬁc radiotherapy responses.

InternationalJournal of Radiation Biology , 95(10):1421–1426, 2019.

21] C. E. Shannon. A Mathematical Theory of Communication.

Bell System Technical Jour-nal , 27:379–423, 1948.[22] Alexander Kraskov, Harald St¨ogbauer, and Peter Grassberger. Estimating mutual infor-mation.

Physical Review E , 69(6):066138, 2004.[23] Gabriel Terejanu, Rochan R. Upadhyay, and Kenji Miki. Bayesian experimental designfor the active nitridation of graphite by atomic nitrogen.

Experimental Thermal and FluidScience , 36:178–193, 2012.[24] Corey Bryant and Gabriel Terejanu. An information-theoretic approach to optimallycalibrate approximate models. In , 2012.[25] Juliane Liepe, Sarah Filippi, Micha(cid:32)l Komorowski, and Michael P.H. Stumpf. Maximizingthe Information Content of Experiments in Systems Biology.

PLoS Computational Biology ,9(1):1–13, 2013.[26] Allison Lewis, Ralph Smith, Brian Williams, and Victor Figueroa. An information theo-retic approach to use high-ﬁdelity codes to calibrate low-ﬁdelity codes.

Journal of Com-putational Physics , 324:24–43, 2016.[27] Eric J. Hall.

Radiobiology for the radiologist . J.B. Lippincott, Philadelphia, 1994.[28] Heiko Enderling, Mark A.J. Chaplain, and Philip Hahnfeldt. Quantitative Modeling ofTumor Dynamics and Radiotherapy.

Acta Biotheoretica , 58(4):341–353, 2010.[29] T. D. Lewin.

Modelling the impact of heterogeneity in tumor composition on the responseto fractionated radiotherapy . PhD thesis, University of Oxford, 2018.[30] Thomas D. Lewin, Helen M. Byrne, Philip K. Maini, Jimmy J. Caudell, Eduardo G.Moros, and Heiko Enderling. The importance of dead material within a tumour on thedynamics in response to radiotherapy.

Physics in Medicine and Biology , 65(1):015007,2020.[31] Thomas D. Lewin, Philip K. Maini, Eduardo G. Moros, Heiko Enderling, and Helen M.Byrne. A three phase model to investigate the eﬀects of dead material on the growth ofavascular tumours.

Mathematical Modelling of Natural Phenomena , 15:1–29, 2020.[32] D. E. Lea and D. G. Catcheside. The mechanism of the induction by radiation of chro-mosome aberrations in Tradescantia.

Journal of Genetics , 44:216–245, 1942.[33] Heikki Haario, Marko Laine, Antonietta Mira, and Eero Saksman. DRAM: Eﬃcient adap-tive MCMC.

Statistics and Computing , 16(4):339–354, 2006.[34] R.C. Smith.

Uncertainty Quantiﬁcation: Theory, Implementation, and Applications .SIAM, Philadelphia, PA, 2014..SIAM, Philadelphia, PA, 2014.