Structural identifiability analysis of PDEs: A case study in continuous age-structured epidemic models
NNoname manuscript No. (will be inserted by the editor)
Structural identifiability analysis of PDEs:A case study in continuous age-structured epidemicmodels
Marissa Renardy · Denise Kirschner · Marisa Eisenberg
Received: date / Accepted: date
Abstract
Computational and mathematical models rely heavily on estimated pa-rameter values for model development. Identifiability analysis determines how wellthe parameters of a model can be estimated from experimental data. Identifiabilityanalysis is crucial for interpreting and determining confidence in model parametervalues and to provide biologically relevant predictions. Structural identifiabilityanalysis, in which one assumes data to be noiseless and arbitrarily fine-grained,has been extensively studied in the context of ordinary differential equation (ODE)models, but has not yet been widely explored for age-structured partial differen-tial equation (PDE) models. These models present additional difficulties due toincreased number of variables and partial derivatives as well as the presence ofboundary conditions. In this work, we establish a pipeline for structural identifi-ability analysis of age-structured PDE models using a differential algebra frame-work and derive identifiability results for specific age-structured models. We useepidemic models to demonstrate this framework because of their wide-spread usein many different diseases and for the corresponding parallel work previously donefor ODEs. In our application of the identifiability analysis pipeline, we focus ona Susceptible-Exposed-Infected model for which we compare identifiability resultsfor a PDE and corresponding ODE system and explore effects of age-dependentparameters on identifiability. We also show how practical identifiability analysiscan be applied in this example.
Keywords epidemiology · age structure · partial differential equations · tuberculosis · modeling M. RenardyUniversity of Michigan Medical School, Department of Microbiology and ImmunologyE-mail: [email protected]. KirschnerUniversity of Michigan Medical School, Department of Microbiology and ImmunologyM. EisenbergUniversity of Michigan, Department of EpidemiologyUniversity of Michigan, Department of MathematicsE-mail: [email protected] a r X i v : . [ q - b i o . Q M ] F e b Marissa Renardy et al.
Identifiability analysis addresses the question of whether the parameters of a math-ematical model can be uniquely identified, given observed data. There are two gen-eral types of identifiability analysis: structural (assuming data to be noiseless andarbitrarily fine-grained) and practical (using real data). These analyses are crucialto interpreting biologically relevant predictions from computational and mathe-matical models that rely on parameter values that are estimated from experimen-tal datasets. Previous studies have shown that unidentifiable models can lead tovastly different predictions for different parameter values that are able to producethe same observed data (Kao and Eisenberg, 2018). In the context of biologicalmodels, parameter identifiability gives increased confidence in model predictionsthat are based on estimated parameters. This is particularly relevant for predic-tions applied to topics such as diseases, treatment outcomes, vaccination efficacy,and future infection dynamics. Specifically, the use of epidemiological models hasbecome increasingly necessary and useful to predicting disease outcomes, as wehave seen in the case of COVID-19, and these efforts should be as accurate andconfident as possible to produce the most useful predictions. Thus, we focus ourwork in the context of epidemic models due to their use in many diseases andto compare with corresponding parallel work previously done for ODEs regardingstructural identifiability.Many theoretical results and algorithms for structural identifiability of linearand non-linear ordinary differential equations (ODEs) models have been derived(Miao et al., 2011; Audoly et al., 2001; Ljung and Glad, 1994; Pohjanpalo, 1978;Chappell et al., 1990; Cobelli and DiStefano, 1980; Chis et al., 2011) and havebeen applied to study an array of diseases. There are models for the spread ofcholera (Eisenberg et al., 2013), vector-borne diseases (Kao and Eisenberg, 2018;Zhu et al., 2018; Tuncer et al., 2016), and general infectious diseases (Tuncerand Le, 2018; Evans et al., 2005). Methods for structural identifiability are oftenanalytical in nature (e.g. the commonly used differential algebra approach (Audolyet al., 2001; Ljung and Glad, 1994)), but may also be numerical, or some blend ofthe two (Jacquez and Greif, 1985; Jacquez and Perry, 1990; Raue et al., 2009).ODE models are not always sufficient for representing disease dynamics when,for example, age is an important factor. In many diseases, different age groupsmay respond differently to diseases or disease interventions based on differences insusceptibility (Gardner, 1980), mortality (Castillo-Chavez et al., 1991), and con-tact rates (Mossong et al., 2008; Glasser et al., 2012). Age preferences have beenshown to have a significant effect on mixing patterns that drive disease spread(Valle et al., 2007). This type of information can be incorporated into a mathe-matical model for epidemics, for example, by explicitly representing age-structureddistributions of the susceptible and infected subpopulations. Age-structured mod-els have been used to evaluate and optimize age-targeted vaccination strategiesfor several diseases (Harris et al., 2019; Keeling and White, 2011; Hao et al., 2019;Castillo-Chavez and Feng, 1998; Shim et al., 2006), and it has been shown that population heterogeneity and non-random mixing can have a substantial effect onthe prediction of disease outbreaks and herd immunity (Glasser et al., 2016; Fenget al., 2015).In a continuous setting, age structure is introduced by creating state variablesthat are functions of both age and time, and also by introducing transport terms tructural identifiability analysis of PDEs 3 to represent constant aging of a population. These model formulations result inPDEs, rather than ODEs, making them more biologically accurate; however, suchmodels also typically more complex to analyze. There are other PDE formulationsof epidemic models such as age of infection models (Thieme and Castillo-Chavez,1993; Inaba and Sekine, 2004) or spatially explicit models (Bertuzzo et al., 2009;Wu, 2008). The focus of this paper will be specifically on age-structured models.While there are alternative approaches to incorporating age structure, such asto use compartmentalization to stratify populations (Chow et al., 2011; Fergusonet al., 1996) or to use a stochastic discrete representation such as an agent-basedmodel (Ajelli et al., 2010; Bian, 2004), we focus here on the PDE formulation.Importantly, the issues of both structural and practical parameter identifiabilityfor these types of models still needs to be elucidated.While structural identifiability analysis has been extensively applied in the con-text of ODE epidemic models, it has not been widely explored for age-structuredPDE models. PDE models present additional difficulties due to increased numberof variables and derivatives as well as the presence of boundary conditions. Iden-tifiability of age-structured models was first considered by Perasso et al. (2011)for a simple SI-type model and was again considered by Perasso and Razafison(2016) for a population dynamics model including only birth and death. Theseanalytical results have not yet been extended to more comprehensive epidemicmodels, though some identifiability results have been shown numerically (Tunceret al., 2016). For these types of problems, modelers have relied on practical iden-tifiability analyses which are dependent on either available data or data that hasbeen synthesized for the purpose of analysis.In this paper, we derive analytical identifiability results for more complex age-structured epidemic models using a differential algebra framework that is simplerto apply than the previous (aforementioned) previous approaches. We apply thismethodology to a compartmental epidemic Susceptible-Exposed-Infected (SEI)model both with and without age structure explicitly modeled, and we comparethe identifiability results under different conditions such as the presence of immi-gration in the system or for age-dependent model parameters. To corroborate ourstructural identifiability results, we also perform a practical identifiability anal-ysis for one version of the model. While we focus on age-structured models, themethods we present herein can be applied to other PDE formulations, with certainassumptions about smoothness and uniqueness of solutions. describe the underlying system of interest, and (2) a measurement model or set of output equations , which describe the observation or measurement of interest (i.e.,the quantities of interest for which one either has or plans to collect data). Inthis study, we assume the following general first-order partial differential equation(PDE) model structure, which is commonly used in age-structured PDE models
Marissa Renardy et al. in biology and health: ∂ x ∂t + ∂ x ∂a = f ( x , t, a, p , u ) y = g ( x , t, a, p ) (1)where x is a vector of state variables describing the system of interest, p is thevector of parameters, t and a are two independent variables. Here we considertime and age as the independent variables, although the approach given here willalso work for more general variables, as well as for additional derivatives of furtherindependent variables on the left hand side. Additionally, y represents the vectorof n y model outputs (measured variables) and u represents any known inputs(e.g. forcing functions) to the system, if they exist. Finally, f and g are rationalfunctions of the model variables and parameters such that the model can be writtenequivalently as differential polynomials by clearing denominators.In structural identifiability analysis, it is assumed that the inputs and outputsare known (observed) perfectly – in this case, we assume that u , and y are knownfor all times and all ages and that there is no noise in the data. Similarly, theindependent variables t and a are assumed to be known. Thus, the resulting iden-tifiability results can be interpreted as the upper limit of what can possibly beidentified with perfect data collection methods. However, structural identifiabilityis a necessary condition for identifiability in the realistic data case, making it animportant first step in the parameter estimation process.A model such as (1) can be thought of as a map from parameter space (e.g.for p ∈ R n p ) to output space (the space of trajectories for y ), often termed themodel map Φ : p (cid:55)→ y . With this framing, the structural identifiability questionbecomes that of whether or not the model map Φ is injective. In particular, wedefine structural identifiability as follows: Definition 1
For a model of the form (1), an individual parameter p in p is uniquely (or globally) structurally identifiable if for almost every value p ∗ andalmost all initial conditions, the equation y ( x , t, a, p ∗ ) = y ( x , t, a, p ) implies p = p ∗ . A parameter p is said to be non-uniquely (or locally) structurally identifiable if for almost any p ∗ and almost all initial conditions, the equation y ( x , t, a, p ∗ ) = y ( x , t, a, p ) implies that p has a finite number of solutions. Definition 2
Similarly, a model of the form (1) is said to be uniquely (respec-tively, non-uniquely ) structurally identifiable for a given choice of output y if ev-ery parameter is uniquely (respectively, non-uniquely) structurally identifiable,i.e. for almost every value p ∗ and almost all initial conditions, the equation y ( x , t, a, p ∗ ) = y ( x , t, a, p ) has only one solution, p = p ∗ (respectively, finitelymany solutions). Equivalently, a model is uniquely structurally identifiable fora given output if and only if the map Φ is injective almost everywhere, i.e. ifthere exists a unique set of parameter values p ∗ which yields a given trajectory y ( x , t, a, p ∗ ) almost everywhere. We note that because some specific parameter values or initial conditions mayrender an otherwise identifiable model unidentifiable (e.g. if all parameters or ini-tial conditions are zero), we define structural identifiability generically, i.e., foralmost all parameter values and initial conditions (Audoly et al., 2001; Pia Sacco-mani et al., 2003). tructural identifiability analysis of PDEs 5 n y monic polynomial equationsin terms of the known variables y , u , and their derivatives, with rational coefficientsin the parameters p . The input-output equations represent an implicit form of themodel map, and as a set of differential equations, they are input-output equivalentto the original system. This means that for the same inputs and parameter values,the equations generate the same output trajectories (Eisenberg, 2019). We thenconsider the map from the input parameter space to the coefficients of the resultingmonic polynomial.A key insight of the differential algebra approach in the rational ODE caseis that injectivity of the model map can be evaluated by considering the the co-efficient map, i.e. the map from parameters to coefficients of the input-outputequations (Audoly et al., 2001; Ljung and Glad, 1994). In the rational ODE case,the coefficient map typically provides complete identifiability information for themodel. If this map is injective, then the model is structurally identifiable. If themodel is unidentifiable, this map determines the identifiable combinations of pa-rameters. For example, while some model parameters may not be independentlyidentifiable, the product or sum of two or more model parameters may be identi-fiable. This may enable the model to be reparameterized in a way that allows themodel to become structurally identifiable (Chappell and Gunn, 1998; Cole et al.,2010). It should be noted that a set of identifiable combinations is not necessarilyunique – there may be many sets of parameter combinations that are equivalent.For example, the sets of identifiable combinations { ab, bc } and { ab, a/c } are equiv-alent since ( ab ) / ( bc ) = a/c .The differential algebra approach for ODE models typically uses characteristicsets to calculate the input-output equations (Audoly et al., 2001; Ljung and Glad,1994). Characteristic sets are a differential analog of the Gr¨obner basis (Ritt,1950) based on differential polynomial psuedodivision. For more details on thecharacteristic set approach, the reader is referred to (Audoly et al., 2001; Ljungand Glad, 1994; Pia Saccomani et al., 2003; Hong et al., 2020; Eisenberg, 2019).This methodology has previously been applied to ODE systems, where deriva-tives are taken with respect to only one variable. In this work, we generalize thismethodology to PDE systems of the form in (1); in particular, we consider age-structured systems where derivatives are taken with respect to both time andage. We show that the same framework can be applied to these systems, withthe additional consideration of boundary conditions. In age-structured systems, aboundary condition must be specified at a = 0, usually in the form of a Dirich- let boundary condition. Thus, in addition to performing identifiability analysis onthe model equations, one must also consider the equations evaluated at a = 0 toexplicitly include information at the boundary. This is analogous to methods fordetermining identifiability of initial conditions in ODE systems (Pia Saccomaniet al., 2003). Marissa Renardy et al. y , u and theirderivatives with respect to either independent variable and the parameters p ), weassume that our observed trajectory contains sufficiently many algebraically inde-pendent points that we can successfully identify the coefficients of the input-outputequations. In other words, from a given observed solution of y over age and time,we assume the observed solution contains an arbitrary number of distinct points atwhich the input-output equations can be evaluated to solve for the input-outputcoefficients (in particular, more distinct points than the number of coefficients). Inpractice most trajectories meet this criterion, although some care must be takento avoid systems or measure zero sets of initial conditions for which the solutionsare at steady state, or have key parameters equal to zero, etc. (Hong et al., 2020).With the coefficient-identifiability assumption (Eisenberg, 2019), the rest of thedifferential algebra approach follows as with the ODE case, and is described below.Procedurally, the process of performing identifiability analysis for PDEs is al-most identical to the ODE case, though substitutions may be made more compli-cated by the presence of an additional derivative. To obtain input-output equationsfrom model equations, a typical first step is to solve the output equations for statevariables and, using substitution, eliminate these state variables and their deriva-tives from the model equations. For simplicity, all nonzero terms in the modelequations are moved to the right-hand side. Remaining state variables can then beeliminated by solving one or more of the model equations. The particular substi-tutions required will of course vary depending on the model and output equations.The goal is to obtain a system of polynomial equations in terms of the outputsand their derivatives, with all state variables eliminated, whose coefficients are interms of the model parameters. In the case that square roots arise, for example,they must be eliminated by multiplying by the conjugate.Once input-output polynomial equations are obtained, we divide each equation by the coefficient of its leading term to obtain a monic polynomial (see below fordetails about the ranking of terms). From these monic polynomials, we obtain theset of polynomial coefficients that defines the coefficient map. This map, as in theODE case, can be used to determine complete identifiability information for themodel equations. tructural identifiability analysis of PDEs 7 Note that thus far, we have not considered any information about the boundaryconditions. Analysis of the boundary conditions is a key component of identifia-bility analysis for PDEs that does not occur in the ODE case. In the case ofage-structured systems, there is a single boundary condition at a = 0. Thus, wetreat the boundary condition as similar to an initial condition for ODEs (Pia Sac-comani et al., 2003). If information is known about the boundary conditions, suchas values or functional forms, this information can be substituted into the modelequations evaluated at a = 0 to obtain boundary condition equations . Since thevalues of model outputs are known at a = 0, the same identifiability analysis asabove can be applied to boundary condition equations. This may reveal additionalidentifiability information about the model or any parameters that are present inthe boundary conditions.We note that, as in the initial condition case, if identifiability is evaluatedwithout considering boundary conditions, this implicitly assumes that the bound-ary conditions are generic and unknown. However, there are some aspects of theboundary conditions which are more subtle than the ODE case of initial condi-tions, and may warrant further exploration, such as that boundary conditions stillmay vary over time. Exploring how different boundary condition types may affectidentifiability would be an important direction for future work.Our procedure for performing structural identifiability analysis, which can beapplied to both ODE and age-structured PDE systems, is outlined in Figure 1. Rewrite model equations as input-output equationsWrite input-output equations as monic polynomial equationsExtract polynomial coefficientsConsider the map from parameter space to the polynomial coefficients. Is it injective?All parameters are identifiable with generic boundary conditions Find “identifiable combinations” for which the map is one-to-one.
NoYes *for PDE case onlyIf applicable, evaluate identifiability of boundary condition equations
Fig. 1
Flowchart describing the methodology for identifiability analysis of ODE and age-structured PDE models based on using a differential algebra approach. Marissa Renardy et al. F = X T X , where X is the sensi-tivity matrix defined by X ( i,j ) = ∂y∂p i ( t j ) for parameters p , ..., p n and time points t , ..., t m . For an age structured system, we define F = X T X where the sensitivitymatrix is defined by X ( i,j ) = ∂y∂p i (( a, t ) j ) where ( a, t ) j for j = 1 , ..., m are pairsof age and time points that form a grid. Here, y denotes the observable modeloutput. For multiple model outputs (e.g., y and y ), the matrix X can be definedby concatenation (i.e., X = [ X ; X ]). The rank of F indicates the number ofidentifiable parameters and parameter combinations. If F is singular, the model isstructurally unidentifiable; if F is close to singular, the model may be practicallyunidentifiable. The FIM can also be used to determine which parameter subsetsare identifiable or unidentifiable (Eisenberg and Hayashi, 2014).We compute profile likelihoods by fixing a particular parameter p i and fittingall remaining parameters p j (cid:54) = i to maximize a likelihood function. We define thelikelihood by assuming constant, normally distributed measurement error withstandard deviation equal to 10% of the mean of the data, and mean equal to themodel trajectory. Maximizing the likelihood function is equivalent to minimizing acost function based on the negative log likelihood (in this case equivalent to leastsquares). This is performed at many values across the range of values for p i . Theprofile likelihood for p i is given by the maximum values of the likelihood functionacross the range of values for p i . If the profile likelihood is flat, p i is structurallyunidentifiable; if it is sufficiently shallow, p i is practically unidentifiable. For moredetails on profile likelihood approaches to identifiability, the reader is referred to(Raue et al., 2009; Venzon and Moolgavkar, 1988).In our analyses herein, we solved the age-structured PDEs numerically usingMethod of Lines, using a first order finite difference approximation of the spatialderivative, and using Matlab’s ODE solver ode45 . In several case studies below, we consider a simple mathematical model of theepidemiology of tuberculosis (TB). TB is transmitted via the aerosolized bac-terium
Mycobacterium tuberculosis from person to person via lung inhalation. TBis unique from many infectious diseases in that most people develop a latent in-fection, meaning that they present no symptoms and cannot transmit the disease,and the infection can remain latent for an extended period of time. Thus, ourmodel consists of three mutually exclusive compartments: Susceptible (S), Ex-posed/latent (E), and Infectious (I). Through contact with infectious individuals, susceptible individuals (called susceptibles) may progress to the exposed class (i.e.latent infection) or the infectious class (i.e. primary infection). This is slightly dif-ferent from the standard SEI model, in that some individuals bypass the exposedclass and immediately become infectious, but is similar to what has been used tomodel tuberculosis epidemics previously (Ozcaglar et al., 2012; Guzzetta et al., tructural identifiability analysis of PDEs 9
S E I ! " $ % ! & '( (1 − ,). Parameter Definition (.,/! " , ! $ , ! & birth ratetransmission rateprobability of primary infectionreactivation ratedeath rates Fig. 2
SEI diagram and parameter definitions for a mathematical model of TB epidemiology. S represents the susceptible population, E represents the exposed/latent population, and I represents the actively infectious population. Arrows describe flow between mutually exclusivecompartments. For simplicity, we first demonstrate the identifiability analysis framework on amodel with density-dependent transmission, i.e. transmission is proportional to SI rather than S IN . In this case, the per-individual rate of contact increases asthe population density increases. This is not necessarily a realistic model for TB,since TB is often spread only among close contacts, but simplifies the differentialalgebra for the purposes of demonstration.In order to perform parameter identifiability analysis, we must first define whatthe outputs (i.e., the quantities for which we could obtain reliable, biologically rel- evant time series data) of the specific model are. For all of the following examples,we will assume that we can observe a fraction of the exposed individuals, i.e. y E = k E E where k E is the reporting/detection rate. This type of data could beobtained by, for example, contact tracing to identify individuals who have beenexposed to TB disease, or testing a proportion of the population for exposure. This is a useful practice even in areas with a low TB burden such as the US, sincedetection and treatment of latent infection is crucial for TB elimination (LoBueand Mermin, 2017). Since latent disease is asymptomatic, it is highly unlikely thatall exposed/latent individuals will be identified, and thus we assume instead thatonly a fraction k E of the exposed individuals are identified.In addition, we may be able to observe all or some of either all active infections(prevalence) or new active infections (incidence). The relevant output functiondepends on the reporting and surveillance data for the region of interest. Forexample, if all infectious cases are monitored and reported then these data wouldrepresent total prevalence. If not all cases are detected due to misdiagnosis, forexample, then the data would represent only a fraction of the total prevalence.If new infectious cases are reported upon diagnosis but then lost to follow-up,then incidence is an observable output while prevalence is not. Again, since it ispossible that not all infected individuals will be diagnosed, we may only be ableto observe a fraction of incidence. We present here results for observing prevalenceover time, but the identifiability analysis of model parameters for incidence wouldfollow similarly.We assume that y I = k I I , i.e. that we are able to obtain data for a fraction ofthe total number of infections at any given time. We assume no other informationabout the system parameters or initial conditions.4.1 ODE SEI model with density-dependent transmissionThough identifiability analysis for ODE systems is well established, we include thetechnical details here for completeness and for comparison when we move to thePDE system. The model can be written as a system of ODEs: dSdt = c − βSI − µ S SdEdt = (1 − (cid:15) ) βSI − δE − µ E EdIdt = (cid:15)βSI + δE − µ I I (2)We begin by rewriting the system of equations (2) in terms of the outputs y E and y I . For ease of notation, we adopt the notation y (cid:48) = dydt . Using the substitutions E = y E k E and I = y I k I , we obtain the following equations: S (cid:48) = c − βS y I k I − µ S Sy (cid:48) E k E = (1 − (cid:15) ) βS y I k I − δ y E k E − µ E y E k E y (cid:48) I k I = (cid:15)βS y I k I + δ y E k E − µ I y I . It now remains to eliminate S from these equations so that we obtain equationsin terms of solely the model parameters and outputs. This can be done by solvingfor S in the equation for dy I dt : S = k E ( y (cid:48) I + µ I y I ) − δk I y E βk E (cid:15)y I . tructural identifiability analysis of PDEs 11 This expression can then be substituted into the remaining two equations toobtain two input-output equations that are rational equations in terms of the out-puts and their derivatives. By multiplying by a common denominator and movingterms to one side, these input-output equations can be rewritten as the followingpolynomial equations.0 = k I δµ S y E y I + ( ck E k I β(cid:15) − k E k I µ I µ S ) y I + k I βδy E y I − k E βµ I y I + k I δy I y (cid:48) E − k I δy E y (cid:48) I − k I k E µ S y I y (cid:48) I − k E βy I y (cid:48) I + k E k I y (cid:48) I − k E k I y I y (cid:48)(cid:48) I − k I δ − k I (cid:15)µ E ) y E + ( k E µ I − k E (cid:15)µ I ) y I − k I (cid:15)y (cid:48) E + ( k E − k E (cid:15) ) y (cid:48) I Dividing the first equation by k E k I and the second equation by k E − k E (cid:15) to formmonic polynomials, we get the following set of polynomial coefficients. (cid:26) − , , βk I , − δk I k E , δk I k E , − βδk E , k I (cid:15)k E ( (cid:15) − , k I ( δ + µ E (cid:15) ) k E ( (cid:15) − ,βµ I k I , µ I , µ S , − δk I µ S k E , µ I µ S − βc(cid:15) (cid:27) We consider this set of coefficients as a function of the model parameters, andwe seek to determine whether this map is injective. If it is not, then we considerwhether it is injective as a function of some parameter combinations. We find that µ S and µ I are structurally identifiable. Through simplification and substitution,we find that other identifiable parameter combinations are (cid:26) βk I , βc(cid:15), βδk E , δ ( − (cid:15) ) (cid:15) , µ E + δ (cid:27) . If we further assume that µ S = µ E , then δ and (cid:15) also become identifiable while β , c , k E , and k I remain unidentifiable. However, the following combinations areidentifiable: { βk E , βk I , βc } . Consequently, if any one of β , c , k E , or k I is known,then the rest become identifiable.4.2 Age-structured PDE model with density-dependent transmission andconstant parametersNow that identifiability results are established for the ODE representation of theSEI model 2, we extend these results to a more complex PDE model that capturesthe same epidemiological setting (SEI), but includes age structure. We incorporate age structure into the model by adding a transport term to capture populationaging. To simplify the model, we assume that transmission occurs between individ-uals of the same age, so that the transmission term is proportional to S ( t, a ) I ( t, a )for any given age a and time t . While this is a significant simplification, datahas shown that people tend to interact primarily with others in their age group (Mossong et al., 2008). The model thus becomes ∂S∂t + ∂S∂a = − βSI − µ S S∂E∂t + ∂E∂a = (1 − (cid:15) ) βSI − δE − µ E E∂I∂t + ∂I∂a = (cid:15)βSI + δE − µ I I (3)for t > a ∈ [0 , ∞ ). Note that all state variables are now functions of bothtime and age. The boundary conditions are S ( t,
0) = c , E ( t,
0) = I ( t,
0) = 0. Notethat the birth rate is now incorporated into the boundary condition rather thaninto the differential equation for S . We begin by assuming that model parametersare constants, independent of age. We first consider identifiability of the model parameters without explicitly ac-counting for the presence of boundary conditions, meaning that we assume thereis no a priori knowledge about the boundary conditions and that they cannot beobserved. We call this case generic boundary conditions. To perform the identifia-bility analysis, we utilize the same framework as we did for the above ODE systemanalysis. That is, we rewrite the model system as a set of polynomial equationsin terms of the outputs and their partial derivatives, and then consider the mapfrom parameter space to the polynomial coefficients.We adopt the notation y ( i,j ) = ∂ i + j y∂t i ∂a j . Using the substitutions E = y E k E and I = y I k I , as in the ODE case, we obtain the following equations: S (1 , + S (0 , = − βS y I k I − µ S Sy (0 , E k E + y (1 , E k E = (1 − (cid:15) ) βS y I k I − δ y E k E − µ E y E k E y (0 , I k I + y (1 , I k I = (cid:15)βS y I k I + δ y E k E − µ I y I . We now eliminate S to obtain equations in terms of only the model parametersand outputs. This can be done by solving for S in the third equation: S = k E (cid:16) y (0 , I + y (1 , I + µ I y I (cid:17) − δk I y E βk E (cid:15)y I . As before, this expression can then be substituted into the remaining two equationsto obtain two input-output equations that are rational equations of the outputs andtheir derivatives. By multiplying by a common denominator and moving terms toone side, these input-output equations can be rewritten as the following polynomial tructural identifiability analysis of PDEs 13 equations. 0 = − k E k I µ S y (0 , I y I − k E k I µ S y (1 , I y I − k E k I y (0 , I y I − k E k I y (1 , I y I − k E k I y (2 , I y I + k E k I ( y (0 , I ) + k E k I ( y (1 , I ) + 2 k E k I y (0 , I y (1 , I − k E k I µ I µ S y I − βk E y (0 , I y I − βk E y (1 , I y I − βk E µ I y I + δk I y (0 , E y I + δk I y (1 , E y I − δk I y E y (0 , I − δk I y E y (1 , I + δk I µ S y E y I + βδk I y E y I − k E (cid:15)y (0 , I − k E (cid:15)y (1 , I + k E y (0 , I + k E y (1 , I + k E µ I y I − k E µ I (cid:15)y I − k I (cid:15)y (0 , E − k I (cid:15)y (1 , E − δk I y E − k I µ E (cid:15)y E We choose derivatives with respect to time to be higher ranked than those withrespect to age for our ranking of terms within the polynomial, and let y I be higherranked than y E , yielding the ranking y E < y I < y (0 , E < y (0 , I < y (1 , E < y (1 , I 0) = c, E ( t, 0) = I ( t, 0) = 0, where c represents the birth rate. Todetermine the identifiability of the boundary condition parameter c , we performthe same analysis as above but with the model equations evaluated at a = 0(boundary condition equations). We let the boundary conditions for E and I begeneric to avoid reducing equations to zero. Then, noting that S ( t, 0) is constantand thus ∂S∂t ( t, 0) = 0, the relevant equations are ∂S∂a ( t, 0) = − βcI ( t, − µ S c∂E∂t ( t, 0) + ∂E∂a ( t, 0) = (1 − (cid:15) ) βcI ( t, − δE ( t, − µ E E ( t, ∂I∂t ( t, 0) + ∂I∂a ( t, 0) = (cid:15)βcI ( t, 0) + δE ( t, − µ I I ( t, 0) (4)4 Marissa Renardy et al. By transforming the system (4) to a set of polynomial equations we obtain thefollowing polynomial coefficients: (cid:26) , − k I δk E , − ck E β − ck E β(cid:15)k I , δ + µ E , µ I − cβ(cid:15) (cid:27) Since µ I is identifiable from the model equations, we thus find that βc(cid:15) is anidentifiable combination, which is consistent with the ODE model. Thus, the fullset of identifiable parameters and combinations for this model is (cid:26) µ S , µ I , βk I , βc(cid:15), δk I k E , δ ( (cid:15) − (cid:15) , µ E + δ (cid:27) which is equivalent to the set of identifiable combinations for the ODE model. With generic initial conditions, we do not gain any additional identifiability results.However, if we assume that certain information is known, such as the total popula-tion size at the initial time, then further model parameters can become identifiable.Here, we assume that the initial population size as well as its derivatives with re-spect to time and age are known for all ages. Under these conditions, we may solvefor the initial susceptible population S (0 , a ) as S (0 , a ) = N (0 , a ) − E (0 , a ) − I (0 , a ),where N represents the total population size. Similarly, since we assume the deriva-tives of N are known at t = 0, we may solve for the derivatives of S at t = 0 interms of the derivatives of E and I . It is thus straightforward to solve for S andits derivatives at t = 0 in terms of the outputs y E and y I .To determine parameter identifiability from the initial condition with this ad-ditional knowledge, we evaluate the model equations at t = 0 and perform identi-fiability analysis analogous to the previous analysis for the boundary conditions.We obtain the following set of polynomial coefficients: (cid:26) , k I k E , βk E , βk I , − δk I k E , − βk E ( (cid:15) − k I , βk E ( (cid:15) − k I , β(cid:15)k E ,β(cid:15)k I , β − β(cid:15)k I , δ + µ E , µ I − β(cid:15), k I µ S k E , µ S − β, − k I ( µ S + 2) (cid:27) From these coefficients, we find that β, δ, (cid:15), µ S , µ E , µ I , k E , and k I are all identi-fiable. Combining this with the previously established identifiability of cβ(cid:15) fromthe boundary equations, we find that all parameters are identifiable if the initialpopulation size and its derivatives are known. While density-dependent transmission is convenient for simplicity, it is not gen-erally a realistic model for TB. Rather, most models for TB epidemiology usefrequency-dependent transmission models wherein the transmission term is pro-portional to S IN . In this section we present identifiability results for the SEI modelwith frequency-dependent transmission, assuming that our output functions are tructural identifiability analysis of PDEs 15 y E = k E E and y I = k I I . In addition to the ODE and constant-parameter PDEmodels, we will also consider variations including immigration and age-dependentdeath rates. The full details of the analysis are cumbersome and thus are notprovided here, but can be found in the Supplementary Material.5.1 ODE SEI model with frequency-dependent transmissionFor completeness, we include here identifiability results for the ODE version of themodel. This model can be written as the following system of equations: dSdt = c − βS IN − µ S SdEdt = (1 − (cid:15) ) βS IN − δE − µ E EdIdt = (cid:15)βS IN + δE − µ I I (5)where N ( t ) = S ( t ) + E ( t ) + I ( t ). We find that the set of identifiable parametersand parameter combinations for this model is (cid:26) β, δ, (cid:15), µ S , µ E , µ I , ck I , k E k I (cid:27) . ∂S∂t + ∂S∂a = − βS IN − µ S S∂E∂t + ∂E∂a = (1 − (cid:15) ) βS IN − δE − µ E E∂I∂t + ∂I∂a = (cid:15)βS IN + δE − µ I I (6)with boundary conditions S ( t, 0) = c , E ( t, 0) = I ( t, 0) = 0. From the model equa-tions, we find that the parameters and combinations { β, δ, (cid:15), µ S , µ E , µ I , k E k I } areidentifiable. We further find from the boundary equations that ck I is identifiable.Thus, the full set of identifiable parameters and combinations for this model is { β, δ, (cid:15), µ S , µ E , µ I , ck I , k E k I } which is consistent with the ODE model. If the initial age distribution of the totalpopulation and its derivatives were known (as in Section 4.2.3 for the previouscase study), then all parameters in the model would be identifiable. dSdt = m + c − βS IN − µ S SdEdt = (1 − (cid:15) ) βS IN − δE − µ E EdIdt = (cid:15)βS IN + δE − µ I I (7)where m is the immigration rate. If we assume that the age distribution θ ( a ) ofimmigrants is known, then the corresponding age-structured PDE system becomes ∂S∂t + ∂S∂a = mθ ( a ) − βS IN − µ S S∂E∂t + ∂E∂a = (1 − (cid:15) ) βS IN − δE − µ E E∂I∂t + ∂I∂a = (cid:15)βS IN + δE − µ I I (8)where θ ( a ) is some known function such that θ (0) = 0 and (cid:82) ∞ θ ( a ) da = 1, withboundary conditions S ( t, 0) = c and E ( t, 0) = I ( t, 0) = 0.The identifiability results for the ODE system (7) remain the same as beforebut with c replaced by c + m , i.e., the identifiable parameters and combinationsare (cid:26) β, δ, (cid:15), µ S , µ E , µ I , ( c + m ) k I , k E k I (cid:27) . For the PDE system (8), we get the following identifiable parameters and combi-nations from the model and boundary equations: (cid:26) β, δ, (cid:15), µ S , µ E , µ I , ck I , mk I , k E k I (cid:27) . Thus, in this case the identifiability results for the PDE model differ from thosefor the ODE model in that the combinations ck I and mk I are both identifiable,whereas in the ODE model only the sum ( c + m ) k I is identifiable. In other words,if all immigrants are susceptible then in an age-structured model it is possible todifferentiate between birth and immigration, whereas in an ODE model it is not. tructural identifiability analysis of PDEs 17 death rates. There are several ways to introduce age dependence to the deathrate: one could assume a specific functional form (e.g., piecewise-constant, linear,exponential), that the function belongs to a particular family (e.g., polynomi-als, splines), or that the function is completely arbitrary with some conditionson smoothness. In this section, we present identifiability results for four cases: (1)piecewise-constant death rate functions for all three compartments, (2) a single ex-ponential death rate function for all compartments, (3) a single polynomial deathrate function for all compartments, and (4) a single arbitrary function for all com-partments. For simplicity, we will consider the SEI model without immigration. We first consider the case that the death rates are piecewise-constant functionsof age with discontinuities at ages { a , a , a , ...a n = A } , i.e. µ j ( a ) = µ ij for a ∈ ( a i , a i +1 ) and j = S, E, I . This represents the case where death rates areknown independently for different age groups. Then the system (6) becomes asequence of PDE systems obeying the equations ∂S∂t + ∂S∂a = − βS IN − µ iS S∂E∂t + ∂E∂a = (1 − (cid:15) ) βS IN − δE − µ iE E∂I∂t + ∂I∂a = (cid:15)βS IN + δE − µ iI I for t > a ∈ ( a i , a i +1 ), 0 ≤ i < n , with boundary conditions S ( t, 0) = c , E ( t, 0) = I ( t, 0) = 0 and S ( t, a i ) = lim a → a − i S ( t, a ) E ( t, a i ) = lim a → a − i E ( t, a ) I ( t, a i ) = lim a → a − i I ( t, a )for 0 < i < n . We will assume that the locations of the discontinuities a , ..., a n are known, and that the sub-intervals are wide enough that the usual structuralidentifiability results hold.Since the model equations remain essentially unchanged, structural identifia-bility of the model equations has already been established in Section 5.2. Thus,from the model equations we obtain that { β, δ, (cid:15), µ iS , µ iE , µ iI , k E k I } are identifiable for all i . From the boundary equations at a = 0 we again getthat ck I is identifiable, as in the constant parameter case. We gain no additionalinformation about the parameter values from the boundary conditions at a = a i for i > One may also wish to use a continuous function to represent age-dependent deathrates. One possibility is to use an exponential function µ ( a ) = µ exp( κa ) so thatdeath rate increases exponentially with age. For simplicity, we assume that deathrates are the same for all three compartments. Then the model system can bewritten as ∂S∂t + ∂S∂a = − βS IN − µS∂E∂t + ∂E∂a = (1 − (cid:15) ) βS IN − δE − µE∂I∂t + ∂I∂a = (cid:15)βS IN + δE − µIdµda = κµ for t > a ∈ [0 , ∞ ), with boundary conditions S ( t, 0) = c , E ( t, 0) = I ( t, 0) =0, and µ (0) = µ . We write the system in this way rather than directly inserting µ ( a ) = µ exp( κa ) so that we maintain a differential polynomial form.To obtain an input-output equation from the model equations, we use the I equation to solve for S and the E equation to solve for µ ( a ) (note that these choicesare somewhat arbitrary). Then by noting that I = y I /k I and E = y E /k E andsubstituting into the S and µ equations, we obtain rational equations in terms ofthe outputs and their derivatives. Thus, by multiplying by a common denominator,we obtain differential polynomials. We find that (cid:110) β, δ, (cid:15), κ, k E k I (cid:111) are the identifiableparameters and parameter combinations.By performing similar analysis on the boundary equations, we find that µ and ck I are identifiable. Thus, the full set of identifiable parameters and combinationsfor this version of the model is (cid:26) β, δ, (cid:15), κ, µ , ck I , k E k I (cid:27) . This shows that both parameters from the exponential death rate are identifiable.The identifiability of the remaining parameters is consistent with the results forthe ODE model in Section 5.1. Another option for a continuous death rate is to use a polynomial, µ ( a ) = (cid:80) ni =0 κ i a i .For example, a polynomial may be fit to the available data for death rate by agegroup. Then the model equations become ∂S∂t + ∂S∂a = − βS IN − (cid:32) N (cid:88) i =0 κ i a i (cid:33) S ∂E∂t + ∂E∂a = (1 − (cid:15) ) βS IN − δE − (cid:32) N (cid:88) i =0 κ i a i (cid:33) E∂I∂t + ∂I∂a = (cid:15)βS IN + δE − (cid:32) N (cid:88) i =0 κ i a i (cid:33) I tructural identifiability analysis of PDEs 19 for t > a ∈ [0 , ∞ ), with boundary conditions S ( t, 0) = c and E ( t, 0) = I ( t, 0) = 0. The age a is considered to be observable. By using the I equation tosolve for S and substituting into the remaining equations, we obtain a set of input-output equations that are differential polynomials. We find that the correspondingmonic polynomial contains the monomial terms κ i a i y I for all i = 0 , ..., n . Thus,all coefficient parameters κ i in µ ( a ) are identifiable and µ ( a ) can be uniquelyrecovered. We also find that (cid:110) β, δ, (cid:15), k E k I (cid:111) are identifiable from the model equationsand ck I is identifiable from the boundary equations. The final and most general option we consider is to allow µ ( a ) to be an arbitraryfunction of a , i.e. the model equations become ∂S∂t + ∂S∂a = − βS IN − µ ( a ) S∂E∂t + ∂E∂a = (1 − (cid:15) ) βS IN − δE − µ ( a ) E∂I∂t + ∂I∂a = (cid:15)βS IN + δE − µ ( a ) I for t > a ∈ [0 , ∞ ). In this case, we use the I equation to solve for S and then use the E equation to solve for µ ( a ). We obtain the following expression for µ ( a ) in terms of the outputs and parameters: µ ( a ) = − k E ( (cid:15) − y I + k I (cid:15)y E (cid:16) δk I y E + k I (cid:15)y (1 , E + k I (cid:15)y (0 , E + k E ( (cid:15) − y (1 , I + k E ( (cid:15) − y (0 , I (cid:17) = − (cid:15) − y I + k I k E (cid:15)y E (cid:18) δ k I k E y E + k I k E (cid:15)y (1 , E + k I k E (cid:15)y (0 , E + ( (cid:15) − y (1 , I + ( (cid:15) − y (0 , I (cid:19) (9)By substituting (9) into the remaining equation for S and performing the iden-tifiability analysis, we again find that (cid:110) β, δ, (cid:15), k I k E (cid:111) are identifiable. We note thatthe second derivatives of y E and y I appear in the input-output equations, andhence µ ( a ) must be at least twice differentiable. We then find that the right handside of (9) consists entirely of observable quantities and identifiable parametercombinations; hence µ ( a ) is identifiable. Note that while we use the term “identifi-able” here, what we actually mean is that µ ( a ) can be determined from observablequantities and identifiable parameter combinations. Further, we obtain from theboundary equations that ck I is identifiable.Identifiability results for all cases of the ODE and age-structured PDE model systems are summarized in Table 1. We note that in the cases where the same deathrate is used for all compartments, the death rate function is always identifiable.In cases where the compartments have different death rates, the rates for thesusceptible and infected compartments are identifiable. These results hold underthe assumption that a fraction of the total prevalence and a fraction of all exposures can be observed at all times and ages. It has been shown previously that otheroutput functions may lead to non-identifiability of death rates even in simplepopulation models (Perasso and Razafison, 2016).5.5 Practical identifiabilityThe above structural analysis determines identifiable parameters and combinationsunder the assumption that data are noiseless and are available for arbitrarilymany time and age points. In reality, epidemiological data are often noisy andare collected only at discrete time points and for discrete age groups. Thus, itis also necessary to perform practical identifiability analysis to determine whichparameters and parameter combinations are still identifiable given more realisticdata. The practical identifiability of model parameters given realistic data maydiffer from their structural identifiability.As a proof of concept, we explore the practical identifiability of the constant-parameter model from Section 5.2 using profile likelihoods. We assume that dataare collected monthly for 5-year age groups from age 0 to 100 over a span of 20 Case Identifiable parameters and combinations Density-dependent ODE(Section 4.1) (cid:26) µ S , µ I , βk I , βc(cid:15), δk I k E , δ ( (cid:15) − (cid:15) , µ E + δ (cid:27) Density-dependent PDE with constant parameters(Section 4.2) (cid:26) µ S , µ I , βk I , βc(cid:15), δk I k E , δ ( (cid:15) − (cid:15) , µ E + δ (cid:27) Frequency-dependent ODE(Section 5.1) (cid:26) β, δ, (cid:15), µ S , µ E , µ I , ck I , k E k I (cid:27) Frequency-dependent PDE with constant parameters(Section 5.2) (cid:26) β, δ, (cid:15), µ S , µ E , µ I , ck I , k E k I (cid:27) Frequency-dependent ODE with immigration(Section 5.3) (cid:26) β, δ, (cid:15), µ S , µ E , µ I , ( c + m ) k I , k E k I (cid:27) Frequency-dependent PDE with immigration(Section 5.3) (cid:26) β, δ, (cid:15), µ S , µ E , µ I , ck I , mk I , k E k I (cid:27) Frequency-dependent PDE with piecewise-constant death rates(Section 5.4.1) (cid:26) β, δ, (cid:15), ck I , k E k I (cid:27) ∪ (cid:32) n − (cid:91) i =0 { µ iS , µ iI , µ iE } (cid:33) Frequency-dependent PDE with exponential death rate(Section 5.4.2) (cid:26) β, δ, (cid:15), κ, µ , ck I , k E k I (cid:27) Frequency-dependent PDE with polynomial death rate(Section 5.4.3) (cid:26) β, δ, (cid:15), ck I , k E k I (cid:27) ∪ (cid:32) n (cid:91) i =0 { κ i } (cid:33) Frequency-dependent PDE with arbitrary death rate function(Section 5.4.4) (cid:26) β, δ, (cid:15), ck I , k E k I , µ ( a ) (cid:27) Table 1 Summary of identifiable parameters and combinations for the ODE and age-structured PDE systems where the output functions are y E = k E E and y I = k I I .tructural identifiability analysis of PDEs 21 years. We synthetically generate data by choosing a baseline parameter set, sim-ulating the model, and adding noise that is normally distributed with a standarddeviation of 5%. Using this synthetic data, we perform parameter estimation usingthe Matlab function fminsearch , and evaluate practical identifiability of the modelparameters as detailed in Section 2.4.We compute the Fisher Information Matrix (FIM) using the outputs y E and y I at the above-specified ages and time points. We find that for a wide variety ofparameter sets within reasonable ranges, the rank of the FIM is eight, implyingthat there should be eight identifiable parameters and parameter combinations.This is consistent with our structural identifiability analysis performed in Section5.2, which revealed a set of eight identifiable parameters and combinations. Fromthe structural analysis, we see that all parameters except c, k I , and k E are struc-turally identifiable. However, from profile likelihoods we observe that the practicalidentifiability of these parameters does vary across the parameter space. Here,we demonstrate two baseline parameter sets that lead to qualitatively differentidentifiability profiles (Figure 3). For both parameter sets, c, k E , and k I are notpractically identifiable.Using the parameter set P = { c = 100 , β = 1 , (cid:15) = 0 . , δ = 0 . , µ S =0 . , µ E = 0 . , µ I = 0 . , k E = 0 . , k I = 0 . } , we find that certain struc-turally identifiable parameters are only practically identifiable on one side, that is,we can establish a reasonable lower or upper bound but not both. While the profilelikelihoods do have a unique minimum, the profiles are flat and confidence intervalsfor these parameters are thus wide. In particular, the transmission rate β cannotbe reasonably bounded from above, and (cid:15), δ, µ S , and µ E cannot be distinguishedfrom zero.Using a slightly different parameter set P = { c = 100 , β = 1 , (cid:15) = 0 . , δ =0 . , µ S = 0 . , µ E = 0 . , µ I = 0 . , k E = 0 . , k I = 1 } , in which report-ing rates and death among the exposed population are increased, we find thatall structurally identifiable parameters are also practically identifiable. While thisparameter set is not necessarily biologically realistic (since µ E > µ I , for exam-ple), this demonstrates that there are regions in the parameter space where theseparameters can be well estimated. Creating mathematical and computational models can greatly impact the biologi-cal community through predictions that can be used and tested. Parameterizationof models is a key factor in connecting them to the biology, and thus identifyingparameters accurately is an important step of model development. Parameters aretypically estimated from data; however, data may be sparse and noisy, and evenperfect data does not guarantee unique estimability of parameters. Understandingif model parameters are identifiable lends credibility to the model system. If pa-rameters are not identifiable, then the applicability of those models becomes more theoretical. While this may add value to understanding, it limits the potential forpredictions to be translated in a meaningful way toward impacting the biologi-cal system. To this end, we focus on parameter identifiability in both ODE andPDE models. While there is a solid literature on structural identifiability analysisfor ODE models, there is not a unified framework for studying identifiability in AB Fig. 3 Practical identifiability of the constant-parameter model from Section 5.2. Costfunctions (black lines) based on likelihoods are shown for two sets of parameter values, P = { c = 100 , β = 1 , (cid:15) = 0 . , δ = 0 . , µ S = 0 . , µ E = 0 . , µ I = 0 . , k E = 0 . , k I = 0 . } (panel A) and P = { c = 100 , β = 1 , (cid:15) = 0 . , δ = 0 . , µ S = 0 . , µ E = 0 . , µ I =0 . , k E = 0 . , k I = 1 } (panel B). Red dashed lines denote 95% confidence thresholds, andred dots denote local minima (corresponding to maximum likelihoods). See Section 2.4 formore details of the analysis. PDEs. Here, we have presented a methodology for performing structural identi-fiability analysis on age-structured epidemic models using a differential algebraframework. We have applied this methodology to explore the structural identi-fiability of a simple age-structured Susceptible-Exposed-Infected (SEI) epidemicmodel under various assumptions.We found that when the age-structured model uses constant parameters anddoes not include immigration, then the identifiability results are identical to those for the corresponding ODE model. In the presence of immigration, however, dif-ferences in identifiability arise: the rates for birth and immigration appear only asa sum in the identifiable combinations of the ODE model, while they appear sep-arately in those for the age-structured PDE model formulation. Further, we foundthat age-dependent death parameters in the PDE model are able to be uniquely tructural identifiability analysis of PDEs 23 recovered in at least the following three cases: piecewise-constant death rates, asingle exponential death rate for all compartments, and a single polynomial deathrate for all compartments.These findings imply that identifiability results for corresponding ODE andPDE models are likely equivalent when parameters are constant, but that addi-tional information can be gained from PDE models when parameters vary withthe non-temporal variable(s). Rigorous derivation of necessary and sufficient con-ditions for different models to have identical identifiability results remains to bedone, however based on our results herein, we conjecture that identifiability forthe ODE and PDE forms of an age-structured model should be equivalent whenthere are not age-dependent parameters present in the model. Additionally, futurework to investigate how discretization of the PDE (e.g. converting age dynamicsinto linear ‘chains’ of compartments for age groups (Hurtado and Kirosingh, 2019;Hurtado and Richards, 2020)) impacts identifiability of the model would be a use-ful direction given the common use of compartmental representations of aging inbiological models.The framework presented here could be extended to other PDE models, suchas reaction-diffusion and advection-diffusion equations, in a variety of applicationareas. In particular, the assumption used here that age-interactions are only local(i.e. individuals interact with only people their own age) may be reasonable forreaction-diffusion models. However, since this methodology relies on the ability toexpress the input-output equations as differential polynomials, there is a limit tothe amount of complexity that it can handle. Thus, highly nonlinear models and/oroutput functions may require the use of other tools such as numerical methods.One such method is the FIM, which we have demonstrated in the context of anage-structured model with preferential mixing informed by real data.A significant limitation of structural identifiability analysis is that it assumesthe availability of perfect data. In practice, data may contain noise and bias, andmay be sampled only at discrete time points. Thus, practical identifiability anal-ysis must also be performed to gain a complete understanding of the parameteridentifiability of a model given the available data. As demonstrated in our exam-ples, practical identifiability of a model may vary not only with the quality of thedata but also with the corresponding values of model parameters. Practical iden-tifiability thus needs to be handled on a case-by-case basis for models and theircorresponding parameter sets. Taken together, practical and structural identifia-bility provide rigor to the modeling process, adding to the applicability of modelpredictions. Acknowledgements This research was supported by NIH grants R01AI123093 and U01HL131072awarded to DK and NSF DMS Grant 1853032 and NIH MIDAS Grant U01GM110712 awarded to ME. References Ajelli M, Gon ˜A § alves B, Balcan D, Colizza V, Hu H, Ramasco JJ, Merler S,Vespignani A (2010) Comparing large-scale computational approaches to epi-demic modeling: Agent-based versus structured metapopulation models. BMCInfectious Diseases 10(1):190, DOI 10.1186/1471-2334-10-190Audoly S, Bellu G, D’Angio L, Saccomani MP, Cobelli C (2001) Global iden-tifiability of nonlinear models of biological systems. IEEE Trans Biomed Eng48(1):55–65Bertuzzo E, Casagrandi R, Gatto M, Rodriguez-Iturbe I, Rinaldo A (2009) Onspatially explicit models of cholera epidemics. Journal of the Royal Society In-terface 7(43):321–333, DOI 10.1098/rsif.2009.0204Bian L (2004) A conceptual framework for an individual-based spatially explicitepidemiological model. Environment and Planning B: Planning and Design31(3):381–395, DOI 10.1068/b2833Castillo-Chavez C, Feng Z (1998) Global stability of an age-structure model for TBand its applications to optimal vaccination strategies. Mathematical Biosciences151(2):135–154, DOI 10.1016/S0025-5564(98)10016-0Castillo-Chavez C, Hethcote HW, Andreasen V, Levin SA, Liu WM (1991)Epidemiological models with age structure, proportionate mixing, and cross-immunity. Journal of Mathematical Biology 27(3):233–258, DOI 10.1007/BF00275810Chappell MJ, Gunn RN (1998) A procedure for generating locally identifi-able reparameterisations of unidentifiable non-linear systems by the similar-ity transformation approach. Mathematical Biosciences 148(1):21–41, DOIhttps://doi.org/10.1016/S0025-5564(97)10004-9Chappell MJ, Godfrey KR, Vajda S (1990) Global identifiability of the parametersof nonlinear systems with specified inputs: A comparison of methods. Mathe-matical Biosciences 102(1):41–73Chis OT, Banga JR, Balsa-Canto E (2011) Structural identifiability of systemsbiology models: a critical comparison of methods. PloS one 6(11):e27755Chow L, Fan M, Feng Z (2011) Dynamics of a multigroup epidemiologicalmodel with group-targeted vaccination strategies. Journal of Theoretical Bi-ology 291:56–64, DOI 10.1016/j.jtbi.2011.09.020Cobelli C, DiStefano JJ (1980) Parameter and structural identifiability conceptsand ambiguities: a critical review and analysis. American Journal of Physiology- Regulatory, Integrative and Comparative Physiology 239(1):R7–R24Cole D, Morgan B, Titterington D (2010) Determining the parametric structure ofmodels. Mathematical Biosciences 228(1):16–30, DOI https://doi.org/10.1016/j.mbs.2010.08.004Eisenberg M (2019) Input-output equivalence and identifiability: some simple gen-eralizations of the differential algebra approach. arXivEisenberg MC, Hayashi MA (2014) Determining identifiable parameter combina-tions using subset profiling. Mathematical biosciences 256:116–126 Eisenberg MC, Robertson SL, Tien JH (2013) Identifiability and estimation ofmultiple transmission pathways in cholera and waterborne disease. Journal ofTheoretical Biology 324:84–102, DOI 10.1016/j.jtbi.2012.12.021Evans ND, White LJ, Chapman MJ, Godfrey KR, Chappell MJ (2005) The struc-tural identifiability of the susceptible infected recovered model with seasonal tructural identifiability analysis of PDEs 25 forcing. Mathematical Biosciences 194:175–197, DOI 10.1016/j.mbs.2004.10.011Feng Z, Hill AN, Smith PJ, Glasser JW (2015) An elaboration of theory aboutpreventing outbreaks in homogeneous populations to include heterogeneity orpreferential mixing. Journal of Theoretical Biology 386:177–187, DOI 10.1016/j.jtbi.2015.09.006Ferguson NM, Nokes DJ, Anderson RM (1996) Dynamical complexity in age-structured models of the transmission of the measles virus: Epidemiological im-plications at high levels of vaccine uptake. Mathematical Biosciences 138(2):101–130, DOI 10.1016/S0025-5564(96)00127-7Gardner ID (1980) The effect of aging on susceptibility to infection. Reviews ofInfectious Diseases 2(5):801–810, DOI 10.1016/j.mbs.2011.10.001Glasser J, Feng Z, Moylan A, Valle SD, Castillo-Chavez C (2012) Mixing in age-structured population models of infectious diseases. Mathematical Biosciences235:1–7, DOI 10.1016/j.mbs.2011.10.001Glasser J, Feng Z, Omer SB, Smith PJ, Rodewald LE (2016) The effect of hetero-geneity in uptake of the measles, mumps, and rubella vaccine on the potential foroutbreaks of measles: a modelling study. Lancet Infectious Diseases 16(5):599–605, DOI 10.1016/S1473-3099(16)00004-9Guzzetta G, Ajelli M, Yang Z, Merler S, Furlanello C, Kirschner D (2011) Mod-eling socio-demography to capture tuberculosis transmission dynamics in a lowburden setting. Journal of Theoretical Biology 289(1):197–205Hao L, Glasser JW, Su Q, Ma C, Feng Z, Yin Z, Goodson JL, Wen N, Fan C,Yang H, Rodewald LE, Feng Z, Wang H (2019) Evaluating vaccination policiesto accelerate measles elimination in China: a meta-population modelling study.International Journal of Epidemiology dyz058, DOI 10.1093/ije/dyz058Harris RC, Sumner T, Knight GM, Evans T, Cardenas V, Chen C, White RG(2019) Age-targeted tuberculosis vaccination in China and implications for vac-cine development: a modelling study. Lancet Global Health 7:E209–E218, DOI10.1016/S2214-109X(18)30452-2Hong H, Ovchinnikov A, Pogudin G, Yap C (2020) Global identifiability of differ-ential models. Communications on Pure and Applied Mathematics 73(9):1831–1879Hurtado PJ, Kirosingh AS (2019) Generalizations of the ‘linear chain trick’: in-corporating more flexible dwell time distributions into mean field ode models.Journal of mathematical biology 79(5):1831–1883Hurtado PJ, Richards C (2020) Time is of the essence: Incorporating phase-type distributed delays and dwell times into ode models. arXiv preprintarXiv:200801318Inaba H, Sekine H (2004) A mathematical model for Chagas disease with infection-age-dependent infectivity. Mathematical Biosciences 1:39–69, DOI 10.1016/j.mbs.2004.02.004Jacquez JA, Greif P (1985) Numerical parameter identifiability and estimability:Integrating identifiability, estimability, and optimal sampling design. Mathemat-ical Biosciences 77(1-2):201–227 Jacquez JA, Perry T (1990) Parameter estimation: local identifiability of param-eters. Am J Physiol 258(4 Pt 1):E727–36Kao YH, Eisenberg MC (2018) Practical unidentifiability of a simple vector-bornedisease model: Implications for parameter estimation and intervention assess-ment. Epidemics 25:89–100, DOI 10.1016/j.epidem.2018.05.010 Keeling MJ, White PJ (2011) Targeting vaccination against novel infections: risk,age and spatial structure for pandemic influenza in Great Britain. Journal ofthe Royal Society Interface 8(58):661–670, DOI 10.1098/rsif.2010.0474Ljung L, Glad T (1994) On global identifiability for arbitrary model parameteri-zation. Automatica 30(2):265–276LoBue PA, Mermin JH (2017) Latent tuberculosis infection: the final frontier oftuberculosis elimination in the USA. The Lancet Infectious Diseases 17(10):e327– e333, DOI https://doi.org/10.1016/S1473-3099(17)30248-7Miao H, Xia X, Perelson A, Wu H (2011) On identifiability of nonlinear ode modelsand applications in viral dynamics. SIAM Review 53(1):3–39, DOI 10.1137/090757009, URL http://epubs.siam.org/doi/abs/10.1137/090757009 , http://epubs.siam.org/doi/pdf/10.1137/090757009 Mossong J, Hens N, Jit M, Beutels P, Auranen K, Mikolajczyk R, Massari M,Salmaso S, Tomba GS, Wallinga J, Heijne J, Sadkowska-Todys M, Rosinska M,Edmunds WJ (2008) Social contacts and mixing patterns relevant to the spreadof infectious diseases. PLoS Medicine 5(3):e74Ozcaglar C, Shabbeer A, Vandenberg SL, Yener B, Bennett KP (2012) Epidemio-logical models of mycobacterium tuberculosis complex infections. MathematicalBiosciences 236(2):77–96, DOI 10.1016/j.mbs.2012.02.003Perasso A, Razafison U (2016) Identifiability problem for recovering the mortal-ity rate in an age-structured population dynamics model. Inverse Problems inScience and Engineering 24(4):711–728, DOI 10.1080/17415977.2015.1061522Perasso A, Laroche B, Chitour Y, Touzeau S (2011) Identifiability analysis ofan epidemiological model in a structured population. Journal of MathematicalAnalysis and Applications 374:154–165, DOI 10.1016/j.jmaa.2010.08.072Pia Saccomani M, Audoly S, D’Angio L (2003) Parameter identifiability of non-linear systems: the role of initial conditions. Automatica 39(4):619–632Pohjanpalo H (1978) System identifiability based on the power series expansion ofthe solution. Mathematical Biosciences 41(1-2):21–33Raue A, Kreutz C, Maiwald T, Bachmann J, Schilling M, Klingmuller U, TimmerJ (2009) Structural and practical identifiability analysis of partially observed dy-namical models by exploiting the profile likelihood. Bioinformatics 25(15):1923–9Renardy M, Kirschner D (2019) Evaluating vaccination strategies for tuberculosisin endemic and non-endemic settings. Journal of Theoretical Biology 469:1–11Renardy M, Kirschner D (2020) A framework for network-based epidemiologicalmodeling of tuberculosis dynamics using synthetic datasets. Bulletin of Mathe-matical Biology 82(6):78Ritt JF (1950) Differential algebra, vol 33. American Mathematical Soc.Shim E, Feng Z, Martcheva M, Castillo-Chavez C (2006) An age-structured epi-demic model of rotavirus with vaccination. Journal of Mathematical Biology53(4):719–746, DOI 10.1007/s00285-006-0023-0Thieme HR, Castillo-Chavez C (1993) How may infection-age-dependent infectiv-ity affect the dynamics of HIV/AIDS? SIAM Journal of Applied Mathematics tructural identifiability analysis of PDEs 27 application to Rift Valley fever. Bulletin of Mathematical Biology 78:1796–1827,DOI 10.1007/s11538-016-0200-2Valle SYD, Hyman JM, Hethcote HW, Eubank SG (2007) Mixing patterns betweenage groups in social networks. Social Networks 29(4):539–554Venzon DJ, Moolgavkar SH (1988) A method for computing profile-likelihood-based confidence intervals. Journal of the Royal Statistical Soci-ety: Series C (Applied Statistics) 37(1):87–94, DOI https://doi.org/10.2307/2347496, URL https://rss.onlinelibrary.wiley.com/doi/abs/10.2307/2347496 , https://rss.onlinelibrary.wiley.com/doi/pdf/10.2307/2347496https://rss.onlinelibrary.wiley.com/doi/pdf/10.2307/2347496