Model structures and structural identifiability: What? Why? How?
aa r X i v : . [ s t a t . M E ] J a n Model structures and structural identifiability: What? Why? How? ∗Jason M. WhyteACEMS, School of Mathematics and Statistics, and CEBRA, School of BioSciences,University of Melbourne, Parkville, Victoria, Australia, 301030th December 2019
Abstract
We may attempt to encapsulate what we know about a physical system by a model structure, S . Thiscollection of related models is defined by parametric relationships between system features; say observables(outputs), unobservable variables (states), and applied inputs. Each parameter vector in some parameter spaceis associated with a completely specified model in S . Before choosing a model in S to predict system behaviour,we must estimate its parameters from system observations. Inconveniently, multiple models (associated withdistinct parameter estimates) may approximate data equally well. Yet, if these equally valid alternatives producedissimilar predictions of unobserved quantities, then we cannot confidently make predictions. Thus, our studymay not yield any useful result.We may anticipate the non-uniqueness of parameter estimates ahead of data collection by testing S forstructural global identifiability (SGI). Here we will provide an overview of the importance of SGI, some essentialtheory and distinctions, and demonstrate these in testing some examples. A “model structure” (or simply “structure”) is essentially a collection of related models of some particular class (saythe linear, first-order, homogeneous, constant-coefficient ODEs in n variables), as summarised by mathematicalrelationships between system variables that depend on parameters. For example, in a “controlled state-spacestructure” we may draw on our knowledge of the system to relate time-varying quantities such as “states” ( x )that we may not be able to observe, and (typically known) controls or “inputs” ( u ) which act on some part of oursystem, to “outputs” ( y ) we can observe. A structure is a useful construct when seeking to model some physicalsystem for which our knowledge is incomplete. We choose some suitable parameter space, and each parametervector therein is associated with a model in our structure, where we use “model” to mean a completely specifiedset of mathematical relationships between system variables.In order to illustrate the concept of a structure, we will consider S , a controlled state-space structure of“compartmental” models, meaning that these are subject to a “conservation of mass” condition—matter is neithercreated nor destroyed. When we are interested in a system evolving in continuous time, a structure will employordinary differential equations (ODEs) to describe the time course of the states. Compartmental structures areoften appropriate for the modelling of biological systems. To illustrate this, let us consider a simple biochemicalsystem, where we consider the interconversion and consumption of chemical species, as in a cellular process.Structure S has three states, x , x , and x , representing concentrations of three distinct chemical species, or“compartments”. Matter may be excreted from the system, delivered into the system, or converted between theforms. We assume that the system receives some infusion of x via input u .Using standard notation for compartmental systems, a real parameter k ij ( i, j = 1 , , , i = j ) representsthe rate constant for the conversion of x j into x i . A real parameter k j is the rate constant associated with theloss of material from x j to the “environment” outside of the system. If reactions are governed by “first-ordermass-action kinetics”, the rate of conversion (or excretion) of some species at time t depends linearly on theamount of that species at time t .Given our physical system and modelling paradigm, (and understanding that an expression such as ˙ x repre-sents d x/ d t ) we may write the “representative model” of S as ˙ x ( t ) = − ( k + k ) x ( t ) − k x ( t ) , ˙ x ( t ) = k x ( t ) − ( k + k ) x ( t ) + k x ( t ) , ˙ x ( t ) = k x ( t ) − k x ( t ) + u ( t ) , (1) ∗ This is a pre-publication version of a paper to be published in the 2019-20 MATRIX Annals, Springer. here we set initial conditions for our states (where ⊤ denotes transpose) (cid:0) x (0) x (0) x (0) (cid:1) ⊤ = (cid:0) x (cid:1) ⊤ . (2)Supposing that x is the only state we can observe over time, our output is y ( t ) = x ( t ) . (3)We may represent this “single-input single-output” (SISO) structure by a compartmental diagram, as inFigure 1. Squares represent distinct chemical species, thin arrows show the conversion of mass to other forms, orexcretion from the system. The rates of conversion or excretion are determined by the product of the associatedparameter and the state variable at the source of the arrow. The thick arrow shows an input, and the circlelinked to x indicates that this compartment is observed. More specifically, Figure 1, (1), and (3) illustrate therepresentative model of a controlled (due to the input u ) compartmental (mass is conserved) linear (describing themanner in which the states and input appear) time-invariant (coefficients of the input and states are constants)state-space structure. x x x k k k k k u Figure 1: A compartmental diagram of the chemical system as modelled by the represen-tative model shown in (1)–(3). Matter in the compartments representing the chemicalspecies x , x , and x , is transferred between compartments. Matter is lost from the x compartment to the environment and this compartment is observed. Input u deliversmass to the x compartment. At this juncture, establishing some conventions will aid our further discussion of structures.
Convention 1.
When discussing features of a structure M , we represent its associated parameter space with Θ ,which we may specify more particularly as necessary. Given arbitrary parameter vector θ ∈ Θ , we shall alwaysuse M ( θ ) to represent M ’s representative system. When considering some specific parameter vector, say α , weshall represent the associated model by M ( α ) , which we will understand to be completely specified. Convention 2.
When we apply some descriptors (e.g. controlled compartmental linear time-invariant state-space) to either a structure’s representative system (as in the example above) or to a structure, these descriptorstransfer to the other. The descriptors also apply to all systems in the structure, except for possibly degeneratesystems associated with a subset of parameter space of measure zero.
Convention 2 foreshadows a case where some small number of models in a structure may have differentproperties to others. We may account for this complication in a manner that assists our intended analysis ofstructures.
Property 1.
Given structure M with parameter set Θ , a property of M is generic if it holds “almost everywhere”in Θ . That is, we allow that the property may not hold on some subset(s) of Θ of measure zero. Having specified a structure for a physical system, we may expect it to contain some model which willencapsulate the system’s features of interest, and provide insights into aspects of the system’s behaviour. Forexample, we may hope to achieve objectives, such as to accurately: O1 predict system outputs at unobserved times within the time range for which we have data, O2 estimate the time course of states, O3 anticipate system behaviour in situations for which we do not have data, such as under a proposed changein experimental or environmental conditions, O4 compare the effects of a range of proposed actions on the system, allowing us to discern which actions havethe potential to produce beneficial results. We will treat classes of structures more formally in Sect. 2. e can only hope to consistently gain such insights if our modelling effort provides reliable predictions.Yet, features of an assumed structure may make this challenging, or impossible. As such, we can benefit frominterrogating structures in advance of their use to ascertain their suitability.To explain further, we may expect to arrive at a particular model in M that we can use for prediction afterusing data to estimate our parameter vector in a process of “parameter identification” (PI). In essence, PI usessome objective function to quantify the goodness-of-fit of predictions made for some α ∈ Θ to data, and analgorithm that searches through Θ to improve upon this as much as possible. The goal is to determine thoseparameter vectors which optimise the objective function. Suppose that there is a “true” (unknown) parametervector θ ∗ ∈ Θ such that M ( θ ∗ ) reproduces the actual dynamics of our physical system, including that relatingto any unobservable states. As data is typically sparse and subject to noise, whilst we expect that we cannotexactly recover θ ∗ , we intend that PI can obtain a good approximation to it.This ambition is frustrated when the value of the objective function is virtually constant over some regionof parameter space. Upon encountering such a region, a search algorithm is unable to find a search directionthat will improve the objective function’s value. This may lead to an unsatisfactory result. For example, the PIprocess may terminate without returning any parameter estimate.Alternatively, PI’s results may defy interpretation. Suppose PI returns multiple feasible, equally valid esti-mates of θ ∗ . If we lack further constraints on the elements of θ ∗ (e.g. relative sizes), we cannot discern whichof the alternative estimates to use as our approximation.This state of affairs may not matter if our only concern is O1, or we do not need to specifically know θ ∗ .However, suppose that using M with alternative parameter estimates yields substantially different results foroutcomes O2–O4. Then, we cannot confidently use M for prediction.Cox and Huber [9] provided one example of such an unsatisfactory outcome. The authors showed thattwo parameter vectors returned by PI lead to equally good predictions of the observed time series of counts ofmalignant cancer cells in a patient, yet produce substantially different counts for the time after an “intervention”—a reduction in the carcinogenic components to which the patient is exposed.PI may fail to uniquely estimate a parameter vector due an inherent property of M . As such, our non-uniqueness problem is independent of the amount and quality of data we have. That is, improvements in thevolume of data or accuracy of its measurement cannot resolve the problem.We expect to anticipate the non-uniqueness of parameter estimates when scrutiny of our structure shows thatit is not structurally globally identifiable (SGI). The concept was first formalised for state-space structures inBellman and Åström [4] with reference to compartmental structures similar to that shown in Figure 1.One tests a structure to determine whether or not it is SGI in an idealised framework.
Convention 3.
The framework employed in testing a structure M for SGI is defined by assumptions including: • the structure correctly represents our physical system, • a record of error-free data that is infinite in extent is available, • and others that may be particular to the assumed class of structure, or testing method.Some methods, e.g. those employing similarity transforms [21] or Markov and initial parameters, [14], are onlyapplicable when M is “generically minimal”. That is, for almost all θ ∈ Θ we cannot reduce M ( θ ) to a systemof fewer states that produces an identical output function. The test aims to discern whether or not it is possible for PI applied to idealised data to only return the truevector θ ∗ , for almost all θ ∗ ∈ Θ . The test result is definitive in this case.Suppose that structure M is classified as SGI. Then, it may be possible for PI applied to actual (limited inextent, noisy) data to return a unique estimate for θ ∗ , but this is not guaranteed. As such, we can only consideran SGI model as possibly useful for prediction. Still, the value of knowing that M is SGI is the assurance thatwe are not almost certain to fail in our objective before we commence our study. Alternatively, it is extremelyunlikely that PI applied to a non-SGI model and actual data will return a unique estimate of θ ∗ . In thiscase, we should not immediately proceed to make predictions following PI. Instead, we may seek to propagateparameter uncertainty through our structure so as to produce a range of predictions, allowing us to quantifyprediction uncertainty. From this we may judge whether or not we can obtain sufficiently useful predictions forour purposes.Aside from merely encouraging caution, the result of testing structure M for structural global identifiability can deliver useful insights. The test result may allow us to distinguish between individual parameters we mayestimate uniquely, and those we cannot. The literature has various alternative terms for SGI, some of which may be equivalent only under particular conditions. For twoexamples, Audoly et al. [3], used “structurally a priori identifiable”, where “a priori” emphasises that one can test a structure in advanceof data collection. Godfrey [12] favoured “deterministic identifiability” in discussing compartmental models, for reasons relating to thedegree of a priori knowledge of a system and the dependence of the result of testing on the combination of inputs. We will consider thissecond matter in Section 4. In the interests of brevity, henceforth we use SGI as a shorthand for this noun, in addition to the adjective used earlier, expectingthat the reader can infer the meaning from context. urther, awareness that a structure is not SGI can assist in correcting the problem. The test may allow usto recognise those parameter combinations which PI may return uniquely. This knowledge may guide reparam-eterisation of M ( θ ) so as to produce the representative system of a new structure that is SGI. Additionally,having learned that M is not SGI, one can examine whether it is possible that modifying M (e.g. holding someparameters constant), or the combination of M and planned data collection (e.g. supposing that an additionalvariable is measured, and rewriting M to include this as another output), will remedy this. Thus, we can treatthe process of testing a structure for SGI as an iterative process. We can detect a structure’s undesirable featuresahead of data collection, address them, test the revised structure, and continue this process until the structureis satisfactory.Analytical inspection of (in particular, more complex) structures to anticipate the uniqueness or otherwise ofparameter estimates is often not straightforward. The difficulties of testing a structure for SGI, as well as howthe results of PI applied to real data can be worse than that predicted by theory, have encouraged numericalapproaches to the task. (See [12, Chapter 8] for an introduction.) Broadly, approaches seeking to demonstrate“numerical” (or “practical”) identifiability are based on assuming some number of parameter vectors; using eachof these with the structure to simulate data at a limited number of observation times, or under a limited numberof conditions (e.g. applied inputs or values of experimental variables), or subject to noise, or some combinationof these; conducting PI; and investigating the features of parameter estimates to determine if these adequatelyapproximate assumed values.Testing a structure for numerical identifiability may determine when PI is unlikely to yield accurate results.However, unlike analytical scrutiny, these investigations may not provide clear guidance on how to remedy theproblem.In this paper we will provide an introduction to the testing of (state-space) structures for SGI. There are avariety of testing methods available (see, for example, [11]) although many are not an ideal means of introducingthe field of identifiability analysis. As such, we intend that our choices of testing method and examples willallow us to illustrate some important issues without having to encounter unnecessary algebraic and conceptualcomplexity.In choosing example structures, we have limited ourselves to a class which are linear in the state variables,as demonstrated in the representative model given in (1)–(3) . We further restrict these to compartmentalstructures. Given these choices, the “Transfer Function Approach” (TFA, see for example [8]), which makes useof features of the Laplace transform of a structure’s output function, is appropriate for our purposes. Althoughone of the older testing methods, it is still included in relatively recent texts presenting a range of methods (e.g.[11]), and:1. is conceptually rather more straightforward than other methods,2. has the unusual distinction of being applicable to a structure that is not generically minimal, and3. is unambiguously appropriate for compartmental structures.To explain the significance of Points 2 and 3, we note that a general linear state-space structure may be judgedas generically minimal as a consequence of having the generic properties of controllability and observability. Theconditions used in deciding this are appropriate for linear systems— these have a state space which is a vectorspace. However, the state space of a positive linear system is a polyhedral cone, and so it does not seemappropriate to treat these as we would a general linear system.Certain authors have sought to highlight differences between features of linear and linear positive systems.In the context of discrete-time systems, Benvenuti and Farina sought to show. . . that the minimality problem for positive linear systems is inherently different from that of ordinarylinear systems . . . ” ([5, Page 219]).Whyte [26, Chapter 3, Section 5.2] considered some of the literature’s perspectives on controllability of linearstate-space systems. Briefly, the origins of the area related to linear “structured” systems (see Poljak [17])which are generally distinct from linear compartmental systems (a type of “descriptor” system; see Yamada andLuenberger [27]). This lead to suspicions that it may not always be inappropriate to test a linear compartmentalstructure for generic minimality using the machinery designed for general linear structures. By choosing to usethe TFA in analysing a structure, Point 2 allows us to avoid this potential issue.Further, the TFA has shown promise in the analysis of structures of linear switching systems (LSSs) (Whyte [25,26]). Structures of switching systems (especially those which evolve in continuous time) are largely neglected inthe literature. Yet methods under development may assist in the scrutiny of structures used to model epidemics,such as where an intervention causes an abrupt change in some parameter values.Discussions at a recent workshop “Identifiability problems in systems biology” held at the American Instituteof Mathematics ([1]) highlighted a degree of inconsistency in certain key definitions used in the field of identi-fiability analysis. As such, here we will draw on efforts to propose transparent and coherent definitions in theanalysis of uncontrolled structures (Whyte [25,26]) in suggesting equivalent definitions for controlled structures. For this reason, the approach is also known as the “Laplace transform method”, as seen in [12, Chapter 6]. he remainder of this paper is organised as follows. In Section 2 we present some preliminary material andintroduce certain classes of structures that aid us in presenting the TFA. In Section 3 we outline the generaltheory of testing an uncontrolled structure for SGI, particularise this to uncontrolled linear time-invariant (LTI)state-space structures, and consider an example. Section 4 proceeds similarly for controlled LTI state-spacestructures, where we draw an important distinction between testing approaches based on how much informationwe are able to elicit from our structure. Finally, in Sect. 5 we summarise some concepts in the testing of structuresand offer some concluding remarks.We conclude this section by establishing notation. The field of real numbers is denoted by R . The subset of R containing only positive (non-negative) values isdenoted by R + ( ¯ R + ). The natural numbers { , , , . . . } are denoted by N , and we define N , N ∪ { } .The field of complex numbers is denoted by C . The real part of z ∈ C is denoted by Re ( z ) . Given some a ∈ R , a useful set for the following discussion is H a , { s ∈ C (cid:12)(cid:12) Re( s ) > a } . (4)We use a bold lower-case (upper-case) symbol such as a ( A ) to denote a vector (matrix), and a superscript ⊤ associated with any such object indicates its transpose. Given vector a , ˙ a denotes its derivative with respectto time. To specify the ( i, j ) -th element of A we may use a i,j , or prefer the simplicity of ( A ) i,j when A is aproduct of terms. For n ∈ N , we use diag( a , a , . . . , a n ) to denote the square diagonal matrix having a , . . . , a n on the main diagonal and zeros elsewhere. A special diagonal matrix is the ( n × n ) identity matrix I n ∈ R n × n ,having a main diagonal of n F and some indeterminate w , F ( w ) denotes the field of rational functions in w over F . Given a, b ∈ N and F , we use F a × b to denote the set of matrices of a rows and b columns having elements in F . Whenat least one of a or b is zero, it is convenient to have F a × b represent a set of “empty matrices”, and we candisregard any matrix in this set as it arises. In this section we will define certain classes of structures, and present an overview of some useful properties, inpreparation for a discussion of how we may test these structures for SGI.We will aim to illustrate the features of systems by introducing sufficient systems theory, beginning withsome conventions. Suppose we have a set of input values U , a set of output values Y , and a time set T ⊆ ¯ R + .Let U denote a set of input functions such that for u ∈ U , u : T → U T : t u ( t ) ∈ U . That is, U isa set of input functions taking values in the set U . Similarly, let Y denote a set of functions such that for y ∈ Y , y : T → Y T : t y ( t ) ∈ Y . That is, Y is a set of output functions taking values in a set Y . Finally, let ζ denote an “input-output” map from U to Y . We use these definitions in presenting a general type of system inDefinition 1. From this we may obtain other system types by imposing suitable conditions. Definition 1. An input-output system on time set T is a triple ( U , Y , ζ ) . Contained within the input-output systems are the state-space systems, which are of particular interest tous here. To aid our discussion of these, given some time set T we define the set T , (cid:8) ( t , t ); t ≥ t , t , t ∈ T (cid:9) . (5) In the following definitions and discussion we draw on Whyte [26, Section 3.4], which was informed by Caines[6, Appendix 2]).
Definition 2 (Adapted from Whyte [26, Definition 3.8]) . A state-space system Σ is a quintuple ( U , X, Y , Φ , η ) where • U is a set of input functions. • X is a set, called the state-space of Σ , with elements called states. • Y is a set of output functions. • Φ( · , · , · , · ) is the state transition function, which maps T × X × U into X .To illustrate this, consider time interval T ⊆ ¯ R + with t , inf T . Suppose Σ is subject to input function u ∈ U . Further, suppose that at t = t we have that x ∈ X is the initial state of Σ . Then, for ( t, t ) ∈ T , Φ( t, t , x , u ) determines the state of Σ as a consequence of time t , x , and u . Under these conditions, wemay concisely refer to Φ( t, t , x , u ) as the state of Σ at time t . η ( · , · , · ) is the output map, which maps T × X × U into Y .That is, at some time t ∈ T , η determines the output vector that results from three inputs: t , the state of Σ at that time, and the input u .Further, the following four properties hold:SS1: The Identity Property of Φ Φ( t, t, x, u ) = x, for all t ∈ T, x ∈ X and u ∈ U . That is, suppose the state of Σ at time t is x . Then, if no time has elapsed from t , Φ does not move thestate away from x .SS2: The Nonanticipative Property of Φ Suppose we have any u , u ∈ U such that these functions are identical on time interval [ t , t ] , where ( t , t ) ∈ T ⊂ R . Then, for all x ∈ X we have Φ( t , t , x, u ) = Φ( t , t , x, u ) . To explain this, suppose the state of Σ at time t is some x ∈ X . The Nonanticipative Property of Φ meansthat Σ reaches the same state at time t for Φ subject to either u or u . Equivalently, differences between u and u for any time greater than t do not influence the evolution of the state of Σ on [ t , t ] under Φ .SS3: The Semigroup Property of Φ For all ( t , t ) , ( t , t ) ∈ T , x ∈ X , and u ∈ U , Φ( t , t , x, u ) = Φ (cid:0) t , t , Φ( t , t , x, u ) , u (cid:1) . To explain, suppose we have system Σ with initial state x at time t and input u . Suppose Φ acts ontime interval [ t , t ] resulting in some particular state (say x , Φ( t , t , x, u ) ) at t . Suppose then Φ uses x as an initial state for evolving the state of Σ on [ t , t ] , resulting in a particular state (say x , Φ (cid:0) t , t , Φ( t , t , x, u ) , u (cid:1) ) at t . Due to the Semigroup Property of Φ , system Σ also reaches state x at t if Φ is used to evolve the state on [ t , t ] .SS4: The Instantaneous Output Map η For all x ∈ X , u ∈ U , ( t, t ) ∈ T , the function y : T → Y defined via y ( t ) = η (cid:0) t, Φ( t, t , x, u ) , u ( t ) (cid:1) is a segment of a function in Y .That is, we can use η to define the instantaneous output of Σ at current time t through t , the state of Σ at time t ( Φ( t, t , x, u ) ) and the value of the input at time t ( u ( t ) ). This property is useful as y providesa simpler means of illustrating the output of Σ than does η when we wish to introduce particular systemtypes. We will now illustrate some useful classes of continuous-time state-space structures, beginning with a generaltype. Henceforth we consider spaces for states, inputs, and outputs of X ⊆ R n , U ⊆ R m , and Y ⊆ R k ,respectively, where accordingly indices n, m, k ∈ N determine the dimensions of our state, input, and outputvectors. For arbitrary parameter vector θ ∈ Θ , and input u ∈ U , at time t ∈ T a controlled state-space structure M has representative system M ( θ ) of the general form: ˙ x ( t ; θ ) = f ( x , u , t ; θ ) , x (0; θ ) = x ( θ ) , y ( t ; θ ) = g ( x , u , t ; θ ) , (6)where f and g satisfy the relevant properties SS1–SS4 of Definition 2.A subtype of the controlled state-space structures are an uncontrolled class, lacking inputs. If an uncontrolledstate-space structure has indices for the state and output spaces of n and k respectively, then a representativemodel is similar to (6): ˙ x ( t ; θ ) = f ( x , t ; θ ) , x (0; θ ) = x ( θ ) , ˙ y ( t ; θ ) = g ( x , t ; θ ) . (7)We will now introduce a particular class of the general state-space structures described above— that of lineartime-invariant (LTI) structures. An LTI structure has a representative system that is particular form of (6). Wewill use specific examples of LTI structures to illustrate the testing of a structure for SGI in Sections 3 and 4. .2 Continuous-time linear, time-invariant structures The following definitions are adapted from Whyte [26, Definition 3.21], which drew on concepts from van den Hof [14].
Definition 3.
Given indices n, m, k ∈ N , a controlled continuous-time linear time-invariant state-spacestructure (or, more briefly, an LTI structure ) M has state, input, and output spaces X = R n , U = R m , and Y = R k , respectively. For parameter set Θ ⊆ R p ( p ∈ N ), M has mappings A : Θ → R n × n , B : Θ → R n × m , C : Θ → R k × n , x : Θ → R n , (8) where the particular pattern of non-zero elements in the “system matrices” shown in (8) defines M . Morespecifically, mappings in (8) dictate the relationships between state variables x , inputs u , and outputs y for alltimes t ∈ T ⊆ R + . Thus, for arbitrary θ ∈ Θ , M ’s representative system M ( θ ) has the form ˙ x ( t, u ; θ ) = A ( θ ) · x ( t, u ; θ ) + B ( θ ) · u ( t ) , x (0; θ ) = x ( θ ) , (9) fy ( t ; θ ) = C ( θ ) · x ( t ; θ ) . (10) Defining L Σ P ( n, m, k ) , R n × n × R n × m × R k × n × R n , (11) then SL Σ P ( n, m, k ) , n (cid:16) A ( θ ) , B ( θ ) , C ( θ ) , x ( θ ) (cid:17) ∈ L Σ P ( n, m, k ) (cid:12)(cid:12)(cid:12) θ ∈ Θ o (12) is the set of system matrices associated with systems in M . Thus, we may consider the matrices of a particularsystem in M as obtained by the parameterisation map f : Θ → SL Σ P ( n, m, k ) such that f ( θ ) = (cid:16) A ( θ ) , B ( θ ) , C ( θ ) , x ( θ ) (cid:17) . Together, the matrices and vector defined by (8) and the indices n , m , and k , are the system parameters of M ( θ ) .We may consider an uncontrolled LTI structure having indices n, k ∈ N as a form of controlled LTIstructure having n, m, k ∈ N by setting m = 0 . As such, systems in the uncontrolled structure have X = R n and Y = R k . By omitting the empty matrix B from (9) we obtain the form of the uncontrolled structure’srepresentative system: ˙ x ( t ; θ ) = A ( θ ) · x ( t ; θ ) , x (0; θ ) = x ( θ ) , (13) y ( t ; θ ) = C ( θ ) · x ( t ; θ ) , (14) where the system matrices are A ∈ R n × n , C ∈ R k × n , and x ∈ R n .As a notational convenience, we allow sets defined in (11) and (12) to apply to this context, where L Σ P ( n, , k ) and SL Σ P ( n, , k ) are understood as neglecting the irrelevant B . In modelling biological systems, we may employ a subclass of the LTI state-space structures in which systemshave states, inputs, and outputs subject to constraints informed by physical considerations. This, in turn,imposes conditions on the structure’s system matrices. Our summary of the conditions in the following definitionis informed by the treatment of compartmental LTI systems given in van den Hof [14].
Definition 4 (Classes of LTI state-space structures) . A positive LTI state-space structure with indices n, m, k ∈ N is an LTI state-space structure after Definition 3, having representative system of the form given in (9) and (10) , where states, outputs, and inputs are restricted to non-negative values. That is, the structure has X = ¯ R n + , U = ¯ R m + , and Y = ¯ R k + .A compartmental LTI structure with indices n, m, k ∈ N is a positive LTI state-space structure for whichsystems in the structure have system matrices subject to “conservation of mass” conditions: • all elements of B and C are non-negative, and • for A = ( a i,j ) i,j =1 ,...,n , a ij ≥ , i, j ∈ { , . . . , n } , i = j ,a ii ≤ − n X j =1 j = i a ji , i ∈ { , . . . , n } . (15) An uncontrolled positive LTI structure or an uncontrolled compartmental LTI structure with in-dices n, k belongs to a subclass of the corresponding class of controlled LTI structures with indices n, k, m . Therelationship between the controlled and uncontrolled forms is as for that between LTI structures and uncontrolledLTI structures presented in Definition 3. The representative system of any such uncontrolled structure has theform outlined in (13) and (14) , subject to appropriate restrictions on state and output spaces X and Y . We shall now consider some properties of controlled LTI structures which will inform our testing of thesestructures for SGI subsequently. .3 Features of the states and outputs of a controlled LTI structure A consideration of some features of the states and outputs of LTI structures here will allow us to appreciate theutility of the TFA in testing such a structure for SGI in Section 3.
In this discussion we adapt the treatment of uncontrolled LTI systems given in Whyte [26, Chapter 3] andcombine this with insights from Seber and Wild [18, Chapter 8]. In this subsection, in the interests of brevity,we we will neglect the dependence of systems on θ .Let us consider a structure defined by system matrices in SL Σ P ( n, m, k ) (recall (11)), where we assume thestructure is defined on time set T = ¯ R + . Recall that states evolve according to an ODE system as in (9). Givenstate space X = R n , the solution for state vector x ( t ) depends on the matrix exponential e A t ∈ R n × n through x ( t ) = e A t x + Z t e A ( t − t ′ ) Bu ( t ′ )d t ′ , (16)provided that the integral exists. Assuming this existence, we may use (14) and the convolution operator ∗ toexpress response as y ( t ) = C e A t x + C e A t B ∗ u ( t ) . (17)Let us presume a situation typical in the modelling of physical systems—that the elements of A are finite. Letus suppose that the n (finite and not necessarily distinct) eigenvalues of A are ordered from largest to smallestand labelled as λ i , i = 1 , . . . , n . In the interests of simplicity, we also assume that A has n linearly independentright eigenvectors s i , i = 1 , . . . , n , where each is associated with the appropriate λ i . We define S ∈ R n × n asthe matrix for which the i -th column is s i . We may then employ a spectral decomposition A ≡ S Λ S − , where Λ = diag( λ , . . . , λ n ) . As a result, we may rewrite our matrix exponential: e A t ≡ S e Λ t S − , (18)noting that each element is a sum of (up to n ) exponentials, with exponents drawn from λ i ( i = 1 , . . . , n ).With this in mind, let us turn our attention towards the terms C e A t x ∈ R k × and C e A t B ∈ R k × m on thethe right-hand side of (17). As x is a constant vector, and B and C are constant matrices, then each elementof C e A t x and C e A t B is also a sum of exponentials in λ i ( i = 1 , . . . , n ).Suppose λ has multiplicity µ ≥ . Hence, the largest possible dominant term in any of our sums ofexponentials involves t µ e λ t . Hence, there exist real constants K > and λ > λ such that for all t ∈ ¯ R + wehave K e λt ≥ (cid:12)(cid:12)(cid:12)(cid:0) C e A t x (cid:1) i, (cid:12)(cid:12)(cid:12) i = 1 , . . . , k , (cid:12)(cid:12)(cid:12)(cid:0) C e A t B (cid:1) i,j (cid:12)(cid:12)(cid:12) i = 1 , . . . , k ,j = 1 , . . . , m . (19)The existence of these bounds will prove important when we consider the application of the TFA to a LTIstructure. Towards this, we shall consider some features of the Laplace transform of the output of LTI structures. We recall the definition of the Laplace transform of a real-valued function.
Definition 5.
Suppose some real-valued function f is defined for all non-negative time. (That is, f : ¯ R + R , t f ( t ) .) We represent the (unilateral) Laplace transform of f with respect to the transform variable s ∈ C by L{ f } ( s ) , Z ∞ f ( t ) · e − st d t , if this exists on some domain of convergence D ⊂ C . Let us consider a controlled LTI structure S with parameter set Θ , with a representative system S ( θ ) , havingthe form shown in (9) and (10). We assume system matrices belong to SL Σ P ( n, m, k ) (recall (12)). Supposethat given input u , L{ u } ( s ) exists. In this case the Laplace transform of output y given u is L{ y ( · , u ; θ ) } ( s ; θ ) = V ( s ; θ ) + W ( s ; θ ) L{ u } ( s ) ∈ R ( s ) k × , (20) We note that others, such as Walter and Pronzato [22, Chapter 2, Page 22], have considered such expressions. However, thenotation employed may make the description of transfer functions in testing a structure for SGI unnecessarily complicated. As such,we employ a simpler notation here. We also include x in V (unlike say in the equivalent matrix H in [22]), as otherwise the initialconditions do not feature in the test equations. here V ( s ; θ ) , C ( θ ) (cid:0) s I n − A ( θ ) (cid:1) − x ( θ ) ∈ R ( s ) k × , (21) W ( s ; θ ) , C ( θ ) (cid:0) s I n − A ( θ ) (cid:1) − B ( θ ) ∈ R ( s ) k × m , (22)and, owing to (19), each element of V and W is defined for all s ∈ H λ . Definition 6.
We refer to V and W as “transfer matrices”, and each element of these is a transfer function—specifically, a rational function in s . We term any such element an unprocessed transfer function. Property 2.
The degree of the denominator of any unprocessed transfer function in V or W is at most n .Similarly, if S is a compartmental structure, the degree of the numerator of any transfer function is at most n − . If we can cancel any factors in s between the numerator and denominator of the transfer function (pole-zero cancellation), then we will obtain a degree for each of the numerator and denominator which is lower thanpreviously.Suppose that pole-zero cancellation occurs in each unprocessed transfer function in V and W . Then, S is notgenerically minimal (recall Convention 3). When we have an uncontrolled LTI structure, (20) reduces to L{ y ( · ; θ ) } ( s ) = V ( s ; θ ) ∈ R ( s ) k × , (23)with V as in (21), and the discussion of matrix elements given above also applies.We may now proceed to consider definitions and processes relating to structures and structural global iden-tifiability, informed by Convention 3. By way of introduction, we begin with the rather more straightforwardmatter of the testing of uncontrolled structures. We will consider the testing of an uncontrolled structure for SGI following what we may call the “classical”approach originally outlined by Bellman and Åström [4]. We follow the treatment of [26] which drew on aspectsof Denis-Vidal and Joly-Blanchard [10]. In essence, we judge a structure as SGI (or otherwise) with reference tothe solution set of test equations.
Definition 7.
Suppose we have a structure of uncontrolled state-space systems M , having parameter set Θ (anopen subset of R p , p ∈ N ), and time set T ⊆ [0 , ∞ ) . For some unspecified θ ∈ Θ , M has representative model M ( θ ) , which has state function x ( · ; θ ) ∈ R n and output y ( · ; θ ) ∈ R k (recall (7) ). Suppose that systems in M satisfy conditions:1. The functions f ( x , · ; θ ) and g ( x , · ; θ ) are real and analytic for every θ ∈ Θ on S (a connected open subsetof R n such that x ( t ; θ ) ∈ S for every t ∈ [0 , τ ] , τ > ).2. f ( x ( θ ); θ ) = for almost all θ ∈ Θ .Then, for some finite time τ > , we consider the set I ( M ) , n θ ′ ∈ Θ : y ( t ; θ ′ ) = y ( t ; θ ) ∀ t ∈ [0 , τ ] o . (24) If, for almost all θ ∈ Θ : I ( M ) = { θ } , M is structurally globally identifiable (SGI);the elements of I ( M ) are denumerable, M is structurally locally identifiable (SLI);the elements of I ( M ) are not denumerable, M is structurally unidentifiable (SU). We note that some care is needed in the application of Definition 7, as it is not appropriate in all cases.Condition 1 ensures that the definition is not applicable to all classes of systems, including switching systems.Condition 2 indicates that the initial state cannot be an equilibrium point, as otherwise response is constant forall time. Such a response cannot provide information on system dynamics. If the constant response is atypical,it does not provide an appropriate idealisation of real data. Thus, it is inappropriate to use a constant responsein testing the structure for SGI.
Remark 1.
Instead of the test described above, one may test a structure for the property of structural localidentifiability ([20]). This is able to judge a structure as either SLI, or SU. Discerning that a structure is SLImay be adequate in some circumstances, and the tests tend to be easier to apply than tests for SGI. n general, the output of system M ( θ ) features “(structural) invariants” [19] (or “observational parameters”[15]) φ ( θ ) which define the time course of output. We may use these to summarise the properties of the wholestructure. Thus, invariants allow us to test a structure for SGI using algebraic conditions that are addressed more easilythan a functional relationship as in (24). Here we formalise this property by rewriting Definition 7 in terms ofinvariants. This leads to a test of a structure for SGI that is easier to apply than its predecessor.
Definition 8.
Suppose that structure M satisfies Conditions 1 and 2 of Definition 7. Then, for some arbitrary θ ∈ Θ , we define the set I ( M, φ ) , n θ ′ ∈ Θ : φ ( θ ′ ) = φ ( θ ) o ≡ I ( M ) . (25) It follows that determination of I ( M, φ ) allows classification of M according to Definition 7. Given Definition 8, we may propose a process for testing a structure for SGI.
Proposition 1.Step 1
Obtain invariants φ ( θ ) : there are various approaches, but some have requirements (e.g. that the struc-ture is generically minimal) that may be difficult to check. Step 2
Form alternative invariants φ ( θ ′ ) by substituting θ ′ for θ in φ ( θ ) . Step 3
Form equations φ ( θ ′ ) = φ ( θ ) . Step 4
Solve equations.
Step 5
Scrutinise solution set to make a judgement on M according to Definition 8. Step 1 poses a key problem : how may we obtain some suitable φ ? When considering an LTI structure,the TFA is appropriate. We will now introduce the approach, proceeding to illustrate its application to anuncontrolled LTI structure in Sect. 3.2. Consider a compartmental LTI structure S with indices n, k ∈ N and m ∈ N , having system matrices belongingto SL Σ P ( n, m, k ) (recalling that m = 0 indicates an uncontrolled structure). Recall the idealised frameworkemployed in the testing of a structure for SGI shown in Convention 3. As such, we consider S defined for time set T = ¯ R + . Recall (20), and the discussion of Sect. 2.3.1 which guarantees that there exists some λ such that theLaplace transform of y has a domain of convergence. Then, given transfer matrices V and W (as appropriate),we may extract invariants for use in testing S for SGI. First, we must place the transfer functions into a specificform. Definition 9 (Canonical form of a transfer function) . Given compartmental LTI structure S of n ∈ N states,suppose that associated with S ( θ ) is a transfer matrix (as in (20) ) Z , composed of unprocessed transfer functions.Given element z i,j ( s ; θ ) ∈ C ( s ) , we obtain the associated transfer function in canonical form by cancellingany common factors between the numerator and denominator, and rewriting to ensure that the denominatorpolynomial is monic. The result is an expression of the form: z i,j ( s ; θ ) = ω i,j,r + p ( θ ) s p + · · · + ω i,j,r ( θ ) s r + ω i,j,r − ( θ ) s r − + · · · + ω i,j, ( θ ) , ∀ s ∈ C ⊇ H λ ,r ∈ { , . . . , n } , p ∈ { , . . . , r − } . (26) The coefficients ω i,j, , . . . , ω i,j,r + p in (26) contribute invariants towards φ ( θ ) . Recalling the general form of systems in an uncontrolled compartmental LTI structure from (13) and (14), letus consider a particular example S , with representative system: ˙x ( t ; θ ) = A ( θ ) · x ( t ; θ ) x (0; θ ) = x ( θ ) , (27) y ( t ; θ ) = C ( θ ) · x ( t ; θ ) , (28) We can conceive of invariants most directly when a structure is defined by one set of mathematical relations for all time. Otherwise,say for structures of switching systems, we require a more flexible approach ([23, 24]). Such structures are beyond the introductoryintentions of this chapter. here the state vector is x ( t ; θ ) = (cid:2) x x x (cid:3) ⊤ , and the system matrices belong to SL Σ P (3 , , . Thesehave the form: x ( θ ) = x , A ( θ ) = − k − k k k − k − k k k − k , C ( θ ) = (cid:2) (cid:3) , (29)and we have parameter vector θ = ( k , k , k , k , k , x ) ⊤ ∈ R . (30)Condition 1 of Definition 7 is satisfied for linear systems. To test whether S satisfies Condition 2 of Defini-tion 7, we note that ˙x (0 , θ ) = A ( θ ) x ( θ ) = k x − ( k + k ) x k x = ( as all parameters are strictly positive ) , (31)and thus the condition is satisfied for all θ ∈ Θ . As the conditions of Definition 7 are satisfied, we may proceedin testing S for SGI following Proposition 1 and Definition 8.Recall that in this uncontrolled case, the Laplace transform of the output function has the form of (23).Following the notation introduced earlier, we write the transform for y ( · ; θ ) as S V ( s ; θ ) , which is a scalar, andthe only source of invariants for S . Deriving the expression (and neglecting the matrix indices of Definition 9for simplicity) yields S V ( s ; θ ) = φ ( θ ) s + φ ( θ ) s + φ ( θ ) s + φ ( θ ) s + φ ( θ ) , ∀ s ∈ C , (32)where φ ( θ ) = k k k ,φ ( θ ) = k k + k k + k k + k k + k k + k k ,φ ( θ ) = k + k + k + k + k ,φ ( θ ) = k k x ,φ ( θ ) = k x . (33)We set φ ( θ ) , (cid:16) φ ( θ ) , φ ( θ ) , φ ( θ ) , φ ( θ ) , φ ( θ ) (cid:17) ⊤ , (34)and defining θ ′ , (cid:0) k ′ , k ′ , k ′ , k ′ , k ′ , x ′ (cid:1) ⊤ ∈ R (35)allows us to form the test equations φ ( θ ′ ) = φ ( θ ) . (36)We have six parameters, and merely five conditions. As such, we expect that S is not SGI. Solving System(36) for feasible θ ′ yields the solution set: I ( S , φ ) = θ ′ ∈ R (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) n x ′ k x , k x x ′ , Ψ − √ Π2 x ′ , k , χ + √ Π2 x ′ , x ′ o , n x ′ k x , k x x ′ , Ψ + √ Π2 x ′ , k , χ −√ Π2 x ′ , x ′ o , (37)where we interpret x ′ as a free parameter, Ψ , φ ( θ )2 − k x ′ x − k x x ′ , and setting Ξ , k + k − k allows us to write χ , (Ξ + k + k ) x ′ − k x , Π , (cid:0) Ξ + 2( k − k )Ξ + ( k + k ) (cid:1) x ′ − k x (Ξ − k + k ) x ′ + k x . (38)By substituting x ′ = x into either of the solution families given in (37) we see that the trivial solution θ ′ = θ is also valid, as we would expect. We note that the parameter k is SGI.Even though structure S contains relatively simple models, (37) with (38) show that the solutions for θ ′ in terms of θ are somewhat complicated, and not particularly easy to categorise. However, we see in (37) thatthere are two distinct families of solutions. As x ′ is free in each, there are uncountably infinitely-many feasiblevectors θ ′ that reproduce the structure’s output for a nominated θ . As such, we judge S as SU. Testing controlled structures for structural global identifiabil-ity
In considering the properties of a controlled state-space structure, we must account for the effects of inputs.Returning to the testing overview outlined in Proposition 1, it is appropriate to precede Step 1 with a new step:
Step 0
Specify the set of inputs which may be applied to the structure.It is also appropriate for us to adapt the definitions that suit uncontrolled structures for this setting.
Definition 10.
Suppose we have controlled state-space model structure M having parameter set Θ and set ofinput functions U , and time set T ⊆ [0 , ∞ ) . For some unspecified parameter vector and input, θ ∈ Θ and u ∈ U respectively, we illustrate M with representative model M ( θ ) (say, as in (6) ), having state function x ( · , u ; θ ) ∈ R n and output function y ( · , u ; θ ) ∈ R k .Suppose that for each u ∈ U systems in M satisfy conditions:1. Functions f ( x , u , · ; θ ) and g ( x , u , · ; θ ) are real and analytic for every θ ∈ Θ on S (a connected open subsetof R n such that x ( t, u ; θ ) ∈ S for every t ∈ [0 , τ ] , τ > ).2. For t belonging to (at least) some subinterval of [0 , τ ] , f ( x , u , t ; θ ) = for almost all θ ∈ Θ .Given finite time τ > , we define I ( M, U ) , n θ ′ ∈ Θ : y ( t, u ; θ ′ ) = y ( t, u ; θ ) ∀ t ∈ [0 , τ ] , ∀ u ∈ U o . (39) If, for almost all θ ∈ Θ : I ( M, U ) = { θ } : M is structurally globally identifiable for input set U ( U -SGI);the elements of I ( M, U ) are denumerable: M is structurally locally identifiable for input set U ( U -SLI);the elements of I ( M, U ) are not denumerable: M is structurally unidentifiable for input set U ( U -SU). Remark 2.
Conditions 1 and 2 of Definition 10 play similar roles to the corresponding conditions of Definition 7.Condition 1 excludes from consideration structures subject to discontinuities in the state or output functions, forwhich we cannot readily define invariants. Condition 2 relates to conditions which allow us to elicit informativeinput from a system in M . This loosens the condition of the uncontrolled case, where a system at equilibrium at t = 0 remains there. The controlled case is different; a system at an equilibrium state may be displaced by theaction of an input. However, this alone does not guarantee that the output of a controlled system is informativefor any input in U . As such, Condition 2 seeks to preclude the case where the system’s state is largely constant,possibly changing only at isolated points on [0 , τ ] . By doing so, we expect to obtain useful (non-degenerate)output, and possibly, invariants subsequently, depending on the nature of U .Should Conditions 1 and 2 not hold for any u ∈ U , it is appropriate to remove these from the input set. Suppose M satisfies Conditions 1 and 2 of Definition 10, and we may observe M ’s outputs for U containinga sufficiently broad range of inputs (e.g. the set of piecewise continuous functions defined on T , [19]). Then,within our idealised testing framework (Convention 3) we can access the structure’s invariants, say φ . In sucha case, rather than making a judgement on M using Definition 10, we may use φ with the more convenientDefinition 8.Let us turn our attention to the application of Definition 10 when M is a controlled compartmental LTIstructure. By physical reasoning ( x is real and does not exhibit jumps, and these properties are transferred to y ) we expect that Condition 1 is satisfied. Checking Condition 2 may not be trivial in general, and so it may beeasier to verify an alternative condition, even if this is stricter than necessary. For example, if we were to showthat ˙x ( t ; θ ) = for almost all θ ∈ Θ and any t ∈ [0 , τ ] for finite τ , then Condition 2 is satisfied.In practice, conditions such as those of Definition 10 do not typically feature in discussions of the testingof controlled LTI structures for SGI. This is likely due to the expectation that one can access a structure’sinvariants if the input set meets only modest requirements: that U is sufficiently diverse, and that the Laplacetransform of any input in U exists. Satisfying these conditions allows us to derive transfer matrices W and V as in (20), place transfer functions contained therein in canonical form (recall Definition 9), and obtain φ fromtheir coefficients.In various situations, for practical or ethical reasons, one is limited in the nature and number of inputs thatone can apply to some physical system. In such a case, it is not appropriate to assume that we may access φ from M . As such, the testing framework seen in Definition 8 is an inappropriate idealisation. However, we mayconsider the result of such a test as a “best case scenario”—we would not expect to obtain a more favourableresult from a limited set of inputs. As such, if a test using φ shows that M is SU, we can be almost certainthat PI applied to the output from our physical system resulting from a limited set of inputs will not obtainunique parameter estimates. Inconveniently, when the test classifies M as SGI or SLI, we cannot necessarilyascertain whether this judgement will also apply when we know that limited inputs are available. As such, it is ppropriate to return to Definition 10 and consider a test for generic uniqueness of parameter vectors that takesinto account the set of available inputs, and which does not require invariants.Some authors have noted situations where—unlike in the testing of a structure for SGI based on invariants—we may not consider inputs as being applied sequentially to yield separate output time courses. For example, inconsidering LTI compartmental structures, Godfrey [12, Page 95] cautioned:However, when more than one input is applied simultaneously, identifiability may depend on theshape of the two inputs, and it is then essential to examine the form of the observations Y ( s ) [theLaplace transform of y ] rather than individual transfer functions.In noting the importance of the available set of inputs, Jacquez and Grief [15, Page 201] sought to distinguish“system identifiability” (which we understand as SGI) from “model identifiability” which depends on some par-ticular inputs (as we have allowed for in Definition 10). The authors noted the confusion caused by failing todistinguish between these different properties. To the best of our knowledge, the literature does not have consis-tent terminology to distinguish these concepts, which may be a consequence of how infrequently it is explicitlyconsidered.We will seek to reuse the TFA machinery in considering what parameter information we may glean from theidealised output of a compartmental LTI structure subject to a single input. Let us consider such a structure S having system matrices in SL Σ P ( n, m, k ) . Suppose that we can observe idealised output for a single input u ,that is U = { u } , and that L{ u } ( s ) exists. Then, we may obtain parameter information for testing S for SGIgiven u from L{ y } ( s ; θ ) = C ( θ ) (cid:0) s I − A ( θ ) (cid:1) − (cid:0) x ( θ ) + B ( θ ) L{ u } ( s ) (cid:1) . (40)In order to demonstrate the difference between the testing of a controlled structure when invariants are andare not obtainable, we shall consider an example structure for which different input sets are available. Recallthe SISO structure S from Sect. 1. Following definitions from Sect. 2, we rewrite the representative system instate-space form as ˙ x ( t ; θ ) = A ( θ ) · x ( t ; θ ) + B ( θ ) u ( t ) , x (0; θ ) = x ( θ ) , (41) y ( t ; θ ) = C ( θ ) · x ( t ; θ ) , (42)where the state vector is x ( t ; θ ) = (cid:2) x x x (cid:3) ⊤ , and system matrices belong to SL Σ P (3 , , . Specificallywe have x (0; θ ) = x , A ( θ ) = − k − k k k − k − k k k − k , B ( θ ) = , C ( θ ) = (cid:2) (cid:3) . (43)Recalling (21) and (22), the transfer matrices here are scalars, which henceforth we denote by S W and S V .We note that by neglecting B we obtain the uncontrolled LTI structure S (recall (27) and (28)). Structure S has the same parameter vector as S , shown in (30).Below we proceed to test S for SGI under the assumption that we can obtain its invariants. Let us assume that we have the idealised outputs of S for a sufficiently large input set U such that we canobtain S W and S V . By converting each of these rational functions into the canonical form, we may obtain eachcoefficient of s . The collection of these specifies a vector of invariants. We shall recall the steps of Proposition 1in testing S for SGI.Towards Step 1, those invariants relating to the response due to the initial conditions reside in S V ≡ S V .We collected these invariants in (34).The behaviour of S differs from that of S due to the invariants relating to inputs, held in S W . Following(22), we see that S W , C ( s I − A ) − B , from which we obtain the transfer function in canonical form: S W ( θ ) = ω ( θ ) s + φ ( θ ) s + φ ( θ ) s + φ ( θ ) , (44)where the denominator invariants repeat the corresponding coefficients in L{ y } ( s ; θ ) (recall (33)), and ω ( θ ) = k k . (45) hus, only ω ( θ ) provides an invariant that is novel compared to those from S V ( θ ) .Drawing on (34) and (45), we complete Step 1 by forming the vector of distinct invariants associated with S : φ ( θ ) , ( φ ( θ ) , φ ( θ ) , φ ( θ ) | {z } common to S V ( θ ) , S W ( θ ) denominators , φ ( θ ) , φ ( θ ) | {z } from numeratorof S V ( θ ) , ω ( θ ) | {z } from numeratorof S W ( θ ) ) ⊤ . (46)Following Step 2 we use φ ( θ ) from (46) to form the invariants dependent on our alternative parameter θ ′ , (asin (35)), φ ( θ ′ ) . Step 3 directs us to form the test equations φ ( θ ′ ) = φ ( θ ) . Upon solving for feasible θ ′ weobtain I ( S , φ ) = ( θ ′ ∈ R (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) (cid:8) k , k , k , k , k , x (cid:9) , (cid:8) k , k , − k + k + k , k , k + k − k , x (cid:9) ) . (47)Equation (47) shows that we can obtain unique estimates for k ′ , k ′ , k ′ , and x ′ (i.e. the correspondingtrue values in θ ) for any θ ∈ R . However, for each of k ′ and k ′ we see there are two distinct solutionswhenever − k + k + k > and k + k − k > . That is, the structure is SLI.Inspection of the second solution family in (47) reveals k ′ + k ′ = k + k . This may hint that a repa-rameterisation of S so as to replace occurrences of k + k (which may occur in combination with otherparameters) with appropriate new parameters would produce a new structure which is SGI. Whilst there aretechniques for generating alternative structures that produce the same output (e.g. [21]), in general, finding asuitable reparameterisation amongst these is not a trivial undertaking. Given this, we may have to find somemeans of managing an SU structure. For example, we may determine bounds on the values of parameters bytesting for “interval identifiability”. If the bounds are sufficiently narrow, we may tolerate an SU structure (see[13] for examples).We shall now consider S in the more restrictive setting where our idealised output results from the applicationof one specific input. Suppose that we can only observe the idealised output of S for the single input u = δ ( t − —the impulsiveinput at time zero. Noting that L{ δ ( t − } ( s ) = 1 , and recalling (20), we may write L{ y ( · , θ ) } ( s ) = S V ( s ; θ ) + S W ( s ; θ ) , (48)where the terms on the right-hand side are given by (32) (recalling S V ( s ; θ ) ≡ S V ( s ; θ ) ) and (44), respectively.The sum of the two transfer functions on the right-hand side of (48) is also a rational function in s , andhence is analogous to a transfer function. As such, it is convenient to process this in a manner similar to thatshown in Sect. 4.1. Thus, ensuring that the right-hand side of (48) is in the canonical form, and simplifying,yields an expression (which is similar to the canonical form of L{ y ( · , θ ) } ( s ) , recall (32)): L{ y } ( s ; θ ) = φ ( θ ) s + β ( θ ) s + φ ( θ ) s + φ ( θ ) s + φ ( θ ) , ∀ s ∈ C , (49)where, recalling (33) and (45), β ( θ ) , φ ( θ ) + ω ( θ ) = k k ( x + 1) . Remark 3.
Given the input u = δ ( t − and that S is an open system, mass present in the system due to theinput and initial conditions is lost to the environment over time. As t → ∞ , the system approaches its steadystate x ∗ = . We note that (49) is the Laplace transform of an output function that is a sum of exponentialsin t (recall Sect. 2.3.1) as a result of being a linear combination of the individual state variables. As all θ arepositive, all invariants in (49) are also positive. As such, we see that y is not constant. We infer that the statefunction x is time-varying, and that it leads to an informative output function. Thus, S for this u satisfiesCondition 2 of Definition 10. We note that L{ y } ( s ; θ ) and L{ y } ( s ; θ ) differ only in the constant term of their numerators. The coefficientsin (49) play a similar role to invariants as they determine the output. As a further conceptual and notationalconvenience, we write φ ( θ ) , ( φ ( θ ) , φ ( θ ) , φ ( θ ) , β ( θ ) , φ ( θ )) ⊤ . ollowing Steps 2 and 3 of Proposition 1 leads to a system of test equations φ ( θ ′ ) = φ ( θ ) , containing four ofthe five equations used in testing S for SGI.Let us consider the difference between the systems of equations which follow from φ and φ . The analysis of S produces a novel equation involving φ . In analysing S output due to a single input here, the novel equationis due to β ( θ ) . This allows k , k , and x more freedom than that permitted by the φ equation. Thus,solving φ ( θ ′ ) = φ ( θ ) yields an even more complicated solution set than that seen for S in (37) and (38).As a kindness to the reader, we shall not present the solution sets here. However, classification of the structureis straightforward as φ ( θ ) provides five equations, yet we have six parameters. Thus, when the input set is U = { δ ( t − } , we classify S as U -SU.This is a less-favourable result than the classification of S as SLI (recall the the assumption that outputsare available for a broad enough range of inputs) as demonstrated in (47). This result reinforces the claimthat, when intending to test a structure for SGI, it is appropriate to specify the inputs which will be applied tophysical system. Thence, we may judge whether or not the associated idealised output allows determination ofinvariants, and use this knowledge in choosing an appropriate testing method. This overview has aimed to highlight the benefits of testing model structures for the property of structural globalidentifiability (SGI). Moreover, by assembling crucial definitions, drawing important distinctions, and providingtest examples, we have sought to illuminate some important concepts in the field of identifiability analysis. Wehope that this will encourage and assist interrogation of proposed structures so as to recognise those that are notSGI. This will allow researchers to anticipate the frustrations almost certain to accompany the use of a non-SGIstructure (especially, an unidentifiable one) in modelling and parameter estimation.Progress in the field of identifiability analysis is ongoing through the development of new methods of testingstructures for SGI or SLI, and refinements to their implementation. However, certain practical matters are yetto receive widespread consideration. We conclude with brief comments on a selection of these.
Competition—or collaboration—between testing methods?
Over a period of time, the literaturehas reported that one cannot generally anticipate which method will be easiest to apply to a given case, (e.g.[12, Page 96]), or that testing methods may suit some problems more than others (e.g. [7]). Consequently, whenconsidering software implementations of testing methods, we may not be able to anticipate which method willproduce a result in the shortest time, or at all. This uncertainty has prompted various comparisons aimed atevaluating the utility of alternative methods for testing structures for SGI.One may wonder if a competitive treatment of methods is a limiting one. That is, might there be benefits incombining methods so as to draw upon their strengths? For example, in considering controlled compartmentalLTI structures, the TFA provides a means of ascertaining whether or not a structure is generically minimal.If the conclusion is positive, we may then change our approach and apply a suitable testing method that usesa type of invariant expected to be simpler than those used in the TFA. For example, we may choose Markovand initial parameters as invariants, expecting these polynomials in the parameters to have a lower degree thanthose seen in transfer function coefficients. Given such simpler invariants, the resulting test equations will have areduced algebraic complexity. We could reasonably expect to solve these more quickly than equations obtainedfrom the TFA.
Reproducibility of analysis
There is a growing concern over the reproducibility of studies in computa-tional biology ([16]). We expect a greater awareness of identifiability analysis to encourage the asking of questionsthat will contribute to a rigorous and defensible modelling practice. Beyond this, we may also ponder how topromote reproducibility through the processes by which identifiability analysis is undertaken.For all but the simplest cases, testing a structure for SGI requires the use of a computer algebra system(CAS). Often this is a commercial product, such as Maple™, Mathematica, or MATLAB. However, as for allcomplex computer code, one cannot necessarily guarantee that results produced by a CAS will be correct in allsituations (see, for example, [2] noting a limitation of certain versions of Maple™). As such, it is good practicefor us to check that results obtained from one CAS agree with those from another.Performing such a comparison might not be straightforward. Recall that the classical approach to testing astructure for SGI requires the solution of a system of algebraic equations. If two CASs employ differing methodsin solving a given system, the solution sets may appear quite dissimilar, even if they are, in fact, the same. Thiscomplicates the task of determining whether or not the solution sets are equivalent.We may be able to make choices that can reduce the complexity of the comparison problem. One approachis to seek to direct the output of CASs by specifying similar options in their commands where this is possible.For example, the “solve” command in Maple™ allows the user to specify various options, including some relatingto how any solutions are displayed. Another Maple™ option allows some variables to be specified as “functions f a free variable”. We may be encouraged to use this given the form of solutions obtained from another CASwhich we would like to emulate.The seeming dissimilarity of solutions may be due to features of CAS solution algorithms that we cannotdirectly control. As such, we may seek to manage these by further scrutinising our equations (or more funda-mentally our invariants φ ( θ ) ), before we attempt to solve them.Suppose that each (multivariate polynomial) element of φ ( θ ) is some combination of simpler polynomials inparameters θ . We may determine these new polynomials by calculating a Gröbner basis for φ ( θ ) . This requiresan “ordering” of parameters, which determines how terms are arranged within a polynomial, and how monomialsare arranged within terms. We may obtain differing bases depending on the chosen ordering.In certain solution methods (such as “nonlinsolve” in version 1.4 of Python package SymPy) a CAS may(effectively) calculate a Gröbner basis for invariants, choosing an ordering without user input. In such cases,should different CASs employ differing orderings, the solutions of test equations may appear quite different. Assuch, it may be useful for the user to obtain a Gröbner basis for a specified ordering, and use this in formulatingtest equations for each CAS.We shall illustrate the importance of the choice of ordering by returning to our example structure S . InMaple 2019 (version 1) we used the “Basis” command (from the “Groebner” package) to compute Gröbner basesfor φ ( θ ) under different orderings. We varied the ordering of parameters, as specified by the “plex()” option(pure lexicographical ordering). The ordering indicates a decreasing preference for eliminating parameters fromour input polynomials (here, our invariants) as we proceed from the start of the list, with the aim of forming a“triangular” system in θ . As such, those parameters occurring earlier in the list are more likely to be eliminatedthan those occurring later.Using the ordering k > k > k > k > k > x yields the Gröbner basis: b ( θ ) , k x ,k k , − k k + k k + k + 2 k k + k ,k + k + k + k + k . (50)Alternatively, with the ordering k > k > x > k > k > k , Maple™ produces the Gröbner basis: b ( θ ) , k + 2 k k + k k + k k x k k + k + k k + k k k + k + k + k + k . (51)The Gröbner bases b ( θ ) and b ( θ ) are not identical, having only two components (the first and fourth compo-nents of b ( θ ) ) in common. (We also note that although (46) shows φ ( θ ) as comprised of six invariants, (50)(or (51)) shows that in the testing of S for SGI, θ is subject to only four independent conditions.)Suppose now that—in a similar manner as we did for φ ( θ ) —we use b ( θ ) and b ( θ ) in turn to definetwo distinct systems of four SGI test equations. The associated solution sets for θ ′ , I ( S , b ) and I ( S , b ) respectively, determined by Maple™ appear to be quite different. For example, I ( S , b ) shows k ′ and k ′ asfree parameters, whereas I ( S , b ) has k ′ and k ′ free. This result suggests that using a Gröbner basis of ourinvariants to define SGI test conditions may remove one cause of unwanted variation between results obtainedby different CASs.When faced with (potential or actual) disparities between CAS results, access to the source code may il-luminate the cause of the divergence, and contribute to its resolution. However, certain CAS do not permitsuch access to the source. In light of this, we are currently developing open-source code using the programminglanguage Python, making particular use of the SymPy (symbolic algebra) package. By implementing this in theJupyter notebook environment, we intend to develop implementations of testing algorithms (as we have for theTFA approach) that are readily accessible to the scientific community, and permit user customisation. Acknowledgements
The author is grateful to the organisers of the programme “Influencing public health policy with data-informedmathematical models of infectious diseases” at MATRIX (Creswick, Victoria, July 1-12 2019) for the invitationto present, and exposure to aspects of infectious disease modelling. Appreciation goes also to the organisers ofthe programme “Identifiability problems in systems biology" held at the American Institute of Mathematics, SanJose, California (August 19-23 2019) and its participants, for useful discussions on contemporary problems. We may consider a Gröbner basis for a list of polynomials as analogous to the reduced row-echelon form of a system of linearequations. For example, the polynomial x y + 2 xy − x + y employs “pure lexicographical ordering” with “ x > y ”—terms are arranged bydecreasing degree of monomials in x , and within each term any monomial in x appears before one in y . Changing the ordering to“ y > x ” yields an alternative form: y x + yx + y − x . eferences [1] American Institute of Mathematics: Identifiability problems in systems biology (2019). URL https://aimath.org/workshops/upcoming/identbio/ [2] Armando, A., Ballarin, C.: A reconstruction and extension of Maple’s assume facility via constraint con-textual rewriting. Journal of Symbolic Computation , 503–521 (2005)[3] Audoly, S., Bellu, G., D’Angiò, L., Saccomani, M.P., Cobelli, C.: Global identifiability of nonlinear modelsof biological systems. IEEE Transactions on Biomedical Engineering (1), 55–65 (2001)[4] Bellman, R., Åström, K.J.: On structural identifiability. Mathematical Biosciences , 329–339 (1970).DOI 10.1016/0025-5564(70)90132-X[5] Benvenuti, L., Farina, L.: Minimal positive realizations: a survey of recent results and open problems.Kybernetika (2), 217–228 (2003)[6] Caines, P.E.: Linear Stochastic Systems. John Wiley & Sons, Inc. (1988)[7] Chis, O.T., Banga, J.R., Balsa-Canto, E.: Structural identifiability of systems biology models: a criticalcomparison of methods. PloS one (11), e27755 (2011)[8] Cobelli, C., DiStefano III, J.J.: Parameter and structural identifiability concepts and ambigu-ities: a critical review and analysis. American Journal of Physiology-Regulatory, Integrativeand Comparative Physiology (1), R7–R24 (1980). DOI 10.1152/ajpregu.1980.239.1.R7. URL https://doi.org/10.1152/ajpregu.1980.239.1.R7 . PMID: 7396041[9] Cox Jr., L.A., Huber, W.A.: Symmetry, Identifiability, and Prediction Uncertainties in Multistage ClonalExpansion (MSCE) Models of Carcinogenesis. Risk Analysis: An International Journal (6), 1441–1453(2007)[10] Denis-Vidal, L., Joly-Blanchard, G.: Equivalence and identifiability analysis of uncontrolled nonlinear dy-namical systems. Automatica (2), 287–292 (2004)[11] DiStefano III, J.: Dynamic systems biology modeling and simulation. Academic Press (2015)[12] Godfrey, K.: Compartmental Models and Their Application. Academic Press Inc. (1983)[13] Godfrey, K., DiStefano III, J.: Identifiability of model parameters. IFAC Proceedings Volumes (5),89–114 (1985)[14] van den Hof, J.M.: Structural identifiability from input-output observations. Tech. Rep. BS-9514, Centrumvoor Wiskunde en Informatica, P. O. Box 94079, 1090 GB Amsterdam, The Netherlands (1995)[15] Jacquez, J.A., Greif, P.: Numerical parameter identifiability and estimability: Integrating identifiability,estimability and optimal sampling design. Mathematical Biosciences (1-2), 201–227 (1985)[16] Laubenbacher, R., Hastings, A.: Editorial. Bulletin of Mathematical Biology (12), 3069–3070 (2018).DOI 10.1007/s11538-018-0501-8. URL https://doi.org/10.1007/s11538-018-0501-8 [17] Poljak, S.: On the gap between the structural controllability of time-varying and time-invariant systems.IEEE Transactions on Automatic Control (12), 1961–1965 (1992)[18] Seber, G.A.F., Wild, C.J.: Nonlinear regression. Wiley series in probability and statistics. Wiley (2003)[19] Vajda, S.: Structural equivalence of linear systems and compartmental models. Mathematical Biosciences (1-2), 39–64 (1981)[20] Villaverde, A.F., Barreiro, A., Papachristodoulou, A.: Structural Identifiability of Dynamic Systems BiologyModels. PLoS Computational Biology (10), e1005153 (2016)[21] Walter, E., Lecourtier, Y.: Unidentifiable compartmental models: what to do? Mathematical biosciences (1-2), 1–25 (1981)[22] Walter, É., Pronzato, L.: Identification of Parametric Models from Experimental Data. Communicationand Control Engineering. Springer (1997)[23] Whyte, J.M.: On Deterministic Identifiability of Uncontrolled Linear Switching Systems. WSEAS Trans-actions on Systems (5), 1028–1036 (2007)[24] Whyte, J.M.: A preliminary approach to deterministic identifiability of uncontrolled linear switching sys-tems. In: 3rd WSEAS International Conference on Mathematical Biology and Ecology (MABE’07), Pro-ceedings of the WSEAS International Conferences. Gold Coast, Queensland, Australia (2007)[25] Whyte, J.M.: Inferring global a priori identifiability of optical biosensor experiment models. In: G.Z.Li, X. Hu, S. Kim, H. Ressom, M. Hughes, B. Liu, G. McLachlan, M. Liebman, H. Sun (eds.) IEEEInternational Conference on Bioinformatics and Biomedicine (IEEE BIBM 2013), pp. 17–22. Shanghai,China (2013). DOI 10.1109/BIBM.2013.6732453
26] Whyte, J.M.: Global a priori identifiability of models of flow-cell optical biosensor experiments. Ph.D.thesis, School of Mathematics and Statistics, University of Melbourne, Victoria, Australia (2016)[27] Yamada, T., Luenberger, D.G.: Generic Controllability Theorems for Descriptor Systems. IEEE Transac-tions on Automatic Control
AC-30 (2), 144–152 (1985)(2), 144–152 (1985)