Forward sensitivity analysis of the FitzHugh-Nagumo system: Parameter estimation
FF ORWARD S ENSITIVITY A NALYSIS OF THE F ITZ H UGH -N AGUMO S YSTEM : P
ARAMETER E STIMATION
A P
REPRINT
Shady E. Ahmed
School of Mechanical & Aerospace Engineering,Oklahoma State University,Stillwater, OK 74078, USA. [email protected]
Omer San
School of Mechanical & Aerospace Engineering,Oklahoma State University,Stillwater, OK 74078, USA. [email protected]
Sivaramakrishnan Lakshmivarahan
School of Computer Science,University of Oklahoma,Norman, Oklahoma - 73019, USA. [email protected] A BSTRACT
The FitzHugh-Nagumo (FHN) model, from computational neuroscience, has attracted attention innonlinear dynamics studies as it describes the behavior of excitable systems and exhibits interestingbifurcation properties. The accurate estimation of the model parameters is vital to understand howthe solution trajectory evolves in time. To this end, we provide a forward sensitivity method (FSM)approach to quantify the main model parameters using sparse measurement data. FSM constitutesa variational data assimilation technique which integrates model sensitivities into the process offitting the model to the observations. We analyse the applicability of FSM to update the FHN modelparameters and predict its dynamical characteristics. Furthermore, we highlight a few guidelinesfor observations placement to control the shape of the cost functional and improve the parameterinference iterations. K eywords Forward sensitivity, parameter estimation, FitzHugh-Nagumo model, data assimilation.
Dynamical systems are ubiquitous around us and in every scientific discipline. Examples from physical sciences includeatmospheric and oceanic flows, heat and mass transfer, the behavior of moving objects (e.g., cars, ships, airplanes,rockets, pendulums, etc.), chemical reactions, and signal transmission. In social sciences, the population increase anddistribution, human interactions, and cultural developments over centuries have been following interesting dynamicalpatterns. Researchers and practitioners in life sciences have also found that the application of dynamical systemstheories to the bodies, organs, and cells yields significant advancement in our understanding and treatment of thebody. In neurosciences, the understanding of brain performance and response to external stimulus has been critical forepilepsy prevention and treatment. Several dynamical models have been historically proposed and investigated to studyand analyze the neuronal activity (e.g., see [1]). The FitzHugh-Nagumo (FHN) equations [2–4] represent one of thevery popular and simple models in the study of neuro-physiology. In addition to its utility for the modeling of biologicalbehavior, it is considered a prototypical model in the study of nonlinear dynamics due to its interesting characteristicssuch as the bifurcation properties [5]. a r X i v : . [ n li n . PS ] F e b PREPRINT
The two-equation FHN model, describing neuronal spike discharges, can be defined as d v d t = v − v − w + I, (1) τ d w d t = v + a − bw, (2)where v defines the membrane potential, w stands for a recovery variable, and τ is the time scale. I represents theexternal input current, while a and b are controlling parameters. The FHN model might appear in various forms, whichcan be related to Eqs. 1–2 by a set of changes of variables and coordinate transformations. It describes the dynamicsof excitable systems which can be observed in various natural systems such as neuronal dynamics, electrocardiology,chemical reactions, and climate dynamics. However, the parameters in the FHN model are difficult to be computeddirectly in a real-world experimentation and the estimation of these parameters has gained the interest of a lot ofresearchers in physiological sciences. We shall see in the following discussions that the specification of the modelparameters is crucial for the prediction of the system’s behavior. For instance, the system can either converge to a stablefixed point or exhibit a limit cycle. Thus, the knowledge of such parameters can be very useful for diagnostic as well asprediction purposes, and the objective of the current study is to estimate the model’s parameters from a few (possiblynoisy) measurements of the system’s state.The parameter estimation framework for FHN model can be generally formulated via standard techniques such assimulated annealing, genetic algorithms, differential evolution, and Kalman filtering extensions. Besides, the knownmodel’s structure and characteristics can be utilized to customize an algorithm to estimate the parameters of therespective model. For example, the time-scale separation in the FHN model has been exploited to infer the model’sparameters [6]. Che et al. [7] solved the parameter estimation problem by deriving a second order differential equationfor the membrane potential, being the observed quantity. A least-squares based regression was then applied andequipped by a wavelet denoising technique to reduce the effect of noise contamination. Geng et al. [8] applied anexpectation maximization based algorithm to identify generic FHN model parameters and estimate the variance of theinterfering Gaussian noise. Jensen et al. [9] applied a Markov chain Monte Carlo method to infer the parameters in astochastic FHN model, constructed by adding a noise term governed by a Brownian motion. Melnykova [10] proposeda contrast estimator technique to infer the model’s parameters in the asymptotic setting.In the present study, we utilize a variational data assimilation technique, namely the forward sensitivity method(FSM) [11,12], to identify the correct parameter values. The inherent sensitivity analysis reveals the relative dependenceof the cost functional, defined by the discrepancy between the identified model’s predictions and the actual observations,onto the respective parameters. We also investigate the effect of observation placement instants on the shape of the costfunctional and the corresponding sensitivities. We finally highlight measurement collection guidelines that potentiallyimprove the parameter inference iterations. The FHN model can be described as ˙ x = f ( x , α ) , (3)where x = [ v ( t ) , w ( t )] T denotes the system’s state, α = [ a, b ] T is the model’s parameters, and f represents thecontinuous-time dynamics of the FHN model (i.e., f ( x ; α ) = [ v − v − w + I, ( v + a − bw ) /τ ] T ). Assuming thethe model f is continuously differentiable in its arguments (i.e., x and α ), its Jacobians with respect to the state x andthe parameter α can be defined as below Df x = (cid:20) − v − /τ − b/τ (cid:21) , Df α = (cid:20) /τ − w/τ (cid:21) , (4)where Df x and Df α define the model’s sensitivity with respect to the state x and the parameters α , respectively. Using a suitable temporal integration scheme, the FHN can be rewritten in a discrete-time form as follows, x ( k + 1) = M ( x ( k ) , α ) , (5)where x ( k ) = [ v ( t k ) , w ( t k )] T ∈ R defines the system’s state at time t k , and M : R × R → R represents theone-step state transition map. Thus, the following discrete-time Jacobians can be computed, DM x ( k ) = (cid:20) ∂M i ( x , α ) ∂x j (cid:21) x = x ( k ) , DM α ( k ) = (cid:20) ∂M i ( x , α ) ∂α j (cid:21) x = x ( k ) . (6)2 PREPRINT
Furthermore, we define the sensitivity of the model forecast at any time t k with respect to the model’s parameters asfollows, V ( k ) = (cid:20) ∂x i ( k ) ∂α j (cid:21) ∈ R × . (7)Equation 5 can be used to evaluate the forward sensitivity matrices at different times in a recursive way as V ( k + 1) = DM x ( k ) V ( k ) + DM α ( k ) , (8)with V (0) = since the initial condition x (0) is independent of the model’s parameters α . Assuming z ( k ) ∈ R m to be the vector of measurements at time t k , given by z ( k ) = h (¯ x ( k )) + ξ ( k ) , (9)where h : R → R m defining the observational operator that relates the model space to the observation space, and ¯ x defines the true system’s state while ξ denotes the measurement noise. For simplicity, we suppose that we directlymeasure the system’s state (i.e., h ( x ( k )) = x ( k ) ). We also assume that ξ is a white Gaussian noise with zero mean anda covariance matrix R (i.e., ξ ( k ) = N ( , R ( k )) ).We define the difference between the model forecast and measurements as e ( k ) = z ( k ) − h ( x ( k )) , which is calledthe innovation or forecast error (computed in the observation space). With the assumption that the dynamical modelis perfect (i.e., correctly encapsulates all the relevant processes) and the initial condition x (0) is known, then thedeterministic part of the forecast error can be attributed to the inaccuracy of the model’s parameters values, defined as δ α = ¯ α − α , where ¯ α denotes the true values of the parameters. Thus, we can define a cost functional J : R → R as J ( α ) = N (cid:88) k =1 (cid:107) e ( k ) (cid:107) R − ( k ) = N (cid:88) k =1 e ( k ) T R − ( k ) e ( k ) , (10)where N is the number of measurement instants. The minimization of the cost function J can be solved as a strongconstrained problem with the standard Lagrangian multiplier method, resulting in the adjoint framework. Alternatively,we utilize the forward sensitivity matrices to evaluate an optimal estimate for the parameters α . Let δ x ( k ) = ¯ x ( k ) − x ( k ) be the difference between the model’s forecast and the true state, with δ x (0) = 0 since the initial conditions are perfectlyknown. With first order Taylor expansions of e ( k ) and δ x ( k ) , the following expressions can be defined, e ( k ) = Dh ( k ) δ x ( k ) , δ x ( k ) = V ( k ) δ α , (11)where Dh is the Jacobian of the observational operator h . Therefore, the forecast error can be related to the correctionto the model’s parameters as e ( k ) = Dh ( k ) V ( k ) δ α . Since we assume that h ( x ( k )) = x ( k ) , we deduce that Dh reduces to the identity matrix. The previous forecast error formulation can be written for all N time instants at whichobservations become available, and the following linear equation is obtained, H δ α = e F , (12)where the matrix H ∈ R Nm × and the vector e F ∈ R Nm are defined as follows, H = Dh (1) V (1) Dh (2) V (2) ... Dh ( N ) V ( N ) , e F = e (1) e (2) ... e ( N ) . (13)The inverse problem can be solved in a weighted least squares sense to find an optimal correction vector δ α , with R − as a weighting matrix, where R is a an N m × N m block-diagonal matrix with R ( k ) being its k -th diagonal block. Weassume that R is a diagonal matrix defined as R = σ I Nm , where I Nm is the N m × N m identity matrix. Then, thesolution to Eq. 12 can be written as δ α = (cid:0) H T R − H (cid:1) − H T R − e F . (14) In order to select the time instants at which measurement data are collected, we relate the cost functional given in Eq. 10to the forward sensitivity matrix V ( k ) . This is based on the method proposed by Lakshmivarahan et al. [13] to control3 PREPRINT the shape of the cost functional and keep its gradient away from zero to accelerate the convergence. By substituting e ( k ) = Dh ( k ) V ( k ) δ α into Eq. 10, we get the following, J ( α ) = N (cid:88) k =1 δ α T (cid:18) E ( k ) T R − ( k ) E ( k ) (cid:19) δ α = N (cid:88) k =1 δ α T G ( k ) δ α , (15)where E ( k ) = Dh ( k ) V ( k ) and G ( k ) = E ( k ) T R − ( k ) E ( k ) . We note that G ( k ) is called the observability Gramian.The gradient of the cost functional with respect to the parameter vector α can be written as below ∇ α J ( α ) = N (cid:88) k =1 − G ( k ) δ α , (16)which relates the gradient of the cost functional and the parameterization error/correction. From Eq. 16, a necessarycondition for the minimization of the cost functional is that G ( k ) is positive definite. For the case considered here, Dh ( k ) = I and R − ( k ) = 1 σ I . Thus, G ( k ) = 1 σ V ( k ) T V ( k ) , where V ( k ) = (cid:20) V V V V (cid:21) . Therefore, one wayto guarantee that the gradient of the cost functional does not hit zero and improve the convergence is to select themeasurement instants in such a way that the diagonal entries (i.e., V + V and V + V ) are as large as possible. We analyze the capability of the forward sensitivity approach to identify the FHN model’s parameters. In particular,we study an arbitrary case where the true parameters values are a = 0 . and b = 0 . . Initial conditions of ( v (0) , w (0)) = (0 . , . are considered and the fourth order Runge-Kutta scheme is applied for time integration with atime step of ∆ t = 0 . , time scale τ = 10 , and a maximum time of t m = 100 . We assume that the measurements arecollected every time steps, corrupted by an additive Gaussian noise with a zero mean and a standard deviation of σ = 0 . . As a first investigation, we study the case with zero input (i.e., I = 0 ). This corresponds to a fixed point of ( v ∗ , w ∗ ) = ( − . , − . with a model Jacobian of (cid:20) . − . − . (cid:21) . The eigenvalues of this matrix are λ = 0 . , λ = 0 . , implying unsteady equilibrium points. However, a Lyapunov function analysis reveals that thesolution of this system is bounded and exhibits an attractive limit cycle [14–18]. In Figure 1, we plot the time evolutionof the membrane potential, v , and the recovery variable, w , for the true system compared to the case with the inferredparameters values. Starting from a prior guess of a = 0 . and b = 0 . to initiate the FSM iterations, a parameterizationof a = 0 . and b = 0 . is identified, very close to the true values. Thus, we can see that the adopted FSM approachis adequately capable of assimilating these noisy data to estimate the model’s parameters for this case.Figure 1: Results for a = 0 . , b = 0 . , and I = 0 , with measurements every 200 time steps and σ = 0 . . Estimatedparameters are a = 0 . and b = 0 . .A second testing situation is to apply a constant input of I = 5 , with the same parameters values as before. We findthat this case corresponds to a stable fixed point. In other words, the solution trajectory converges to the equilibrium4 PREPRINT point (which is ( v ∗ , w ∗ ) = (1 . , . ) and resides there. We apply the same procedure to estimate the model’sparameters starting with an initial guess of a = 0 . and b = 0 . . The plots in Figure 2 show that the iterative algorithmfails to correctly approximate the parameters values and produces a periodic solution, instead. To understand this,we compute the fixed point and the eigenvalues of the corresponding model’s Jacobian. We find that with I = 0 (theprevious case), both the true values ( a, b ) = (0 . , . and the initial guess ( a, b ) = (0 . , . induce a periodic limitcycle. On the other hand, for I = 5 , the true parameter values correspond to a stable fixed point, while the initial guessstill yields a cyclic behavior. Therefore, the estimation process should cross the bifurcation points in order to predict thecorrect parameterization, which is a common problem in parameter estimation frameworks.Figure 2: Results for a = 0 . , b = 0 . , and I = 5 , with measurements every 200 time steps and σ = 0 . . Estimatedparameters are a = − . and b = − . , starting from an initial guess of (0 . , . .In order to mitigate this issue, prior information about the regime of the solution trajectory can be utilized to make anintelligent guess. For instance, an initial guess of ( a, b ) = (0 . , . with I = 5 yields a stable fixed point, and hencecan be chosen as an alternative starting point. Results are presented in Figure 3, where we can see that both the true andpredicted trajectories converge to the equilibrium state. However, the estimated parameters values ( a = 0 . and b = 0 . ) are slightly far from the true ones.Figure 3: Results for a = 0 . , b = 0 . , and I = 5 , with measurements every 200 time steps and σ = 0 . . Estimatedparameters are a = 0 . and b = 0 . , starting from an initial guess of (0 . , . .In order to explore the effect of the measurements on the forward sensitivities, we plot the variation of V ij for i, j ∈ { , } with time in Figure 4. We observe that the initial period has the least influence on the forward sensitivities,while the measurements around and after t = 50 have the largest effects. Therefore, we redistribute our measurementinstants based on the approach described in Section 2.3. In particular, we collect data at t ∈ { , , , , } andapply the FSM framework to estimate the model’s parameter. Starting from an initial guess of ( a, b ) = (0 . , . , aparameterization of ( a, b ) = (0 . , . is estimated, showing significant improvement with respect to the case withequispaced measurement signals. Results are shown in Figure 5 for the true and predicted trajectories. We can alsonotice that the optimized measurements are concentrated towards the equilibrium state.5 PREPRINT
Figure 4: Forward sensitivities for a = 0 . , b = 0 . , and I = 5 . Selected observations instants are denoted withorange circles.Figure 5: Results for a = 0 . , b = 0 . , and I = 5 , with measurements instants selected based on the forwardsensitivity criteria. Estimated parameters are a = 0 . and b = 0 . , starting from an initial guess of (0 . , . . Finally, we vary the input excitation as I = 5 t/t m (i.e., linearly increasing from to ). This corresponds to a movingfixed point, beginning with a cyclic trajectory and followed by a convergence to the stable equilibria. Parameterestimation results for equidistant measurement instants are depicted in Figure 6 beginning from an initial guess of ( a, b ) = (0 . , . . We find that the predicted trajectory sufficiently match the true one, but the estimated parametersare not very accurate.Figure 6: Results for a = 0 . , b = 0 . , and varying I , with measurements every 200 time steps and σ = 0 . .Estimated parameters are a = 0 . and b = 0 . , starting from an initial guess of (0 . , . .We then investigate the effects of observation times on the forward sensitivities of the model predictions. We find aspike in the sensitivity of v predictions with respect to the v measurements around t = 12 . . We also see a relativelylarge dependence on the w measurements about t = 37 . . On the other hand, the w predictions show an increasing6 PREPRINT sensitivity on either v or w measurements at final times. Therefore, we reallocate our observation times to capture thesetrends as demonstrated in Figure 7. Results based on this enhanced parameter estimation methodology are described inFigure 8, where the approximated parameters values ( a = 0 . , and b = 0 . ) are closer to the true values.Figure 7: Sensitivities for a = 0 . , b = 0 . , and varying I . Selected observations instants are denoted with orangecircles.Figure 8: Results for a = 0 . , b = 0 . , and varying I , with measurements instants selected based on the forwardsensitivity criteria. Estimated parameters are a = 0 . and b = 0 . starting from an initial guess of (0 . , . . We put forth a forward sensitivity analysis for the FitzHugh-Nagumo (FHN) system to infer the model’s parameterizationfrom sparse observations. The approach relies on the investigation of the forward sensitivity matrices that encapsulatesthe temporal dependence of model’s predictions onto its parameters. The presented methodology shows substantialsuccess in assimilating noisy observational data to identify the unknown parameters. We find that the convergence ofthe predicted parameters to the true values relatively depends on the first guess used to initialize the algorithm. Inparticular, the initial guess has to yield equilibrium points with similar stability characteristics to the true one. We studythree test cases, including zero input, constant non-zero current, and time-dependent excitation. We also formulatemeasurement collection guidelines based on the relation between the cost functional and the forward sensitivitycomponents. We demonstrate that this approach provides more accurate estimates of unknown parameters than thoseresulting with arbitrary measurement placements.
Acknowledgments
This material is based upon work supported by the U.S. Department of Energy, Office of Science, Office of AdvancedScientific Computing Research under Award Number DE-SC0019290. O.S. gratefully acknowledges the U.S. DOEEarly Career Research Program support.Disclaimer: This report was prepared as an account of work sponsored by an agency of the United States Government.Neither the United States Government nor any agency thereof, nor any of their employees, makes any warranty,express or implied, or assumes any legal liability or responsibility for the accuracy, completeness, or usefulness of any7
PREPRINT information, apparatus, product, or process disclosed, or represents that its use would not infringe privately owned rights.Reference herein to any specific commercial product, process, or service by trade name, trademark, manufacturer, orotherwise does not necessarily constitute or imply its endorsement, recommendation, or favoring by the United StatesGovernment or any agency thereof. The views and opinions of authors expressed herein do not necessarily state orreflect those of the United States Government or any agency thereof.
References [1] Eugene M Izhikevich.
Dynamical systems in neuroscience: The geometry of excitability and bursting . MIT Press,2007.[2] Richard FitzHugh. Mathematical models of threshold phenomena in the nerve membrane.
The bulletin ofMathematical Biophysics , 17(4):257–278, 1955.[3] Richard FitzHugh. Impulses and physiological states in theoretical models of nerve membrane.
BiophysicalJournal , 1(6):445, 1961.[4] Jinichi Nagumo, Suguru Arimoto, and Shuji Yoshizawa. An active pulse transmission line simulating nerve axon.
Proceedings of the IRE , 50(10):2061–2070, 1962.[5] S Sehgal and AJ Foulkes. Numerical analysis of subcritical Hopf bifurcations in the two-dimensional FitzHugh-Nagumo model.
Physical Review E , 102(1):012212, 2020.[6] Rose T Faghih, Ketan Savla, Munther A Dahleh, and Emery N Brown. The Fitzhugh-Nagumo model: Firingmodes with time-varying parameters & parameter estimation. In , pages 4116–4119. IEEE, 2010.[7] Yanqiu Che, Li-Hui Geng, Chunxiao Han, Shigang Cui, and Jiang Wang. Parameter estimation of the FitzHugh-Nagumo model using noisy measurements for membrane potential.
Chaos: An Interdisciplinary Journal ofNonlinear Science , 22(2):023139, 2012.[8] Li-Hui Geng, Terefe Bayisa Ayele, Jin-Cang Liu, and Brett Ninness. Expectation maximization based FitzHugh-Nagumo model identification under unknown gaussian measurement noise. In , pages 2167–2172. IEEE, 2020.[9] Anders Chr Jensen, Susanne Ditlevsen, Mathieu Kessler, and Omiros Papaspiliopoulos. Markov chain MonteCarlo approach to parameter estimation in the FitzHugh-Nagumo model.
Physical Review E , 86(4):041114, 2012.[10] Anna Melnykova. Parametric inference for hypoelliptic ergodic diffusions with full observations.
StatisticalInference for Stochastic Processes , 23(3):595–635, 2020.[11] S Lakshmivarahan and John M Lewis. Forward sensitivity approach to dynamic data assimilation.
Advances inMeteorology , 2010:1–12, 2010.[12] Sivaramakrishnan Lakshmivarahan, John M Lewis, and Rafal Jabrzemski.
Forecast error correction using dynamicdata assimilation . Springer, Switzerland, 2017.[13] S Lakshmivarahan, John M Lewis, and Junjun Hu. On controlling the shape of the cost functional in dynamicdata assimilation: Guidelines for placement of observations and application to saltzman’s model of convection.
Journal of the Atmospheric Sciences , 77(8):2969–2989, 2020.[14] Matthias Ringkvist.
On dynamical behaviour of FitzHugh-Nagumo systems . PhD thesis, Department of Mathe-matics, Stockholm University, 2006.[15] E Kaumann and U Staude. Uniqueness and nonexistence of limit cycles for the FitzHugh equation. In
Equadiff82 , pages 313–321. Springer, 1983.[16] KP Hadeler, U An Der Heiden, and K Schumacher. Generation of the nervous impulse and periodic oscillations.
Biological Cybernetics , 23(4):211–218, 1976.[17] SA Treskov and EP Volokitin. On existence of periodic orbits for the FitzHugh nerve system.
Quarterly of AppliedMathematics , 54(4):601–607, 1996.[18] Mattias Ringkvist and Yishao Zhou. On the dynamical behaviour of FitzHugh–Nagumo systems: revisited.