Fractional-order susceptible-infected model: definition and applications to the study of COVID-19 main protease
aa r X i v : . [ q - b i o . M N ] M a y FRACTIONAL-ORDER SUSCEPTIBLE-INFECTEDMODEL: DEFINITION AND APPLICATIONS TO THESTUDY OF COVID-19 MAIN PROTEASELuciano Abadias , , Gissell Estrada-Rodriguez , ErnestoEstrada , Abstract
We propose a model for the transmission of perturbations across theamino acids of a protein represented as an interaction network. The dy-namics consists of a Susceptible-Infected (SI) model based on the Caputofractional-order derivative. We find an upper bound to the analytical solu-tion of this model which represents the worse-case scenario on the propaga-tion of perturbations across a protein residue network. This upper bound isexpressed in terms of Mittag-Leffler functions of the adjacency matrix of thenetwork of inter-amino acids interactions. We then apply this model to theanalysis of the propagation of perturbations produced by inhibitors of themain protease of SARS CoV-2. We find that the perturbations producedby strong inhibitors of the protease are propagated far away from the bind-ing site, confirming the long-range nature of intra-protein communication.On the contrary, the weakest inhibitors only transmit their perturbationsacross a close environment around the binding site. These findings mayhelp to the design of drug candidates against this new coronavirus.
MSC 2010 : Primary 26A33; Secondary 33E12, 92C40, 05C82
Key Words and Phrases : Caputo derivative; Mittag-Leffler matrix func-tions; Susceptible-Infected model; COVID-19, SARS CoV-2 protease
1. Introduction
The presence of a networked structure is one of the fundamental char-acteristics of complex systems in general [11, 21]. It could be argued thatthe main function of such networks is that of allowing the communicationc (cid:13)
Year Diogenes Co., Sofiapp. xxx–xxx Abadias, Estrada-Rodriguez, Estradabetween the entities that form its structure. In the case of proteins, thenon-covalent interactions between residues in their three-dimensional struc-tures form inter-residue networks [11, 20]. These networks facilitate thatinformation about one site is transmitted to and influences the behavior ofanother. This phenomenon–the transmission of any perturbation in proteinstructure and function from one site to another–is known as allostery, whichrepresents an essential feature of protein regulation and function [10, 22].Allostery permits that two residues geometrically distant can interact witheach other. As observed experimentally by Ottemann et al. [32] a confor-mational change of 1˚A in a residue can be transmitted to another 100˚Aapart. As stated long time ago, such allosteric effects can occur even whenthe average protein structure remains unaltered [7]. An important kindof allosteric effect is the transmission of the changes produced by a lig-and interacting with a protein. Such transmission occurs from the residuesproximal to the binding site to other residues distant from it. Such kindof allosteric interaction is very important for understanding the effects ofdrugs on their receptors, which directly impacts the drug design process[24].It has been stressed by Berry [6] that there are striking similarities be-tween organization schemes at different observation scales in complex sys-tems, such as allosteric-enzyme networks, cell population and virus spread-ing. Recently, Miotto et al. [29] exploited these similarities between epi-demic spreading and a diffusive process on a protein residue network toprove the capability of propagating information in complex 3D proteinstructures. Their analogy proved useful in estimating important proteinproperties ranging from thermal stability to the identification of functionalsites [29]. In the current work, we go a step further in the exploitation of theanalogy between epidemiological models and communication processes inproteins by considering the inclusion of long-range transmission effects. Forthis purpose, we develop here a new fractional-order Susceptible-Infected(SI) model for the transmission of perturbations through the amino acids ofa protein residue network. Such perturbations are produced, for instance,by the interactions of the given protein with inhibitors, such as drugs ordrug candidates. We obtain an upper bound to the exact soluction of thisfractional-order SI model which is expressed in terms of the Mittag-Lefflermatrix functions, and which generalizes the upper bound found by Lee etal. [23] to the non-fractional (classical) SI model.Due to its current relevance, we apply the present approach to thestudy of the long-range inter-residue communication in the main proteaseof the new coronavirus named SARS-CoV-2 [37, 36]. This new coronavirushas produced an outbreak of pulmonary disease expanding from the cityRACTIONAL SI MODEL 3of Wuhan, Hubei province of China to the rest of the World in about3 months [40]. One of the most important targets for the development ofdrugs against SARS-CoV-2 is its main protease, M pro , whose 3-dimensionalstructure has been recently resolved and deposited [39] in the Protein DataBank (PDB) [1]. It is a key enzyme for proteolytic processing of polypro-teins in the virus and some chemicals have been found to bind this protein,representing potential specific drug canditades against CoV-2 [39]. Here wefind that important communication between amino acids in CoV-2 M pro oc-curs from the proximities of the binding site to very distant amino acidsin other domains of the protein. These effects produced by the interactionwith inhibitors are transmistted up to 50˚A away from the binding site, con-firming the long-range nature of intra-protein communication. Accordingto our results, it seems that stronger inhibitors transmit such perturba-tions to longer inter-residue distances. Therefore, the current findings areimportant for the understanding of the mechanisms of drug action on CoV-2 M pro , which may help to the design of drug candidates against this newcoronavirus.
2. Antecedents and Motivations2.1. Protein residue networks
The protein residue networks (PRN) (see ref. [11] Chapter 14 for de-tails) are simple, undirected and connected graphs G “ p V, E q , thereforetheir adjacency matrices are symmetric matrices of order n ˆ n and haveeigenvalues λ ą λ ě ¨ ¨ ¨ ě λ n . As the matrices are traceless, the spectralradius λ ą
0. Here v i P V, i “ , . . . , n are the nodes corresponding tothe amino acids of a protein and two nodes v i and v j are connected byan edge t v i , v j u P E if the corresponding residues (amino acids) interactsphysically in the protein. They are built here by using the informationreported on the Protein Data Bank [1] for the protease of CoV-2 as well asits complexes with three inhibitors (see further). The nodes of the networkrepresent the α -carbon of the amino acids. Then, we consider cutoff radius r C , which represents an upper limit for the separation between two residuesin contact. The distance r ij between two residues i and j is measured bytaking the distance between C α atoms of both residues. Then, when theinter-residue distance is equal or less than r C both residues are consideredto be interacting and they are connected in the PRN. The adjacency matrix A of the PRN is then built with elements defined by A ij “ " H p r C ´ r ij q i ‰ j, i “ j, (2.1) Abadias, Estrada-Rodriguez, Estradawhere H p x q is the Heaviside function which takes the value of one if x ą Here we state the main motivation of using a Susceptible-Infected (SI)model for studying the effects of inhibitor binding to a protein residuenetwork in a similar way as an SIS has been used by Miotto et al. [29].The selection of an SI model can be understood by the fact that we areinterested in the early times of the dynamics. At this stage, it has beenshown [23] that the SI model is most suitable than any other model. Tomotivate the SI model in the PRN context let us consider that an amino acidis in the binding site of a protein. Then, this amino acid is susceptible to beperturbed by the interaction with this inhibitor. Consequently, this residuecan be in one of two states, either waiting to be perturbed (susceptible) orbeing perturbed by the interaction. Of course, this amino acid can transmitthis perturbation to any other amino acid in the protein to which it interactswith. Then, if β is the rate at which such perturbation is transmittedbetween amino acids, and if s i p t q and x i p t q are the probabilities that theresidue i is susceptible or get perturbed at time t , respectively, we can writethe dynamics ds i p t q dt “ ´ βs i p t q x i p t q , (2.2) dx i p t q dt “ βs i p t q x i p t q . (2.3)Because the amino acids can only be in the states “susceptible” or“perturbed” we have that s i p t q ` x i p t q “
1, such that we can write dx i p t q dt “ β p ´ x i p t qq x i p t q . (2.4)When we consider all the interactions between pairs of residues in thePRN we should transform the previous equation into a system of equationsof the following form [28]: dx i p t q dt “ β p ´ x i p t qq ÿ j P N A ij x j p t q , t ě t , (2.5)RACTIONAL SI MODEL 5where A ij are the entries of the adjacency matrix of the PRN for the pair ofamino acids i and j , and N “ t , . . . , n u . In matrix-vector form becomes: dx p t q dt “ β r I n ´ diag p x p t qqs Ax p t q , (2.6)with initial condition x p q “ x . The evolution of dynamical systems basedon the adjacency matrix of a network have been analyzed by Mugnolo [31].It is well-known that [23]:(1) if x P r , s n then x p t q P r , s n for all t ą x p t q is monotonically non-decreasing in t ;(3) there are two equilibrium points: x ‹ “
0, i.e. no epidemic, and x ‹ “
1, i.e. full contagion;(4) the linearization of the model around the point 0 is given by dx p t q dt “ βA x p t q , (2.7)and the solution diverges when t Ñ 8 , due to the fact that thespectral radius of A is positive;(5) each trajectory with x ‰ x ‹ “
1, i.e.the epidemic spreads monotonically to the entire network.The SI model can be rewritten as11 ´ x i p t q dx i p t q dt “ β ÿ j P N A ij ´ ´ e ´p´ log p ´ x j p t qqq ¯ , (2.8)which is equivalent to dy i p t q dt “ β ÿ j P N A ij f p y j p t qq , (2.9)where y i p t q : “ g p x i p t qq “ ´ log p ´ x i p t qq P r , , (2.10)and f p y q : “ ´ e ´ y “ g ´ p y q .Lee et al. [23] have considered the following linearized version of theprevious nonlinear equation d ˆ y p t q dt “ βA diag p ´ x p t qq ˆ y p t q ` βb p x p t qq , (2.11)where ˆ x p t q “ f p ˆ y p t qq in which ˆ x p t q is the approximate solution to the SImodel, ˆ y p t q “ g p x p t qq and b p x q : “ x ` p ´ x q log p ´ x q . (2.12)They have found that the solution to this linearized model is [23]: Abadias, Estrada-Rodriguez, Estradaˆ y p t q “ e β p t ´ t q A diag p ´ x p t qq g p x p t qq` ÿ k “ p β p t ´ t qq k ` p k ` q ! r A diag p ´ x p t qqs k Ab p x p t qq . (2.13)When t “ x i p q “ c { n , i “ , , . . . , n for some positive c , the previousequation is transformed toˆ y p t q “ p { γ ´ q e γβtA ~ ´ p { γ ´ ` log p γ qq ~ , (2.14)where γ “ ´ c { n and ~ x i p q “ c { n indicates that at initial time every amino acid has the sameprobability of being perturbed by the inhibitor. Lee et al. [23] have provedthat this solution is an upper bound to the exact solution of the SI model.This result indicates that the upper bound to the solution of the SI modelis proportional to the exponential of the adjacency matrix of the network,which is the source of the subgraph centrality [14] and of the communica-bility function [13] between pairs of nodes in it. In the next section of thiswork we obtain a generalization of this upper bound based on a fractional-order SI model, which will also be formulated there.
3. Mathematical Results3.1. Definition of the fractional-order SI model
In the following we will consider a fractional SI model based on theCaputo fractional derivative of the logarithmic function of 1 ´ x i . Here, x i also denotes the probability that the residue i get perturbed at time t. First of all, we recall the definition of Caputo fractional derivative.Given 0 ă α ă u : r ,
8q Ñ R , we denote by D αt u theCaputo fractional derivative of u of order α, which is given by [25] D αt u p t q “ ż t h ´ α p t ´ τ q u p τ q dτ : “ ` h ´ α ˚ u ˘ p t q , t ą , where ˚ denotes the classical convolution product on p , and h γ p t q : “ t γ ´ Γ p γ q , for γ ą . Observe that the previous fractional derivative has sensewhenever the function is derivable and the convolution is defined (for ex-ample if u is locally integrable). The notation h γ is very useful in thefractional calculus theory, mainly by the property h γ ˚ h δ “ h γ ` δ for all γ, δ ą . Before presenting our model, we state a technical lemma which plays akey role in the main result of this section.RACTIONAL SI MODEL 7
Lemma . Let u : r ,
8q Ñ R be a derivable function with u p q “ , and ă α ă . If D αt u p t q ě for all t ą , then u p t q ě . P r o o f. Observe that by hypothesis p h ´ α ˚ u q p t q ě , therefore u p t q “ ż t u p τ q dτ “ ` h ˚ u ˘ p t q “ ` h α ˚ h ´ α ˚ u ˘ p t q ě . l Now, we recall that β will denote the perturbation rate and let s i p t q and x i p t q be the probabilities that residue i is susceptible or get perturbedat time t , respectively. Let 0 ă α ă
1, we consider the following fractionalmodel inspired by (2.2) and (2.3): $’’’’&’’’’% ż t h ´ α p t ´ τ q s i p τ q x i p τ q dτ “ ´ β α s i p t q , ż t h ´ α p t ´ τ q x i p τ q s i p τ q dτ “ β α x i p t q . Since s i p t q ` x i p t q “
1, we have ż t h ´ α p t ´ τ q x i p τ q ´ x i p τ q dτ “ β α x i p t q . (3.15)Observe that the left-hand side of the above system is the Caputo fractionalderivative of the minus logarithmic function of 1 ´ x i (see for instance [30]),that is, D αt p´ log p ´ x i qqp t q . As in the classical SI model happens, this equation is transformed into asystem of equations when we consider the interactions between the differentresidues in the protein according to the PRN. So, the fractional SI modelwhich we will study is given by ż t h ´ α p t ´ τ q x i p τ q ´ x i p τ q dτ “ β α ÿ j P N A ij x j , i P N , t ą , x i p q P r , s . (3.16)We can rewrite (3.16) in a matrix-vector form: D αt p´ log p ´ x qqp t q “ β α Ax p t q , (3.17)with initial condition x p q “ x , where A is the adjacency matrix of thePRN. This fractional SI model, based on the fractional-order derivative, hasnot been considered in the literature under our knowledge. Other fractionalcompartmental models have been previously discussed in the literature (seefor instance [4] and references therein). Abadias, Estrada-Rodriguez, EstradaNote that if x i p q “ , then by (3.16) we have x i p τ q ´ x i p τ q ě s closeto 0, and that case is not possible. So, we will consider that x ‹ i “ x ‹ i “ s i . Furthermore, if x i p q P p , q , by Lemma we have ´ log p ´ x i p t qq ě´ log p ´ x i p qq ą , then x i p t q P p , q , and therefore x i is non-decreasing.We deduce that if x p q P r , s n then x p t q P r , s n for all t ą , and thereare two equilibrium points: x ‹ “
0, i.e. no epidemic, and x ‹ “
1, i.e., fullcontagion. Also, each trajectory with x ‰ x ‹ “
1, i.e. the epidemic spreads monotonically to the entire network.One of the objects of greatest importance in the fractional calculustheory are the Mittag-Leffler functions. Let α, ν ą , they are defined by E α,ν p z q “ ÿ k “ z k Γ p αk ` ν q , z P C . (3.18)For more details on fractional calculus and Mittag-Leffler functions see theseminal works [9, 19, 26, 33, 18]. Let us note that when α “ e z . As the exponential function, the Mittag-Leffler functions canbe considered in a matrix framework. We refer the reader to Section for more details on the Mittag-Leffler matrix functions.Now we consider the linearization of (3.17) D αt ˜ x p t q “ β α A ˜ x p t q . (3.19)It is known that the solution of (3.19) is given by˜ x p t q “ E α, pp βt q α A q x : “ ÿ k “ p βt q αk A k x Γ p αk ` q , (3.20)where x is the same initial condition that in the non-linearized problem.In fact the solution diverges as t goes to infinity, that is,lim t Ñ8 ˜ x i p t q “ lim t Ñ8 E α, pp βt q α λ q ψ i n ÿ j “ ψ j x j “ lim t Ñ8 8 ÿ k “ pp βt q α λ q k Γ p αk ` q ψ i n ÿ j “ ψ j x j “ 8 , (3.21)for all v i P V in G “ p V, E q , and where ψ j is the j th entry of the eigen-vector associated to the spectral radius λ .Observe that the fractional SI model (3.16) can be rewritten as D αt y i p t q “ β α ÿ j P N A ij f p y j p t qq , RACTIONAL SI MODEL 9where y i p t q is defined as in (2.10).Now we consider the Lee-Tenneti-Eun (LTE) type transformation [23],which is also given in (2.11), which produces the following linearized equa-tion D αt ˆ y p t q “ β α ˆ A ˆ y p t q ` β α Ab p x q , (3.22)where ˆ A “ A Ω and Ω : “ diag p ´ x q . Analogous to the notation usedin (2.11), ˆ x p t q “ f p ˆ y p t qq in which ˆ x p t q is an approximate solution tothe fractional SI model, ˆ y is the solution of (3.22) with initial conditionˆ y p q “ g p x p qq and b p x q is given in (2.12). Theorem . For any t ě , we have x p t q ĺ ˆ x p t q “ f p ˆ y p t qq ĺ ˜ x p t q , under the same initial conditions x : “ x p q “ ˆ x p q “ ˜ x p q , where x p t q ĺ ˆ x p t q if x i ď ˆ x i for all i “ , , . . . , n . The solution ˆ y of (3.22) is given by ˆ y p t q “ E α, ´ p βt q α ˆ A ¯ g p x q ` ÿ k “ p βt q α p k ` q ˆ A k Ab p x q Γ p α p k ` q ` q , (3.23) and ˜ x is given by (3.20) . Furthermore, } ˆ x p t q´ x p t q} Ñ and } ˜ x p t q´ x p t q} Ñ8 as t goes to infinity. P r o o f. First of all, by the theory of fractional calculus, it is well-known that the solution of the linearized problem (3.22) is given byˆ y p t q “ E α, ´ p βt q α ˆ A ¯ g p x q` ż t τ α ´ E α,α ´ p βτ q α ˆ A ¯ β α Ab p x q dτ, (3.24)where the functions E α, p¨q and E α,α p¨q are defined as in (3.18). Therefore,since ż t τ αk ` α ´ dτ “ t αk ` α αk ` α , from (3.24) we get (3.23). For more details about linear fractional modelssee [2, 3, 5], and references therein. Notice that Eq. (3.23) is the generalizedfractional version of the one obtained by LTE by means of their Theorem . Their specific solution is recovered when α “ E , ´ βt ˆ A ¯ “ exp ´ βt ˆ A ¯ and Γ p n ` q “ p n ` q !.We have assumed that x “ x p q “ ˆ x p q “ ˜ x p q , with y p t q “ g p x p t qq and ˆ y p t q “ g p ˆ x p t qq . Since y, ˆ y are non-decreasing functions of x, ˆ x, it isenough to prove that y p t q ĺ ˆ y p t q to get x p t q ĺ ˆ x p t q . Following the paper0 Abadias, Estrada-Rodriguez, Estradaof Lee et all, since f is a concave function with f p y q “ e ´ y , we have D αt y i p t q ď β α ÿ j P N A ij p ´ x j p qq y j p t q ` β α ÿ j P N A ij b p x j p qq . Then, since y p q “ ˆ y p q , D αt y p t q ĺ D αt ˆ y p t q , so Lemma implies x p t q ĺ ˆ x p t q . Now, note that D αt ˆ x i p t q “ D αt f p ˆ y i p t qq “ ż t h ´ α p t ´ s q e ´ ˆ y i p s q ˆ y i p s q ds. Furthermore (3.23) shows that y i p s q ě s ą , then0 ď D αt ˆ x i p t q ď ż t h ´ α p t ´ s q ˆ y i p s q ds “ D αt ˆ y i p t q . Also, it is well-known [9, 18, 34] (or more recently [2, 3, 5]) that theprevious Mittag-Leffler matrix functions satisfy E α, ´ p βt q α ˆ A ¯ “ ´ h ´ α ˚ s α ´ E α,α ´ p βs q α ˆ A ¯¯ p t q“ ż t h ´ α p t ´ s q s α ´ E α,α ´ p βs q α ˆ A ¯ ds (3.25)and E α, ´ p βt q α ˆ A ¯ I “ I ` β α ˆ A ´ h α ˚ E α, ´ p βs q α ˆ A ¯¯ p t q“ I ` β α ˆ A ż t h α p t ´ s q E α, ´ p βs q α ˆ A ¯ ds. (3.26)Then, by (3.22), (3.24), (3.25) and (3.26) one gets D αt ˆ y p t q “ β α ˆ AE α, ´ p βt q α ˆ A ¯ g p x q ` β α ˆ A ´ h ˚ s α ´ E α,α ´ p βs q α ˆ A ¯¯ p t qˆ β α Ab p x q ` β α Ab p x q“ β α ˆ AE α, ´ p βt q α ˆ A ¯ g p x q ` β α ˆ A ´ h α ˚ h ´ α ˚ s α ´ E α,α ´ p βs q α ˆ A ¯¯ p t qˆ β α Ab p x q ` β α Ab p x q“ β α ˆ AE α, ´ p βt q α ˆ A ¯ g p x q ` β α ˆ A ´ h α ˚ E α, ´ p βs q α ˆ A ¯¯ ˆ p t q β α Ab p x q ` β α Ab p x q“ β α ˆ AE α, ´ p βt q α ˆ A ¯ g p x q ` E α, ´ p βt q α ˆ A ¯ β α Ab p x q“ β α AE α, ´ p βt q α ˆ A ¯ x , RACTIONAL SI MODEL 11where in the last equality we have used that Ω g p x q ` b p x q “ x . Bydefinition of Mittag-Leffler matrix function, it is easy to see that E α, ´ p βt q α ˆ A ¯ x ĺ E α, pp βt q α A q x , since ˆ A “ A Ω with Ω “ diag p ´ x q . Therefore D αt ˆ x p t q ĺ D αt ˆ y p t q ĺ β α AE α, pp βt q α A q x “ D αt ˜ x p t q , and Lemma implies ˆ x p t q ĺ ˜ x p t q . Finally, it is known that lim t Ñ8 ˜ x i p t q “ 8 and lim t Ñ8 ˆ y i p t q “ 8 . Since f is continuous, lim t Ñ8 ˆ y i p t q “ 8 , then lim t Ñ8 ˆ x i p t q “ lim t Ñ8 f p ˆ y i q p t q “ . Therefore, since lim t Ñ8 x i p t q “ } ˆ x p t q ´ x p t q } Ñ } ˜ x p t q ´ x p t q } Ñ t Ñ 8 . l Corollary . Let x ĺ , then the solution of (3.22) can bewritten as ˆ y p t q “ g p x q ` ” E α, ´ p βt q α ˆ A ¯ ´ I ı Ω ´ x . (3.27)P r o o f. Let us write Eq. (3.24) in the following wayˆ y p t q “ E α, ´ p βt q α ˆ A ¯ g p x q ` ż t s α ´ E α,α ´ p βs q α ˆ A ¯ β α A ΩΩ ´ b p x q ds, (3.28)which can be reordered asˆ y p t q “ E α, ´ p βt q α ˆ A ¯ g p x q ` „ β α ˆ A ż t s α ´ E α,α ´ p βs q α ˆ A ¯ ds Ω ´ b p x q . (3.29)So, by (3.26) we haveˆ y p t q “ E α, ´ p βt q α ˆ A ¯ g p x q ` ” E α, pp βt q α ˆ A q ´ I ı Ω ´ b p x q . (3.30)Now, it is easy to check that Ω ´ b p x q “ Ω ´ x ´ g p x q . Therefore,ˆ y p t q “ E α, ´ p βt q α ˆ A ¯ Ω ´ x ´ Ω ´ x ` g p x q , (3.31)which by reordering gives the final solution. l Let us now consider x i p q “ c { n , i “ , , . . . , n, where c is a positivereal, and let γ “ ´ c { n. Noting that diag p ´ x q “ γI , then2 Abadias, Estrada-Rodriguez, Estradaˆ y p t q “ ˆ ´ γγ ˙ E α, ´ t α β α ˆ A ¯ ~ ´ ˆ ´ γγ ` log γ ˙ ~ “ ˆ ´ γγ ˙ E α, ´ t α β α A diag p ´ x q ¯ ~ ´ ˆ ´ γγ ` log γ ˙ ~ “ ˆ ´ γγ ˙ E α, ´ t α β α γA ¯ ~ ´ ˆ ´ γγ ` log γ ˙ ~ . (3.32)We should remark that according to the result in Theorem thesolution to the fractional-order SI model obtained here represents an upperbound to the exact solution. Therefore, we will use it here as the worse-case scenario for the analysis of perturbations in real-world PRNs. Thismeans that our results should be interpreted here not as an approximationto the solution but as the most extreme situation that can happen in thepropagation of a perturbation through a protein. As we have seen in Section the upper bound of the SI model islinearly proportional to e αβtA ~
1, where A is the adjacency matrix of thegraph. That is, the only structural information about the graph whichappears in the solution of the SI model is contained in e ζA where ζ is aparameter. Here we first explain how is this information encoded in thematrix exponential. Let us start by writing e ζA “ ÿ k “ p ζA q k k ! . (3.33)We recall that a walk of length k in G is a set of nodes i , i , . . . , i k , i k ` such that for all 1 ď l ď k , p i l , i l ` q P E . A closed walk is a walk for which i “ i k ` [11]. Then, we state the following well-known result (see [11] andreferences therein). Theorem . The number of walks of length k between the nodes u and v of the graph G is given by ` A k ˘ uv . This means that ` e ζA ˘ uv counts the number of walks of any lengthbetween the nodes u and v of G penalizing them by p k ! q ´ , where k isthe length of the walk. Obviously, ` e ζA ˘ uv penalizes too heavily relativelylong walks. For instance, while a walk of length two contributes 0 . ζ to ` e ζA ˘ uv , a walk of length 6 contributes 0 . ζ . Then, if ζ ă ´ secondsfor conformational transitions to 10 ´ seconds for hydrogen bondbreaking, rotational relaxation and translational diffusion [38];(2) the existence of long-range transmission of effects, which has beenobserved to take place even between amino acids separated 100apart [32]. Notice that in terms of the PRN this represents a trans-mission between two nodes separated by 14 edges in the network.The function e ζA along cannot account for the previously mentioned im-portant characteristics of protein perturbations. Once we consider a givennetwork and a fixed value of ζ, the function e ζA can describe only oneprocess in the wide time-window previously described. For instance, sup-pose that such process is one occurring at the 10 ´ seconds scale. For thesame network and conditions we cannot model another process occurringat the 10 ´ seconds scale with the same mathematical model. At thesame time this function penalizes very heavily the long-range transmissionof perturbation effects, also avoiding a complete characterization of thephysico-chemical process.In contrast, the Mittag-Leffler matrix functions, which appear in thesolution of the fractional-order SI model, are expressed in the following way[27, 17, 35, 15] E α,ν ´ ζA ¯ “ ÿ k “ p ζA q k Γ p αk ` ν q , α, ν ą , . (3.34)Then, for a fixed network topology and fixed external conditions ζ we canstill model several processes at different time-windows by changing theMittag-Leffler parameter α . For instance, we can consider a process oc-curring at the micro-second scale modeled by using α “ .
0, while anotherprocess occurring in the same network at the pico-second scale by using α “ .
25. This is illustrated in Fig. (a) were we plot the time evolutionof the propagation of perturbations on a cycle of 15 nodes for ζ “
1. Ascan be seen the time at which 50% of the nodes are perturbed changesfrom 242 with α “ α “ .
25. This simple graph, a cycle, is agood example of some structures appearing in PRNs, named the chordlesscycles or holes. A chordless cycle, also known as induced cycle, is a cyclewhich contains no edge which does not itself belongs to the cycle. Holesare ubiquitous in proteins [20] and they may represent important bindingsites in them.The Mittag-Leffler matrix functions also allow to describe the secondcharacteristic of the propagation of perturbations through proteins, i.e., the4 Abadias, Estrada-Rodriguez, Estrada
Time o f pe r t u r bed node s = 1 = 0.75 = 0.5 = 0.25 (a) Distance from origin R e l a t i v e pe r t u r ba t i on ( % ) = 1.0 = 0.75 = 0.5 = 0.25 (b) Figure
Illustration of the effects of changing the pa-rameter α in the Mittag-Leffler function for the transmissionof perturbations in a cycle of 15 nodes.existence of long-range interactions. While ` e ζA ˘ uv penalizes very heavilylong-range perturbations, E α, ´ ζA ¯ allows us to modulate such effects bychanging the parameter α . For instance, let us consider a perturbationat a given node of the cycle of 15 nodes previously considered here. Thisperturbation can be transmitted across the cycle in no more than 7 steps,i.e., the diameter of the graph. When α “
1, i.e., E , ´ ζA ¯ “ exp ´ ζA ¯ ,the transmission of this perturbation to nodes at more than 5 steps fromthe origin is practically null. As can be seen in Fig. (b) this situationchanges when we drop the value of α . When α “ . α “ .
25 the transmission to farthest neighbors is almostunchanged in relation to that of the transmission to nearest neighbors,which may seem exaggerated in physical conditions of proteins.In closing, the Mittag-Leffler matrix functions, and consequently the useof a fractional-order SI model, are important for modeling the transmissionof perturbations across PRNs because they allow to capture importantspatial and temporal characteristics of protein perturbations, which arelimited with the use of the classical SI model.RACTIONAL SI MODEL 15
Time o f pe r t u r bed node s (a) Time o f pe r t u r bed node s (b) Figure
Time evolution of the upper bounds of thenormal (a) and fractional (b) SI model for the main proteaseof CoV-2 bounded to three inhibitors as well as free with β “ . γ “ ´ cn , c “ . .
4. Computational Results
Here we apply our model to the study of the M pro of SARS CoV-2complexed with three inhibitors: PDB codes 6M0K and 6LZE from [8] and6Y2G from [39]. We compare the results obtained with the free proteasestructure: PDB 6Y2E. All calculations are carried out on Matlab. For theMittag-Leffler matrix functions we use the Matlab function “ml matrix.m”provided by Garrappa and Popolizio [17, 16]. The three inhibitors selectedfor our study have been reported to display potent inhibitory capacityagainst SARS CoV-2. This potency is measured through their inhibitoryconcentration IC , which is the concentration of the inhibitor needed invitro to inhibit the virus by 50%. For the simulations we use here β “ . c “ . , γ “ ´ cn , and compare the results for α “ and when α “
1. InFig. we illustrate the time evolution of the number of perturbed aminoacids in the complexes studied as well as in the free protease (the last curveis overlapped by that of complex with 6LZE). There are two interestingobservations from these plots. First, the use of α “ { ă ă α, which is exactlythe order of potency of the inhibitors towards SARS CoV-2.6 Abadias, Estrada-Rodriguez, Estrada α “ . α “ . IC p µM q Inhibitor ∆G ij (%) ¯ L N BS ∆G ij (%) ¯ L N BS . ˘ . . ˘ . . ˘ . Table 1.
Average change individual transmissibility of per-turbations between amino acids in CoV-2 M pro bounded toinhibitors relative to the free protease. The average pathlength ¯ L for paths between the top ten pairs of amino acidsaccording to ∆G ij and the number of times a residues in oneof these paths is located in the binding site of the protease, N BS .In order to gain more insights about the influence of the two differentdynamics on the propagation of a perturbation across the CoV-2 M pro when bounded with inhibitors we study the structural contributions fromeach of the structures to the SI dynamics. In doing so we calculate therelative differences in the individual components of the transmissibility ofthis perturbation from one residue to another, G αij , ∆G αij “ n p n ´ q ÿ i ‰ j G αij p bounded q ´ G αij p free q G αij p free q , (4.35)where G αij “ r E α, ˆ p βt q α γA ˙ s ij . We have selected the time at which 50% of the amino acids in theprotease are perturbed, which occurs at t “ α “ {
2) and t “
50 ( α “ β “ . γ “ ´ cn , c “ . . We also selectedthe top ten pairs of amino acids according to their values of ∆G ij . Forthese pairs we have calculated the average length ¯ L of the shortest pathsconnecting the pair of residues. For instance, in 6M0K for α “ . ∆G ij is for the pair L167-K269 for which the shortest pathhas length 8. For the same complex but using α “ . ∆G ij is for the pair L167-M276 for which the shortest path has length 10.In addition, we determine which N BS of these pairs of residues in the topten ranking according to ∆G ij is involved directly in the binding site ofthe protease or it is bounded to one of them. For instance, for the case ofthe pair before mentioned for 6M0K ( α “ .
0) the residue L167 is directlybounded to two amino acids in the binding site, namely E166 and P168.RACTIONAL SI MODEL 17According to the results given in Table 1 we can extract the followingconclusions. For α “ .
0, the values of ∆G ij indicate that the three in-hibitors increase the transmissibility of perturbations across the protein inrelation to the free protease. The trend in these percentages of change isparallel to that of the inhibitory power of the inhibitors. That is, the mostpotent inhibitor increases more the transmissibility of effects across theprotease than the second most powerful one, and the least powerful is theone with the poorer increase in ∆G ij . However, neither ¯ L nor N BS displaya consistent pattern of change in relation to the values of IC p µM q . Incontrast, when α “ . ∆G ij at thebinding site, while the weakest initiates only 30% of these perturbations atthe binding site.In terms of the geometric distance between the residues in the perturbedprotease we also observe similar characteristics as for the case of the lengthof the shortest path. For instance, for α “ α “ .
5. Conclusions
There are two main conclusions in the current work. The first is thatwe have proposed a generalized fractional-order SI model which includesthe classical SI model as a particular case. We have found an upper boundto the exact solution of this model, which under given initial conditions de-pends only on the Mittag-Leffler matrix function of the adjacency matrixof the graph. The most important characteristic of this fractional-order SImodel is that it allows to account for long-range interactions between thenodes of a network as well as for different time-windows on the transmis-sion of perturbations on networks by tuning the fractional parameter α ofthe model. Both characteristics are of great relevance in many differentapplications of complex systems ranging from biological to social systems,and in particular for the study of protein residue networks.The second main conclusion of this work is that the fractional-orderSI model allowed us to extract very important information about the in-teraction of inhibitors with the main protease of the SARS CoV-2. Thisstructural information consists in the transmission of perturbations pro-duced by the inhibitors at the binding site of the protease to very distantamino acids in other domains of the protein. More importantly, our findingssuggest that the length of this transmission seems to reflect the potency ofthe inhibitor. That is, the more powerful inhibitors transmit perturbationsto longer distances through the protein. On the contrary, weaker inhibitorsdo not propagate such effect beyond 6 edges from the binding site as av-erage. Consequently, these findings are important for understanding themechanisms of actions of such inhibitors on SARS CoV-2 M pro and helpingin the design of more potent drug candidates against this new coronavirus.Of course, the current approach can be extended and used for the analysisof other inhibitors in other proteins not only using experimental data likein here but using computational analysis of such interactions. Acknowledgements
We thank the Editor and the three anonymous referees for useful sug-gestions that improve the presentation of this work.The first author has been partly supported by Project MTM2016-77710-P, DGI-FEDER, of the MCYTS, Project E26-17R, D.G. Arag´on,and Project for Young Researchers, Fundaci´on Ibercaja and Universidadde Zaragoza, Spain.
References [1] Protein data bank: the single global archive for 3D macromolecular structure data.
Nucleic acids research , 47(D1):D520–D528, 2019.
RACTIONAL SI MODEL 19 [2] L. Abadias, E. Alvarez, et al. Uniform stability for fractional Cauchy problems andapplications.
Topological Methods in Nonlinear Analysis , 52(2):707–728, 2018.[3] L. Abadias, C. Lizama, P. J. Miana, et al. Sharp extensions and algebraic proper-ties for solution families of vector-valued differential equations.
Banach Journal ofMathematical Analysis , 10(1):169–208, 2016.[4] C. N. Angstmann, A. M. Erickson, B. I. Henry, A. V. McGann, J. M. Murray,and J. A. Nichols. Fractional order compartment models.
SIAM Journal on AppliedMathematics , 77(2):430–446, 2017.[5] E. Bazhlekova. The abstract Cauchy problem for the fractional evolution equation.
Fract. Calc. Appl. Anal , 1(3):255–270, 1998.[6] H. Berry. Nonequilibrium phase transition in a self-activated biological network.
Physical review E , 67(3):031907, 2003.[7] A. Cooper and D. Dryden. Allostery without conformational change.
European Bio-physics Journal , 11(2):103–109, 1984.[8] W. Dai, B. Zhang, H. Su, J. Li, Y. Zhao, X. Xie, Z. Jin, F. Liu, C. Li, Y. Li, et al.Structure-based design of antiviral drug candidates targeting the sars-cov-2 mainprotease.
Science , 2020.[9] K. Diethelm.
The analysis of fractional differential equations: An application-oriented exposition using differential operators of Caputo type . Springer Science &Business Media, 2010.[10] K. H. DuBay, J. P. Bothma, and P. L. Geissler. Long-range intra-protein com-munication can be transmitted by correlated side-chain fluctuations alone.
PLoScomputational biology , 7(9), 2011.[11] E. Estrada.
The structure of complex networks: theory and applications . OxfordUniversity Press, 2012.[12] E. Estrada. Topological analysis of sars cov-2 main protease.
Chaos, in press , 2020.[13] E. Estrada and N. Hatano. Communicability in complex networks.
Physical ReviewE , 77(3):036111, 2008.[14] E. Estrada and J. A. Rodr´ıguez-Vel´azquez. Subgraph centrality in complex networks.
Phys. Rev. E , 71:056103, May 2005.[15] D. Fulger, E. Scalas, and G. Germano. Monte Carlo simulation of uncoupledcontinuous-time random walks yielding a stochastic solution of the space-time frac-tional diffusion equation.
Physical Review E , 77(2):021122, 2008.[16] R. Garrappa. Numerical evaluation of two and three parameter mittag-leffler func-tions.
SIAM Journal on Numerical Analysis , 53(3):1350–1369, 2015.[17] R. Garrappa and M. Popolizio. Computing the matrix mittag-leffler function withapplications to fractional calculus.
Journal of Scientific Computing , 77(1):129–153,2018.[18] R. Gorenflo, A. A. Kilbas, F. Mainardi, S. V. Rogosin, et al.
Mittag-Leffler functions,related topics and applications , volume 2. Springer, 2014.[19] R. Gorenflo, F. Mainardi, E. Scalas, and M. Raberto. Fractional calculus andcontinuous-time finance iii: the diffusion limit. In
Mathematical finance , pages 171–180. Springer, 2001.[20] G. Hu, J. Zhou, W. Yan, J. Chen, and B. Shen. The topology and dynamics ofprotein complexes: insights from intra–molecular network theory.
Current Proteinand Peptide Science , 14(2):121–132, 2013.[21] V. Latora, V. Nicosia, and G. Russo.
Complex networks: principles, methods andapplications . Cambridge University Press, 2017. [22] A. L. Lee et al. Frameworks for understanding long-range intra-protein communica-tion.
Current Protein and Peptide Science , 10(2):116–127, 2009.[23] C.-H. Lee, S. Tenneti, and D. Y. Eun. Transient dynamics of epidemic spreading andits mitigation on large networks. In
Proceedings of the Twentieth ACM InternationalSymposium on Mobile Ad Hoc Networking and Computing , pages 191–200, 2019.[24] S. Lu, M. Ji, D. Ni, and J. Zhang. Discovery of hidden allosteric sites as novel targetsfor allosteric drug design.
Drug discovery today , 23(2):359–365, 2018.[25] F. Mainardi.
Fractional calculus and waves in linear viscoelasticity: an introductionto mathematical models.
World Scientific, 2010.[26] F. Mainardi and R. Gorenflo. On mittag-leffler-type functions in fractional evolutionprocesses.
Journal of Computational and Applied Mathematics , 118(1-2):283–299,2000.[27] I. Matychyn. On computation of matrix Mittag-Leffler function. arXiv preprintarXiv:1706.01538 , 2017.[28] W. Mei, S. Mohagheghi, S. Zampieri, and F. Bullo. On the dynamics of deterministicepidemic propagation over networks.
Annual Reviews in Control , 44:116–128, 2017.[29] M. Miotto, L. Di Rienzo, P. Corsi, D. Raimondo, and E. Milanetti. Simulatedepidemics in 3d protein structures to detect functional properties. arXiv preprintarXiv:1906.05390 , 2019.[30] S. K. Mishra, M. Gupta, and D. K. Upadhyay. Fractional derivative of logarithmicfunction and its applications as multipurpose asp circuit.
Analog Integrated Circuitsand Signal Processing , 100(2):377–387, 2019.[31] D. Mugnolo. Dynamical systems associated with adjacency matrices. arXiv preprintarXiv:1702.05253 , 2017.[32] K. M. Ottemann, W. Xiao, Y.-K. Shin, and D. E. Koshland. A piston model fortransmembrane signaling of the aspartate receptor.
Science , 285(5434):1751–1754,1999.[33] R. Paris. Exponential asymptotics of the mittag–leffler function.
Proceedings of theRoyal Society of London. Series A: Mathematical, Physical and Engineering Sci-ences , 458(2028):3041–3052, 2002.[34] I. Podlubny.
Fractional differential equations: an introduction to fractional deriva-tives, fractional differential equations, to methods of their solution and some of theirapplications . Elsevier, 1998.[35] A. Sadeghi and J. R. Cardoso. Some notes on properties of the matrix Mittag-Lefflerfunction.
Applied Mathematics and Computation , 338:733–738, 2018.[36] C. S. G. The International et al. The species severe acute respiratory syndrome-related coronavirus: classifying 2019-ncov and naming it SARS-CoV-2.
Nature Mi-crobiology , page 1, 2020.[37] F. Wu, S. Zhao, B. Yu, Y.-M. Chen, et al. A new coronavirus associated with humanrespiratory disease in China.
Nature , 579(7798):265–269, 2020.[38] Y. Xu and M. Havenith. Perspective: Watching low-frequency vibrations of waterin biomolecular recognition by thz spectroscopy.
The Journal of chemical physics ,143(17):170901, 2015.[39] L. Zhang, D. Lin, X. Sun, U. Curth, C. Drosten, L. Sauerhering, S. Becker, K. Rox,and R. Hilgenfeld. Crystal structure of SARS-CoV-2 main protease provides a basisfor design of improved α -ketoamide inhibitors. Science , 2020.[40] P. Zhou, X.-L. Yang, X.-G. Wang, B. Hu, et al. A pneumonia outbreak associatedwith a new coronavirus of probable bat origin.
Nature , 579(7798):270–273, 2020.
RACTIONAL SI MODEL 21 Departamento de Matem´aticas, Facultad de CienciasUniversidad de Zaragoza, 50009 Zaragoza, Spain.e-mail: [email protected] Instituto Universitario de Matem´aticas y AplicacionesUniversidad de Zaragoza, 50009 Zaragoza, Spain.email: [email protected] Laboratoire Jacques-Louis Lions, Universit´e Pierre-et-Marie-Curie (UPMC)4 place Jussieu, 75005, Paris, France.email: [email protected]