[PDF] Skepsis on the scenario of Biological Evolution provided by stochastic models

Abstract

Stochastic models, based on random processes, may lead to power law distributions, which provide long range correlations. The observation of power law behavior and the presence of long range correlations in biological systems has been demonstrated in various studies. The combination of the two just mentioned results, theoretical and experimental, supports strongly the scenario of biological evolution across different organisms. In the current Letter we explore in a general way, using the algebra of Nonextensive Statistics introduced by Tsallis and coworkers, if the processes which are described by a class of stochastic models are really random and discuss the results with regard to a possible biological evolution.

Full PDF

aa r X i v : . [ c ond - m a t . s t a t - m ec h ] M a r Skepsis on the scenario of Biological Evolution provided bystochastic models

Thomas Oikonomou ∗ Institute of Physical Chemistry,National Center for Scientiﬁc Research “Demokritos”,15310 Athens, GreeceSchool of Medicine Department of Biological Chemistry,University of Athens, Goudi,11527 Athens, Greece (Dated: November 4, 2018)

Abstract

Stochastic models, based on random processes, may lead to power law distributions, whichprovide long range correlations. The observation of power law behavior and the presence of longrange correlations in biological systems has been demonstrated in various studies. The combinationof the two just mentioned results, theoretical and experimental, supports strongly the scenario ofbiological evolution across diﬀerent organisms. In the current Letter we explore in a general way,using the algebra of Nonextensive Statistics introduced by Tsallis and coworkers, if the processeswhich are described by a class of stochastic models are really random and discuss the results withregard to a possible biological evolution.

PACS numbers: 02.50.-r; 87.23.Kg; 87.10.-eKeywords: stochastic models; biological evolution; q -Algebra; long range correlations ∗ Electronic address: [email protected]

1n the last years a variety of systems in diverse scientiﬁc ﬁelds has been explored present-ing power law distributions like in biophysics [1], neurophysics [2], economics [3], turbulence[4, 5], urban agglomerations [6] and many others. Because of the ubiquity of power lawsmany stochastic models have been studied [7, 9], which are based on random processes andlead to the desired power law distributions. These results have a great impact on our point ofview about the rules that dominate in nature. Singularly, in biology the success of derivingpower laws through the consideration of random processes supports strongly the scenario ofthe biological evolution across diﬀerent species. These models do not give explicit descrip-tion of the transition between diﬀerent organisms during the evolution but they can formwell, under consideration of biological processes like mutation, duplication, etc., symbolicsequences which correspond to polynucleotide DNA chains. For example in Ref. [7] Provataused a random aggregation model to describe a possible formation of such a polynucleotideDNA chain. The model is based on the probability formula P ( s, t + ∆ t ) = N X n =1 P ( I ) (cid:18) Nn (cid:19) (cid:18) N (cid:19) n (cid:18) − N (cid:19) N − n n Y j =1 P ( s j , t ) (cid:12)(cid:12)(cid:12) P nj =1 s j + I = s (1)introduced by Takayasu and collaborators [10, 11], which describes the probability of ﬁndinga macromolecule of size s at the time t +∆ t . The analytical expression of Eq. (1) is computedin the transformed Fourier space. A diﬀerent stochastic approach is considered in Ref. [9]by Messer, Arndt and L¨assig, who constructed a master equation for the joint probabilities P eq ( r, t ) and P op ( r, t ) of ﬁnding a GC pair and a AT pair, respectively, at a distance rddt P eq ( r, t ) = 2 µ eﬀ [ P op ( r, t ) − P eq ( r, t )]+ (cid:2) rδ + ( r − γ + (cid:3) [ P op ( r − , t ) − P eq ( r, t )]+ rγ − [ P eq ( r + 1 , t ) − P eq ( r, t )] . (2)The exchange P eq ( r, t ) ↔ P op ( r, t ) in Eq. (2) gives the analogous equation for the probability P op ( r, t ). Then, the correlation function C ( r, t ) = P eq ( r, t ) − P op ( r, t ) can be computed. Theconsideration of both Eqs. (1) and (2) leads, under certain assumptions, to power lawdistributions.On the other hand, it is well known that the existence of power law distributions indicatesthe presence of long (or short) range correlations. Yet, this is in contradiction with thestatistical assumptions of the stochastic models about random processes. It would mean2hat uncorrelated processes can create, in an unexplained way, correlations between them.Even if we assume that in nature such an event (creation of correlations through randomprocesses) can be possible, can we assume the same in mathematics? A stochastic modelis a mathematical construction with concrete rules and it can not lead to the existence ofcorrelations if we have not ﬁrst introduced them in the model. Starting from this point ofview we shall explore in the next paragraphs whether a class of stochastic models, wheresums appear like in Eqs. (1) and (2), describes really random processes. For this aim weshall use the q -Algebra [12] of Nonextensive Statistics [13]. The results will be discussed inthe frame of biology.Nonextensive Statistics is a possible generalization of the ordinary random statistics,depending on a parameter q ∈ [0 ,

1] (the q > -branche corresponds to diﬀerent q -statistics[14]). Its structure is based on the following deformed logarithmln q ( x ) := x − q − − q , ( x > , (3)with its inverse deformed exponential functione q ( x ) := [1 + (1 − q ) x ] − q , ( x > . (4)When q = 1 the entire generalized statistical structure recovers the ordinary one. One of themain points in Nonextensive Statistics is its successful treatment of correlated probabilities(or processes in general), which are projected on the values of the parameter q .First, we introduce some elementary notions of the Probability Theory. A probabilityspace , which represents our uncertainty regarding an experiment consists of two parts: i) aset of events/outcomes, which is denoted as sample space Ω and ii) a real function of thesubsets of Ω which is denoted as probability measure P . Now, let us follow the next thought.We consider two coins I and II . The sides of each coin have a diﬀerent symbol, so that thesample spaces Ω I and Ω II consist of two events, Ω I = { A, B } and Ω II = { ˜ A, ˜ B } . Then,the intersection probability (of both occurring) in each sample space separately is of courseequal to zero P ( A ∩ B ) = 0 = P ( ˜ A ∩ ˜ B ) . (5)The union probability (of either occurring) in both spaces separately is equal to the unity P ( A ∪ B ) = P ( A ) + P ( B ) P ( ˜ A ∪ ˜ B ) = P ( ˜ A ) + P ( ˜ B ) ) = 1 . (6)3q. (6) corresponds in physics to the normalization constraint . In this probability measurethe event probabilities in each sample space are strongly correlated in such a way that theirsum is always equal to the unity. We denote these correlations as correlations of Type I.On the other hand, with respect to the values of the probabilities, there are inﬁnite waysto fulﬁl Eq. (6). In order to capture all these possibilities, or better to attribute them to acause, we introduce yet another type of correlations, which we call correlations of Type II.These correlations are individual for each event. In our example they could represent thediﬀerent weights of each side of the coin. When the correlations of Type II vanish, then wehave equal event probabilities.In further, we join the spaces Ω I, II . Now, we have one more type of correlations toconsider, which we denote as correlations of Type III. These correlations are between theseveral combinations of the events. Following these steps, the consideration of more compli-cate processes or compositions would lead to an increasing number of the correlation types(higher order correlations, Type > III). In shake of the simplicity, we assume that there areno correlations of the Types II and III in the new sample space Ω

III and because of this,Ω

III is a tensor product of both subspaces Ω

III = Ω I × Ω II = { A ˜ A, A ˜ B, B ˜ A, B ˜ B } . In thiscase the event probabilities P ( A ), P ( B ), P ( ˜ A ) and P ( ˜ B ) are equal and independent. Then,the intersection probabilities are given by P ( A ∩ ˜ A ) = P ( A ) P ( ˜ A ) , (7) P ( A ∩ ˜ B ) = P ( A ) P ( ˜ B ) , (8) P ( B ∩ ˜ A ) = P ( B ) P ( ˜ A ) , (9) P ( B ∩ ˜ B ) = P ( B ) P ( ˜ B ) , (10)and because of the union probability P ( A ˜ A ∪ A ˜ B ∪ B ˜ A ∪ B ˜ B ) = P ( A ) P ( ˜ A ) + P ( A ) P ( ˜ B )+ P ( B ) P ( ˜ A ) + P ( B ) P ( ˜ B ) = 1 , (11)their values are equal to 1/4. Here, there is a very important point to stress. Namely,the feature of the probability (in)dependence is meaningful in the frame of an appropriateprobability measure. As we can see in Eqs. (7) - (10), the intersection probabilities ofΩ III are computed based on the independence of the event probabilities between the twosubspaces Ω I and Ω II . Yet, the union probability of Ω III correlates strongly the intersection4robabilities. Furthermore, it becomes evident that when we refer to correlations we haveto be very concrete about their Type. To give a more general character to the intersectionprobabilities for the further purposes of this Letter we shall call them

Blocks of probabilities .In Ref. [14] the present author demonstrated the importance of the distinction betweenthe diverse types of correlations. The existence of two diﬀerent types of correlations withinNonextensive Statistics has been introduced, the inner and the outer , described by a set ofparameters R = { R i } i =1 ,...,n and Q = { Q i } i =1 ,...,n ′ respectively, in order to distinguish be-tween the three generalized entropic structures that can be created, based on any generalizedlogarithmic function. The inner and outer correlations in this case should not be necessarilyidentiﬁed with ones in our example above. Their notion must be every time adjusted tothe features of a problem under consideration. For the one parametric ( Q = { Q } = q )generalized logarithm (3) with its inverse function (4) these three structures correspond tothe Tsallis, R´enyi and Nonextensive Gaussian entropy. We shall keep the notation of thecorrelation-sets R and Q throughout the paper.For an arbitrary deformed logarithmic and exponential function we deﬁne the generalizedproduct between two elements x, y ∈ R + as [14] x ⊗ Q y := e Q (ln Q ( x ) + ln Q ( y )) . (12)For Q → Q we obtain the ordinary multiplication since e Q (ln Q ( x ) + ln Q ( y )) =e (ln( x ) + ln( y )) = x × y . Then, for the Q -multiplication of m elements we have m Y ⊗ Q k =1 x k = e Q m X k =1 ln Q ( x k ) ! , ( x k > . (13)Using the deformed functions in Eqs. (3) and (4), Eq. (13) takes the explicit form m Y ⊗ q k =1 x k = " m X k =1 x − qk − ( m − − q . (14)If we express the normalization constraint of W intersection probabilities P j through the q -algebra we can easily verify the relation W X j =1 P j = W ⊗ q W Y ⊗ q j =1 P j (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) q =0 = 1 . (15)The value q = 0 in q -Statistics corresponds to long range correlations. This result reproducecorrectly the correlations of Type I behind the normalization constraint, which relate all5robabilities in a very speciﬁc way. Since generalized statistics gives us proper results anddoes not lead to inconsistencies we shall use it in the next paragraph in order to completethe aim of this Letter.We consider the statistical processes of a system and deﬁne the probabilities under whichthese processes take place and the way they interact (correlated or uncorrelated). An exam-ple for such a system under consideration could be a DNA chain and a possible statisticalprocess could be the mutation of a nucleotide. In connection to our example with the coins,all these interacting probabilities deﬁne the respective Blocks of probabilities for the currentapproach and are denoted as { A ( R ) k ( P j ) } k =1 ,...,m . The internal R -correlations characterizethe diﬀerent types of correlations within a Block. These correlations are canceled for certainvalues R → R . Now, we want to see how the several k -Blocks interact with each other.These interactions are described by the external correlations Q . Then, we introduce theprobability functional B {Q , R} ( P j ) which is equal to the entire internal and external interac-tions. It could be a derivative of the probabilities P j in A ( R ) k ( P j ), as in Eq. (2), or simply anew probability, as in Eq. (1), etc. The elimination of the external correlations is deﬁnedagain in a certain limit Q → Q . Then, the entire process is given by an expression of theform B {Q , R} ( P j ) = m ⊗ Q m Y ⊗ Q k =1 A ( R ) k ( P j ) , (16)with B {Q , R} ( P j ) >

0. We stress that the Eq. (16) does not describe exact processes. Itis a general expression of a set of probability interactions. The Q -multiplication with theconstant m , does not have a speciﬁc meaning but is useful for our purpose. We demand thatthe stochastic processes inside of each Block are independent ( R → R ), so that they aremultiplicatively connected through the ordinary multiplication. Using Eq. (14), the relationin Eq. (16) can be written as B { q, R } ( P j ) = " m ( q ) + m X k =1 h A ( R ) k ( P j ) i − q − q . (17)with m ( q ) := m − q (1 − m q ). In further, we set q = 0 as we have done in Eq. (15). Then,from Eq. (17) we obtain B { q =0 , R } ( P j ) = m X k =1 h A ( R ) k ( P j ) i . (18)6q. (18) is our main result. Namely, when the probability measure B { q, R } ( P j ) is equalto the sum ( q = 0) of the probability Blocks, even if the internal processes inside a Blockare independent, then this measure describes long range correlations. Accordingly, everystochastic model, which is based on such a sum, gives a possible creation of sequencesthrough a very concrete dynamic, where the processes of the several probability Blocks arelong range correlated (synchronized). Comparing Eqs. (1) and (2) with Eq. (18), we seethat the two earlier equations are speciﬁc expressions of the latter one.If we accept that biological evolution, as described by Eq. (18), has taken place, then thetransition between diﬀerent species during the evolution is attributed to the combination oftwo diﬀerent kinds of modiﬁcations in the R -correlations in each Block. The ﬁrst kind ofmodiﬁcations do not aﬀect the structure of the given dynamic. This case would correspondto the creation of diﬀerent organisms of the same class. The second kind of modiﬁcationsdoes aﬀect the structure of the given dynamic. This case would correspond to the creationof diﬀerent classes of organisms. However, the existence of long range correlations has theeﬀect that a modiﬁcation of a single probability causes simultaneously changes in all involvedterms. This has as consequence that every new created organism does not have the time tobe adapted to its environment, so that it is able to survive. On the other hand, the adaptionto the environment is the basis of the natural selection and the biological evolution. As wecan see, the introduction of long range correlations gives a problematic to the scenario ofbiological evolution, as has been assumed up to now.Summarizing, we have shown that the dependence or independence between probabilitiesis associated to an appropriate probability measure. The sum of probabilities or of a Blockof probabilities corresponds to long range interactions between the participating terms. Ac-cordingly, any stochastic model in which such a sum appears, correlates in a very speciﬁcway, with long range features, the respective processes. The result is a non-exponential prob-ability distribution function. Beside the clariﬁcation of the appropriate probability measure,we need to clarify the Type (or Types) of the correlations as well. There is a variety of dif-ferent types of correlations in a process, whose confusion may lead to incorrect results. Themore complicated a process is, the more diﬀerent types of correlations are involved in thisprocess. The presence of correlations creates diﬃculties considering the cornerstone of the7atural selection in biology, which is the adjustment of an organism to its environment. [1] Th. Oikonomou, A. Provata, U. Tirnakli, Physica A , 2653 (2008).[2] E. Novikov, A. Novikov, D. Shannahoﬀ-Khalsa, B. Schwartz, J. Wright,Phys. Rev. E ,R2387 (1997).[3] Tsallis, C., Anteneodo, C., L. Borland and R. Osorio, Physica A , 89 (2003).[4] Beck, C., Lewis, G.S., Swinney, H.L., Phys. Rev. E , 035303 (2001).[5] Arimitsu, N., Arimitsu, T., Europhys. Lett. , 60 (2002).[6] Malacarne, L.C., R.S. Mendes, Lenzi, E.K., Phys. Rev. E , 017106 (2002).[7] A. Provata, Physica A , 570 (1999).[8] A.M.C. de Souza & C. Anteneodo, Biophysical Journal , 1708 (1995).[9] P.W. Messer, P.F. Arndt & M. L¨assig, Phys. Rev. Lett. , 138103 (2005).[10] H. Takayasu, M. Takayasu, A. Provata, G. Huber, J. Stat. Phys. , 725 (1991).[11] H. Takayasu, Phys. Rev. Lett. , 2563 (1989).[12] E.P. Borges, Physica A , 95 (2004).[13] J.P. Boon, C. Tsallis, Europhysics News , 6 (2005).[14] Th. Oikonomou, Physica A , 119 (2007)., 119 (2007).