[PDF] Equilibrium in Wright-Fisher models of population genetics

Abstract

For multivariant Wright-Fisher models in population genetics, we introduce equilibrium states, expressed by fluctuations of probability ratio, in contrast to the traditionally used fluctuations, expressed by the difference between the current value of the random process and its equilibrium value. Then the drift component of the dynamic process of gene frequencies, primarily expressed as a ratio of two quadratic forms, is transformed into a cubic parabola with a certain normalization factor.

Full PDF

aa r X i v : . [ m a t h . S T ] A ug UDC 519.24

Equilibrium in Wright-Fisher models of population geneticsD.Koroliouk † , V.S.Koroliuk ∗ † Institute of Telecommunications and Global Information Space ∗ Institute of Mathematics †∗ Ukrainian Academy of Sciences, Kiev, Ukraine

Abstract.

For multivariant Wright-Fisher models in population genetics we introduceequilibrium states, expressed by ﬂuctuations of probability relations, in distinction of thetraditionally used ﬂuctuations, expressed by the diﬀerence between the current value of arandom process and its equilibrium value.Then the drift component of the gene frequencies dynamic process, primarily espressedas a ratio of two quadratic forms, is transformed in a cubic parabola with a certainnormalization factor.

Keywords:

Wright-Fisher model, population genetics, evolutionary process, equilib-rium state, ﬂuctuations of probability relations.

The population genetics models by Wright-Fisher are deﬁned by regression functionswhich are determined by a ratio of two quadratic forms [1, Ch.10].However, equilibrium state, deﬁned by equilibrium point of regression function, re-quires additional analysis (see, for ex., [2, 3]).At the same time, the equilibrium is easily determined for the incremental regressionfunction at each stage [4, 5, 6].In the present work, the models of population genetics of genotypes interaction aredetermined by diﬀerence evolution equations with regression functions of increments forthe frequency probabilities of genotypes.In this case, the equilibrium state of the probabilities frequency is given by the equi-librium of the regression function of increments, which is postulated by the form of sucha function.

The probabilities of genotype frequencies at each stage k ≥ P ( k ) = ( P m ( k ) , ≤ m ≤ M ) with M + 1 ( M ≥

1) ﬁnite number ofthe state set E = { e , e , . . . e M } .The dynamics of the frequency probabilities at the next k + 1-th stage ( k ≥

0) is given1y the regression function [1, Ch.10] P m ( k + 1) := W m ( p ) /W ( p ) , ≤ m ≤ M , k ≥ , (1) W m ( p ) := p m M X n =0 W mn p n , ≤ m ≤ M, (2) W ( p ) := M X n =0 W m ( p ) . (3)The probabilities of frequencies obey the usual restrictions 0 ≤ p m ≤ P Mn =0 p n = 1.The respective restrictions for the survival parameters are 0 ≤ W mn ≤

1, 0 ≤ m, n ≤ M .The increment of probability at each stage∆ P m ( k + 1) := P m ( k + 1) − P m ( k ) , ≤ m ≤ M , k ≥ , (4)is given by the incremental regression function∆ P m ( k + 1) = W ( m )0 ( p ) , ≤ m ≤ M, (5) W ( m )0 ( p ) = V ( m )0 ( p ) /W ( p ) , V ( m )0 ( p ) := W m ( p ) − p m W ( p ) , ≤ m ≤ M. (6)Let us introduce new parameters of survival: V mn := 1 − W mn , ≤ m, n ≤ M. (7)Then the numerator of incremental regression function (6) is transformed to the form: V ( m )0 ( p ) = p m " M X n =0 p n ( V n , p ) − ( V m , p ) , ≤ m ≤ M, (8)and the normalizing denominator (3) has the form: W ( p ) = 1 − M X n =0 p n ( V n , p ) . (9)where the scalar product ( V m , p ) := M X n =0 V mn p n , ≤ m ≤ M. (10)Introduce the equilibriums of incremental regression functions (8) by the relations:( V m , ρ ) = π , ≤ m ≤ M , π := M Y n =0 ρ n . (11)The normalized constant π is also generated by equilibriums ρ = ( ρ m , 0 ≤ m ≤ M ).2 emma 1. The equilibriums of incremental regression functions (6) - (10) are given bythe relation: ρ m = πV m , ≤ m ≤ M , π := M Y n =0 ρ n , (12) where V m := P Mn =0 V mn , ≤ m ≤ M with the summands which are the elements ofinverse matrix V − := [ V mn , ≤ m, n ≤ M ] with respect to the directing parame-ters matrix V = [ V mn ; ≤ m, n ≤ M ] , under the additional normalization condition P Mm =0 V m = P Mm,n =0 V mn = π − .Proof. The relation (11) means that V ρ = π , π := ( π , ≤ n ≤ M ) . Hence the vector of equilibriums has the following representation: ρ = π V − , ρ m = V m π , ≤ m ≤ M, (13)that is, the assertion of Lemma (12). Corollary 1.

The equilibria (12) provide the equilibrium state of the probability fre-quency (1): V ( m )0 ( ρ ) ≡ , ≤ m ≤ M. (14) Corollary 2.

The equilibria (12) generate a representation of the scalar products (10)by ﬂuctuations of the probability relations :( V m , p ) = πp m /ρ m , ≤ m ≤ M. (15)The normalizing constant π is deﬁned in (11).First of all, note that relation (12) coincides with the deﬁnition of the equilibrium(11), under additional assumption that the directing parameters matrix V = V m δ mn ;0 ≤ m, n ≤ M , is diagonal. Hence we have the following Lemma 2.

There takes place the following relation: πρ − = V , (16) which coincides with formula (15). Now the incremental regression functions (6) - (9) with the relations (14) generate thefollowing 3 roposition 1.

The incremental regression functions with Wright-Fisher normalizationis given by the relations: W ( m )0 ( p ) = V ( m )0 ( p ) /W ( p ) , (17) V ( m )0 ( p ) = πρ m [ M X m =0 p n /ρ n − p m /ρ m ] , (18) W ( p ) = 1 − π M X m =0 p n /ρ n . (19) It is obvious the balance condition: M X m =0 V ( m )0 ( p ) = 0 , (20) which in scalar form is the following: M X m =0 p m M X n =0 p n /ρ n − M X m =0 p m /ρ m ≡ . (21) The presence of equilibrium state is provided by equilibrium point of the incrementalregression function: V ( m )0 ( ρ ) = 0 , ≤ m ≤ M , ρ = ( ρ m , ≤ m ≤ M ) . (22)The normalizing Wright-Fisher factor has the form: W ( ρ ) = 1 − π , π = M Y m =0 ρ m . (23)The equilibrium generated by the state ρ = ( ρ m , ≤ m ≤ M ), is interpreted by theconvergence of evolutionary processes (1). Theorem 1.

For any initial data: < P m (0) < , ≤ m ≤ M , evolutionary processes P m ( k ) , ≤ m ≤ M , k ≥ , which are determined by solutions of diﬀerence evolutionaryequation (5) with the incremental regression function (16) - (18) converge, by k → ∞ , toequilibrium lim k →∞ P m ( k ) = ρ m , ≤ m ≤ M. (24) Proof.

The property of the main components is used, which is speciﬁed by the sum M X n =0 p n /ρ n = M X n =0 p n ( p n /ρ n ) , (25)4his means averaging the ﬂuctuations of the probability relations p n /ρ n , 0 ≤ m ≤ M on the distribution of frequencies at the current stage. In this case, the ﬂuctuations ofthe ratios are equal to one for p n = ρ n , 0 ≤ m ≤ M , and at the same time, the maincomponent of the incremental regression function is also equal to one.Consequently, the possible values of the frequency probabilities can be split into threezones: (+) p n < ρ n ; ( − ) p n > ρ n ; (0) p n = ρ n . The signs of the incremental regression functions (16) - (18) in such a zones is the same:in zone (+) the probabilities increase, in zone (-) they are decrease.Therefore, there exists a limit (23) whose value is ensured by the necessary conditionfor the existence of a limit: lim k →∞ ∆ P m ( k + 1) = 0 . (26) The binary EP P ± ( k ), k ≥

0, are determined by the following regression functions [1]: P ± ( k + 1) = W ± ( p ) /W ( p ) , k ≥ , (27) W ± ( p ) = P ± ( W ± p ± + p ∓ ) , (28) W (0) = W + ( p + ) + W − ( p − ) = W + p + 2 p + p − + W − p − . (29)The frequencies probabilities at k -th stage satisfy the usual conditions 0 ≤ p ± ≤ p + + p − = 1. The survival parameters are also limited by the relation 0 < W ± < P ± ( k + 1) := P ± ( k + 1) − P ± ( k ) , k ≥ , (30)the corresponding regression functions of increments can be represented as follows: W ± ( p ± ) = W ± ( p ± ) /W ( p ) − p ± , (31)or equivalently W ± ( p ± ) = V ± ( p ± ) /W ( p ) , (32) V ± ( p ± ) = W ± ( p ± ) − p ± W ( p ) . (33)Now introduce direction parameters, based on the survival ones: V ± := 1 − W ± Therelative equilibriums will be ρ ± = V − ± with the normalization condition V + + V − = 1.Then the numerators (27) of the regression function has the following form: W ± ( p ± ) = p ± (1 − πp ∓ /ρ ± ) , π := ρ + ρ − . (34)Therefore the numerator (32) transforms into the following: V ± ( p ± ) = p ∓ W ± ( p ± ) − p ± W ∓ ( p ∓ ) = p + p − ( ρ ± p ∓ − ρ ∓ p ± ) . (35)5he linear component has the following representation: ρ + p − − ρ − p + = − ( p + − ρ + ) = p − − ρ − . (36)So the regression functions of the increments of binary evolutionary processes are repre-sented by the probability of ﬂuctuations: V ± ( p ± ) = − p + p − ( p ± − ρ ± ) . (37)The Wright-Fisher normalizing factor has the following representation: W ( p ) = 1 − π [ p /ρ + + p − /ρ − ] = 1 − [ ρ − p + ρ + p − ] . (38)The balance condition is also evident: V +0 ( p + ) + V − ( p − ) ≡ . (39) The considered in the present work evolution processes serve as predictable components ofstochastic models in population genetics and are represented by conditional mathematicalexpectations: P m ( k + 1) := E [ S ( m ) N ( k + 1) | S N ( k ) = P ( k )] , ≤ m ≤ M , k ≥ . (40)The stochastic models in population genetics are determined by averaged sums S N ( k ) := 1 N N X n =1 δ n ( k ) , k ≥ . (41)of random sample variables δ n ( k ), 1 ≤ n ≤ N , which take values in a ﬁnite set with M + 1( M ≥

1) states E = { e , e , . . . , e M } (see [7]). So the stochastic models (41) are deﬁnedby the sum of two components: S N ( k + 1) = V ( S N ( k )) + ∆ µ N ( k + 1) , k ≥ . (42)The ﬁrst, predictable component is generated by conditional mathematical expectations: V m ( P m ( k )) = P m ( k ) + V ( m )0 ( P m ( k )) /W ( P m ( k )) , ≤ m ≤ M , k ≥ . (43)The second component forms a martingale diﬀerences∆ µ N ( k + 1) = S N ( k + 1) − V ( S N ( k )) , k ≥ , (44)characterized by the ﬁrst moments: E ∆ µ ( m ) N ( k + 1) = 0 , ≤ m ≤ M,E [(∆ µ ( m ) N ( k + 1)) | S N ( k )] = σ m ( S N ( k )) , ≤ m ≤ M , k ≥ . (45)The conditional dispersion is determined by regression functions: σ m ( p ) = V m ( p )[1 − V m ( p )] , ≤ m ≤ M. (46)The asymptotical properties of stochastic models (42) - (45) by N → ∞ as well as by k → ∞ , will be investigated in our next paper. The algorithms of phase merging [8] andstatistical estimation of drift parameter [9], [10] can be directly applied.6 eferenceseferences