Equilibrium in Wright-Fisher models of population genetics
aa r X i v : . [ m a t h . S T ] A ug UDC 519.24
Equilibrium in Wright-Fisher models of population geneticsD.Koroliouk † , V.S.Koroliuk ∗ † Institute of Telecommunications and Global Information Space ∗ Institute of Mathematics †∗ Ukrainian Academy of Sciences, Kiev, Ukraine
Abstract.
For multivariant Wright-Fisher models in population genetics we introduceequilibrium states, expressed by fluctuations of probability relations, in distinction of thetraditionally used fluctuations, expressed by the difference between the current value of arandom process and its equilibrium value.Then the drift component of the gene frequencies dynamic process, primarily espressedas a ratio of two quadratic forms, is transformed in a cubic parabola with a certainnormalization factor.
Keywords:
Wright-Fisher model, population genetics, evolutionary process, equilib-rium state, fluctuations of probability relations.
The population genetics models by Wright-Fisher are defined by regression functionswhich are determined by a ratio of two quadratic forms [1, Ch.10].However, equilibrium state, defined by equilibrium point of regression function, re-quires additional analysis (see, for ex., [2, 3]).At the same time, the equilibrium is easily determined for the incremental regressionfunction at each stage [4, 5, 6].In the present work, the models of population genetics of genotypes interaction aredetermined by difference evolution equations with regression functions of increments forthe frequency probabilities of genotypes.In this case, the equilibrium state of the probabilities frequency is given by the equi-librium of the regression function of increments, which is postulated by the form of sucha function.
The probabilities of genotype frequencies at each stage k ≥ P ( k ) = ( P m ( k ) , ≤ m ≤ M ) with M + 1 ( M ≥
1) finite number ofthe state set E = { e , e , . . . e M } .The dynamics of the frequency probabilities at the next k + 1-th stage ( k ≥
0) is given1y the regression function [1, Ch.10] P m ( k + 1) := W m ( p ) /W ( p ) , ≤ m ≤ M , k ≥ , (1) W m ( p ) := p m M X n =0 W mn p n , ≤ m ≤ M, (2) W ( p ) := M X n =0 W m ( p ) . (3)The probabilities of frequencies obey the usual restrictions 0 ≤ p m ≤ P Mn =0 p n = 1.The respective restrictions for the survival parameters are 0 ≤ W mn ≤
1, 0 ≤ m, n ≤ M .The increment of probability at each stage∆ P m ( k + 1) := P m ( k + 1) − P m ( k ) , ≤ m ≤ M , k ≥ , (4)is given by the incremental regression function∆ P m ( k + 1) = W ( m )0 ( p ) , ≤ m ≤ M, (5) W ( m )0 ( p ) = V ( m )0 ( p ) /W ( p ) , V ( m )0 ( p ) := W m ( p ) − p m W ( p ) , ≤ m ≤ M. (6)Let us introduce new parameters of survival: V mn := 1 − W mn , ≤ m, n ≤ M. (7)Then the numerator of incremental regression function (6) is transformed to the form: V ( m )0 ( p ) = p m " M X n =0 p n ( V n , p ) − ( V m , p ) , ≤ m ≤ M, (8)and the normalizing denominator (3) has the form: W ( p ) = 1 − M X n =0 p n ( V n , p ) . (9)where the scalar product ( V m , p ) := M X n =0 V mn p n , ≤ m ≤ M. (10)Introduce the equilibriums of incremental regression functions (8) by the relations:( V m , ρ ) = π , ≤ m ≤ M , π := M Y n =0 ρ n . (11)The normalized constant π is also generated by equilibriums ρ = ( ρ m , 0 ≤ m ≤ M ).2 emma 1. The equilibriums of incremental regression functions (6) - (10) are given bythe relation: ρ m = πV m , ≤ m ≤ M , π := M Y n =0 ρ n , (12) where V m := P Mn =0 V mn , ≤ m ≤ M with the summands which are the elements ofinverse matrix V − := [ V mn , ≤ m, n ≤ M ] with respect to the directing parame-ters matrix V = [ V mn ; ≤ m, n ≤ M ] , under the additional normalization condition P Mm =0 V m = P Mm,n =0 V mn = π − .Proof. The relation (11) means that V ρ = π , π := ( π , ≤ n ≤ M ) . Hence the vector of equilibriums has the following representation: ρ = π V − , ρ m = V m π , ≤ m ≤ M, (13)that is, the assertion of Lemma (12). Corollary 1.
The equilibria (12) provide the equilibrium state of the probability fre-quency (1): V ( m )0 ( ρ ) ≡ , ≤ m ≤ M. (14) Corollary 2.
The equilibria (12) generate a representation of the scalar products (10)by fluctuations of the probability relations :( V m , p ) = πp m /ρ m , ≤ m ≤ M. (15)The normalizing constant π is defined in (11).First of all, note that relation (12) coincides with the definition of the equilibrium(11), under additional assumption that the directing parameters matrix V = V m δ mn ;0 ≤ m, n ≤ M , is diagonal. Hence we have the following Lemma 2.
There takes place the following relation: πρ − = V , (16) which coincides with formula (15). Now the incremental regression functions (6) - (9) with the relations (14) generate thefollowing 3 roposition 1.
The incremental regression functions with Wright-Fisher normalizationis given by the relations: W ( m )0 ( p ) = V ( m )0 ( p ) /W ( p ) , (17) V ( m )0 ( p ) = πρ m [ M X m =0 p n /ρ n − p m /ρ m ] , (18) W ( p ) = 1 − π M X m =0 p n /ρ n . (19) It is obvious the balance condition: M X m =0 V ( m )0 ( p ) = 0 , (20) which in scalar form is the following: M X m =0 p m M X n =0 p n /ρ n − M X m =0 p m /ρ m ≡ . (21) The presence of equilibrium state is provided by equilibrium point of the incrementalregression function: V ( m )0 ( ρ ) = 0 , ≤ m ≤ M , ρ = ( ρ m , ≤ m ≤ M ) . (22)The normalizing Wright-Fisher factor has the form: W ( ρ ) = 1 − π , π = M Y m =0 ρ m . (23)The equilibrium generated by the state ρ = ( ρ m , ≤ m ≤ M ), is interpreted by theconvergence of evolutionary processes (1). Theorem 1.
For any initial data: < P m (0) < , ≤ m ≤ M , evolutionary processes P m ( k ) , ≤ m ≤ M , k ≥ , which are determined by solutions of difference evolutionaryequation (5) with the incremental regression function (16) - (18) converge, by k → ∞ , toequilibrium lim k →∞ P m ( k ) = ρ m , ≤ m ≤ M. (24) Proof.
The property of the main components is used, which is specified by the sum M X n =0 p n /ρ n = M X n =0 p n ( p n /ρ n ) , (25)4his means averaging the fluctuations of the probability relations p n /ρ n , 0 ≤ m ≤ M on the distribution of frequencies at the current stage. In this case, the fluctuations ofthe ratios are equal to one for p n = ρ n , 0 ≤ m ≤ M , and at the same time, the maincomponent of the incremental regression function is also equal to one.Consequently, the possible values of the frequency probabilities can be split into threezones: (+) p n < ρ n ; ( − ) p n > ρ n ; (0) p n = ρ n . The signs of the incremental regression functions (16) - (18) in such a zones is the same:in zone (+) the probabilities increase, in zone (-) they are decrease.Therefore, there exists a limit (23) whose value is ensured by the necessary conditionfor the existence of a limit: lim k →∞ ∆ P m ( k + 1) = 0 . (26) The binary EP P ± ( k ), k ≥
0, are determined by the following regression functions [1]: P ± ( k + 1) = W ± ( p ) /W ( p ) , k ≥ , (27) W ± ( p ) = P ± ( W ± p ± + p ∓ ) , (28) W (0) = W + ( p + ) + W − ( p − ) = W + p + 2 p + p − + W − p − . (29)The frequencies probabilities at k -th stage satisfy the usual conditions 0 ≤ p ± ≤ p + + p − = 1. The survival parameters are also limited by the relation 0 < W ± < P ± ( k + 1) := P ± ( k + 1) − P ± ( k ) , k ≥ , (30)the corresponding regression functions of increments can be represented as follows: W ± ( p ± ) = W ± ( p ± ) /W ( p ) − p ± , (31)or equivalently W ± ( p ± ) = V ± ( p ± ) /W ( p ) , (32) V ± ( p ± ) = W ± ( p ± ) − p ± W ( p ) . (33)Now introduce direction parameters, based on the survival ones: V ± := 1 − W ± Therelative equilibriums will be ρ ± = V − ± with the normalization condition V + + V − = 1.Then the numerators (27) of the regression function has the following form: W ± ( p ± ) = p ± (1 − πp ∓ /ρ ± ) , π := ρ + ρ − . (34)Therefore the numerator (32) transforms into the following: V ± ( p ± ) = p ∓ W ± ( p ± ) − p ± W ∓ ( p ∓ ) = p + p − ( ρ ± p ∓ − ρ ∓ p ± ) . (35)5he linear component has the following representation: ρ + p − − ρ − p + = − ( p + − ρ + ) = p − − ρ − . (36)So the regression functions of the increments of binary evolutionary processes are repre-sented by the probability of fluctuations: V ± ( p ± ) = − p + p − ( p ± − ρ ± ) . (37)The Wright-Fisher normalizing factor has the following representation: W ( p ) = 1 − π [ p /ρ + + p − /ρ − ] = 1 − [ ρ − p + ρ + p − ] . (38)The balance condition is also evident: V +0 ( p + ) + V − ( p − ) ≡ . (39) The considered in the present work evolution processes serve as predictable components ofstochastic models in population genetics and are represented by conditional mathematicalexpectations: P m ( k + 1) := E [ S ( m ) N ( k + 1) | S N ( k ) = P ( k )] , ≤ m ≤ M , k ≥ . (40)The stochastic models in population genetics are determined by averaged sums S N ( k ) := 1 N N X n =1 δ n ( k ) , k ≥ . (41)of random sample variables δ n ( k ), 1 ≤ n ≤ N , which take values in a finite set with M + 1( M ≥
1) states E = { e , e , . . . , e M } (see [7]). So the stochastic models (41) are definedby the sum of two components: S N ( k + 1) = V ( S N ( k )) + ∆ µ N ( k + 1) , k ≥ . (42)The first, predictable component is generated by conditional mathematical expectations: V m ( P m ( k )) = P m ( k ) + V ( m )0 ( P m ( k )) /W ( P m ( k )) , ≤ m ≤ M , k ≥ . (43)The second component forms a martingale differences∆ µ N ( k + 1) = S N ( k + 1) − V ( S N ( k )) , k ≥ , (44)characterized by the first moments: E ∆ µ ( m ) N ( k + 1) = 0 , ≤ m ≤ M,E [(∆ µ ( m ) N ( k + 1)) | S N ( k )] = σ m ( S N ( k )) , ≤ m ≤ M , k ≥ . (45)The conditional dispersion is determined by regression functions: σ m ( p ) = V m ( p )[1 − V m ( p )] , ≤ m ≤ M. (46)The asymptotical properties of stochastic models (42) - (45) by N → ∞ as well as by k → ∞ , will be investigated in our next paper. The algorithms of phase merging [8] andstatistical estimation of drift parameter [9], [10] can be directly applied.6 eferenceseferences