On the asymptotic approximation to the probability distribution of extremal precipitation
OOn the asymptotic approximation to the probabilitydistribution of extremal precipitation
V. Yu. Korolev , A. K. Gorshenin Abstract.
Based on the negative binomial model for the duration of wet periods mea-sured in days [2], an asymptotic approximation is proposed for the distribution of the maxi-mum daily precipitation volume within a wet period. This approximation has the form of ascale mixture of the Fr´echet distribution with the gamma mixing distribution and coincideswith the distribution of a positive power of a random variable having the Snedecor–Fisherdistribution. The proof of this result is based on the representation of the negative binomialdistribution as a mixed geometric (and hence, mixed Poisson) distribution [7] and limittheorems for extreme order statistics in samples with random sizes having mixed Poissondistributions [22]. Some analytic properties of the obtained limit distribution are described.In particular, it is demonstrated that under certain conditions the limit distribution is mixedexponential and hence, is infinitely divisible. It is shown that under the same conditionsthe limit distribution can be represented as a scale mixture of stable or Weibull or Paretoor folded normal laws. The corresponding product representations for the limit randomvariable can be used for its computer simulation. Several methods are proposed for the es-timation of the parameters of the distribution of the maximum daily precipitation volume.The results of fitting this distribution to real data are presented illustrating high adequacyof the proposed model. The obtained mixture representations for the limit laws and thecorresponding asymptotic approximations provide better insight into the nature of mixedprobability (“Bayesian”) models.
Key words: wet period, maximum daily precipitation, negative binomial distribution,asymptotic approximation, extreme order statistic, random sample size, gamma distribution,Fr´echet distribution, Snedecor–Fisher distribution, parameter estimation.
In most papers dealing with the statistical analysis of meteorological data available to theauthors, the suggested analytical models for the observed statistical regularities in precip-itation are rather ideal and inadequate. For example, it is traditionally assumed that theduration of a wet period (the number of subsequent wet days) follows the geometric dis-tribution (for example, see [1]) although the goodness-of-fit of this model is far from beingadmissible. Perhaps, this prejudice is based on the conventional interpretation of the ge-ometric distribution in terms of the Bernoulli trials as the distribution of the number ofsubsequent wet days (“successes”) till the first dry day (“failure”). But the framework ofBernoulli trials assumes that the trials are independent whereas a thorough statistical anal-ysis of precipitation data registered in different points demonstrates that the sequence of Faculty of Computational Mathematics and Cybernetics, Lomonosov Moscow State University, Russia;Institute of Informatics Problems, Federal Research Center “Computer Science and Control” of RussianAcademy of Sciences, Russia; Hangzhou Dianzi University, China; [email protected] Institute of Informatics Problems, Federal Research Center “Computer Science and Control” of RussianAcademy of Sciences, Russia; [email protected] a r X i v : . [ m a t h . P R ] J un ry and wet days is not only independent, but it is also devoid of the Markov property sothat the framework of Bernoulli trials is absolutely inadequate for analyzing meteorologicaldata.It turned out that the statistical regularities of the number of subsequent wet days can bevery reliably modeled by the negative binomial distribution with the shape parameter lessthan one. For example, in [2] we analyzed meteorological data registered at two geographicpoints with very different climate: Potsdam (Brandenburg, Germany) with mild climateinfluenced by the closeness to the ocean with warm Gulfstream flow and Elista (Kalmykia,Russia) with radically continental climate. The initial data of daily precipitation in Elistaand Potsdam are presented on Figures 1a and 1b, respectively. On these figures the hor-izontal axis is discrete time measured in days. The vertical axis is the daily precipitationvolume measured in centimeters. In other words, the height of each “pin” on these figuresis the precipitation volume registered at the corresponding day (at the corresponding pointon the horizontal axis).a) b)Fig. 1: The initial data of daily precipitation in Elista (a) and Potsdam (b).In order to analyze the statistical regularities of the duration of wet periods this datawas rearranged as shown on Figures 2a and 2b.a) b)Fig. 2: The durations of wet periods in Elista (a) and Potsdam (b).On these figures the horizontal axis is the number of successive wet periods. It shouldbe mentioned that directly before and after each wet period there is at least one dry day,that is, successive wet periods are separated by dry periods. On the vertical axis there lie2he durations of wet periods. In other words, the height of each “pin” on these figures isthe length of the corresponding wet period measured in days and the corresponding pointon the horizontal axis is the number of the wet period.The samples of durations in both Elista and Potsdam were assumed homogeneous andindependent. It was demonstrated that the fluctuations of the numbers of successive wetdays with very high confidence fit the negative binomial distribution with shape parameterless than one (also see [3]). Figures 3a and 3b show the histograms constructed from thecorresponding samples of duration periods and the fitted negative binomial distribution. Inboth cases the shape parameter r turned out to be less than one. For Elista r = 0 . p = 0 . r = 0 . p = 0 . ,
1] is determinedwhich is then used as the probability of success in the sequence of Bernoulli trials in whichthe original “unconditional” r.v. with the negative binomial distribution is nothing else thanthe “conditionally” geometrically distributed r.v. having the sense of the number of trialsup to the first failure. This makes it possible to assume that the sequence of wet/dry daysis not independent, but is conditionally independent and the random probability of successis determined by some outer stochastic factors. As such, we can consider the seasonality orthe type of the cause of a rainy period.The negative binomial model for the distribution of the duration of wet periods makesit possible to obtain asymptotic approximations for important characteristics of precipi-tation such as the distribution of the total precipitation volume per wet period and thedistribution of the maximum daily precipitation volume within a wet period. The first ofthese approximations was proposed in [7], where an analog of the law of large numbers fornegative binomial random sums was presented stating that the limit distribution for thesesums is the gamma distribution.The construction of the second approximation is the target of the present paper.The paper is organized as follows. Definitions and notation are introduced in Section 2which also contains some preliminary results providing some theoretical grounds for thenegative binomial model of the probability distribution of the duration of wet periods.Main results are presented and proved in Section 3 where the asymptotic approximation isproposed for the distribution of the maximum daily precipitation volume within a wet period.Some analytic properties of the obtained limit distribution are described. In particular, it isdemonstrated that under certain conditions the limit distribution is mixed exponential andhence, is infinitely divisible. It is shown that under the same conditions the limit distributioncan be represented as a scale mixture of stable or Weibull or Pareto or folded normal laws.The corresponding product representations for the limit random variable can be used forits computer simulation. Several methods for the statistical estimation of the parametersof this distribution are proposed in Section 4. Section 5 contains the results of fitting thedistribution proposed in Section 3 to real data by the methods described in Section 4.4
Preliminaries
Although the main objects of our interest are the probability distributions, for convenienceand brevity in what follows we will expound our results in terms of r.v:s with the corre-sponding distributions assuming that all the r.v:s under consideration are defined on oneand the same probability space (Ω , F , P ).In the paper, conventional notation is used. The symbols d = and = ⇒ denote the co-incidence of distributions and convergence in distribution, respectively. The integer andfractional parts of a number z will be respectively denoted [ z ] and { z } .A r.v. having the gamma distribution with shape parameter r > λ > G r,λ , P ( G r,λ < x ) = (cid:90) x g ( z ; r, λ ) dz, with g ( x ; r, λ ) = λ r Γ( r ) x r − e − λx , x ≥ , where Γ( r ) is Euler’s gamma-function, Γ( r ) = (cid:82) ∞ x r − e − x dx , r > G , is a r.v. with the standard exponential distribution: P ( G , < x ) = (cid:2) − e − x (cid:3) ( x ≥
0) (here and in what follows ( A ) is the indicator functionof a set A ).The gamma distribution is a particular representative of the class of generalized gammadistributions (GG-distributions), which were first described in [12] as a special family oflifetime distributions containing both gamma distributions and Weibull distributions. AGG-distribution is the absolutely continuous distribution defined by the density g ∗ ( x ; r, γ, λ ) = | γ | λ r Γ( r ) x γr − e − λx γ , x ≥ , with γ ∈ R , λ > r > g ∗ ( x ; r, γ, λ ) will be denoted G ∗ r,γ,λ . It can be easily made sure that G ∗ r,γ,λ d = G /γr,λ . (1)For a r.v. with the Weibull distribution, a particular case of GG-distributions correspondingto the density g ∗ ( x ; 1 , γ,
1) and the distribution function (d.f.) (cid:2) − e − x γ (cid:3) ( x ≥ W γ . Thus, G , d = W . It is easy to see that W /γ d = W γ . (2)A r.v. with the standard normal d.f. Φ( x ) will be denoted X , P ( X < x ) = Φ( x ) = 1 √ π (cid:90) x −∞ e − z / dz, x ∈ R . The distribution of the r.v. | X | , i. e. P ( | X | < x ) = 2Φ( x ) − x ≥
0, is called half-normalor folded normal.The d.f. and the density of a strictly stable distribution with the characteristic exponent α and shape parameter θ defined by the characteristic function (ch.f.) f α,θ ( t ) = exp (cid:8) − | t | α exp {− iπθα sign t } (cid:9) , t ∈ R , < α ≤ | θ | ≤ min { , α − } , will be respectively denoted F α,θ ( x ) and f α,θ ( x ) (see,e. g., [14]). A r.v. with the d.f. F α,θ ( x ) will be denoted S α,θ .To symmetric strictly stable distributions there correspond the value θ = 0 and thech.f. f α, ( t ) = e −| t | α , t ∈ R . To one-sided strictly stable distributions concentrated onthe nonnegative halfline there correspond the values θ = 1 and 0 < α ≤
1. The pairs α = 1, θ = ± ±
1, respectively. All the reststrictly stable distributions are absolutely continuous. Stable densities cannot be explicitlyrepresented via elementary functions with four exceptions: the normal distribution ( α = 2, θ = 0), the Cauchy distribution ( α = 1, θ = 0), the L´evy distribution ( α = , θ = 1) andthe distribution symmetric to the L´evy law ( α = , θ = − < α <
2, then E | S α,θ | β < ∞ for any β ∈ (0 , α ), but the momentsof the r.v. S α,θ of orders β ≥ α do not exist (see, e. g., [14]). Despite the absence of explicitexpressions for the densities of stable distributions in terms of elementary functions, it canbe shown [16] that E S βα, = Γ (cid:0) − βα (cid:1) Γ(1 − β ) (3)for 0 < β < α ≤ α ∈ (0 ,
1) and the i.i.d. r.v:s S α, and S (cid:48) α, have thesame strictly stable distribution, then the density v α ( x ) of the r.v. R α = S α, /S (cid:48) α, has theform v α ( x ) = sin( πα ) x α − π [1 + x α + 2 x α cos( πα )] , x > . (4)A r.v. N r,p is said to have the negative binomial distribution with parameters r > p ∈ (0 ,
1) (“success probability”), if P ( N r,p = k ) = Γ( r + k ) k !Γ( r ) · p r (1 − p ) k , k = 0 , , , ... A particular case of the negative binomial distribution corresponding to the value r = 1is the geometric distribution . Let p ∈ (0 ,
1) and let N ,p be the r.v. having the geometricdistribution with parameter p : P ( N ,p = k ) = p (1 − p ) k , k = 0 , , , ... This means that for any m ∈ N P ( N ,p ≥ m ) = (cid:88) ∞ k = m p (1 − p ) k = (1 − p ) m . Let Y be a r.v. taking values in the interval (0 , p ∈ (0 ,
1) ther.v. Y and the geometrically distributed r.v. N ,p be independent. Let V = N ,Y , that is, V ( ω ) = N ,Y ( ω ) ( ω ) for any ω ∈ Ω. The distribution P ( V ≥ m ) = (cid:90) (1 − y ) m d P ( Y < y ) , m ∈ N , of the r.v. V will be called Y -mixed geometric [8].6t is well known that the negative binomial distribution is a mixed Poisson distributionwith the gamma mixing distribution [20] (also see [6]): for any r > p ∈ (0 ,
1) and k ∈ { } (cid:83) N we have Γ( r + k ) k !Γ( r ) · p r (1 − p ) k = 1 k ! (cid:90) ∞ e − z z k g ( z ; r, µ ) dz, (5)where µ = p/ (1 − p ).Based on representation (5), in [7] it was proved that any negative binomial distributionwith the shape parameter no greater than one is a mixed geometric distribution. Namely,the following statement was proved that gives an analytic explanation of the validity ofthe negative binomial model for the duration of wet periods measured in days (see theIntroduction). Theorem 1 [7].
The negative binomial distribution with parameters r ∈ (0 , and p ∈ (0 , is a mixed geometric distribution : for any k ∈ { } (cid:83) N Γ( r + k ) k !Γ( r ) · p r (1 − p ) k = (cid:90) ∞ µ (cid:16) zz + 1 (cid:17)(cid:16) − zz + 1 (cid:17) k p ( z ; r, µ ) dz = (cid:90) p y (1 − y ) k h ( y ; r, p ) dy, where µ = p/ (1 − p ) and the probability densities p ( z ; r, µ ) and h ( y ; r, p ) have the forms p ( z ; r, µ ) = µ r Γ(1 − r )Γ( r ) · ( z ≥ µ )( z − µ ) r z ,h ( y ; r, p ) = p r Γ(1 − r )Γ( r ) · (1 − y ) r − ( p < y < y ( y − p ) r . Furthermore, if G r, and G − r, are independent gamma-distributed r.v:s, µ > , p ∈ (0 , , then the density p ( z ; r, µ ) corresponds to the r.v. Z r,µ = µ ( G r, + G − r, ) G r, (6) and the density h ( y ; r, p ) corresponds to the r.v. Y r,p = p ( G r, + G − r, ) G r, + pG − r, . Let P ( t ), t ≥
0, be the standard Poisson process (homogeneous Poisson process withunit intensity). Then distribution (5) corresponds to the r.v. N r,p = P ( G r,p/ (1 − p ) ), wherethe r.v. G r,p/ (1 − p ) is independent of the process P ( t ). In this section we will deduce the probability distribution of extremal daily precipitationwithin a wet period. 7et r > λ > q ∈ (0 , n ∈ N , p n = min { q, λ/n } . It is easy to make sure that n − G r,p n / (1 − p n ) = ⇒ G r,λ (7)as n → ∞ . Lemma 1.
Let Λ , Λ , . . . be a sequence of positive r.v : s such that for any n ∈ N the r.v. Λ n is independent of the Poisson process P ( t ) , t ≥ . The convergence n − P (Λ n ) = ⇒ Λ as n → ∞ to some nonnegative r.v. Λ takes place if and only if n − Λ n = ⇒ Λ (8) as n → ∞ . Proof . This statement is a particular case of Lemma 2 in [21] (also see Theorem 7.9.1in [6]).Consider a sequence of independent identically distributed (i.i.d.) r.v:s X , X , . . . . Let N , N , . . . be a sequence of natural-valued r.v:s such that for each n ∈ N the r.v. N n isindependent of the sequence X , X , . . . . Denote M n = max { X , . . . , X N n } .Let F ( x ) be a d.f., a ∈ R . Denote rext( F ) = sup { x : F ( x ) < } , F − ( a ) = inf { x : F ( x ) ≥ a } . Lemma 2.
Let Λ , Λ , . . . be a sequence of positive r.v : s such that for each n ∈ N the r.v. Λ n is independent of the Poisson process P ( t ) , t ≥ . Let N n = P (Λ n ) . Assume that thereexists a nonnegative r.v. Λ such that convergence (8) takes place. Let X , X , . . . be i.i.d.r.v : s with a common d.f. F ( x ) . Assume also that rext( F ) = ∞ and there exists a number γ > such that for each x > y →∞ − F ( xy )1 − F ( y ) = x − γ . (9) Then lim n →∞ sup x ≥ (cid:12)(cid:12)(cid:12)(cid:12) P (cid:18) M n F − (1 − n ) < x (cid:19) − (cid:90) ∞ e − zx − γ d P (Λ < z ) (cid:12)(cid:12)(cid:12)(cid:12) = 0 . Proof.
This statement is a particular case of Theorem 3.1 in [22].
Theorem 2.
Let n ∈ N , λ > , q ∈ (0 , and let N r,p n be a r.v. with the negativebinomial distribution with parameters r > and p n = min { q, λ/n } . Let X , X , . . . be i.i.d.r.v : s with a common d.f. F ( x ) . Assume that rext( F ) = ∞ and there exists a number γ > such that relation (9) holds for any x > . Then lim n →∞ sup x ≥ (cid:12)(cid:12)(cid:12)(cid:12) P (cid:18) max { X , . . . , X N r,pn } F − (1 − n ) < x (cid:19) − F ( x ; r, λ, γ ) (cid:12)(cid:12)(cid:12)(cid:12) = 0 , where F ( x ; r, λ, γ ) = (cid:18) λx γ λx γ (cid:19) r , x ≥ . roof. From (5) it follows that N r,p n d = P ( G r,p n / (1 − p n ) . Therefore, from (7), Lemma1 with Λ n = G r,p n / (1 − p n ) and Lemma 2 with the account of the absolute continuity of thelimit distribution it immediately follows thatlim n →∞ sup x ≥ (cid:12)(cid:12)(cid:12)(cid:12) P (cid:18) max { X , . . . , X N r,pn } F − (1 − n ) < x (cid:19) − λ r Γ( r ) (cid:90) ∞ e − z ( λ + x − γ ) z r − dz (cid:12)(cid:12)(cid:12)(cid:12) = 0 . By elementary calculation it can be made sure that λ r Γ( r ) (cid:90) ∞ e − z ( λ + x − γ ) z r − dz = λ r Γ( r )( λ + x − γ ) r (cid:90) ∞ e − z z r − dz = (cid:18) λx γ λx γ (cid:19) r . The theorem is proved.It is worth noting that the limit distribution with the power-type decrease of the tail wasobtained in Theorem 2 as a scale mixture of the Fr´echet distribution (the type II extremevalue distribution) in which the mixing law is the gamma distribution.Since the Fr´echet d.f. e − x − γ with γ > W − γ , it is easy tomake sure that the d.f. F ( x ; r, λ, γ ) corresponds to the r.v. M r,γ,λ ≡ G /γr,λ W − γ , where themultipliers on the right-hand side are independent. From (1) and (2) it follows that M r,γ,λ d = (cid:16) G r,λ W (cid:17) /γ d = G ∗ r,γ,λ W γ (10)where in each term the multipliers are independent. Consider the r.v. G r,λ /W in (10) inmore detail. We have G r,λ W d = G r,λ G , d = G r, λG , d = Q r, λr , where Q r, is the r.v. having the Snedecor–Fisher distribution with parameters r, f r, ( x ) = r r +1 x r − (1 + rx ) r +1 , x ≥ , (see, e. g., [15], Section 27).So, M r,γ,λ d = (cid:16) Q r, λr (cid:17) /γ , (11)and the statement of theorem 2 can be re-formulated asmax { X , . . . , X N r,pn } F − (1 − n ) = ⇒ M r,γ,λ ≡ G /γr,λ W γ d = (cid:16) Q r, λr (cid:17) /γ ( n → ∞ ) . (12)The density of the limit distribution F ( x ; r, γ, λ ) of the extreme daily precipitation withina wet period has the form p ( x ; r, γ, λ ) = rγλ r x γr − (1 + λx γ ) r +1 = γrλ r x γ ( λ + x − γ ) r +1 , x > . (13)It is easy to see that p ( x ; r, γ, λ ) = O ( x − − γ ) as x → ∞ . Therefore E M δr,γ,λ < ∞ only if δ < γ . Moreover, from (12) it is possible to deduce explicit expressions for the moments ofthe r.v. M r,γ,λ . 9 heorem 3. Let < δ < γ < ∞ . Then E M δr,γ,λ = Γ (cid:0) r + δγ (cid:1) Γ (cid:0) − δγ (cid:1) λ δ/γ Γ( r ) . Proof . From (12) it follows that E M δr,γ,λ = E G δ/γr,λ · E W − δ/γ . (14)It is easy to verify that E G δ/γr,λ = Γ (cid:0) r + δγ (cid:1) λ δ/γ Γ( r ) , E W − δ/γ = Γ (cid:0) − δγ (cid:1) . (15)Hence follows the desired result.To analyze the properties of the limit distribution in theorem 2 more thoroughly we willrequire some additional auxiliary results. Lemma 3 [16].
Let γ ∈ (0 , . Then W γ d = W S γ, with the r.v:s on the right-hand side being independent. Lemma 4 [7].
Let r ∈ (0 , , γ ∈ (0 , , λ > . Then G /γr,λ d = G ∗ r,γ,λ d = W γ Z /γr,λ d = W S γ, Z /γr,λ , where the r.v. Z r,λ was defined in (6) and all the involved r.v : s are independent. Theorem 4 . Let r ∈ (0 , , γ ∈ (0 , , λ > . Then the following product representationsare valid : M r,γ,λ d = G /γr,λ S γ, W , (16) M r,γ,λ d = W γ W (cid:48) γ · Z /γr,λ d = W · R γ W (cid:48) Z /γr,λ d = Π R γ Z /γr,λ d = | X |√ W R γ W (cid:48) Z /γr,λ , (17) where W γ d = W (cid:48) γ , W d = W (cid:48) , the r.v. R γ has the density (4) , the r.v. Π has the Paretodistribution : P (Π > x ) = ( x + 1) − , x ≥ , and in each term the involved r.v : s are indepen-dent. Proof . Relation (16) follows from (12) and Lemma 3, relation (17) follows from (12)and Lemma 4 with the account of the representation W d = | X |√ W , the proof of whichcan be found in, say, [16].With the account of the relation R γ d = R − γ , from (17) we obtain the following statement. Corollary 1.
Let r ∈ (0 , , γ ∈ (0 , , λ > . Then the d.f. F ( x ; r, γ, λ ) is mixedexponential : 1 − F ( x ; r, γ, λ ) = (cid:90) ∞ e − ux dA ( u ) , x ≥ , here A ( u ) = P (cid:0) W R γ Z /γr,λ < u (cid:1) , u ≥ , and all the involved r.v : s are independent . Remark 1.
It is worth noting that the mixing distribution in Corollary 1 is mixedexponential itself.
Corollary 2.
Let r ∈ (0 , , γ ∈ (0 , , λ > . Then the d.f. F ( x ; r, γ, λ ) is infinitelydivisible. Proof.
This statement immediately follows from Corollary 1 and the result of Goldie[23] stating that the product of two independent non-negative random variables is infinitelydivisible, if one of the two is exponentially distributed.Theorem 3 states that the limit distribution in Theorem 2 can be represented as ascale mixture of exponential or stable or Weibull or Pareto or folded normal laws. Thecorresponding product representations for the r.v. M r,γ,λ can be used for its computersimulation.In practice, the asymptotic approximation F ( x ; r, λ, γ ) for the distribution of the extremedaily precipitation within a wet period proposed by Theorem 2 is adequate, if the “successprobability” is small enough, that is, if on the average the wet periods are long enough. r , λ and γ From (13) it can be seen that the realization of the maximum likelihood method for theestimation of the parameters r , λ and γ inevitably assumes the necessity of numerical solu-tion of a system of transcendental equations by iterative procedures without any guaranteethat the resulting maximum is global. The closeness of the initial approximation to the truemaximum likelihood point in the three-dimensional parameter set might give a hope thatthe terminal extreme point found by the numerical algorithm is global.For rough estimation of the parameters, the following considerably simpler method canbe used. The resulting rough estimates can be used as a starting point for the ‘full’ max-imum likelihood algorithm mentioned above in order to ensure the closeness of the initialapproximation to the true solution. The rough method is based on that the quantiles of thed.f. F ( x ; r, λ, γ ) can be written out explicitly. Namely, the quantile x ( (cid:15) ; r, λ, γ ) of the d.f. F ( x ; r, λ, γ ) of order (cid:15) ∈ (0 , F ( x ; r, λ, γ ) = (cid:15) withrespect to x , obviously has the form x ( (cid:15) ; r, λ, γ ) = (cid:18) (cid:15) /r λ − λ(cid:15) /r (cid:19) /γ . Let at our disposal there be observations { X i,j } , i = 1 , . . . , m , j = 1 , . . . , m i , where i is thenumber of a wet period (the number of a sequence of rainy days), j is the number of a dayin the wet sequence, m i is the length of the i th wet sequence (the number of rainy days inthe i th wet period), m is the total number of wet sequences, X i,j is the precipitation volumeon the j th day of the i th wet sequence. Construct the sample X ∗ , . . . , X ∗ m as X ∗ k = max { X k, , . . . , X k,m k } , k = 1 , . . . , m. (18)11et X ∗ (1) , . . . , X ∗ ( m ) be order statistics constructed from the sample X ∗ , . . . , X ∗ m . Since wehave three unknown parameters r , λ and γ , fix three numbers 0 < p < p < p < X ∗ ([ mp k ]) = (cid:18) p /rk λ − λp /rk (cid:19) /γ , k = 1 , , a ] denotes the integer part of a number a ).This system can be solved by standard techniques. For example, first, the number s ≡ r can be found numerically as the solution of the equation Cs = log 1 − p s − p s log X ∗ ([ mp ]) X ∗ ([ mp ]) − log 1 − p s − p s log X ∗ ([ mp ]) X ∗ ([ mp ]) , where C = log X ∗ ([ mp ]) X ∗ ([ mp ]) log p p − log X ∗ ([ mp ]) X ∗ ([ mp ]) log p p . Having obtained the value of s , one can then find the values of γ and λ explicitly: γ = s (log p − log p ) + log(1 − p s ) − log(1 − p s )log X ∗ ([ mp ]) − log X ∗ ([ mp ]) , (19) λ = p s (1 − p s )( X ∗ ([ mp ]) ) γ . (20)As p , p and p one may take, say, p k = k , k = 1 , ,
3. Or it is possible to fix a τ ∈ (0 , ),set p = τ , p = , p = 1 − τ and choose a τ that provides the best fit of the estimatedmodel d.f. F ( x ; r, λ, γ ) with the empirical d.f.If the parameter r is known (for example, it is estimated beforehand while solving theproblem of fitting the negative binomial distribution to the empirical data on the lengthsof wet periods), then the parameters λ and γ can be estimated directly by formulas (19)and (20).With known r , the more accurate estimates of the parameters λ and γ can be also foundby minimizing the discrepancy between the empirical and model d.f:s by the least squarestechniques. Namely, from the Glivenko theorem it follows that (cid:18) λ ( X ∗ ( i ) ) γ λ ( X ∗ ( i ) ) γ (cid:19) r ≈ im . (21)As this is so, for every i = 1 , . . . , m − ⇐⇒ (cid:110) λ ( X ∗ ( i ) ) γ λ ( X ∗ ( i ) ) γ ≈ (cid:16) im (cid:17) /r (cid:111) ⇐⇒ (cid:110)
11 + λ ( X ∗ ( i ) ) γ ≈ − (cid:16) im (cid:17) /r (cid:111) ⇐⇒⇐⇒ (cid:110) λ ( X ∗ ( i ) ) γ ≈ i /r m /r − i /r (cid:111) ⇐⇒ (cid:110) log λ + γ log X ∗ ( i ) ≈ log i /r m /r − i /r (cid:111) . Therefore the estimates (cid:98) λ and (cid:98) γ of the parameters λ and γ can be found as the solution ofthe least squares problem(log (cid:98) λ, (cid:98) γ ) = arg min log λ,γ m − (cid:88) i =1 (cid:0) log λ + γ log X ∗ ( i ) − c i (cid:1) , c i = log i /r m /r − i /r . This solution can be written out explicitly so that finally we have (cid:98) γ = ( m − (cid:80) m − i =1 c i log X ∗ ( i ) − (cid:80) m − i =1 log X ∗ ( i ) (cid:80) m − i =1 c i ( m − (cid:80) m − i =1 (log X ∗ ( i ) ) − ( (cid:80) m − i =1 log X ∗ ( i ) ) , (22) (cid:98) λ = exp (cid:110) m − (cid:16) (cid:88) m − i =1 c i − (cid:98) γ (cid:88) m − i =1 log X ∗ ( i ) (cid:17)(cid:111) . (23) In this section we present the results of statistical estimation of the distribution of extremaldaily precipitation within a wet period by the methods described in the preceding section.As the data, we use the observations of daily precipitation in Potsdam and Elista from 1950to 2009.First of all, notice that the Pareto distribution of daily precipitation volumes (see Figure4) satisfies condition (9). Therefore, the asymptotic approximation provided by Theorem 2can be applied to the statistical analysis of the real data.The parameter r is assumed known and its value coincides with that of the negativebinomial distribution (see the Introduction): r = 0 .
847 for Potsdam and r = 0 .
876 forElista. Figures 5 and 6 illustrate the approximation of the empirical d.f. by the modeld.f. F ( x ; r, γ, λ ) with γ and λ estimated by the ‘rough’ formulas (19) and (20) as well asby the least squares formulas (22) and (23). To illustrate the asymptotic character of theapproximation F ( x ; r, γ, λ ) we consider a sort of censoring in which the censoring thresholdis the minimum length of the wet periods which varies from 1 day (full sample) to 15 days.For each censoring threshold h = min i m i the sample is formed according to the rule (18).For each value of the threshold on the upper graph there are • the empirical d.f. (dot-dash line); • the d.f. F ( x ; , r, γ, λ ) with γ and λ estimated by the ‘rough’ formulas (19) and (20)(dash line); • the d.f. F ( x ; , r, γ, λ ) with γ and λ estimated by the least squares formulas (22)and (23) (continuous line).On the lower graph there is the discrepancy (the uniform distance) between the empiricald.f. and the fitted model d.f. The types of lines correspond to those on the upper graph.First of all, from Figures 5 and 6 it is seen that the asymptotic model F ( x ; r, γ, λ ) providesvery good approximation to the real statistical regularities in the behavior of extremal dailyprecipitation within a wet period. As this is so, the least squares formulas (22) and (23)yield more accurate estimates for the parameters of the model d.f.It should be also noted that these figures illustrate the dependence of the accuracy of theapproximation on the censoring threshold and the censored sample size. The approximationis satisfactory even if the censoring threshold h is greater or equal to three and the censoredsample size is grater than 150. As this is so, the influence of the threshold h on the accuracyis more noticeable than that of the sample size.13) b)c) d)e) f)g) h)Fig. 5: Empirical and estimated model extreme precipitation distributions (Potsdam). Du-ration of wet period is no less than: a) 1; b) 2; c) 3; d) 4; e) 6; f) 8; g) 10; h) 15 days.14) b)c) d)e) f)g) h)Fig. 6: Empirical and estimated model extreme precipitation distributions (Elista). Dura-tion of wet period is no less than: a) 1; b) 2; c) 3; d) 4; e) 6; f) 7; g) 8; h) 10 days.15 cknowledgements The research was partly supported by the RAS Presidium Program No. I.33P (project0063-2016-0015) and by the Russian Foundation for Basic Research (projects 15-07-04040and 17-07-00851).
References [1]
Zolina O., Simmer C., Belyaev K., Gulev S., Koltermann P.
Changes in the durationof European wet and dry spells during the last 60 years // Journal of Climate, 2013.Vol. 26. P. 2022–2047.[2]
Korolev V. Yu., Gorshenin A. K., Gulev S. K., Belyaev K. P., Grusho A. A.
StatisticalAnalysis of Precipitation Events // AIP Conference Proceedings, 2017. Also availableon arXiv:1705.11055 [math.PR][3]
Gorhenin A. K.
On some mathematical and program methods for the constructionof structure models of information flows // Informatics and Its Applications, 2017.Vol. 11. No. 1. P. 58–68.[4]
Kingman J. F. C.
Poisson processes. – Oxford: Clarendon Press, 1993.[5]
Gnedenko B. V., Korolev V. Yu.
Random Summation: Limit Theorems and Applica-tions. – Boca Raton: CRC Press, 1996.[6]
Korolev V. Yu., Bening V. E., Shorgin S. Ya.
Mathematical Foundations of Risk The-ory. 2nd Edition. – Moscow: FIZMATLIT, 2011.[7]
Korolev V. Yu.
Analogs of the Gleser theorem for negative binomial and generalizedgamma distributions and some their applications // Informatics and Its Applications,2017. Vol. 11. No. 3.[8]
Korolev V. Yu.
Limit distributions for doubly stochastically rarefied renewal processesand their properties // Theory Probab. Appl., 2016. Vol. 61. No. 4. P. 753–773.[9]
Korolev V. Yu., Korchagin A. Yu., Zeifman A. I.
Poisson theorem for the scheme ofBernoulli trials with random probability of success and a discrete analog of the Weibulldistribution // Informatics and Its Applications, 2016. Vol. 10. No. 4. P. 11–20.[10]
Korolev V. Yu., Korchagin A. Yu., Zeifman A. I.
On doubly stochastic rarefaction ofrenewal processes // AIP Conference Proceedings, 2017.[11]
Gleser L. J.
The gamma distribution as a mixture of exponential distributions //American Statistician, 1989. Vol. 43. P. 115–117.[12]
Stacy E. W.
A generalization of the gamma distribution // Annals of MathematicalStatistics, 1962. Vol. 33. P. 1187–1192.[13]
Zaks L. M., Korolev V. Yu.
Generalized variance gamma distributions as limit lawsfor random sums // Informatics and Its Applications, 2013. Vol. 7. No. 1. P. 105–115.1614]
Zolotarev V. M.
One-Dimensional Stable Distributions. – Providence, R.I.: AmericanMathematical Society, 1986.[15]
Johnson, N. L., Kotz S., Balakrishnan N.
Continuous Univariate Distributions, Vol.2 (2nd Edition). – New York: Wiley, 1995.[16]
Korolev V. Yu.
Product representations for random variables with the Weibull distri-butions and their applications // Journal of Mathematical Sciences, 2016. Vol. 218.No. 3. P. 298–313.[17]
Kotz S., Ostrovskii I. V.
A mixture representation of the Linnik distribution // Statis-tics and Probability Letters, 1996. Vol. 26. P. 61–64.[18]
Korolev V. Yu., Zeifman A. I.
A note on mixture representations for the Linnik andMittag-Leffler distributions and their applications // Journal of Mathematical Sci-ences, 2017. Vol. 218. No. 3. P. 314–327.[19]
Korolev V. Yu., Zeifman A. I.
Convergence of statistics constructed from samples withrandom sizes to the Linnik and Mittag-Leffler distributions and their generalizations// Journal of Korean Statistical Society. Available online 25 July 2016. Also availableon arXiv:1602.02480v1 [math.PR].[20]
Greenwood M., Yule G. U.
An inquiry into the nature of frequency-distributions ofmultiple happenings, etc. // J. Roy. Statist. Soc., 1920. Vol. 83. P. 255–279.[21]
Korolev V. Yu.
On convergence of distributions of compound Cox processes to stablelaws // Theory of Probability and its Applications, 1999. Vol. 43. Vol. 4. P. 644–650.[22]
Korolev V. Yu., Sokolov I. A.
Mathematical Models of Inhomogeneous Flows of Ex-tremal Events. – Moscow: Torus-Press, 2008.[23]