J-measure of uncertainty (JMU) for specified probability distribution
J-measure of uncertainty (JMU) for specified probability distribution
Y. Schreiber*, A. Chudnovsky** * Kurt-Schumacher-Str.11, 86165 Augsburg, Germany ** The University of Illinois At Chicago, Chicago IL, 60607, USA
Abstract.
In this paper it is shown that in the case of the given probability distribution or histogram using the Jaynes entropy maximum principle an J- measure of uncertainty (JMU) provides the maximum for a given distribution can be constructed. Formulas for the introduced JMU were obtained explicitly and calculations of this new measure for a number of distributions are shown as examples. It is shown using as the example a two-dimensional random variable, the application of the proposed method to the JMU estimation for the multidimensional case. It was made a comparison of the information contained in the histogram of a random variable with the information in the probability distribution obtained as a fitting of this histogram. Moreover studied the influence of an additional measurement of a certain physical quantity on the amount of information.
1. Introduction.
This paper originated as the intersection of two ideas: generalized entropies and Jaynes maximum entropy principle. However, the named after Jaynes J-measure of uncertainty, proposed in this paper, is not a generalization of the classical Boltzmann-Gibbs (BG) entropy. In essence, the only starting point for the proposed method is maximum entropy principle. A. Ya. Khinchin formulated four axioms, which are necessary and sufficient conditions for the entropy to have the form of Gibbs entropy [1]. It is shown in [2-4] that if one of these axioms fails, namely the axiom of additivity, a two-parameter family of entropies of a certain structure appears. Moreover, in this and all other generalizations of BG entropy, the latter is the simplest particular case. However, it should be noted that Khinchin's axioms were formulated by him for discrete random variables. For continuous random variables BG entropy is used rather by analogy. In this case, some problems arise, so that Kullback-Leibler divergence, also called relative entropy, is used more often. For this reason, in particular, it is desirable to propose a measure of uncertainty that similarly works for both discrete and continuous, both for one-dimensional, and for multidimensional random variables. *e-mail address: [email protected] **e-mail address: [email protected]
2. In paper of C.Tsallis [5], it was introduced the generalized entropy (Tsallis entropy), which in recent years has become very popular. This paper was not the first one, in which other entropy was proposed [6], for example Renyi entropy appeared much earlier [7], however it was Tsallis entropy that found especially many applications in physics and not only in physics [8]. There appeared non-extensive statistical mechanics [9], which Tsallis himself considered more accurately to call non-additive statistical mechanics. It should also be denoted the works on the generalization of entropy BG, devoted to superstatistics, in which the parameter corresponding to the temperature itself is considered random [10-13]. This generalization of classical statistical mechanics has also found many applications [14,15]. When using the maximum entropy principle of Jaynes [16,17], it is proposed to construct an unknown probability distribution of a random variable from the condition of the maximum of BG entropy with a few natural constraints, the issue of constraints has been studied in various papers. In papers [18,19] there are the tables that show constraints, required to obtain one or another commonly used distribution from the maximum entropy principle. For example, if we set only mean as constraint , we get an exponential distribution, if we also specify variance, we get a normal distribution. Some of the constraints required to obtain typical distributions, from our point of view, are awkward. But the most important thing is that the reason is not clear, why one should use these various constraints, if we look for different probability distributions. We can try to do the opposite. Namely, as a constraint we use only mean, but at the same time we refuse using of BG entropy as universal. As for the exponential distribution, which as the only one can be obtained in this way by maximizing the BG entropy with mean only as constraint, it seems that BG entropy is universal for ideal gases simply because the exponential distribution for energies for ideal gases is universal. In a number of other areas in physics and not only in physics different variants of generalized entropy have been successfully used. The structure of paper is as follows. In sec. 2 we are presented the basic ideas of the method for determining the J-measure of uncertainty in the case when the distribution of random variable is known. In sec. 3 the possibility of applying the proposed method to a two-dimensional (bivariate) random variables is discussed, in sec.4 studied the application of method to histogram. Sec. 5 contains conclusion, in Appendix shown the details of calculations for gamma distribution
2. J-measure of uncertainty for different probability distributions.
If the probability distribution density is given, and only the new measure of uncertainty associated with this density is required, we can use in the spirit of paper [20] a simple and direct way, namely, using the maximum entropy principle in direction, inverse to the standard, for solving this problem. Other possibility to generate a generalized entropy form is proposed in papers [21-23], where it is shown the connection between generalized entropy and the steady states solutions of the nonlinear Fokker-Planck equations. 3. We look for the density of J-measure of uncertainty (JMU) ))),(( xxpg from the condition of the maximum of the functional )( ph , taking into account natural constraints such as the normalization and given mean value . −−−−= a aXP mdtttpdttpdppgph ))(()1)(()()( )( , (1) where maxxpp ],,[),( = - mean of probability distribution, , -Lagrange multipliers. Since the constraints are fulfilled practically automatically for given distribution, we obtain follows xpg += (2) After integration over p we’ll obtain ,)()()()]()([)( )(')()(')()(),(
11 110
CxFxpxCxFxxpxp CdtttpxpCdttptdpxxpg xaxap +−+=+−+= =++=++=+= (3) where )( xF - cumulative distribution function. We note that a new JMU density ),( xpg is a functional of the probability density )( xp and the cumulative probability distribution )( xF , also may depend on ],[ ax directly. Integrating (3), we’ll obtain JMU of system )( xS as function of random variable x and the full measure of uncertainty of system S +++−== xaxa CxCdtttpxFxdxxpgxS )(2)()(),()( ; )( = SS (4) If it is possible to find the inverse function in an explicit form, it could be obtained the desired JMU as a functional of )( xp , i.e. in standard form. For a multi-valued inverse function, this must be done for each segment. Before we consider the application of the proposed approach to various probability distributions and formulate the conditions for determining arbitrary constants, we make a remark about the concavity of the JMU. It is clear, that BG entropy is concave for any probability distribution because of following result ppg ln −= ; )1(ln)( ' +−= ppg ; '' −= ppg . (5) In the general case xpg += )(' ; )(')(')('' xppxpg == (6) i.e., for monotonously decreasing density distributions, as exponential (in this case we assume that = a ), concavity is takes place with arbitrary positive , because xp , we’ll express constants ,, CC through from the following conditions 4. === Sgg , from here we obtain ==−= CCp and after substitution them in (3)-(4) ))0(12()(};)(2)0( )()](1[{)( )}(1)(])0(1{[)( pmSdtttpp xFxFxxS xFxppxxg x −=+−−= −+−= (7) For unimodal distributions as the normal , gamma, etc., where there is a maximum of probability density, i.e. mode, and so the derivative changes its sign, it is impossible, choosing the same constant , to satisfy the required for concavity condition pg for all x , i.e. for all p . But this can be achieved by choosing )( xxsign −= ; v where x -the mode of distribution. Then we obtain )(' )()(')()('' −=−= xp xx si gnpxxx si gnpg . (8) Now we’ll obtain the general formulas for entropy for unimodal probability densities. We use formulas (3)-(4) and (8), but we consider separately the cases xx (Case 1) and xx (Case 2), where x - the mode of distribution. Then we obtain Case 1: xx , −= , )()()()( CxFxpxxg ++−= (9) . )(2)()()()(
CxCdtttpxFxdttgxS xaxa ++−+== Case 2: xx , = , )()()()( CxFxpxxg +−+= (10) )(2)()()()(
CxCdtttpxFxdttgxS xaxa +++−== , where a is a the lower limit of integration, for example −= a for normal distribution, = a for gamma distributions. Now, leaving the constant arbitrary, we formulate six conditions to express constants ,,,,, CCCC through . = ag ; =− xg ; =+ xg ; = g ; = aS ; )0()0( +=− xSxS (11) These requirements seem natural and are an extension of similar conditions for a simple case in the classics. The selection of constants plays a decisive role and should be the subject of discussion. 5. We assume further that = ap , the general case is similar to this one. Then it could be obtained as follows: )];()([)( xFxpxxp −= ];1)()([)( +−−= xFxpxxp . ;0 == CC = C ; (12) })(4)(])(1)( )(24{[ −−+−= xa dtttpxxFxpxp xFxC , After substitution of constants we obtain Case 1 )}()(])( )({[)( xFxpxp xFxxxg +−−= })(2)(])( )({[)( −−+= xa dtttpxFxp xFxxxS Case 2 (13) }1)()(])(1)( )({[)( +−−+−= xFxpxpxp xFxxxg })(2)(])(1)( )({[)( CxdtttpxFxpxp xFxxxS xa +++−+−−= , where C can be found from formula (12). Then JMU is })(42)( )(1]1)(2][)( )(2{[)( −+−−−−== xa dtttpmxp xFxFxp xFxSS (14) 2.1 Exponential distribution. For the exponential distribution, the proposed method takes the following form. Probability density function is x exp − = )( , the cumulative distribution function is x exF − −= , x , = m - mean, = - variance. After substitution )( xp in (3), we obtain )()( Cxeexg xx +−++= −− . (15) From formula for )( xp )ln(ln1 −−= px , then )ln(ln)( Cppppg +−+++−= . (16) If we take == C , ln1 −−= , we obtain the density of BG entropy. But we aim to choose constants in a way that is the same or similar for any given distributions laws. Using (7) we’ll obtain 6. . x xexg − = )( (17) ])1(1[)( xx xeexS −− −−= , == )( SS (18) If we substitute pxpe x ln1; −== − , we obtain pppg ln)( −= . (19) We’ll notice that at first, == m for exponential distribution, secondly, that for all arbitrariness of , it should not be taken = , as was done above in the classical case. The density of BG originates from a discrete case, where p and it is logical therefore to require ==== pgpg , but in the continuous case it is curiously, because in this example p . Note also that p is the normalized probability density. 2.2 Normal distribution Now we consider the normal distribution with probability density mx exp −− = , the cumulative distribution function is )]2(1[21)( mxerfxF −+= , where − x , , m - mean and variance, dtexerf x t − = - error function. For a normal distribution, the mode x and mean m coincide. To obtain the JMU density )( pg as a functional of the distribution density )( xp , we substitute in (13) )2ln(2)( pmpx −= , where the upper sign corresponds to the case mx , and the lower sign corresponds to the case mx . The sign for follows (7), i.e. )( mxsign −= , v . Then it can be obtained )( pg as functional of )( xp only }2)2ln(2)])2ln((1[21{)( pppperfpg −+−−= , (20) here the rule of signs is the same as above. To find the JMU more natural using of formulas (13)-(14), and after substitution ===−= xFxpmxa we obtain for mx )]}2(1[2121)2{()( mxerfexmxg mx −++−−= −− , (21) }2)]2(1)[2(21{)( mx emxerfmxxS −− +−+−−=
7. . for mx )]}2(1[2121)2{()( mxerfemxxg mx −−+−−= −− , (21a) }222)]2(1)[2(21{)( +−++−+−+−= −− mxemxerfmxxS mx The result is −== SS (22) So for the normal distribution, the proposed JMU is proportional to the standard deviation of the random variable, it is, we must admit, quite unexpectedly. It is well known that the classical BG entropy for a normal distribution is )2ln(21 eS BG = , where - variance. Jaynes measure of uncertainty (JMU) is shown on FIG.1, for comparison with JMU (22) proposed here it is shown BG entropy too.. As expected, both grow, with the growth of standard deviation, however, in a neighborhood of zero they behave in a opposite way. BG entropy −→ , and JMU → , which seems more natural, because if → a normal density distribution FIG.1 J-measure of normal distribution )()( mxxp −→ , where )( x - Dirac delta function, i.e. the random variable becomes deterministic 2.3 Gamma distribution The gamma distribution density and cumulative distribution function are 8. x ebxxp −− = )( , ),()(1)( xxF = (23) where , - shape and rate parameters; mean is = m , the variance is = ; the mode is −= x ; )( = b - the normalization constant, dtet t − − = )( -gamma function; dtetxs tx s −− = ),( - the lower incomplete gamma function. Detailed calculations are given in the appendix. All calculations were made for case as an example directly for the gamma distribution without using general formulas (13)-(14). Of course the results below can also be obtained by substituting the gamma distribution into general formulas. The JMU densities and JMU as functions of random value x for two cases: xx and xx take the following form Case 1 xx )]}1,1()1,()1[()1( )])1(,1())1(,()1[()(1{)( −+−−−−− −−−+−−−−−= − ep WWpg , (24) )},()]()1)(1,(1{[)()( xexxexg x +−−−−−= −−− (25) })(2),(])1)(1,()1({[)()( x exxxexS −− ++−−−+−= (26) Case 2 xx )]}1()()1()1,1()1,()1[()1()]1( )()1())1(,1())1(,()1[()(1{)( ++−−−+−−−−−++ +−−−−+−−−−−= − ep WWpg (27) )}(),()]()1))(( )1,(()1({[)()( +−+−− −−+−−= −−− xexxexg x (28) 9. )}()1()1(4)(2)1,(])1))(( )1,(2(4[),(])1))(( )1,((1[)],()([{)()( )1(1112 −−−+−−−− −−+−−− −−+++−= −−−−− eexe xe xxxS x (29) JMU for gamma distribution is following )}()1()1(4)1,(])1))(()1,(2( 4[)(])1))(()1,((1{[)()( )1(1 12 −−−+−−−−+ +−−−−++== −−− − ee eSS (30) In these formulas −−= xW - Lambert-W function, i.e. the solving of the equation )( )( xW exWx = . We note, that if , gamma distribution has maximum, if , it is monotonically decreasing function and if = , gamma distribution turns to exponential distribution. On FIG. 2 it is shown the JMU for gamma distribution (30) for = , = , i.e. == m , and BG entropy for comparison is shown too. For gamma distribution BG entropy is )()1()(lnln −++−= BG S , where )( )()( ' xxx = - digamma function. As an asymptotic estimate shows, BG entropy for the gamma distribution −→ BG S if a standard deviation → , i.e. a normal distribution similar, at the same time, the proposed Jaynes measure of uncertainty (30) behaves plausibly, namely → S . Now consider the case . In this case for all x if pg , because of W (see formula (A7) in Appendix), i.e. the concavity condition is satisfied. We substitute density of gamma distribution in (8), (9) and obtain )],([)()()( Cxexxexg xx +−+= −−− (31) )],1(2),()[()(1)( CxCxxxxS ++++−= (32) Unfortunately, in this case it is impossible to express uniquely, as before, constants ,, CC through from the all conditions === Sgg . The reason for this is that → )( xg when → x , so for the finiteness of the entropy density, should be put .0 = From condition = g follows = C and from the condition = S follows = C . Then == )0( Cg and after substitution of constants we obtain 10. ]),()([)()( x exxxg − +−= (33) )},1(2)],()({[)()( xxxxS ++−= (34) FIG.2 J-measure of uncertainty for gamma distribution. The JMU for this case is equal == SS (35) It is of interest to compare the limiting entropy behavior for → from both sides with the entropy of the exponential distribution. For −→ from (35) follows → S , for +→ from (30) can be obtained → S , wherein it is necessary to use the limits lim = +→ , lim =− +→ , lim =− −+→ . For exponential distribution from (18) follows → S too, if we take equal the means of exponential and gamma distributions, i.e. = for = . So for −→ and for selected conditions to determine arbitrary constants JMU has a gap. However, if we’ll save the requirement = C , = C but waive the requirement = , we obtain from (32) for → x += S , if we choose −= , we obtain for −→ → S , i.e. continuity of entropy is provided on both sides, however, )0( g just like the distribution density )0( p for is infinite. 11. 2.4 Tsallis distribution As an example of the application of the proposed method in the discrete case, we consider Tsallis distribution [26] ])1(1[1 − −−= qiqi xqZp , ])1(1[ −= −−= qiniq xqZ , (36) where i x - discrete random variable, ni ,...,2,1 = . To obtain the corresponding set of entropy values i g we’ll look for maximum of the function ),...,,( W ppph with constraints )()1()(),...,,( mxpppgppph ini iini iini iiW −−−−= === , (37) where m – the mean of random variable. From maximum condition iiiii xpg += )(' , and after substitution )1()1( 1 −− −−= qiqi pZqx from (36) we obtain )1()1()(' −− −−+= qiqiiii pZqpg . (38) When choosing i i g is concave since −= −− qiqii pZg . After integration of (38) could be obtained Cpqq Zpqg qiqiiiii +−−−+= − )1())1(( . From the standard for the discrete case conditions == ii gg follows that = C , )1()1( −−= − qZq qii and after substitution i in i g we obtain )()1( qiiqii ppqq Zg −−= − (39) Then JMU is the sum )1(1 == −−== ni qiWi iq pq kgS , qZk qi − = , (40) this is exactly Tsallis entropy.
3. J-measure of uncertainty for two-dimensional random variables
Here we’ll show the generalization of the proposed method to the multidimensional 12. case. It seemed sufficient to consider a two-dimensional random variable, even in this simplest, but not at all simple case, a number of computational difficulties arise. After obtained general formulas we consider as examples bivariate exponential and bivariate normal distributions. For a given bivariate distribution density of random variables, we are looking for a JMU as an integral of its density over the set of variables = X Y dxdyyxpgS )),(( (41) Usually, ),(),,( yaYxaX == , where = a or −= a , besides three constraints are given: normalization and known means of random variables yx , = dxdyyxp X Y ; = X x mdxxxp )( ; = Y y mdyyyp )( (42) Thus, by the method of Lagrange multipliers we are looking the maximum of the functional )),(( )),(()1),(()()( ),( −− −−−−−= X yY X xYX YYXP mdxdyyxyp mdxdyyxxpdxdyyxpdppgph (43) Since the density distribution is given, we search ),),,(( yxyxpg from the condition '' =−−−= yxpgyxph p (44) Differentiation in respect to Lagrange multipliers leads to equations that are satisfied for the given probability density ),( yxp automatically. =−= X Y dxdyyxph ' ; =−= X xY mdxdyyxxph ' =−= X yY mdxdyyxyph ' (45) We’ll formulate the additional conditions for the shown in (44) and appearing later arbitrary constants. The requirement of concavity '' p g must be added. As and are arbitrary, it is possible to take = and introduce a new variable yxz += , so the problem reduces to the case of one variable discussed above. The generalization to the case of an arbitrary number of random variables is obvious. From equation (4) we can obtain zpg += )(' (46) 13. Thus, in principle, the problem is solved. Consider as the simplest example a two-dimensional exponential distribution of independent random variables, which in this case is the product of one-dimensional distributions, the density distribution and cumulative probability distribution in this case are )exp(),( yxyxp −−= , )]exp(1)][exp(1[),( yxyxF −−−−= , (47) it is easy to obtain the cumulative probability distribution for the sum of random variables )exp()exp(1),()()(
212 1121 20 0 zzdyyxpdxyxFzF z xz −−+−−+==+= − (48) and the density distribution is )]exp()[exp()()( zzzFzp −−−−== (49) Interesting is the case == . Going to the limit →→ , could be obtained )exp()( zzzp −= , )exp()1(1)( zzzF −+−= (50) A simple analysis shows that the density distribution (49) has a maximum at ln1 −= z , this maximum is equal
21 112 2 )( −− = zp . Cumulative probability distribution in this point z
21 212 1 )(1)( −− +−= zF , the mean is += z m . Besides these functions to find of JMU by the formula (4) we need integral ]11)exp()1(1)exp()1(1[)( +−−+−−+−−= zzzzdzzzp z Further, the entropy is calculated by the formula (4), but we will give here as an example with a less bulky result for the particular case == , in this case − = z , )( − = ezp , − −= ezF , − = z m , −= − )52()( z edzzzp , after substitution these values in (14), we’ll obtain +−= − eeS (51) To analyze the effect of correlation on the entropy for bivariate exponential distribution, we choose one of the distributions proposed by Gumbel [24], we choose this distribution from many bivariate exponential distributions [25] only due to its simplicity. This density distribution is ]}1)exp(2][1)exp(2[1){exp(),( −−−−+−−= yxyxyxp , (52) 14. а cumulative distribution function )]exp(1)][exp(1)][exp(1[),( yxyxyxF −−−−−−+= (53) Means, variances and correlation coefficient respectively are == yx mm ; == yx ; = (54) As before we’ll find )( zF и )( zp , where yxz += , and we’ll obtain )]2exp()23()exp()3[()exp()1(1)( zzzzzzzF −+−−−+−+−= (55) )]2exp()1(4)exp()4[()exp()( zzzzzzzp −++−−+−= (56) +== zz m (57) From condition = zp follows the equation =−+−−+− zzzz to determine )()(maxarg zzpz == . The solution of this equation is well approximated by a fourth degree polynomial ++−−= z . Additionally we need the integral )]2exp()1(2 )exp()22[()exp()22(2)(
020 002000 020 zz zzzzzzdzzzp z −+− −++−+−++−= (58) After the substitution of this integral, as well as z , z m , )( zF и )( zp in (14) could be obtained the relation )( S for = , it is shown on FIG 3. Note that the JMU in the absence of correlation, i.e. )0( S , as expected, coincides with the result (51) at = , = . We’ll do here the short remark about bivariate normal distribution with following density distribution ]})( ))((2)([)1(2 1exp{12 1),( y y yx yxx x my mymxmxyxp −+ +−−−−−−−= (59) As it is well known, the sum of two normally distributed random variables is a normally distributed random variable, with mean equal to the sum of means, the standard deviation is yxyxyx ++= + , where is the correlation coefficient. So we can directly use the formula (22), substituting in it the appropriate values of standard deviation.
4. J-measure of uncertainty for a histogram.
It was shown above, that for given probability distribution and constraints Jaynes maximum entropy principle allows to obtain the family of measures of uncertainty. In order to choose one specific measure of this family, additional conditions must be formulated, for example, the proposed ones. Naturally, the proposed method for 15. determining this J- measure of uncertainty (JMU) can be applied not only to the probability distribution, which is usually obtained as a result of the fitting of histogram, but to the histogram directly. The algorithm depends on whether the histogram monotonously decreases with increasing of a random variable values or unimodal. Let, as a result of measurements, obtained a sample of values of a random variable as the basis of constructed histogram. For simplicity, consider a random variable FIG 3. J-measure of bivariate exponential distribution. distributed in interval ]1,0[ . The transition to the interval [ ba , ] is obvious. The interval is divided into n equal bins with length n = . Then ===−== + nxniixx ni ; n PPP ,...,, , - frequencies/probabilities, )( + = iii xxxPP . This histogram describes the case n PPP ... . Often this type of histogram can be approximated by an exponential distribution. Another case will be studied below. We’ll transform the expression for the proposed density of JMU )( xg to a form suitable for use in the case of a known histogram (see formula (3)) )]()([)()( CxFxxpxpxg +−+= (60) Density distribution )( xp and cumulative distribution )( xF for histogram of this type when + ii xxx are )()(;)( iiik kiii xxPPxFPxp −+== −= (61) After substitution (61) in (60) and transformations we’ll obtain 16. )( CiPGPg iiii +−−= , (62) where = = ik ki PG - cumulative histogram for histogram i P . The formula (62) is an analogue of the formula (60) for histogram. As before, it is a family of JMU densities, from the concavity requirement for i g , the condition = must be satisfied. Up to a factor of constants and C are determined from the following conditions == n gg , then can be obtained nn PP P −−−= ; nn PP PPC −−= (63) We substitute (63) into (62) and obtain . ])1([ iinnii iPGPPP PPg +−−−−= (64) Full JMU of histogram is equal ]2)1()1(1[ == ++−−− −== ni innni in iPnnPPP PgS (65) Consider an example. Let a histogram is given, == n and the values of frequencies / probabilities are given in Table 1 Table 1. Histogram und cumulative histogram i
1 2 3 4 5 6 7 8 9 10 i P i G niGPn ii ,...,1,,,, = in (65) we’ll obtain = n S . Now we consider how does the approximation of a given histogram by the exponential distribution in interval [0,1] affect JMU. For truncated exponential distribution in interval [0,1] density distribution and cumulative distribution are as follows x eexp −− −= ; −− −−= eexF x (66) If we use the constants and C found for histogram, JMU for this distribution can be after integration of (60) obtained )}1(]1)1(1)21(1[1 1{)( −−−+−−+−−== −−− PPP PeeedxxgS nn (67) It can be shown that the truncated exponential distribution with = approximates the histogram (Table 1) with sufficient accuracy. After substitution = , = , = P , = n P in (66) we obtain = S . From the comparison S and n S it is clear that, despite the very accurate approximation, the JMU has decreased, as a 17. result of the approximation, i.e. we add the additional information from the outside. Of course, it is possible to choose from the condition n SS = , which leads to the result = , however, the accuracy of approximation in this case lost. At the same time to satisfy both requirements is impossible. It is of interest to present for comparison the results of calculating the BG entropy for the case of the histogram (Table 1) and for the case of truncated exponential distribution (66) by = =−= = ini iBGn PPS ;
110 1 −=−−−= −−−− dxeeeeS xxBG (68) It is easy to see that if the proposed J- measure of uncertainty for histogram and the approximating probability distribution are almost equal, although the classical BG entropies for them differ greatly, including the sign. Now we consider the following problem. Let the results presented in Table.1 show the results of measuring any physical quantity specified in the interval [0,1]. Let, for example, the total number of measurements m , we denote i m the number of results that fall in the bin i . It is clear that they could be only integer, so Table. 1 slightly modified (Table 2). Substituting the values n , and i P in formula (65) it can be obtained = S Table. 2. Modified histogram und cumulative histogram i 1 2 3 4 5 6 7 8 9 10 i m
15 14 12 11 10 9 8 8 7 6 i P = m . To obtain this estimate, we will replace successively ' +=→ kkk mmm , nk ,...,1 = leaving the values for each of the other bins unchanged , we will find ' , mkn S , and finally we will find after averaging over k = = nk mknmn SnS '' . After transformations we obtain = S . Thus, an additional measurement for the data of this example reduces the JMU and accordingly increases the information by about 1,6 %. In each case, before averaging over bins JMU, and respectively the information, can as well increase as decrease. It is natural to assume that the larger the initial number of measurements m , the less additional measurement has an impact on the entropy, and therefore on information. At the same time, when distribution of frequencies over bins remain in the same proportion, the initial JMU value does not depend on m , SS = . So for example for = m = S , i.e. approximately 0,48% Similarly to the previous, consider the case of a unimodal histogram, as the normal or gamma distribution. Table 3 shows the corresponding histogram, i.e. frequencies and 18. cumulative frequencies. Formulas for k xx and k xx ( k x - the mode of distribution), similar to (9), (10) are as follows )( CiPGPg iiii +−+= , ki ,...,2,1 = (69) )( CiPGPg iiii +−−= , nki ,...,1 += Table 3. Unimodal histogram and cumulative histogram. i
1 2 3 4 5 i P − • − • i G − • − • i
6 7 8 9 10 i P − • − • i G ==== + nkk gggg From here the constants can be found )( PP kPG k kk −−−= , nk nkk PP PPnkG − −++−−= + ++ (70) −= PC , )1( nn nPPC −+−= . After summing (69) over i we obtain )]}()([)( {)1()( inki iki ii kkki nki iin iPGiPGCkn kCGGggS −−−+−+ ++−+=+= +=== += (71) After substitution of all necessary values in (71) can be obtained = n S . Now compare, as before, the obtained JMU n S with the JMU found after approximation of histogram with a truncated exponential distribution in interval [0,1] ]2 )(exp[)( mxAxp −−= ; )]2()2([2)( merfmxerfAxF +−= , (72) where )]}2()]21([2{ − +−= merfmerfA .. The fitting leads to the following results: = m , = , = A . Calculating JMU for truncated normal distribution, using the constants ,,, CC , obtained for histogram, and (9), (10) leads to result .018,0 = norm S Thus, from the comparison it is clear that, as in the case, discussed 19. above, the approximation uses additional information that is not contained in the original histogram. Now, as it was done for a monotone decreasing histogram, let us estimate how varies the JMU or the amount of information obtained from the additional measurement. Original number of measurements taken again = m . In Table 4 shown values of i m , i P and i G .To estimate the JMU from the data in this table, we as above define constants , , C , C and then use the formula (71). We obtain = S . To evaluate the impact of an additional measurement, we repeat the procedure described above, i.e. for the case = m , we one after another add an additional result to each bin, in each case we calculate the JMU, then we find the average. Exactly the same as before in each case, before averaging over bins JMU and respectively the information can as well increase as decrease.
As a result of this procedure we obtain = S . In this way JMU decreased, the information increased ~ 1,96 %. So on average, an additional measurement leads to the small increase of information. Table 4 Modified unimodal histogram and cumulative histogram. i
1 2 3 4 5 6 7 8 9 10 i m
3 5 8 12 17 22 15 9 6 3 i P i G and do not claim to be quantitatively accurate. They are intended to show that the proposed measure of uncertainty leads to plausible results.
5. Conclusion
For the case of the known distribution of a random variable from experiments or simulations, in this paper, instead of classical BG entropy, considered alternative J- measure of uncertainty (JMU). It is obtained explicitly with the help Jaynes maximum entropy principle, used in the direction, opposite to the generally accepted. In all cases, i.e. for any probability distributions as constraints used only the normalization and the given mathematical expectation (mean). In order to use formulas (7) or (13)-(14), which make possible to find the density of the proposed JMU, it is sufficient to know the distribution of random value. An important advantage of using this measure of uncertainty in comparison with the distribution law itself is the direct information content, which is ensured by the principle of maximum entropy. On the other hand, unlike the classical Boltzmann-Gibbs entropy, based on certain assumptions, explicitly formulated or accepted by default, on the properties of a random variable, this JMU is based only on considerations that underlie the maximum entropy principle and natural additional conditions for defining arbitrary constants. 20. Shown the application of proposed method to exponential, normal and gamma distributions and to bivariate random variable as a example of the multidimensional case. It was made a comparison of the information contained in the histogram of a random variable with the information in the probability distribution obtained as a fitting of this histogram. Moreover, studied the influence of an additional measurement of a certain physical quantity on the amount of information.
References
1. Khinchin A. Ya., Mathematical Foundations of Information Theory, Dover, NY, 1957. 2. Thurner S., R. Hanel. What do generalized entropies look like? An axiomatic approach for complex, non-ergodic systems. In: Concepts and recent Advances in Generalized Information Measures and Statistics., A.M. Kowalski, R.D. Rossignoli, E.M.F. Curado (eds.), 81-99, Bentham Science Publ., 2013. 3. Hanel R., S. Thurner. When do generaized entropies apply? How phase space volume determines entropy. Europhys. Lett (EPL), 96, 50003, 2011. 4. Turner S., Hanel R. The entropy of non-ergodic complex systems – a derivation from first principles Int J. Modern. Phys.: conf. series, 16, 105-115, 2012. 5. Tsallis C. Possible generalization of Boltzmann-Gibbs statistics. J.Stat.Phys. . 52, 479-487, 1988. 6. Beck C. Generalised information and entropy measures in physics. Contemp. Phys., 50, 495-510, 2009. 7. Renyi A. On measures of information and entropy. Proc. Fourth Berkeley Symp. Math., Stat., Prob., 547-561, 1960. 8. Tsallis C. Thermodynamics and statistical mechanics for complex systems – foundations and applications. Acta Phys. Pol., B46, 1089, 2015. 9. Tsallis C., Introduction to Nonextensive Statistical Mechanics — Approaching a Complex World, Springer, NY, 2009. 10. Beck C., E.G.D. Cohen. Superstatistics. Physica A, 322, 267-275, 2003. 11.Beck C. Generalized statistical mechanics for superstatistical systems. Phil. Trans. R. Soc. A, 369, 453-465, 2011. 12. Van der Straeten E., C. Beck. Superstatistical distributions from a maximum entropy principle. Phys. Rev., E78, 051101, 2008. 13. Beck C. Superstatistics: theoretical concepts and physical applications. arXiv: 0705.3832, 2007. 14. Chavanis P.-H. Coarse-grained distributions and superstatistics. Physica A, 359, 177-212, 2006. 15. Beck C. Recent developments in superstatistics. Braz. J. Phys., 39, 357-363, 2009. 16. Jaynes E.T. Information theory and statistical mechanics. Phys. Rev., 106, 620-630, 1957 . 17. Jaynes E.T. Information theory and statistical mechanics, II. Phys. Rev., 108, 171-190, 1957. 18. Singh V.P., A.K. Rajagopal, K. Singh. Derivation of some frequency distributions using the principle of maximum entropy. Adv. Water. Res, 9, 91-106, 1986. 21. 19. Park S. Y., Bera A.K. Maximum entropy autoregressive conditional heteroskedasticity model. J. Econometrics, 150, 219-230, 2009. 20. Abe S. Generalized entropy optimized by an arbitrary distribution. J.Phys. A, 36(33), 8733, 2003. 21. Schwämmle V., E.M.F. Curado, F.D. Nobre. A general nonlinear Fokker- Planck equation and its associated entropy. Europ. Phys.J., B, 58, 159-165, 2007. 22. Schwämmle V., F.D. Nobre, E.M.F. Curado. Consequences of the H- theorem from nonlinear Fokker-Planck equations. Phys. Rev., E76, 041123, 2007. 23 Asgarani S. Families of Fokker-Planck equations and the associated entriopic forms for a distinct steady state probability distribution with a known external force field. Phys. Rev. E91, 022104. 2015. 24 Gumbel, E.J., Bivariate exponential distributions. J.Americ.Stat. Assoc, 55, 698–707, 1960 25 Balakrishnan N., Chin-Diew Lai. Contunuous Bivariate Distibutions. Springer, Dordrecht, 2009
Appendix. J-measure of uncertainty for gamma distribution.
The gamma distribution density is x ebxxp −− = )( , (A1) where , - shape and rate parameters. In this appendix we consider the case . The mean is = m , the variance is = , the mode is −= x , )( = b - the normalization constant, dtet t − − = )( - gamma function. In accordance with the maximum entropy principle, must be found the maximum of following functional )()1()()( −−−−= PPP mxpdppdppgph , (A2) where )( pg - JMU density, , )( xxsign −= - Lagrange multipliers, . The condition of maximum of )( ph is =+= xpg . (A3) To determine )( pg it’s necessary to find the inverse function )( px , we consider the case , only in this case we have unimodal distribution, we introduce new variables ))((11 − −−= pz , −−= xW . (A4) Then for determination of )( px from (A1) we’ll obtain the equation W Wez = , ))(( pzWW = . (A5) 22. The solution of (A5) is the Lambert-W function. Note that Lambert-W function is two-valued function and has two branches W and − W . For W und z from (A4) − W , for − W −− W . This should be expected, since each value )( xp takes place for two values of x , i.e. the inverse function is two-valued. It is easy to show that the maximum of the function )( xp , i.e. the mode of distribution, is at point −== xx and equal to )1(1max )1( −−− −= ebp . Then ))((11 −− −=−−= epz . After substitution ))((1 pzWx −−= in (A3) we’ll obtain ))((1)(' pzWpg −−= (A6) Now we consider the question about the concavity and then continue to look for )( pg . After differentiating of (A6) we’ll obtain ))((1)('' ' pzWpg p −−= From (A4) we can find ))(()1( 1)( −−−− −−= ppz , and from (A5) after differentiation the equation W Wez = can be obtained ' =+ WezW Wz , Wze W = , then )1()( ' Wz WzW z += , )1())(( ' Wz WdpdzdpdzdzdWzpW p +== Then for )('' pg we obtain WWppg +−= . (A7) In order to have g must be selected )1()( +−=−= Wsignxxsign , . And now to look for )( pg we integrate (A6) over p and obtain +−++= p CdppWWsignppg )(1)1()( . (A8) Then ])1(,1(11))1(,([]11[)( ][1)(1)( WWbdyeydyey dtetdtetcdzzzWcdppW
W W yy tzW zWtzp −−+−−−−−=−−−= =+−=−= −−− −− −−− −−−−−− −−= −− bc , dtetx tx −− = ),( - lower incomplete gamma function. We used the 23. substitutions )( zWt = , ty )1( −−= .Then the required functional takes the form CWWW signppg +−−+−−−− +−= ])1(,1())1(,()1[()( )1()( . (A9) This functional (A9) is JMU density appropriate to gamma distribution. We can substitute )( xp from (A1) and )(1 zWx −−= from (A4) in (A9) in order to obtain the density of proposed J-measure as the function of the random value x C xxxxsignebxxg x ++−− −+= −− )],1(),()1[()( )()( . (A10) JMU )( xS can be obtained by integrating of the JMU density )( xg , then we’ll find the total J- measure of uncertainty as the limit → = x xSS )()( . However, before determining )( xS and S , it is necessary to determine the constants that are still arbitrary separately for two cases: xx i.e. − W and xx i.e. − W . Case 1: xx , − W . From the conditions: = g at = x or, what the same, at = p and at −== xx or, what the same, )1()( − −= ep , can be obtained == CC and ])1)(1,(1[ − −−−−== e . Substituting these constants in (A9) and (A10), we obtain )]}1,1()1,()1[()1( )])1(,1())1(,()1[()(1{)( −+−−−−− −−−+−−−−−= − ep WWpg (A11) and after the transformations could be obtained )},()]()1)(1,(1{[)()( xexxexg x +−−−−−= −−− (A12) After integration we’ll obtain })(2),(])1)(1,()1({[)()( x exxxexS −− ++−−−+−= (A13) Case 2: xx , − W From the conditions: = g at → x or, what the same, at = p and at 24. −== xx or, what the same, at )1()( − −= ep , can be obtained == CC and })1)](()1,([)1({ − −−−+−−== e . Substituting these constants in (A9) and (A10), we obtain )]}1()()1()1,1()1,()1[()1( )]1()()1())1(,1())1(,()1[()(1{)( ++−−−+−−−−− −++−−−−+−−−−−= − ep WWpg (A14) and after the transformations could be obtained )}(),()]()1))(( )1,(()1({[)()( +−+−− −−+−−= −−− xexxexg x (A15) After integration we’ll obtain )}()1()1(4 )(2)1,(])1))(()1,(2(4[ ),(])1))(()1,((1[)],()([{)()( )1( 1 112 −−−+ +−−−−−+− −−−−+++−= −− −− − e exe xexxxS x (A16) and the J-measure of system is )}()1()1(4)1,(])1))(()1,(2( 4[)(])1))(()1,((1{[)()( )1(1 12 −−−+−−−−+ +−−−−++== −−− − ee eSSee eSS