A multinomial Asymptotic Representation of Zenga's Discrete Index, its Influence Function and Data-driven Applications
Tchilabalo Abozou Kpanzou, Diam Ba, Cherif Moctar Mamadou Traoré, Gane Samb Lo
AA MULTINOMIAL ASYMPTOTIC REPRESENTATION OF ZENGA’S DISCRETEINDEX, ITS INFLUENCE FUNCTION AND DATA-DRIVEN APPLICATIONS † TCHILABALO ATOZOU KPANZOU, †† DIAM BA, †††
PAPE DJIBY MERGANE, AND ††††
GANE SAMB LOAbstract. In this paper, we consider the Zenga index, one of the most recent inequality index.We keep the finite-valued original form and address the asymptotic theory. The asymptoticnormality is established through a multinomial representation. The Influence function isalso given. Th results are simulated and applied to Senegalese data. † Tchilabalo Abozou Kpanzou (corresponding author).Kara University, Kara, Togo.Email : [email protected] †† Diam BaLERSTAD, Gaston Berger University, Saint-Louis, S´en´egal.Email : [email protected]. †††
Pape Djiby MerganeLERSTAD, Gaston Berger University, Saint-Louis, S´en´egal.Email : [email protected]. ††††
Gane Samb Lo.LERSTAD, Gaston Berger University, Saint-Louis, S´en´egal (main affiliation).LSTA, Pierre and Marie Curie University, Paris VI, France.AUST - African University of Sciences and Technology, Abuja, [email protected], [email protected], [email protected] address : 1178 Evanston Dr NW T3P 0J9,Calgary, Alberta, Canada. keywords and phrases . Inequality measures; Asymptotic behaviour; Asymptotic represen-tations; functional empirical proces.
AMS 2010 Mathematics Subject Classification :
1. IntroductionOver the years, a number of measures of inequality have been developed.Examples include the generalized entropy, the Atkinson, the Gini, thequintile share ratio and the Zenga measures (see e.g. Zenga (1984) andZenga (1990)), Cowell and Flachaire (2007); Cowell et al. (2009); Hulligerand Schoch (2009). Recently, Mergane and Lo (2013) gathered a signif-icant number of inequality measures under the name of Theil-like family. a r X i v : . [ s t a t . M E ] M a r † TCHILABALO ATOZOU KPANZOU, †† DIAM BA, †††
PAPE DJIBY MERGANE, AND ††††
GANE SAMB LO
Such inequality measures are very important in capturing inequality in in-come distributions. They also have applications in many other branchesof Science, e.g. in ecology (see e.g. Magurran (1991)), sociology (see e.g.Allison (1978)), demography (see e.g. White (1986)) and information sci-ence (see e.g. Rousseau (1993)).The inequality measure of Zenga (2006) is one of the most recent one. It isreceiving a considerable attention from researchers for its novelty indeed,but for its interesting properties. Papers dealing with that measure covertheoretical aspects including asymptotic theory and statistical inference(Greselin et al. (2010b), Eldin and Marilou (1999)) and applied works toincome data (Greselin et al. (2010a)), etc.In this paper, we focus on the discrete form as introduced by Zenga (2006).We justify the asymptotic study of the discrete and finite form by a num-ber of reasons. In some situations, only aggregated data exists. Althoughthis is hardly conceivable today, it is still possible and it is highly prob-able that the researcher does not have access to the original data andhas in hand only data in form of frequency tables. Some other times,frequency tables may be available while the full data is destroyed or lost.Right now, in Gambia, health data collected from the health centers arestored in daily books and the national health direction extract frequencytables from those books and this type of data is the only one available intheir computerized system. So one of the main reason to work on the finitediscrete data is the lack of accessibility to the full data for one reason oranother. The second main is that an asymptotic theory on such king ofdata will give the structure of the limit results with also no severe con-ditions. By replacing the discrete finite probability law of the aggregateddata by a general probability law, we get the precise general asymptoticcase. From that simplified study, we see what might be expected in gen-eral theory before we proceed it.Here, we suppose that the full data has been summarized into a frequen-cies table of the formEach class ( c i − , c i ) in Table 1 is represented by a single point x ∗ i , usuallytaken as the middle of class x ∗ i = ( c i − + c i ) / (other possible choices are themean of the median of observation falling in the class). So we may adoptapproximatively reconstitute the n ≥ data as follows HE DISCRETE ZENGA INEQUALITY MEASURE 3 classes ( c i − , c i ) Represents x i ∗ frequencies n i ( c , c ) x ∗ frequencies n ( c , c ) x ∗ frequencies n ... ... ... ( c m − , c m ) x ∗ m frequencies n m Total x ∗ i n Table 1. Frequencies Tables x ∗ (cid:124)(cid:123)(cid:122)(cid:125) n times · · · x ∗ j (cid:124)(cid:123)(cid:122)(cid:125) n j times · · · x ∗ m (cid:124)(cid:123)(cid:122)(cid:125) n m times In the sequel, we suppose that the data itself is discrete and takes a pre-determined number of m value. First, we will give an asymptotic theorywhich will be given in the form of representation in multinomial laws, inopposite to representation in Brownian Bridges in the general case. Next,the influence function will be derived by direct computations and this usu-ally allows to find again the asymptotic variance and some times as in ourcase, to find a different but equivalent expression of that variance.The works presented here will be applied to incomes available in an aggre-gated form. At the same time, they serve as a paving way to a more generalapproach.Let us suppose that the income variable X is discrete and takes the m ( m > ) ordered values −∞ < x < ... < x m < x m +1 = + ∞ with theprobabilities p j > , j ∈ { , ..., m } with p + p + ... + p m = 1 . If the in-come continuously observed, we have a sequence of random replications X , X , ... defined on the same probability space (Ω , A , P ) . For each n ≥ , the empirical distribution of X on the sample is characterized by the em-pirical frequencies n = 0 , n j = { h ∈ { , ..., n } , X h = x j } , j ∈ { , ..., m } , and their normalized and cumulative forms respectively f = 0 , f j = n j n , j ∈ { , ..., m } † TCHILABALO ATOZOU KPANZOU, †† DIAM BA, †††
PAPE DJIBY MERGANE, AND ††††
GANE SAMB LO and n ∗ = f ∗ = 0 , n ∗ j = j (cid:88) h =1 n h , f ∗ j = j (cid:88) h =1 f h , j ∈ { , ..., m } , with m (cid:88) j =1 n j = n, m (cid:88) j =1 f j = 1 , n ∗ m = n, f ∗ m = 1 . We also define p ∗ = 0 , p ∗ j = j (cid:88) h =1 p h , p ∗ m = 1 . The empirical and discrete Zenga (2006)’s index is given by Z d,n = 1 − m − (cid:88) j =1 f j ( n ∗ j /n ) − (cid:80) ≤ h ≤ j n h x h (1 − ( n ∗ j /n )) − (cid:80) j +1 ≤ h ≤ m n h x h , which is obtained by summing Formula (3.1) in Zenga (2006) over j ∈{ , ..., m } and presented as a synthetic measure of inequality. The empiricalcumulative distribution function (cdf) based on the sample of size n ≥ is F n ( x ) = 1 n m (cid:88) h =1 n h [ x h ,x h +1 [ ( x ) , x ∈ R and is the non-parametric estimator of the true (cdf) F n ( x ) = m (cid:88) h =1 p j [ x h ,x h +1 [ ( x ) , x ∈ R We also have the empirical probability generated by the sample is given by P X,n ( A ) = 1 n m (cid:88) j =1 A ( x j ) We may express Z n,d in terms of the empirical probability measure by HE DISCRETE ZENGA INEQUALITY MEASURE 5 Z d,n = 1 − m − (cid:88) j =1 P X,n ( x j ) (cid:0)(cid:82) ]0 ,x j ] ( t ) d P X,n ( t ) (cid:1) − (cid:0)(cid:82) t ]0 ,x j ] ( t ) d P X,n ( t ) (cid:1)(cid:0)(cid:82) ] x j , + ∞ [ ( t ) d P X,n ( t ) (cid:1) − (cid:0)(cid:82) t ] x j , + ∞ [ ( t ) d P X,n ( t ) (cid:1) . Finally by considering the discrete measure ν = (cid:80) ≤ j ≤ n δ x j , where δ x j is theDirac measure concentrated at x j with mass one, we may also write Z d,n = 1 − (cid:90) (cid:0)(cid:82) ]0 ,s ] ( t ) d P X,n ( t ) (cid:1) − (cid:0)(cid:82) t ]0 ,s ] ( t ) d P X,n ( t ) (cid:1)(cid:0)(cid:82) ] s, + ∞ [ ( t ) d P X,n ( t ) (cid:1) − (cid:0)(cid:82) t ] s, + ∞ [ ( t ) d P X,n ( t ) (cid:1) P X,n ( s ) dν ( s ) . It is clear, by the convergence in law of the sequence of probability mea-sures P X,n to the P X = P X − (the probability law of X ), we see that Z n,d converges to Z d = 1 − (cid:90) (cid:0)(cid:82) ]0 ,x j ] ( t ) d P X ( t ) (cid:1) − (cid:0)(cid:82) t ]0 ,x j ] ( t ) d P X ( t ) (cid:1)(cid:0)(cid:82) ] x j , + ∞ [ ( t ) d P X ( t ) (cid:1) − (cid:0)(cid:82) t ] x j , + ∞ [ ( t ) d P X ( t ) (cid:1) P X ( s ) dν ( s ) . In this simple setting, the convergence are easily justified because of thefiniteness of the summations and of the functions. In terms of cdf and onmathematical expectation, we have Z d = 1 − (cid:90) x m x F ( s ) (cid:82) s td P X ( t ) − F ( s ) (cid:82) ∞ s td P X ( t ) P X ( s ) dν ( s ) . ( X ) The integral in the last expression should be read as (cid:90) x m − x F ( s ) (cid:82) s td P X ( t ) − F ( s ) (cid:82) ∞ s td P X ( t ) P X ( s ) dν ( s ) = (cid:90) [ x ,x m [ ( s ) F ( s ) (cid:82) s td P X ( t ) − F ( s ) (cid:82) ∞ s td P X ( t ) P X ( s ) dν ( s ) , so that neither − F ( s ) nor F ( s ) never vanishes on the integration domain.On one side, we are going to draw an asymptotic normality theory of Z n,d using the m -multivariate binomial laws. On an other side, the sensitivityof a statistic T ( F ) and the impact of extreme observations on it are also tworecurrent questions in the research in the field (see Cowell and Flachaire(2007)) † TCHILABALO ATOZOU KPANZOU, †† DIAM BA, †††
PAPE DJIBY MERGANE, AND ††††
GANE SAMB LO
In that context, the asymptotic variance of the plug-in estimator T ( F n ) ofstatistic T ( F ) is of the form σ = (cid:82) L ( x, T ( F )) dF ( x ) . From this, we maysay that the influence function behaves in nonparametric estimation asthe score function does in the parametric setting (See Wasserman (2006),page 19). To define the notion of IF , Let us consider the contaminatedprobability law P − ( ε ) X of P X at x with mass ε > by(1.1) P ( ε ) X = (1 − ε ) P X + εδ x . and a functional of P X , namely T ( P X ) . The influence function of the func-tional T at x , if it exists, is given by(1.2) IF ( T, x ) = lim ε → T ( P ( ε ) X ) − T ( P X ) ε . The previous remarks motivate us to derive the IF function of Z d ( P X ) andto compare it with the asymptotic variance the plug-in Zenga’s estimator.Before we proceed to our a task, we point out that asymptotic normalityresults for Zenga’s index are available in the literature, among them thoseof Greselin et al. (2010b) and Eldin and Marilou (1999). We will comeback to these results in the coming paper where we deal with other versionof asymptotic versions in the general case.Here is how is organized the paper, we give our asymptotic results as de-scribed above in Section 2 in Theorems 1 and 2. Section 3 is devoted tosimulation studies and data-driven application to Senegalese Data. A con-clusion and perspectives section ends the paper.2. Asymptotic Theory for the discrete Zenga measure (A) - Asymptotic normality .Let begin by the following reminder. For each n ≥ , the random vector ( n , ...., n m ) follows a m -dimensional multimonial law of parameters n ≥ and p = ( p , ..., p m ) t . In such a case a classical result of weak convergence(See Lo et al. (2016), for example„ as n → + ∞ , is the following HE DISCRETE ZENGA INEQUALITY MEASURE 7 (cid:18) n − np √ np , · · · , n m − np m √ np m (cid:19) t ≡ ( N ,n , · · · , N m,n ) t (cid:32) Z = ( Z , · · · , Z m ) t ∼ N m (0 , Σ) , the variance-covariance matrix Σ = ( σ h,k ) ≤ k,k ≤ m of Z is defined, for ( h, k ) ∈{ , ..., m } , h (cid:54) = k , by σ hh = E ( Z h ) = 1 − p h and σ hk = E ( Z h Z k ) = −√ p h p k . We invoke the Skorohod-Wichura Theorem (See Wichura (1970)) to sup-pose that Z is defined on the same probability space and that ( N ,n , · · · , N m,n ) t → P Z, as n → + ∞ . Let us give some notation. Define vectors C = ( c , ..., c m ) t such that c j = √ p j (1 /p ∗ j ) µ ( j ) (1 / (1 − p ∗ j )) µ ( j ) ( j (cid:54) = m ) , j ∈ { , ..., m } , for j ∈ { , ...m − } , i ∈ { , } , D j,i = ( d j,i, , ..., d j,i,m ) t such that d j, ,h = ( x h √ p h ) 1 ( h ≤ j ) , d j, ,h = − ( x h √ p h ) 1 ( h ≥ j +1) γ j, = p j (1 /p ∗ j )(1 / (1 − p ∗ j )) µ ( j ) , γ j, = p j (1 /p ∗ j )(1 / (1 − p ∗ j ) µ ( j ) ( µ ( j ) ) and let E j = ( e j, , ..., e j,m ) t be the vector defined by its components as follows e j,h = − ( √ p h ) 1 ( h ≤ j ) . Finally, let us defined − H = C + m − (cid:88) j =1 (cid:16) γ j, D j, + γ j, D j, + (cid:0) p ∗ j (cid:1) − E j (cid:17) . Theorem 1.
Under the notation given above, we have, as n → + ∞ , √ n ( Z d,n − Z d ) (cid:32) N m (cid:0) , H t Σ H (cid:1) . ♦ † TCHILABALO ATOZOU KPANZOU, †† DIAM BA, †††
PAPE DJIBY MERGANE, AND ††††
GANE SAMB LO
Proof of Theorem 1 . Let us fix n ≥ . We have Z n,d = 1 − m − (cid:88) j =1 n j n (cid:18) nn ∗ j − (cid:19) (cid:80) ≤ h ≤ j n h x h (cid:80) j +1 ≤ h ≤ m n h x h . We define Z ∗ d,n = m − (cid:88) j =1 n j n (cid:18) nn ∗ j − (cid:19) (cid:80) ≤ h ≤ j n h x h (cid:80) j +1 ≤ h ≤ m n h x h . and for ≤ j ≤ m − ,µ ( j ) = j (cid:88) h =1 p h x h and µ ( j ) = m (cid:88) h = j +1 p h x h . We have (cid:80) ≤ h ≤ j n h x h (cid:80) j +1 ≤ h ≤ m n h x h − µ ( j ) µ ( j ) = (cid:80) ≤ h ≤ j n h x h (cid:80) j +1 ≤ h ≤ m n h x h − nµ ( j ) (cid:80) j +1 ≤ h ≤ m n h x h + nµ ( j ) (cid:80) j +1 ≤ h ≤ m n h x h − µ ( j ) µ ( j ) = (cid:80) jh =1 x h N h,n √ p h √ n (cid:80) j +1 ≤ h ≤ m n h x h /n − µ ( j ) (cid:80) mh = j +1 x h N h,n √ p h √ nµ ( j ) (cid:16)(cid:80) j +1 ≤ h ≤ m n h x h /n (cid:17) . Then Z ∗ d,n = m − (cid:88) j =1 n j n (cid:18) nn ∗ j − (cid:19) µ ( j ) µ ( j ) + 1 √ n m − (cid:88) j =1 n j n (cid:18) nn ∗ j − (cid:19) (cid:80) jh =1 x h N h,n √ p h (cid:80) j +1 ≤ h ≤ m n h x h /n − µ ( j ) (cid:80) mh = j +1 x h N h,n √ p h µ ( j ) (cid:16)(cid:80) j +1 ≤ h ≤ m n h x h /n (cid:17) = : Z ∗ d,n (1) + R n (1 , We also have
HE DISCRETE ZENGA INEQUALITY MEASURE 9 (cid:18) nn ∗ j − (cid:19) − (cid:18) p ∗ j − (cid:19) = (cid:18) nn ∗ j − (cid:19) − (cid:32) n (cid:80) jh =1 np h − (cid:33) = − (cid:80) jh =1 n h − (cid:80) jh =1 p h (cid:16)(cid:80) jh =1 p h (cid:17) (cid:16)(cid:80) jh =1 n h (cid:17) = − √ n (cid:80) jh =1 √ p h N h,n (cid:16)(cid:80) jh =1 p h (cid:17) (cid:16)(cid:80) jh =1 n h /n (cid:17) . This leads to Z ∗ d,n (1) = m − (cid:88) j =1 n j n (cid:18) p ∗ j − (cid:19) µ ( j ) µ ( j ) − m − (cid:88) j =1 n j n n √ n (cid:80) jh =1 √ p h N h,n (cid:16)(cid:80) jh =1 n h (cid:17) (cid:16)(cid:80) jh =1 np h (cid:17) µ ( j ) µ ( j ) = : Z ∗ d,n (2) + R n (1 , Finally, we have Z ∗ d,n (2) = m − (cid:88) j =1 p j (cid:18) p ∗ j − (cid:19) µ ( j ) µ ( j ) + 1 n m − (cid:88) j =1 √ np j N j,n (cid:18) p ∗ j − (cid:19) µ ( j ) µ ( j ) = m − (cid:88) j =1 p j (cid:18) p ∗ j − (cid:19) µ ( j ) µ ( j ) + 1 √ n m − (cid:88) j =1 √ p j N j,n (cid:18) p ∗ j − (cid:19) µ ( j ) µ ( j ) (L2) = m − (cid:88) j =1 (1 /p ∗ j ) µ ( j ) (1 / (1 − p ∗ j )) µ ( j ) + 1 √ n m − (cid:88) j =1 √ p j N j,n (cid:18) p ∗ j − (cid:19) µ ( j ) µ ( j ) = : Z ∗ d + R n (3) . It is clear that Z d = 1 − Z ∗ d . We finally get √ n ( Z ∗ d,n − Z ∗ d ) = √ nR n (1) + √ nR n (2) + √ nR n (3) . † TCHILABALO ATOZOU KPANZOU, †† DIAM BA, †††
PAPE DJIBY MERGANE, AND ††††
GANE SAMB LO
By using the convergence (strong and weak) on binomial probabilities, weget √ nR n (1 , m − (cid:88) j =1 n j n (cid:18) nn ∗ j − (cid:19) (cid:80) jh =1 (cid:0) x h √ p h (cid:1) N h,n (cid:80) j +1 ≤ h ≤ m n h x h /n − µ ( j ) (cid:80) mh = j +1 (cid:0) x h √ p h (cid:1) N h,n µ ( j ) (cid:16)(cid:80) j +1 ≤ h ≤ m n h x h /n (cid:17) → P m − (cid:88) j =1 p j (1 /p ∗ j )(1 / (1 − p ∗ j )) (cid:32) (cid:80) jh =1 (cid:0) x h √ p h (cid:1) Z h µ ( j ) − µ ( j ) (cid:80) mh = j +1 (cid:0) x h √ p h (cid:1) Z h ( µ ( j ) ) (cid:33) , ( A Next √ nR n (1 ,
2) = − (cid:80) jh =1 √ p h N h,n (cid:16)(cid:80) jh =1 p h (cid:17) (cid:16)(cid:80) jh =1 n h /n (cid:17) → P − (cid:80) jh =1 √ p h Z h ( p ∗ j ) . ( A and finally √ nR n (3) = m − (cid:88) j =1 √ p j (cid:18) p ∗ j − (cid:19) µ ( j ) µ ( j ) N j,n → P m − (cid:88) j =1 √ p j (1 /p ∗ j ) µ ( j ) (1 / (1 − p ∗ j )) µ ( j ) Z j . ( A By combining Developments (A1), (A2) and (A3), we get
HE DISCRETE ZENGA INEQUALITY MEASURE 11 √ n ( Z ∗ d,n − Z ∗ d ) → m − (cid:88) j =1 p j (1 /p ∗ j )(1 / (1 − p ∗ j )) (cid:32) (cid:80) jh =1 (cid:0) x h √ p h (cid:1) Z h µ ( j ) − µ ( j ) (cid:80) mh = j +1 (cid:0) x h √ p h (cid:1) Z h ( µ ( j ) ) (cid:33) − (cid:80) jh =1 √ p h Z h ( p ∗ j ) + m − (cid:88) j =1 √ p j (1 /p ∗ j ) µ ( j ) (1 / (1 − p ∗ j )) µ ( j ) Z j = (cid:32) m − (cid:88) j =1 (cid:104) γ j, D j, , Z (cid:105) + (cid:104) γ j, D j, , Z (cid:105) + (cid:104) ( p ∗ j ) − E j , Z (cid:105) (cid:33) + (cid:104) C, Z (cid:105) . We conclude that √ n ( Z ∗ d,n − Z ∗ d ) → P H t Z. (cid:3) † TCHILABALO ATOZOU KPANZOU, †† DIAM BA, †††
PAPE DJIBY MERGANE, AND ††††
GANE SAMB LO (B) - Influence Function of Z d . Theorem 2.
Under the notations given below, the Influence function of Z d isgiven, for x ≤ x ≤ x m , by IF ( Z d , x ) = (cid:90) P X ( s ) (cid:18) R ( s ) R ( s ) (1 − F ( s ) )1 ] s, + ∞ ] ( x ) − R ( s ) F ( s ) 1 ]0 ,s ] ( x ) (cid:19) xdν + (cid:90) P X ( s ) (cid:18) R ( s ) R ( s ) F ( s ) 1 ]0 ,s ] ( x ) − R ( s ) R ( s )(1 − F ( s )) 1 ] s, + ∞ ] ( x ) (cid:19) dν − (cid:90) δ x ( s ) R ( s ) R ( s ) dν + (cid:90) P X ( s ) R ( s ) R ( s ) dν. Proof of Theorem 2 . Let us write, for s ∈ R , R ( s ) = R ( s, P X ) = (cid:82) t ]0 ,s ] ( t ) d P X ( t ) (cid:82) ]0 ,s ] ( t ) d P X ( t ) , and R ( s ) = R ( s, P X ) = (cid:82) t ] s, + ∞ [ ( t ) d P X ( t ) (cid:82) ] s, + ∞ [ d P X ( t ) . We have Z d ( P X ) = Z d = 1 − (cid:90) R ( s ) R ( s ) P X ( s ) dν ( s ) . By using Formula (1.1), we have d ( P ( ε ) X − P X ) ε = − d P X + dδ x For short, we write R i ( s, P X ) = R i ( s ) and R i ( s, P ( ε ) X ) = R i ( s, ε ) , i ∈ { , } . We have
HE DISCRETE ZENGA INEQUALITY MEASURE 13 Z d ( P ( ε ) X ) − Z d ( P X ) = − (1 − ε ) (cid:90) P X ( s ) R ( s, ε ) R ( s, ε ) dν − ε (cid:90) δ x ( s ) R ( s, ε ) R ( s, ε ) dν + (cid:90) P X ( s ) R ( s ) R ( s ) dν = − (cid:90) P X ( s ) (cid:18) R ( s, ε ) R ( s, ε ) − R ( s ) R ( s ) (cid:19) dν + ε (cid:90) P X ( s ) R ( s, ε ) R ( s, ε ) dν − ε (cid:90) δ x ( s ) R ( s, ε ) R ( s, ε ) dν. Le us apply the definition of the IF as in Formula (1.2). Since P ( ε ) X → P X as ε → (The convergence being meant as a convergence in law), we have noproblem to see that lim ε → Z d ( P ( ε ) X ) − Z d ( P X ) ε = (cid:90) P X ( s ) R ( s ) R ( s ) dν − (cid:90) δ x ( s ) R ( s ) R ( s ) dν − (cid:90) P X ( s ) lim ε → ε (cid:18) R ( s, ε ) R ( s, ε ) − R ( s ) R ( s ) (cid:19) dν. (2.1)So we have to find the influence function of R ( s ) /R ( s ) . By formally rep-resenting the differentiation of a functional T ( P X ) by ∂T ( P X ) ∂λ we have that the influence function of R ( s ) /R ( s ) is given by IF ( R ( s ) /R ( s ) , x ) = R ( s ) ∂R ( s ) ∂λ − R ( s ) ∂R ( s ) ∂λ R ( s ) . But † TCHILABALO ATOZOU KPANZOU, †† DIAM BA, †††
PAPE DJIBY MERGANE, AND ††††
GANE SAMB LO R ( s, ε ) − R ( s ) = (cid:82) t ]0 ,s ] ( t ) d P X ( t ) (cid:82) ]0 ,s ] ( t ) d P ( ε ) X ( t ) − ε (cid:82) t ]0 ,s ] ( t ) d P X ( t ) (cid:82) ]0 ,s ] ( t ) d P ( ε ) X ( t )+ ε (cid:82) t ]0 ,s ] ( t ) dδ x ( t ) (cid:82) ]0 ,s ] ( t ) d P ( ε ) X ( t ) − (cid:82) t ]0 ,s ] ( t ) d P X ( t ) (cid:82) ]0 ,s ] ( t ) d P X ( t )= (cid:82) t ]0 ,x j ] ( t ) d ( P ( ε ) X ( t ) − P X ( t )) (cid:82) ]0 ,s ] ( t ) d P ( ε ) X ( t ) − (cid:82) ]0 ,s ] ( t ) d ( P ( ε ) X ( t ) − P X ( t )) (cid:16)(cid:82) ]0 ,s ] ( t ) d P ( ε ) X ( t ) (cid:17) (cid:0)(cid:82) ]0 ,s ] ( t ) d P X ( t ) (cid:1) (cid:90) t ]0 ,s ] ( t ) d P X ( t ) . We get lim ε → R ( s, ε ) − R ( s ) ε = (cid:82) t ]0 ,x j ] ( t ) d ( − P X ( t ) + δ x ) (cid:82) ]0 ,s ] ( t ) d P X ( t ) − (cid:82) ]0 ,s ] ( t ) d ( − P X ( t ) + δ x ) (cid:0)(cid:82) ]0 ,s ] ( t ) d P X ( t ) (cid:1) (cid:90) t ]0 ,s ] ( t ) d P X ( t )= − (cid:0)(cid:82) t ]0 ,s ] ( t ) d P X ( t ) (cid:1) + x ]0 ,s ] ( x ) (cid:82) ]0 ,s ] ( t ) d P X ( t ) − − (cid:0)(cid:82) ]0 ,s ] ( t ) d P X ( t ) (cid:1) + 1 ]0 ,s ] ( x ) (cid:0)(cid:82) ]0 ,s ] ( t ) d P X ( t ) (cid:1) (cid:90) t ]0 ,s ] ( t ) d P X ( t ) We get ∂R ( s ) ∂λ = − R ( s ) + x ]0 ,s ] ( x ) F ( s ) + R ( s ) − R ( s ) F ( s ) 1 ]0 ,s ] ( x ) . By treating R ( s ) in the same manner we have (We should not forget thatwe differentiate in the probability) HE DISCRETE ZENGA INEQUALITY MEASURE 15 ∂R ( s ) ∂λ = x ]0 ,s ] ( x ) F ( s ) − R ( s ) F ( s ) 1 ]0 ,s ] ( x ) ∂R ( s ) ∂λ = x ] s, + ∞ ] ( x )1 − F ( s ) − R ( s )1 − F ( s ) 1 ] s, + ∞ ] ( x ) Thus lim ε → R ( s, ε ) − R ( s ) ε = (cid:18) ]0 ,s ] ( x ) R ( s ) F ( s ) − R ( s )1 ] s, + ∞ ] ( x ) R ( s )(1 − F ( s )) (cid:19) x + (cid:18) R ( s ) R ( s )(1 − F ( s )) 1 ] s, + ∞ ] ( x ) − R ( s ) R ( s ) F ( s ) 1 ]0 ,s ] ( x ) (cid:19) ; By replacing this limit with its expression in the equation (2.1) we get. lim ε → Z d ( P ( ε ) X ) − Z d ( P X ) ε = (cid:90) P X ( s ) R ( s ) R ( s ) dν − (cid:90) δ x ( s ) R ( s ) R ( s ) dν + (cid:90) P X ( s ) (cid:18) R ( s ) R ( s ) (1 − F ( s ) )1 ] s, + ∞ ] ( x ) − R ( s ) F ( s ) 1 ]0 ,s ] ( x ) (cid:19) xdν + (cid:90) P X ( s ) (cid:18) R ( s ) R ( s ) F ( s ) 1 ]0 ,s ] ( x ) − R ( s ) R ( s )(1 − F ( s )) 1 ] s, + ∞ ] ( x ) (cid:19) dν. From this, the proof is directed concluded. (cid:4)
3. Data-driven Applications
Simulation Study . Quality of the convergence . We choose a Probability distribution of yearlyincome supported by m = 10 points with lower endpoint x = 4 . . XOF(9.030 nearly) and upper endpoint x m = 9 . . XOF(170.490 nearly) ,characterized as in Table 2.values x x x x x ...4.515.000 13.485.000 22.455.000 31.425.000 40.395.000 ... P ( X = x i ) Table 2. Underlying Probability Law (to be continued) † TCHILABALO ATOZOU KPANZOU, †† DIAM BA, †††
PAPE DJIBY MERGANE, AND ††††
GANE SAMB LO values ... x x x x x ... 49.365.000 58.335.000 67.305.000 76.275.000 85.245.000 P ( X = x i ) ... 0.1 0.2 0.2 0.1 0.1 Table 3. Continuation of Table 2
Table 2 shows the good performance of estimation the Zenga’s discrete forsize samples from n = 100 to n = 1500 . Such sizes are comparable withthose of sample survey from population counted in dozen of millions.Size 100 200 500 750 1000 750ERM . − − .
36 10 − − − .
41 10 − .
56 10 − − . − MSE . − .
35 10 − .
49 10 − .
16 10 − . − .
64 10 − Table 4. Mean errors (ERM), Mean Square Errors (MSE)
Figure 2 shows the pretty good asymptotic normality approximation of thecentered and normalized empirical Zenga’s estimator. (B) Data-driven Applications .We use the income Data in Senegal (2001-2002) from the database relatedto ANSD : Senegalese Survey from Households (2001-2202) . The incomesare given by households. We should use an adult-equivalence scale to con-sider to be able to compare households. The notion of adult-equivalencehas already been described in Lo (2016) and implemented on different setsof data, among them the data just described above. The data are availablefor the whole country (Senegal) and for the 10 areas given in the followingorder : (OA) : Dakar, Diourbel, Fatik, Kaolack, Louga, Saint-Louis, Tamba, Thies,Ziguinchor, Kolda.Dakar in the most urbanized area of Senegal and includes the capital ofthe country, named also after Dakar. It concentrated almost 23.1% of thepopulation.The Zenga and the Gini index have been computed for the 11 areas fromthe aggregate data, and are display in Table 5 (continued in Table 6).
HE DISCRETE ZENGA INEQUALITY MEASURE 17
Figure 1. Histograms, Parzen Estimators and QQ-plotsfor sample sizes , and from left to right Index Senegal Dakar Diourbel Fatick Kaolack Louga ..Zenga 80.65 93.33 81.34 92.54 81.11 84.00 ..Gini 75.00 80.90 75.26 80.39 75.16 16.25 ...
Table 5. Zenga and Gini index measures for Senegal’s administrative areas(2000), to be continued † TCHILABALO ATOZOU KPANZOU, †† DIAM BA, †††
PAPE DJIBY MERGANE, AND ††††
GANE SAMB LO
Index ... Saint-Louis Tamba Thies Ziguinchor KoldaZenga ... 87.69 86.64 82.61 82.11 80.24Gini ... 78.83 77.26 75.72 75.52 47.86
Table 6. Continuation of Table 5Figure 2. The areas are given in the horizontal line and are ordered ac-cording to the ranking (AO) above. Blue : Zenga’s index. Regd : Gini’sindex
Through the values in theses tables, the 11 areas are ordered from theleast inequality index to the greatest as follows :
Ordering by Zenga’s index : Kolda (1), Senegal (2), Kaolack (3), Diourbel(4), Ziguinchor (5), Thies (6), Louga (7), Tamba (8), Saint-Louis (9), Fatick(10), Dakar (11).
Ordering by Gini’s index : Louga (1), Kolda (2), Senegal (3), Kaolack (4),Diourbel (5), Ziguinchor (6), Thies (7), Tamba (8), Saint-Louis (9), Fatick(10), Dakar (11).These orderings are illustrated in Figure 2.The most striking fact is that the two index do not order the areas in anexact similar way. The most unfair areas (with the greatest values of theinequality index) are the same with the same ordering, form areas 8 to 11.
HE DISCRETE ZENGA INEQUALITY MEASURE 19
From areas 1 to 7, the ordering is slightly changed but the case of Lougais remarkable. It is ranked first by Gini and seventh by Zenga.One may think that the inequality should be greater in urban areas thanin rural zone. Indeed we see that with the areas of Thies, Saint-Louis,Dakar. But Factik and Tamba are so urbanized areas. Investigating whythe inequality indices (Both Zenga and Gini) are high should be investi-gated in accordance with local realities.In this simple study, we are concerned with a large scale comparison stud-ies between Zenga’s and Gini’s either but simulation studies or by theoret-ical investigations. This would be certainly in coming papers.4. Conclusion and perspectivesReferencesAllison, P. D. (1978). Measures of inequality.
American Sociological Review ,vol. 43, pp. 478-484.Agence Nationale de la Statistique et de la D ´emographie (ANSD) (2001-202) Enquˆete S ´en ´egalaise Aupr `es des M ´enages 2 (ESAM II). Website: http://anads.ansd.sn/index.php/catalog/47/study-description. etodiQuantitativi per le Scienze Economiche ed Aziendali - Universita degliStudi di Milano-Bicocca.Cowell, F. A. and Flachaire, E. (2007). Income Distribution and InequalityMeasurement: The Problem of Extreme Values.
Journal of Econometrics ,vol. 141, pp. 1044-1072.Cowell, F.A., Flachaire, E. and Bandyopadhyay, S. (2009). Goodness-of-Fit: An Economic Approach.
Distributional Analysis Research Programme(DARP 101).
Discussion Paper. Department of Economics (University ofOxford).Emad-Eldin, A.A.A. and Marilou, O.H. (1999). Nonparametric inferencefor Zenga’s measure of income inequality.
Metron - International Journalof Statistics , vol. LVII, issue 1-2, pp.69-84.Greselin, F., Pasquazzi, L. and Zitikis, R. (2010a). Zenga’s new index of eco-nomic inequality, its estimation, and analysis of incomes in Italy.
Journalof Probability and Statistics , vol. 2010, pp. 1-26.Greselin, F., Pasquazzi, L. and Zitikis, R. (20XX). Asymptotic Theory forZenga’s New Index of Economic Inequality. Proceedings of the 45th Sci-entific Meeting of the Italian Statistical Society. † TCHILABALO ATOZOU KPANZOU, †† DIAM BA, †††
PAPE DJIBY MERGANE, AND ††††
GANE SAMB LO
Hulliger, B. and Schoch, T. (2009). Robust estimation of the quintile shareratio with bias reduction. Presented at the Swiss Statistics Meeting, Oc-tober 30, 2009, Geneva.Lo G.S. (2016) VB and R codes using Households databases avail-able in the NSI’s : A prelude to statistical applied studies.
AfricanJournal to Applied Statistics , Vol 3, pp. 121-156, Vol 3. DOI :http://dx.doi.org/10.16929/ajas/206Lo, G.S.(2016). Weak Convergence (IA). Sequences of random vectors.SPAS Books Series. Saint-Louis, Senegal - Calgary, Canada. Doi :10.16929/sbs/2016.0001. Arxiv : 1610.05415. ISBN : 978-2-9559183-1-9Magurran, A. E. (1991).
Ecological diversity and its measurement , Chap-man and Hall.Mergane P.D. and Lo G.S.(2013) On the Functional Empirical Pro-cess and Its Application to the Mutual Influence of the Theil-Like Inequality Measure and the Growth.
Applied Mathematics
Scientometrics , vol.28, pp. 3-14.Wasserman L.(2006).
All of Nonparametric Statistics . Springer.Wichura M.J. (1970) On the Construction Of Almost sure Uniformly Con-vergent Random Variable with Given Weakly Convergent Images laws.
Ann. Math. Statist. , 41 (1), pp. 284-291White, M. J. (1986). Segregation and diversity measures in population dis-tribution.
Population Index , vol. 52, pp. 198-221.Zenga, M. (1984). Proposta per un Indice di Concentrazione Basato suiRapporti tra Quantili di Popolazione e Quantili di Reddito.