Ratio Estimators in Simple Random Sampling when Study Variable is an Attribute
RRatio Estimators In Simple Random Sampling When Study Variable Is An Attribute
Rajesh Singh and Mukesh Kumar Department of Statistics, Banaras Hindu University, Varanasi-221005, INDIA Florentin Smarandache Department of Mathematics, University of New Mexico, Gallup, USA [email protected], [email protected], [email protected]
Abstract
In this paper we have suggested a family of estimators for the population mean when study variable itself is qualitative in nature. Expressions for the bias and mean square error (MSE) of the suggested family have been obtained. An empirical study has been carried out to show the superiority of the constructed estimator over others.
Key words : Attribute, point bi-serial, mean square error, simple random sampling. Introduction
The use of auxiliary information can increase the precision of an estimator when study variable y is highly correlated with auxiliary variable x. In many situations study variable is generally ignored not only by ratio scale variables that are essentially qualitative , or nominal scale, in nature, such as sex, race, colour, religion, nationality, geographical region, political upheavals (see [1]). Taking into consideration the point biserial correlation coefficient between auxiliary attribute and study variable, several authors including [2], [3], [4], [5] and [6] defined ratio estimators of population mean when the priori information of population proportion of units, possessing some attribute is available. All the others have mplicitly assumed that the study variable Y is quantitative whereas the auxiliary variable is qualitative. In this paper we consider some estimators in which study variable itself is qualitative in nature. For example suppose we want to study the labour force participation (LFP) decision of adult males. Since an adult is either in the labour force or not, LFP is a yes or no decision. Hence, the study variable can take two values, say 1, if the person is in the labour force and 0 if he is not. Labour economics research suggests that the LFP decision is a function of the unemployment rate, average wage rate, education, family income, etc (See [1]). Consider a sample of size n drawn by simple random sampling without replacement (SRSWOR) from a population size N. Let i φ and i x denote the observations on variable φ and x respectively for th i unit (i=1,2,3…N). i =φ , if th i unit of population possesses attribute φ and i =φ , otherwise. Let ∑ = φ= N1i i
A and ∑ = φ= n1i i a denote the total number of units in the population and sample possessing attribute φ respectively, NAP = and nap = denote the proportion of units in the population and sample, respectively, possessing attribute φ . Define, ( ) ,P Ppe −= φ ( ) XXxe x −= Such that, ( ) ( ) x,i,0eE i φ== nd ( ) ,fCeE = φ ( ) ,fCeE = ( ) .xpbx CCfeeE φφ ρ= where, ,N1n1f ⎟⎠⎞⎜⎝⎛ −= ,PSC = ,XSC = and xxpb SSS φφ =ρ is the point biserial correlation coefficient. Here, ( ) ,P1N1S N1i 2i2 ∑ −φ−= =φ ( ) ∑ −−= = N1i 2i2x
Xx1N1S and .XNPx1N1S
N1i iix ⎟⎟⎠⎞⎜⎜⎝⎛ ∑ −φ−= =φ
2. The proposed estimator
We first propose the following ratio-type estimator
XxPt ⎟⎠⎞⎜⎝⎛ = (2.1) The bias and MSE of the estimator t , to the first order of approximation is respectively, given by ( ) ⎟⎟⎠⎞⎜⎜⎝⎛ ρ−= φ xpb2x1 CC2CftB (2.2) ( ) ( ) xpb2x21
CC2CCftMSE φφ ρ−+= (2.3) Following [7], we propose a general family of estimators for P as ) u,pHt = (2.4) where Xxu = and ( ) u,pH is a parametric equation of p and u such that ( ) P,P1,pH ∀= (2.5) and satisfying following regulations: (i) Whatever be the sample chosen, the point (p,u) assume values in a bounded closed convex subset R of the two-dimensional real space containing the point (p,1). (ii) The function H(p,u) is a continuous and bounded in R . (iii)The first and second order partial derivatives of H(p,u) exist and are continuous as well as bounded in R Expanding H(p,u) about the point (P,1) in a second order Taylor series we have ( )( ) ( ) ( )( ) ( ) ...HPpH1uPpH1uH1-up u,pHt +−+−−+−++== (2.6) where, ,uHH == ∂∂= ,uH21H == ∂∂= ,up H21H == ∂∂∂= and .yp H21H == ∂∂= The bias and MSE of the estimator t are respectively given by – ( ) ( ) HCPHCHCCPftB ++ρ= (2.7) ) ( ) xppb12x212p22 CCPH2CHCPftMSE ρ++= (2.8) On differentiating (2.8) with respect to H and equating to zero we obtain xppb1 CCPH ρ−= (2.9) On substituting (2.9) in (2.8), we obtain the minimum MSE of the estimator t as ( ) ( ) ρ−= (2.10) We suggest another family of estimators for estimating P as ( ) [ ] ( ) ( ) ( ) ( ) βα ⎥⎦⎤⎢⎣⎡ +++ +−+ ⎥⎦⎤⎢⎣⎡ ++−+= bxabXa bxabXaexpbxa bXaxXqPqt (2.11) where q and q,, βα are real constants and a and b are known as characterising positive scalars. Many ratio-product estimators can be generated from t by putting suitable values of q , q , α , β , a and b (for choice of the parameters refer to [8], and [5]). ( ) [ ] ( ) ( ) ⎥⎥⎦⎤⎢⎢⎣⎡ +ββθ+βθ− ⎥⎦⎤⎢⎣⎡ θ+αα+αθ−−+= ( ) ( ) { } { } ...BeeXqe1AeeeeBe1Pq −−+++−+= (2.12) where, bXa Xa +=θ , θ ⎟⎠⎞⎜⎝⎛ β+α= and ( ) ( ) [ ] .42148A αβ++ββ++ααθ= The bias and MSE of the estimator t to the first order of approximation, are given as ( ) ( ) ( ) [ ] xp12x123 CCPBqCPAqBXqf1qPtBias ρ−++−= (2.13) ) ( )
PtEtMSE −= ( ) ( ) MqM2MqP1q +++−= ( ) (2.14) Mq2Mq2MMqq2 +−−−+ where, ( ) ,CCB2CBCfPM xp2x22p21 ρ−+= ( ) ,CfXM = ( ) ,CCB2ACfPM xp2x23 ρ−= ( ) ,CCBCfXPM xp2x4 ρ+−= On minimising the MSE of t with respect to q and q respectively, we get q Δ−ΔΔ ΔΔ−ΔΔ= and (2.15) q
Δ−ΔΔ ΔΔ−ΔΔ= where, ( ) ,M2MP ++=Δ ( ) ,MM −−=Δ ( ) ,M =Δ ( ) ,MP +=Δ ( ) ,M −=Δ On putting these values of q and q in equation (2.14) we obtain the minimum MSE of t as- ( ) ⎥⎥⎦⎤⎢⎢⎣⎡ Δ−ΔΔ ΔΔΔ−ΔΔ+ΔΔ−= (2.16) ( ) .BCPfXM −= . Efficiency Comparisons
First, we compare the efficiency of proposed estimator t with usual estimator. ( ) ( ) yVtMSE min3 ≤ If, (3.1) CfP2P ≤ ⎥⎥⎦⎤⎢⎢⎣⎡ Δ−ΔΔ ΔΔΔ−ΔΔ+ΔΔ−
On solving we observed that above conditions holds always true. Next we compare the efficiency of proposed estimator t with regression estimator. ( ) ( ) ( ) regMSEtMSEregMSE min ≤ α If, ( ) (3.2) 1CfP2P ρ−≤ ⎥⎥⎦⎤⎢⎢⎣⎡
Δ−ΔΔ ΔΔΔ−ΔΔ+ΔΔ− Empirical study Data Statistics:
We have taken the data from [1]. Where Y – Home ownership X – Income (thousands of dollars) The following Table shows PRE of different estimator’s with respect to usual estimator.
Table 1: Percent relative efficiency (PRE) of estimators with respect to usual estimator
Estimators y t t t =β=α =β=α =β=α PRE 100 189.384 511.794 515.798 517.950
When we examine Table 1, we observe that the proposed estimators t , t and t all performs better than the usual estimator .y Also, the proposed estimator t is the best among the estimators considered in the paper for the choice .1,0 =β=α References Gujarat, D. N., Sangeetha, 2007. Basic economtrics. Tata McGraw – Hill. 2.
Naik,V.D. and Gupta, P.C., 1996. A note on estimation of mean with known population proportion of an auxiliary character. Jour. Ind. Soc. Agr. Stat., 48(2),151-158. 3.
Jhajj, H. S., Sharma, M. K. and Grover, L. K., 2006. A family of estimators of population mean using information on auxiliary attribute. Pak. J. Statist., 22 (1), 43-50. 4.
Shabbir, J. and Gupta, S., 2007. On estimating the finite population mean with known population proportion of an auxiliary variable, Pakistan Journal of Statistics 23 (1) 1–9. n N P X pb ρ Cp Cx
11 40 0.525 14.4 0..897 0.963 0.3085 .
Singh, R., Chauhan, P., Sawan, N. and Smarandache,F., 2008. Ratio estimators in simple random sampling using information on auxiliary attribute. Pak. J. Stat. Oper. Res. 4(1),47-53. 6.
A.M. Abd-Elfattah, E.A. El-Sherpieny, S.M. Mohamed, O.F. Abdou, 2010. Improvement in estimating the population mean in simple random sampling using information on auxiliary attribute, Appl. Math. Comput. doi:10.1016/j.amc.2009.12.041 7.