aa r X i v : . [ s t a t . A P ] J a n A note on the g and h control charts Chanseok Park
Applied Statistics LaboratoryDepartment of Industrial EngineeringPusan National UniversityBusan 46241, Korea
Min Wang
Department of Management Science and StatisticsUniversity of Texas at San AntonioSan Antonio, TX 78249, USA
Abstract
In this note, we revisit the g and h control charts that are commonly used formonitoring the number of conforming cases between the two consecutive appearancesof nonconformities. It is known that the process parameter of these charts is usuallyunknown and estimated by using the maximum likelihood estimator and the minimumvariance unbiased estimator. However, the minimum variance unbiased estimator in thecontrol charts has been inappropriately used in the quality engineering literature. Thisobservation motivates us to provide the correct minimum variance unbiased estimatorand investigate theoretical and empirical biases of these estimators under considera-tion. Given that these charts are developed based on the underlying assumption thatsamples from the process should be balanced, which is often not satisfied in many prac-tical applications, we propose a method for constructing these charts with unbalancedsamples. Keywords: control charts, geometric distribution, maximum likelihood estimator, min-imum variance unbiased estimator, g and h charts. Introduction
In an introductory statistics course, the geometric distribution is defined as a probabilitydistribution that represents the number of failures (or normal cases) before observing thefirst success (or adverse case) in a series of Bernoulli trials. Based on this distribution,Kaminsky et al. (1992) proposed Shewhart-type statistical control charts, the so-called g and h charts, for monitoring the number of conforming cases between the two consecutiveappearances of nonconformities. Since then they have been widely used for monitoring thecontrol process especially in the healthcare department; see, for example, Benneyan (1999),Benneyan (2000), to name just a few.The process parameter in the g and h charts is usually unknown and needs to be es-timated in the control chart procedures. One can employ the maximum likelihood (ML)estimator and the minimum variance unbiased (MVU) estimator for the process parameter.Of particular note is that the MVU estimator in the control charts has been inappropri-ately used in the quality engineering literature. This motivates us to obtain the correctMVU estimator and investigate theoretical biases of the estimators considered in this note.Furthermore, Monte Carlo simulations are conducted to investigate the empirical biasesof these estimators. Numerical results show that the theoretical and empirical biases ofthe existing estimators are severe when the sample size is small and the value of processparameter is large and that those of the proposed MVU estimator are always very close tozero for all the simulated scenarios.It deserves mentioning that these conventional g and h charts are developed based onthe underlying assumption that samples from the process should be balanced so that thesamples have the same size, whereas such an assumption can be restrictive and may not besatisfied in many practical applications. To overcome this issue, we propose the method ofhow to construct the g and h charts with unbalanced samples.The remainder of this note is organized as follows. In Section 2, we briefly review thegeometric and negative binomial distributions and then provide the correct MVU estimatorfor the process parameter. In Section 3, we obtain the parameter estimation with unequalsample sizes and investigate theoretical and empirical properties of these estimators consid-ered in this note. In Section 4, based on the proposed estimator, we provide a method of2onstructing the g and h charts with unbalanced samples. Concluding remarks are providedin Section 5. Let Y i be independent and identically distributed (iid) according to the shifted geometricdistribution with location shift a and Bernoulli probability p for i = 1 , , . . . . Then itsprobability mass function (pmf) is given by f ( y ) = P ( Y i = y ) = p (1 − p ) y − a , (1)where y = a, a + 1 , . . . and a is the known minimum possible number of events (usually a = 0 , Y i are respectively given by E ( Y i ) = 1 − pp + a and Var( Y i ) = 1 − pp . For notational convenience, we let T n = P ni =1 Y i . Then T n has the (shifted) negativebinomial with predefined location shift na and Bernoulli probability p and its pmf is givenby g n ( t ) = P ( T n = t ) = (cid:18) t − na + n − n − (cid:19) p n (1 − p ) t − na , (2)where t = na, na + 1 , . . . . The mean and variance of T n are respectively given by E ( T n ) = n (1 − p ) p + na and Var( Y i ) = n (1 − p ) p . It is well known that the method of moments and the method of ML yield the sameestimator of p , which is given by ˆ p ml = 1¯ Y − a + 1 , (3)where ¯ Y = P ni =1 Y i /n . It is worth noting that this estimator is not unbiased and that weare able to identify the best unbiased estimator of p summarized in the following theorem. Theorem 1.
The MVU estimator for the parameter of the geometric distribution in (1) isgiven by ˆ p mvu = n − P ni =1 Y i − na + n − n − /n ¯ Y − a + 1 − /n . roof. It is immediate from Lehmann and Casella (1998) that T n = P ni =1 Y i is a completesufficient statistic since the joint mass functions of iid geometric distributions form anexponential family. Thus, we can employ the Rao-Blackwell theorem (Rao, 1945; Blackwell,1947) to obtain the MVU estimator of p as follows.Let δ = I ( Y n = a ) where I ( · ) is the indicator function. Then δ is an unbiased estimatorof p since E ( δ ) = P ( Y n = a ) = p. Conditioning the unbiased estimator δ on the complete sufficient statistic T n = t and takingthe expectation, we can obtain the MVU estimate, denoted by η ( t ), due to the Rao-Blackwelltheorem η ( t ) = E (cid:2) I ( Y n = a ) | T n = t (cid:3) = P ( Y n = a, T n = t ) P ( T n = t ) . (4)Since Y n = a and T n = Y + Y + · · · + Y n = t , we have T n − = Y + Y + · · · + Y n − = t − a and T n − is independent of Y n . Thus, we have η ( t ) = P ( Y n = a, T n − = t − a ) P ( T n = t ) = P ( Y n = a ) · P ( T n − = t − a ) P ( T n = t ) . Note that the pmfs of Y n = a and T n − = t − a are given by f ( a ) in (1) and g n − ( t − a )from (2), respectively. Thus, we have η ( t ) = f ( a ) · g n − ( t − a ) g n ( t ) = p · (cid:0) t − a − ( n − a + n − n − (cid:1) p n − (1 − p ) t − a − ( n − a (cid:0) t − na + n − n − (cid:1) p n (1 − p ) t − na , which can be simplified as η ( t ) = n − t − na + n − . (5)Using (5), we obtain the MVU estimator of p which is given byˆ p mvu = ( n − /n ¯ Y − a + 1 − /n . This completes the proof.
We assume that there are m samples and each sample has different sample sizes. We denotethe size of the i th sample by n i for i = 1 , . . . , m . Let X ij be the number of independent4ernoulli trials (cases) until the first nonconforming case in the i th sample for i = 1 , . . . , m and j = 1 , . . . , n i . We assume that X ij ’s are iid geometric random variables with locationshift a and Bernoulli probability p . Let T N = P mi =1 P n i j =1 X ij and N = P mi =1 n i . Then itis easily seen from (2) that T N has the negative binomial with predefined location shift N a and Bernoulli probability p and its pmf is given by g N ( t ) = P ( T N = t ) = (cid:18) t − N a + N − N − (cid:19) p N (1 − p ) t − Na , (6)where t = N a, N a + 1 , . . . . It is immediate from (3) that the ML estimator with all thesamples is given by ˆ p ml = 1¯¯ X − a + 1 , where ¯¯ X = P mi =1 P n i j =1 X ij /N . By following Theorem 1, we obtain the MVU estimator of p which is given by ˆ p mvu = ( N − /N ¯¯ X − a + 1 − /N . To the best of our knowledge, the MVU estimator ˆ p mvu above has not yet been used in thequality engineering literature. For example, Benneyan (2001) and Minitab (2020) use thefollowing estimator ˆ p b as the MVU estimator of p ˆ p b = ( N − /N ¯¯ X − a + 1 , (7)which is however not unbiased. As an illustration, consider the case of the degeneratinggeometric distribution with p = 1. Then we have P ( X ij = a ) = 1, so that ¯¯ X = a . Thus,we have ˆ p ml = 1 and ˆ p mvu = 1, whereas ˆ p b = 1 − /N , indicating that ˆ p b is not unbiased.It is worth noting that the estimator ˆ p b in (7) was obtained by simply multiplying theML estimator with the factor ( N − /N , that is, ˆ p b = ˆ p ml · ( N − /N . For the case ofthe exponential distribution with the density f ( y ) = λe − λy , which can be regarded as acontinuous version of the geometric distribution, the MVU estimator of λ can be obtainedby simply multiplying the unbiasing factor ( N − /N with the ML estimator; see, forexample, Miyakawa (1984) and Park (2010). However, this technique fails to the case ofthe geometric distribution. Also, it is of interest to provide the inequality relation of thethree estimators considered above in the following theorem. Theorem 2.
For < p < , we have ˆ p b < ˆ p mvu < ˆ p ml . roof. First, we show that ˆ p b < ˆ p mvu . Since the denominator of ˆ p b is always larger thanthat of ˆ p mvu , we have ˆ p b < ˆ p mvu .Next, we show that ˆ p mvu < ˆ p ml . To prove this, we use the fact that the mediant of thetwo fractions is positioned between them, that is, ac < a + bc + d < bd , where a/c < b/d and a, b, c, d >
0. The estimator ˆ p ml is the mediant of ˆ p mvu and (1 /N ) / (1 /N ),that is, 1 − /N ¯¯ X − a + 1 − /N < X − a + 1 < /N /N , which completes the proof.We observe from Theorem 2 that ˆ p b tends to underestimate the true value p and thatˆ p ml tends to overshoot the true value. Since ˆ p b and ˆ p ml are biased, a natural question arises:what are the theoretical biases of these estimators? In what follows, we provide the firstmoments of these estimators so that the biases of the estimators are easily obtained bysubtracting the true value of p from their first moments. Theorem 3.
For < p < , we have E (ˆ p ml ) = p N · F ( N, N ; N + 1; 1 − p ) and E (ˆ p b ) = (cid:18) N − N (cid:19) p N · F ( N, N ; N + 1; 1 − p ) , where F ( · ) is the Gaussian hypergeometric function.Proof. If X i has the geometric distribution with location shift a and p , then X i − a alsofollows the geometric distribution with zero shift. Without loss of generality, we may thusassume that a = 0. Since ˆ p ml = 1 / ( ¯¯ X + 1) = N/ ( T N + N ), it is immediate upon using (6)that we have E (ˆ p ml ) = ∞ X t =0 Nt + N · g N ( t ) = ∞ X t =0 Nt + N · (cid:18) t + N − N − (cid:19) p N (1 − p ) t , that is, E (ˆ p ml ) = N p N (1 − p ) N ∞ X t =0 (cid:18) t + N − N − (cid:19) (1 − p ) t + N t + N . − p ) t + N / ( t + N ) = R p (1 − y ) t + N − dy , we have E (ˆ p ml ) = N p N (1 − p ) N ∞ X t =0 (cid:18) t + N − N − (cid:19) Z p (1 − y ) t + N − dy = N p N (1 − p ) N Z p (1 − y ) N − y N " ∞ X t =0 (cid:18) t + N − N − (cid:19) y N (1 − y ) t dy. (8)Since (cid:0) t + N − N − (cid:1) y N (1 − y ) t is the pmf of the negative binomial distribution, we have ∞ X t =0 (cid:18) t + N − N − (cid:19) y N (1 − y ) t = 1 . Thus, Equation (8) can be further simplified as E (ˆ p ml ) = N p N (1 − p ) N Z p (1 − y ) N − y − N dy. Using the integration by substitution with x = 1 − y , the above is written as E (ˆ p ml ) = N p N (1 − p ) N Z − p x N − (1 − x ) − N dx = N p N (1 − p ) N · B − p ( N, − N ) , where B x ( a, b ) is the incomplete beta function defined as B x ( a, b ) = Z x y a − (1 − y ) b − dy. It deserves mentioning that the calculation of B − p ( N, − N ) can be complex becausefew software packages provide its calculation with negative argument. To deal with thisdifficulty, one can use the hypergeometric representation of the incomplete beta function(Dutka, 1981; ¨Ozarslan and Ustao˘glu, 2019) which is given by B x ( a, b ) = x a a · F ( a, − b ; a + 1; x ) . (9)Here p F q ( · ) is the hypergeometric function (Abramowitz and Stegun, 1964; Seaborn, 1991)and it is defined as p F q ( a , . . . , a p ; b , . . . , b q ; z ) = ∞ X n =0 ( a ) n · · · ( a p ) n ( b ) n · · · ( b q ) n z n n ! , (10)where ( a ) n is the Pochhammer symbol for the rising factorial defined as ( a ) = 1 and( a ) n = a ( a + 1) · · · ( a + n −
1) for n = 1 , , . . . . Thus, by using (9), we have E (ˆ p ml ) = p N · F ( N, N ; N + 1; 1 − p ) . (11)7ote that we can easily obtain E (ˆ p b ) since ˆ p b = ˆ p ml · ( N − /N . This completes theproof.By using the well-known Euler transformation formula for the hypergeometric function(Miller and Paris, 2011) which is given by F ( a, b ; c ; z ) = (1 − z ) c − a − b F ( c − a, c − b ; c ; z ) , we obtain E (ˆ p ml ) = p · F (1 , N + 1; 1 − p ). Then according the definition of the hyper-geometric function in (10), we have E (ˆ p ml ) = p ∞ X n =0 (1) n (1) n ( N + 1) k (1 − p ) n n != p ∞ X n =0 N ! n !( N + n )! (1 − p ) n = p + ∞ X n =1 p (1 − p ) n (cid:0) N + nn (cid:1) since (1) n = n ! and ( N + 1) n = ( N + n )! /N !. Then the biases of the estimators ˆ p ml and ˆ p b are obtained as Bias(ˆ p ml ) = ∞ X n =1 p (1 − p ) n (cid:0) N + nn (cid:1) and Bias(ˆ p b ) = − pN + N − N ∞ X n =1 p (1 − p ) n (cid:0) N + nn (cid:1) , respectively. It should be noted that the R language provides the hypergeo package tocalculate the hypergeometric function; see Hankin (2016). We can calculate the theoreticalvalues of the biases and provide these values in Figure 1 along with the empirical values. Itdeserves mentioning that the theoretical bias of ˆ p mvu is trivially zero.In what follows, we provide the second moments of the estimators so that their variancescan be easily obtained using them. Theorem 4.
For < p < , we have E (ˆ p ) = p N · F ( N, N, N ; N + 1 , N + 1; 1 − p ) ,E (ˆ p ) = ( N − p N N · F ( N, N, N ; N + 1 , N + 1; 1 − p ) , nd E (ˆ p ) = p N · F ( N − , N − N ; 1 − p ) . Proof.
We first note that E (ˆ p ) = ∞ X t =0 (cid:18) Nt + N (cid:19) · g N ( t )= ∞ X t =0 (cid:18) Nt + N (cid:19) · (cid:18) t + N − N − (cid:19) p N (1 − p ) t = N p N (1 − p ) N ∞ X t =0 Nt + N · (cid:18) t + N − N − (cid:19) (1 − p ) t + N t + N .
By using the identity (1 − p ) t + N / ( t + N ) = R p (1 − y ) t + N − dy , we have E (ˆ p ) = N p N (1 − p ) N ∞ X t =0 Nt + N · (cid:18) t + N − N − (cid:19) Z p (1 − y ) t + N − dy = N p N (1 − p ) N Z p (1 − y ) N − y N " ∞ X t =0 Nt + N (cid:18) t + N − N − (cid:19) y N (1 − y ) t dy. The term in the integrand, P ∞ t =0 Nt + N (cid:0) t + N − N − (cid:1) y N (1 − y ) t , is essentially the same as the firstmoment of ˆ p ml with probability y . Thus, it follows from (11) that ∞ X t =0 Nt + N (cid:18) t + N − N − (cid:19) y N (1 − y ) t = y N · F ( N, N ; N + 1; 1 − y ) , which results in E (ˆ p ) = N p N (1 − p ) N Z p (1 − y ) N − · F ( N, N ; N + 1; 1 − y ) dy = N p N (1 − p ) N Z − p x N − · F ( N, N ; N + 1; x ) dx. (12)Using the general integral representation for p + k F q + k in Theorem 38 of Rainville (1960) andSection 2 of Driver and Johnston (2006), we have Z − p x N − · F ( N, N ; N + 1; x ) dx = (1 − p ) N N · F ( N, N, N ; N + 1 , N + 1; 1 − p ) . (13)Substituting (13) into (12), we obtain the first result. The second result is easily obtainedfrom ˆ p b = ˆ p ml · ( N − /N . 9ext, we have E (ˆ p ) = ∞ X t =0 (cid:18) N − t + N − (cid:19) · g N ( t )= ∞ X t =0 (cid:18) N − t + N − (cid:19) · (cid:18) t + N − N − (cid:19) p N (1 − p ) t = ∞ X t =0 (cid:18) N − t + N − (cid:19) · (cid:18) t + N − N − (cid:19) p N (1 − p ) t = ( N − p N (1 − p ) N − ∞ X t =0 · (cid:18) t + N − N − (cid:19) (1 − p ) t + N − t + N − . Using the identity (1 − p ) t + N − / ( t + N −
1) = R p (1 − y ) t + N − dy , we have E (ˆ p ) = ( N − p N (1 − p ) N − Z p (1 − y ) N − y N − " ∞ X t =0 (cid:18) t + N − N − (cid:19) (1 − y ) t y N − dy = ( N − p N (1 − p ) N − Z p (1 − y ) N − y N − dy = ( N − p N (1 − p ) N − Z − p x N − (1 − x ) − ( N − dx = ( N − p N (1 − p ) N − · B − p ( N − , − N + 2) . Then it is immediate upon using the hypergeometric representation of the incomplete betafunction in (9) that we have the result, which completes the proof.It should be noted that based on the Euler transformation formula for the hypergeo-metric function, we can rewrite E (ˆ p ) = p · F (1 , N ; 1 − p )= p + P Nn =1 p (1 − p ) n (cid:0) N − nn (cid:1) , which results in Var(ˆ p mvu ) = P Nn =1 p (1 − p ) n (cid:0) N − nn (cid:1) . In addition, we also conduct Monte Carlo simulations to study empirical biases of theseestimators under consideration. For each simulation, we generate ( n , n ) = (1 , , , ,
10) samples from the geometric distribution with Bernoulli probability p = 0 . .
3, 0 .
5, 0 .
7, 0 . a being always zero. To obtain empirical biases10 (cid:0)(cid:1) (cid:2)(cid:3)(cid:4) (cid:5)(cid:6)(cid:7) − . − . − . − . . . p B i a s p ^ b ( n = n = p ^ b ( n = n = p ^ b ( n = n = p ^ ml ( n = n = p ^ ml ( n = n = p ^ ml ( n = n =
5) 0.0 0.2 0.4 0.6 0.8 1.0 . . . . . p M SE p ^ b ( n = n = p ^ b ( n = n = p ^ b ( n = n = p ^ ml ( n = n = p ^ ml ( n = n = p ^ ml ( n = n = p ^ mvu ( n = n = p ^ mvu ( n = n = p ^ mvu ( n = n = Figure 1: Theoretical values of the biases and MSEs of the estimators under consideration.The empirical values are denoted by the legends ◦ , × , and • .and empirical mean square errors (MSEs), we iterate this experiment I = 10 ,
000 times. Itshould be noted that the existing methods are all biased so that it is more appropriate tocompare their empirical MSEs instead of the empirical variances. The empirical biases andMSEs are provided in Tables 1 and 2. The values of the theoretical MSEs are easily obtainedusing Theorems 3 and 4 and we also plot the these values along with the biases in Figure 1.In the figure, to compare the empirical and theoretical values, we also superimposed theempirical values with the legends ◦ ( n = 1, n = 1), × ( n = 2, n = 3), and • ( n = 5, n = 5).The values of the empirical biases of ˆ p b are always negative and those of ˆ p ml are alwayspositive, which is expected from Theorem 2, and both biases tend to decrease as the samplesizes increase. It is worth noting that the bias of ˆ p b is really serious, especially when thesample size is small and the probability p is large. However, the empirical biases of ˆ p mvu arevery close to zero for all the cases as expected from the fact that its theoretical bias is zero.Numerical results clearly show that the proposed estimator ˆ p mvu outperforms the existingestimators. On the other hand, the bias of ˆ p ml is larger when p is around 0.5. With n = 111able 1: Empirical biases of ˆ p b , ˆ p mvu and ˆ p ml . ( n , n ) (1 ,
1) (2 ,
3) (5 ,
5) (10 , p = 0 . p b − . − . − . − . p mvu − . . . − . p ml . . . . p = 0 . p b − . − . − . − . p mvu . − . . − . p ml . . . . p = 0 . p b − . − . − . − . p mvu . . . . p ml . . . . p = 0 . p b − . − . − . − . p mvu . − . . − . p ml . . . . p = 0 . p b − . − . − . − . p mvu − . − . . − . p ml . . . . Table 2: Empirical MSEs of ˆ p b , ˆ p mvu and ˆ p ml . ( n , n ) (1 ,
1) (2 ,
3) (5 ,
5) (10 , p = 0 . p b . . . . p mvu . . . . p ml . . . . p = 0 . p b . . . . p mvu . . . . p ml . . . . p = 0 . p b . . . . p mvu . . . . p ml . . . . p = 0 . p b . . . . p mvu . . . . p ml . . . . p = 0 . p b . . . . p mvu . . . . p ml . . . . and n = 1, the bias of ˆ p b can reach around 0.5 with p close to 1 and that of ˆ p ml can reacharound 0.1 with p around 0.5. Considering that the value of p is always in (0 , p b and ˆ p ml are really serious. As N gets larger, the bias gets smaller, whereas the biasof ˆ p b is still severe with a large value of p . 12 Construction of the g and h control charts As we did earlier, we let X ij be the number of independent Bernoulli trials (cases) until thefirst nonconforming case in the i th sample for i = 1 , , . . . , m and j = 1 , , . . . , n i . Then X ij ’s are iid geometric random variables with location shift a and p . Let ¯ X k be the meanof the k th sample with sample size n k .Based on the asymptotic theory, we have¯ X k − µ p σ /n k • ∼ N (0 , , where µ = E ( X kj ) = (1 − p ) /p + a and σ = Var( X kj ) = (1 − p ) /p . We can constructthe control chart for average number of events per subgroup (the h chart) with CL ± g · SEcontrol limits ¯ X k − µ k p σ /n k = ± g, which results in the upper control limit (UCL), lower control limit (LCL) and center line(CL) as follows UCL = µ + g s σ n k = 1 − pp + a + g r − pn k p , CL = µ = 1 − pp + a, (14)LCL = µ − g s σ n k = 1 − pp + a − g r − pn k p . It deserves mentioning that the American Standard uses g = 3 with an ideal false alarmrate 0.27% and British Standard uses g = 3 .
09 with 0.20%.By setting up ( n k ¯ X k − n k µ k ) / p n k σ = ± g , we can also construct the control chart forthe total number of events per subgroup (the g chart) and its control limits are given byUCL = n k µ + g p n k σ = n k (cid:18) − pp + a (cid:19) + g s n k (1 − p ) p , CL = n k µ = n k (cid:18) − pp + a (cid:19) , (15)LCL = n k µ − g p n k σ = n k (cid:18) − pp + a (cid:19) − g s n k (1 − p ) p . In practice, the parameters µ and σ are unknown and can be estimated by substitutingan estimator of p through the relationship µ = 1 /p − a and σ = (1 − p ) /p . However,13 care should be taken in this case. For example, ˆ p mvu is unbiased for p , but 1 / ˆ p mvu is notunbiased for 1 /p . We have shown that ˆ p ml is not unbiased for p , whereas 1 / ˆ p ml is actuallyunbiased for 1 /p . Thus, we estimate µ = 1 /p − a using ˆ µ = 1 / ˆ p ml − a , which resultsin ˆ µ = ¯¯ X . Since ¯¯ X = T N /N = P mi =1 P n i j =1 X ij /N is a complete sufficient statistic, ˆ µ = ¯¯ X is the MVU estimator of µ due to the Lehmann-Scheff´e theorem. For more details on thistheorem, see Theorem 7.4.1 of Hogg et al. (2013). It should be noted that ˆ µ = ¯¯ X is alsothe ML estimator because of the invariance property of the ML estimator (for example, seeTheorem 7.2.10 of Casella and Berger, 2002). Thus, it is clear that one should use ˆ µ = ¯¯ X to estimate the CL, which results in CL = ¯¯ X ( h chart) and CL = n k ¯¯ X ( g chart).To estimate σ , we consider the ML estimator of σ by plugging ˆ p ml into σ = (1 − p ) /p ,which results in ˆ σ = ( ¯¯ X − a )( ¯¯ X − a + 1) . (16)The MVU estimator of σ is also easily obtained using the Lehmann-Scheff´e theorem with E (cid:2) ( T N /N ) · ( T N + N ) / ( N + 1) (cid:3) = (1 − p ) /p . Then we haveˆ σ = NN + 1 ( ¯¯ X − a )( ¯¯ X − a + 1) . (17)Using ˆ µ = ¯¯ X and ˆ σ in (16) along with (14) and (15), we can construct the ML-based h and g charts as follows. • h chart: UCL = ¯¯ X + g s ( ¯¯ X − a )( ¯¯ X − a + 1) n k , CL = ¯¯ X, LCL = ¯¯ X − g s ( ¯¯ X − a )( ¯¯ X − a + 1) n k . • g chart: UCL = n k ¯¯ X + g q n k ( ¯¯ X − a )( ¯¯ X − a + 1) , CL = n k ¯¯ X, LCL = n k ¯¯ X − g q n k ( ¯¯ X − a )( ¯¯ X − a + 1) . Also, using ˆ µ = ¯¯ X and ˆ σ in (17) along with (14) and (15), we can construct the MVU-based h and g charts as follows. 14 h chart: UCL = ¯¯ X + g s NN + 1 ( ¯¯ X − a )( ¯¯ X − a + 1) n k , CL = ¯¯ X, LCL = ¯¯ X − g s NN + 1 ( ¯¯ X − a )( ¯¯ X − a + 1) n k . • g chart: UCL = n k ¯¯ X + g r n k NN + 1 ( ¯¯ X − a )( ¯¯ X − a + 1) , CL = n k ¯¯ X, LCL = n k ¯¯ X − g r n k NN + 1 ( ¯¯ X − a )( ¯¯ X − a + 1) . It should be noted that Kaminsky et al. (1992) provide the control limits for the MVU-based h and g charts in their Table 1, but these limits are based on ˆ p b which is not theMVU. Also, one can also construct the control limits by plugging the MVU estimator ˆ p mvu into (14) and (15). However, like the ML estimator, the MVU estimator has no invarianceproperty. Thus, in this case, the resulting limits can not be regarded as the MVU-basedlimits. We have revisited the g and h control charts with proper ML and MVU estimators. We haveshown that the MVU estimator has been inappropriately used in the quality engineeringliterature and thus provided the correct MVU estimator along with various statistical prop-erties such as their theoretical first and second moments which are explicitly expressed asthe Gauss hypergeometric function. Furthermore, based on the new estimators developedin this note, we provided how to construct the ML-based and MVU-based h and g controlcharts with unbalanced samples.Finally, it is worth noting that we have developed the rQCC R package (Park and Wang,2020) to construct various control charts. In ongoing work, we plan to add these controlcharts in the next update so that practitioners can use our results more easily.15 cknowledgment
This research was supported by the National Research Foundation of Korea (NRF) grant(NRF-2017R1A2B4004169) and the BK21-Plus Program (Major in Industrial Data Scienceand Engineering) funded by the Korea government.
References
Abramowitz, M. and I. A. Stegun (1964).
Handbook of Mathematical Functions: withFormulas, Graphs, and Mathematical Tables , Volume 55 of
National Bureau of StandardsApplied Mathematics Series . U.S. Government Printing Office, Washington, D.C.Benneyan, J. C. (1999). Geometric-based g -type statistical control charts for infrequentadverse events. In Institute of Industrial Engineers Society for Health Systems Conf.Proc. , pp. 175–185.Benneyan, J. C. (2000). Number-between g -type statistical quality control charts for mon-itoring adverse events. Health Care Management Science 4 , 305–318.Benneyan, J. C. (2001). Performance of number-between g -type statistical control chartsfor monitoring adverse events. Health Care Management Science 4 , 319–336.Blackwell, D. (1947). Conditional expectation and unbiased sequential estimation.
Annalsof Mathematical Statistics 18 , 105–110.Casella, G. and R. L. Berger (2002).
Statistical Inference (Second ed.). Pacific Grove, CA:Duxbury.Driver, K. A. and S. J. Johnston (2006). An integral representation of some hypergeometricfunctions.
Electron. Trans. Numer. Anal. 25 , 115–120.Dutka, J. (1981). The incomplete beta function – a historical profile.
Archive for Historyof Exact Sciences 24 (1), 11–29.Hankin, R. K. S. (2016). hypergeo : The Gauss hypergeometric function. https://CRAN.R-project.org/package=hypergeo . R package version 1.2.13 (publishedon April 7, 2016). 16ogg, R. V., J. W. McKean, and A. T. Craig (2013).
Introduction to Mathematical Statistics (7 ed.). Boston, MA: Pearson.Kaminsky, F. C., J. C. Benneyan, and R. D. Davis (1992). Statistical control charts basedon a geometric distribution.
Journal of Quality Technology 24 , 63–69.Lehmann, E. L. and G. Casella (1998).
Theory of Point Estimation (second ed.). NewYork: Springer-Verlag.Miller, A. R. and R. B. Paris (2011). Euler-type transformations for the generalized hyper-geometric function r +2 f r +1 ( x ). Zeitschrift f¨ur angewandte Mathematik und Physik 62 ,31–45.Minitab (2020). Methods and formulas for g chart. Minitab 20 Support. https://support.minitab.com/en-us/minitab/20/ (accessed on December 24, 2020).Miyakawa, M. (1984). Analysis of incomplete data in competing risks model. IEEE Trans-actions on Reliability 33 , 293–296.¨Ozarslan, M. and C. Ustao˘glu (2019). Some incomplete hypergeometric functions andincomplete riemann-liouville fractional integral operators.
Mathematics 7 (5), 483.Park, C. (2010). Parameter estimation for reliability of load sharing systems.
IIE Transac-tions 42 , 753–765.Park, C. and M. Wang (2020). rQCC : Robust quality control chart. https://CRAN.R-project.org/package=rQCC . R package version 1.20.7 (published onJuly 5, 2020).Rainville, E. D. (1960).
Special Functions . New York: Macmillan.Rao, C. R. (1945). Information and accuracy attainable in the estimation of statisticalparameters.
Bulletin of the Calcutta Mathematical Society 37 , 81–91.Seaborn, J. B. (1991).