Robust Bias Estimation for Kaplan-Meier Survival Estimator with Jackknifing
aa r X i v : . [ s t a t . M E ] D ec Robust Bias Estimation for Kaplan–Meier SurvivalEstimator with Jackknifing
Md Hasinur Rahaman Khan a , J. Ewart H. Shaw b a Institute of Statistical Research and Training, University of Dhaka, Bangladesh b Department of Statistics, University of Warwick, UK
Abstract
For studying or reducing the bias of functionals of the Kaplan–Meier survivalestimator, the jackknifing approach of Stute and Wang (1994) is natural. Wehave studied the behavior of the jackknife estimate of bias under differentconfigurations of the censoring level, sample size, and the censoring andsurvival time distributions. The empirical research reveals some new findingsabout robust calculation of the bias, particularly for higher censoring levels.We have extended their jackknifing approach to cover the case where thelargest observation is censored, using the imputation methods for the largestobservations proposed in Khan and Shaw (2013b). This modification to theexisting formula reduces the number of conditions for creating jackknife biasestimates to one from the original two, and also avoids the problem thatthe Kaplan–Meier estimator can be badly underestimated by the existingjackknife formula.
Keywords:
Bias, Censoring, Jackknifing, Kaplan–Meier Estimator
Preprint submitted to Elsevier October 1, 2018 . Introduction
Suppose that there is a random sample of n individuals. Let T i and C i be the random variables that represent the lifetime and censoring time forthe i th individual. We also assume T i has unknown distribution function F .The Kaplan-Meier (K–M) estimator, ˆ F KM (Kaplan and Meier, 1958) is thendefined by 1 − ˆ F KM ( t ) = Y y ( i ) ≤ y (cid:16) n − in − i + 1 (cid:17) δ ( i ) , (1)where Y (1) ≤ · · · ≤ Y ( n ) are the ordered observations (censored and uncen-sored lifetimes), δ ( i ) = 1 if Y ( i ) is observed and δ ( i ) = 0 if Y ( i ) is censored,ties between censoring times are treated as if the former precede the latter,and other ties are ordered arbitrarily. Suppose that S is a given statisticalfunction so that S ( F ) is the parameter of interest. It follows from Stute(1994) that if S is nonlinear then the K–M based estimator, S ( F KM ), isbiased. Stute (1994) also discussed the situation where the bias arises evenfor linear S when the data of interest are partially observable. Now for any F -integrable function ϕ , the corresponding estimator of the parameter ofinterest, S ( ˆ F KM ) is defined by the K–M integral R ϕ ( Y ( i ) ) d ˆ F KM .The K–M estimator is well known to be unbiased if there is no ran-dom censorship but it becomes biased under censorship. Gill (1980) wasthe first to bound the bias of ˆ F KM : − F H ≤ E ( ˆ F KM ) − F ≤
0, where H is the distribution function of Y . Mauro (1985) extended this result toarbitrary K–M integrals with non-negative integrands. Zhou (1988) provedthat the bias of the K–M estimator functional decreases at an exponentialrate, and always underestimates the true value. He established the lower2ound: − R ϕ H F ( dt ) ≤ bias( R ϕ d ˆ F KM ) ≤
0. Stute (1994) derived theexact formula for the bias of R ϕ d ˆ F KM for a general Borel-measurable func-tion, ϕ . He also discussed the effect of light, medium or heavy censoring onthe bias of R ϕ d ˆ F KM . Stute and Wang (1994) derived an explicit formulafor the jackknife estimate of the bias of R ϕ ( Y ( i ) ) d ˆ F KM . They also showedthat jackknifing can lead to a considerable reduction of the bias. Four yearslater, Shen (1998) proposed another explicit formula for jackknife estimate ofbias of R ϕ ( T ∗ ( i ) ) d ˆ F KM . He used delete-2 jackknifing where two observationsare deleted. It follows from Shen (1998) that the formula based on delete-2doesn’t show any further improvement on the delete-1 formula. Stute (1996)also proposed a jackknife estimate of the variance of R ϕ ( Y ( i ) ) d ˆ F KM .As mentioned in Stute and Wang (1994), under random censorship theestimator S ( ˆ F KM ) becomes the K–M integral S ( ˆ F KM ) = n X i =1 w i ϕ ( Y ( i ) ) ≡ ˆ S KMϕ , i = 1 , · · · , n (2)where the the K–M weights w i are the sizes of the jumps by which the K–Mestimator of F changes at the uncensored points Y ( i ) , given by w = δ (1) n , w i = δ ( i ) n − i + 1 i − Y j =1 (cid:16) n − jn − j + 1 (cid:17) δ ( j ) , i = 2 , · · · , n. (3)A detailed study of the w i ’s in connection with the strong law of large num-bers under censoring has been carried out in Stute and Wang (1993).The jackknife estimate of bias for the K–M integral (Eq. 2) is given byBias ( ˆ S KMϕ ) = − n − n ϕ ( Y ( n ) ) δ ( n ) (1 − δ ( n − ) n − Y j =1 (cid:16) n − − jn − j (cid:17) δ ( j ) . (4)3he associated bias corrected jackknife estimator is therefore given by˜ S KMϕ = ˆ S KMϕ − Bias ( ˆ S KMϕ ) . (5)
2. Modified Jackknife Bias for K–M Lifetime Estimator
When no censoring is present, ˆ F KM reduces to the usual sample distribu-tion estimator ˆ F that assign weight n to each observation. With censoring,the weighting method (3) gives zero weight to the censored observations Y +( . ) ,causing particular problems if the largest datum is censored (i.e. δ ( n ) = 0).As a first step one may apply Efron’s (1967) tail correction approach: reclas-sify δ ( n ) = 0 as δ ( n ) = 1. In order to reduce estimation bias and inefficiency,Khan and Shaw (2013b) proposed five alternatives to Efron’s approach, thatcan lead to more efficient and less biased estimates. The approaches aresummarised in Table 1. The first four approaches are based on the under- Table 1: The imputation approaches from Khan and Shaw (2013b). W τ m : Adding the Conditional Mean W τ md : Adding the Conditional Median W τ ∗ m : Adding the Resampling-based Conditional Mean W τ ∗ md : Adding the Resampling-based Conditional Median W ν : Adding the Predicted Difference Quantitylying regression assumption relating lifetimes and covariates (e.g., the AFTmodel), and the fifth approach W ν , is based on only the random censorshipassumption.The jackknife bias in Eq. (4) is non-zero if and only if the largest datumis uncensored, δ ( n ) = 1, and the second largest datum is censored, δ ( n − =4. Stute and Wang (1994) state that if δ ( n ) = 0, then the correspondingobservation doesn’t contain enough information about F to make a changeof ˆ S KMϕ desirable. This inability to estimate bias if δ ( n ) = 0 is a majorlimitation of the jackknife bias formula.If ( δ ( n − = 0 , δ ( n ) = 0), then we can obtain a modified jackknife esti-mate of bias by imputing the largest datum, for example using any of theapproaches given in Table 1. From Eq. (2) this gives the modified estimatorˆ S ∗ ϕKM ≡ n − X i =1 w i ϕ ( Y ( i ) ) + ´ w n ϕ ( ˜ Y ( n ) ) , i = 1 , · · · , n − , (6)where ˜ Y ( n ) is the imputed largest observation, and ´ w n is the correspondingadjusted K–M weight´ w n = w n + n − n n − Y j =1 (cid:16) n − − jn − j (cid:17) δ ( j ) as suggested in Stute and Wang (1994) for the pair ( δ ( n − = 0 , δ ( n ) = 1).The modified estimator (6) is also obtained when imputing in the situation( δ ( n − = 1 , δ ( n ) = 0). In this case the K–M weight to ˜ Y ( n ) is not adjustedand we arrive at the estimatorˆ S ∗ ϕKM ≡ n − X i =1 w i ϕ ( Y ( i ) ) + w n ϕ ( ˜ Y ( n ) ) , i = 1 , · · · , n − . So unlike the actual jackknife formula the modified approach doesn’t im-pose any condition on the censoring status of Y ( i ) . The modified estimate ofbias is given byBias ( ˆ S ∗ ϕKM ) = − n − n ϕ ( ˜ Y ( n ) ) δ ∗ ( n ) (1 − δ ( n − ) n − Y j =1 (cid:16) n − − jn − j (cid:17) δ ( j ) , (7)5here δ ∗ ( n ) is the modified censoring indicator for ˜ Y ( n ) . With the above ap-proach, δ ∗ ( n ) is always 1. It follows from Eq. (7) the larger bias quantitybecause ˜ Y ( n ) > Y ( n ) . The modified bias corrected jackknife estimator is thendefined by ˜ S ∗ ϕKM = ˆ S ∗ ϕKM − Bias ( ˆ S ∗ ϕKM ) . (8)The K–M estimates under both approaches for the four pairs are summa-rized in Table 2. Table 2: K–M lifetime estimates by censoring indicators for the last two observations.
K–M estimate δ ( n − δ ( n ) ˆ S ∗ ϕKM + n − n ϕ ( ˜ Y ( n ) ) δ ∗ ( n ) (1 − δ ( n − ) Q n − j =1 (cid:16) n − − jn − j (cid:17) δ ( j ) S ∗ ϕKM S KMϕ S KMϕ + n − n ϕ ( ˜ Y ( n ) ) δ ( n ) (1 − δ ( n − ) Q n − j =1 (cid:16) n − − jn − j (cid:17) δ ( j ) S ( ˆ F KM )based on both the actual and the modified jackknife bias formula. For com-putational simplicity we look only at the K–M mean lifetime estimator, ob-tained by replacing ϕ ( y ) by y in Eq. (2). Note that researchers in reliabilityare very often interested in estimating the mean lifetime of a component, andthat the K–M mean lifetime estimate also has an important role in HealthEconomics, for example, in a “QTWIST” analysis (Glasziou et al. 1990).Obviously the behaviour of the K–M mean lifetime estimator depends onthe nature of the distribution being estimated and the degree of censoring,although the true distribution of censored data is generally unknown. Wetherefore conducted simulation studies to demonstrate the behavior of the6–M mean lifetime estimator in the presence of right censoring. We assumethat the lifetimes and censoring times have independent distributions.Note that the mean survival time can be defined as the area under thesurvival curve, S ( t ) (Kaplan and Meier, 1958). A nonparametric estimate ofthe mean survival time can also be obtained by substituting the K–M meanestimator for the unknown survival function ˆ µ = R ∞ ˆ S ( t ) d t . Stute (1994)proposed a bias corrected jackknife estimator for the K–M mean lifetime.When the observations are subject to right censoring, the usual mean esti-mator of the mean lifetime is not appropriate (Datta, 2005). The reason isthat the censoring leads to an inconsistent estimator that underestimates thetrue mean and the bias worsens as the censoring increases.
3. Simulation Study
This section reports on three simulation based examples. The first exam-ple extends the Koziol-Green model simulations of Stute and Wang (1994).The second example considers various skewed distributions for survival timesand corresponding distributions for the associated censored times. The thirdexample uses a log-normal AFT model where the event times are assumedto be associated with several covariates.
This extends the simulations of the Koziol-Green proportional hazardsmodel from Stute and Wang (1994). Under this model both T and C wereexponentially distributed: T ∼ Exp (1) and C ∼ Exp ( λ ), with varying λ ’s.Four different sample sizes n = 30 , , ,
150 are used. For each sample,100 ,
000 simulation runs are drawn and the bias and variance of both the7ean lifetime estimators ˆ S KM mean and ˜ S KM mean are computed. The bias andits variance are shown in Table 3 and 4 (the first sub-table for both tables)respectively. Table 3: Simulation results based on the Koziol-Green model for the bias of the four K–Mmean lifetime estimators ˆ S KM mean, ˜ S KM mean, ˆ S ∗ KM mean and ˜ S ∗ KM mean. P % n=30 n=50 n=100 n=150 n=30 n=50 n=100 n=150Bias of ˆ S KM mean Bias of ˜ S KM mean10 -0.155 -0.114 -0.073 -0.055 -0.154 -0.114 -0.073 -0.05620 -0.197 -0.157 -0.107 -0.085 -0.191 -0.155 -0.107 -0.08630 -0.250 -0.205 -0.151 -0.126 -0.233 -0.195 -0.146 -0.12340 -0.304 -0.265 -0.209 -0.178 -0.267 -0.239 -0.193 -0.16450 -0.364 -0.327 -0.278 -0.248 -0.295 -0.268 -0.237 -0.21560 -0.409 -0.389 -0.349 -0.328 -0.287 -0.281 -0.263 -0.25570 -0.430 -0.426 -0.413 -0.396 -0.224 -0.234 -0.246 -0.24580 -0.402 -0.417 -0.428 -0.428 -0.082 -0.097 -0.127 -0.14190 -0.280 -0.304 -0.335 -0.346 0.161 0.178 0.171 0.164Bias of ˆ S ∗ KM mean Bias of ˜ S ∗ KM mean10 -0.208 -0.147 -0.090 -0.067 -0.207 -0.147 -0.090 -0.06820 -0.259 -0.202 -0.132 -0.104 -0.252 -0.200 -0.132 -0.10430 -0.326 -0.261 -0.186 -0.155 -0.309 -0.251 -0.181 -0.15240 -0.391 -0.335 -0.260 -0.218 -0.354 -0.310 -0.243 -0.20550 -0.465 -0.407 -0.343 -0.304 -0.396 -0.349 -0.303 -0.27160 -0.511 -0.481 -0.426 -0.400 -0.389 -0.372 -0.341 -0.32770 -0.518 -0.512 -0.495 -0.475 -0.312 -0.320 -0.328 -0.32580 -0.463 -0.481 -0.496 -0.498 -0.162 -0.162 -0.195 -0.21090 -0.304 -0.331 -0.367 -0.380 0.151 0.151 0.139 0.129 The results show that, for both estimators, the bias increases as censoringincreases until a particular censoring level, then declines. That particularcensoring level falls in the range 60 to 80. Above that censoring level thebias decreases as censoring increases, and decreases much more rapidly forthe corrected estimator than for the K–M estimator. In addition, the bias8 able 4: Simulation results based on the Koziol − Green model for variance of the bias ofthe four K − M mean lifetime estimators ˆ S KM mean, ˜ S KM mean, ˆ S ∗ KM mean and ˜ S ∗ KM mean. P % n=30 n=50 n=100 n=150 n=30 n=50 n=100 n=150Variance of bias of ˆ S KM mean Variance of bias of ˜ S KM mean10 0.004 0.002 0.001 0.000 0.010 0.005 0.002 0.00120 0.008 0.006 0.003 0.002 0.019 0.013 0.006 0.00430 0.016 0.012 0.006 0.004 0.037 0.027 0.014 0.01040 0.024 0.019 0.012 0.009 0.056 0.045 0.028 0.02150 0.034 0.028 0.021 0.016 0.082 0.064 0.049 0.03760 0.041 0.037 0.029 0.025 0.096 0.088 0.067 0.05870 0.040 0.038 0.034 0.032 0.092 0.090 0.081 0.07480 0.030 0.030 0.029 0.029 0.071 0.074 0.069 0.07190 0.011 0.011 0.013 0.013 0.034 0.032 0.034 0.035Variance of bias of ˆ S ∗ KM mean Variance of bias of ˜ S ∗ KM mean10 0.021 0.008 0.003 0.001 0.034 0.014 0.004 0.00220 0.031 0.019 0.007 0.004 0.053 0.032 0.012 0.00830 0.056 0.035 0.015 0.011 0.095 0.061 0.027 0.02040 0.078 0.056 0.031 0.022 0.135 0.099 0.057 0.03950 0.116 0.077 0.054 0.039 0.201 0.136 0.098 0.07060 0.117 0.100 0.073 0.059 0.209 0.181 0.132 0.10870 0.101 0.092 0.082 0.073 0.183 0.171 0.151 0.13580 0.063 0.064 0.061 0.063 0.121 0.128 0.118 0.12390 0.018 0.019 0.022 0.023 0.045 0.045 0.049 0.051 for the corrected estimator at P % = 90 censoring is positive for all samplesizes. This behaviour at high censoring levels does not appear in Stute andWang (1994) who investigated the bias up to only P % = 66 .
7, but it is easilyseen from Table 2 that if censoring is 100%, then δ ( n ) = 0, so the bias is 0. Asimilar trend is observed for the variance of the bias of the two estimators.We have computed also the bias of the jackknife estimate and its variancebased on both the modified estimators ˆ S ∗ KM mean and ˜ S ∗ KM mean. The modifi-cation is based on the predicted difference quantity approach where ˜ Y ( n ) is9eplaced by Y ( n ) + ν ( W ν in Table 1), as discussed in Khan and Shaw (2013b).The bias and its variance are shown in Table 3 and 4 respectively (the secondsub-table for both tables). The results demonstrate that under the modifiedapproach, slightly larger bias and variance estimates are obtained. Theiroverall trends are similar to those of the original estimators. In the second simulation, survival times are generated from four skeweddistributions , and censoring times independently from other specified dis-tributions, as listed in Table 5. Datasets are generated randomly subject tothe restriction δ ( n − = 0, and, for the original jackknife formula, with theadditional restriction δ ( n ) = 1. Table 5: The failure time distributions with their corresponding censoring distributions.
Failure time distributions Censoring distributionsLog-normal (1.1, 1): √ π exp( − (log t − . / t Uniform: U ( a, a )Exponential (0.2): exp( − t ) Exponential: Exp ( λ )Gamma (4, 1): t exp( − t ) Uniform: U ( a, a )Weibull (3.39, 3): . t exp( − t . ) Uniform: U ( a, a ) In the case when T ∼ Exp (0 .
2) and C ∼ Exp ( λ ) for a chosen levelof censoring percentage P % , it follows that Y and δ are independent with P % /
100 = pr ( δ = 0) = λ/ (0 . λ ). For censoring time the Uniform distri-bution over the range [ a, a ] is chosen.We use four samples n = 30 , , , ,
000 simulated datasets areshown in Fig. 1 and 2 (both shown in supplementary document) respectively.10
20 40 60 80 − . − . − . − . Estimator
Censoring percentage B i a s N 30N 50N 100N 150 − . . . . Corrected estimator
Censoring percentage B i a s N 30N 50N 100N 150 − . − . − . − . Modified estimator
Censoring percentage B i a s N 30N 50N 100N 150 − . . . . Modified corr. estimator
Censoring percentage B i a s N 30N 50N 100N 150 (a) For T ∼ LN (1.1, 1) & C ∼ U ( a, a ). − . − . − . − . Estimator
Censoring percentage B i a s N 30N 50N 100N 150 − . − . . . Corrected estimator
Censoring percentage B i a s N 30N 50N 100N 150 − . − . − . Modified estimator
Censoring percentage B i a s N 30N 50N 100N 150 − − Modified corr. estimator
Censoring percentage B i a s N 30N 50N 100N 150 (b) For T ∼ EX (0.2) & C ∼ EX ( λ ). − . − . − . − . Estimator
Censoring percentage B i a s N 30N 50N 100N 150 − . − . . Corrected estimator
Censoring percentage B i a s N 30N 50N 100N 150 − . − . − . − . Modified estimator
Censoring percentage B i a s N 30N 50N 100N 150 − . − . . . Modified corr. estimator
Censoring percentage B i a s N 30N 50N 100N 150 (c) For T ∼ G (4, 1) & C ∼ U ( a, a ). − . − . − . Estimator
Censoring percentage B i a s N 30N 50N 100N 150 − . − . − . Corrected estimator
Censoring percentage B i a s N 30N 50N 100N 150 − . − . − . Modified estimator
Censoring percentage B i a s N 30N 50N 100N 150 − . − . − . Modified corr. estimator
Censoring percentage B i a s N 30N 50N 100N 150 (d) For T ∼ WB (3.39, 3) & C ∼ U ( a, a ).Figure 1: The bias of the K–M mean lifetime estimators ˆ S KM mean, ˜ S KM mean, ˆ S ∗ KM mean and˜ S ∗ KM mean in 10000 simulation runs.
20 40 60 80 . . . . Estimator
Censoring percentage V a r i an c e o f b i a s N 30N 50N 100N 150 . . . Corrected estimator
Censoring percentage V a r i an c e o f b i a s N 30N 50N 100N 150 . . . . Modified estimator
Censoring percentage V a r i an c e o f b i a s N 30N 50N 100N 150 . . . Modified corr. estimator
Censoring percentage V a r i an c e o f b i a s N 30N 50N 100N 150 (a) For T ∼ LN (1.1, 1) & C ∼ U ( a, a ). . . . . . . . Estimator
Censoring percentage V a r i an c e o f b i a s N 30N 50N 100N 150 . . . . . . Corrected estimator
Censoring percentage V a r i an c e o f b i a s N 30N 50N 100N 150 . . . . Modified estimator
Censoring percentage V a r i an c e o f b i a s N 30N 50N 100N 150
Modified corr. estimator
Censoring percentage V a r i an c e o f b i a s N 30N 50N 100N 150 (b) For T ∼ EX (0.2) & C ∼ EX ( λ ). . . . Estimator
Censoring percentage V a r i an c e o f b i a s N 30N 50N 100N 150 . . . Corrected estimator
Censoring percentage V a r i an c e o f b i a s N 30N 50N 100N 150 . . . Modified estimator
Censoring percentage V a r i an c e o f b i a s N 30N 50N 100N 150 . . . . Modified corr. estimator
Censoring percentage V a r i an c e o f b i a s N 30N 50N 100N 150 (c) For T ∼ G (4, 1) & C ∼ U ( a, a ). . . . . Estimator
Censoring percentage V a r i an c e o f b i a s N 30N 50N 100N 150 . . . . Corrected estimator
Censoring percentage V a r i an c e o f b i a s N 30N 50N 100N 150 . . . . Modified estimator
Censoring percentage V a r i an c e o f b i a s N 30N 50N 100N 150 . . . . Modified corr. estimator
Censoring percentage V a r i an c e o f b i a s N 30N 50N 100N 150 (d) For T ∼ WB (3.39, 3) & C ∼ U ( a, a ).Figure 2: The variance of the bias of the K − M mean lifetime estimators ˆ S KM mean, ˜ S KM mean,ˆ S ∗ KM mean and ˜ S ∗ KM mean in 10000 simulation runs. W ν of Table 1,described fully in (Khan and Shaw, 2013b).Fig. 1(a), 1(d) and 2(a), 2(d) reveal similar results to our large simulationbased Koziol–Green model example. For example, given the modification,the bias estimate is bound to be higher. This seems to be true also for thevariance estimate. In addition, we find that for both actual and modifiedestimators the trend in bias differs for different censoring levels, but theybehave similarly under different lifetime distributions (see Fig. 1). The re-lationship between bias and censoring level varies substantially between thedistributions and the sample sizes. For a log-normal distribution, the biasfor the estimators except for the corrected estimators tends to increase as P % increases until 50. The maximum bias for the other distributions in-vestigated occurs between 60% and 80% censoring. Under the Exponentiallifetime distribution the bias behaves very similarly to that of the Koziol–Green proportional hazards model. Given that the estimators are originalor modified the corrected estimators seem to be overestimated in the highercensoring points (i.e., the bias becomes positive in higher censoring).The variance (Fig. 2) of bias for estimators also differs according to samplesizes and censoring level. The variance generally reaches a maximum atsome censoring level between 50% and 70%, then declines. However, for thecorrected estimators under a log-normal distribution the variance decreasesconsistently as censoring increases (see Fig. 2(a)). This simulation study is conducted to investigate how the modified esti-mators behave relative to the original estimators when lifetimes are modeled13s an AFT model that has the form Z i = α + X Ti β + σε i , i = 1 , · · · , n ε i ∼ N (0 ,
1) for i = 1 , · · · , n (9)where Z i = log ( T i ), X is the covariate vector, α is the intercept term, β isthe unknown p × a, a ) where a is chosenanalytically in the same way as done in the previous example. We considerfive covariates X = ( X , X , X , X , X ) each of which is generated usingU(0 , P % points, and three samples n = 30, 50 and 100. The coeffi-cients of the covariates are chosen as β j = j +1 where j = 1 , · · · , σ = 1.Of the five proposed imputation approaches of Table 1 and Khan and Shaw(2013b), the resampling based conditional mean approach ( W τ ∗ m ) is found tohave the least bias, and the results for W τ ∗ m from 10 ,
000 simulation runs areshown in Fig. (3).
4. Discussion
The behavior of bias for the K–M lifetime estimators is influenced bymany factors in practice. For example, the nature of the distributions to beused for lifetimes, the censoring rate, the sample size, whether the lifetimesare modeled with the covariates and so on. To explore the behaviour of thejackknife bias for K–M estimators under various conditions (in particular,censoring levels) a large simulation is required. Our simulation studies gobeyond the small simulation study in Stute and Wang (1994) and show cleardifferences from many of their results. In particular, the bias (Eq. (4) and14 − − − − − − Estimator
Censoring percentage B i a s N 30N 50N 100
20 40 60 80 − − − − Corrected estimator
Censoring percentage B i a s N 30N 50N 100
20 40 60 80 − − − − − − Modified estimator
Censoring percentage B i a s N 30N 50N 100
20 40 60 80 − − − − Modified corr. estimator
Censoring percentage B i a s N 30N 50N 100 (a) Bias
20 40 60 80 . . . . . . Estimator
Censoring percentage V a r i an c e o f b i a s N 30N 50N 100
20 40 60 80 . . . . . . Corrected estimator
Censoring percentage V a r i an c e o f b i a s N 30N 50N 100
20 40 60 80 . . . . Modified estimator
Censoring percentage V a r i an c e o f b i a s N 30N 50N 100
20 40 60 80 . . . . Modified corr. estimator
Censoring percentage V a r i an c e o f b i a s N 30N 50N 100 (b) Variance of biasFigure 3: Simulation results for the third simulated example for all four K − M meanlifetime estimators ˆ S KM mean, ˜ S KM mean, ˆ S ∗ KM mean and ˜ S ∗ KM mean under the log-normal AFT modelat different censoring points. Lowess smooths are superimposed. (7)) will be 0 at 0% censoring and increases as the censoring level increases.However, the bias will also tend to 0 as the censoring level tends to 100%(because the bias is 0 when either δ ( n − or δ ( n ) is 0). Therefore, as shownin the figures, the bias increases up to a particular censoring level (typically50% − δ ( n ) = 0, δ ( n − = 0)15o contribute to the bias calculation. So our modifications reduce the originalconditions needed for jackknife estimation of bias ( δ ( n − = 0, δ ( n ) = 1) to thesingle condition δ ( n − = 0. The modified jackknife estimate also preventsthe K–M estimator from being badly underestimated by the jackknife esti-mate when the largest observation is censored. For calculating bias and itsvariance with the proposed and existing jackknifing procedures we have pro-vided a publicly available package jackknifeKME (Khan and Shaw, 2013a)implemented in the R programming system.
5. Acknowledgements
The first author is grateful to the Centre for Research in StatisticalMethodology (CRiSM), Department of Statistics, University of Warwick, UKfor offering research funding for his PhD study.