Bias corrected estimators for proportion of true null hypotheses under exponential model: Application of adaptive FDR-controlling in segmented failure data
BBias corrected estimators for proportion of true null hypothesesunder exponential model: Application of adaptive FDR-controllingin segmented failure data
Aniket BiswasDepartment of StatisticsDibrugarh UniversityDibrugarh, Assam, India-786004Email: [email protected]
Gaurangadeb ChattopadhyayDepartment of StatisticsUniversity of CalcuttaKolkata, West Bengal, India-700019Email: [email protected]
Aditya ChatterjeeDepartment of StatisticsUniversity of CalcuttaKolkata, West Bengal, India-700019Email: [email protected]
Abstract
Two recently introduced model based bias corrected estimators for proportion of true nullhypotheses ( π ) under multiple hypotheses testing scenario have been restructured for expo-nentially distributed random observations available for each of the common hypotheses. Basedon stochastic ordering, a new motivation behind formulation of some related estimators for π is given. The reduction of bias for the model based estimators are theoretically justified andalgorithms for computing the estimators are also presented. The estimators are also used toformulate a popular adaptive multiple testing procedure. Extensive numerical study supportssuperiority of the bias corrected estimators. We also point out the adverse effect of using themodel based bias correction method without proper assessment of the underlying distribution.A case-study is done with a synthetic dataset in connection with reliability and warranty studiesto demonstrate the applicability of the procedure, under a non-Gaussian set up. The resultsobtained are in line with the intuition and experience of the subject expert. An intriguingdiscussion has been attempted to conclude the article that also indicates the future scope ofstudy. Keywords : Multiple hypotheses testing, Adaptive Benjamini-Hochberg algorithm, Mean mileageto failure, p -value. MS 2010 classification : 62F99, 62P30, 62N99.
Let us consider an empirical Bayesian set-up given in Storey (2002), where m similar but inde-pendent hypotheses are to be tested, viz. H , H , ..., H m . For H i = 1, the i -th null hypothesisis true and for H i = 0, false for any i ∈ { , , ..., m } . Thus, H i ’s are Bernoulli random variables1 a r X i v : . [ m a t h . S T ] J u l ith success probability π ∈ (0 , m be thenumber of true null hypotheses. Thus, m = (cid:80) mi =1 H i is a binomial random variable with index m and parameter π . Clearly, H i ’s and hence m remain latent and can never be realized in agiven multiple testing scenario. As in case of single hypothesis testing problem, the test statis-tics T , T , ..., T m respectively for H , H , ..., H m may be observed. For F being the commondistribution of T i | H i = 1 and F being the same for T i | H i = 0, a two-component mixture modelfor T i is T i ∼ π F + (1 − π ) F for all i = 1 , , ..., m. (1. 1)Thus, π may be thought of as the mixing proportion of the null test statistics with the non-nulltest statistics when multiple tests are performed. In existing literature p -values are consideredas test statistics since its use ensures similar nature of critical region, irrespective of the natureof hypotheses framed. Usually, a little abuse of notation is made while denoting p -value by p irrespective of whether it is a random variable or a realization on that. The distinction of usageought to be understood as the situation demands. The marginal density function of p -value(Langaas et al. 2005) is f ( p ) = π f ( p ) + (1 − π ) f ( p ) for 0 < p < f and f are two p-value densities, respectively under the null and alternative hypothe-ses. When the tested null is simple and the corresponding test statistic is absolutely continuous, f ( p ) is simply 1, the density function of a uniform random variable over (0 ,
1) and the p -valueunder the alternative hypothesis is stochastically smaller than the uniform variate. In addition,the density estimation based approaches for estimating π impose certain restrictions on f (Langaas et al. 2005; Guan et al. 2008; Ostrovnaya and Nicolae 2012). Storey’s estimator(Storey 2002) is constructed on the basis of a tuning parameter λ ∈ (0 ,
1) such that, f ( p ) = 0for p > λ . This assumption introduces a conservative bias in the estimator that can be correctedor in practice, can be reduced as have been discussed in Cheng et al. (2015). The set-up giventherein for the applicability of the Gaussian model based bias correction is discussed in section2. Biswas (2019) has recently proposed an alternative model based bias corrected estimatorfor π under the same set-up. A comparative performance study of both the estimators withsimulated microarray datasets has also been provided. Microarray datasets attract the lime-light to demonstrate the application of multiple testing procedures owing to their obvious highdimensional nature and associated decision-making questions.The current work deals with a segmented failure dataset, where failure time or some similarentity of a particular component is available for a number of units but the units are operatedor tested in different conditions, that may vary over space and time. Thus, the dataset is di-vided into several segments and the observations are available for each segment. The numberof observations per segment (in order of tens or hundreds) might be much less than the numberof segments (in order of hundreds or thousands), as the segmentation is done on the basis oftime and space among other things. Thus the situation is quite similar to that of microarraydatasets where thousands of genes are tested to identify differentially expressed genes based ongene expression levels of two small groups of subjects, viz. treatment group and control group.For segmented failure dataset, similar kind of questions may be raised regarding identificationof segment(s) for which the failure patterns of that particular component performs strikinglydifferent (much worse or better) from the average.To answer this question, appropriate hypotheses for each segment are framed and testedsimultaneously. While testing a large number of hypotheses, control over the False DiscoveryRate (FDR) is desirable and the classical Benjamini-Hochberg algorithm has the ability to doso (Benjamini and Hochberg 1995). A reliable estimate of π may be used for eliminating theconservativeness present in such Benjamini-Hochberg approach (Benjamini and Yekutieli 2000).2mpirical Bayesian interpretation of FDR and controlling the same by estimating it for fixedrejection region requires an estimate of π (Storey 2002). Craiu and Sun (2008) justifies theequivalence of Storey’s q-value approach with adaptive Benjamini-Hochberg algorithm. Theauthors emphasized that both the adaptive procedure and Storey’s approach require a goodapproximation of π .Gaussian model assumption for failure data is inappropriate, especially when sample sizecorresponding to each segment or equivalently each test is small and exponential distributionis a reasonable primary model choice. Under this set-up, we modify both the estimators pro-posed in Cheng et al. (2015) and Biswas (2019) and find that these model based estimators aremore efficient than the existing π -estimators in practice. Application of adaptive Benjamini-Hochberg procedure has the ability to list the significantly different segments with respect tosuch time to event or equivalent entity of a certain component in our case study.The remaining part of the article is structured as follows. In section 2 we reproduce Storey’sestimator and the recently introduced bias corrected estimators from stochastic ordering ap-proach which ties them in a yarn and may inspire further works in similar line. The next sectionis devoted to different testing scenarios and useful properties of respective non-null p -values. Insection 4 we briefly revisit the estimation algorithms and discuss adaptation of the π estimatesto Benjamini-Hochberg algorithm. Section 5 deals with performance comparison of the newestimators with existing ones through extensive simulation experiment. In section 6, a real-lifesynthetic segmented failure dataset is presented, has been validated for applicability of the pro-posed methods, analysed to demonstrate the superior performance of adaptive algorithm withthe new estimators along with proper justification of the findings. We conclude the article witha mention of a few limitations of the present work and a glimpse of the future direction of thestudy. Let p denote a p -value corresponding to a simple null hypothesis testing problem with continuoustest statistic. Thus, p has the support (0 , V on the samesupport (0 ,
1) with the distribution function G . Then, P ( p ≥ V ) = (cid:90) f ( p ) G ( p ) dp. (2. 3)In the following subsections we take different choices for G and motivate different estimatorsfor π as mentioned in section 1. Consider V to be degenerate at some λ ∈ (0 , G ( v ) = (cid:40) v ≥ λ v < λ. (2. 4)Putting (1. 2) and (2. 4) in (2. 3), we obtain¯ F ( λ ) = π (1 − λ ) + (1 − π ) Q ( λ ) (2. 5)where, F is the distribution function of p , ¯ F = 1 − F and Q is the survival function of non-null p -value. Assume, 3 A1: For an appropriate choice of λ , Q ( λ ) = P ( p > λ | H = 0) i.e the probability of non-null p -value being greater than λ equals zero (Storey 2002).When parameter of interest under alternative hypothesis is substantially far from the samespecified under null hypothesis or sample size is moderate to large, p -value tends to be smallerfor consistent tests. Hence, even for moderate choice of λ , the probability of p -value underfalse null dominating λ , vanishes. This is a reasonable but crucial assumption in a sense that,violation of assumptions regarding the true value of the parameter of interest and sample sizemay not result in Q ( λ ) = 0. Thus, applying A1 in (2. 5) we get π = ¯ F ( λ )(1 − λ ) . (2. 6)Let p , p , ..., p m be the p -values corresponding to the m hypotheses tested or equivalently m realizations on p . Denote W ( λ ) = (cid:80) mi =1 I ( p i > λ ) ( I denoting the indicator function) to be thenumber of p -values greater than λ . Putting the plug-in estimator of ¯ F ( λ ), i.e W ( λ ) /m in (2.6), an estimator for π depending upon the choice of λ may be suggested asˆ π ( λ ) = W ( λ ) m (1 − λ ) . (2. 7)For a given dataset, two different choices of λ would yield two different estimates and thus anoptimum choice of λ for a given dataset is necessary. For a subjectively chosen set with possiblevalues of λ ∈ Λ, where Λ = { , . , . , ..., . } ; a bootstrap routine is given in Storey (2002)and Storey et al. (2004) to approximate the best λ . Thus, Storey’s bootstrap estimator is:ˆ π B = ˆ π ( λ best ). In Storey and Tibshirani (2003), natural cubic spline has been fitted to the( λ, ˆ π ( λ )) curve for smoothing and the evaluated value of the fit at λ = 1 (as motivated inCorollary 1 of Storey (2002)) is taken as the final estimate which we denote by ˆ π P .For a small choice of λ in ˆ π ( λ ), the bias of the estimator is large while the variance is small.The situation is exactly opposite for large λ . It has been first noted by Jiang and Doerge (2008)and they have suggested the use of multiple λ ’s instead of a single best choice, in some sense.For the time being assume a fixed set S λ = { ( λ , λ , ..., λ k ) : 0 < λ < λ < ... < λ k < } fora fixed k and equal width given by ( λ i +1 − λ i ) for i = 1 , , ..., k − π A = (1 /k ) (cid:80) ki =1 ˆ π ( λ i ) to be an appropriateestimator for π . The authors have also suggested a change-point based algorithm to select S λ . Without the assumption A1, from (2. 5) we get π = ¯ F ( λ ) − Q ( λ )(1 − λ ) − Q ( λ ) (2. 8)for fixed λ . Cheng et al. (2015) obtained (2. 8) from a somewhat different motivation. Substi-tuting plug-in estimator of ¯ F ( λ ) has already been discussed in subsection 2.1. For estimating Q ( λ ) following assumptions are necessary. • A2: The availability of a common test for all the m hypotheses. • A3: The data-arrays used for each test are generated from a known parametric family. • A4: The closed form distribution of the test-statistics under the null are of a known family, enabling the calculation of the exact p -values.4 A5: The distribution of the non-null test-statistics and hence the non-null p -values arelabelled by unknown effect sizes, which are different for each test.A2 is generally true for microarray experiments and is also appropriate for the present setup.Cheng et al. (2015) assumed normality for each expression level, such that A3 is valid. Time toevents or its equivalent entities for each segment are assumed to be exponentially distributed,thus satisfying A3. Under normality, the test-statistics for usual single-sample or two-sampletests for the mean are normal under null. In this work, the test-statistics for single-sampletest related to the exponential rate parameter is a χ variate under null and for a two-sampleproblem the test-statistic is distributed as a F variate. Thus, A4 also holds good. As mentionedearlier test for H i is performed by T i and we introduce the notation δ i to denote the effect sizeof the corresponding test. Non-null distribution of T i and hence the non-null distribution of p i is to be labelled by δ i , i = 1 , , ..., m . Hung et al. (1997) have discussed properties of non-null p -values, where non-null distribution of the p -value for Z -test has been explored. For singlesample and two sample t -test, similar discussion is available in section 3 of Cheng et al. (2015).We will discuss such properties of non-null p -value for single and two sample problems underexponential set-up in section 3.Let I = { , , ..., m } . Also let T denote the set of indices corresponding to the originally truenull hypotheses i.e, T = { i ∈ I : H i = 1 } . Thus, the cardinality of T , is m . Similarly denotethe set of originally false null hypotheses by F . Clearly, F = I − T with cardinality m . Eachnull p has the same distribution, uniform over (0 , p -valuesare different but they belong to the same family. Let f δ ( p ) denote the distribution of p witheffect size δ . Then for all i ∈ F , Q δ i ( λ ) = (cid:82) λ f δ i ( p ) dp , probability of i -th non-null p -value beinggreater than λ . Define, Q ∗ ( λ ) = (1 /m ) (cid:80) i ∈F Q δ i ( λ ), the average of non-null p -values greaterthan λ . By strong law of large number, Q ∗ ( λ ) → Q ( λ ), almost surely as m → ∞ . To estimate Q ( λ ), individual δ i ’s are estimated by ˆ δ i , i ∈ F . In fact, ˆ δ i ’s are strongly consistent for δ i foreach i ∈ F . The estimation of δ under different testing problem is discussed in section 3. Each Q δ i ( λ ) is continuous in δ i and thus, Q ˆ δ i ( λ ) is strongly consistent for Q δ i ( λ ). Thus, a stronglyconsistent estimator for Q ( λ ) is ˜ Q ( λ ) = (1 /m ) (cid:80) i ∈F Q ˆ δ i ( λ ). In practice, F is unknown andhence ˜ Q ( λ ) is unavailable. Assume ˆ Q ( λ ) to be a dummy for ˜ Q ( λ ) such that ˜ Q ( λ ) ≥ ˆ Q ( λ )with probability 1. The computation of ˆ Q ( λ ) is discussed in detail in section 4. Substitutingthe plug-in estimators for ¯ F ( λ ) and Q ( λ ) in (2. 8), we get ˜ π U ( λ ) (or ˆ π U ( λ )), bias correctedestimator for π with fixed choice of λ . We now address the issues related to reduction in biasand over-correction in the following result. Result 1 : With the set-up and notations introduced in subsection 2.2, for all λ ∈ (0 , a ) For W ( λ ) /m ≤ (1 − λ ) , ˜ π U ( λ ) ≤ ˆ π U ( λ ) ≤ ˆ π ( λ ) . ( b ) ˜ π U ( λ ) → π , almost surely . Proof : Consider, g ( x ) = a − xb − x . Note that, g is non-increasing in x for a ≤ b . Let a = W ( λ ) /m and b = (1 − λ ). Since,0 ≤ ˆ Q ( λ ) ≤ ˜ Q ( λ ), g (0) ≥ g ( ˆ Q ( λ )) ≥ g ( ˜ Q ( λ )), which proves ( a ). Now,˜ π U ( λ ) = W ( λ ) m − ˜ Q ( λ )(1 − λ ) − ˜ Q ( λ ) . As ˆ δ i a.s −−→ δ i ∀ i = 1 , , ..., m , ˜ Q ( λ ) a.s −−→ Q ∗ ( λ ). Thus,˜ π U ( λ ) a.s −−→ W ( λ ) m − Q ∗ ( λ )(1 − λ ) − Q ∗ ( λ ) = ˜ π , say . W ( λ ) /m a.s −−→ ¯ F ( λ ) as m → ∞ . As m → ∞ , m → ∞ for π ∈ (0 ,
1) and thus, Q ∗ ( λ ) a.s −−→ Q ( λ ) as m → ∞ . Hence ˜ π a.s −−→ ( ¯ F ( λ ) − Q ( λ )) / ((1 − λ ) − Q ( λ )) = π as m → ∞ . (cid:4) The situations ˆ π ( λ ) ≤ Q ( λ ) ≤ (1 − λ ) are quite usual in multi[ple testing set-up asthe first one is a reasonable estimate of π and ˜ Q ( λ ) is a consistent estimate of Q ( λ ), which isobviously less than (1 − λ ). If these do not hold good, ˆ π U ( λ ) lies outside the parameter spaceand then we take the estimate to be the nearest boundary point.Result 1 combines claims written in section 2 and subsection 4.2 of Cheng et al. (2015). Wehave been able to prove Result 1 in a more direct way. Thus, the approach reduces conservativebias of Storey’s primary estimator while refraining from over-correction.Λ = { . , . , ..., . } is taken as in Jiang and Doerge (2008) for similar purpose (seesubsection 2.2 in Cheng et al. (2015)) and we identify the following estimator as the bias andvariance reduced estimator for π :ˆ π U = 1 (cid:88) λ j ∈ Λ min { , max { , ˆ π U ( λ j ) }} , where p -values Instead of taking V degenerated at some fixed λ , assume V ∼ U nif orm (0 , G ( v ) = v for v ∈ (0 ,
1) in (2. 3) we get P ( p ≥ V ) = (cid:90) p f ( p ) dp = E ( p ) = π − π ) e (2. 9)since p | H = 1 ∼ U nif orm (0 , e to denote expectation of non-null p -value: e = E ( p | H = 0). From (2. 9) we get π = E ( p ) − e . − e . (2. 10)To estimate π , both E ( p ) and e are to be estimated. E ( p ) can be estimated by the mean ofobserved p -values: ¯ p = (1 /m ) (cid:80) mi =1 p i . Define e ∗ = (1 /m ) (cid:80) i ∈F e δ i , which converges almostsurely to e as m → ∞ . e can be estimated imitating the approach of estimating Q ( λ ) withassumptions A2-A5. The corresponding estimator for π has recently been introduced in Biswas(2019) and computation of e δ i = E ( p i | i ∈ F ) has been demonstrated for single and two sam-ple t -tests therein. Since each e δ i is bounded and continuous in δ i , following the discussion insubsection 2.2, a strongly consistent estimator for e is ˜ e = (1 /m ) (cid:80) i ∈F e ˆ δ i , which cannot berealized in practice for obvious reason mentioned earlier and hence ˜ π E = (¯ p − ˜ e ) / (0 . − ˜ e ) cannotbe implemented. For ˆ e being a dummy of ˜ e with ˆ e ≤ ˜ e almost surely, an workable estimator for π is ˆ π E = (¯ p − ˆ e ) / (0 . − ˆ e ). Result 2 : With the set-up and notations introduced in subsection 2.3( a ) For ¯ p ≤ . , ˜ π E ≤ ˆ π E . ( b ) ˜ π E → π , almost surely . Proof : We consider g as in Result 1 and assume a = ¯ p , b = 0 .
5. Since, ˆ e ≤ ˜ e , g (ˆ e ) ≥ g (˜ e ),which proves ( a ). Here, ˜ π E = ¯ p − ˜ e . − ˜ e .
6s ˆ δ i a.s −−→ δ i ∀ i = 1 , , ..., m , ˜ e a.s −−→ e ∗ . Thus,˜ π E a.s −−→ ¯ p − e ∗ . − e ∗ = ˜ π , say . Note that, ¯ p a.s −−→ E ( p ) as m → ∞ . As m → ∞ , m → ∞ and thus, e ∗ a.s −−→ e as m → ∞ . Hence,˜ π a.s −−→ ( E ( p ) − e ) / (0 . − e ) = π as m → ∞ . (cid:4) The situations ¯ p ≤ . e ≤ . p isconsistent for E ( p ), which is less than 0 . e is consistent for e which is also lessthan 0 .
5. If theses do not hold good, ˆ π E lies outside the parameter space and then we take theestimator for π as ˆ π E = min (cid:26) , max (cid:26) , ¯ p − ˆ e . − ˆ e (cid:27)(cid:27) . Both the model based bias corrected estimators are shown to have conservative bias forestimating π . In Cheng et al. (2015), ˆ π U has been shown to outperform the robust estimatorsunder reasonable model assumption whereas, under similar situation ˆ π E outperforms it in termsof mean square error, as empirically studied in Biswas (2019) through extensive simulation study.Note that, both the estimators use an initial estimator for π but the computation of ˆ π E doesnot require flexible threshold tuning parameters owing to the fact that it uses all the p -values.To rule out the possibility of estimates taking value outside the parameter space under veryunusual situation, ˆ π E is taken to be equal to the nearest boundary point when it lies outsidethe parameter space. p -values To implement the bias corrected estimators ˆ π U and ˆ π E , appropriate estimates of the unknownquantities Q ( λ ) and e are needed. To get explicit expressions for these quantities, we need tohave the probability density functions f δ i ( p ) (for notational convenience we write this to be f δ i ( p ) henceforth) for each non-null p -value with effect size δ i , i ∈ F . The subscript i in effectsizes are not specified in this section for ease of notation. Thus, for different testing scenarioswe determine the probability density function f δ ( p ), then Q δ ( λ ) by integrating f δ ( p ) from λ to1 and finally obtain e δ through the following results. As discussed in subsection 2.2, Q δ ( λ ) forfixed λ and e δ are continuous in δ under each of the testing problems considered here. Result 3 : Assume X , X , ..., X n be a random sample of size n from an exponential distributionwith mean θ . Consider the following testing problem: H : θ = 1 versus H : θ > . (3. 11)For the corresponding likelihood ratio test( a ) δ = θ and thus ˆ δ = min { , ¯ X } ( b ) f δ ( p ) = δ f χ n ( δ χ p, n ) f χ n ( χ p, n ) for 0 < p < c ) Q δ ( λ ) = F χ n (cid:18) δ χ λ, n (cid:19) for 0 < λ < d ) e δ = E X ∼ χ n (cid:20) − F χ n (cid:18) Xδ (cid:19)(cid:21) . Here f χ ν , F χ ν and χ p,ν denote the probability density function, the distribution function andthe upper- p point of chi-square distribution with ν degrees of freedom, respectively.7 roof : The likelihood ratio test corresponding to the hypothesis in (3. 11) uses the test-statistic T = 2 (cid:80) ni =1 X i ∼ θχ n . Effect size of the test δ = θ . As E ( T ) = 2 nδ , an unbiasedestimator of δ is ˆ δ = T / n = ¯ X , the sample mean.As we reject H for larger observed value of T , the corresponding p -value is defined as p = P H ( χ n > T ) = 1 − F χ n ( T ) since under H , T ∼ χ n . Therefore, p ∼ U nif orm (0 , H . Under H , T ∼ δχ n and therefore the density function of T , labelled by δ is f δ ( t ) = 1 δ f χ n (cid:18) tδ (cid:19) for t > . (3. 12)From the relation between T and p , t = F − χ n (1 − p ) = χ p, n . The corresponding absoluteJacobian of transformation is f χ n ( χ p, n ). Thus from (3. 12), the density function of p labelledby δ is f δ ( p ) = δ f χ n ( χ p, n ) f χ n ( χ p, n ) for 0 < p < . (3. 13)For λ ∈ (0 ,
1) upper tail probability labelled by δ , using (3. 13) in expression of Q δ ( λ ) weget Q δ ( λ ) = (cid:90) λ δ f χ n ( δ χ p, n ) f χ n ( χ p, n ) dp = I, say . By change of variable from p to v such that v = (1 /δ ) χ p, n we get I = F χ n ( 1 δ χ λ, n ) . which proves the result in ( c ). For an explicit expression for expected p -value under the falsenull, we apply e δ = (cid:90) F χ n (cid:18) δ χ p, n (cid:19) dp By change of variable from p to v such that v = χ p, n we get e δ = (cid:90) ∞ F χ n (cid:16) vδ (cid:17) f χ n ( v ) dv = E X ∼ χ n (cid:20) F χ n (cid:18) Xδ (cid:19)(cid:21) . (cid:4) Result 4 : For X , X , ..., X n to be a random sample from exponential distribution with mean θ , consider the testing problem: H : θ = 1 versus H : θ (cid:54) = 1 . (3. 14)For the corresponding likelihood ratio test( a ) δ = θ and thus ˆ δ = ¯ X ( b ) Q δ ( λ ) = F χ n (cid:20) δ χ λ , n (cid:21) − F χ n (cid:20) δ χ − λ , n (cid:21) for 0 < λ < c ) e δ = E X ∼ χ n (0 ,µ ) (cid:20) F χ n (cid:18) Xδ (cid:19)(cid:21) − E X ∼ χ n ( µ, ∞ ) (cid:20) F χ n (cid:18) Xδ (cid:19)(cid:21) . χ ν ( a, b )denotes the truncated chi-squared distribution with degree of freedom ν and region of truncationbeing ( a, b ). Here µ denotes the median of χ n distribution. (cid:4) Proof : The corresponding likelihood ratio test uses the same test-statistic as in Result 3 andthus part ( a ) of Result 4 follows directly from part ( a ) of Result 3. For the next part, it shouldbe noted that due to two-sided alternative hypotheses, the corresponding p -value is definedthrough T , where p = 2 min (cid:8) P ( χ n > T ) , P ( χ n < T ) (cid:9) = 2 min ( p ∗ , − p ∗ ) , say . Here p ∗ = P ( χ n > T ) is the p -value defined for the testing problem in (3. 11). Thus from part( b ) of Result 3 we have f δ ( p ∗ ) = δ f χ n ( χ p ∗ , n ) f χ n ( χ p ∗ , n ) . (3. 15)Now for any λ ∈ (0 , Q δ ( λ ) = P ( p > λ )= P (cid:18) λ < p ∗ < − λ (cid:19) = (cid:90) − λ λ δ f χ n ( χ p ∗ , n ) f χ n ( χ p ∗ , n ) dp ∗ [using (3. 15)]= (cid:90) δ χ λ , n δ χ − λ , n f χ n ( v ) dv by taking v = 1 δ χ p ∗ , n , which proves the result in ( b ). e δ = (cid:90) Q δ ( λ ) dλ = (cid:90) F χ n (cid:20) δ χ p , n (cid:21) dp − (cid:90) F χ n (cid:20) δ χ − p , n (cid:21) dp = I − I , say . Now, we consider the problem of evaluating the integral I . By change of variable from p to v such that v = χ p , n , we get I = 2 (cid:90) ∞ χ , n F χ n (cid:16) vδ (cid:17) f χ n ( v ) dv = E X ∼ χ n (0 ,µ ) (cid:20) F χ n (cid:18) Xδ (cid:19)(cid:21) Following the same steps for evaluating I , I can also be evaluated and thus the result in ( c ). (cid:4) Result 5 : Let X , X , ..., X n and Y , Y , ..., Y n be two random samples of size n and n respectively from exponential distribution with mean θ and θ . Consider the testing problem H : θ = θ versus H : θ > θ . (3. 16)9or the corresponding likelihood ratio test, we have the following.( a ) δ = θ θ and ˆ δ = min (cid:26) , n − n (cid:80) n i =1 Y i (cid:80) n i =1 X i (cid:27) ( b ) f δ ( p ) = δ f F n , n ( δ F p, n , n ) f F n , n ( F p, n , n ) for 0 < p < c ) Q δ ( λ ) = F F n , n (cid:18) δ F λ, n , n (cid:19) for 0 < λ < d ) e δ = E X ∼ F n , n (cid:20) F F n , n (cid:18) Xδ (cid:19)(cid:21) . Here f F ν ,ν , F F ν ,ν and F p,ν ,ν respectively denote the probability density function, the distri-bution function and the upper- p point of F distribution with ν and ν degrees of freedom. Proof : The likelihood test corresponding to the hypothesis in (3. 16) uses the test-statistic T = (cid:80) n i =1 Y i / (cid:80) n i =1 X i ∼ ( θ /θ ) F n , n . Effect size of the test δ = θ /θ . As E ( T ) = δ [ n / ( n − δ is ˆ δ = [( n − /n ] T and thus the result in ( a ) follows. Rest of theproof follows from the proof of Result 3 with obvious changes. (cid:4) Result 6 : Consider X , X , ..., X n and Y , Y , ..., Y n to be independent random samples ofsize n and n from exponential distributions with mean θ and θ , respectively. Consider thetesting problem H : θ = θ versus H : θ (cid:54) = θ . (3. 17)For the corresponding likelihood ratio test, we have( a ) δ = θ θ and ˆ δ = n − n (cid:80) n i =1 Y i (cid:80) n i =1 X i ( b ) Q δ ( λ ) = F F n , n (cid:20) δ F λ , n , n (cid:21) − F F n , n (cid:20) δ F − λ , n , n (cid:21) ( c ) e δ = E X ∼ F n , n (0 ,µ ) (cid:20) F F n , n (cid:18) Xδ (cid:19)(cid:21) − E X ∼ F n , n ( µ, ∞ ) (cid:20) F F n , n (cid:18) Xδ (cid:19)(cid:21) . The notations used for Result 5 also remain relevant here. In addition to that, F ν ,ν ( a, b )denotes the truncated F -distribution with degrees of freedom ν , ν and region of truncation( a, b ). Here µ denotes the median of F n , n distribution. Proof : For the testing problem in (3. 17), the likelihood ratio test uses the same test-statisticas in Result 5. Since the critical region is two-sided, the corresponding p -value is similarlydefined as in Result 4. One can follow the steps elaborated through the proof of Result 4 anduse Result 5 to easily prove Result 6. (cid:4) Algorithm for computing ˆ π U under normal model assumption is given in in Cheng et al. (2015)and for ˆ π E under same set-up, see Biswas (2019). First, we reframe the algorithms undercurrent set-up to maintain readability and for making the proposed estimation methods readilyavailable to the practitioners. For the sake of brevity we only consider the testing problem inResult 4 and use the corresponding non-null p -value properties here. For all the four situationsdiscussed here, the following algorithms can be implemented with obvious modifications.10 lgorithm 1 (For computing ˆ π U ) • For all i = 1 , , ..., m , estimate δ i by ˆ δ i = ¯ X i . • For all i = 1 , , ..., m and for each λ j ∈ Λ; estimate the upper tail probability Q δ i ( λ j ) by Q ˆ δ i ( λ j ) given by Q ˆ δ i ( λ j ) = F χ ni (cid:20) δ i χ λj , n i (cid:21) − F χ ni (cid:20) δ i χ − λj , n i (cid:21) . where, n i denotes the available sample size for testing i -th hypothesis. • Using an available estimator of π as initial estimator ˆ π I , calculate d = [ m × (1 − ˆ π I )],where [ ] denotes the usual box function. Arrange Q ˆ δ i ( λ j )’s in increasing order and denotethe i -th quantity in the list as ˆ Q ( i ) ( λ j ). Thus a conservative estimator for Q ( λ j ) isˆ Q ( λ j ) = 1 d d (cid:88) i =1 ˆ Q ( i ) ( λ j ) . • Given ˆ Q ( λ j ) ∀ λ j ∈ Λ, calculateˆ π U = 1 (cid:88) λ j ∈ Λ min (cid:40) , max (cid:40) , W ( λ j ) m − ˆ Q ( λ j )(1 − λ j ) − ˆ Q ( λ j ) (cid:41)(cid:41) . Algorithm 2 (For computing ˆ π E ) • For all i = 1 , , ..., m , estimate δ i by ˆ δ i = ¯ X i . • For all i = 1 , , ..., m , estimate the mean of non-null p -value e δ i byˆ e δ i = E X ∼ χ ni (0 ,µ i ) (cid:20) F χ ni (cid:18) X ˆ δ i (cid:19)(cid:21) − E X ∼ χ ni ( µ i , ∞ ) (cid:20) F χ ni (cid:18) X ˆ δ i (cid:19)(cid:21) . where, n i denotes the available sample size for testing i -th hypothesis and µ i denotes themedian of χ n i distribution. • Using an initial estimator of π as initial estimator ˆ π I , calculate d = [ m × (1 − ˆ π I )], asbefore. Arrange ˆ e δ i ’s in increasing order and denote the i -th quantity in the list as ˆ e ( i ) .Thus a conservative estimator for e isˆ e = 1 d d (cid:88) i =1 ˆ e ( i ) . • Given ˆ e , calculate ˆ π E = min (cid:26) , max (cid:26) , ¯ p − ˆ e . − ˆ e (cid:27)(cid:27) . Note 1 : The role of ˆ π I is important in obtaining ˆ Q ( λ ) and ˆ e . For ˆ π I ≥ π , observe that m ≥ d . Clearly, ˆ Q ( λ ) ≤ ˜ Q ( λ ) and ˆ e ≤ ˜ e . Thus, ˆ π U ≥ ˜ π U and ˆ π E ≥ ˜ π E .11 ote 2 : For implementation of both the algorithms, we choose Storey’s bootstrap estimator,given by ˆ π B as the initial estimator. This choice seems reasonable albeit being non-universaland further research on this is warranted. The algorithms could also be implemented with otherchoices of ˆ π I . The performance analysis of the bias corrected estimators under the current set-up requires extensive simulation study, starting with different choices of the initial estimator.In fact, the algorithms could in principle be done several times, each time with the estimate of π from the previous iteration. Obviously, this technique will become computation intensive forall practical purposes. We refrain from addressing these issues, as they are beyond the scope ofthe current study.It has already been mentioned in section 1 that Benjamini-Hochberg procedure for control-ling the false discovery rate is conservative. To understand this, we briefly discuss FDR andthe algorithm for controlling it at a prefixed level q ∈ (0 , m hypotheses simul-taneously, let R be the total number of rejected hypotheses by application of certain multipletesting algorithm. From the entire set of rejected hypotheses, some hypotheses may be origi-nally true. These are categorized as false discovery and let V denote the total number of suchfalse discoveries. Then the false discovery proportion (FDP) is defined as F DP = (cid:40) VR if R >
00 if R = 0 . Note that, prior to the application of any algorithm both V and R are random variables and theexpected value of FDP is termed as F alse Discovery Rate (FDR). Let p (1) ≤ p (2) ≤ ... ≤ p ( m ) be the ordered sequence of the available p -values. Benjamini-Hochberg procedure identifiesthe largest k such that p ( k ) ≤ ( k/m ) q and rejects all hypotheses with corresponding p -valueless than p ( k ) along with the hypothesis with p -value p ( k ) . This procedure is conservative, asthe implementation of the same ensures F DR = π q where, π = m /m . To overcome thisshortcoming, Craiu and Sun (2008) worked with the following adaptive Benjamini-Hochbergprocedure which uses an approximation of π . Algorithm 3 (Implementing adaptive BH procedure to control FDR at level q ) • Let the p -value corresponding to the problem of testing H i be p i for i = 1 , , ..., m . Arrangethe available p -values in increasing order: p (1) , p (2) , ..., p ( m ) . Denote the correspondinghypotheses by H ( i ) : i = 1 , , ..., m . • Given the data-set, estimate π . Let it be ˆ π . • Compute the adjusted p -value corresponding to p ( i ) : adj.p ( i ) = min (cid:26) ˆ π m p ( j ) j : j ≥ i (cid:27) ∀ i = 1 , , ..., m. • For all i = 1 , , ..., m , reject H ( i ) if adj.p ( i ) ≤ q .Both adaptive BH procedure and Storey’s q-value approach are justified to be equivalent inCraiu and Sun (2008). They have also emphasized that both the approaches require a good ap-proximation of π . Less conservative estimators for π are in demand since closer approximationof π will bring superiority in the adaptive procedure by increasing the number of rejectionswhile controlling FDR at level q , as evident from Algorithm 3. In numerical study, we use adjust.p( ) function (available in cp4p library from Bioconductor) by Gianetto et al. (2019) forobtaining the adjusted p -values. 12 Simulation study
We have conducted an extensive simulation study to investigate the performance of the biascorrected estimators under different settings. The well-known and established estimators apartfrom the proposed ˆ π U and ˆ π E , considered for performance comparison are listed below.ˆ π B : Storey’s bootstrap estimator (discussed in subsection 2.1)ˆ π L : Convest estimator (Langaas et al. 2005)ˆ π A : Jiang and Doerge’s average estimator (discussed in subsection 2.1)ˆ π P : Natural cubic spline smoothing based estimator (discussed in subsection 2.1)ˆ π H : Histogram based estimator (Nettleton et al. 2006)ˆ π D : A robust estimator of π (Pounds and Cheng 2006)ˆ π S : Sliding linear model based estimator (Wang et al. 2011) . We imitate a segmented time to event dataset to generate artificial datasets. For this purposewe choose m = 100 , , n = 30 ,
50 available observations.For fixed π = 0 . , . , ..., .
9, calculate m = [ m π ] and take m = m − m . We set the meanfailure time under null θ as ‘unity’. For m randomly chosen numbers from I = { , , ..., m } we fix θ = θ = 1 and for the remaining cells in the array θ , we generate values through somestochastic mechanism ensuring that they are not equal to θ . We discuss this through thefollowing two settings, although any other convenient settings might be used for the purpose. • Uniform setting: For segments with better average lifetime θ ∼ Uniform(1 , .
5) and forsegments with poor average lifetime θ ∼ Uniform(0 . , • Exponential setting: For segments with better average lifetime θ is generated from trun-cated exponential distribution (mean=1) with support (1 , .
5) and for segments with pooraverage lifetime, the same is done with support (0 . , π may vary with the proportion of better (or poor) non-nullmean lifetimes. Thus we also replicate the experiment in each setting with different proportionsof left & right non-null θ -values given respectively by the allocation proportions 75% & 25%,50% & 50% and 25% & 75%.After generating the array of parameter θ , we generate a sample of size n from the expo-nential distribution with mean θ i for all i = 1 , , ..., m . Thus the data matrix of order m × n is generated where each row correspond to observations from a particular segment and out ofthem m (fixed by the choice of π ) segments originally have mean lifetime equal to θ = 1.From each row of the data matrix we obtain p -value by applying appropriate test and constructa p -value array of length m to compute the bias corrected estimators from Algorithm 1 andAlgorithm 2. The other estimators are computed using estim.pi0 R-function (available in cp4p library). Algorithm 3 also uses this array of p -values and an estimate of π to identify thesignificantly different segments with control over FDR.13 .2 Simulation results Under different set-ups mentioned in subsection 5.1, each experiment is repeated N = 1000times and the estimators are compared through M SE (ˆ π ) = 1 N N (cid:88) i =1 (ˆ π i − π ) and Bias (ˆ π ) = 1 N N (cid:88) i =1 (ˆ π i − π ) . MSE and bias of each estimators under uniform setting with 50% −
50% allocation are reported inTable 1 and Table 2, respectively. From Table 1, it can be pointed out that the bias correctedestimators beat other estimators over a significant region of the parameter space (for π ∈ (0 , . π U performs slightly better than ˆ π E . Thus, their performance may be consideredto be approximately equivalent. Thus using the bias corrected estimators for small to moderatevalues of π brings significant improvement while for larger values of π it remains a viablealternative. From Table 2, similar comments may be made. Additionally, we point out that ˆ π U really reduces the bias of ˆ π B for a significant portion of the parameter space. As expected, themean squared error for different estimators increase with increasing m/n ratio, while relativeperformance of the proposed bias corrected estimators gets better when the same ratio increases.Observations under other allocation proportions of non-null θ -values are almost similar and werefrain from reporting the numerical results to keep it concise. However, the gain from improvedestimation of π needs to be elaborated. Precise estimation of π is used to apply adaptivealgorithm for identifying significant segments as mentioned in section 4. We construct adaptivealgorithms through the Algorithm 3 with different estimators for π and compute the proportionof rejections for each which may be considered as the power of a particular algorithm in multipletesting set-up. In Figure 1 we compare power of non-adaptive Benjamini-Hochberg algorithmwith its adaptive version using ˆ π as function of π and observe that for lower to moderate valuesof π , the adaptive version results in substantial gain in power, while marginal gain is observedfor larger values of π . Similar comparison for adaptive algorithm with different estimates of π is reported in Table 3. From Table 3, it is evident that the bias corrected estimators outperformthe others for lower to moderate values of π , where it really matters as pointed out from Figure1. Table 1: MSE of estimators under different simulation settingsm=100 n=50 π ˆ π B ˆ π L ˆ π A ˆ π P ˆ π H ˆ π D ˆ π S ˆ π U ˆ π E π ˆ π B ˆ π L ˆ π A ˆ π P ˆ π H ˆ π D ˆ π S ˆ π U ˆ π E .1 0.17805 0.18048 0.23915 0.31011 0.24104 0.26921 0.35413 π ˆ π B ˆ π L ˆ π A ˆ π P ˆ π H ˆ π D ˆ π S ˆ π U ˆ π E π ˆ π B ˆ π L ˆ π A ˆ π P ˆ π H ˆ π D ˆ π S ˆ π U ˆ π E π ˆ π B ˆ π L ˆ π A ˆ π P ˆ π H ˆ π D ˆ π S ˆ π U ˆ π E π ˆ π B ˆ π L ˆ π A ˆ π P ˆ π H ˆ π D ˆ π S ˆ π U ˆ π E .6 0.03181 0.03212 0.03952 0.04192 0.04201 0.05001 0.07559 π ˆ π B ˆ π L ˆ π A ˆ π P ˆ π H ˆ π D ˆ π S ˆ π U ˆ π E π ˆ π B ˆ π L ˆ π A ˆ π P ˆ π H ˆ π D ˆ π S ˆ π U ˆ π E π ˆ π B ˆ π L ˆ π A ˆ π P ˆ π H ˆ π D ˆ π S ˆ π U ˆ π E π ˆ π B ˆ π L ˆ π A ˆ π P ˆ π H ˆ π D ˆ π S ˆ π U ˆ π E .8 π ˆ π B ˆ π L ˆ π A ˆ π P ˆ π H ˆ π D ˆ π S ˆ π U ˆ π E π ˆ π B ˆ π L ˆ π A ˆ π P ˆ π H ˆ π D ˆ π S ˆ π U ˆ π E π ˆ π B ˆ π L ˆ π A ˆ π P ˆ π H ˆ π D ˆ π S ˆ π U ˆ π E π ˆ π B ˆ π L ˆ π A ˆ π P ˆ π H ˆ π D ˆ π S ˆ π U ˆ π E .9 0.07416 π ˆ π B ˆ π L ˆ π A ˆ π P ˆ π H ˆ π D ˆ π S ˆ π U ˆ π E π ˆ π B ˆ π L ˆ π A ˆ π P ˆ π H ˆ π D ˆ π S ˆ π U ˆ π E π ˆ π B ˆ π L ˆ π A ˆ π P ˆ π H ˆ π D ˆ π S ˆ π U ˆ π E π ˆ π B ˆ π L ˆ π A ˆ π P ˆ π H ˆ π D ˆ π S ˆ π U ˆ π E .2 0.4 0.6 0.8 . . . m=100 n=50 Proportion of true null P o w e r . . . m=100 n=30 Proportion of true null P o w e r . . . m=500 n=50 Proportion of true null P o w e r . . m=500 n=30 Proportion of true null P o w e r . . . m=1000 n=50 Proportion of true null P o w e r . . . m=1000 n=30 Proportion of true null P o w e r Figure 1: Plot showing power of non-adaptive BH algorithm (dashed line) and adaptive BHalgorith with ˆ π U (solid line) as functions of π under different simulation settings. For the case-study we have considered the real life synthetic data set used by Gupta et al. (2014,2017) in connection with reliability and warranty studies. The detailed description of the datais available there and we report only the relevant part of it, which is required in the presentstudy. The date of failure of a particular component of automobiles along with the mileageat failure as reflected through the odometer readings are available. Although the entire dataset cover two disjoint geographical regions, as reported in Gupta et al. (2017), they may befurther subdivided into failures corresponding to seven sub-regions, termed as zones. Owing todata confidentiality issue, let us number them from 1 to 7. We have considered failure datacorresponding to a particular year and the mileage figures of the failures in successive calendarmonths across the zones as the response variable. The twelve calendar months are recorded asJAN, FEB, MAR, APR, MAY, JUN, JUL, AUG, SEP, OCT, NOV and DEC. Thus the entiredataset is inherently segmented by 7 different zones and 12 different months in a year. In other19ords, our synthetic dataset contains mileage at failure for 84 different month × zone segments.In total, little less than 3000 component failures have been reported in the year consideredwith varying number of warranty claims over the month × zone segments. Thus on an averagearound 35 failures are reported in each segment. In line with the discussion in section 1, herewe are primarily interested in identifying the segments which have significantly better or poorperformance in terms of mileage at failure in comparison to a bench mark. Thus, appropriatehypotheses are needed to be formed and tested separately for each of the segments making wayfor application of adaptive FDR-controlling algorithms.To validate our model-assumption we perform Kolmogorov-Smirnov’s test with empirical p -value for exponential distribution (using R-function ks.exp.test available in exptest package)and find that, at level 0 .
05 exponentiality fails for only 18 out of 84 segments whereas at level0 .
01 only 7 rejections are there. For visual display of fit, we present QQ-plot of some randomlychosen segments in Figure 2. As the sample sizes for most of the segments are moderate, wealso check normality applying Shapiro-Wilks’ test (using default R-function shapiro.test ). Atlevel 0 .
05, 59 out of 84 hypotheses gets rejected and at level 0 .
01, the number is 42. The firstline of information justifies applicability of the model based estimators for π discussed in thisarticle whereas the results from the normality test demonstrate the necessity of the modifica-tions achieved through this work over the existing related works, usually done under normalmodel.Now we consider framing of the appropriate hypotheses. We assume that the mileages atfailure for the i -th segment to be exponentially distributed with mean θ i miles. Thus θ i ’s arethe mean mileage to failure (MMTF) for the i -th. segment, a quantity similar to mean time tofailure (MTTF) in terms of the response variable ‘mileage’, for i = 1 , , ...,
84. We consider, asan indicator of the bench mark, the MMTF of the entire dataset as our null hypothesis point,approximately given by θ = 10973 miles. This value as an benchmark seems to be justified, asthe warranty mileage limit for the data base is 36000 miles and it is well known that such failuredata are usually positively skewed. According to the research question we then simultaneouslytest the following hypotheses: H i : θ i = θ versus H i : θ i (cid:54) = θ for i = 1 , , ..., . (6. 18)The two-sided choice of the alternative hypotheses at all the segments needs clarification. Inthe absence of any prior knowledge about the functioning of the component, it is not possibleto mark any segment to be better/worse than the overall benchmark in terms of MMTF. As aresult, to be on the safe side, we have suggested the alternative hypotheses at all segments tobe two-sided. This is very common in multiple testing situations. As an example, in microarraydata analysis, the samples used as a reference are called control samples. The other samplesexhibiting different phenotypic status are called treated samples. The gene expression levelsamong these groups may be different. To identify whether a gene is differentially expressed ornot, we fix two-sided alternative (Chen et al. 2007).Likelihood ratio tests are performed for each of the hypotheses after scaling the originalobservations by θ , maintaining equivalence of the test and the corresponding p -values alongwith effect sizes for each test are stored for further use. A file auto details.txt available assupplementary material provides details under the following heads: • segment : This column provides serial number of the segment, 1 to 84 such that segment i is for i -th month of zone 1 for i = 1(1)12, segment 12 + i is for i -th month of zone 2 for i = 1(1)12 and so on for the 7 zones in order as mentioned in the first paragraph of thissection. 20 llllllllll l l l l l l l l Zone: 1 Month: May
Theoretical quantile S a m p l e quan t il e llllllllllllllllllllllllllllllllllllllllll l l l l l l Zone: 2 Month: July
Theoretical quantile S a m p l e quan t il e llllllllllllllllllllllll l l l l l l l Zone: 3 Month: April
Theoretical quantile S a m p l e quan t il e lllllllllllll l l l l l l l Zone: 4 Month: September
Theoretical quantile S a m p l e quan t il e ll l l l l l l l l l Zone: 5 Month: June
Theoretical quantile S a m p l e quan t il e llllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllll l l l l l Zone: 6 Month: June
Theoretical quantile S a m p l e quan t il e Figure 2: QQplot for some randomly chosen segments from the synthetic data, showcasing theexponential fitting. • n : Provides available sample size for each segment. • pval : Provides the obtained p -value corresponding to common likelihood ratio test per-formed for each segment. • del : Provides maximum likelihood estimates of the effect sizes corresponding to each test.These array of values can be readily fed into Algorithms 1, 2 and 3 to get ˆ π U , ˆ π E and thelist of rejected hypotheses when adaptive FDR-controlling algorithm is applied with different π -estimates. Estimate of π along with the corresponding list of rejected hypotheses using theestimators already mentioned in this article are also reported. The estimated π -values usingdifferent estimators are reported in Table 4.In Table 5 we indicate the segments that are found to be significantly different from theaverage in terms of the mean mileage at of the designated component failure when adaptive21able 4: Estimate of π using different estimators under exponential model assumption for thesynthetic data. ˆ π B π L π A π P π H π D π S π U π E π -estimators and non-adaptive(N/A) BH-algorithm are appliedto control FDR at level q = 0 . , .
1. For visual display, we plot the adjusted p -values for non-adaptive Benjamini-Hochberg algorithm and its adaptive version using ˆ π U with cut-off q = 0 . π .Table 5: Significantly different segments identified by adaptive-BH algorithm with differentestimators for π for the synthetic data. Segment Zone Month mean: ¯ X θ ) ˆ π B ˆ π L ˆ π A ˆ π P ˆ π H ˆ π D ˆ π S ˆ π U ˆ π E N/A16 2 APR 1.44 (1.12,1.93) 2 2 2 2 2 2 2 2 2 218 2 JUN 1.29 (1.01,1.70) 0 0 0 0 0 1 0 1 0 020 2 AUG 1.47 (1.11,2.04) 2 2 2 2 2 2 1 2 2 021 2 SEP 1.33 (1.01,1.82) 0 0 0 0 0 0 0 1 0 023 2 NOV 1.69 (1.30,2.29) 2 2 2 2 2 2 2 2 2 238 4 MAR 0.55 (0.37,0.92) 1 1 1 1 1 0 0 1 1 039 4 APR 0.50 (0.34,0.79) 2 2 2 2 2 2 2 2 2 240 4 MAY 0.58 (0.40,0.92) 1 1 1 1 1 0 0 1 1 041 4 JUN 0.63 (0.44,0.95) 0 0 0 0 0 0 0 1 1 046 4 OCT 0.64 (0.45,0.98) 0 0 0 0 0 0 0 1 0 052 5 APR 1.77 (1.10,3.33) 1 1 1 1 1 1 0 1 1 056 5 AUG 1.82 (1.14,3.32) 1 1 1 1 1 1 1 2 2 162 6 FEB 0.66 (0.53,0.84) 2 2 2 2 2 2 2 2 2 263 6 MAR 0.60 (0.48,0.78) 2 2 2 2 2 2 2 2 2 264 6 APR 1.27 (1.03,1.61) 1 1 1 1 1 0 0 1 1 069 6 SEP 0.67 (0.54,0.85) 2 2 2 2 2 2 2 2 2 274 7 FEB 0.20 (0.11,0.51) 2 2 2 2 2 2 2 2 2 276 7 APR 0.48 (0.20,0.48) 0 0 0 0 0 0 0 1 0 0 Values in fourth and fifth columns are reported after scaling the original variable by θ = 10973 .
30. For columns 6 to 15, 2 indicates that the segments are found significant for q = 0 .
05 and trivially for q = 0 . q = 0 . . . . Plot of adjusted p−values for non−adaptive BH algorithm for the synthetic data
Segments A d j u s t ed p − v a l ue s . . . Plot of adjusted p−values for adaptive BH algorithm for the synthetic data
Segments A d j u s t ed p − v a l ue s Figure 3: Comparative plot of adjusted p-values for the synthetic data with q = 0 . π U .From the domain knowledge (not to be mentioned explicitly, owing to confidentiality issue),it is known that the functioning of the automobile component under consideration is likely tobe influenced by the climate condition, reflected through the effect of the month, as well as bythe effect of the zone of their usual operation. The effect of climate on the functioning of theautomobiles is well known and has also been reported in Lawless (1998). For simplicity anddemonstration purpose, we assume that each automobile is used only in the designated zonewhere the failure is reported. Although, we have used the two sided alternative, as being donein any multiple testing problem, the corresponding confidence interval falling entirely below orabove of the scaled null hypothesis point of ‘unity’, indicates the actual one-sided alternativefor which the respective significance appears. Thus the MMTFs of zone 4 are consistently andsignificantly smaller than the benchmark value (the null hypothesis point) indicating usage re-lated adverse problem of the automobiles and this problem is persistent in the first or secondquarter of the year indicating a transition from colder to warmer climate or the fourth quarterof the year indicating the transition from warmer to colder climate . Interestingly for zone 2,exactly the converse situation is prevailed and this seemingly high MMTF might not be dueto the climate condition and on the contrary may be attributed to better usage scenario. Forzone 5, better usage scenario is evident at least in two months, although weather related issuesmight not be associated with such improvement. The findings for zone 6 is heavily dependent onclimate condition especially during the advent of spring where a significant decrease in MMTFis identified followed by significant increase in MMTF just after. Again during the fall a signif-icant decrease in MMTF is found establishing the climate dependence of failure data. For zone7, climate plays an adverse role during the end of the winter and start of the summer. The data23orresponding to remaining two zones, do not reveal any deviation from the usual usage patternand/or are not affected by extreme climate conditions. It is to be noted that for almost all thezones, the month of April becomes significant concerning either betterment or worsening of thescenario in comparison to the benchmark. On the other hand the two months viz. Decemberand January never become markedly different from the benchmark at all the locations It mightbe attributed to the fact that in the winter, the relatively colder temperature does not affectall the zones, while a transition in temperature, as observed in April, may play a decisive rolein operating conditions in almost all the zones. Zones 1 and 3 never figure in the list andno marked deviation from the benchmark in any climate condition (non-rejection of the nullhypothesis at all seasons) is observed. This homogeneity might be attributed to the fact thatthese two zones correspond to a relatively warmer climate and hence climate dependence on theoperating conditions are not present here. Although, we have to suppress the zone identity forconfidentiality issue, the findings are as corroborated by the domain knowledge experts.To conclude this section, we emphasize the appropriateness of the model-based bias correc-tion approach. We try to explore a ‘what-if’ type scenario and try to assess the validity of thefindings if we assume the mileages at failure in each segment to be normally distributed, insteadof the exponential assumption. Our main objective remains same i.e to identify the significantlydifferent segments with respect to MMTF values. If we assume that the mileages at failure forthe i -th segment be normally distributed with mean θ i and variance σ i , the testing problem isstill the same as in (6. 18). We perform single sample both-sided t-test for each of the segmentsand obtain the array of p -values over all the segments. Computation of robust estimates maybe done similarly as mentioned, but for the bias corrected estimators we follow algorithms givenin Cheng et al. (2015) (for ˆ π U ) and Biswas (2019) (for ˆ π E ) in stead of Algorithms 1 and 2 forobvious reasons. The estimates of π under normality assumption are reported in Table 6.Table 6: Estimate of π using different estimators under normal model assumption for thesynthetic data. ˆ π B π L π A π P π H π D π S π U π E π , while the bias corrected estimators getdisrupted owing to the inappropriate model assumption and hence misleading effect size of test,upper tail probability and expectation of non-null p -values. Thus, appropriate model-basedbias correction seems to be appropriate and efficient by bringing out more power in adaptivealgorithms, while the findings may be misleading when not applied with adequate confidenceon underlying model assumption. As a result, the necessary modification of bias correctiontechnique under exponential model seems to be only way out, particularly while dealing withmultiple testing problem arising from segmented failure data, usually encountered in survivaland reliability studies. 24 Discussion
We have approached the problem of estimating π and thus construction of adaptive FDR-controlling procedure from suitable model assumptions and a common test for all the hypothe-ses to be tested. Within the framework suggested in Cheng et al. (2015) and Biswas (2019),we have tried to develop methods for estimation of π under exponential model and presenteda simple adaptive Benjamini-Hochberg algorithm in a spirit similar to Craiu and Sun (2008),which is shown to be more efficient than its counterparts for simulated as well as real life syn-thetic data. The current work also motivates the Storey’s bootstrap estimator for π and the π -estimator based on sum of all p -values based on P ( p ≥ V ). The cases of V being degeneratedat some λ and V being uniformly distributed over (0 ,
1) have also been discussed. This maymotivate other choices of V for further study of model based π estimators. The study on V having negatively skewed density function over (0 ,
1) is presently under consideration, whichtries to give more importance to the p -values corresponding to true null hypotheses and theconstruction of new estimators in future. Though the results presented in the current workstrengthens the foundations of bias-corrected estimation of π in general, the distinguishing fea-ture of this work lies on the innovative application of multiple testing procedure to segmentedfailure data. To the best of our knowledge, such procedures have never been applied to answersuch interesting research questions framed in section 6 related to large scale industrial data. Inthis work, however, we have focused on presenting and motivating a simple yet powerful tech-nique of identifying significantly different segments in terms of the performance of automobileand exploring the effect of zone of operation coupled with climate, that too under exponentialmodel assumption. The synthetic data explored in this work pose several other issues that maybe solved by the application of modified methods, which are to be formulated in future.This analysis of the real life synthetic data is based on one year data and may be carriedout on the basis of monthly or even weekly data associated with the component failures. Owingto limited number of such failure data in each segment, one has to use the standard failuremodels like exponential or Weibull. Instead, if one uses the usual Gaussian model to describethe failure pattern, then one is expected to commit a gross mistake and consequently, a falseperception on the MMTF may be reached. This issue has been addressed with the same failuredata. Instead of the exponential model, the normal probability model has been used and thetest for equality of respective means in all 84 segments with the same null hypothesis pointrepresenting the bench mark, as being done in the usual multiple testing procedure, has beenattempted. Interestingly, the test for normality at majority of the 84 segments fails miserablyand hence conclusion on the basis of the test for MMTF with reference to the benchmark undernormality will give a wrong signal about the true status of MMTF in those segments.Usually, in all studies related to life time data analysis, the time to event (failure) is takenas the response variable. Although various time to event distribution in the multivariate setupare available in the literature (Marshall and Olkin 2007; Crowder 2016), all the variables aremeasured with respect to the same scale. However, in a two-dimensional warranty data basethe two variables are measured in a different scale and hence special treatments to analyze suchdata are called for. Moreover, even if the data on the two variables are available, standardwarranty analysis fail to exploit both of them in a systematic manner, by neglecting the usagedata. Wu and Meeker (2002) has considered the number of failures, progressively counted overtime among the units which have been manufactured in immediately preceding time points andadopted a group sequential procedure (Jennison and Turnbull 1999) to address the emerging is-sue identification problem. They have only recorded the ‘month in service’ (MIS) of the vehiclesand have not used the individual mileage data. Instead, one may use the technique suggestedhere for progressively collected weekly or monthly data for all the components at all locations,25vailable through warranty data base. The method suggested here is expected to perform equiv-alently, if not better, as the mileage at failure data is used to assess the performance of severalcomponents progressively, over time at various locations.As the proposed method makes assumption regarding the distribution of the mileage tofailure data, we should accept the fact that, the proposed estimator is not universally suitablein all situations. At the same time, multiple testing problem in a non-Gaussian frameworkseems to be novel and may cover all parametric models for scenarios where non-negative valuedrandom variables seem to be appropriate. In such a framework, we have introduced two simpleestimators for π which simultaneously reduces the bias and variance of the existing estimatorover a relatively important part of the parameter space. The behaviour of such estimator isstudied through extensive simulation studies and the new estimator is shown to be more preciseunder some practical assumptions in comparison to those available in the existing literature.Involvement of numerical or Monte-Carlo integration for each segment makes the proposedmethod rather computation intensive. This extra labour is expected to be compensated by thegain in precision of the analysis, thus meaningfully addressing the multiple testing problem ina non-Gaussian set up. Acknowledgement
The third author (AC) acknowledges the contribution of Mr. Soumen De, formerly of GeneralMotors Tech Center India (GMTCI), Bengaluru, India for his association with the previousworks based on the synthetic data, used here.
References
Benjamini, Y., and Hochberg, Y. (1995), “Controlling the false discovery rate: a practical and powerfulapproach to multiple testing,”
Journal of the royal statistical society. Series B (Methodological) , 289-300.Benjamini, Y., and Hochberg, Y. (2000), “On the adaptive control of the false discovery rate in multipletesting with independent statistics,”
Journal of educational and Behavioral Statistics , 25(1), 60-83.Biswas, A. (2019), “Estimating Proportion of True Null Hypotheses based on Sum of p-values andapplication to microarrays,” arXiv preprint arXiv:1904.13282 .Chen, J. J., Wang, S. J., Tsai, C. A., and Lin, C. J. (2007), “Selection of differentially expressedgenes in microarray data analysis,”
The pharmacogenomics journal , 7(3), 212-220.Cheng, Y., Gao, D., and Tong, T. (2015), “Bias and variance reduction in estimating the proportion oftrue-null hypotheses,”
Biostatistics , 16(1), 189-204.Craiu, R. V., and Sun, L. (2008), “Choosing the lesser evil: trade-off between false discovery rateand non-discovery rate,”
Statistica Sinica,
18, 861-879.Crowder, M. J. (2012), “Multivariate survival analysis and competing risks,”
Chapman and Hall/CRC .Gianetto, Q. G., Combes, F., Ramus, C., and Gianetto, M. Q. G. (2019), Package ‘cp4p’.Guan, Z., Wu, B., and Zhao, H. (2008), “Nonparametric estimator of false discovery rate based onbernˇstein polynomials,”
Statistica Sinica,
18, 905-923.Gupta, S. K., De, S., and Chatterjee, A. (2014), “Warranty forecasting from incomplete two-dimensionalwarranty data,”
Reliability Engineering & System Safety, upta, S. K., De, S., and Chatterjee, A. (2017), “Some reliability issues for incomplete two-dimensionalwarranty claims data,” Reliability Engineering & System Safety,
Biometrics , 11-22.Jennison, C., and Turnbull, B. W. (1999), “
Group Sequential Methods With Applications to ClinicalTrials ,” FL Chapman & Hall/CRC.Jiang, H., and Doerge, R. W. (2008), ”Estimating the proportion of true null hypotheses for multi-ple comparisons,”
Cancer informatics , 6, 117693510800600001.Langaas, M., Lindqvist, B. H., and Ferkingstad, E. (2005), “Estimating the proportion of true nullhypotheses, with application to DNA microarray data,”
Journal of the Royal Statistical Society: SeriesB (Statistical Methodology),
International Statistical Review,
Life distributions ” (Vol. 13). Springer, New York.Nettleton, D., Hwang, J. G., Caldo, R. A., and Wise, R. P. (2006), “Estimating the number of truenull hypotheses from a histogram of p values,”
Journal of agricultural, biological, and environmentalstatistics,
Statistica Sinica , 1689-1716.Pounds, S., and Cheng, C. (2006), “Robust estimation of the false discovery rate,”
Bioinformatics,
Journal of the Royal Statistical Society:Series B (Statistical Methodology),
The analysis of gene expression data (pp. 272-290).Springer, New York, NY.Wang, H. Q., Tuominen, L. K., and Tsai, C. J. (2010), “SLIM: a sliding linear model for estimatingthe proportion of true null hypotheses in datasets with dependence structures,”
Bioinformatics , 27(2),225-231.Wu, H., and Meeker, W. Q. (2002), “Early detection of reliability problems using information fromwarranty databases,”
Technometrics,44(2), 120-133.