Step-up simultaneous tests for identifying active effects in orthogonal saturated designs
aa r X i v : . [ m a t h . S T ] A ug The Annals of Statistics (cid:13)
Institute of Mathematical Statistics, 2007
STEP-UP SIMULTANEOUS TESTS FOR IDENTIFYING ACTIVEEFFECTS IN ORTHOGONAL SATURATED DESIGNS
By Samuel S. Wu and Weizhen Wang University of Florida and Wright State University
A sequence of null hypotheses regarding the number of negligibleeffects (zero effects) in orthogonal saturated designs is formulated.Two step-up simultaneous testing procedures are proposed to iden-tify active effects (nonzero effects) under the commonly used assump-tion of effect sparsity. It is shown that each procedure controls theexperimentwise error rate at a given α level in the strong sense.
1. Introduction.
Assume a linear model Y i = µ + β x i + · · · + β k x ik + ε i , for i = 1 , . . . , M, (1)where ε i ∼ i.i.d. N(0 , σ ). The unknown parameters β i are of interest and µ and σ are two unknown nuisance parameters. The design is called orthogo-nal if the least squares estimators ˆ β i (1 ≤ i ≤ k ) are uncorrelated (equivalentto independent), which occurs, for example, in two-level fractional factorialdesigns. The design is said to be saturated if there are just enough observa-tions to estimate the model parameters β i and µ (i.e., M = k + 1), leavingno degrees of freedom to estimate the error variance σ . In order to makeinferences on β i , one must typically use the assumption of effect sparsity,that is, that most of the β i ’s are equal to zero. Then we can use the cor-responding ˆ β i ’s to estimate σ . However, we do not know how many andwhich of the β i ’s are zero. An initial guess would be at least ν of the β i ’sequal zero, say one-half of the effects. Therefore, the smallest ν of the ˆ β i ’sshould be used to estimate σ . Any other ˆ β j whose square is substantiallylarger is likely to have a nonzero mean and corresponds to an active effect.For a fixed sequence of β = ( β , . . . , β k ), let N = the number of β i ’s which equal zero . (2) Received May 2004; revised October 2005. Supported in part by NSF Grant DMS-03-08861.
AMS 2000 subject classifications.
Primary 62F35, 62F25, 62K15; secondary 62J15.
Key words and phrases.
Closed test method, effect sparsity, experimentwise error rate,non-central chi-squared distribution.
This is an electronic reprint of the original article published by theInstitute of Mathematical Statistics in
The Annals of Statistics ,2007, Vol. 35, No. 1, 449–463. This reprint differs from the original in paginationand typographic detail. 1
S. S. WU AND W. WANG
Thus, the number of nonzero β i ’s is equal to k − N and the entire parameterspace without nuisance parameters is H = { β = ( β , . . . , β k ) : N ≥ ν } . Foreach integer m ∈ [ ν + 1 , k ], consider the testing problem H ,m : N ≥ m vs. H A,m : N ≤ m − , k − N ≥ k − m + 1)(3)and define a parameter configuration in each H ,m , β m =: (0 , . . . , , + ∞ , . . . , + ∞ ) , (4)where the first m components are zero. Let B = { H ,m : ν + 1 ≤ m ≤ k } , (5)which contains all null hypotheses of interest in this paper. Because H ,i is asubset of H ,j for any i > j , if H ,j is incorrect, then so is H ,i . This impliesthat a testing process should be terminated as soon as a rejection occursfor some null hypothesis. Starting from m = ν + 1, we test these hypothesesone at a time as m goes up to k . If H ,ν +1 is rejected, we then concludethat there are k − ν active effects (i.e., H ∩ H A,ν +1 ) and no longer test anyother hypotheses; otherwise, we test the next hypothesis H ,ν +2 . In general,if H ,m is the first hypothesis being rejected for some m ≤ k , we stop andconclude that there are k − m + 1 nonzero effects (i.e., H ,m − ∩ H A,m );otherwise, all hypotheses in B are accepted and we conclude that there isno active effect. Clearly, this is a step-up testing procedure.Many inference procedures have been proposed to identify active effects.The data analysis of orthogonal saturated designs was initially consideredby Birnbaum [1] and Daniel [2]. The half-normal plot introduced by Daniel[2] is still being used in the preliminary analysis. Lenth [8] proposed thefirst adaptive method to let the data determine which and how many ofthe ˆ β i ’s should be used to estimate σ . Whether Lenth’s interval is of level1 − α still remains a question. Besides using the data adaptively, anotherfundamentally desirable property is the ability to control the error rate inthe strong sense (i.e., under all parameter configurations), which is espousedby Hochberg and Tamhane [5]. For orthogonal saturated designs, the firstadaptive confidence interval known to provide strong control of error ratesand more general results can be found in [15] and [16], respectively. Hamadaand Balakrishnan [4] provided a thorough review of the analysis methodsavailable for saturated designs.Since we do not know which and how many effects are active, it is rea-sonable to search for active effects using simultaneous tests. Here are twopossibilities:(a) Starting from the largest ˆ β i , test whether the corresponding effect isactive. If it is not active, then conclude no active effect and stop; oth-erwise, test for the second largest ˆ β i (go down), and so on, until a zeroeffect is found—the step-down tests. TEP-UP TESTS (b) Starting from the ( ν + 1)th smallest ˆ β i , test whether the correspondingeffect is active. If it is active, then conclude k − ν active effects and stop;otherwise, test for the ( ν + 2)th smallest ˆ β i (go up), and so on, until anactive effect is found—the step-up tests.Voss [12] proposed nonadaptive step-down tests for a set of hypothesesdifferent from B and controlled the experimentwise error rates at the givenlevel α in the strong sense. Recently, Voss and Wang [14] derived adaptivestep-down tests with the experimentwise error rates controlled in the samesetting as Voss [12]. As pointed out by several researchers [3, 7, 11], thestep-up tests typically have a greater power to detect active effects thanthe step-down tests. For example, Venter and Steel [11] proposed one step-up and one step-down procedure for the orthogonal saturated designs withcutoff points determined at β m . However, they were unable to prove thatthe experimentwise error rates of their procedures are controlled at a givenlevel α in the strong sense.The rest of this article is organized as follows. In Section 2, we providethe motivation for two desirable tests for H ,m and establish a probabilityinequality for noncentral χ distributions which asserts that the maximaltype I error of any test in a general class is always achieved at β m . Basedon this inequality, two level- α tests are proposed in Section 3 for testingeach single hypothesis H ,m . Two sequential step-up procedures for testingall hypotheses in B , which control the experimentwise error rates at α in thestrong sense, are derived in Section 4. Section 5 presents a simulation studyand Section 6 concludes with some discussion.
2. Motivation and a general probability inequality.
In this section, weprovide motivation for the new procedures testing H ,m and a general class[see A m in (11)] of the rejection regions of level- α . The maximal type I errorof each rejection region in this class is always achieved at β m , as stated inTheorem 1.Assume the factorial effect estimators ˆ β i are independently distributedas N ( β i , a i σ ) for known constants a i . We may assume that each a i = 1without loss of generality. Let X , . . . , X k be the order statistics of the ˆ β i for i ≤ k . Intuitively, it is more likely that the small order statistics X i willcorrespond to estimators ˆ β j with β j = 0. If we believe a priori that at least ν of the β i ’s are zero, then those β i ’s corresponding to X , . . . , X ν are likelyto be the negligible ones. To test H ,m , one needs to compare X m with X , . . . , X ν . For any integer n ∈ [ ν, m ], let S n = P ni =1 X i and ¯ X n = S n /n anddefine a test statistic for H ,m as follows: W n,m = nX m P ni =1 X i = nX m S n = X m ¯ X n . (6) S. S. WU AND W. WANG
Intuitively, a large value of W n,m is in favor of H A,m . Therefore, the rejectionregion should be W n,m > d n,m for a constant d n,m which satisfiessup β ∈ H ,m P β ( W n,m > d n,m ) = α. (7)In this section, we will show that W n,m is stochastically largest at β m for β ∈ H ,m . Definition 1.
A random variable X is said to be stochastically smallerthan Y , denoted by X ≺ Y , if P ( X ≤ d ) ≥ P ( Y ≤ d ) for all d .Let Y ,m , . . . , Y m,m be the order statistics of the ˆ β i ’s for i ≤ m when β = · · · = β m = 0 and let Z n,m = nY m,m P ni =1 Y i,m . (8)Langsrud and Næs [7] also studied Z n,m , called Ψ m, ,n,m -distribution in theirnotation. Clearly, the distribution of Z n,m does not depend on any param-eter and can be sampled based on the order statistics of m independent χ random variables. It is easy to see that W n,m = Z n,m at β m and we want toshow that W n,m ≺ Z n,m on H ,m . If this is true, then (7) reduces to P ( Z n,m > d n,m ) = α (9)and d n,m is the 100(1 − α ) percentile of Z n,m . Definition 2.
A function h : R d → R will be called nondecreasing (tothe coordinatewise ordering) if x i ≤ y i , i = 1 , . . . , d, implies that h ( x ) ≤ h ( y ).Now we prove a general theorem that includes (9) as a special case. Theorem 1.
Suppose that X , . . . , X k are the order statistics of inde-pendent random variables with noncentral chi-squared distributions χ ( β i ) , ≤ i ≤ k . Then sup β ∈ H ,m P β ( R ) = P β m ( R )(10) for any rejection region R of H ,m that belongs to the class A m = { T L ( X , . . . , X s − ) < T R ( X s ) , for some integer < s ≤ m } , (11) where T L and T R satisfy the following two properties :(i) ( monotone ) T L ( x , . . . , x s − ) and T R ( x s ) are nondecreasing functions ; TEP-UP TESTS (ii) ( invariant ) T L /T R is invariant to scale transformation, that is, forany a > , T L ( ax , . . . , ax s − ) T R ( ax s ) = T L ( x , . . . , x s − ) T R ( x s ) . (12) Corollary 1.
For any < α < and d n,m given in ( ) , the rejectionregion R n,m = { W n,m > d n,m } (13) defines a level- α test for ( ) . Proof.
Let s = m , T L ( x , . . . , x m − ) = P ni =1 x i /n and T R ( x m ) = x m /d n,m .The claim follows from Theorem 1. (cid:3) To prove Theorem 1, we need the facts below (Corollary 2 and Corol-lary 3).
Lemma 1.
For a nonnegative random variable X and a positive number y , let X y = X/y , given that X ≤ y . Let U ∼ χ , a chi-squared distributionwith one degree of freedom, and V ∼ χ ( θ ) , a noncentral chi-squared distri-bution with one degree of freedom and noncentrality parameter θ . Then wehave (i) U y ≺ V y and (ii) U y ≺ U y for any y > y > . Part (ii) of Lemma 1 is identical to Lemma 2 of [7]. The proofs for bothstochastic orderings follow from the monotone likelihood ratio function.
Lemma 2.
Let U , . . . , U s − be any independent random variables. Letthe same be true for V , . . . , V s − . If T ( x , . . . , x s − ) is a nondecreasing func-tion and U i ≺ V i for i ≤ s − , then T ( U , . . . , U s − ) ≺ T ( V , . . . , V s − ) . This is called by some researchers (see, e.g., [13]) a “stochastic orderinglemma.”
Corollary 2.
Suppose that U i ∼ χ , ≤ i ≤ s − , and V j ∼ χ ( θ j ) , ≤ j ≤ s − , are independent random variables. For any y > , let U ( i ) ,y bethe order statistics of U i,y (= U i /y , given that U i ≤ y , as defined in Lemma and let V ( j ) ,y be the order statistics of V j,y . Then for any nondecreasingfunction T L , T L ( U (1) ,y , . . . , U ( s − ,y ) ≺ T L ( V (1) ,y , . . . , V ( s − ,y ) . (14) Therefore, for any t > , P ( T L ( U (1) ,y , . . . , U ( s − ,y ) ≤ t ) ≥ P ( T L ( V (1) ,y , . . . , V ( s − ,y ) ≤ t ) . S. S. WU AND W. WANG
Proof.
Let T ∗ L ( U ,y , . . . , U s − ,y ) = T L ( U (1) ,y , . . . , U ( s − ,y ) . Since each U ( i ) ,y is a nondecreasing function in each U j,y and T L is nonde-creasing in each of its arguments, T ∗ L is nondecreasing in each of its argu-ments. Therefore, if one combines part (i) of Lemma 1 with Lemma 2, it canbe concluded that T L ( U (1) ,y , . . . , U ( s − ,y ) = T ∗ L ( U ,y , . . . , U s − ,y ) ≺ T ∗ L ( V ,y , . . . , V s − ,y )= T L ( V (1) ,y , . . . , V ( s − ,y ) . (cid:3) Corollary 3.
Suppose that T L and T R satisfy the monotone and in-variant conditions specified in the definition of A m in (11) and one defines G ∗ ( y ) ≡ P ( T L ( U (1) ,y , . . . , U ( s − ,y ) ≤ T R (1))= Z . . . Z { y > , T L ( U (1) ,y , . . . , U ( s − ,y ) ≺ T L ( U (1) ,y , . . . , U ( s − ,y ) . Therefore, G ∗ ( y ) is a nondecreasing function. Proof.
Combine part (ii) of Lemma 1 with Lemma 2 and define T ∗ L asin Corollary 2. (cid:3) Proof of Theorem 1.
Consider two samples, { ˆ β , . . . , ˆ β m , + ∞ , . . . , + ∞ ( k − m of them) } and { ˆ β , . . . , ˆ β k } . Clearly, the s th order statistic of the first sample is stochastically largerthan that of the second sample, that is, X s ≺ Y s,m for any given integer s satisfying 1 < s ≤ m .Second, we have P ( T L ( Y ,m , . . . , Y s − ,m ) < T R ( Y s,m ))= E [ P ( T L ( Y ,m , . . . , Y s − ,m ) < T R ( y ) | Y s,m = y )]= E [ P ( T L ( U (1) ,y , . . . , U ( s − ,y ) < T R (1) | Y s,m = y )]= E [ G ∗ ( Y s,m )] . TEP-UP TESTS Next, for any partition ω = ( j , j , . . . , j s − )( j s )( j s +1 , . . . , j k ) of the inte-gers 1 to k , we denote by E ω the event { ˆ β j i < ˆ β j s , ∀ i < s ; ˆ β j s < ˆ β j l , ∀ l > s } .We note that for any β ∈ H ,m , P β ( R )= E β [ P β ( T L ( X , . . . , X s − ) < T R ( y ) | X s = y )]= E β [ P β ( T L ( X ,y , . . . , X s − ,y ) < T R (1) | X s = y )]= E β (cid:20)X ω P β ( { T L ( X ,y , . . . , X s − ,y ) < T R (1) } ∩ E ω | X s = y ) (cid:21) = E β (cid:20)X ω P β ( { T L ( X ,y , . . . , X s − ,y ) < T R (1) }| E ω , X s = y ) P β ( E ω | X s = y ) (cid:21) ≤ E β (cid:20)X ω P β m ( { T L ( X ,y , . . . , X s − ,y ) < T R (1) }| X s = y ) P β ( E ω | X s = y ) (cid:21) = E β [ P β m ( T L ( X ,y , . . . , X s − ,y ) < T R (1) | X s = y )]= E β [ P ( T L ( U (1) ,y , . . . , U ( s − ,y ) < T R (1) | X s = y )]= E β [ G ∗ ( X s )] , where the above inequality is due to Corollary 2.Finally, since G ∗ ( y ) is a nondecreasing function due to Corollary 3, theinequality X s ≺ Y s,m implies that G ∗ ( X s ) ≺ G ∗ ( Y s,m ). Therefore, it can beconcluded that E β [ G ∗ ( X s )] ≤ E [ G ∗ ( Y s,m )] . (cid:3)
3. Two tests for H ,m . In this section, we present two tests for H ,m .While both are of level α , they are to be used under different circumstances.Let d ν,m and d m − ,m denote the numbers determined by (9) when n = ν and n = m −
1, respectively.
Theorem 2.
For any < α < , the rejection regions R ν,m = { W ν,m > d ν,m } and R m − ,m = { W m − ,m > d m − ,m } (15) both define level- α tests for (3). This theorem is a special case of Corollary 1 when n = ν and n = m − R ν,m estimates the error variance by ¯ X ν = P νi =1 X i /ν , irre-spective of the value of m , while the test based on R m − ,m uses the adaptiveestimator ¯ X m − = P m − i =1 X i / ( m − fixed and sequential scaling , respectively. Region R ν,m should S. S. WU AND W. WANG be applied only when H ,m is of interest and no information is availableon whether H ,n is true for any n < m . In such a case, we are certain thatat least ν of the ˆ β i ’s have a zero mean. It is reasonable to compare X m with the average of X , . . . , X ν , the smallest ν of the ˆ β i ’s, through theiraverage. If W ν,m is large, then one concludes that all ˆ β i ’s corresponding to X m , . . . , X k are from populations with nonzero means. On the other hand,if one tests H ,n for n < m sequentially up to H ,m and H ,m − is accepted,then at least m − β i ’s are zero. In this case, one should compare X m with X , . . . , X m − and a large value of W m − ,m would lead to a rejectionof H ,m .
4. Two step-up testing procedures.
When effect sparsity is assumed,we do not know which and how many of the β i ’s are zero. It is of moreinterest to conduct tests simultaneously to identify the nonzero effects. Thetests developed in the previous section, in fact, can detect whether there is ajump at X m among X , . . . , X k . However, these tests cannot tell whether thejump, if it exists, is the first one, which is what interests us. Therefore, asmentioned earlier, since H ,m decreases as m increases, one needs to conducttests sequentially. Like all testing problems, there are two major concerns:to control the experimentwise error rate at a given level α , that is,sup β ∈ S km = ν +1 H ,m P β (assert not H ,n , which contains β , for some n ∈ [ ν + 1 , k ]) ≤ α, (16)and to obtain more powerful tests, which means larger rejection regions.The first requirement (16) can be ensured by using the closed test pro-cedure proposed by Marcus, Peritz and Gabriel [9]. For details, see [6],page 137. A na¨ıve solution is to assert not H ,m (i.e., to assert H A,m ) ifone rejects H ,i at level α for all i ≥ m . For example, suppose R ν,i is used totest for each H ,i . Then assert not H ,m iff R m = T ki = m R ν,i is true. This, bythe closed test procedure, controls the experimentwise error rate at α . Notethat R m decreases as m increases, which contradicts the fact that H ,m isdecreasing (we need R m to increase). Therefore, simply applying the closedtest procedure on the tests derived in the previous section only results inless powerful tests for the simultaneous hypotheses. We require that the re-jection region for H ,m (a) increases as m gets larger and (b) is of level- α . Inthis section, two testing procedures are discussed with their rejection regionsdenoted by { R ∗ ν,m } km = ν +1 and { R ∗ m − ,m } km = ν +1 corresponding to R ν,m and R m − ,m , respectively. TEP-UP TESTS The construction of { R ∗ ν,m } km = ν +1 : step-up tests with fixed scaling ( SUF ) . The general form of R ∗ ν,m , for ν + 1 ≤ m ≤ k , is R ∗ ν,m = m [ i = ν +1 { W ν,i > d ∗ ν,i } = { S ν < max { νX i /d ∗ ν,i } mi = ν +1 } , (17)where the sequence of constants { d ∗ ν,m } km = ν +1 is determined iteratively be-low. More precisely, d ∗ ν,m depends on d ∗ ν,i for i < m and causes R ∗ ν,m to havelevel- α . It is clear that R ∗ ν,m is nondecreasing when m gets larger and isstrictly increasing if all d ∗ ν,m are finite. We start with the following lemma. Lemma 3.
For a sequence of random variables { ∆ i } si =0 , where s is agiven positive integer, P (∆ < max { ∆ i } si =1 ) ≤ s X i =1 P (max { ∆ j } i − j =0 < ∆ i ) . (18) Proof.
We prove (18) by induction. When s = 1, (18) is true. Suppose(18) is true for any s = n . Then for s = n + 1, P (∆ < max { ∆ i } n +1 i =1 ) ≤ P (∆ < max { ∆ i } ni =1 ) + P (max { ∆ j } nj =0 ≤ ∆ < ∆ n +1 ) ≤ n +1 X i =1 P (max { ∆ j } i − j =0 < ∆ i ) . (cid:3) We now determine the sequence of constants { d ∗ ν,m } km = ν +1 starting from m = ν + 1. For testing H ,ν +1 , let R ∗ ν,ν +1 = R ν,ν +1 . It is a level- α test byTheorem 2.For any ν + 2 ≤ m < k , let d ∗ ν,ν = ∞ and suppose that { d ∗ ν,i } m − i = ν +1 areavailable. Then d ∗ ν,m is determined by solving m X i = ν +1 P β m ( A i ) = α, (19) where A i = { max { S ν , { νX j /d ∗ ν,j } i − j = ν } < νX i /d ∗ ν,i } . Note that R ∗ ν,m is of level- α because for any β ∈ H ,m , P β ( R ∗ ν,m ) ≤ m X i = ν +1 P β ( A i ) ≤ m X i = ν +1 P β m ( A i ) = α, (20)where the first inequality follows from Lemma 3 (with ∆ = S ν and ∆ i − ν = νX i /d ∗ ν,i for i = ν + 1 , . . . , m ) and the second inequality holds since eachterm in the summation achieves its maximum at β m by Theorem 1. On the S. S. WU AND W. WANG other hand, since the thresholds { d ∗ ν,i } m − i = ν +1 satisfy P m − i = ν +1 P β m − ( A i ) = α and each term in this summation satisfies P β m − ( A i ) > P β m ( A i ) by Theo-rem 1, the last term in the summation of (19) is greater than zero, that is, P β m ( A m ) >
0. This guarantees d ∗ ν,m to be finite, which implies that rejectionregion R ∗ ν,m is larger than R ∗ ν,m − .Finally, for m = k , since the null hypothesis H ,k now contains only oneparameter configuration β k and d ∗ ν,i is available up to i = k −
1, one deter-mines d ∗ ν,k by solving P β k ( R ∗ ν,k ) = α, (21)which implies that R ∗ ν,k is level- α . Similarly, one can show that d ∗ ν,k is fi-nite. Thus, R ∗ ν,k is larger than R ∗ ν,k − . The determination of { d ∗ ν,m } km = ν +1 iscompleted.To conduct the simultaneous tests for B , assert not H ,m (i.e., assert H A,m ) if R ∗ ν,m is true.(22)Notice two facts: (1) B is closed under the operation of intersection and(2) for each ν + 1 ≤ m ≤ k , R ∗ ν,m = T ki = m R ∗ ν,i is level- α . Therefore, the ex-perimentwise error rate is no greater than α by the closed test procedure.The discussion above is now summarized as the following theorem. Theorem 3.
The rejection regions R ∗ ν,m given in ( ) increase when m gets larger and each defines a level- α test for H ,m . If one conducts simulta-neous tests for B using ( ) , then the experimentwise error rate is controlledat α in the strong sense. Let [1] , . . . , [ k ] be random indices such that ˆ β < · · · < ˆ β k ] . We now de-scribe the step-up testing procedure based on R ∗ ν,m as follows: Step
1: If R ∗ ν,ν +1 is true, then conclude that β [ ν +1] , . . . , β [ k ] are the k − ν active effects (= H ∩ H A,ν +1 ) and stop; otherwise, go to step 2. Step
2: If R ∗ ν,ν +2 is true, then conclude that β [ ν +2] , . . . , β [ k ] are the k − ν − H ,ν +1 ∩ H A,ν +2 ) and stop; otherwise, go to step 3.... Step k − ν : If R ∗ ν,k is true, then conclude that β [ k ] is the only active effect(= H ,k − ∩ H ,k ) and stop; otherwise, conclude no active effect and stop.4.2. The construction of { R ∗ m − ,m } km = ν +1 : step-up tests with sequentialscaling ( SUS ) . There is another way to conduct the simultaneous tests for
TEP-UP TESTS B . For each integer m ∈ [ ν + 1 , k ], we construct a level- α region for H ,m ,denoted by R ∗ m − ,m (corresponding to R m − ,m in Section 3), of the form R ∗ m − ,m = S mi = ν +1 { W i − ,i > d ∗ i − ,i } = S mi = ν +1 { S ν < Q i } = { S ν < max { Q i } mi = ν +1 } , (23)where Q i = ( i − X i /d ∗ i − ,i − S i − + S ν and S i − = P i − j =1 X j .To determine constants { d ∗ m − ,m } km = ν +1 , we first let the constant d ∗ ν,ν +1 equal d ν,ν +1 . Suppose that d ∗ i − ,i is available up to i = m − m < k . Wethen determine d ∗ m − ,m . Comparing (23) with (17), R ∗ m − ,m and R ∗ ν,m havesimilar forms. Therefore, similarly to (19), we obtain d ∗ m − ,m by solving m X i = ν +1 P β m (max { S ν , { Q j } i − j = ν } < Q i ) = α (24)(with Q ν = 0). Finally, for m = k , since d ∗ m − ,m is available up to m = k − d ∗ k − ,k is solved by P β k ( R ∗ k − ,k ) = α . The determination of { R ∗ m − ,m } km = ν +1 is thus complete.Using a discussion similar to that used for R ∗ ν,m , one can show that R ∗ m − ,m is a level- α test for H ,m and is increasing in m . More specifically,Lemma 3 implies that P β ( R ∗ m − ,m ) ≤ m X i = ν +1 P β (max { S ν , { Q j } i − j = ν } < Q i )= m X i = ν +1 P β (cid:18) max (cid:26) S i − , (cid:26) ( j − X j d ∗ j − ,j + S i − − S j − (cid:27) i − j = ν (cid:27) < ( i − X i d ∗ i − ,i (cid:19) . The last step rewrites each set and makes it clear that, by Theorem 1, eachprobability above on the right-hand side achieves its maximum at β m among β ∈ H ,m . Therefore, the type I error of R ∗ m − ,m is bounded by α due to (24).Again, by Theorem 1, each term corresponding to i < m evaluated at β m issmaller than that at β m − , which ensures the existence of a finite solutionfor d ∗ m − ,m .To conduct the simultaneous tests for the null hypotheses in B , assert not H ,m (i.e., assert H A,m ) if R ∗ m − ,m is true.(25)Therefore, we have a theorem similar to Theorem 3. Theorem 4.
The rejection regions R ∗ m − ,m given in ( ) are increasingwhen m increases and each defines a level- α test for H ,m . If one conductsthe simultaneous tests for B using ( ) , then the experimentwise error rateis strongly controlled at α . S. S. WU AND W. WANG
We omit the description of the step-up testing procedure based on R ∗ m − ,m . Remark 1.
Langsrud and Næs [7] and Venter and Steel [11] also con-sidered these two step-up procedures. For the same test statistics W ν,m and W m − ,m , they proposed to determine critical values d † ν,m and d † m − ,m itera-tively by P β m m [ i = ν +1 { W ν,i > d † ν,i } ! = α and P β m m [ i = ν +1 { W i − ,i > d † i − ,i } ! = α. (26)Intuitively, the solutions d † ν,m and d † m − ,m to the above equations would besmaller than their corresponding cutoff points d ∗ ν,m and d ∗ m − ,m determinedby (19) and (24) and would hence result in larger rejection regions. However,it is still not clear that the error rates of their procedures are controlled at α in the strong sense because it is very difficult to establish that for all β ∈ H ,m , P β ( R ∗ ν,m ) ≤ P β m ( R ∗ ν,m ) or P β ( R ∗ m − ,m ) ≤ P β m ( R ∗ m − ,m ) . (27)If, for example, we write R ∗ ν,m in the form T L ( X , . . . , X ν )(=: S ν ) < T R ( X ν +1 , . . . , X m )(=: max { νX i /d ∗ ν,i } mi = ν +1 ) , then T R involves more than one argument and Theorem 1 cannot be applied.However, our numerical studies show no evidence against (27).4.3. An example.
We illustrate the proposed methods using a 2 facto-rial experiment from [10], pages 246–254, which investigates how tempera-ture, pressure, concentration of formaldehyde and stirring rate influence thefiltration rate of a chemical product. The results are presented in Table 1.Column 2 of Table 1 gives the eight effect estimates with largest absolute val-ues and Column 3 the corresponding squared statistics, while S = P i =1 X i equals 15.11 for the seven effect estimates with smallest absolute values.Test statistics W ν,m and W m − ,m are presented in the next two columns for ν = 7. The SUF procedure identifies four largest active effects, irrespectiveof the two ways of choosing thresholds ( d ∗ ν,m or d † ν,m ), while the SUS proce-dure identifies five largest active effects, also irrespective of the two thresh-old selections. For the sake of comparison, a step-down procedure from Vossand Wang [14], which uses test statistics T SD ,m = X m / min { . S , . S } ,identifies three largest active effects. Finally, a MATLAB program for theevaluation of cutoff points is available from the authors. TEP-UP TESTS Table 1
The Montgomery [10] data, the step-up tests with fixed scaling ( SUF ) and the step-uptests with sequential scaling ( SUS ) , and the related cutoff points Effect Test statistics The cutoff points at level α = 0 . m estimate X m W ν,m W m − ,m d † ν,m d ∗ ν,m d † m − ,m d ∗ m − ,m − .
625 6 .
89 3 . . . . . .
99 3 .
125 9 .
77 4 . . . . . .
710 4 .
125 17 .
02 7 . . . . . .
311 9 .
875 97 .
52 45 . . . . .
712 14 .
625 213 .
89 99.1 16 . . . . .
213 16 .
625 276 .
39 128 . . . . . . − .
125 328 .
52 152 . . . . . .
515 21 .
625 467 .
64 216 . . . . . .
5. Simulation study.
A limited simulation study was conducted to com-pare five testing procedures: step-up tests with sequential scaling (SUS andSUSI using cutoff points determined by (24) and (26), respectively), step-uptests with fixed scaling (SUF and SUFI using cutoff points determined by(19) and (26), respectively) and the Voss and Wang [14] step-down proce-dure (SD). The testing procedures were evaluated in terms of four measures:(1) the experimentwise error rate (EER), (2) the probability of correctlyselecting the number of inactive effects (PCSN), (3) the probability of com-plete correct selection (PCCS) and (4) the expected fraction of active effectsthat are declared active (Power). The simulation was carried out for severalchoices of k . Since the results are similar, we only present the choice k = 15on six cases below, following Venter and Steel [11]:C1: β ∈ H , , β = s ; C4: β ∈ H , , β = · · · = β = s ;C2: β ∈ H , ; β = β = β = s ; C5: β ∈ H , , β i = is, ≤ i ≤ β ∈ H , , β = · · · = β = s ; C6: β ∈ H , , β i = is, ≤ i ≤ s takes values from 0 to 8 with a step of 0.02. Each independentsample consists of 15 observations, each from N ( β i ,
1) for 1 ≤ i ≤ α = 0 .
05 and ν = 7, although findings are similarfor other choices of α and ν . Each point was determined based on 100,000simulations. In summary, all procedures control the EER. Second, thereis a very small difference between the two ways of choosing cutoff points,especially between SUS and SUSI. Third, in C1, C2 and C5, the SUS isclearly the best. In C4 and C6, there is a small difference between SUF andSUS. In C3, the SUF performs better at small s , but the SUS is better atlarge s . The SD seems to be the worst in most selected cases. S. S. WU AND W. WANG C1 EE R C3 C5 C1 P C S N C3 C5 C1 P o w e r C3 C5 C2 P o w e r s 0 2 4 6 800.20.40.60.81 C4 s 0 2 4 6 800.20.40.60.81 C6 s SUSISUSSUFISUFSD Fig. 1.
Selected simulation results for five test procedures (SUSI, SUS, SUFI, SUF andSD) using α = 0 . , ν = 7 , three evaluation measures (EER, PCSN, Power) and six casesof parameter configuration ( C1 – C6) given in the text.
6. Discussion.
We search for active effects in orthogonal saturated de-signs by conducting simultaneous tests on a sequence of decreasing nullhypotheses. A general class of level- α tests is provided for testing at least acertain number of active effects and the least favorable distribution is iden-tified to be the one at β m . Two sets of simultaneous tests are derived withincreasing rejection regions and their experimentwise error rates are con-trolled at α in the strong sense. Between these two sets of tests, the step-uptests with sequential scaling are recommended because our simulation studyindicates that { R ∗ m − ,m } km = ν +1 has greater power in most cases. Since themaximal type I errors for { R ∗ ν,m } km = ν +1 and { R ∗ m − ,m } km = ν +1 at m = ν + 1and k are equal to α , simply enlarging the rejection regions cannot yieldvalid level- α tests.We can show that Lemma 1 is also true if U ∼ F ,n , an F -distribution with1 and n degrees of freedom, and V ∼ F ,n ( λ ), a noncentral F -distributionwith 1 and n degrees of freedom and noncentrality parameter λ . Conse- TEP-UP TESTS quently, Theorem 1 remains true if we let X , . . . , X k be the order statisticsof independent random variables with noncentral F distributions F ,n ( λ i ),1 ≤ i ≤ k . This implies that our step-up simultaneous tests also work forsquares of independent t -statistics. Acknowledgments.
The authors are grateful for the efforts and construc-tive suggestions of two referees and an Associate Editor.REFERENCES [1]
Birnbaum, A. (1959). On the analysis of factorial experiments without replication.
Technometrics Daniel, C. (1959). Use of half-normal plots in interpreting factorial two-level exper-iments.
Technometrics Dudoit, S., Shaffer, J. P. and
Boldrick, J. (2003). Multiple hypothesis testingin microarray experiments.
Statist. Sci. Hamada, M. and
Balakrishnan, N. (1998). Analyzing unreplicated factorial ex-periments: A review with some new proposals (with discussion).
Statist. Sinica Hochberg, Y. and
Tamhane, A. C. (1987).
Multiple Comparison Procedures . Wiley,New York. MR0914493[6]
Hsu, J. C. (1996).
Multiple Comparisons. Theory and Methods . Chapman and Hall,London. MR1629127[7]
Langsrud, Ø. and
Næs, T. (1998). A unified framework for significance testing infractional factorials.
Comput. Statist. Data Anal. Lenth, R. V. (1989). Quick and easy analysis of unreplicated factorials.
Technomet-rics Marcus, R., Peritz, E. and
Gabriel, K. R. (1976). On closed testing procedureswith special reference to ordered analysis of variance.
Biometrika Montgomery, D. C. (2001).
Design and Analysis of Experiments , 5th ed. Wiley,New York.[11]
Venter, J. H. and
Steel, S. J. (1998). Identifying active contrasts by stepwisetesting.
Technometrics Voss, D. T. (1988). Generalized modulus-ratio tests for analysis of factorial designswith zero degrees of freedom for error.
Comm. Statist. Theory Methods Voss, D. T. (1999). Analysis of orthogonal saturated designs.
J. Statist. Plann.Inference Voss, D. T. and
Wang, W. (2006). On adaptive testing in orthogonal saturateddesigns.
Statist. Sinica Wang, W. and
Voss, D. T. (2001). Control of error rates in adaptive analysis oforthogonal saturated designs.
Ann. Statist. Wang, W. and
Voss, D. T. (2003). On adaptive estimation in orthogonal saturateddesigns.
Statist. Sinica Division of BiostatisticsUniversity of FloridaGainesville, Florida 32610USAE-mail: [email protected]fl.edu