[PDF] Step-up simultaneous tests for identifying active effects in orthogonal saturated designs

Abstract

A sequence of null hypotheses regarding the number of negligible effects (zero effects) in orthogonal saturated designs is formulated. Two step-up simultaneous testing procedures are proposed to identify active effects (nonzero effects) under the commonly used assumption of effect sparsity. It is shown that each procedure controls the experimentwise error rate at a given α level in the strong sense.

Full PDF

aa r X i v : . [ m a t h . S T ] A ug The Annals of Statistics (cid:13)

Institute of Mathematical Statistics, 2007

STEP-UP SIMULTANEOUS TESTS FOR IDENTIFYING ACTIVEEFFECTS IN ORTHOGONAL SATURATED DESIGNS

By Samuel S. Wu and Weizhen Wang University of Florida and Wright State University

A sequence of null hypotheses regarding the number of negligibleeﬀects (zero eﬀects) in orthogonal saturated designs is formulated.Two step-up simultaneous testing procedures are proposed to iden-tify active eﬀects (nonzero eﬀects) under the commonly used assump-tion of eﬀect sparsity. It is shown that each procedure controls theexperimentwise error rate at a given α level in the strong sense.

1. Introduction.

Assume a linear model Y i = µ + β x i + · · · + β k x ik + ε i , for i = 1 , . . . , M, (1)where ε i ∼ i.i.d. N(0 , σ ). The unknown parameters β i are of interest and µ and σ are two unknown nuisance parameters. The design is called orthogo-nal if the least squares estimators ˆ β i (1 ≤ i ≤ k ) are uncorrelated (equivalentto independent), which occurs, for example, in two-level fractional factorialdesigns. The design is said to be saturated if there are just enough observa-tions to estimate the model parameters β i and µ (i.e., M = k + 1), leavingno degrees of freedom to estimate the error variance σ . In order to makeinferences on β i , one must typically use the assumption of eﬀect sparsity,that is, that most of the β i ’s are equal to zero. Then we can use the cor-responding ˆ β i ’s to estimate σ . However, we do not know how many andwhich of the β i ’s are zero. An initial guess would be at least ν of the β i ’sequal zero, say one-half of the eﬀects. Therefore, the smallest ν of the ˆ β i ’sshould be used to estimate σ . Any other ˆ β j whose square is substantiallylarger is likely to have a nonzero mean and corresponds to an active eﬀect.For a ﬁxed sequence of β = ( β , . . . , β k ), let N = the number of β i ’s which equal zero . (2) Received May 2004; revised October 2005. Supported in part by NSF Grant DMS-03-08861.

AMS 2000 subject classiﬁcations.

Primary 62F35, 62F25, 62K15; secondary 62J15.

Key words and phrases.

Closed test method, eﬀect sparsity, experimentwise error rate,non-central chi-squared distribution.

This is an electronic reprint of the original article published by theInstitute of Mathematical Statistics in

The Annals of Statistics ,2007, Vol. 35, No. 1, 449–463. This reprint diﬀers from the original in paginationand typographic detail. 1

S. S. WU AND W. WANG

Thus, the number of nonzero β i ’s is equal to k − N and the entire parameterspace without nuisance parameters is H = { β = ( β , . . . , β k ) : N ≥ ν } . Foreach integer m ∈ [ ν + 1 , k ], consider the testing problem H ,m : N ≥ m vs. H A,m : N ≤ m − , k − N ≥ k − m + 1)(3)and deﬁne a parameter conﬁguration in each H ,m , β m =: (0 , . . . , , + ∞ , . . . , + ∞ ) , (4)where the ﬁrst m components are zero. Let B = { H ,m : ν + 1 ≤ m ≤ k } , (5)which contains all null hypotheses of interest in this paper. Because H ,i is asubset of H ,j for any i > j , if H ,j is incorrect, then so is H ,i . This impliesthat a testing process should be terminated as soon as a rejection occursfor some null hypothesis. Starting from m = ν + 1, we test these hypothesesone at a time as m goes up to k . If H ,ν +1 is rejected, we then concludethat there are k − ν active eﬀects (i.e., H ∩ H A,ν +1 ) and no longer test anyother hypotheses; otherwise, we test the next hypothesis H ,ν +2 . In general,if H ,m is the ﬁrst hypothesis being rejected for some m ≤ k , we stop andconclude that there are k − m + 1 nonzero eﬀects (i.e., H ,m − ∩ H A,m );otherwise, all hypotheses in B are accepted and we conclude that there isno active eﬀect. Clearly, this is a step-up testing procedure.Many inference procedures have been proposed to identify active eﬀects.The data analysis of orthogonal saturated designs was initially consideredby Birnbaum [1] and Daniel [2]. The half-normal plot introduced by Daniel[2] is still being used in the preliminary analysis. Lenth [8] proposed theﬁrst adaptive method to let the data determine which and how many ofthe ˆ β i ’s should be used to estimate σ . Whether Lenth’s interval is of level1 − α still remains a question. Besides using the data adaptively, anotherfundamentally desirable property is the ability to control the error rate inthe strong sense (i.e., under all parameter conﬁgurations), which is espousedby Hochberg and Tamhane [5]. For orthogonal saturated designs, the ﬁrstadaptive conﬁdence interval known to provide strong control of error ratesand more general results can be found in [15] and [16], respectively. Hamadaand Balakrishnan [4] provided a thorough review of the analysis methodsavailable for saturated designs.Since we do not know which and how many eﬀects are active, it is rea-sonable to search for active eﬀects using simultaneous tests. Here are twopossibilities:(a) Starting from the largest ˆ β i , test whether the corresponding eﬀect isactive. If it is not active, then conclude no active eﬀect and stop; oth-erwise, test for the second largest ˆ β i (go down), and so on, until a zeroeﬀect is found—the step-down tests. TEP-UP TESTS (b) Starting from the ( ν + 1)th smallest ˆ β i , test whether the correspondingeﬀect is active. If it is active, then conclude k − ν active eﬀects and stop;otherwise, test for the ( ν + 2)th smallest ˆ β i (go up), and so on, until anactive eﬀect is found—the step-up tests.Voss [12] proposed nonadaptive step-down tests for a set of hypothesesdiﬀerent from B and controlled the experimentwise error rates at the givenlevel α in the strong sense. Recently, Voss and Wang [14] derived adaptivestep-down tests with the experimentwise error rates controlled in the samesetting as Voss [12]. As pointed out by several researchers [3, 7, 11], thestep-up tests typically have a greater power to detect active eﬀects thanthe step-down tests. For example, Venter and Steel [11] proposed one step-up and one step-down procedure for the orthogonal saturated designs withcutoﬀ points determined at β m . However, they were unable to prove thatthe experimentwise error rates of their procedures are controlled at a givenlevel α in the strong sense.The rest of this article is organized as follows. In Section 2, we providethe motivation for two desirable tests for H ,m and establish a probabilityinequality for noncentral χ distributions which asserts that the maximaltype I error of any test in a general class is always achieved at β m . Basedon this inequality, two level- α tests are proposed in Section 3 for testingeach single hypothesis H ,m . Two sequential step-up procedures for testingall hypotheses in B , which control the experimentwise error rates at α in thestrong sense, are derived in Section 4. Section 5 presents a simulation studyand Section 6 concludes with some discussion.

2. Motivation and a general probability inequality.

In this section, weprovide motivation for the new procedures testing H ,m and a general class[see A m in (11)] of the rejection regions of level- α . The maximal type I errorof each rejection region in this class is always achieved at β m , as stated inTheorem 1.Assume the factorial eﬀect estimators ˆ β i are independently distributedas N ( β i , a i σ ) for known constants a i . We may assume that each a i = 1without loss of generality. Let X , . . . , X k be the order statistics of the ˆ β i for i ≤ k . Intuitively, it is more likely that the small order statistics X i willcorrespond to estimators ˆ β j with β j = 0. If we believe a priori that at least ν of the β i ’s are zero, then those β i ’s corresponding to X , . . . , X ν are likelyto be the negligible ones. To test H ,m , one needs to compare X m with X , . . . , X ν . For any integer n ∈ [ ν, m ], let S n = P ni =1 X i and ¯ X n = S n /n anddeﬁne a test statistic for H ,m as follows: W n,m = nX m P ni =1 X i = nX m S n = X m ¯ X n . (6) S. S. WU AND W. WANG

Intuitively, a large value of W n,m is in favor of H A,m . Therefore, the rejectionregion should be W n,m > d n,m for a constant d n,m which satisﬁessup β ∈ H ,m P β ( W n,m > d n,m ) = α. (7)In this section, we will show that W n,m is stochastically largest at β m for β ∈ H ,m . Definition 1.

A random variable X is said to be stochastically smallerthan Y , denoted by X ≺ Y , if P ( X ≤ d ) ≥ P ( Y ≤ d ) for all d .Let Y ,m , . . . , Y m,m be the order statistics of the ˆ β i ’s for i ≤ m when β = · · · = β m = 0 and let Z n,m = nY m,m P ni =1 Y i,m . (8)Langsrud and Næs [7] also studied Z n,m , called Ψ m, ,n,m -distribution in theirnotation. Clearly, the distribution of Z n,m does not depend on any param-eter and can be sampled based on the order statistics of m independent χ random variables. It is easy to see that W n,m = Z n,m at β m and we want toshow that W n,m ≺ Z n,m on H ,m . If this is true, then (7) reduces to P ( Z n,m > d n,m ) = α (9)and d n,m is the 100(1 − α ) percentile of Z n,m . Definition 2.

A function h : R d → R will be called nondecreasing (tothe coordinatewise ordering) if x i ≤ y i , i = 1 , . . . , d, implies that h ( x ) ≤ h ( y ).Now we prove a general theorem that includes (9) as a special case. Theorem 1.

Suppose that X , . . . , X k are the order statistics of inde-pendent random variables with noncentral chi-squared distributions χ ( β i ) , ≤ i ≤ k . Then sup β ∈ H ,m P β ( R ) = P β m ( R )(10) for any rejection region R of H ,m that belongs to the class A m = { T L ( X , . . . , X s − ) < T R ( X s ) , for some integer < s ≤ m } , (11) where T L and T R satisfy the following two properties :(i) ( monotone ) T L ( x , . . . , x s − ) and T R ( x s ) are nondecreasing functions ; TEP-UP TESTS (ii) ( invariant ) T L /T R is invariant to scale transformation, that is, forany a > , T L ( ax , . . . , ax s − ) T R ( ax s ) = T L ( x , . . . , x s − ) T R ( x s ) . (12) Corollary 1.

For any < α < and d n,m given in ( ) , the rejectionregion R n,m = { W n,m > d n,m } (13) deﬁnes a level- α test for ( ) . Proof.

Let s = m , T L ( x , . . . , x m − ) = P ni =1 x i /n and T R ( x m ) = x m /d n,m .The claim follows from Theorem 1. (cid:3) To prove Theorem 1, we need the facts below (Corollary 2 and Corol-lary 3).

Lemma 1.

For a nonnegative random variable X and a positive number y , let X y = X/y , given that X ≤ y . Let U ∼ χ , a chi-squared distributionwith one degree of freedom, and V ∼ χ ( θ ) , a noncentral chi-squared distri-bution with one degree of freedom and noncentrality parameter θ . Then wehave (i) U y ≺ V y and (ii) U y ≺ U y for any y > y > . Part (ii) of Lemma 1 is identical to Lemma 2 of [7]. The proofs for bothstochastic orderings follow from the monotone likelihood ratio function.

Lemma 2.

Let U , . . . , U s − be any independent random variables. Letthe same be true for V , . . . , V s − . If T ( x , . . . , x s − ) is a nondecreasing func-tion and U i ≺ V i for i ≤ s − , then T ( U , . . . , U s − ) ≺ T ( V , . . . , V s − ) . This is called by some researchers (see, e.g., [13]) a “stochastic orderinglemma.”

Corollary 2.

Suppose that U i ∼ χ , ≤ i ≤ s − , and V j ∼ χ ( θ j ) , ≤ j ≤ s − , are independent random variables. For any y > , let U ( i ) ,y bethe order statistics of U i,y (= U i /y , given that U i ≤ y , as deﬁned in Lemma and let V ( j ) ,y be the order statistics of V j,y . Then for any nondecreasingfunction T L , T L ( U (1) ,y , . . . , U ( s − ,y ) ≺ T L ( V (1) ,y , . . . , V ( s − ,y ) . (14) Therefore, for any t > , P ( T L ( U (1) ,y , . . . , U ( s − ,y ) ≤ t ) ≥ P ( T L ( V (1) ,y , . . . , V ( s − ,y ) ≤ t ) . S. S. WU AND W. WANG

Proof.

Let T ∗ L ( U ,y , . . . , U s − ,y ) = T L ( U (1) ,y , . . . , U ( s − ,y ) . Since each U ( i ) ,y is a nondecreasing function in each U j,y and T L is nonde-creasing in each of its arguments, T ∗ L is nondecreasing in each of its argu-ments. Therefore, if one combines part (i) of Lemma 1 with Lemma 2, it canbe concluded that T L ( U (1) ,y , . . . , U ( s − ,y ) = T ∗ L ( U ,y , . . . , U s − ,y ) ≺ T ∗ L ( V ,y , . . . , V s − ,y )= T L ( V (1) ,y , . . . , V ( s − ,y ) . (cid:3) Corollary 3.

Suppose that T L and T R satisfy the monotone and in-variant conditions speciﬁed in the deﬁnition of A m in (11) and one deﬁnes G ∗ ( y ) ≡ P ( T L ( U (1) ,y , . . . , U ( s − ,y ) ≤ T R (1))= Z . . . Z { y > , T L ( U (1) ,y , . . . , U ( s − ,y ) ≺ T L ( U (1) ,y , . . . , U ( s − ,y ) . Therefore, G ∗ ( y ) is a nondecreasing function. Proof.

Combine part (ii) of Lemma 1 with Lemma 2 and deﬁne T ∗ L asin Corollary 2. (cid:3) Proof of Theorem 1.

Consider two samples, { ˆ β , . . . , ˆ β m , + ∞ , . . . , + ∞ ( k − m of them) } and { ˆ β , . . . , ˆ β k } . Clearly, the s th order statistic of the ﬁrst sample is stochastically largerthan that of the second sample, that is, X s ≺ Y s,m for any given integer s satisfying 1 < s ≤ m .Second, we have P ( T L ( Y ,m , . . . , Y s − ,m ) < T R ( Y s,m ))= E [ P ( T L ( Y ,m , . . . , Y s − ,m ) < T R ( y ) | Y s,m = y )]= E [ P ( T L ( U (1) ,y , . . . , U ( s − ,y ) < T R (1) | Y s,m = y )]= E [ G ∗ ( Y s,m )] . TEP-UP TESTS Next, for any partition ω = ( j , j , . . . , j s − )( j s )( j s +1 , . . . , j k ) of the inte-gers 1 to k , we denote by E ω the event { ˆ β j i < ˆ β j s , ∀ i < s ; ˆ β j s < ˆ β j l , ∀ l > s } .We note that for any β ∈ H ,m , P β ( R )= E β [ P β ( T L ( X , . . . , X s − ) < T R ( y ) | X s = y )]= E β [ P β ( T L ( X ,y , . . . , X s − ,y ) < T R (1) | X s = y )]= E β (cid:20)X ω P β ( { T L ( X ,y , . . . , X s − ,y ) < T R (1) } ∩ E ω | X s = y ) (cid:21) = E β (cid:20)X ω P β ( { T L ( X ,y , . . . , X s − ,y ) < T R (1) }| E ω , X s = y ) P β ( E ω | X s = y ) (cid:21) ≤ E β (cid:20)X ω P β m ( { T L ( X ,y , . . . , X s − ,y ) < T R (1) }| X s = y ) P β ( E ω | X s = y ) (cid:21) = E β [ P β m ( T L ( X ,y , . . . , X s − ,y ) < T R (1) | X s = y )]= E β [ P ( T L ( U (1) ,y , . . . , U ( s − ,y ) < T R (1) | X s = y )]= E β [ G ∗ ( X s )] , where the above inequality is due to Corollary 2.Finally, since G ∗ ( y ) is a nondecreasing function due to Corollary 3, theinequality X s ≺ Y s,m implies that G ∗ ( X s ) ≺ G ∗ ( Y s,m ). Therefore, it can beconcluded that E β [ G ∗ ( X s )] ≤ E [ G ∗ ( Y s,m )] . (cid:3)

3. Two tests for H ,m . In this section, we present two tests for H ,m .While both are of level α , they are to be used under diﬀerent circumstances.Let d ν,m and d m − ,m denote the numbers determined by (9) when n = ν and n = m −

1, respectively.

Theorem 2.

For any < α < , the rejection regions R ν,m = { W ν,m > d ν,m } and R m − ,m = { W m − ,m > d m − ,m } (15) both deﬁne level- α tests for (3). This theorem is a special case of Corollary 1 when n = ν and n = m − R ν,m estimates the error variance by ¯ X ν = P νi =1 X i /ν , irre-spective of the value of m , while the test based on R m − ,m uses the adaptiveestimator ¯ X m − = P m − i =1 X i / ( m − ﬁxed and sequential scaling , respectively. Region R ν,m should S. S. WU AND W. WANG be applied only when H ,m is of interest and no information is availableon whether H ,n is true for any n < m . In such a case, we are certain thatat least ν of the ˆ β i ’s have a zero mean. It is reasonable to compare X m with the average of X , . . . , X ν , the smallest ν of the ˆ β i ’s, through theiraverage. If W ν,m is large, then one concludes that all ˆ β i ’s corresponding to X m , . . . , X k are from populations with nonzero means. On the other hand,if one tests H ,n for n < m sequentially up to H ,m and H ,m − is accepted,then at least m − β i ’s are zero. In this case, one should compare X m with X , . . . , X m − and a large value of W m − ,m would lead to a rejectionof H ,m .

4. Two step-up testing procedures.

When eﬀect sparsity is assumed,we do not know which and how many of the β i ’s are zero. It is of moreinterest to conduct tests simultaneously to identify the nonzero eﬀects. Thetests developed in the previous section, in fact, can detect whether there is ajump at X m among X , . . . , X k . However, these tests cannot tell whether thejump, if it exists, is the ﬁrst one, which is what interests us. Therefore, asmentioned earlier, since H ,m decreases as m increases, one needs to conducttests sequentially. Like all testing problems, there are two major concerns:to control the experimentwise error rate at a given level α , that is,sup β ∈ S km = ν +1 H ,m P β (assert not H ,n , which contains β , for some n ∈ [ ν + 1 , k ]) ≤ α, (16)and to obtain more powerful tests, which means larger rejection regions.The ﬁrst requirement (16) can be ensured by using the closed test pro-cedure proposed by Marcus, Peritz and Gabriel [9]. For details, see [6],page 137. A na¨ıve solution is to assert not H ,m (i.e., to assert H A,m ) ifone rejects H ,i at level α for all i ≥ m . For example, suppose R ν,i is used totest for each H ,i . Then assert not H ,m iﬀ R m = T ki = m R ν,i is true. This, bythe closed test procedure, controls the experimentwise error rate at α . Notethat R m decreases as m increases, which contradicts the fact that H ,m isdecreasing (we need R m to increase). Therefore, simply applying the closedtest procedure on the tests derived in the previous section only results inless powerful tests for the simultaneous hypotheses. We require that the re-jection region for H ,m (a) increases as m gets larger and (b) is of level- α . Inthis section, two testing procedures are discussed with their rejection regionsdenoted by { R ∗ ν,m } km = ν +1 and { R ∗ m − ,m } km = ν +1 corresponding to R ν,m and R m − ,m , respectively. TEP-UP TESTS The construction of { R ∗ ν,m } km = ν +1 : step-up tests with ﬁxed scaling ( SUF ) . The general form of R ∗ ν,m , for ν + 1 ≤ m ≤ k , is R ∗ ν,m = m [ i = ν +1 { W ν,i > d ∗ ν,i } = { S ν < max { νX i /d ∗ ν,i } mi = ν +1 } , (17)where the sequence of constants { d ∗ ν,m } km = ν +1 is determined iteratively be-low. More precisely, d ∗ ν,m depends on d ∗ ν,i for i < m and causes R ∗ ν,m to havelevel- α . It is clear that R ∗ ν,m is nondecreasing when m gets larger and isstrictly increasing if all d ∗ ν,m are ﬁnite. We start with the following lemma. Lemma 3.

For a sequence of random variables { ∆ i } si =0 , where s is agiven positive integer, P (∆ < max { ∆ i } si =1 ) ≤ s X i =1 P (max { ∆ j } i − j =0 < ∆ i ) . (18) Proof.

We prove (18) by induction. When s = 1, (18) is true. Suppose(18) is true for any s = n . Then for s = n + 1, P (∆ < max { ∆ i } n +1 i =1 ) ≤ P (∆ < max { ∆ i } ni =1 ) + P (max { ∆ j } nj =0 ≤ ∆ < ∆ n +1 ) ≤ n +1 X i =1 P (max { ∆ j } i − j =0 < ∆ i ) . (cid:3) We now determine the sequence of constants { d ∗ ν,m } km = ν +1 starting from m = ν + 1. For testing H ,ν +1 , let R ∗ ν,ν +1 = R ν,ν +1 . It is a level- α test byTheorem 2.For any ν + 2 ≤ m < k , let d ∗ ν,ν = ∞ and suppose that { d ∗ ν,i } m − i = ν +1 areavailable. Then d ∗ ν,m is determined by solving m X i = ν +1 P β m ( A i ) = α, (19) where A i = { max { S ν , { νX j /d ∗ ν,j } i − j = ν } < νX i /d ∗ ν,i } . Note that R ∗ ν,m is of level- α because for any β ∈ H ,m , P β ( R ∗ ν,m ) ≤ m X i = ν +1 P β ( A i ) ≤ m X i = ν +1 P β m ( A i ) = α, (20)where the ﬁrst inequality follows from Lemma 3 (with ∆ = S ν and ∆ i − ν = νX i /d ∗ ν,i for i = ν + 1 , . . . , m ) and the second inequality holds since eachterm in the summation achieves its maximum at β m by Theorem 1. On the S. S. WU AND W. WANG other hand, since the thresholds { d ∗ ν,i } m − i = ν +1 satisfy P m − i = ν +1 P β m − ( A i ) = α and each term in this summation satisﬁes P β m − ( A i ) > P β m ( A i ) by Theo-rem 1, the last term in the summation of (19) is greater than zero, that is, P β m ( A m ) >

0. This guarantees d ∗ ν,m to be ﬁnite, which implies that rejectionregion R ∗ ν,m is larger than R ∗ ν,m − .Finally, for m = k , since the null hypothesis H ,k now contains only oneparameter conﬁguration β k and d ∗ ν,i is available up to i = k −

1, one deter-mines d ∗ ν,k by solving P β k ( R ∗ ν,k ) = α, (21)which implies that R ∗ ν,k is level- α . Similarly, one can show that d ∗ ν,k is ﬁ-nite. Thus, R ∗ ν,k is larger than R ∗ ν,k − . The determination of { d ∗ ν,m } km = ν +1 iscompleted.To conduct the simultaneous tests for B , assert not H ,m (i.e., assert H A,m ) if R ∗ ν,m is true.(22)Notice two facts: (1) B is closed under the operation of intersection and(2) for each ν + 1 ≤ m ≤ k , R ∗ ν,m = T ki = m R ∗ ν,i is level- α . Therefore, the ex-perimentwise error rate is no greater than α by the closed test procedure.The discussion above is now summarized as the following theorem. Theorem 3.

The rejection regions R ∗ ν,m given in ( ) increase when m gets larger and each deﬁnes a level- α test for H ,m . If one conducts simulta-neous tests for B using ( ) , then the experimentwise error rate is controlledat α in the strong sense. Let [1] , . . . , [ k ] be random indices such that ˆ β < · · · < ˆ β k ] . We now de-scribe the step-up testing procedure based on R ∗ ν,m as follows: Step

1: If R ∗ ν,ν +1 is true, then conclude that β [ ν +1] , . . . , β [ k ] are the k − ν active eﬀects (= H ∩ H A,ν +1 ) and stop; otherwise, go to step 2. Step

2: If R ∗ ν,ν +2 is true, then conclude that β [ ν +2] , . . . , β [ k ] are the k − ν − H ,ν +1 ∩ H A,ν +2 ) and stop; otherwise, go to step 3.... Step k − ν : If R ∗ ν,k is true, then conclude that β [ k ] is the only active eﬀect(= H ,k − ∩ H ,k ) and stop; otherwise, conclude no active eﬀect and stop.4.2. The construction of { R ∗ m − ,m } km = ν +1 : step-up tests with sequentialscaling ( SUS ) . There is another way to conduct the simultaneous tests for

TEP-UP TESTS B . For each integer m ∈ [ ν + 1 , k ], we construct a level- α region for H ,m ,denoted by R ∗ m − ,m (corresponding to R m − ,m in Section 3), of the form R ∗ m − ,m = S mi = ν +1 { W i − ,i > d ∗ i − ,i } = S mi = ν +1 { S ν < Q i } = { S ν < max { Q i } mi = ν +1 } , (23)where Q i = ( i − X i /d ∗ i − ,i − S i − + S ν and S i − = P i − j =1 X j .To determine constants { d ∗ m − ,m } km = ν +1 , we ﬁrst let the constant d ∗ ν,ν +1 equal d ν,ν +1 . Suppose that d ∗ i − ,i is available up to i = m − m < k . Wethen determine d ∗ m − ,m . Comparing (23) with (17), R ∗ m − ,m and R ∗ ν,m havesimilar forms. Therefore, similarly to (19), we obtain d ∗ m − ,m by solving m X i = ν +1 P β m (max { S ν , { Q j } i − j = ν } < Q i ) = α (24)(with Q ν = 0). Finally, for m = k , since d ∗ m − ,m is available up to m = k − d ∗ k − ,k is solved by P β k ( R ∗ k − ,k ) = α . The determination of { R ∗ m − ,m } km = ν +1 is thus complete.Using a discussion similar to that used for R ∗ ν,m , one can show that R ∗ m − ,m is a level- α test for H ,m and is increasing in m . More speciﬁcally,Lemma 3 implies that P β ( R ∗ m − ,m ) ≤ m X i = ν +1 P β (max { S ν , { Q j } i − j = ν } < Q i )= m X i = ν +1 P β (cid:18) max (cid:26) S i − , (cid:26) ( j − X j d ∗ j − ,j + S i − − S j − (cid:27) i − j = ν (cid:27) < ( i − X i d ∗ i − ,i (cid:19) . The last step rewrites each set and makes it clear that, by Theorem 1, eachprobability above on the right-hand side achieves its maximum at β m among β ∈ H ,m . Therefore, the type I error of R ∗ m − ,m is bounded by α due to (24).Again, by Theorem 1, each term corresponding to i < m evaluated at β m issmaller than that at β m − , which ensures the existence of a ﬁnite solutionfor d ∗ m − ,m .To conduct the simultaneous tests for the null hypotheses in B , assert not H ,m (i.e., assert H A,m ) if R ∗ m − ,m is true.(25)Therefore, we have a theorem similar to Theorem 3. Theorem 4.

The rejection regions R ∗ m − ,m given in ( ) are increasingwhen m increases and each deﬁnes a level- α test for H ,m . If one conductsthe simultaneous tests for B using ( ) , then the experimentwise error rateis strongly controlled at α . S. S. WU AND W. WANG

We omit the description of the step-up testing procedure based on R ∗ m − ,m . Remark 1.

Langsrud and Næs [7] and Venter and Steel [11] also con-sidered these two step-up procedures. For the same test statistics W ν,m and W m − ,m , they proposed to determine critical values d † ν,m and d † m − ,m itera-tively by P β m m [ i = ν +1 { W ν,i > d † ν,i } ! = α and P β m m [ i = ν +1 { W i − ,i > d † i − ,i } ! = α. (26)Intuitively, the solutions d † ν,m and d † m − ,m to the above equations would besmaller than their corresponding cutoﬀ points d ∗ ν,m and d ∗ m − ,m determinedby (19) and (24) and would hence result in larger rejection regions. However,it is still not clear that the error rates of their procedures are controlled at α in the strong sense because it is very diﬃcult to establish that for all β ∈ H ,m , P β ( R ∗ ν,m ) ≤ P β m ( R ∗ ν,m ) or P β ( R ∗ m − ,m ) ≤ P β m ( R ∗ m − ,m ) . (27)If, for example, we write R ∗ ν,m in the form T L ( X , . . . , X ν )(=: S ν ) < T R ( X ν +1 , . . . , X m )(=: max { νX i /d ∗ ν,i } mi = ν +1 ) , then T R involves more than one argument and Theorem 1 cannot be applied.However, our numerical studies show no evidence against (27).4.3. An example.

We illustrate the proposed methods using a 2 facto-rial experiment from [10], pages 246–254, which investigates how tempera-ture, pressure, concentration of formaldehyde and stirring rate inﬂuence theﬁltration rate of a chemical product. The results are presented in Table 1.Column 2 of Table 1 gives the eight eﬀect estimates with largest absolute val-ues and Column 3 the corresponding squared statistics, while S = P i =1 X i equals 15.11 for the seven eﬀect estimates with smallest absolute values.Test statistics W ν,m and W m − ,m are presented in the next two columns for ν = 7. The SUF procedure identiﬁes four largest active eﬀects, irrespectiveof the two ways of choosing thresholds ( d ∗ ν,m or d † ν,m ), while the SUS proce-dure identiﬁes ﬁve largest active eﬀects, also irrespective of the two thresh-old selections. For the sake of comparison, a step-down procedure from Vossand Wang [14], which uses test statistics T SD ,m = X m / min { . S , . S } ,identiﬁes three largest active eﬀects. Finally, a MATLAB program for theevaluation of cutoﬀ points is available from the authors. TEP-UP TESTS Table 1

The Montgomery [10] data, the step-up tests with ﬁxed scaling ( SUF ) and the step-uptests with sequential scaling ( SUS ) , and the related cutoﬀ points Eﬀect Test statistics The cutoﬀ points at level α = 0 . m estimate X m W ν,m W m − ,m d † ν,m d ∗ ν,m d † m − ,m d ∗ m − ,m − .

625 6 .

89 3 . . . . . .

99 3 .

125 9 .

77 4 . . . . . .

710 4 .

125 17 .

02 7 . . . . . .

311 9 .

875 97 .

52 45 . . . . .

712 14 .

625 213 .

89 99.1 16 . . . . .

213 16 .

625 276 .

39 128 . . . . . . − .

125 328 .

52 152 . . . . . .

515 21 .

625 467 .

64 216 . . . . . .

5. Simulation study.

A limited simulation study was conducted to com-pare ﬁve testing procedures: step-up tests with sequential scaling (SUS andSUSI using cutoﬀ points determined by (24) and (26), respectively), step-uptests with ﬁxed scaling (SUF and SUFI using cutoﬀ points determined by(19) and (26), respectively) and the Voss and Wang [14] step-down proce-dure (SD). The testing procedures were evaluated in terms of four measures:(1) the experimentwise error rate (EER), (2) the probability of correctlyselecting the number of inactive eﬀects (PCSN), (3) the probability of com-plete correct selection (PCCS) and (4) the expected fraction of active eﬀectsthat are declared active (Power). The simulation was carried out for severalchoices of k . Since the results are similar, we only present the choice k = 15on six cases below, following Venter and Steel [11]:C1: β ∈ H , , β = s ; C4: β ∈ H , , β = · · · = β = s ;C2: β ∈ H , ; β = β = β = s ; C5: β ∈ H , , β i = is, ≤ i ≤ β ∈ H , , β = · · · = β = s ; C6: β ∈ H , , β i = is, ≤ i ≤ s takes values from 0 to 8 with a step of 0.02. Each independentsample consists of 15 observations, each from N ( β i ,

1) for 1 ≤ i ≤ α = 0 .

05 and ν = 7, although ﬁndings are similarfor other choices of α and ν . Each point was determined based on 100,000simulations. In summary, all procedures control the EER. Second, thereis a very small diﬀerence between the two ways of choosing cutoﬀ points,especially between SUS and SUSI. Third, in C1, C2 and C5, the SUS isclearly the best. In C4 and C6, there is a small diﬀerence between SUF andSUS. In C3, the SUF performs better at small s , but the SUS is better atlarge s . The SD seems to be the worst in most selected cases. S. S. WU AND W. WANG C1 EE R C3 C5 C1 P C S N C3 C5 C1 P o w e r C3 C5 C2 P o w e r s 0 2 4 6 800.20.40.60.81 C4 s 0 2 4 6 800.20.40.60.81 C6 s SUSISUSSUFISUFSD Fig. 1.

Selected simulation results for ﬁve test procedures (SUSI, SUS, SUFI, SUF andSD) using α = 0 . , ν = 7 , three evaluation measures (EER, PCSN, Power) and six casesof parameter conﬁguration ( C1 – C6) given in the text.

6. Discussion.

We search for active eﬀects in orthogonal saturated de-signs by conducting simultaneous tests on a sequence of decreasing nullhypotheses. A general class of level- α tests is provided for testing at least acertain number of active eﬀects and the least favorable distribution is iden-tiﬁed to be the one at β m . Two sets of simultaneous tests are derived withincreasing rejection regions and their experimentwise error rates are con-trolled at α in the strong sense. Between these two sets of tests, the step-uptests with sequential scaling are recommended because our simulation studyindicates that { R ∗ m − ,m } km = ν +1 has greater power in most cases. Since themaximal type I errors for { R ∗ ν,m } km = ν +1 and { R ∗ m − ,m } km = ν +1 at m = ν + 1and k are equal to α , simply enlarging the rejection regions cannot yieldvalid level- α tests.We can show that Lemma 1 is also true if U ∼ F ,n , an F -distribution with1 and n degrees of freedom, and V ∼ F ,n ( λ ), a noncentral F -distributionwith 1 and n degrees of freedom and noncentrality parameter λ . Conse- TEP-UP TESTS quently, Theorem 1 remains true if we let X , . . . , X k be the order statisticsof independent random variables with noncentral F distributions F ,n ( λ i ),1 ≤ i ≤ k . This implies that our step-up simultaneous tests also work forsquares of independent t -statistics. Acknowledgments.

The authors are grateful for the eﬀorts and construc-tive suggestions of two referees and an Associate Editor.REFERENCES [1]

Birnbaum, A. (1959). On the analysis of factorial experiments without replication.

Technometrics Daniel, C. (1959). Use of half-normal plots in interpreting factorial two-level exper-iments.

Technometrics Dudoit, S., Shaffer, J. P. and

Boldrick, J. (2003). Multiple hypothesis testingin microarray experiments.

Statist. Sci. Hamada, M. and

Balakrishnan, N. (1998). Analyzing unreplicated factorial ex-periments: A review with some new proposals (with discussion).

Statist. Sinica Hochberg, Y. and

Tamhane, A. C. (1987).

Multiple Comparison Procedures . Wiley,New York. MR0914493[6]

Hsu, J. C. (1996).

Multiple Comparisons. Theory and Methods . Chapman and Hall,London. MR1629127[7]

Langsrud, Ø. and

Næs, T. (1998). A uniﬁed framework for signiﬁcance testing infractional factorials.

Comput. Statist. Data Anal. Lenth, R. V. (1989). Quick and easy analysis of unreplicated factorials.

Technomet-rics Marcus, R., Peritz, E. and

Gabriel, K. R. (1976). On closed testing procedureswith special reference to ordered analysis of variance.

Biometrika Montgomery, D. C. (2001).

Design and Analysis of Experiments , 5th ed. Wiley,New York.[11]

Venter, J. H. and

Steel, S. J. (1998). Identifying active contrasts by stepwisetesting.

Technometrics Voss, D. T. (1988). Generalized modulus-ratio tests for analysis of factorial designswith zero degrees of freedom for error.

Comm. Statist. Theory Methods Voss, D. T. (1999). Analysis of orthogonal saturated designs.

J. Statist. Plann.Inference Voss, D. T. and

Wang, W. (2006). On adaptive testing in orthogonal saturateddesigns.

Statist. Sinica Wang, W. and

Voss, D. T. (2001). Control of error rates in adaptive analysis oforthogonal saturated designs.

Ann. Statist. Wang, W. and

Voss, D. T. (2003). On adaptive estimation in orthogonal saturateddesigns.

Statist. Sinica Division of BiostatisticsUniversity of FloridaGainesville, Florida 32610USAE-mail: [email protected]ﬂ.edu