[PDF] Marginal modeling of cluster-period means and intraclass correlations in stepped wedge designs with binary outcomes

Abstract

Stepped wedge cluster randomized trials (SW-CRTs) with binary outcomes are increasingly used in prevention and implementation studies. Marginal models represent a flexible tool for analyzing SW-CRTs with population-averaged interpretations, but the joint estimation of the mean and intraclass correlation coefficients (ICCs) can be computationally intensive due to large cluster-period sizes. Motivated by the need for marginal inference in SW-CRTs, we propose a simple and efficient estimating equations approach to analyze cluster-period means. We show that the quasi-score for the marginal mean defined from individual-level observations can be reformulated as the quasi-score for the same marginal mean defined from the cluster-period means. An additional mapping of the individual-level ICCs into correlations for the cluster-period means further provides a rigorous justification for the cluster-period approach. The proposed approach addresses a long-recognized computational burden associated with estimating equations defined based on individual-level observations, and enables fast point and interval estimation of the intervention effect and correlations. We further propose matrix-adjusted estimating equations to improve the finite-sample inference for ICCs. By providing a valid approach to estimate ICCs within the class of generalized linear models for correlated binary outcomes, this article operationalizes key recommendations from the CONSORT extension to SW-CRTs, including the reporting of ICCs.

Full PDF

BBiostatistics (2021), , 0, pp. Marginal Modeling of Cluster-Period Means andIntraclass Correlations in Stepped WedgeDesigns with Binary Outcomes

FAN LI , , ∗ , HENGSHI YU , PAUL J. RATHOUZ , ELIZABETH L. TURNER ,JOHN S. PREISSER Department of Biostatistics, Yale School of Public Health, New Haven, CT, U.S.A. Center for Methods in Implementation and Prevention Science, Yale School of Public Health,New Haven, CT, U.S.A. Department of Biostatistics, University of Michigan, Ann Arbor, MI, U.S.A. Department of Population Health, The University of Texas at Austin, Austin, TX, U.S.A. Department of Biostatistics and Bioinformatics, Duke University, Durham, NC, U.S.A. Department of Biostatistics, University of North Carolina, Chapel Hill, NC, U.S.A. *[email protected]

Summary

Stepped wedge cluster randomized trials (SW-CRTs) with binary outcomes are increasingly usedin prevention and implementation studies. Marginal models represent a ﬂexible tool for analyzingSW-CRTs with population-averaged interpretations, but the joint estimation of the mean andintraclass correlation coeﬃcients (ICCs) can be computationally intensive due to large cluster-period sizes. Motivated by the need for marginal inference in SW-CRTs, we propose a simple andeﬃcient estimating equations approach to analyze cluster-period means. We show that the quasi-score for the marginal mean deﬁned from individual-level observations can be reformulated asthe quasi-score for the same marginal mean deﬁned from the cluster-period means. An additionalmapping of the individual-level ICCs into correlations for the cluster-period means further pro- ∗ To whom correspondence should be addressed. © The Author 2021. Published by Oxford University Press. All rights reserved. For permissions, please e-mail: [email protected] a r X i v : . [ s t a t . M E ] J a n F. Li and others vides a rigorous justiﬁcation for the cluster-period approach. The proposed approach addressesa long-recognized computational burden associated with estimating equations deﬁned based onindividual-level observations, and enables fast point and interval estimation of the interventioneﬀect and correlations. We further propose matrix-adjusted estimating equations to improve theﬁnite-sample inference for ICCs. By providing a valid approach to estimate ICCs within the classof generalized linear models for correlated binary outcomes, this article operationalizes key rec-ommendations from the CONSORT extension to SW-CRTs, including the reporting of ICCs.

Key words : Cluster randomized trials; Generalized estimating equations; Matrix-adjusted estimatingequations (MAEE); Intraclass correlation coeﬃcient; Statistical eﬃciency; Finite-sample correction.

1. Introduction

Overview and Objectives

Cluster randomized trials (CRTs) are pragmatic clinical trials that test interventions appliedto groups or clusters (Hayes and Moulton, 2009). Methodology for designing, conducting andanalyzing CRTs has been rigorously developed over decades (Turner and others , 2017 a , b ). Aprincipal, but not the sole reason why CRTs are considered is that the intervention has one ormore components deﬁned at the cluster level. Increasingly, CRTs employ stepped wedge (SW)designs, which are one-way crossover designs where all clusters start out in the control conditionand switch to the intervention at randomly assigned time points (Hussey and Hughes, 2007).Logistical considerations such as the need to deliver the intervention in stages and the desire toeventually implement the intervention in all clusters are key factors involved in the decision toadopt a SW-CRT. Given the increasing popularity of these designs, the development of statisticalmethods and computational tools for valid analysis is critically important.For the past decade, the design and analysis of SW-CRTs have mostly been based on lin- arginal Modeling of SW Trials with Binary Outcomes and others , 2020). Particularly, a major direction of research has been tostudy these methods under diﬀerent random eﬀects structures whose choice induces a marginalcovariance structure (Hooper and others , 2016; Kasza and others , 2019). While not as frequentlystudied in the SW-CRT literature, generalized linear mixed models (GLMM) are a broad class ofcluster-speciﬁc models to analyze clustered binary outcomes. However, their application carries acouple of caveats. First, with few exceptions (e.g., identity link), the interpretation of the inter-vention eﬀect changes according to diﬀerent speciﬁcations of the latent random-eﬀects structure.Second, while GLMMs are ﬂexible insofar as accounting for the dependence of observations withinclusters via random eﬀects, they may not adequately describe the pattern and magnitude of intr-aclass correlation structures on the natural measurement scale of the outcomes. This is becauseexact expressions for the marginal mean and correlation are generally lacking for GLMMs witha non-identity link function (Zeger and others , 1988). Perhaps for this reason, while GLMMs areused in the analysis of SW-CRTs with binary outcomes, they are seldom used as the basis forplanning SW-CRTs, with an exception in Zhou and others (2020) who developed a numericalapproach for power calculation using the random-intercept linear probability model.Motivated by the Washington State Expedited Partner Therapy (EPT) trial (Golden andothers , 2015), we consider marginal model based analyses of SW-CRTs with binary outcomesand ﬂexible choice of link functions. Marginal models separately specify the mean and the intra-class correlation structures, with the interpretation of the marginal mean parameters remainingthe same regardless of correlation speciﬁcation (Liang and Zeger, 1986; Zeger and others , 1988).Further, the marginal modeling approach may be more robust because misspeciﬁcation of thecorrelation structure does not aﬀect the consistency of the regression parameter estimator inthe marginal mean model. Finally, the marginal modeling framework permits direct estimationof intraclass correlation coeﬃcients (ICCs) and assessing their uncertainty on the natural mea-surement scale of the outcomes. Such information is particularly useful as input parameters for F. Li and others sample size determination in cluster trials, generally (Preisser and others , 2003), and in SW-CRTs,speciﬁcally (Li and others , 2018). Accurate reporting of the intraclass correlation structures hasbeen long advocated in parallel CRTs (Preisser and others , 2007) and aligns with item 17a ofthe recent CONSORT extension to SW-CRTs, which recommends reporting of various intraclasscorrelation estimates to facilitate the planning of future trials (Hemming and others , 2018).1.2

Motivating Study: The Washington State EPT Trial

The Washington State EPT trial is a SW-CRT that evaluates the population eﬀect of an expeditedpatient-delivered partner notiﬁcation strategy versus the standard partner notiﬁcation for thetreatment of Chlamydia and Gonorrhea infection (Golden and others , 2015). The interventionincludes the promotion of patient-delivered partner therapy through commercial pharmacies andtargeted provision of public health partner services, and was designed to increase treatmentadoption for sex partners of individual heterosexual patients. The randomization is carried outat the level of local health jurisdiction (LHJ), namely the administrative unit corresponding toa single county. Each LHJ is a cluster, and a total of 23 LHJs were randomized from 2007 to2010 over four waves until the intervention had been disseminated in all LHJs. Cross-sectionalsurveys were conducted based on sentinel women aged 14 to 25 years in each LHJ at baselineand in between waves to measure the prevalence of Chlamydia and Gonorrhea. Due to the cross-sectional design, diﬀerent women are included in diﬀerent periods.Following Golden and others (2015), we restrict the analysis to the 22 LHJs that provideindividual-level data on the Chlamydia outcome. Deﬁne Y ijk as the binary Chlamydia infectionstatus for sentinel woman k = 1 , . . . , n ij surveyed during period j = 1 , . . . , J in LHJ i = 1 , . . . , I ;the value of Y ijk equals 1 if the sentinel woman reports Chlamydia and 0 otherwise. Li and others (2018) and Li (2020) speciﬁed the following marginal mean model for SW-CRTs g ( µ ijk ) = β j + X ij δ, (1.1) arginal Modeling of SW Trials with Binary Outcomes µ ijk is the mean of Y ijk , g is the link function, β j is the j th period eﬀect, X ij is theintervention indicator, and δ is the time-adjusted average intervention eﬀect on the link functionscale. Because it is also of interest to report the within-period and between-period correlations, Li and others (2018) proposed a paired estimating equations approach to simultaneously estimatethe intervention eﬀect and correlation parameters. However, in many cross-sectional SW-CRTs,the cluster-period sizes n ij ’s are large and highly variable. In the EPT trial design, for example,the cluster-by-period diagram in Figure 1 shows that the cluster size n i + = (cid:80) Jj =1 n ij ranges from277 to 5393. As the correlation estimating equations can involve as many as (cid:0) (cid:1) ≈

2. Modeling Cluster-Period means in Cross-Sectional Designs

Because marginal mean model (1.1) is a function of both period and intervention, we con-sider collapsing the individual-level outcomes to cluster-period means Y i = ( Y i , . . . , Y iJ ) T =( Y i /n i , . . . , Y iJ + /n iJ ) T , where Y ij + = (cid:80) n ij k =1 Y ijk is the cluster-period total, and n ij the F. Li and others cluster-period size. We assume that the cluster-period sizes are variable, which is almost al-ways the case with cross-sectional SW-CRTs. Let the mean of Y i be µ i = ( µ i , . . . , µ iJ ) T , wherefor a binary outcome µ ij = E ( Y ij + ) /n ij is the prevalence in the ( i, j )th cluster-period. Becausethe right hand side of marginal model (1.1) depends only on cluster and period, aggregating overcluster-periods implies the same marginal model, g ( µ ij ) = β j + X ij δ . Writing θ = ( β , . . . , β J , δ ) T ,the generalized estimating equations (GEE; Liang and Zeger, 1986) for θ are I (cid:88) i =1 D T i V − i ( Y i − µ i ) = 0 , (2.2)where D i = ∂ µ i /∂ θ T and V i = cov( Y i ) is the working covariance matrix parameterized by theindividual-level variances and pairwise correlations. This is the usual GEE applied to correlatedbinomial data, and our novel contribution is to enable the estimation of individual-level ICCsthat parameterize the cluster-period mean covariance V i .Several prior eﬀorts collapsed individual-level observations for analyzing SW-CRTs. Husseyand Hughes (2007) suggested a linear mixed model based on cluster-period means with a randomintercept. For binary outcomes, their approach only estimates the treatment eﬀect on the riskdiﬀerence scale and does not estimate valid ICCs deﬁned from the individual-level model undervariable cluster sizes (see SM Appendix A for details). Thompson and others (2018) proposeda permutation test based on cluster-period means. However, their approach assumed workingindependence and ignored the estimation of correlation structures. Our approach distinguishesfrom these two earlier eﬀorts by allowing arbitrary link functions in the marginal mean modelfor binary outcomes as well as by enabling valid estimation and inference for ICC structures.In cross-sectional designs, distinct sets of participants are included in each period, and requiremodeling both the within-period and between-period correlations for each pair of individual-leveloutcomes. We consider two multilevel correlation structures: the nested exchangeable and theexponential decay structures. The nested exchangeable correlation structure (Li and others , 2018)diﬀerentiates between the within-period and between-period ICCs. Speciﬁcally, this structure arginal Modeling of SW Trials with Binary Outcomes α between two individual outcomes from the same cluster withinthe same period, and a constant correlation α between two individual outcomes from the samecluster across two periods. Equating α with α leads to the simple exchangeable structure as instandard GEE analyses (Hussey and Hughes, 2007). The exponential decay correlation structurewas recently introduced in the context of linear mixed models (Kasza and others , 2019; Li andothers , 2020). While this structure assumes a constant correlation α between two individualoutcomes from the same cluster within the same period, it allows the between-period correlationto decay at an exponential rate. Mathematically, the correlation between two outcomes measuredin the j th and l th periods (1 (cid:54) j, l (cid:54) J ) is α ρ | j − l | (0 (cid:54) ρ (cid:54) ρ = 1), the exponential decay structure also reduces to the simple exchangeable structure.Example matrix forms of these correlation structures are provided in STable 1 in the SM. Inthe following two sections, we develop estimation and inference strategies under each of thesecorrelation structures. 2.1 Nested Exchangeable Correlation Structure

The individual-level correlation structure informs the speciﬁcation of covariances for cluster-period means. Under the nested exchangeable correlation structure, the diagonal element of V i is σ ijj = var( Y ij + ) = ν ij n ij { n ij − α } , (2.3)where ν ij = µ ij (1 − µ ij ) is the binomial variance. The design eﬀect, 1 + ( n ij − α , is the classicvariance inﬂation factor for over-dispersed binomial outcomes. The oﬀ-diagonal element of V i is σ ijl = cov( Y ij + , Y il + ) = √ ν ij ν il α . (2.4)When all n ij → ∞ , var( Y ij + ) → ν ij α and the pairwise cluster-period mean correlation corr( Y ij + , Y il + ) → α /α , which is identical to the cluster autocorrelation deﬁned in Hooper and others (2016) and F. Li and others Li and others (2020) based on linear mixed models. This also suggests that the cluster-periodmeans are approximately exchangeable when all n ij ’s are large, but such an approximation maybe crude in cases such as the motivating study where the n ij ’s vary from 19 to 1553.Deﬁne α = ( α , α ) T , we specify the covariance estimating equations (Zhao and Prentice,1990) to estimate α I (cid:88) i =1 D T i V − i ( S i − η i ) = 0 , (2.5)where η i = ( σ i , σ i , . . . , σ i , . . . ) T , S i = ( s i , s i , . . . , s i , . . . ) T , s ijl = ( Y ij + − ˆ µ ij )( Y il + − ˆ µ il ) is the residual cross-product, D i = ∂ η i /∂ α T and V i is the working variance for S i . Para-metric speciﬁcation of working covariances V i requires the joint distributions of within-clustertriplets and quartets, which are not provided from the speciﬁcation of marginal mean and co-variances. Henceforth, a practical strategy is to set V i as identity matrix (Sharples and Breslow,1992). In this case, the following closed-form updates are implied from (2.5)ˆ α = (cid:80) Ii =1 (cid:80) Jj =1 (cid:16) n ij − n ij (cid:17) ˆ ν ij (cid:16) s ijj − ˆ ν ij n ij (cid:17)(cid:80) Ii =1 (cid:80) Jj =1 (cid:16) n ij − n ij (cid:17) ˆ ν ij , ˆ α = (cid:80) Ii =1 (cid:80) j (cid:54) = l s ijl (cid:112) ˆ ν ij ˆ ν il (cid:80) Ii =1 (cid:80) j (cid:54) = l ˆ ν ij ˆ ν il . (2.6)Noticeably, even though the cluster-period sizes n ij could be large and pose a computationalchallenge for individual-level paired estimating equations, cluster-period aggregation reduces theeﬀective cluster sizes to J , the number of periods, which rarely exceeds 10 (Grayling and oth-ers , 2017). On the other hand, because SW-CRTs often involve a limited number of clusters(fewer than 30), the residual vector Y i − ˆ µ i could be biased towards zero due to overﬁtting,leading to ﬁnite-sample bias in the estimation of correlation parameters. Here we extend themultiplicative adjustment of Preisser and others (2008) to the covariance estimating equations(2.5) by the following argument. Because E [( Y i − ˆ µ i )( Y i − ˆ µ i ) T ] ≈ ( I − H i )cov( Y i ), where H i = D i ( (cid:80) Ii =1 D i V − i D i ) − D T i V i is the cluster leverage, a bias-adjusted, and hence moreaccurate estimator for the covariance of Y i is obtained as (cid:103) cov( Y i ) = ( I − H i ) − ( Y i − ˆ µ i )( Y i − ˆ µ i ) T , (2.7) arginal Modeling of SW Trials with Binary Outcomes H i is evaluated at ˆ θ . Improved estimation of correlation parameters may then be achievedby replacing S i in (2.5) with (cid:101) S i = (˜ s i , ˜ s i , . . . , ˜ s i , . . . ), where ˜ s ijl is the ( j, l )th element of thebias-adjusted covariance (cid:103) cov( Y i ). We will similarly deﬁne the cluster leverage for the covarianceestimating equations as H i = D i ( (cid:80) Ii =1 D i V − i D i ) − D T i V i , which is evaluated at ˆ θ and ˆ α .When the number of clusters I is large, the joint distribution of I / ( ˆ θ − θ ), I / ( ˆ α − α ) isGaussian with mean zero and covariances estimated by I × (cid:18) Ω 0

Q P (cid:19) (cid:18) Λ Λ Λ T Λ (cid:19) (cid:18) Ω Q T P (cid:19) , (2.8)where Ω = (cid:110)(cid:80) Ii =1 D T i V − i D i (cid:111) − , P = (cid:110)(cid:80) Ii =1 D T i V − i D i (cid:111) − , Q = P (cid:110)(cid:80) Ii =1 D T i V − i ∂ S i ∂ θ T (cid:111) Ω , Λ = I (cid:88) i =1 C i D T i V − i B i ( Y i − ˆ µ i )( Y i − ˆ µ i ) T B T i V − i D i C T i Λ = I (cid:88) i =1 C i D T i V − i B i ( Y i − ˆ µ i )( ˜ S i − ˆ η i ) T B T i V − i D i C T i Λ = I (cid:88) i =1 C i D T i V − i B i ( ˜ S i − ˆ η i )( ˜ S i − ˆ η i ) T B T i V − i D i C T i , where we will discuss the choice of { C i , C i } and { B i , B i } in the following. If we set C i = I dim( θ ) , C i = I dim( α ) , B i = I dim( Y i ) , B i = I dim( ˜ S i ) , equation (2.8) becomes the robustsandwich variance in the spirit of Zhao and Prentice (1990), or BC0. Because the number ofclusters included in SW-CRTs are frequently less than 30, the following ﬁnite-sample bias cor-rections could provide improved inference for θ and α . Speciﬁcally, setting C i , C i as identitybut B i = ( I dim( Y i ) − H i ) − / , B i = ( I dim( ˜ S i ) − H i ) − / results in the bias-corrected co-variance that extends Kauermann and Carroll (2001), or BC1. Setting C i , C i as identity but B i = ( I dim( Y i ) − H i ) − , B i = ( I dim( ˜ S i ) − H i ) − results in the bias-corrected covariance thatextends Mancl and DeRouen (2001), or BC2. Finally, setting B i , B i as identity but C i =diag { (1 − min { ζ , [ D T i V − i D i ] jj Ω } ) − / } , C i = diag { (1 − min { ζ , [ D T i V − i D i ] jj P } ) − / } extends Fay and Graubard (2001), or BC3. Usually we set ζ = ζ = 0.75 to ensure that mul-tiplicative bias correction is no larger than 2 fold. When I is smaller than 30, each of these0 F. Li and others bias-corrections could inﬂate the variance relative to BC0 and potentially improve the ﬁnite-sample behaviour of the sandwich variance.2.2

Exponential Decay Correlation Structure

Under the exponential decay correlation structure, the covariances for cluster-period means V i include diagonal element σ ijj deﬁned in equation (2.3), and oﬀ-diagonal element becomes σ ijl = cov( Y ij + , Y il + ) = √ ν ij ν il α ρ | j − l | . When all n ij → ∞ , the pairwise cluster-period meancorrelation corr( Y ij + , Y il + ) → ρ | j − l | , which corresponds to a ﬁrst-order auto-regressive structure.Again, such an approximation may not be accurate in the Washington State EPT trial becausethe cluster-period sizes could occasionally be small and quite variable.Unlike the expression obtained under the nested exchangeable correlation structure, σ ijl ob-tained under the exponential decay structure is nonlinear in the decay parameter ρ . Based onestimating equations (2.5) and the bias-adjusted covariance (cid:101) S i , we can show that each update of( α , ρ ) joint solves the following system of equationsˆ α = (cid:80) Ii =1 (cid:80) Jj =1 (cid:16) n ij − n ij (cid:17) (cid:16) ˜ s ijj ˆ ν ij − ˆ ν ij n ij (cid:17) + (cid:80) Ii =1 (cid:80) j (cid:54) = l ˜ s ijl (cid:112) ˆ ν ij ˆ ν il ˆ ρ | j − l | (cid:80) Ii =1 (cid:80) Jj =1 (cid:16) n ij − n ij (cid:17) ˆ ν ij + (cid:80) Ii =1 (cid:80) j (cid:54) = l ˆ ν ij ˆ ν il ˆ ρ | j − l | (2.9) I (cid:88) i =1 (cid:88) j (cid:54) = l | j − l | ˜ s ijl (cid:112) ˆ ν ij ˆ ν il ˆ ρ | j − l |− − I (cid:88) i =1 (cid:88) j (cid:54) = l | j − l | ˆ ν ij ˆ ν il ˆ α ˆ ρ | j − l |− = 0. (2.10)In particular, we observe that the second equation is a polynomial function of ρ to the order of2 | J − | −

1, and so one can use root-ﬁnding algorithms to search for the zero-value within theunit interval. Given each update of the marginal mean parameters, an update of the exponentialdecay correlation structure can be obtained by iterating between (2.9) and (2.10). The varianceestimators for both θ and α with ﬁnite-sample corrections can be obtained by following theapproach in Section 2.1. Extensions of the proposed cluster-period marginal modeling approachfor continuous and count outcomes are presented in the SM Appendix B. arginal Modeling of SW Trials with Binary Outcomes

3. Considerations on Asymptotic Efficiency

We assess the asymptotic eﬃciency in estimating the intervention eﬀect δ based on estimatingequations deﬁned for cluster-period means. In the same context, Li and others (2018) provideda paired estimating equations for individual-level outcomes, which in principle serves as the ef-ﬁciency gold-standard. However, the computational burdens of that approach in analyzing themotivating trial are twofold: those associated with repeatedly inverting a large correlation ma-trix for marginal mean estimation and those associated with enumerating all pairwise residualcross-products for correlation estimation. These computational disadvantages prohibit the appli-cation of individual-level GEE to analyze SW-CRTs with large cluster-period sizes, especiallywhen the correlation model includes more than one parameter. In contrast, the cluster-periodGEE converges in seconds because the induced correlation matrix is of dimension J × J andonly (cid:0) J (cid:1) pairwise residual products need to be enumerated in each cluster. It is then of interestto study whether the cluster-period GEE compromises eﬃciency in estimating the interventioneﬀect δ . To proceed, we observe that both the nested exchangeable and exponential decay cor-relation structures are special cases of the block Toeplitz structure deﬁned in SM Appendix C.In SM Appendix D, we show that, as long as the working correlation model for individual-leveldata is block Toeplitz, the marginal mean estimating equations for cluster-period means areequivalent to those for individual-level outcomes. Speciﬁcally, we deﬁne the individual-level es-timating equations for θ as (cid:80) Ii =1 E T i M − i ( Y i − ϑ i ) = , where Y i = ( Y i , Y i , . . . , Y i , . . . ) T , ϑ i = ( µ i Tn i , . . . , µ iJ Tn iJ ) T , M i = cov( Y i ), and E i = ∂ ϑ i /∂ θ T , and the following result holds. Theorem D T i V − i ( Y i − µ ) = E T i M − i ( Y i − ϑ i ) for each cluster i .2 F. Li and others

Similarly, D T i V − i D i = E T i M − i E i .To establish the above general result under variable cluster-period sizes, a mathematicalinduction argument (SM Appendix D) is necessary because an analytical inverse cannot be easilyobtained for the block Toeplitz matrix. Theorem 3.1 indicates that there is no loss of asymptoticeﬃciency for estimating the intervention eﬀect that results from cluster-period aggregation, aslong as the induced cluster-period mean correlation matrix is properly speciﬁed. Particularly,assuming equal cluster-period sizes and a linear mixed model with Gaussian outcomes, Grantham and others (2019) suggested that the linear mixed model based on the cluster-period summaryresults in no loss of information for estimating the treatment eﬀect in SW-CRTs as long as thewithin-period observations are exchangeable. Theorem 3.1 generalizes their ﬁnding to GEE witharbitrary speciﬁcation of link and variance functions and further relaxes their equal cluster-periodsize assumption.Theorem 3.1 also provides a convenient device to numerically evaluate the asymptotic relativeeﬃciency (ARE) between accurately modeling the cluster-period mean correlations versus usinga working independence structure. To further support the application of the proposed approachversus using working independence in analyzing the motivating study, we calculate the ARE inestimating δ between these two approaches. To do so, we assume a SW-CRT with 22 clusters and5 periods, where the randomization follows the cluster-by-period diagram in Figure 1. We set thetrue marginal mean model as equation (1.1) with a logit link. The period eﬀect β j ’s are speciﬁedso that the outcome prevalence decreases from 25% to 20% in the absence of intervention, andthe intervention eﬀect corresponds to an odds ratio of e δ = 0.75. To account for variable cluster-period sizes, we resample the cluster-period sizes from the motivating study and obtain 1000bootstrap replicates. For the k th bootstrap replicate, we obtain τ k = (cid:104) ( (cid:80) Ii =1 D T i Ψ − i D i ) − ( (cid:80) Ii =1 D T i Ψ − i V i Ψ − i D i )( (cid:80) Ii =1 D T i Ψ − i D i ) − (cid:105) ( J +1 ,J +1) (cid:104) ( (cid:80) Ii =1 D T i V − i D i ) − (cid:105) ( J +1 ,J +1) , arginal Modeling of SW Trials with Binary Outcomes Ψ i is a J × J diagonal matrix with the j th element as ν ij /n ij (i.e. working independencecovariance model), and all parameters evaluated at the truth. The ARE is then estimated as (cid:80) k =1 τ k / ≈

44 when α = α = 0.1) when the within-period ICC and between-period ICC are identical, assuming thelatter does not exceed the former. The ARE decreases when the within-period ICC decreases, andalso when the between-period ICC deviates from the within-period ICC. In the scenario where α = 0.02 and α = 0.001, modeling the correlations is still 60% more eﬃcient than ignoring thecorrelation structure in estimating δ . These observations highlight the importance of correlationspeciﬁcation in SW-CRTs when the cluster-period sizes are highly variable.

4. Simulation Studies

We conduct two sets of simulation experiments to assess the ﬁnite-sample operating characteristicsof the cluster-period GEE for analyzing correlated binary outcomes in cross-sectional SW-CRTs.In the ﬁrst simulation experiment, we focus on a limited number of clusters I ∈ { , , } withtreatment sequences randomized across J = 5 periods. We assume all clusters receive the controlcondition during the ﬁrst period J = 1, and an equal number of clusters cross over to interventionat each wave. We use the Qaqish (2003) method to generate correlated individual-level binary out-comes. The true marginal mean model is given by (1.1), where g is a logit link and the eﬀect size δ = log(0.5). We assume the baseline prevalence of the outcome is 35% and a gently decreasingtime trend with β j − β j +1 = 0.1 × (0.5) j for j (cid:62)

1. Both the nested exchangeable and the exponen-tial decay correlation structures are considered in simulating the data. When the true correlationis nested exchangeable, we consider ( α , α ) ∈ { (0.03 , , (0.1 , } , representing small tomoderate within-period and between-period correlations previously reported (Martin and others ,2016). When the true correlation is exponential decay, we consider ( α , ρ ) ∈ { (0.03 , , (0.1 , } F. Li and others in accordance with values assumed in previous simulations for cohort stepped wedge designs (Li,2020). In this ﬁrst simulation experiment, we consider relatively large but more variable cluster-period sizes, randomly drawn from DiscreteUniform(50 , and others (2018) becomes computationally burdensome to ﬁt due to (1) the enumeration of amaximum of (cid:0) (cid:1) = 280 ,

875 pairwise residual cross-products in each cluster and (2) numericalinversion of a large correlation matrix in each modiﬁed Fisher-scoring update. We therefore onlyconsider analyzing the simulated data via the cluster-period GEE with the correct speciﬁcationof the marginal mean and induced correlation structure. We simulate 3000 data replicates, andstudy the percent relative bias and coverage probability in estimating the marginal interventioneﬀect and correlation parameters. The comparisons are made between the uncorrected estimatingequations (UEE), namely equation (2.5), and the matrix-adjusted estimating equations (MAEE),namely equation (2.5) but now with the bias-adjusted cross-products (cid:101) S i .Table 2 summarizes the percent relative bias results. Overall, the bias of the intervention eﬀectremains insensitive to bias corrections of the correlation estimating equations, corroborating theﬁndings of Lu and others (2007) for individual-level GEE. However, when the true correlationstructure is nested exchangeable, MAEE substantially reduces the negative bias of UEE in es-timating α and α . When the true correlation structure is exponential decay, MAEE similarlyreduces the negative bias of UEE in estimating the within-period correlation α , but comes at acost of slightly inﬂating the negative bias in estimating the decay parameter ρ , especially when I = 12. This is because α and ρ enter the polynomial estimating equation (2.10) in a multi-plicative fashion, while the updates for α and α under the nested exchangeable structure arenearly orthogonal. As the number of clusters increase to I = 24 or 36, both MAEE and UEEhave negligible bias in estimating ρ , but MAEE still has notably smaller bias in estimating α .STable 2 in the SM summarizes the coverage probability for δ . The conﬁdence intervals (CIs) arginal Modeling of SW Trials with Binary Outcomes t I − quantiles as this approach has been shown to provide robustsmall-sample behaviour in previous simulations with individual-level GEE (Li, 2020; Ford andWestgate, 2020). In addition to the model-based variance and the usual sandwich variance, weexamine three bias-corrected variances introduced in Section 2.1. Based on a binomial modelwith 3000 replicates, we consider the empirical coverage between 94.2% and 95.8% as close tonominal. STable 2 indicates that the CIs for δ constructed with the model-based variance or anyof the bias-corrected variances generally provide close to nominal coverage, while those based onBC0 frequently lead to under coverage. STable 3 and 4 summarize the coverage probability ofthe correlation parameters (interval constructed based on the same t I − distribution). Given thelimited number of clusters and variable cluster sizes, the coverage of correlation parameters isfrequently below nominal. However, MAEE can substantially improve the coverage of the corre-lations parameters. Throughout, the CIs constructed based on BC2 provide the best coverage forcorrelations, a ﬁnding that echoes Perin and Preisser (2017) with alternating logistic regressions.To further investigate the coverage probability of the correlation parameters with a largernumber of clusters, we consider a second set of experiments, with the same simulation designexcept for smaller cluster-period sizes. Speciﬁcally, the cluster-period sizes are randomly drawnfrom DiscreteUniform(25 , I is varied from 12 to 120. The coverageresults for the intervention eﬀect parameter δ are largely consistent with those from the ﬁrstsimulation experiment, and are presented in SFigure 1 and 2 in the SM. However, the resultsfurther indicate that the model-based variance and BC2 may lead to over coverage with I = 12and I = 24, and that BC1 seems to have the most robust performance. Next, Figure 2 presentsthe coverage for the nested exchangeable correlation structure when α = 0.03 and α = 0.015.Similar results for α = 0.1 and α = 0.05 are in SFigure 3. These ﬁgures indicate that MAEEcoupled with BC2 leads to higher and closer to nominal coverage for α and α compared toUEE. Likewise, MAEE also improves the empirical coverage for α and ρ under the exponential6 F. Li and others decay correlation structure, and the results are presented in SFigure 4 and 5.

5. Analysis of the Washington State EPT Trial

We apply the cluster-period GEE to analyze cluster-period proportions in the Washington StateEPT trial. The focus of this analysis is on estimating the intervention eﬀect and the intraclasscorrelation structure with respect to the Chlamydia outcomes. We consider the marginal modelfor cluster-period means logit( µ ij ) = β j + δX ij , where β j ( j = 1 , . . . ,

5) is the period eﬀect andexp( δ ) is the population-averaged odds ratio. To model the within-cluster correlations, we considerthe simple exchangeable structure, as well as the nested exchangeable and exponential decaystructures. Of note, the simple exchangeable structure is obtained when we enforce α = α inthe nested exchangeable structure or ρ = 1 in the exponential decay structure. We do not considerthe working independence assumption, because reporting ICCs is considered good practice perthe CONSORT extension and useful for planning future SW-CRTs (Hemming and others , 2018).Table 3 summarizes the point estimates and bias-corrected standard errors of the marginalmean and correlation parameters from the analysis of all sentinel women. Informed by the sim-ulation study in Section 4, we report the BC1 standard error estimates for all marginal meanparameters and the BC2 standard error estimates for all correlation parameters. The estimatedodds ratios due to the EPT intervention are 0.868, 0.867 and 0.883, under the simple exchange-able, nested exchangeable and exponential decay correlation models. All 95% conﬁdence intervalsinclude one. Because the prevalence of Chlamydia is around 6% and is considered rare, the oddsratio approximates the relative risk. Therefore, interpreting the odds ratio as a relative risk, weconclude that the EPT intervention results in an approximately 12% reduction in Chlamydialinfection among women aged between 14 and 25. This ﬁnding is consistent with that in Golden and others (2015) based on generalized linear mixed models. We additionally report the intraclasscorrelation estimates and their estimated precisions for the Chlamydia outcome on the natural arginal Modeling of SW Trials with Binary Outcomes α ≈ α ≈ ρ ≈ cp = trace (cid:34)(cid:32) I (cid:88) i =1 D T i Ψ − i D i (cid:33) ΩΛ Ω (cid:12)(cid:12)(cid:12) θ = ˆ θ ( R i ) , α = ˆ α ( R i ) (cid:35) , (5.11)where Ψ i is a J × J working independence covariance, ΩΛ Ω is the bias-corrected sandwichvariance of the marginal mean (BC1), and all parameters are evaluated at the estimates underthe assumed correlation structure. According to Theorem 3.1, as I → ∞ , the limit of (5.11) isidentical to the limit of the usual CIC deﬁned for individual-level GEE in Hin and Wang (2009)under the marginal mean model (1.1), providing some justiﬁcation for using this metric. FromTable 3, while the smallest CIC cp corresponds to and favors the exponential decay correlation,the CIC cp of the simple exchangeable correlation structure is only larger by a small amount.Future simulation studies are needed to better assess the operating characteristics of CIC cp forselecting the optimal correlation structures based on cluster-period GEE analysis of SW-CRTs.To explore treatment eﬀect among subgroups, we perform the cluster-period GEE analyses foradolescent girls (aged between 14 and 19) and adult women (aged greater than 19), and presentthe results in STable 5 and 6. From STable 5, the intervention leads to a more pronouncedreduction of Chlamydia infection among adolescent girls compared to the overall analysis. Underthe nested exchangeable correlation model, the intervention eﬀect in odds ratio is estimated as0.780, and its 95% CI (0.625 , F. Li and others in 22% reduction in Chlamydia infection among adolescents. Given that adolescents are at highrisk for acquiring sexually transmitted diseases and that research on eﬀectiveness of EPT amongthis population is limited (Gannon-Loew and others , 2017), our subgroup analysis may providenew evidence. For brevity, the comparison between correlation structures among the adolescentpopulation and the analysis of the adult women subgroup are presented in the SM Appendix E.

6. Discussion

In the analysis of SW-CRTs with binary outcomes, statistical methods are seldom used to si-multaneously obtain point and interval estimates for the intervention eﬀect and the ICCs. Anexception is the paired estimating equations approach studied in Li and others (2018). How-ever, that approach is computationally infeasible with large cluster sizes, which are typicallyseen in cross-sectional SW-CRTs. To address this limitation, we propose a simple and eﬃcientestimating equations approach based on cluster-period means, which resolves the computationalburden of the approach based on individual-level observations. In practice, one could ﬁrst at-tempt an individual-level GEE analysis with an appropriate correlation structure. However, ifthat procedure becomes computationally intensive, the proposed cluster-period GEE providesa valid workaround. Because standard software could only provide valid intervention eﬀect es-timates with cluster-period means in SW-CRTs, we have developed an R package geeCRT toimplement both the individual-level GEE studied in Li and others (2018) and the cluster-periodGEE proposed in this article.Although individual-level analysis has usually been considered more eﬃcient than cluster-levelanalysis in CRTs, we have shown in Theorem 3.1 that the cluster-period GEE and individual-levelGEE are asymptotically equally eﬃcient in estimating the treatment eﬀect parameter in cross-sectional SW-CRTs. The full eﬃciency of the cluster-period analysis depends on the inducedcorrelation structure, deﬁned in Section 2.1 and 2.2. On the other hand, cluster-period analysis arginal Modeling of SW Trials with Binary Outcomes andothers , 2018). The numerical study in Section 3 emphasizes the necessity of carefully characteriz-ing the induced cluster-period correlation when performing a cluster-period analysis. Finally, theproposed approach enables fast estimation and inference for the correlation parameters, whichaligns with the current recommendation in the CONSORT extension to SW-CRTs (Hemming andothers , 2018). The estimating equations method also produces standard errors of the estimatedcorrelations, which can be used to construct interval estimates to further improve planning offuture trials.Our simulations indicate that the cluster-period GEE can estimate the intervention eﬀectwith negligible bias, regardless of bias-corrections to the correlation estimating equations viaMAEE. However, MAEE substantially reduces the bias of the ICC estimates. On the otherhand, while the bias-corrected sandwich variances can provide nominal coverage for δ even when I = 12, inference for ICC parameters appears more challenging (see SM Appendix F for a concisesummary of ﬁndings). We suggest that 30 to 40 clusters may be suﬃcient for the cluster-periodMAEE to provide nominal coverage for α , which generally agrees with Preisser and others (2008) using individual-level MAEE. In SW-CRTs, a larger number of clusters may be needed toachieve nominal coverage for the between-period correlation ( α or ρ ), which diﬀers from ﬁndingsin Preisser and others (2008) for parallel CRTs. This diﬀerence highlights the requirement foraccurate ICC inference can depend on randomization design (parallel versus stepped wedge).A further reason underlying such a diﬀerence is that we have simulated unequal cluster-periodsizes, under which the sandwich variance becomes more variable (Kauermann and Carroll, 2001).Fortunately, compared to UEE, the use of cluster-period MAEE can substantially mitigate, if noteliminate, the under-coverage of ICC parameters in small samples.A reviewer has raised the issue of performing cluster-period analysis using GLMMs in cross-sectional SW-CRTs. As explained in SM Appendix G, with binary outcomes, a rigorous cluster-0 F. Li and others period analysis using GLMMS may proceed with the cluster-period totals, (cid:80) n ij k =1 Y ijk , whichfollows a Binomial distribution. The likelihood principle then directly suggests such cluster-periodaggregation leads to the same inference of GLMM parameters. However, the interpretation of thetreatment eﬀect parameter in GLMMs is conditional on the latent random eﬀects, and thereforeapplies only to each cluster, or, strictly speaking, to the population with the same value of theunobserved random eﬀects. In contrast, the treatment eﬀect δ is averaged over all clusters, andhas been argued to bear a more straightforward population-averaged interpretation (Preisser andothers , 2003; Li and others , 2018).Although the proposed approach is motivated by cross-sectional SW-CRTs, it is equally ap-plicable to parallel cross-sectional longitudinal cluster randomized trials (L-CRTs). In parallelL-CRTs, the intervention eﬀect is parameterized either by the time-adjusted main eﬀect or thetreatment-by-time interaction. For both estimands, because cluster-period aggregation impliesthe same marginal mean model, the proposed GEE approach is valid and can be useful. Anotherdirection for future research is to extend the proposed approach to analyze closed-cohort SW-CRTs (Copas and others , 2015; Li and others , 2018; Li, 2020) and SW-CRTs with continuousrecruitment (Grantham and others , 2019; Hooper and Copas, 2019). These more recent variantsof SW-CRTs have more complex intraclass correlation structures and therefore requires additionalconsiderations in cluster-period analysis.One potential limitation of the current study is that we have only considered a marginalmean model without individual-level covariates. Such an unadjusted mean model originates fromHussey and Hughes (2007) and has been widely applied for planning and analyzing SW-CRTs;see, for example, the recent review in Li and others (2020). More often than not, the intraclasscorrelation structures are also deﬁned with respect to the unadjusted mean models in SW-CRTs(Kasza and others , 2019; Li and others , 2018; Li, 2020; Li and others , 2020). However, covariateadjustment may potentially improve the eﬃciency in estimating the treatment eﬀect. We plan EFERENCES and others (2004).

7. Software

An R package for our method, geeCRT , is available online at CRAN. Sample R code, togetherwith a simulated data example is also available from the corresponding author’s GitHub page at https://github.com/lifanfrank/clusterperiod_GEE .

8. Supplementary Material

Supplementary material is available online at http://biostatistics.oxfordjournals.org . Acknowledgments

Research in this article was partially funded through a Patient-Centered Outcomes ResearchInstitute ® (PCORI ® Award ME-2019C1-16196). The statements presented in this article aresolely the responsibility of the authors and do not necessarily represent the views of PCORI ® , itsBoard of Governors or Methodology Committee. Dr. Preisser has received a stipend for serviceas a merit reviewer from PCORI ® . Dr. Preisser did not serve on the Merit Review panel thatreviewed his project. The authors thank Dr. James P. Hughes for sharing the Washington StateEPT study data, and Xueqi Wang for discussions and computational assistance. The authors arealso grateful to the Associate Editor and two anonymous reviewers for constructive comments. ReferencesCopas, Andrew J, Lewis, James J, Thompson, Jennifer A, Davey, Calum, Baio, Gi-anluca and Hargreaves, James R . (2015). Designing a stepped wedge trial: three main2

REFERENCES designs, carry-over eﬀects and randomisation approaches.

Trials , 352. Fay, M. P. and Graubard, B. I. (2001). Small-sample adjustments for Wald-type tests usingsandwich estimators.

Biometrics , 1198–1206. Ford, Whitney P and Westgate, Philip M . (2020). Maintaining the validity of inferencein small-sample stepped wedge cluster randomized trials with binary outcomes when usinggeneralized estimating equations.

Statistics in Medicine , 2779–2792. Gannon-Loew, K. E., Holland-Hall, C. and Bonny, A. E. (2017). A review of expeditedpartner therapy for the management of sexually transmitted infections in adolescents.

Journalof Pediatric and Adolescent Gynecology , 341–348. Golden, M. R., Kerani, R. P., Stenger, M., Hughes, J. P., Aubin, M., Malinski,C. and Holmes, K. K. (2015). Uptake and population-level impact of expedited partnertherapy (EPT) on Chlamydia trachomatis and Neisseria gonorrhoeae: the Washington Statecommunity-level randomized trial of EPT.

Plos Medicine , 1–22. Grantham, K. L., Kasza, J., Heritier, S., Hemming, K. and Forbes, A. B. (2019).Accounting for a decaying correlation structure in cluster randomized trials with continuousrecruitment.

Statistics in Medicine , 1918–1934. Grayling, M. J., Wason, J. M. S. and Mander, A. P. (2017). Stepped wedge cluster ran-domized controlled trial designs : a review of reporting quality and design features.

Trials ,1–13. Hayes, R. J. and Moulton, L. H. (2009).

Cluster Randomised Trials . Boca Raton, FL: Taylor& Francis Group, LLC.

Hemming, K., Taljaard, M., McKenzie, J. E., Hooper, R., Copas, A., Thompson, J. A.

EFERENCES and et al . (2018). Reporting of stepped wedge cluster randomised trials: Extension of theCONSORT 2010 statement with explanation and elaboration. BMJ , 1–26.

Hin, L.-Y. and Wang, Y.-G. (2009). Working-correlation-structure identiﬁcation in generalizedestimating equations.

Statistics in Medicine , 642–658. Hooper, Richard and Copas, Andrew . (2019). Stepped wedge trials with continuous re-cruitment require new ways of thinking.

Journal of Clinical Epidemiology , 161–166.

Hooper, Richard, Teerenstra, Steven, de Hoop, Esther and Eldridge, Sandra .(2016). Sample size calculation for stepped wedge and other longitudinal cluster randomisedtrials.

Statistics in Medicine , 4718–4728. Hussey, M. A. and Hughes, J. P. (2007). Design and analysis of stepped wedge clusterrandomized trials.

Contemporary Clinical Trials , 182–191. Kasza, J., Hemming, K., Hooper, R., Matthews, J. N.S. and Forbes, A. B. (2019).Impact of non-uniform correlation structure on sample size and power in multiple-period clusterrandomised trials.

Statistical Methods in Medical Research , 703–716. Kauermann, G. and Carroll, R. J. (2001). A note on the eﬃciency of sandwich covariancematrix estimation.

Journal of the American Statistical Association , 1387–1396. Li, F. (2020). Design and analysis considerations for cohort stepped wedge cluster randomizedtrials with a decay correlation structure.

Statistics in Medicine , 438–495. Li, F., Hughes, J. P., Hemming, K., Taljaard, M., Melnick, E. R. and Heagerty, P. J. (2020). Mixed-eﬀects models for the design and analysis of stepped wedge cluster randomizedtrials: An overview.

Statistical Methods in Medical Research , 10.1177/0962280220932962.

Li, F., Turner, E. L. and Preisser, J. S. (2018). Sample size determination for GEE analysesof stepped wedge cluster randomized trials.

Biometrics , 1450–1458.4 REFERENCESLiang, K.-Y. and Zeger, S. L. (1986). Longitudinal data analysis using generalized linearmodels.

Biometrika , 13–22. Lu, B., Preisser, J. S., Qaqish, B. F., Suchindran, C., Bangdiwala, S. I. and Wolfson,M. (2007). A comparison of two bias-corrected covariance estimators for generalized estimatingequations.

Biometrics , 935–941. Mancl, L. A. and DeRouen, T. A. (2001). A covariance estimator for GEE with improvedsmall-sample properties.

Biometrics , 126–134. Martin, J., Girling, A., Nirantharakumar, K., Ryan, R., Marshall, T. and Hemming,K. (2016). Intra-cluster and inter-period correlation coeﬃcients for cross-sectional clusterrandomised controlled trials for type-2 diabetes in UK primary care.

Trials , 402–413. Perin, J. and Preisser, J. S. (2017). Alternating logistic regressions with improved ﬁnitesample properties.

Biometrics , 696–705. Preisser, J. S., Lu, B. and Qaqish, B. F. (2008). Finite sample adjustments in estimatingequations and covariance estimators for intracluster correlations.

Statistics in Medicine ,5764–5785. Preisser, John S, Reboussin, Beth A, Song, Eun Young and Wolfson, Mark . (2007).The importance and role of intracluster correlations in planning cluster trials.

Epidemiology ,552–560. Preisser, John S., Young, Mary L., Zaccaro, Daniel J. and Wolfson, Mark . (2003).An integrated population-averaged approach to the design, analysis and sample size determi-nation of cluster-unit trials.

Statistics in Medicine , 1235–1254. Qaqish, B. F. (2003). A family of multivariate binary distributions for simulating correlatedbinary variables.

Biometrika , 455–463. EFERENCES Sharples, K. and Breslow, N. (1992). Regression analysis of correlated binary data: somesmall sample results for the estimating equation approach.

Journal of Statistical Computationand Simulation , 1–20. Thompson, J. A., Davey, C., Fielding, K., Hargreaves, J. R. and Hayes, R. J. (2018).Robust analysis of stepped wedge trials using cluster-level summaries within periods.

Statisticsin Medicine , 2487–2500. Turner, E. L., Li, F., Gallis, J. A., Prague, M. and Murray, D. M. (2017 a ). Reviewof recent methodological developments in group-randomized trials: part 1–design. AmericanJournal of Public Health , 907–915.

Turner, E. L., Prague, M., Gallis, J. A., Li, F. and Murray, D. M. (2017 b ). Reviewof recent methodological developments in group-randomized trials: part 2–analysis. AmericanJournal of Public Health , 1078–1086.

Yasui, Yutaka, Feng, Ziding, Diehr, Paula, McLerran, Dale, Beresford,Shirley AA and McCulloch, Charles E . (2004). Evaluation of community-interventiontrials via generalized linear mixed models.

Biometrics , 1043–1052. Zeger, S. L., Liang, K.-Y. and Albert, P. S. (1988). Models for longitudinal data: Ageneralized estimating equation approach.

Biometrics , 1049–1060. Zhao, L. P. and Prentice, R. L. (1990). Correlated binary regression using a quadraticexponential model.

Biometrika , 642–648. Zhou, X., Liao, X., Kunz, L. M., Normand, S-L. T., Wang, M. and Spiegelman, D. (2020). A maximum likelihood approach to power calculations for stepped wedge designs ofbinary outcomes.

Biostatistics , 102–121.6 REFERENCES

Figure 1. Cluster-by-period diagram of the Washington State EPT Trial. Each white cell representsa control cluster-period and each gray cell indicates an intervention cluster-period. The correspondingcluster-period sizes are indicated in each cell.Table 1. Mean relative eﬃciency in estimating the marginal intervention eﬀect δ when the correlationstructure is properly modeled versus when working independence is used in the GEE analyses of cross-sectional SW-CRTs with variable cluster-period sizes bootstrapped from the Washington EPT study.Results are averaged over 1000 bootstrap replicates. True correlation: nested exchangeable True correlation: exponential decay α = 0.1 α = 0.05 α = 0.02 α = 0.1 α = 0.05 α = 0.02 α ARE α ARE α ARE ρ ARE ρ ARE ρ ARE0.1 44.9 0.05 22.5 0.02 9.4 1 44.9 1 22.5 1 9.40.08 6.5 0.04 5.4 0.01 2.3 0.8 5.7 0.8 4.7 0.8 3.40.06 3.6 0.03 3.3 0.005 1.8 0.5 2.5 0.5 2.3 0.5 2.00.04 2.5 0.02 2.4 0.002 1.6 0.2 1.9 0.2 1.8 0.2 1.60.02 2.0 0.01 1.9 0.001 1.6 0.05 1.9 0.05 1.8 0.05 1.6

EFERENCES

20 40 60 80 100 120(UEE) CORR=NE a =0.03 a =0.015I0.800.850.900.95 C o v e r age o f a l l l l l l l l l ll l l l l l l l l l

20 40 60 80 100 120(MAEE) CORR=NE a =0.03 a =0.015I0.800.850.900.95 C o v e r age o f a l l l l l l l l l ll l l l l l l l l l

20 40 60 80 100 120(UEE) CORR=NE a =0.03 a =0.015I0.800.850.900.95 C o v e r age o f a l l l l l l l l l ll l l l l l l l l l

20 40 60 80 100 120(MAEE) CORR=NE a =0.03 a =0.015I0.800.850.900.95 C o v e r age o f a l l l l l l l l l ll l l l l l l l l ll l BC0 (ZP) BC1 (KC) BC2 (MD) BC3 (FG)

Figure 2. Coverage of 95% conﬁdence intervals for correlation parameters based on the t I − quantilesas a function of number of clusters I under the nested exchangeable (NE) correlation structure when α = 0.03 and α = 0.015. The cluster-period sizes are randomly drawn from DiscreteUniform(25 , REFERENCES

Table 2. Percent relative bias of model parameters as a function of number of clusters I under two diﬀerentcorrelation structures: nested exchangeable (NE) and exponential decay (ED). The cluster-period sizesare randomly drawn from DiscreteUniform(50 , Bias ˆ δ Bias ˆ α Bias ˆ α or ˆ ρ UEE MAEE UEE MAEE UEE MAEECorrelation structure I = 12NE( α , α ) (0.03 , , α , ρ ) (0.03 , , I = 24NE( α , α ) (0.03 , , α , ρ ) (0.03 , , I = 36NE( α , α ) (0.03 , , α , ρ ) (0.03 , , Table 3. Parameter estimates of marginal mean and correlation parameters from the overall analysis ofWashington State EPT Trial using MAEE. Standard error of the marginal mean parameters are basedon BC1 and standard error of the intraclass correlation parameters are based on BC2. All standard errorestimates are reported in the parenthesis.

Simple exchangeable Nested exchangeable Exponential decay

Marginal mean β (period 1) -2.443 (0.091) -2.446 (0.095) -2.437 (0.095) β (period 2) -2.454 (0.091) -2.439 (0.083) -2.444 (0.089) β (period 3) -2.535 (0.094) -2.495 (0.100) -2.508 (0.100) β (period 4) -2.609 (0.106) -2.606 (0.115) -2.613 (0.115) β (period 5) -2.537 (0.145) -2.535 (0.128) -2.552 (0.131) δ (treatment) -0.141 (0.092) -0.142 (0.090) -0.124 (0.087) Intraclass correlation α .0051 (.0016) .0072 (.0039) .0070 (.0039) α – .0038 (.0015) – ρ – – .7157 (.2962) Correlation selection criteria

CIC cpcp