A pattern mixture model for a paired 2×2 crossover design
aa r X i v : . [ m a t h . S T ] M a y IMS CollectionsBeyond Parametrics in Interdisciplinary Research: Festschrift in Honor of ProfessorPranab K. Sen
Vol. 1 (2008) 257–271c (cid:13)
Institute of Mathematical Statistics, 2008DOI:
A pattern mixture model for a paired × crossover design Laura J. Simon and Vernon M. Chinchilli ∗ The Pennsylvania State University
Abstract:
When conducting a paired 2 × × ×
1. Introduction
Two important design principles are occasionally used in clinical trials: 1) A subjectis “matched” or “paired” with another subject with similar characteristics to reducethe chance that other variables obscure the primary comparison of interest. 2) Asubject serves as his or her own control by “crossing over” from one treatment toanother during the course of an experiment. Experiments employing the first designprinciple are called matched pair designs (Cochran [1]), while those employing thesecond are called crossover designs (Jones and Kenward [2], Ratkowsky et al. [3]and Senn [4]).There are situations in which it may be beneficial to use the two design princi-ples simultaneously. That is, it may be advantageous to conduct a “paired crossoverdesign.” For such a design, each subject and his/her paired counterpart are ran-domized to the same treatment sequence. That is, they receive one experimentaltreatment, and then cross over and receive the other experimental treatment(s) atthe same time. A paired crossover design was recently used in three clinical tri- ∗ Supported in part by Grants U10 HL51845 and U10 HL074231 from the National Heart,Lung, and Blood Institute Department of Statistics, The Pennsylvania State University, 328 Thomas Building, UniversityPark, PA 16802, USA, e-mail: [email protected] Department of Public Health Sciences, The Pennsylvania State University, A210, 600 Center-view Drive, Suite 2200, Hershey, PA 17033, USA, e-mail: [email protected]
AMS 2000 subject classifications:
Primary 62F03, 62F10; secondary 62P10.
Keywords and phrases: clinical trials, matching, missing data.25758
L. J. Simon and V. M. Chinchilli × th position of the beta-agonist receptor gene. It washypothesized that the scheduled daily use of inhaled albuterol, the most commontreatment for patients with mild to moderate asthma, actually had a detrimentaleffect on the lung function of patients with the Arg/Arg genotype (R) but notpatients with the Gly/Gly genotype (G). The primary research question thereforeconcerned whether or not the treatment effects differed for the two genotypes. Thatis, the primary hypothesis concerned inference about whether the interaction pa-rameter: γ = ( µ RA − µ RP ) − ( µ GA − µ GP )is 0 , where µ kl is the population mean lung function of genotype k patients ontreatment l. To achieve an efficient comparison of the treatments within each genotype, a 2 × k . Because a subject’s genotypeis a pre-determined characteristic, subjects could not be randomly allocated togenotypic group. Therefore, to minimize confounding caused by potential differencesin the baseline lung function of the two genotypic groups, each subject of genotypeR was matched to a subject of genotype G with similar baseline lung function. Thematched subjects were randomly assigned to the same sequence of the crossoverdesign, and an eight-week washout period was placed between the two treatmentperiods (Table 1).At the conclusion of the BARGE Study, each pair of subjects j of sequence s ideally yielded the quadrivariate response Y sj = ( Y sjRA , Y sjRP , Y sjGA , Y sjGP ) , avector containing the subjects’ changes in lung function. Unfortunately, as shouldbe expected when conducting any clinical trial, measurements in the BARGE Studywere not always collected as planned. That is, some subjects missed one or moreplanned visits or completely dropped out of the study. In this paper, we presenta method based on a pattern-mixture model for analyzing the data arising from apaired 2 × y i = ( y i , . . . , y ip ) ′ denotes the p × i and m i = ( m i , . . . , m ip ) ′ represents a p × y ij is observed, thenpattern-mixture models factor the joint distribution of y i and m i as: π [ y i , m i | X i ] = π [ y i | X i , m i ] × π [ m i | X i ] , where X i denotes fixed covariates or design matrices . That is, in short, the data arestratified by their patterns of missingness, and then a separate model is specified
Table 1
Paired crossover design for the BARGE Study. Pairs of subjects were randomizedto either the AP sequence (first row) or the PA sequence (second row). Allsubjects had a washout period between the two treatment periods
Subject 1 (R) Washout Subject 2 (G)Period 1 2 —– 1 2
Sequence 1 RA RP —– GA GPSequence 2 RP RA —– GP GA attern mixture model for each missing data pattern. The distribution π [ y i | X i , m i ] models the within-subject regressions for each missing data pattern, and π [ m i | X i ] models the marginalproportions of each missing data pattern as functions of between-subject covariates.It should be noted that there does exist a more general random-coefficientpattern-mixture model, e.g., see Little [7]. We do not pursue the more generalmodel here, however, because our crossover model for the BARGE Study assumesthat only one measurement is made in each period.The type of “missingness” that exists really should dictate our final analysis.Rubin [8] and Little and Rubin [9] describe three types of missingness for whichwe need not apply a pattern-mixture model: i) if the data are missing completelyat random (the probability of response is independent of both the observed andunobserved data); ii) if covariate-dependent dropout exists (the missingness dependson fixed covariates in the model); and iii) if the data are missing at random (theprobability of response depends on the observed data but not the missing data). Inany of these three cases, we simply can apply a general linear model with correlatederrors to the data arising from a paired crossover design, as was performed bySimon and Chinchilli [10]. The resulting likelihood-based estimation and inferenceprocedures are asymptotically unbiased.If the missing values satisfy none of the three types mentioned above, then Lit-tle and Rubin [9] label the situation “non-ignorable” (the probability of responsedepends on the unobserved data), and the analysis of the available data requirespecial methods. In this situation, when analyzing the data arising from a pairedcrossover design, we propose applying the pattern-mixture model methods that wenow describe.
2. Methods
In defining a pattern-mixture model for a paired 2 × F EV ”) in an asthma trial. Alternatively, andperhaps more commonly, the response could be a summary of multiple, repeatedoutcome measures, such as the change in response from pre- to post-treatment or thearea under a dose-response curve (“ AU C ”) in a bioequivalence trial. We also assumethat the subjects do not contribute any additional covariates. Thus, in general, ifwe refer to the two types of subjects who are matched as 1 and 2 and we label thetreatments as A and B, then we expect each pair of subjects j of sequence s toyield only the quadrivariate response Y sj = ( Y sj A , Y sj B , Y sj A , Y sj B ) . To definea pattern-mixture model for a paired 2 × When modeling missing data in a clinical trial, it is common to assume that themissingness happens monotonically, i.e. , after the first missing data point for asubject, all of the subsequent anticipated data points for that subject are alsomissing. For example, the data vector for a subject i might look something like y i = ( y i , y i , y i , · , · , · ) ′ assuming the subject misses all measurement occasionsafter the third occasion. The monotonic missingness assumption greatly simplifiesthe model development and subsequent analysis. L. J. Simon and V. M. Chinchilli
Unfortunately, for a paired crossover design, the monotonicity assumption is notrealistic. The data vector y sj for the paired 2 × y sj = ( y sj A , · , y sj A , · ) ′ are quite possible in a pairedcrossover design. One could still hope to simplify the problem by assuming thatat least the missingness with respect to each subject is monotonic. However, theBARGE Study data set, to which we later apply the methods described herein,contains non-monotonic missingness within subjects. Therefore, we do not makeany simplifying assumptions, but instead address the problem in its full generality.Another complication of handling missing data in a crossover design (pairedor otherwise) is that, even when the same pattern of missingness occurs, differentinformation is gleaned about the treatments from the subjects in different sequences.For example, for our paired 2 × y j , which are missing the last data point: y j = ( y j A , y j B , y j A , · ) ′ provide no information about the type 2-treatment B combination, while data vec-tors for the second sequence, y j , which are missing the last data point: y j = ( y j B , y j A , y j B , · ) ′ provide no information about the type 2-treatment A combination. Therefore,pattern-mixture models for any crossover design must accommodate patterns ofmissingness for each of the sequences.Let P ps denote the pattern of missingness for pattern p = 0 , , . . . ,
14 for sequence s = 1 , . Table 2 summarizes, for the paired 2 × Table 2
Eight patterns of missingness in paired 2-by-2 crossover design that are monotonicwithin subject. X denotes observed and ? denotes missing
Subject 1 (R) Subject 2 (G)Pattern Sequence Per 1 Per 2 Per 1 Per 2 P P P P P P P P P P P P P P P P attern mixture model Table 3
Seven patterns of missingness in paired 2-by-2 crossover design that are non-monotonicwithin subject. X denotes observed and ? denotes missing
Subject 1 (R) Subject 2 (G)Pattern Sequence Per 1 Per 2 Per 1 Per 2 P P P P P , P , P , P , P , P , P , P , P , P , Complicating issues arise when modeling the data from a 2 × × crossover model for complete data In defining a statistical model for the paired 2 × λ A ) is the component of the response in thesecond period due to the lasting effect of treatment A from the first period (Jonesand Kenward [2], Ratkowsky et al. [3] and Senn [4]). Likewise, in a BA sequence,the “B carryover” ( λ B ) is the component of the response in the second period dueto the lasting effect of treatment B from the first period. In a 2 × λ A = λ B ),it is not possible to estimate the primary quantity of interest – the true treatmenteffect µ A − µ B . There are two situations in which it is possible to estimate the true treatmenteffect: 1) if the carryover effects are the same for each treatment ( λ A = λ B ); or2) if the time periods between the treatment periods, i.e., the washout periods,are designed to be lengthy enough to render both carryover effects negligible ( λ A = λ B = 0). Most 2 × × × Y sj = X sj β + ε sj , L. J. Simon and V. M. Chinchilli where: • Y sj = ( Y sj A , Y sj B , Y sj A , Y sj B ) ′ is the quadrivariate response for pair j (1 , , . . . , n s ) of sequence s (1 = AB, BA ). • X sj is a 4 × X j and X j do not depend on j, i.e., X sj = X s for all s = 1 , j = 1 , , . . . , n s ). • β = ( µ A , µ B , µ A , µ B , ρ , ρ , ν , ν ) ′ is an 8 × • ε sj = ( ε sj A , ε sj B , ε sj A , ε sj B ) ′ is a random error term.The mean of the responses derives directly from the X sj β portion of the model,while the variance of the responses derives directly from the ε sj portion of themodel. Our proposed model for the mean of the responses arising from a paired 2 × × × µ kl ), as well as the period ( ρ k )and sequence ( ν l ) in which the treatments are taken. More specifically, we proposeparameterizing the mean of the responses as: Type Sequence Period 1 Period 2 µ A + ρ + ν µ B − ρ + ν µ B + ρ − ν µ A − ρ − ν µ A + ρ + ν µ B − ρ + ν µ B + ρ − ν µ A − ρ − ν Traditionally, the variance matrix for a crossover design is assumed to be compoundsymmetric. That is, the variance of the responses arising from one treatment isassumed to equal the variances of the responses arising from other treatments. And,the covariances between the responses of the subjects when on different treatmentsare also assumed to be equal. Like the approach of others (Ekbohm and Melander[12], Sheiner [13], Chinchilli and Esinhart [14] and Putt and Chinchilli [15]), weinstead model the maximum number of variance components permitted for thepaired 2 × Σ , of the response Y sj , s = 1 , σ A, A σ A, B σ A, A σ A, B σ A, B σ B, B σ B, A σ B, B σ A, A σ B, A σ A, A σ A, B σ A, B σ B, B σ A, B σ B, B is much more flexible than the compound symmetric structure traditionally as-sumed. attern mixture model As the paired 2 × Attempting to define a model for each of the fifteen missing data patterns yieldsidentifiability problems. Some of the patterns will naturally be more populated,while others will be sparsely populated at best. For the sake of illustration, Table4 summarizes the number of subject pairs in the BARGE Study falling into eachof the fifteen missing data patterns. Table 4 indicates that there were no non-monotonic patterns observed in the BARGE Study, which is not suprising becauseof the longitudinal nature of the trial design.In general, because of the inherent identifiability problems, it is necessary tocollapse the fifteen patterns into coarser groupings, so that information about theeffects parameters can be “borrowed” across the patterns. When considering po-tential groupings, one should take into account the area of scientific research, aswell as the various characteristics of the subjects and/or pairs that might yieldgroup differences. In creating groupings for the BARGE Study data, we proposethe formation of the following three groups:1. A group containing the completers who have a pair match – patterns 0, 10,11, and 12. This group is denoted group “ C ” for “completers.”2. A group in which all patterns involve missing a second period, regardless ofwhether a pair match exists – patterns 1, 2, 3, 6, 7, 13 and 14. This group isdenoted group “ D ” for missingness due to “dropout.”3. A group of the patterns in which a subject is a completer but does not havea pair match – patterns 4, 5, 8 and 9. This group is denoted group “ P ” formissingness due to a missing “pair.”Collapsing the counts in Table 4 for the BARGE Study according to these criteriayields that n C = 29, n D = 6, and n P = 5.It should be emphasized that other groupings are possible besides the one thatwe propose. Some of the other groupings, however, require one to assume that noperiod and no sequence effects exist. That is an assumption that we were unwillingto make for the BARGE Study. It should be noted, though, that if there were no Table 4
Frequencies of patterns observed in the BARGE Study
Pattern Count Pattern Count Pattern Count
L. J. Simon and V. M. Chinchilli period nor sequence effects to worry about, the methods described by Little (1995)could be applied instead.Now that our groupings are defined, let: • Y sj denote the 4 × j in sequence s . • Y psj denote the r × j in pattern p and sequence s . • β ( g ) = ( µ ( g )1 A , µ ( g )1 B , µ ( g )2 A , µ ( g )2 B , ρ ( g )1 , ρ ( g )2 , ν ( g )1 , ν ( g )2 ) ′ denote the 8 × g = C, D, and P. • X sj denote the 4 × j in sequence s = 1 , Y sj to β ( C ) . • Σ denote the 4 × Y sj (notice that we do not assume that thevariance-covariance parameters differ across the patterns). • E ps be an r × × p andsequence s, in which rows of the identity matrix are removed according to themissing values in Y sj . The rows of the identity matrix are always consideredin 1A, 1B, 2A, 2B order, and r equals the number of non-missing observations.Then, we define our pattern-mixture model as: Y psj = E ps Y sj ∼ N r (cid:16) E ps X sj β ( g ) , E ps ΣE ′ ps (cid:17) , where g = C for patterns p = 0 , , , and 12; g = D for patterns p = 1 , , , , , g = P for patterns p = 4 , , . For example, pattern p = 2 and sequence s = 1 is missing the 1B measurement.Therefore, r = 3 and: E = . The mean of the response Y j = ( Y j A , Y j A , Y j B ) ′ is: E X j β ( D ) = µ ( D )1 A + ρ ( D )1 + ν ( D )1 µ ( D )2 A + ρ ( D )2 + ν ( D )2 µ ( D )2 B − ρ ( D )2 + ν ( D )2 and the 3 × E ΣE ′ = σ A, A σ A, A σ A, B σ A, A σ A, A σ A, B σ A, B σ A, B σ B, B . The proposed means and variances of the remaining patterns p and sequences s can be obtained similarly. We can readily obtain maximum likelihood (ML) estimates of the parameters of ourpattern-mixture model using available software such as PROC MIXED in SAS 9.1.The following is sample code from SAS PROC MIXED that can be used to findthe ML estimates of the covariance parameters and the pattern-specific locationparameters: attern mixture model
PROC MIXED DATA = barge METHOD = ML;CLASS pairid position grp;MODEL response = mu1A(grp) mu1B(grp) mu2A(grp) mu2B(grp)rho1(grp) rho2(grp) nu1(grp) nu2(grp)/NOINT S;REPEATED position / SUBJECT = pairid TYPE = UN;RUN;
The variable pairid represents the identification number of the pairs. The variable position represents the order of the data in the pair’s quadrivariate response vector:1A is position 1, 1B is position 2, 2A is position 3 and 2B is position 4. The variable grp represents the three groups of patterns: for our example, groups C, D, and P.As an alternative to using SAS PROC MIXED, we can use the log-likelihoodfunction and a matrix programming language, such as S-Plus or SAS/IML, to writeour own Fisher scoring or Newton-Raphson algorithm to find the ML estimates.Let: • n ps denote the number of pairs falling in pattern p of sequence s, and n s denote the number of pairs falling in sequence s . • π ps denote the true proportion of pairs falling in pattern p of sequence s . • φ id denote the set of 62 identifiable parameters – 24 effects ( µ ( C )1 A , . . . , ν ( P )2 ) , π , π , . . . , π , , π , ) , and 10 covariance parameters ( σ A, A ,. . . , σ B, B ).Then, letting Σ ∗ ps = E ps ΣE ′ ps and using Lagrange multipliers λ and λ withconstraints π + · · · + π , = 1 and π + · · · + π , = 1 , respectively, the loglikelihood function is:log L Y , n ( φ id ) ≈ " X p =0 2 X s =1 n ps log π ps − λ (1 − π − · · · − π , ) − λ (1 − π − · · · − π , ) 12 X p =0 2 X s =1 n ps X j =1 log (cid:12)(cid:12) Σ ∗ ps (cid:12)(cid:12) − X p =0 2 X s =1 n ps X j =1 (cid:16) E ps y sj − E ps X sj β ( g ) (cid:17) ′ Σ ∗− ps (cid:16) E ps y sj − E ps X sj β ( g ) (cid:17) where the value of g depends on pattern p ( g = C for p = 0 , , , g = D for p = 1 , , , , , ,
14; and g = P for p = 4 , , , . Just as is the case for ML estimation, it is possible to use SAS PROC MIXED to findrestricted maximum likelihood (REML) estimates of the parameters of the patternmixture model. We only need to make one minor modification to the SAS PROCMIXED code previously used to find the ML estimates (change the “METHOD =ML” option to “METHOD = REML” option).Again, alternatively we can use the restricted likelihood function and a matrixprogramming language, such as S-Plus or SAS/IML, to write one’s own Fisherscoring or Newton-Raphson algorithm to find the REML estimates.Let: L. J. Simon and V. M. Chinchilli • n = ( n , . . . , n , ) ′ be the vector of sample sizes n ps for pattern p andsequence s (with N = P p =0 P s =1 n ps ). • π = ( π , . . . , π , ) ′ be the vector of true population proportions for pattern p and sequence s . • Y = ( y , . . . , y n , . . . , y , , . . . , y , n , ) ′ be the N × • β = ( β ( C ) ′ , β ( D ) ′ , β ( P ) ′ ) ′ be the 24 × • X be the N ×
24 full design matrix linking Y to β . • Ω be the N × N block-diagonal variance-covariance matrix of Y . • θ be a vector containing the 10 identifiable variance components ( σ A, A , . . . ,σ B, B ).Then the joint likelihood of Y and n can be written as the factorization: L Y , n ( φ id ) = L Y | n ( β, θ ) × L n ( π ) , where Y | n is a multivariate N -normal with mean X β and variance Ω , and n ismultinomial with parameters N and π. Then, the general procedure behind REMLestimation of transforming the data vector Y | n such that the likelihood L Y | n ( β, θ )factors into two components can be applied, yielding:log L Y , n ( φ id ) = log L ′ ( θ ) + log L ′′ ( β, θ ) + log L n ( π ) . Letting P = Ω − − Ω − X ( X ′ Ω − X ) − X ′ Ω − , the log REML likelihood functionis: log L ′ ( θ ) ≈ −
12 log | Ω | −
12 log (cid:12)(cid:12) X ′ Ω − X (cid:12)(cid:12) − Y ′ PY and: log L ′′ ( β, θ ) = −
12 log (cid:12)(cid:12)(cid:12)(cid:0) X ′ Ω − X (cid:1) − (cid:12)(cid:12)(cid:12) − (cid:16) b β − β (cid:17) ′ (cid:0) X ′ Ω − X (cid:1) (cid:16) b β − β (cid:17) , where b β has the form of the generalized least squares (GLS) estimator b β =( X ′ Ω − X ) − X ′ Ω − Y . Using Lagrange multipliers, log L n ( π ) is the log-likelihoodof the two independent multinomial samples:log L n ( π ) = " X p =0 2 X s =1 n ps log π ps − λ (1 − π − · · · − π , ) − λ (1 − π − · · · − π , ) γ Thus far, we only have addressed estimation of the pattern-specific parameters,such as µ ( C )1 A , ρ ( D )1 , and ν ( P )1 . Interest, however, typically lies in inference about theoverall population parameters, such as µ A , ρ , and ν , not the pattern-specificones. Therefore, here we extend our work of the previous sections by addressing es-timation of the overall population parameters. Furthermore, since use of the paired2 × γ = ( µ A − µ B ) − ( µ A − µ B ) , attern mixture model we also derive the asymptotic distribution of b γ here. Since the expression for thepoint estimator b γ and the form of the asymptotic distribution of b γ are the sameregardless of whether REML or ML estimation is used in obtaining b γ, for the sakeof simplicity, we proceed assuming that ML estimation is used.In order to define point estimators of the overall population parameters, let b π g denote the proportion of pairs in group g for g = C, D, and P. Then, for k = 1 , l = A, B , the weighted average: b µ kl = b π C b µ ( C ) kl + b π D b µ ( D ) kl + b π P b µ ( P ) kl (2.1) = b π C b µ ( C ) kl + b π D b µ ( D ) kl + (1 − b π C − b π D ) b µ ( P ) kl is the ML estimator of the overall population treatment-by-genotype mean. And,therefore b γ = ( b µ A − b µ B ) − ( b µ A − b µ B ) is the ML estimator of γ .The estimator b γ is a function of b π g and b µ ( g ) kl , the estimated proportions andpattern-specific means. Therefore, we derive the asymptotic distribution of b γ byfirst using the asymptotic normality of ML estimators and then by an applicationof the delta method.Let: • b π = ( b π C , b π D ) ′ be the 2 × • π = ( π C , π D ) ′ be the 2 × • V ( b π ) = Diag ( π ) − ππ ′ denote the 2 × b π . • b µ = ( b µ ( C )1 A , b µ ( D )1 A , b µ ( P )1 A , . . . , b µ ( C )2 B , b µ ( D )2 B , b µ ( P )2 B ) be the 12 × • µ = ( µ ( C )1 A , µ ( D )1 A , µ ( P )1 A , . . . , µ ( C )2 B , µ ( D )2 B , µ ( P )2 B ) be the 12 × • V ( b µ ) denote the 12 ×
12 variance matrix of b µ .Then, by the asymptotic normality of maximum likelihood estimators and bythe independence of b π and b µ : √ N (cid:18)(cid:18) b π N b µ N (cid:19) − (cid:18) πµ (cid:19)(cid:19) is asymptotically multivariate normal with mean × and 14 ×
14 variance-covariance matrix: V = (cid:18) V ( b π ) × × × V ( b µ ) × (cid:19) . Now, defining the 4 × g (( b π ′ , b µ ′ ) ′ ) = ( b µ A , b µ B , b µ A , b µ B ) ′ wherethe functions b µ A , b µ B , b µ A and b µ B are as defined by (2.1), the 4 ×
14 Jacobianmatrix equals J =[ J | J ] where J is the 4 × J = b µ ( C )1 A − b µ ( P )1 A b µ ( D )1 A − b µ ( P )1 A b µ ( C )1 B − b µ ( P )1 B b µ ( D )1 B − b µ ( P )1 B b µ ( C )2 A − b µ ( P )2 A b µ ( D )2 A − b µ ( P )2 A b µ ( C )2 B − b µ ( P )2 B b µ ( D )2 B − b µ ( P )2 B L. J. Simon and V. M. Chinchilli and J is the 4 ×
12 matrix: J = b π C b π D b π P b π C b π D b π P b π C b π D b π P b π C b π D b π P . Then, an application of the delta method yields that: √ N b µ A b µ B b µ A b µ B − µ A µ B µ A µ B is asymptotically multivariate normal with mean × and 4 × JVJ ′ . Now, we can construct any linear combination of the population mean parame-ters as γ = c ′ ( µ A , µ B , µ A , µ B ) ′ , but the contrast vector of particular interest inour application is: c = (cid:0) − − (cid:1) ′ . Then b γ = g (cid:16)(cid:0) b µ A b µ B b µ A b µ B (cid:1) ′ (cid:17) = c ′ ( b µ A , b µ B , b µ A , b µ B ) ′ and ∂ b γ∂ (cid:0) b µ A b µ B b µ A b µ B (cid:1) = c ′ . Therefore, one final application of the delta method yields that √ N ( b γ − γ ) is asymp-totically normal with mean 0 and variance V ( b γ ) = c ′ JVJ ′ c . The resulting asymp-totic variance V ( b γ ) depends on b π g , b µ ( g ) kl , V ( b π ) and V ( b µ ) , and therefore must beestimated.
3. Results
As described in the Introduction, the BARGE Study was a randomized paired2 × n C = 29, n D = 6, and n P = 5. Because of the smallsamples sizes for groups D and P, which yielded non-estimable standard errors for attern mixture model some of their estimated parameters, these two groups were pooled. Thus, the finalpattern-mixture model for the analysis of the BARGE Study data consisted of onlytwo groups (C and D+P).A general linear model with correlated errors, as described in Sections 2.4 and2.5, was applied to group C and group D+P separately that included effects forperiod, sequence, and treatment. Because group D+P had a small sample size, acommon 4 × γ = ( µ RA − µ RP ) − ( µ GA − µ GP ) , the genotype × treatmentinteraction term, is –11.7 liters per minute with standard error 22.3 ( p = 0 . γ is − p = 0 . Table 5
The REML estimates of the treatment effects from the mixed-effects model analysis of groups Cand D+P separately. Arg/Arg = R, Gly/Gly = G, Albuterol = A, Placebo = P
Change in AM PEFRGroup Genotype Treatment b µ ( g ) kl = Mean (Std Error) C Arg/Arg Placebo b µ ( C ) RP = 20 . . b µ ( C ) RA = 8 . . b µ ( C ) GP = 12 . . b µ ( C ) GA = 22 . . b µ ( D + P ) RP = − . . b µ ( D + P ) RA = 12 . . b µ ( D + P ) GP = − . . b µ ( D + P ) GP = − . . Table 6
The REML estimates of the weighted averages of the treatment effects from the mixed-effectsmodel analysis. Arg/Arg = R, Gly/Gly = G, Albuterol = A, Placebo = P
Change in AM PEFRGenotype Treatment Mean (Std Error)
Arg/Arg Placebo b µ RP = 8 . . b µ RA = 9 . . b µ GP = − . . b µ GA = 3 . . L. J. Simon and V. M. Chinchilli yielded a more powerful and sensitive analysis than the analysis described herebecause many of the patients in pattern-mixture group D had partial data duringthe 16-week treatment periods which contributed to the overall analysis. Indeed, theauthors of the BARGE manuscript estimated the genotype × treatment interactionterm as 24.0 liters per minute with standard error 6.3 ( p < . ×
4. Discussion
We have demonstrated the development of a pattern-mixture version of a generallinear model with correlated errors for the paired 2 × × m times, estimating the treatmenteffects within each data set, and averaging the effects across the m data sets. Mul-tiple data imputation affords two advantages: (1) it provides complete data setsthat are more easily analyzed via standard statistical methods, and (2) it appropri-ately accounts for the variability in estimating the missing values. The disadvantageof multiple data imputation is that the probability model for generating imputedvalues could be misspecified. If so, then the combined results could be biased. Forexample, the probability model may be based on the estimated mean and vari-ance structure of those experimental units with complete observations. If the datafrom dropouts have an inherently different mean and variance structure, then theimputed values could be misrepresentative. Acknowledgments.
The authors wish to thank an anonymous referee, whosevaluable comments improved the presentation of this material.
References [1]
Cochran, W. G. (1983).
Planning and Analysis of Observational Studies .Wiley, New York. MR0720048[2]
Jones, B. and Kenward, M. G. (1989).
Design and Analysis of Cross-overTrials . Chapman and Hall/CRC Press, Boca Raton, FL. MR1014893 attern mixture model [3]
Ratkowsky, D. A., Evans, M. A. and Alldredge, J. R. (1993).
Cross-Over Experiments . Dekker, New York.[4]
Senn, S. (2002).
Cross-over Trials in Clinical Research , 2nd ed. Wiley, NewYork.[5]
Kephart, D. K., Chinchilli, V. M., Hurd, S. and Cherniack, R. forthe Asthma Clinical Research Network. (2001). The organization ofthe Asthma Clinical Research Network: A multicenter, multiprotocol clinicaltrials team.
Control. Clin. Trials (Suppl) 119S–125S.[6] Israel, I., Chinchilli, V. M., Ford, J. G., et al. for the NationalHeart, Lung, and Blood Insitute’s Asthma Clinical Research Net-work. (2004). Use of regularly scheduled albuterol treatment in asthma:Genotype-stratified, randomized, placebo-controlled cross-over trial.
Lancet
Little, R. J. A. (1995). Modeling the drop-out mechanism in repeated-measures studies.
J. Amer. Statistic. Assoc. Rubin, D. B. (1976). Inference and missing data.
Biometrika Little, R. J. A. and Rubin, D. B. (2002).
Statistical Analysis with MissingData . Wiley, New York. MR1925014[10]
Simon, L. J. and Chinchilli, V. M. (2007). A matched crossover design forclinical trials.
Contemp. Clin. Trials Laird, N. M. and Ware, J. H. (1982). Random effects models for longitu-dinal data: an overview of recent results.
Biometrics Ekbohm, G. and Melander, H. (1989). The subject-by-formulation inter-action as a criterion for interchangeability of drugs.
Biometrics Sheiner, L. B. (1992). Bioequivalence revisited.
Statistics in Medicine Chinchilli, V. M. and Esinhart, J. D. (1996). Design and analysis ofintra-subject variability in cross-over experiments.
Statistics in Medicine Putt, M. and Chinchilli, V. M. (1999). A mixed-effects model for theanalysis of repeated measures cross-over studies.
Statistics in Medicine Schafer, J. L. (1997).