[PDF] Prediction-based classification for longitudinal biomarkers

Abstract

Assessment of circulating CD4 count change over time in HIV-infected subjects on antiretroviral therapy (ART) is a central component of disease monitoring. The increasing number of HIV-infected subjects starting therapy and the limited capacity to support CD4 count testing within resource-limited settings have fueled interest in identifying correlates of CD4 count change such as total lymphocyte count, among others. The application of modeling techniques will be essential to this endeavor due to the typically nonlinear CD4 trajectory over time and the multiple input variables necessary for capturing CD4 variability. We propose a prediction-based classification approach that involves first stage modeling and subsequent classification based on clinically meaningful thresholds. This approach draws on existing analytical methods described in the receiver operating characteristic curve literature while presenting an extension for handling a continuous outcome. Application of this method to an independent test sample results in greater than 98% positive predictive value for CD4 count change. The prediction algorithm is derived based on a cohort of n=270 HIV-1 infected individuals from the Royal Free Hospital, London who were followed for up to three years from initiation of ART. A test sample comprised of n=72 individuals from Philadelphia and followed for a similar length of time is used for validation. Results suggest that this approach may be a useful tool for prioritizing limited laboratory resources for CD4 testing after subjects start antiretroviral therapy.

Full PDF

aa r X i v : . [ s t a t . A P ] N ov The Annals of Applied Statistics (cid:13)

Institute of Mathematical Statistics, 2010

PREDICTION-BASED CLASSIFICATION FORLONGITUDINAL BIOMARKERS By Andrea S. Foulkes, Livio Azzoni, Xiaohong Li,Margaret A. Johnson, Colette Smith, Karam Mounzerand Luis J. Montaner

University of Massachusetts, The Wistar Institute, University ofMassachusetts, Royal Free Hampstead NHS Trust, Royal Free andUniversity College Medical School, Philadelphia Field Initiating Group forHIV Trials (FIGHT)and The Wistar Institute

Assessment of circulating CD4 count change over time in HIV-infected subjects on antiretroviral therapy (ART) is a central compo-nent of disease monitoring. The increasing number of HIV-infectedsubjects starting therapy and the limited capacity to support CD4count testing within resource-limited settings have fueled interest inidentifying correlates of CD4 count change such as total lympho-cyte count, among others. The application of modeling techniqueswill be essential to this endeavor due to the typically nonlinear CD4trajectory over time and the multiple input variables necessary forcapturing CD4 variability. We propose a prediction-based classiﬁca-tion approach that involves ﬁrst stage modeling and subsequent clas-siﬁcation based on clinically meaningful thresholds. This approachdraws on existing analytical methods described in the receiver op-erating characteristic curve literature while presenting an extensionfor handling a continuous outcome. Application of this method to anindependent test sample results in greater than 98% positive predic-tive value for CD4 count change. The prediction algorithm is derivedbased on a cohort of n = 270 HIV-1 infected individuals from theRoyal Free Hospital, London who were followed for up to three yearsfrom initiation of ART. A test sample comprised of n = 72 individ-uals from Philadelphia and followed for a similar length of time isReceived February 2009; revised September 2009. Supported by the National Institutes of Health (R01-AI056983 to A. S. F., R01-AI51225 to L. J. M. and U01-AI051986 to L. J. M.), the Philadelphia Foundation and theFund from the Commonwealth Universal Research Enhancement Program, PennsylvaniaDepartment of Health.

Key words and phrases.

Prediction, classiﬁcation, receiver operator characteristic(ROC) curve, generalized linear mixed eﬀects modeling (GLMM), CD4, HIV/AIDS.

This is an electronic reprint of the original article published by theInstitute of Mathematical Statistics in

The Annals of Applied Statistics ,2010, Vol. 4, No. 3, 1476–1497. This reprint diﬀers from the original in paginationand typographic detail. 1

A. S. FOULKES ET AL.used for validation. Results suggest that this approach may be a use-ful tool for prioritizing limited laboratory resources for CD4 testingafter subjects start antiretroviral therapy.

1. Introduction.

Chronic HIV infection results in the progressive deple-tion of CD4+ T lymphocytes from both lymphoid tissues and peripheralblood. Thus, the monitoring of peripheral blood CD4 count is the stan-dard used in decision-making concerning initiation of antiretroviral therapy(ART), as well as monitoring response to ART over time. In 2002 and againin 2006, the World Health Organization (WHO) proposed guidelines for ad-ministration of ARTs in an eﬀort to provide a clear public health approach toutilization of these limited, yet very powerful drugs [WHO-Report (2006)].This series of recommendations includes routine collection and monitoringof CD4 counts to inform decisions regarding both initiation and switchingof drug regimens. However, this report also acknowledges that collection ofrepeated CD4 counts may not be feasible in resource-limited settings due tothe high costs associated with such monitoring. In these instances, cliniciansare advised to initiate therapy in patients with asymptomatic HIV diseaseif total lymphocyte count (TLC) falls below 1200 cells / mm .In this manuscript we consider modeling strategies for using alternativesurrogate markers within an acute window (3 years) post-initiation of ther-apy. Since publication of the WHO guidelines, several reports have beenpublished on the clinical utility of alternative surrogate markers for moni-toring post-therapy response and speciﬁcally the correlation between thesemarkers and CD4 count [Bagchi et al. (2007); Bisson et al. (2006); Fer-ris et al. (2004); Mahajan et al. (2004); Kamya et al. (2004); Badri andWood (2003); Bedell et al. (2003); Spacek et al. (2003); Kumarasamy et al.(2002)]. These investigations involve both cross-sectional and longitudinaldata and implement a variety of straightforward analytical methods. Typ-ically, cross-sectional comparisons between CD4 count and TLC as well aslongitudinal comparisons between the change in each of these variables overa speciﬁed time period are performed using correlation analysis [Badri andWood (2003); Kamya et al. (2004); Kumarasamy et al. (2002); Spacek et al.(2003)]. A summary of analytic strategies described for these settings, andtheir potential limitations, is given in the discussion; notably, the scientiﬁcﬁndings of these reports are variable.In this manuscript we describe a prediction-based classiﬁcation (PBC)framework for predicting biomarker trajectories based on a binary decisionrule. PBC was originally described in the setting of classifying HIV ge-netic variants that capture variability in a cross-sectional response to ART[Foulkes and DeGruttola (2002, 2003)]. Within this framework, we presenttwo estimation procedures that both involve ﬁrst stage modeling using a gen-eralized linear mixed eﬀect model (GLMM). In the ﬁrst case, we dichotomize BC FOR LONGITUDINAL BIOMARKERS the biomarker a priori and use a logit link function. In this case, our ap-proach reduces simply to ﬁtting a logistic model coupled with a receiveroperator characteristic (ROC) curve analysis, which is commonly applied inpractice though it has not been described for this setting. The second esti-mation approach we present is based on ﬁtting a linear mixed eﬀects modelto the observed CD4 count, as measured on a continuous scale. This laterapproach may oﬀer improved predictive performance since it incorporatesthe full range of the continuous scale data. We describe both approachesfurther in Section 2. Section 3 then illustrates the method through appli-cation to two cohorts of HIV-1 infected individuals followed for three yearsafter initiation of ART. Some simple extensions are described in Section 4and ﬁnally we oﬀer a discussion of how the approaches complement existingmethods in Section 5.

2. Methods.

Monitoring patient level CD4 counts over time may involveconsideration of the observed counts at a given time point, the percentchange in counts across a given period of time or some other function of pa-tient level data. In general, interest lies in determining whether this functionof the data is above or below a threshold value. For example, in monitor-ing absolute CD4 counts, thresholds of 200 and 350 are considered withinwell-established treatment administration guidelines. A threshold of 20%,on the other hand, is common for monitoring the percent change in CD4between visits over time. We begin in this section by describing a generalmodeling framework. We then present an approach for predicting whetherabsolute CD4 is above a clinically meaningful threshold, at each of multiplediscrete time points. In Section 4 we consider extensions of this frameworkthat allow us to consider functions of the biomarker under study, such aspercentage change over a given time period.2.1.

Generalized linear mixed eﬀects model.

Consider the generalized lin-ear mixed eﬀects model (GLMM) given by g ( E [ Y i ]) = X i β + Z i b i , (2.1)where Y i = ( Y i , . . . , Y in i ) T is a vector of the n i responses for individual i , g ( · ) is a link function, X i is the n i × M corresponding design matrix across M covariates, β is the ﬁxed-eﬀects parameter vector and b i i . i . d . ∼ MVN(0 , D ).Here Z i is the design matrix for the random eﬀects and will typically includeboth an intercept and time component. One choice of X i and Z i is oﬀered inthe example of Section 3 and includes time varying values of white blood cellcount and lymphocyte percentage. This model is a natural choice for thissetting since repeated measures are taken over time on the same individualand the time points are unevenly spaced across individuals [Fitzmaurice,Laird and Ware (2004)]. A. S. FOULKES ET AL.

In this manuscript we consider two approaches to ﬁtting the model ofequation (2.1). Since ultimately we are interested in predicting whetherCD4 count is above (or below) a given threshold, we begin by modelinga dichotomized version of the observed CD4 data. We use the notation Y + ij to indicate this binary representation of the observed data. That is, we de-ﬁne the dependent variable Y + ij = I (CD4 ij > K ), where CD4 ij is the CD4count at the j th time point for individual i and K is set equal to a clinicallymeaningful threshold. In this case, the canonical logit link is used to modelthe resulting binary outcome. Formally, if we let θ ij = E [ Y + ij ] = Pr( Y + ij = 1),then equation (2.1) reduces in this setting to θ ij = exp[ x ij β + z ij b i ]1 + exp[ x ij β + z ij b i ] , (2.2)where x ij and z ij are the rows of X i and Z i respectively, corresponding tothe j th measurement for individual i .Second, we explore the utility of using the full range of the CD4 countdata by modeling CD4 as a continuous variable. That is, we let Y ij = CD4 ij and g ( · ) be the identity function, so that the model of equation (2.1) reducesto the linear mixed eﬀects model (LMM), given by Y ij = x ij β + z ij b i + ε ij , (2.3)where ε ij ∼ N (0 , σ ) and b i ⊥ ε ij . Since we ultimately aim to predict whetherCD4 is above a given threshold, we then derive a prediction rule based onthe estimated mean and variance components from this model.2.2. Prediction-based classiﬁcation.

In ﬁtting the mixed eﬀects modelof equation (2.1), we use the complete vector of observed data, given by y i = ( y i , . . . , y in i ), for all individuals in our learning sample. In general, wewant to make predictions for new individuals under the assumption that only baseline values of y i , given by y i , are observed. In the usual model ﬁttingcontext, the predicted y is generated using the empirical Bayes estimatesof b i , given by b b i = E [ b | y i ]. Notably, this conditions on this complete datavector and thus is not applicable to our setting, in which only the y i areavailable. Thus, we need to arrive at an alternative estimate of the randomeﬀects that conditions only on the observed data for new individuals. Weconsider two approaches in the context of the linear mixed model. In theﬁrst case, we replace y i with X i b β in the formula for b b i . This is our primaryapproach, described in Section 2.2.2 and applied in the example of Section 3.The second alternative we consider is to replace y i with the baseline measure y i , which is presented as an extension in Section 4. BC FOR LONGITUDINAL BIOMARKERS Binary outcome.

After ﬁtting the model of equation (2.1), meanand variance parameter estimates can be used to arrive at a predicted meanresponse for individual i at the j th time point. Consider ﬁrst the case inwhich we dichotomize CD4 count and ﬁt the GLMM with a logit link, asdescribed by equation (2.2). In this case, we have the predicted probability ofCD4 count being above the threshold K at the j th time point for individual i given by b θ ij = exp[ x ij b β + z ij b b i ]1 + exp[ x ij b β + z ij b b i ] , (2.4)where b β is a maximum likelihood estimate of β and b b i = E [ b i | y + i ] is theconditional mean of the random eﬀects for individual i , given the observeddata y + i . Numerical integration techniques, such as Gaussian quadrature,are required for model ﬁtting in this setting since no simple, closed-formsolutions to maximum likelihood estimation are available.A simple approach to prediction in this case is to let the predicted out-come, given by b y ij , equal 1 if b θ ij ≥ .

50 and 0 otherwise, where b θ ij is deﬁnedby equation (2.4). Alternatively, we may want to choose a prediction rulethat controls a clinically meaningful attribute. For example, in the CD4 pre-diction setting, we may want to control the false positive rate, deﬁned asthe proportion of individuals predicted to be above a safety threshold, whenin fact their CD4 counts are below this safe limit. In this case, we deﬁnemultiple rules, termed α -prediction rules, that are given by b y + ij,α = (cid:26) , if θ ij ≥ − α ,0 , otherwise,(2.5)where the unobserved θ ij is replaced with the estimate b θ ij . Notably, in mak-ing predictions for new individuals, the complete vector y + is not availableand, thus, b b i = E [ b i | y + i ] in equation (2.4) cannot be calculated. In the ex-ample provided below, we let b b i = E [ b i ] = 0 for all i in our test sample. Analternative approach for the linear model setting is described in Section 4. Table 1

Contingency table notation for a given α -prediction rule y + ij Total b y + ij,α n n n · n n n · Total n · n · n ·· A. S. FOULKES ET AL.

Based on a given α -prediction rule, we can generate the contingency ta-ble given in Table 1. Here the n kl ’s are the corresponding cell counts for k, l = 1 ,

2. For example, n is the number of observations that are ob-served to be above the threshold ( y + ij = 1) and predicted to be above thethreshold ( b y + ij,α = 1). The sensitivity of this rule is deﬁned as the proba-bility of correctly predicting an observation as being above the thresholdamong those responses that are in fact above the threshold and is givenalgebraically as Pr( b y + ij,α = 1 | y + ij = 1) = n /n · . The corresponding speci-ﬁcity is given by Pr( b y + ij,α = 0 | y + ij = 0) = n /n · and the false positive rateis FP α = 1 − speciﬁcity = n /n · . Positive predictive value (PPV) and neg-ative predictive value (NPV) are given by ( n /n · ) and ( n /n · ), respec-tively. By varying the value of α in equation (2.5), we generate multipleprediction rules and can construct a corresponding receiver operator char-acteristic (ROC) curve, which oﬀers a visual representation of the trade-oﬀbetween sensitivity and speciﬁcity. Speciﬁcally, an ROC curve is deﬁned as aplot of the false positive rate ( x -axis) and corresponding sensitivity ( y -axis)for each of multiple classiﬁers, in our case prediction rules. In our setting,each α -rule contributes one point to the ROC curve. We deﬁne the opti-mal rule as the one that controls the FP rate at a speciﬁed level, thoughalternative criterion are equally applicable.Since the prediction rule given by equation (2.5) depends on an esti-mate of θ ij that is derived based on the data, a cross-validation approach isnecessary to obtain accurate estimates of predictive performance, includingsensitivity and false positive rate. The motivation for this stems from theneed to characterize the ability to make predictions on observations that didnot contribute to the model ﬁtting procedure. In this manuscript, we usean independent test sample to evaluate model performance. The approachproceeds as follows: First, model parameters are estimated using data aris-ing from what we refer to as the learning sample. Second, the best α -ruleis identiﬁed based on the trade-oﬀ between sensitivity and speciﬁcity, againusing the learning sample data. The estimates of predictive performance(e.g., false positive rate) based on the learning sample are referred to asresubstitution estimates as the data used for estimating error rates are thesame as those used for deriving the prediction rule. Finally, measures of pre-dictive performance for the chosen α -rule are reported based on applyingthe rule to an independent data set, which we refer to as the test sampledata. These test sample estimates are considered unbiased reﬂections of pre-dictive performance, as independent data sets are used to generate the ruleand describe its performance.2.2.2. Continuous outcome.

The prediction approach just described fora binary outcome involves simply ﬁtting a logistic regression model and then

BC FOR LONGITUDINAL BIOMARKERS generating an ROC curve based on several probability cutoﬀs. While, to ourknowledge, this has not been applied to the setting of modeling biomarkertrajectories over time and speciﬁcally to CD4 monitoring, similar approachesare used in practice in other settings [Tosteson et al. (1994); Tosteson andBegg (1988)]. One reason that this approach may not be optimal for thepresent setting is that CD4 count is measured on a continuous scale. We thusconsider a simple extension of this approach that takes into considerationthe full range of the observed CD4 count data. We begin by modeling y ij =CD4 ij as a quantitative biomarker, using the linear mixed eﬀects model ofequation (2.3), and then derive a prediction approach similar to the onedescribed by equation (2.5).The model derived predicted value of y ij is given by b y ∗ ij = x ij b β + z ij b b i .Here x ij and z ij are again respectively the rows of X i and Z i correspondingto the j th measurement for individual i , b β = P Ni =1 ( X Ti b Σ − i X i ) − X Ti b Σ i y i isthe least squares estimate of β , b b i = E ( b i | y i ) = b DZ Ti b Σ − i ( y i − X i b β ) is thebest linear unbiased predictor (BLUP) of the random eﬀects for individual i , b Σ i = d Var( y i ) = Z i b DZ Ti + b σ I , and b D and b σ are the restricted maxi-mum likelihood estimates of D and σ , respectively. Rather than estimate θ ij = Pr(CD4 ij > K ) of equation (2.5), we describe a one-sided predictioninterval approach to identify a rule that is similar to the one described bythis equation.First note that the lower bound of the one-sided (1 − α ) prediction intervalfor y ij is given by l ij,α = b y ij − z α q Var( b y ij − y ij ) , (2.6)where z α is the quantile of a standard normal corresponding to a 1 − α probability and Var( b y ij − y ij ) is referred to as the prediction variance. In thismanuscript, we treat this interval as an approximate credible interval, so thatwe are (1 − α )% certain that the random variable Y ij will be greater than thisrealization of the lower bound. In other words, Pr( Y ij > l ij,α ) = (1 − α )%.Thus, if l ij,α > K , we are at least (1 − α )% certain that Y ij > K . In otherwords, l ij,α > K is equivalent to θ ≥ (1 − α ). As a result, the rule given by b y + ij,α = (cid:26) , if l ij,α > K ,0 , otherwise,(2.7)is equivalent to the one given by equation (2.5). As described in McClean,Sanders and Stroup (1991) and McCulloch and Searle (2001), the predic-tion variance is given by Var( b y ij − x ij β − z ij b i ) = x ij Var( b β ) x Tij + z ij Var( b b i − b i ) z Tij + x ij Cov( b β, b b i − b i ) z Tij where Var( b β ) = P Ni =1 ( X i Σ − i X Ti ) − , Var( b b i − b i ) = ( σ Z Ti Z i + D − ) − − Cov( b β, b b i − b i ) X Ti Σ − i Z i D and Cov( b β, b b i − b i ) = A. S. FOULKES ET AL. − DZ Ti Σ − i X i Var( b β ). In our setting, we are interested in the prediction vari-ance for a new observed value and thus have an additional σ term. Thatis, Var( b y ij − y ij ) of equation (2.6) is equal to Var( b y ij − x ij β − z ij b i ) + σ .The appropriateness of treating the above prediction interval as a credibleinterval depends on prior assumptions about the parameters of our model.Since we are using this as a means of generating a prediction rule, and not asa tool for inference, this approximation seems reasonable. It also performswell in the example provided in Section 3. A study of the relative advan-tages of applying a fully Bayesian approach to approximating the posteriorpredictive distribution for this data setting is ongoing research.Again a test sample is used to characterize model performance. In thelinear mixed modeling setting, we note that Var( b β ), b D and b σ are estimatedbased on the model ﬁtting procedure that uses the learning sample data.The remaining variance terms, Var( b b i − b i ) and Cov( b β, b b i − b i ) as well as thedesign elements x ij and z ij used in the calculation of l ij,α of equation (2.6)are based on the test sample data. Notably, in both modeling frameworks,the BLUPs of the random eﬀects can not be calculated for a new individualfor whom the response y i is not observed. One approach to handling thisunobserved data is to replace y i in the formula for b b i with X i b β so that b b i = b DZ Ti b Σ − i ( X i b β − X i b β ) = 0. This results in reducing b y ij to b y ij = x ij b β and is consistent with assigning each individual the estimated populationaverage. In the example below, we use the prediction variance from theusual regression setting of Var( b y ij − y ij ) = x ij Var( b β ) x Tij + σ . This predictionvariance is less than the one described above; however, as we are varying z α of equation (2.6) to generate a series of classiﬁcation rules, the magnitude ofthe interval is less relevant. An alternative approach for handling the randomeﬀects in the linear mixed modeling framework is described in Section 4.

3. Example.

The approach described in Section 2 is applied to a cohortof N = 270 individuals from the Royal Free Hospital, London who were fol-lowed for up to three years after initiation of ART. Detailed information onthe patient population and laboratory methods can be found in Smith et al.(2003, 2004). The aim of our analysis is to determine the utility of baselineCD4 count and repeated measures on WBC and lymphocyte percentage forpredicting CD4 counts over time. Our approach uses the complete CD4 countdata (across all time points) from a learning sample to generate a model;predictions based on this model are then made, for the resubstituted dataas well as for an independent test sample, assuming that we only observethe baseline values of CD4. Consideration is given to two clinically mean-ingful CD4 count thresholds: K = 200 and K = 350 cells / mm . All analysesare performed using R version 2.7.1. The median length of follow-up is 25months and the interquartile range (IQR) for length of follow-up is (14 , BC FOR LONGITUDINAL BIOMARKERS months. The median number of follow-up time points is 9 with a full rangeof 2 to 24. In total, there are 2635 records including baseline measurements.The median baseline CD4 count for this cohort is 219 . , lme() and lmer() functions of the nlme and lme4 packages, respectively.We assume a piecewise linear mixed eﬀects model for modeling CD4 countafter initiation of ART [Fitzmaurice, Laird and Ware (2004)]. This modelis appropriate since CD4 count tends to rise rapidly for approximately onemonth and then proceeds to increase more gradually. Fixed eﬀects for base-line CD4 count (on a log base 10 scale), baseline and time varying values ofWBC and lymphocyte percentage and time before and after one month offollow-up are included in the model as predictors. In addition, interactionsbetween each time component and baseline values of WBC and lymphocytepercent are included.The design matrix X i for the ﬁxed eﬀects of equation (2.1) is thus givenby X i = [ 1 N X i X i ] , (3.1) X i =  y i w i l i t i ( t i − + w i l i y i w i l i t i ( t i − + w i l i ... y i w i l i t in i ( t in i − + w in i l in i  , X i =  t i ∗ w i t i ∗ l i ( t i − + ∗ w i ( t i − + ∗ l i t i ∗ w i t i ∗ l i ( t i − + ∗ w i ( t i − + ∗ l i ... t in i ∗ w i t in i ∗ l i ( t in i − + ∗ w i ( t in i − + ∗ l i  , where w i and l i are respectively baseline WBC and baseline lymphocytepercent, t ij is time in months since initiation of ART, ( t ij − + is follow-uptime after the ﬁrst 1 month on ART for t ij > w ij and l ij are respectively WBC and lymphocyte percent at time t ij . We deﬁne y i in X i as log(CD4) for both the linear and generalized linear model althoughthe response variable, given by Y i = ( Y i , . . . , Y in i ), is dichotomized for thegeneralized linear model setting. Notably, this model allows for two lineartime trends, before and after 1 month of follow-up on ART. Random personspeciﬁc intercepts and slopes before the knot are also assumed so that thedesign matrix Z i for the random eﬀects of equation (2.1) is given by Z i =  t i t i ...1 t in i  . (3.2) A. S. FOULKES ET AL.

The random eﬀects vector in equation (2.1) is given by b Ti = [ b i b i ] rep-resenting the intercept and slope before the change point for individual i .We begin by ﬁtting the generalized linear model, as described in equa-tion (2.2). In this case, post-baseline CD4 counts are dichotomized and usedas the outcome in the model ﬁtting procedure. Predicted probabilities ofbeing above the CD4 threshold are estimated for each post-baseline timepoint for each individual. The results of applying a probability cutoﬀ of 0 . .

98 and 0 .

90) are high for boththresholds, the corresponding false positive rates are also high (0 .

54 and0 . α cutoﬀs are considered to generate multiple predictionrules and an ROC curve is generated, as illustrated in Figure 1(a). This isagain based on the GLMM approach to model ﬁtting. Data corresponding torules with resubstitution FP rates of approximately (but not greater than)5% and 10% and CD4 threshold cutoﬀs of K = 200 and 350 are providedin Tables 2(b) and (c). Resubstitution-based summary measures are givenin Table 3(a). Based on a CD4 threshold of K = 200, a FP rate of 0 . .

61, a positive predictive value of 0 .

97 and anegative predictive value of 0 .

32. For the same CD4 threshold, a FP of 0 . .

42, a positive predictive value of 0 .

98 anda negative predictive value of 0 . Table 2

Observed and predicted counts (based on learning sample data) (a) GLMM approach with a “naive” 0.50 probability cutoﬀ

Observed > < Total

Predicted >

200 1932 215 2147 <

200 34 184 218Total 1966 399 2365

Observed > < Total

Predicted >

350 1194 289 1483 <

350 137 745 882Total 1331 1034 2365BC FOR LONGITUDINAL BIOMARKERS Table 2

Continued (b) GLMM approach

Observed Observed > < Total > < Total

Predicted ∗ >

200 826 18 844 1206 37 1243 <

200 1140 381 1521 760 362 1122Total 1966 399 2365 1966 399 2365

Observed Observed > < Total > < Total

Predicted ∗ >

350 669 50 719 880 103 983 <

350 662 984 1646 451 931 1382Total 1331 1034 2365 1331 1034 2365(c) LMM approach

Observed Observed > < Total > < Total

Predicted ∗ >

200 1291 19 1319 1558 38 1596 <

200 675 380 1055 408 361 769Total 1966 399 2365 1966 399 2365

Observed Observed > < Total > < Total

Predicted ∗ >

350 760 51 811 940 103 1043 <

350 571 983 1554 391 931 1322Total 1331 1034 2365 1331 1034 2365 ∗ Predicted counts are based on rules with resubstitution FP rate estimates of approxi-mately (but not greater than) 5% (left panels) and 10% (right panels).

Next we ﬁtted the linear mixed eﬀects model, as described by equa-tion (2.3), to the observed CD4 count data. The resulting ROC curve illus-trating the sensitivity and corresponding false positive rates in this cohort(resubstitution estimates) is given in Figure 1(b). Count data correspondingto rules for which thresholds are K = 200 and 350 and the resubstitutionFP rates are approximately (but not greater than) 5% and 10% are given inTable 2(b). Corresponding summaries, as well as 95% bootstrap conﬁdenceintervals (CIs), are reported in Table 3(b). To arrive at CIs, we repeatedlysample individuals with replacement and in each case, ﬁt a linear mixed ef- A. S. FOULKES ET AL. (a)(b)

Fig. 1.

ROC curves based on resubstitution estimates. (a)

GLMM, (b)

LMM. fects model. The prediction rule corresponding to FP rates of approximately(but not greater than) 5% and 10% are selected and corresponding resubsti-tution estimates of sensitivity, PPV and NPV are recorded. A total of 100bootstraps are performed for each threshold and the ﬁfth and ninety-ﬁfthpercentiles reported.Based on a CD4 cutoﬀ of 200, a FP rate of 0 .

10 corresponds to a sensitivityof 0 .

79 [95% CI (0.74, 0.83)]. In this case, the PPV is 0 .

98 (0.97, 0.98) and

BC FOR LONGITUDINAL BIOMARKERS Table 3

Estimates of predictive performance (a) GLMM approach

GLMM (LS) GLMM (TS)Sens Spec PPV NPV Sens Spec PPV NPV K = 200:LS FP < .

05 0.42 0.95 0.98 0.25 0.66 0.96 0.99 0.31LS FP < .

10 0.61 0.91 0.97 0.32 0.77 0.96 0.99 0.39 K = 350:LS FP < .

05 0.50 0.95 0.93 0.60 0.61 0.95 0.95 0.59LS FP < .

10 0.66 0.90 0.90 0.67 0.79 0.90 0.93 0.71(b) LMM approach

LMM (LS) LMM (TS)Sens Spec PPV NPV Sens Spec PPV NPV K = 200:LS FP < .

05 0.66 0.95 0.99 0.36 0.77 0.96 0.99 0.39(0.60, 0.75) (0.98, 0.99) (0.31, 0.46)LS FP < .

10 0.79 0.90 0.98 0.47 0.84 0.96 0.99 0.49(0.74, 0.83) (0.97, 0.98) (0.37, 0.54) K = 350:LS FP < .

05 0.57 0.95 0.94 0.63 0.73 0.93 0.95 0.67(0.44, 0.67) (0.92, 0.95) (0.56, 0.70)LS FP < .

10 0.71 0.90 0.90 0.70 0.84 0.90 0.94 0.77(0.65, 0.79) (0.88, 0.92) (0.66, 0.78) the NPV is 0 .

47 (0.37, 0.54). This corresponds to the rule in which α =0 . − .

035 =96 . .

05 corresponds to asensitivity of 0 .

66 (0.60, 0.75), PPV of 0 .

99 (0.98, 0.99) and NPV of 0 . n = 72 individuals from an independent co-hort in Philadelphia. We use only baseline CD4 counts to make predictions,assuming that this is all that is available. The median baseline CD4 in thiscohort is 260 . / mm and the IQR is (159 . , . n = 327 since there are 399 −

72 = 327 post-baseline measure-ments for this cohort. In this case, n = 240 measurements are predicted to A. S. FOULKES ET AL.

Table 4

Observed and predicted counts (based on test sampledata)

Observed > < Total

Predicted >

200 238 2 240 <

200 44 43 87Total 282 45 327 be above the threshold, while 87 are predicted below. Since this is intendedas a prioritization tool, this rule would suggest performing a true CD4 teston the 87 observations that are predicted below the threshold to conﬁrmthe true value. A “savings” associated with this rule is 240 /

327 = 73% sincea CD4 test would not be required for this percentage of the observations.The “cost” is the associated false positive rate of 2 /

45 = 4 .

4. Extensions.

In this section we brieﬂy describe two extensions of themethod outlined in Section 2 to illustrate its ﬂexibility and directions forfurther development. First, we consider one approach to incorporating in-formation about the individual level random eﬀects into our prediction al-gorithm for the linear mixed eﬀects setting. This approach is relevant asit provides a potential framework for incorporating observed, post-baselineCD4 counts into the model. Additionally, it illustrates the trade-oﬀ betweenusing baseline data within the ﬁxed eﬀects design matrix, and using thesedata to inform prediction of the random eﬀects. Second, we detail how thismethod can be applied to making predictions about changes in CD4 countover time. Extensions for modeling alternative outcomes are relevant, as clin-ical decision making generally takes into account both absolute and relativeCD4 count changes.4.1.

Using observed response data to inform BLUPs of random eﬀects.

While leading to a prediction rule with good predictive performance, theapproach described in Section 2 does not take into account the latent eﬀectsthat result in some individuals having higher or lower responses, information

BC FOR LONGITUDINAL BIOMARKERS that is typically captured in random eﬀects. Several alternatives exist. Forexample, the prediction variance used in the example above is based on theusual regression setting, Var( b y ij − y ij ) = Var( x ij b β − x ij β − ε ). Alternatively,we could use Var( b y ij − y ij ) = Var( x ij b β − x ij β − z ij b i − ε ) = x ij Var( b β ) x Tij + z ij b Dz Tij + σ . That is, while we let b b i = 0, we still include the true b i in theprediction variance formula. Based on the London data, this results in slight,yet unremarkable improvements in sensitivity (results not shown).We can also estimate the random eﬀects for new individuals based onbaseline data. In the example provided, we assume only baseline CD4 countsare available, and these are used in the ﬁxed eﬀects design matrix ratherthan informing the random eﬀects. To begin, we propose ﬁtting the modelof equation (2.3) with the slight modiﬁcation that the observed baselineCD4 count, given by y i , is now included in the response vector Y i andremoved from the design matrix X i . In order to estimate the random eﬀectsfor a new individual (whose complete response vector y i is unobserved),we calculate the conditional expectation of the random eﬀects, given thebaseline (observed) response y i . That is, we replace b b i = E ( b i | y i ) with ˜ b i = E ( b i | y i ) = ( y i − x i b β ) / ( b D , + b σ ε ) b D , · , where b D , is the (1 ,

1) element of b D corresponding to the estimated variance of the intercept random eﬀect, b D , · is the column vector corresponding to the ﬁrst column of b D , b β is theﬁrst element of b β corresponding to the intercept ﬁxed eﬀect and x i is theﬁrst row of X i . This equation is derived simply by replacing the matrix Z i with its ﬁrst row and replacing the vectors y i and X i b β with their ﬁrstelements in the formula b b i = E ( b i | y i ) = b DZ Ti b Σ − i ( y i − X i b β ).Notably, this is not the same prediction of b i that would have been arrivedat if the complete data vector y i were observed and so the alternative no-tation ˜ b i is used. Through use of the ﬁrst column of the b D matrix, we drawon the estimated covariance between the random eﬀects to ﬁll in values forboth the intercept and slope random eﬀects for each individual, while onlyrelying on baseline values of the response. Finally, we additionally replaceVar( b b i − b i ) and Cov( b β, b b i − b i ) with Var(˜ b i − b i ) and Cov( b β, ˜ b i − b i ), respec-tively in the formula for Var( b y ij − y ij ). Applications of this approach to theLondon data (results not shown) are similar to those reported, suggestingthat, in this data example, using the modiﬁed BLUPs in place of treatingbaseline as CD4 as a predictor variable does not improve our prediction al-gorithm. Observed post-baseline measures of CD4 that occur prior to thetime of prediction could be incorporated similarly into the predicted randomeﬀects.4.2. Making predictions about the percentage change in CD4 count overtime.

In Sections 2 and 3 we focus on the setting in which interests lie in A. S. FOULKES ET AL. predicting the response at a single time point. More generally, we may wantto make a prediction about a function of the CD4 counts for individual i across a combination of time points j . For example, we may be interestedin the percentage change in CD4 count over a speciﬁed period of time,given by the function f t ( Y ij ) = ( Y ij − Y ij ′ ) /Y ij where ( j − j ′ ) = t . We canagain begin by ﬁtting the linear model of equation (2.3) to the repeatedCD4 count measures and arriving at predictions for new observations basedon this model. The predicted percentage change for a single individual i isthen given by b f t ( y ij ) = ( b y ij − b y ij ′ ) / b y ij . In order to determine the predictionvariance of b f t ( y ij ), we use the multivariate delta method. Based on a ﬁrst-order Taylor series expansion, we have Var[ b f t ( y ij )] = Var[( b y ij − b y ij ′ ) / b y ij ] =Var[ b y ij ′ / b y ij ] ≈ U T V U , where U T = ( 1 / b y ij − b y ij ′ / b y ij ) is the score vectorand V is the variance–covariance matrix of ( b y ij b y ij ′ ) T . The matrix V iscalculated using the same formula as for Var( b y ij ) above, where the vectors x ij and z ij are replaced by matrices with rows corresponding to the timepoints j and j ′ . Further exploration of the utility of ﬁtting a LMM andidentifying an associated prediction rule for the percentage change in CD4count, or a rule that evaluates simultaneously the absolute level and thepercentage change within the PBC framework, is ongoing research.

5. Discussion.

This manuscript presents an analytic approach, which weterm PBC, for predicting a quantitative biomarker trajectory over time thatcombines the generalized linear mixed eﬀects model with an ROC curvetype approach. Two approaches to approximating the prediction rule ofequation (2.5) are considered. In the ﬁrst case, we dichotomize the data apriori and model the resulting binary outcomes over time; a generalized lin-ear mixed eﬀects modeling approach is applied for direct estimation of θ ij .Since we ultimately aim to arrive at a binary prediction rule, this approachis intuitively appealing and consistent with applications of the logistic modelfor prediction. In the second case, we model the data using a linear mixedeﬀects model, a standard approach to the analysis of unevenly spaced, re-peated measures data with a continuous response and multiple predictorvariables. The results of this model ﬁtting procedure are used in turn toinform predictions, in this case using a rule that involves the lower boundof the corresponding prediction interval. This second approach also oﬀersintuitive appeal since it allows for use of all of the observed data to informthe model ﬁt. A similar approach as the one described herein can be appliedfor modeling pathogenesis, though the additional population level variabil-ity in CD4 counts in the absence of therapy may lead to lower predictiveperformance.PBC diﬀers in two regards from methods currently employed in this set-ting. First, we apply ﬁrst-stage modeling that can incorporate the full range BC FOR LONGITUDINAL BIOMARKERS of multiple continuous and categorical predictors, as well as quantitativedata on our outcome (CD4 count) to inform our analysis. Estimated meanand variance components from this model ﬁtting procedure are subsequentlyused to deﬁne a rule for predicting whether a function of the observed CD4count (within and across time points) is above or below a clinically mean-ingful threshold. Multiple patient level characteristics can be incorporated,including observed baseline CD4 count and time-varying values of the poten-tially predictive markers as described in Section 3. The proposed approachis diﬀerent from previously described approaches for this setting since mod-eling is performed using all of the available data and a prediction rule isassociated with the resulting model. One potential advantage is that we areable to draw on the full range of both the predictor and outcome data to in-form our investigation while still providing a binary decision rule for clinicaldecision making based on resulting probability estimates.A second diﬀerence is that PBC provides a framework for modeling CD4count trajectories over time that is not limited to characterizing changes be-tween two time points. Speciﬁcally, we consider models with a single knot atone month after initiation of ART to account for the rapid increase in CD4count that is typically observed and the subsequently slower rise over time[Laird and Ware (1982); Fitzmaurice, Laird and Ware (2004)]. The GLMMis applied with individual level random intercept and slope terms in orderto account for the within person correlation inherent in repeated measuresdata. The use of a mixed eﬀects model for longitudinal CD4 data has beendescribed for monitoring response to therapy [Mahajan et al. (2004)]; how-ever, the aim of that investigation diﬀered in that the investigators appliedthe mixed model to uncover the within and between person variability inTLC for ﬁxed changes in CD4 count. In our setting, the mixed model is usedas a tool within a predictive algorithm that allows for prediction across atemporal trajectory.Several manuscripts also report receiver operating characteristic (ROC)curve analyses using information on TLC as well as other markers, suchas hemoglobin to predict CD4 count. To our knowledge, all such investiga-tions involve a ﬁrst-stage dichotomization of the proposed markers as wellas the outcome CD4 count. For example, Spacek et al. describe an approachinvolving cutoﬀ points for TLC ( < / mm and > / mm )and/or hemoglobin ( >

12 g / dl) [Spacek et al. (2003)], while others proposedichotomizing TLC based on whether the change over a speciﬁed time pe-riod is greater than 0 [Badri and Wood (2003); Mahajan et al. (2004)]. CD4count is also dichotomized ( <

200 cells / mm ) for each observation basedon the absolute value at a given time point or the change over a speciﬁedperiod. These investigations generally include reporting of sensitivity, speci-ﬁcity, positive predictive value (PPV) and negative predictive value (NPV), A. S. FOULKES ET AL. where sensitivity and speciﬁcity are deﬁned in the usual manner as the pro-portions respectively of those predicted positive among those truly positiveand those predicted negative among those truly negative. Through consid-eration of multiple cutoﬀ points for both predictor and outcomes, ROCcurves are generated that illustrate the trade-oﬀ between sensitivity andspeciﬁcity.Logistic regression models have also been described as a useful tool inthis setting [Bagchi et al. (2007); Spacek et al. (2003)]. These methods drawstrength on the continuous nature of the potentially predictive markers, suchas TLC, while using a dichotomized version of CD4 count. Logistic modelshave the advantage of oﬀering a framework for incorporating multiple con-tinuous or categorical predictor variables and accounting for the confound-ing and/or eﬀect modifying role of patient speciﬁc demographic and clinicalfactors. Adjusted odds ratios are reported from these model ﬁts. While thisapproach uses more information on the available data, it involves ﬁrst di-chotomizing CD4 counts and does not include reporting of sensitivity andspeciﬁcity, two clinically appealing and relevant concepts.An extensive literature also exists on methodologies for ROC curves assummarized in Zhou, Obuchowski and McClish (2002) and Pepe (2000b).Within this body of research, methods for incorporating ordinal and contin-uous predictors have been described [Pepe (1998, 2005); Tosteson and Begg(1988)] as well as approaches to handling repeated marker data [Emir et al.(1998)]. To our knowledge, however, these methods are developed primar-ily for a dichotomous outcome such as “diseased” or “not diseased.” In oursetting, both the predictor variables and outcome of interest are continu-ous biomarkers, which serve as a primary motivation for the linear mixedeﬀects modeling approach we describe. Speciﬁcally, we aim to incorporateand draw strength from the complete observed response data (rather thana dichotomized version) to arrive at a prediction rule.Similar to our approach, methods for time-dependent ROC curves, asdescribed in Heagerty, Lumley and Pepe (2000), aim to characterize a time-varying clinical measure of disease progression within a prediction frame-work. Heagerty, Lumley and Pepe (2000) provide an eloquent approach forthe setting of a survival outcome, in which the binary indicator for diseasestatus is potentially censored and can vary over time, and which involvesdirect modeling of the sensitivity and speciﬁcity. In our setting, the outcomeof interest is a continuous biomarker and, thus, direct modeling of the sen-sitivity and speciﬁcity in this fashion is not tenable. Instead, we considertwo approaches, one that involves direct modeling of the probability thatthe outcome is above a threshold and the second that approximates theprediction rule through use of a corresponding prediction interval. Furtherextensions involving modeling of time to CD4 count below a meaningfulthreshold would be interesting.

BC FOR LONGITUDINAL BIOMARKERS Methods involving generalized linear models and mixed eﬀects modelshave been described for estimating ROC curves [Albert (2007); Gatsonis(1995); Pepe (2000a)]. As noted by Dodd and Pepe (2003), PBC in its orig-inal formulation is an approach to estimation of the area under the ROCcurve given by the probability that the response in the group is greater thanthe response is another group. The setting described herein diﬀers, however,since here estimation is described for the probability that an observation isgreater than a given threshold and not for the comparison of two groups. AROC curve is then generated based on a prediction rule that incorporatesthis estimated probability. Finally, we note that our algorithm involves gen-erating a single ROC curve based on a set of predictors determined in amodel ﬁtting framework. This distinguishes our strategy from approachesthat aim to identify the most predictive set of markers by evaluating theareas under the curve across several sets of predictors, such as Bisson et al.(2008).PBC may be a clinically useful tool for predicting whether an individual’sCD4 count will be greater than a given threshold based on less-expensivelaboratory measures, including WBC and lymphocyte percent. For the dataexample presented, using the continuous range of the CD4 data and appli-cation of the linear mixed eﬀects model appears to oﬀer better predictiveperformance than a ﬁrst stage dichotomization and application of the gener-alized linear mixed model. This is evidenced in both the resubstitution andtest sample estimates of predictive performance. For example, for a CD4threshold of 350 and a test sample FP rate of 4%, the GLMM approachresults in test sample Sensitivity = 0 .

50, PPV = 0 .

78 and NPV = 0 .

88. TheLMM approach, on the other hand, yields test sample Sensitivity = 0 . .

82 and NPV = 0 .

91 for the same cutoﬀ and test sample FP rate.While we have not demonstrated a statistically signiﬁcant diﬀerence betweenthe two approaches, a clear trend is observed across all rules for both thetest and learning sample data.The primary advantages of this strategy over the tools described in Sec-tion 1 for this data setting are as follows: (1) it allows us to draw strengthfrom the full range of continuous outcome data (through linear modeling)while providing us with clinically relevant measures, such as positive pre-dictive value (through subsequent classiﬁcation based on probability thresh-olds) and (2) it allows for simultaneous consideration of unevenly spacedbiomarker measurements over time. In the example described for predict-ing absolute CD4 count based on a 200-level threshold, a positive predic-tive value of 0 .

98 is observed with a false positive rate of 0 .

05, suggestingthis approach may be useful in developing alternative clinical managementstrategies. The relatively low NPV of 0 .

36 suggests that the approach de-scribed herein may serve best as a prioritization tool that allows for the re-duction in higher-end capacity testing, while not replacing the use of thesetests. A. S. FOULKES ET AL.

The clinical utility of this tool, however, will require further considerationof additional clinical and environmental factors as well as an in-depth anal-ysis of a diverse array of cohorts. For example, the application presentedin Section 3 is based on data from the London cohort in which a medianbaseline CD4 count of 219 . α in equation (2.5) corresponding to the best predic-tion rule will likely change. In the extreme case that the prediction variancetends to 0, we have that l ij,α of equation (2.7) approaches b y ij regardless of α . In this case, since the observed and predicted values would be very close,all prediction rules would perform equally well with sensitivity and speci-ﬁcity close to unity. In addition, alternative more sophisticated models may BC FOR LONGITUDINAL BIOMARKERS oﬀer improved accuracy. For example, Chu et al. (2005) describe a Bayesianrandom change point model for predicting CD4 trajectories that includesboth population and individual level change points. Incorporating this mod-eling approach into the PBC framework introduces the additional analyticchallenge of predicting individual-level change points for new patients andis a direction of potential future development.In summary, through combining modeling and an ROC curve approach,PBC provides a ﬂexible statistical framework for appropriately modelingcontinuous biomarker data using all available data on the biomarker as wellas additional, potentially relevant continuous or categorical predictors. Atthe same time, it oﬀers interpretable measures of diagnostic accuracy basedon clinically determined thresholds. Notably, improved prediction of CD4count based on less-expensive and more widely available laboratory mea-sures, such as lymphocyte percentage and white blood cell count, may havebroad public health implications. A sound diagnostic tool could provide formore targeted CD4 testing strategies, oﬀering a much needed instrument ina resource limited setting where HIV/AIDS presents the greatest burden.REFERENCES Albert, P. (2007). Random eﬀects modeling approaches to estimating ROC curves fromrepeated ordinal tests without a gold standard.

Biometrics Badri, M. and

Wood, R. (2003). Usefulness of total lymphocyte count in monitoringhighly active antiretroviral therapy in resource-limited settings.

AIDS Bagchi, S. , Kempf, M. , Westfall, A. , Maherya, A. , Willig, J. and

Saag, M. (2007).Can routine clinical markers be used longitudinally to monitor antiretroviral therapysuccess in resource-limited settings?

CID Bedell, R. , Keath, K. , Hogg, R. , Wood, E. , Press, N. , Yip, B. , O’Shaughnessy, M. and

Montaner, J. (2003). Total lymphocyte count ass a possible surrogate of CD4 cellcount to prioritize eligibility for antiretroviral therapy among HIV-infected individualsin resource-limited settings.

Antivir. Ther. Bisson, G. , Gross, R. , Strom, J. , Rollins, C. , Bellamy, S. , Weinstein, R. , Fried-man, H. , Dickinson, D. , Frank, I. , Strom, B. , Gaolathe, T. and

Ndwapi, N. (2006). Diagnostic accuracy of CD4 cell count increase for virologic response after ini-tiating highly active antiretroviral therapy.

AIDS Bisson, G. , Gross, R. , Bellamy, S. , Chittams, J. , Hislop, M. , Regensberg, L. , Frank, I. , Maartens, G. and

Nachega, J. (2008). Pharmacy reﬁll adherence com-pared with CD4 count changes for monitoring HIV-infected adults on antiretroviraltherapy.

PLoS Medicine e109. Chu, H. , Gange, S. , Yamashita, T. , Hoover, D. , Chmiel, J. , Margolick, J. and

Jacobson, L. (2005). Individual variation in CD4 cell count trajectory among humanimmunodeciency virus-infected men and women on long-term highly active antiretro-viral therapy: An application using a Bayesian random change-point model.

AmericanJournal of Epidemiology

Dodd, L. and

Pepe, M. (2003). Semiparametric regression for the area under the receiveroperating characteristic curve.

J. Amer. Statist. Assoc. Emir, B. , Wieand, S. , Su, J. and

Cha, S. (1998). Analysis of repeated markers used topredict progression of cancer.

Stat. Med. A. S. FOULKES ET AL.

Ferris, D. , Dawood, H. , Magula, N. and

Lalloo, U. (2004). Application of an algo-rithm to predict CD4 lymphocyte count below 200 cells / mm in HIV-infected patientsin South Africa. AIDS Fitzmaurice, G. , Laird, N. and

Ware, J. (2004).

Applied Longitudinal Analysis . Wiley,Hoboken, NJ. MR2063401

Foulkes, A. and

DeGruttola, V. (2002). Characterizing the relationship between HIV-1 genotype and phenotype: Prediction based classiﬁcation.

Biometrics Foulkes, A. and

DeGruttola, V. (2003). Characterizing classes of antiretroviral drugsby genotype.

Stat. Med. Gatsonis, C. (1995). Random eﬀects models for diagnostic accuracy.

Academic Radiology Heagerty, P. J. , Lumley, T. and

Pepe, M. S. (2000). Time-dependent ROC curves forcensored survival data and a diagnostic marker.

Biometrics Kamya, M. , Semitala, F. , Quinn, T. , Ronald, A. , Njama-Meta, D. , Mayania-Kizza,H. , Katabira, E. and

Spacek, L. (2004). Total lymphocyte count of 1200 is nota sensitive predictor of CD4 lymphocyte count among patients with HIV disease inKampala, Uganda.

Afr. Health Sci. Kumarasamy, N. , Mahajan, A. P. , Flanigan, T. , Hemalatha, R. , Mayer, K. , Carpenter, C. , Thyagarajan, S. and

Solomon, S. (2002). Total lymphocyte count(TLC) is a useful tool for the timing of opportunistic infection prophylaxis in India andother resource-constrained countries.

JAIDS Laird, N. M. and

Ware, J. (1982). Random-eﬀects models for longitudinal data.

Bio-metrics Mahajan, A. , Hogan, J. , Snyder, B. , Kumarasamy, N. , Mehta, K. , Solomon, S. , Carpenter, C. , Mayer, K. and

Flanigan, T. (2004). Changes in total lymphocytecount as a surrogate for changes in CD4 count following initiation of HAART: Impli-cations for monitoring in resource-limited settings.

Clinical Science McClean, R. , Sanders, W. and

Stroup, W. (1991). A uniﬁed approach to mixed linearmodels.

Amer. Statist. McCulloch, C. E. and

Searle, S. R. (2001).

Generalized, Linear, and Mixed Models .Wiley, New York. MR1884506

Pepe, M. (1998). Three approaches to regression analysis of receiver operating character-istic curves for continuous test results.

Biometrics Pepe, M. (2000a). An interpretation for the ROC curve and inference using GLM proce-dures.

Biometrics Pepe, M. (2000b). Receiver operating characteristic methodology.

J. Amer. Statist. Assoc. Pepe, M. (2005). Evaluating technologies for classiﬁcation and prediction in medicine.

Stat. Med. Smith, C. , Sabin, C. , Lampe, F. , Kinloch-de Loes, S. , Gumley, H. , Carroll, A. , Prinz, B. , Youle, M. , Johnson, M. and

Phillips, A. (2003). The potential forCD4 cell increases in HIV-positive individuals who control viraemia with highly activeantiretroviral therapy.

AIDS Smith, C. , Sabin, C. , Youle, M. , Kinloch-de Loes, S. , Lampe, F. , Madge, S. , Crop-ley, I. , Johnson, M. and

Phillips, A. (2004). Factors inﬂuencing increases in CD4 cellcounts of HIV-positive persons receiving long-term highly active antiretroviral therapy.

J. Infect. Dis. Spacek, L. , Griswold, M. , Quinn, T. and

Moore, R. (2003). Total lymphocyte countand hemoglobin combined in an algorithm to initiate the use of highly active antiretro-viral therapy in resource-limited settings.

AIDS Tosteson, A. and

Begg, C. (1988). A general regression methodology for ROC curveestimation.

Medical Decision Making Tosteson, A. , Weinstein, M. , Wittenberg, J. and

Begg, C. (1994). A general re-gression methodology for ROC curve estimation.

Environmental Health Perspectives

WHO-Report (2006). Antiretroviral therapy for HIV infection in adults andadolescents in resource-limited settings: Toward universal access. Available at . Zhou, X. , Obuchowski, N. and

McClish, D. (2002).

Statistical Methods in DiagnosticMedicine . Wiley, New York. MR1915698

A. S. FoulkesDivision of BiostatisticsUniversity of Massachusetts404 Arnold House715 N. Pleasant StreetAmherst, Massachusetts 01003USAE-mail: [email protected]@schoolph.umass.edu