[PDF] merlin: An R package for Mixed Effects Regression for Linear, Nonlinear and User-defined models

Abstract

The R package merlin performs flexible joint modelling of hierarchical multi-outcome data. Increasingly, multiple longitudinal biomarker measurements, possibly censored time-to-event outcomes and baseline characteristics are available. However, there is limited software that allows all of this information to be incorporated into one model. In this paper, we present merlin which allows for the estimation of models with unlimited numbers of continuous, binary, count and time-to-event outcomes, with unlimited levels of nested random effects. A wide variety of link functions, including the expected value, the gradient and shared random effects, are available in order to link the different outcomes in a biologically plausible way. The accompanying predict.merlin function allows for individual and population level predictions to be made from even the most complex models. There is the option to specify user-defined families, making merlin ideal for methodological research. The flexibility of merlin is illustrated using an example in patients followed up after heart valve replacement, beginning with a linear model, and finishing with a joint multiple longitudinal and competing risks survival model.

Full PDF

JJSS

Journal of Statistical Software

MMMMMM YYYY, Volume VV, Issue II. doi: 10.18637/jss.v000.i00 merlin : An R package for Mixed Eﬀects Regressionfor Linear, Nonlinear and User-deﬁned models

Emma C. Martin

University of Leicester

Alessandro Gasparini

Karolinska Institutet

Michael J. Crowther

University of Leicester &Karolinska Institutet

Abstract

The R package merlin performs ﬂexible joint modelling of hierarchical multi-outcomedata. Increasingly, multiple longitudinal biomarker measurements, possibly censoredtime-to-event outcomes and baseline characteristics are available. However, there is lim-ited software that allows all of this information to be incorporated into one model. Inthis paper, we present merlin which allows for the estimation of models with unlimitednumbers of continuous, binary, count and time-to-event outcomes, with unlimited lev-els of nested random eﬀects. A wide variety of link functions, including the expectedvalue, the gradient and shared random eﬀects, are available in order to link the diﬀerentoutcomes in a biologically plausible way. The accompanying predict.merlin functionallows for individual and population level predictions to be made from even the mostcomplex models. There is the option to specify user-deﬁned families, making merlin idealfor methodological research. The ﬂexibility of merlin is illustrated using an example inpatients followed up after heart valve replacement, beginning with a linear model, andﬁnishing with a joint multiple longitudinal and competing risks survival model. Keywords : joint modelling, multi-outcome, mixed eﬀects, survival, R , merlin .

1. Introduction

Software packages to ﬁt joint and multi-state models are continuously being developed andupdated to increase ﬂexibility. However this ﬂexibility is often limited in terms of outcometypes, levels of nested random-eﬀects, or the forms of linking functions between outcomes.We have developed merlin in order to address this lack of ﬂexibility, allowing for a wide rangeof models to be estimated. With merlin it is possible to include any number of outcomesfrom a wide range of families, including Gaussian, Bernoulli, Poisson, a number of survivalmodels including ﬂexible parametric models, amongst others. It is also possible to custom a r X i v : . [ s t a t . C O ] J u l merlin: Mixed Eﬀects Regression supply user deﬁned families to allow for even greater ﬂexibility and method development. Thisallows merlin to ﬁt everything from a simple Weibull model to a multivariate joint model.Joint models can be deﬁned using commonly chosen association structures (Gould, Boye,Crowther, Ibrahim, Quartey, Micallef, and Bois 2015), for example, shared random eﬀects,the current value, gradient or area under the curve, and to provide even more customisation- user deﬁned link functions. This R package is based on the recently released merlin packagein Stata (Crowther 2017, Crowther (2018)).Previous software released in R has some of the individual capabilities of merlin . Package JM (Rizopoulos 2010) ﬁts a single normal longitudinal response jointly with a single sur-vival outcome or competing risk outcomes, assuming a current value or current gradient link.There is also an extension JMbayes (Rizopoulos 2016) which ﬁts similar models in a Bayesianframework. joineR (Philipson, Sousa, Diggle, Williamson, Kolamunnage-Dona, Henderson,and Hickey 2018) allows for the joint modelling of a single longitudinal response and a sin-gle time-to-event outcome or competing risk outcome. The extension joineRML (Hickey,Philipson, Jorgensen, and Kolamunnage-Dona 2018) additionally allows for multivariate lon-gitudinal data. The frailtypack (Rondeau, Mazroui, and Gonzalez 2012) package ﬁts shared,joint and nested frailty models, with one longitudinal response and multiple recurrent andterminal events.New package merlin oﬀers additional ﬂexibility in how the joint model is speciﬁed. Multiplelongitudinal responses can be speciﬁed and there is a wider range of models available to betterdescribe the data, including splines and fractional polynomials. There is also a wider varietyof survival models available compared to joineR which only allows Cox models, and JM and frailtypack which allow Cox, Weibull and limited spline based survival models. In additionto these models, merlin allows for exponential survival models, and a wider range of ﬂexiblespline based models such as Royston-Parmar models (Royston 2001). While it is possible toﬁt models with multiple shared random-eﬀects, there are additional link functions availableto describe the relationship between the longitudinal and time-to-event outcomes, includingcurrent expected value, or other functions of the longitudinal response, including derivativesand integrals. A number of packages exist which allow for multiple hierarchical levels ofrandom-eﬀects for either longitudinal responses ( lme4 (Bates, Mächler, Bolker, and Walker2015) or nlme (Pinheiro, Bates, DebRoy, Sarkar, and R Core Team 2019)) or time-to-eventoutcomes ( coxme (Therneau 2019)). Each of the joint modelling packages described aboveonly allow for one level of clustering, with the exception of frailtypack which allows for two,whereas merlin can incorporate any number of nested levels, which is particularly useful forbig data such as electronic health records, which is often hierarchical.Further ﬂexibility is provided in merlin with the option of user-deﬁned functions. This allowsusers to deﬁne their own likelihood functions, merlin is then used as a wrapper function tocarry out the optimisation, similar to BAMLSS (Umlauf, Klein, and Zeileis 2017) which usesa modular “Lego brick” approach in a Bayesian framework. Allowing users to extend merlin via user-deﬁned functions makes it a useful tool for the development of new methods.In this paper we introduce the modular syntax employed by merlin which enables its ﬂex-ibility. In order to illustrate this ﬂexibility we will develop an example model using datafrom an observational study of patients following aortic valve replacement surgery (Lim, Ali,Theodorou, Sousa, Ashraﬁan, Chamageorgakis, Duncan, Henein, Diggle, and Pepper 2008).In Section 2 we explain the syntax to specify the model structure and use the predict func-tion. In Section 3 we work through an illustrative example in patients following heart valve ournal of Statistical Software merlin and plans forfuture extensions.

2. Specifying model structure

The syntax for merlin is modular in nature. The family is speciﬁed for each outcome, thelinear predictor for each outcome can then be built from components such as an intercept,covariates and random eﬀects. merlin(model = list(model1, model2, ...),family = c("family1", "family2", ...),levels = "level1",data = data))

Where the syntax for each model is model1 <- depvar ~ component1 + component2 + ..., model_options

Each component can be made up of a number of elements such as covariates, random eﬀects,functions of time and expected values of other outcomes. Interactions between elements canbe speciﬁed using : between diﬀerent elements. By default a coeﬃcient will be estimated foreach component, the coeﬃcient can be constrained to 1 using *1 . component1 <- element1 [:element2] [:element3] [...] [*1] A number of model families are currently available, including• gaussian - Gaussian distribution• bernoulli - Bernoulli distribution• poisson - Poisson distribution• beta - beta distribution• negbinomial - Negative binomial distributionAs well as a number of survival models• exponential - exponential survival distribution• weibull - Weibull distribution• gompertz - Gompertz distribution• rp - Royston-Parmar survival model, (complex predictor on the log cumulative hazardscale)• loghazard - general log hazard model (complex predictor on the log hazard scale)With two user-deﬁned options• user - which ﬁts a user-deﬁned model which can be written using merlin ’s utility func-tions. The name of the user-deﬁned function needs to be passed through using userf option merlin: Mixed Eﬀects Regression • null - which is a convenience tool for deﬁning additional complex predictors, that donot contribute to the log likelihood Each element can take a number of diﬀerent forms• varname - the simplest form is a varname, which refers to a variable in the data setprovided.• rcs - a restricted cubic spline function, – knots() - allows the user to specify the location of the knots in the form of avector. – df() - alternatively the number of degrees of freedom can be speciﬁed, in whichcase the boundary knots are assumed to be at the minimum and maximum of varname with the internal knots placed at evenly spaced centiles. – orthog - this option uses Gram-Schmidt orthogonalisation of the splines, specifyingthis can improve model convergence.• time functions - such as powers of time and log time. In order to use time functions timevar must be speciﬁed as extra numerical integration may be required.• M - a random-eﬀect at the cluster level, all random-eﬀects must benamed M followed by a number to enable the sharing of random eﬀects between models• fp() - speciﬁes a fractional polynomial function, with order 1 or 2. – powers() - the powers of the the fractional polynomial function must be speciﬁed(up to second degree).• bhazard(varname) - invokes a relative survival (excess hazard) model. varname speciﬁesthe expected hazard rate at the event time.• exposure(varname) - include log( varname ) in the linear predictor, with a coeﬃcient of1. For use with family = "poisson" .Functions of longitudinal submodels can be included as covariates in other submodels usingthe following options• EV[depvar] - the expected value of the response of a submodel• dEV[depvar] - the ﬁrst derivative with respect to time of the expected value of theresponse of a submodel• d2EV[depvar] - the second derivative with respect to time of the expected value of theresponse of a submodel• iEV[depvar] - the integral with respect to time of the expected value of the responseof a submodel•

XB[depvar] - the expected value of the complex predictor of a submodel• dXB[depvar] - the ﬁrst derivative with respect to time of the expected value of thecomplex predictor of a submodel• d2XB[depvar] - the second derivative with respect to time of the expected value of thecomplex predictor of a submodel ournal of Statistical Software iXB[depvar] - the integral with respect to time of the expected value of the complexpredictor of a submodel There are a number of methods available for numerically integrating out the random-eﬀectsin order to calculate the likelihood for a mixed-eﬀects model. The options for intmethod are:• ghermite - for non-adaptive Gauss-Hermite quadrature;• halton - for Monte Carlo integration using Halton sequences;• sobol - for Monte Carlo integration using Sobol sequences;• mc - for standard Monte Carlo integration using normal draws.The default is ghermite . Level-speciﬁc integration techniques can be speciﬁed. Gauss-Hermite quadrature is widely considered the optimal numerical integration technique, howeverit doesn’t scale well for large numbers of random-eﬀects. Therefore in a three level modelexample, we may use Gauss-Hermite quadrature at the highest level and the more eﬃcientMonte-Carlo integration with Halton sequences at level 2, using intmethod = c("ghermite","halton") . A range of post estimation tools are available with merlin using the prediction function usingthe following syntax. predict(modelname, statistic, type, options)

The currently available statistics options are• eta - the expected value of the complex predictor• mu - the expected value of the response variable• hazard - the hazard function• chazard - the cumulative hazard function• logchazard - the log cumulative hazard function• survival - the survival function• cif - the cumulative incidence function• rmst - calculates the restricted mean survival time, which is the integral of the survivalfunction within the interval (0,t], where t is the time at which predictions are made. Ifmultiple survival models have been speciﬁed in your merlin model, then it will assume allof them are cause-speciﬁc competing risks models, and include them in the calculation.If this is not the case, you can override which models are included by using the causes option. rmst = t - totaltimelost .• cifdifference calculates the diﬀerence in cif predictions between values of a covariatespeciﬁed using the contrast option.• hdifference calculates the diﬀerence in hazard predictions between values of a covari-ate speciﬁed using the contrast option.• rmstdifference calculates the diﬀerence in rmst predictions between values of a co-variate speciﬁed using the contrast option. merlin: Mixed Eﬀects Regression • mudifference calculates the diﬀerence in mu predictions between values of a covariatespeciﬁed using the contrast option.• etadifference calculates the diﬀerence in eta predictions between values of a covariatespeciﬁed using the contrast option.• timelost - calculates the time lost due to a particular event occurring, within theinterval (0,t]. In a single event survival model, this is the integral of the cif between(0,t]. If multiple survival models are speciﬁed in the merlin model then by default allare assumed to be cause-speciﬁc event time models contributing to the calculation. Thiscan be overridden using the causes option.• totaltimelost - total time lost due to all competing events, within (0,t]. If multiplesurvival models are speciﬁed in the merlin model then by default all are assumed to because-speciﬁc event time models contributing to the calculation. This can be overriddenusing the causes option. totaltimelost is the sum of the timelost due to all causes.Prediction options include• type - speciﬁes whether the predictions include ﬁxed-eﬀects only ( fixedonly ), or themarginal prediction is calculated marginally with respect to the latent variables. The stat is calculated by integrating the prediction function with respect to all the latentvariables over their entire support.• predmodel - speciﬁes which model to predict from, default predmodel=1 .• causes - for use when calculating predictions from a competing risks model. By default, cif , rmst , timelost and totaltimelost assume that all survival models included inthe merlin model are cause-speciﬁc hazard models contributing to the calculation. Ifthis is not the case, then you can specify which models (indexed using the order theyappear in your merlin model by using the causes option, e.g. causes=c(1, 2) ).• at - speciﬁes covariate values for prediction. Fixed values of covariates should be spec-iﬁed in a list e.g. at = c("trt" = 1, "age" = 50) .• contrast - speciﬁes the values of a covariate to be used when comparing statistics, suchas when using the cifdifference option to compare cumulative incidence functions,e.g. contrast = c("trt" = 0, "trt" = 1) .

3. Examples

A consequence of the ﬂexibility of merlin is the syntax is arguably complex to allow for thegeneralisation. In order to illustrate the potential uses of merlin we ﬁt a number of increasinglyadvanced models to data from an observational study which investigated the eﬀects of aorticvalve replacement with a stentless or a homograft valve (Lim et al. R package joineRML (Hickey et al. R> data(heart.valve, package = "joineRML") ournal of Statistical Software

R> heart.valve <- heart.valve[!is.na(heart.valve$grad), ]

The data should be set out in wide format for submodel, with each outcome speciﬁed in aseparate column. For survival data event time and status will appear in diﬀerent columns,and should only be speciﬁed once per individual. However within a submodel long format isused, with each repeated measurement of a biomarker on a new row with a separate columnspecifying the timing of that observation. The current set up of the heart.valve data (shownbelow) needs some editing to allow modelling ﬁtting with merlin , however once the data hasbeen put into the correct format, all models can be ﬁtted without any further editing beingrequired.

R> print(R+ heart.valve[heart.valve$num %in% c(1:2, 13),R+ c(1:3, 25, 5:6, 4, 8, 10)],R+ row.names = FALSER+ ) num sex age hs fuyrs status time log.grad log.lvmi1 0 75.06027 Stentless valve 4.956164 0 0.0109589 2.302585 4.7789551 0 75.06027 Stentless valve 4.956164 0 3.6794520 2.302585 4.7789551 0 75.06027 Stentless valve 4.956164 0 4.6958900 2.302585 4.9245692 0 45.79452 Homograft 9.663014 0 6.3643840 2.639057 4.7443232 0 45.79452 Homograft 9.663014 0 7.3041100 2.197225 4.6986612 0 45.79452 Homograft 9.663014 0 8.3013700 2.484907 5.05879013 1 69.94247 Homograft 5.186301 1 0.1369863 2.708050 5.30554113 1 69.94247 Homograft 5.186301 1 1.0575340 2.833213 5.28335613 1 69.94247 Homograft 5.186301 1 2.0547950 2.995732 4.79455013 1 69.94247 Homograft 5.186301 1 3.9726030 3.401197 4.993421

The event time ( fuyrs ) for each individual should only appear once in the data set, unlessthere are multiple events per individual, these should appear on separate lines. The status is the event indicator variable, coded 0 for censored (lost to follow-up) and 1 for died.

R> heart.valve$id <- heart.valve$numR> heart.valve$stime <- heart.valve$fuyrsR> heart.valve$stime[duplicated(heart.valve$id)] <- NAR> heart.valve$died <- heart.valve$statusR> heart.valve$died[duplicated(heart.valve$id)] <- NA

Binary variables, such as type of heart valve used, need to be converted to be numeric.

R> heart.valve$type <- as.numeric(heart.valve$hs) - 1 merlin: Mixed Eﬀects Regression

A section of the correctly formatted data is shown below, Individual 1 has three longitu-dinal measurements for log valve gradient ( log.grad ) and log left ventricular mass index( log.lvmi ), the timings of these measurements are given in the time column. In this casethe diﬀerent biomarkers were measured at the same time points, but this is not necessary,missing biomarker measurements should be recorded as NA . The survival information hasbeen recorded on the ﬁrst line for each individual. Baseline covariates such as sex should bespeciﬁed on every line for that individual. All the models below are ﬁtted to this data set. R> print(R+ heart.valve[heart.valve$id %in% c(1:2, 13),R+ c(26, 2:3, 29, 27:28, 8, 10, 4)],R+ row.names = FALSER+ ) id sex age type stime died log.grad log.lvmi time1 0 75.06027 1 4.956164 0 2.302585 4.778955 0.01095891 0 75.06027 1 NA NA 2.302585 4.778955 3.67945201 0 75.06027 1 NA NA 2.302585 4.924569 4.69589002 0 45.79452 0 9.663014 0 2.639057 4.744323 6.36438402 0 45.79452 0 NA NA 2.197225 4.698661 7.30411002 0 45.79452 0 NA NA 2.484907 5.058790 8.301370013 1 69.94247 0 5.186301 1 2.708050 5.305541 0.136986313 1 69.94247 0 NA NA 2.833213 5.283356 1.057534013 1 69.94247 0 NA NA 2.995732 4.794550 2.054795013 1 69.94247 0 NA NA 3.401197 4.993421 3.9726030

To begin with we will ﬁt a simple linear regression of log of the valve gradient ( log.grad )against time , with age and sex as covariates.

R> library(merlin)R> m1 <- merlin(R+ model = log.grad ~ sex + age + time,R+ family = "gaussian",R+ data = heart.valveR+ )R> summary(m1)

Mixed effects regression modelLog likelihood = -651.4753Estimate Std. Error z Pr(>|z|) [95% Conf. Interval]sex 0.140489 0.059778 2.350 0.0188 0.023327 0.257651age -0.002212 0.002297 -0.963 0.3356 -0.006714 0.002290time -0.013541 0.011958 -1.132 0.2574 -0.036978 0.009895 ournal of Statistical Software −3−2−1012 0.0 2.5 5.0 7.5 10.0 Time (years) R e s i dua l Figure 1: Absolute residuals plot against time for linear model m1 _cons 2.771597 0.160581 17.260 0.0000 2.456863 3.086330log_sd(resid.) -0.383205 0.028194 -13.592 0.0000 -0.438465 -0.327946

The constant term for log valve gradient is estimated to be 2.772 (95% CI 2.457, 3.086)and this is estimated to change by -0.014 (95% CI -0.037, 0.010) for every year after valvereplacement. The residual error is reported in the results table as the log of the standarddeviation, meaning the residual standard error in this model is 0.682. To assess model ﬁt wecan calculate the residuals using the predict function to get the ﬁtted values.

R> heart.valve$m1res <- heart.valve$log.grad - predict(m1, stat = "mu")R>R> library(ggplot2)R> ggplot(heart.valve, aes(x = time, y = m1res)) +R+ geom_point() +R+ geom_hline(yintercept = 0, color = "blue") +R+ xlab("Time (years)") +R+ ylab("Residual") +R+ theme_classic()

This shows that there seems to be some model misspeciﬁcation, as the values at the beginningand end are generally under predicted, while values between 1 and 4 years are over predicted.To address this we can add further ﬂexibility to the shape of the log valve gradient overtime, using restricted cubic splines, with number degrees of freedom speciﬁed as below. Theboundary knots will be assumed to be at the minimum and maximum of log.grad with the0 merlin: Mixed Eﬀects Regression internal knots at equally spaced centiles. Alternatively the locations of the knots can bespeciﬁed by using the knots() option. The spline terms have been orthogonalised, whichwill impact on the interpretation of the intercept term. While the spline terms themselveshave little meaningful interpretation they are reported to allow the model to be used to makeexternal predictions.

R> m2 <- merlin(R+ model = log.grad ~ sex + age + rcs(time, df = 3, orthog = TRUE),R+ timevar = "time",R+ family = "gaussian",R+ data = heart.valveR+ )R> summary(m2)

Mixed effects regression modelLog likelihood = -678.6148Estimate Std. Error z Pr(>|z|) [95% Conf. Interval]sex -0.090970 0.057917 -1.571 0.11625 -0.204486 0.022545age 0.016740 0.002454 6.822 0.00000 0.011931 0.021550rcs():1 -0.067805 0.026277 -2.580 0.00987 -0.119306 -0.016304rcs():2 -0.215318 0.025901 -8.313 0.00000 -0.266084 -0.164553rcs():3 0.097600 0.025675 3.801 0.00014 0.047278 0.147922_cons 1.532048 0.159002 9.635 0.00000 1.220411 1.843685log_sd(resid.) -0.448594 0.030335 -14.788 0.00000 -0.508050 -0.389139

When we plot the residuals for model m2 we can see there is less of a pattern over time,suggesting this model is a better ﬁt to the data. R> heart.valve$m2res <- heart.valve$log.grad - predict(m2, stat = "mu")R> ggplot(heart.valve, aes(x = time, y = m2res)) +R+ geom_point() +R+ geom_hline(yintercept = 0, color = "blue") +R+ xlab("Time (years)") +R+ ylab("Residual") +R+ theme_classic()

We can further improve this model by accounting for the clustered nature of the log.grad measurements within patients ( id ). We can add a normally-distributed random interceptat the patient id level using the M syntax below. Each random eﬀect is given a name ofthis form to enable the sharing of random eﬀects between models, which will be illustratedlater. In the model below M1 speciﬁes a random intercept and M2 speciﬁes a random linearslope. By default the random-eﬀects at each level are not assumed to be correlated (option covariance(identity) ), however this can be relaxed and the correlation estimated by in-stead specifying covariance(unstructured) . For mixed-eﬀects models the levels must bespeciﬁed using the level option. There is no limit to the number of levels which can be ournal of Statistical Software −2−1012 0.0 2.5 5.0 7.5 10.0 Time (years) R e s i dua l Figure 2: Absolute residuals plot against time for restricted cubic spline model m2ﬁtted, but the levels must be be speciﬁed from highest to lowest, e.g. county > practice >patient. By default all components in the model will have an estimated coeﬃcient, howeverthe coeﬃcient can be constrained to 1 using *1 notation, which would normally be the casefor random eﬀects not shared between models. By default estimation of the likelihood isdone using Gauss-Hermite quadrature with 7 nodes, increasing this number using the ip op-tion will improve estimation of the likelihood, although this will increase computation timeconsiderably. R> m3 <- merlin(R+ model = log.grad ~ sex + age + rcs(time, df = 3, orthog = TRUE) +R+ M1[id] * 1 + time:M2[id] * 1,R+ timevar = "time",R+ level = "id",R+ covariance = "unstructured",R+ family = "gaussian",R+ data = heart.valveR+ )R> summary(m3)

Mixed effects regression modelLog likelihood = -612.5806Estimate Std. Error z Pr(>|z|) [95% Conf. Interval]sex 0.1559667 0.0741357 2.104 0.0354 0.0106635 0.3012699 merlin: Mixed Eﬀects Regression age 0.0057777 0.0030691 1.883 0.0598 -0.0002377 0.0117931rcs():1 -0.0331820 0.0332446 -0.998 0.3182 -0.0983402 0.0319762rcs():2 -0.1828734 0.0262125 -6.977 0.0000 -0.2342490 -0.1314978rcs():3 0.1294719 0.0233294 5.550 0.0000 0.0837471 0.1751967_cons 2.2338384 0.2030426 11.002 0.0000 1.8358823 2.6317945log_sd(resid.) -0.7233108 0.0342331 -21.129 0.0000 -0.7904063 -0.6562152log_sd(M1) -1.0672644 0.1262396 -8.454 0.0000 -1.3146895 -0.8198394log_sd(M2) -0.7900452 0.1528337 -5.169 0.0000 -1.0895939 -0.4904966atanh_corr(M2,M1) -2.1512169 0.1571694 -13.687 0.0000 -2.4592632 -1.8431706Integration method: Non-adaptive Gauss-Hermite quadratureIntegration points: 7 Adding the random-eﬀects terms at the id level greatly reduces the log-likelihood. Thestandard deviation for the random intercept is 0.344, and for the random slope is 0.454.The correlation between these random-eﬀects is reported as the inverse hyperbolic tangent,to get the estimate of the correlation the tanh() function can be used, tanh(-2.151) =-0.973 showing that these random-eﬀects are highly correlated. Random-eﬀects at multiplehierarchical levels can be included by changing the level variable in square brackets, and byspecifying the levels from highest to lowest in the level option.We can make predictions of the expected value of the response from mixed-eﬀects model m3 using the predict function with the mu option. Predictions will only be made for non-missing values of the response. These predictions are marginal, calculated by integrating outthe random-eﬀects, giving population averaged predictions. R> ldata <- heart.valve[!is.na(heart.valve$log.grad), ]R> ldata$pred1 <- predict(m3, stat = "mu", predmodel = 1, type = "marginal")R> print(R+ ldata[ldata$id %in% c(1:2, 13), c("id", "time", "log.grad", "pred1")],R+ row.names = FALSER+ ) id time log.grad pred11 0.0109589 2.302585 3.1223911 3.6794520 2.302585 2.5232781 4.6958900 2.302585 2.6458542 6.3643840 2.639057 2.6034682 7.3041100 2.197225 2.6427332 8.3013700 2.484907 2.66592113 0.1369863 2.708050 3.17750813 1.0575340 2.833213 2.72003713 2.0547950 2.995732 2.51496613 3.9726030 3.401197 2.689035 ournal of Statistical Software merlin also allows users the ﬂexibility to specifytheir own likelihood functions using the user family.To help users to deﬁne their own likelihood there are a number of inbuilt utility functions.• merlin_util_depvar(M) - returns the dependent variable for the current model. Fortime-to-event outcomes this will be a matrix with two columns, for event time andevent-indicator.• merlin_util_xzb(M, t) - returns the complex predictor for the current model, option-ally evaluated at time t .• merlin_util_xzb_deriv(M, t) - returns the derivative with respect to time of thecomplex linear predictor for the current model, optionally evaluated at time t .• merlin_util_xzb_deriv2(M, t) - returns the second derivative with respect to timeof the complex linear predictor for the current model, optionally evaluated at time t .• merlin_util_xzb_integ(M, t) - returns the integral with respect to time of the com-plex linear predictor for model M , optionally evaluated at time t .• merlin_util_expval(M, t) - returns the expected value of the response for the currentmodel, optionally evaluated at time t .• merlin_util_expval_deriv(M, t) - returns the derivative with respect to time of theexpected value of the response for the current model, optionally evaluated at time t .• merlin_util_expval_deriv2(M, t) - returns the second derivative with respect totime of the expected value of the response for the current model, optionally evaluatedat time t .• merlin_util_expval_integ(M, t) - returns the integral with respect to time of theexpected value of the response for the current model„ optionally evaluated at time t .• merlin_util_ap(M,i) - returns the i th ancillary parameter of the current model.• merlin_util_timevar(M) - returns the time variable for he current model, speciﬁed bythe timevar option.These utility functions take a list as input, which has been referred to as gml below. Thiscontains a merlin object, which should not then be edited by the user. The xzb or expval functions have a corresponding *_mod() function, which allows users to specify an additionalargument for which model to call, e.g. merlin_util_xzb_mod(M,2) will return the complexpredictor for the second model in the merlin statement, allowing submodels to be linked.The log-likelihood is speciﬁed as a function, giving the observation level log-likelihood contri-bution. As an example a simple linear model can be ﬁtted using the function below. R> logl_gaussian <- function(gml) {R+ y <- merlin_util_depvar(gml)R+ xzb <- merlin_util_xzb(gml)R+ se <- exp(merlin_util_ap(gml, 1))R+R+ mu <- (sweep(xzb, 1, y, "-"))^2R+ logl <- ((-0.5 * log(2 * pi) - log(se)) - (mu / (2 * se^2)))R+ return(logl)R+ } merlin: Mixed Eﬀects Regression To specify a user deﬁned function, the family is given as user , the userf option must thenbe given the function above.

R> m4 <- merlin(log.grad ~ sex + age + time + ap(1),R+ family = "user",R+ userf = "logl_gaussian",R+ data = heart.valveR+ )R> summary(m4)

The parameter estimates from this model are the same as model M1 above, where _ap1 isthe ancillary residual error parameter. These user deﬁned functions allows users to extend merlin , which is particularly useful for those doing methodological research. A number of standard time-to-event models are available in merlin such as Weibull, expo-nential and Gompertz models. Additionally a range of more ﬂexible models are also availableincluding Royston-Parmar models, and a model on the log hazard scale, for both a number offorms can be used for the baseline including restricted cubic splines, or fractional polynomials.

Weibull proportional hazards model

We will start by ﬁtting a simple Weibull proportional hazard model for time-to-death, ad-justing for age and type of aortic valve replacement. In order to ﬁt a survival model a

Surv object must be supplied with the time and event indicator variables.

R> m5 <- merlin(R+ model = Surv(stime, died) ~ age + type,R+ family = "weibull",R+ data = heart.valveR+ )R> summary(m5)

Mixed effects regression modelLog likelihood = -173.3746 ournal of Statistical Software Estimate Std. Error z Pr(>|z|) [95% Conf. Interval]age 0.09731 0.01903 5.115 0.0000 0.06002 0.13460type 0.03834 0.34428 0.111 0.9113 -0.63644 0.71312_cons -11.68669 1.49113 -7.837 0.0000 -14.60925 -8.76413log(gamma) 0.64107 0.12323 5.202 0.0000 0.39954 0.88261

The results table gives the coeﬃcient for the factors in the model, which is the log of thehazard ratio. Therefore the hazard ratio for type of aortic valve replacement is exp(0.038)= predict function, with the survival option.Predictions will only be made for non-missing values of the response.

R> sdata <- heart.valve[!is.na(heart.valve$stime), ]R> sdata$pred2 <- predict(m5, stat = "survival", predmodel = 1, type = "fixedonly")R> print(R+ sdata[R+ sdata$id %in% c(1:2, 13),R+ c("id", "stime", "died", "age", "type", "pred2")R+ ],R+ row.names = FALSER+ ) id stime died age type pred21 4.956164 0 75.06027 1 0.76250972 9.663014 0 45.79452 0 0.947690313 5.186301 1 69.94247 0 0.8412621

The predictions give the survival probability for each individual at their event time, dependingon their age and type of valve replacement. We can use the at option to compare the survivalfunctions for diﬀerent levels of a covariate, while holding other covariates constant. As typeof valve replacement has a smaller eﬀect, we will instead look at the diﬀerences in survivalprediction by age, assuming a stentless value replacement. R> p_50 <- predict(m5, stat = "survival", type = "fixedonly",R+ at = c(age = 50, type = 1))R> p_60 <- predict(m5, stat = "survival", type = "fixedonly",R+ at = c(age = 60, type = 1))R> p_70 <- predict(m5, stat = "survival", type = "fixedonly",R+ at = c(age = 70, type = 1))R> surv_pred <- data.frame(p_50, p_60, p_70,R+ stime = heart.valve[!is.na(heart.valve$stime), "stime"])R> surv_pred <- surv_pred[!duplicated(surv_pred$stime),]R> surv_pred <- melt(surv_pred, id.var = "stime")R> merlin: Mixed Eﬀects Regression Time (years) S u r v i v a l p r obab ili t y Age

50 years60 years70 years

Figure 3: Survival functions for patients following stentless value replacement by age (frommodel m5)

R> ggplot(surv_pred, aes(x = stime, y = value, linetype = variable)) +R+ geom_line(size = 0.6) +R+ coord_cartesian(ylim = c(0, 1)) +R+ xlab("Time (years)") +R+ ylab("Survival probability") +R+ theme_classic() +R+ theme(legend.position = c(0.2, 0.2)) +R+ scale_linetype_discrete(R+ name = "Age",R+ labels = c("50 years", "60 years", "70 years"))

This gives a visual representation of how the survival probability changes depending on ageat entry, for a particular treatment option.

Spline-based survival model

Further ﬂexibility can be included in survival models by modelling the hazard function usingsplines. Royston-Parmar models use restricted cubic splines to model the baseline log cumu-lative hazard function. They allow ﬂexibility in the shape of the baseline hazard and allow fortime-dependent eﬀects. The form of the baseline hazard is speciﬁed by adding the function tothe linear predictor of the survival model. Here we ﬁt a model using restricted cubic splineswith 3 degrees of freedom in the baseline hazard. Using the event = TRUE option means theknots locations for the the splines are based on the event times only, ignoring censored time ournal of Statistical Software

R> m6 <- merlin(R+ model = Surv(stime, died) ~ age + type +R+ rcs(stime, df = 3, log = TRUE, event = TRUE),R+ timevar = "stime",R+ family = "rp",R+ data = heart.valveR+ )R> summary(m6)

Mixed effects regression modelLog likelihood = -170.653Estimate Std. Error z Pr(>|z|) [95% Conf. Interval]age 0.101765 0.019528 5.211 0.0000 0.063491 0.140040type 0.025291 0.343717 0.074 0.9413 -0.648381 0.698964rcs():1 1.107442 0.143163 7.736 0.0000 0.826849 1.388036rcs():2 -0.274877 0.089043 -3.087 0.0020 -0.449398 -0.100357rcs():3 -0.002397 0.076386 -0.031 0.9750 -0.152112 0.147317_cons -9.057037 1.401644 -6.462 0.0000 -11.804210 -6.309865

The estimated age and treatment eﬀects are very similar to the Weibull survival model m5 above. We can compare the shape in baseline hazards between the Weibul and Royston-Parmar using the hazard option in the predict function. R> base_m5 <- predict(m5, stat = "hazard", type ="fixedonly",R+ at = c(age = 0, type = 0))R> base_m6 <- predict(m6, stat = "hazard", type ="fixedonly",R+ at = c(age = 0, type = 0))R>R> base_pred <- data.frame(stime = heart.valve[!is.na(heart.valve$stime), "stime"],R+ base_m5,R+ base_m6)R> base_pred <- base_pred[!duplicated(base_pred$stime),]R> base_pred <- melt(base_pred, id.var = "stime")R> ggplot(base_pred, aes(x = stime, y = value, linetype = variable)) +R+ geom_line(size = 0.6) +R+ xlab("Time (years)") + ylab("Baseline hazard") +R+ theme_classic() + theme(legend.position = c(0.2, 0.8)) +R+ scale_linetype_discrete(name = "Model", labels = c("Weibull", "RP - 3df"))

Assess proportional-hazards

The survival models above assume proportional hazards. We can test this assumption in theeﬀect of type aortic valve replacement by including the interaction between type and log time.8 merlin: Mixed Eﬀects Regression

Time (years) B a s e li ne ha z a r d Model

WeibullRP − 3df

Figure 4: Baseline hazard functions with Weibull model (m5) and Royston-Parmar modelwith 3 degrees of freedom (m6)It is important to use the timevar option for this model, as time-dependent eﬀects need tobe diﬀerentiated with respect to time to calculate the hazard function.

R> m7 <- merlin(R+ model = Surv(stime, died) ~ age + type +R+ type:fp(stime, powers = c(0)) +R+ rcs(stime, df = 3, log = TRUE, event = TRUE),R+ timevar = "stime",R+ family = "rp",R+ data = heart.valveR+ )R> summary(m7)

Mixed effects regression modelLog likelihood = -170.3792Estimate Std. Error z Pr(>|z|) [95% Conf. Interval]age 0.10223 0.01956 5.226 0.0000 0.06389 0.14057type -0.62548 0.93454 -0.669 0.5033 -2.45714 1.20618type:fp() 0.34704 0.46673 0.744 0.4571 -0.56773 1.26180rcs():1 0.96557 0.22712 4.251 0.0000 0.52044 1.41071rcs():2 -0.24625 0.09397 -2.620 0.0088 -0.43044 -0.06207rcs():3 -0.01036 0.07610 -0.136 0.8917 -0.15952 0.13880 ournal of Statistical Software _cons -9.01176 1.40479 -6.415 0.0000 -11.76510 -6.25841 The interaction term ( type:fp() ) is not signiﬁcant, therefore accounting for time-dependenteﬀects on the type of aortic valve is not necessary.

Non-linear eﬀects

In the above models the eﬀect of age was assumed to be linear. We can investigate the non-linear eﬀect of age using fractional polynomials. Including fp(age, powers = c(1, 1)) speciﬁes a second-order fractional polynomial. The ﬁrst speciﬁed term is stime to the ﬁrstpower, which in this case is a linear term. The second speciﬁed term is stime to the secondpower multiplied by the natural log of stime . R> m8 <- merlin(R+ model = Surv(stime, died) ~ type +R+ fp(age, powers = c(1, 1)) +R+ rcs(stime, df = 3, log = TRUE, event = TRUE),R+ timevar = "stime",R+ family = "rp",R+ data = heart.valveR+ )R> summary(m8)

Mixed effects regression modelLog likelihood = -170.7799Estimate Std. Error z Pr(>|z|) [95% Conf. Interval]type 0.038219 0.350542 0.109 0.9132 -0.648830 0.725268fp():1 -0.596800 0.392650 -1.520 0.1285 -1.366380 0.172780fp():2 0.133596 0.072833 1.834 0.0666 -0.009154 0.276345rcs():1 1.114711 0.143890 7.747 0.0000 0.832692 1.396731rcs():2 -0.277742 0.089596 -3.100 0.0019 -0.453347 -0.102137rcs():3 -0.003194 0.076810 -0.042 0.9668 -0.153740 0.147351_cons 0.007325 5.537987 0.001 0.9989 -10.846930 10.861580

The hazard ratios for linear age (0.551) and linear age multiplied by log age (1.143) are bothsigniﬁcant, suggesting the eﬀect of age is non-linear. We could go on to investigate whetherthe proportional hazard assumption is valid for this non-linear function of age by ﬁtting aninteraction between the age function and log time.

The syntax for relatively simple one outcome models such as those above can be simpliﬁedusing available wrapper functions, which use the powerful merlin function underneath. Forexample the mlsurv wrapper ﬁts parametric survival models, with exponential , weibull , gompertz , rp , logchazard , and loghazard model options. To illustrate, the Weibull modelin m5 above can ﬁtted using the wrapper mlsurv , with simpliﬁed syntax.0 merlin: Mixed Eﬀects Regression R> mlsurv(R+ formula = Surv(stime, died) ~ age + type,R+ distribution = "weibull",R+ data = heart.valveR+ )

Proportional hazards regression modelWeibull baseline hazardData: dataCoefficients:age type _cons log(gamma)0.09719 0.03778 -11.67541 0.63998

Competing risk analysis can be framed as a multiple outcome survival model by specifyingcause-speciﬁc hazard models. As this data set only contains all cause survival information,to illustrate ﬁtting a competing risks model we randomly assign the deaths to either cardio-vascular disease ( cardio ) or other causes ( other ). R> set.seed(6342)R> heart.valve$cardio[!is.na(heart.valve$died)] <-R+ rbinom(length(heart.valve$died[!is.na(heart.valve$died)]), 1, 0.6)R> heart.valve$other <- 1 - heart.valve$cardioR> heart.valve$cardio[heart.valve$died == 0 & !is.na(heart.valve$died)] <- 0R> heart.valve$other[heart.valve$died == 0 & !is.na(heart.valve$died)] <- 0

We can then ﬁt a model with cardio as one outcome and other as a second outcome.Both event types have been ﬁtted using a Royston-Parmar model with 3 degrees of freedom,however it is possible to use diﬀerent survival model types for each event. As there are twooutcomes the model now needs to be speciﬁed as a list.

R> m9 <- merlin(R+ model = list(R+ Surv(stime, cardio) ~ type +R+ rcs(stime, df = 3, log = TRUE, event = TRUE),R+ Surv(stime, other) ~ type +R+ rcs(stime, df = 3, log = TRUE, event = TRUE)R+ ),R+ timevar = c("stime", "stime"),R+ family = c("rp", "rp"),R+ data = heart.valveR+ )R> summary(m9) ournal of Statistical Software Mixed effects regression modelLog likelihood = -232.9797Estimate Std. Error z Pr(>|z|) [95% Conf. Interval]type -0.004186 0.412139 -0.010 0.9919 -0.811963 0.803591rcs():1 1.098643 0.186195 5.901 0.0000 0.733708 1.463578rcs():2 -0.415694 0.101863 -4.081 0.0000 -0.615343 -0.216045rcs():3 -0.149017 0.100844 -1.478 0.1395 -0.346668 0.048634_cons -2.836043 0.351496 -8.068 0.0000 -3.524962 -2.147124type -0.408162 0.364551 -1.120 0.2629 -1.122670 0.306346rcs():1 2.480418 0.792595 3.129 0.0018 0.926960 4.033876rcs():2 1.362926 0.631493 2.158 0.0309 0.125223 2.600629rcs():3 -0.178580 0.124901 -1.430 0.1528 -0.423381 0.066222_cons -2.405594 0.323577 -7.434 0.0000 -3.039795 -1.771394

The hazard ratio for type of graft is 0.996 for death from cardiovascular disease, suggestingin this simulated example having a stentless valve replacement increases the risk of death dueto cardiovascular disease compared to having a homograft valve replacement. The eﬀect is inthe same direction for death from other causes, although the hazard ratio of 0.665 suggeststhe eﬀect is much smaller.Hazard ratios describe the relative diﬀerences in hazard between groups. In the case of com-peting risks the cause-speciﬁc cumulative incidence function, which is the probability of failurefrom the event of interest in the presence of other competing events, may be more useful. Wecan calculate the cause-speciﬁc cumulative incidence function for each of the causes in themodel using the predict function, by specifying which submodel to predict from using the predmodel option. Predictions are for each valve type at a given age, speciﬁed using the at option. Here the new cardio_homograft variable will give the cause-speciﬁc cumulative inci-dence function for time-to-death from cardiovascular disease (which is submodel 1) assuminga patient aged 50 has had a homograft valve replacement. R> card_homo <- predict(m9, stat = "cif", type = "fixedonly",R+ predmodel = 1, at = c(age = 50, type = 0))

To create stacked cumulative incidence plots for both types of value replacements we thencalculate further predictions as below for stentless grafts, and for death from other causes.

R> card_stent <- predict(m9, stat = "cif", type = "fixedonly",R+ predmodel = 1, at = c(age = 50, type = 1))R> other_homo <- predict(m9, stat = "cif", type = "fixedonly",R+ predmodel = 2, at = c(age = 50, type = 0))R> other_stent <- predict(m9, stat = "cif", type = "fixedonly",R+ predmodel = 2, at = c(age = 50, type = 1))

We can then plot the stacked cumulative incidence plots, allowing us to compare between thetwo types of valve replacement. They show that for both types the cumulative incidence forcardiovascular disease starts to ﬂatten over time, whereas it continues to increase for deathfrom other causes.2 merlin: Mixed Eﬀects Regression

Homograft valve Stentless valve0 3 6 9 0 3 6 90.00.20.40.6

Time (years) C u m u l a t i v e I n c i den c e Cause of death

Cardiovascular diseaseOther

Figure 5: Stacked cumulative incidence functions for death from cardiovascular disease anddeath from other causes, by type of valve replacement (model m9)

R> stime = rep(heart.valve$stime[!is.na(heart.valve$stime)], 4)R> pred = c(card_homo, card_stent, other_homo, other_stent)R> valve = rep(c(rep("Homograft valve", length(card_homo)),R+ rep("Stentless valve", length(card_stent))), 2)R> cod = c(rep("cardio", length(card_homo) * 2),R+ rep("other", length(card_stent) * 2))R> pred_comp <- data.frame(stime, pred, valve, cod)R> pred_comp <- pred_comp[!duplicated(pred_comp[, c(1, 3, 4)]),]R> ggplot(pred_comp, aes(x = stime, y = pred, fill = cod)) +R+ geom_area() +R+ xlab("Time (years)") +R+ ylab("Cumulative Incidence") +R+ theme_classic(base_size = 10) +R+ theme(legend.position = c(0.19, 0.83)) +R+ scale_fill_discrete(name = "Cause of death",R+ labels = c("Other", "Cardiovascular disease"),R+ guide = guide_legend(reverse = TRUE)) +R+ facet_grid(. ~ valve) ournal of Statistical Software merlin to ﬁt joint longitudinal survival models. This can ﬁrstly be donethrough shared random-eﬀects. In merlin shared random-eﬀects are speciﬁed by using thesame name for the random-eﬀects that are to be shared. When using shared-random eﬀectsit is possible to allow for an association factor, which explains the relationship between theindividual-speciﬁc random intercept of the biomarker and their survival. To estimate theassociation parameter the *1 is dropped from one of the random-eﬀects terms, removing theconstraint and allowing a coeﬃcient to be estimated. R> m10 <- merlin(R+ model = list(R+ Surv(stime, died) ~ type + M1[id],R+ log.grad ~ sex + age + time + M1[id] * 1R+ ),R+ timevar = c("stime", "time"),R+ levels = c("id"),R+ family = c("weibull", "gaussian"),R+ data = heart.valveR+ )R> summary(m10)

Mixed effects regression modelLog likelihood = -825.2692Estimate Std. Error z Pr(>|z|) [95% Conf. Interval]type 0.6198651 0.3254017 1.905 0.0568 -0.0179105 1.2576408M1 1.8384968 0.8677693 2.119 0.0341 0.1377002 3.5392934_cons -5.0586043 0.5317924 -9.512 0.0000 -6.1008982 -4.0163104log(gamma) 0.5320193 0.1265303 4.205 0.0000 0.2840245 0.7800141sex 0.1466814 0.0745633 1.967 0.0492 0.0005401 0.2928228age -0.0051930 0.0030689 -1.692 0.0906 -0.0112078 0.0008219time -0.0017929 0.0124788 -0.144 0.8858 -0.0262510 0.0226652_cons 2.9499413 0.2081879 14.170 0.0000 2.5419006 3.3579820log_sd(resid.) -0.5013162 0.0340641 -14.717 0.0000 -0.5680806 -0.4345517log_sd(M1) -1.0821441 0.1140678 -9.487 0.0000 -1.3057129 -0.8585752Integration method: Non-adaptive Gauss-Hermite quadratureIntegration points: 7

The parameter estimates from the survival submodel are reported ﬁrst, followed by the esti-mates from the longitudinal model, then the random-eﬀect estimates. Here the random eﬀect M1 is shared between the longitudinal and survival submodels. In the longitudinal model M1 is a random intercept, therefore this model is relating an individuals baseline valve gradientto their survival. The standard deviation for the random eﬀect is 0.339. The association be-tween the random-eﬀect on the intercept and survival is given in row M1 in the results table,a value of 1.838 suggests that higher baseline valve gradient leads to worse survival.Alternatively it is possible to to use current value parameterisation, linking the time-dependentexpected value of the biomarker to survival using the EV[] option. In order to do this the4 merlin: Mixed Eﬀects Regression timevar() must be supplied, as integration over time is required for the survival modellikelihood contribution.

R> m11 <- merlin(R+ model = list(R+ Surv(stime, died) ~ type + EV[log.grad],R+ log.grad ~ sex + age + time + M1[id] * 1R+ ),R+ timevar = c("stime", "time"),R+ levels = c("id"),R+ family = c("weibull", "gaussian"),R+ data = heart.valveR+ )R> summary(m11)

Mixed effects regression modelLog likelihood = -826.1898Estimate Std. Error z Pr(>|z|) [95% Conf. Interval]type 0.814597 0.322574 2.525 0.0116 0.182363 1.446831EV[] 0.785010 0.810309 0.969 0.3327 -0.803168 2.373187_cons -6.747273 2.315188 -2.914 0.0036 -11.284959 -2.209587log(gamma) 0.403767 0.131396 3.073 0.0021 0.146236 0.661298sex 0.174081 0.074095 2.349 0.0188 0.028858 0.319305age -0.003442 0.003240 -1.062 0.2882 -0.009793 0.002909time -0.011135 0.013790 -0.807 0.4194 -0.038163 0.015893_cons 2.856742 0.226865 12.592 0.0000 2.412094 3.301390log_sd(resid.) -0.512929 0.033856 -15.150 0.0000 -0.579286 -0.446572log_sd(M1) -1.106644 0.114632 -9.654 0.0000 -1.331319 -0.881969Integration method: Non-adaptive Gauss-Hermite quadratureIntegration points: 7

Now the association term

EV[] describes the eﬀect of the current expected value of thebiomarker log.grad on survival. The hazard ratio is 2.192, meaning for every unit increasein log valve gradient the hazard is 2.192 times higher.The form of the association between the the longitudinal biomarker model and survival isﬂexible. For example, if there was instead a random slope term we could link the trend in therepeatedly measured biomarker to survival by using the gradient of the longitudinal model dEV[] . R> m12 <- merlin(R+ model = list(R+ Surv(stime, died) ~ type + dEV[log.grad],R+ log.grad ~ sex + age + time + time:M1[id] * 1R+ ), ournal of Statistical Software R+ timevar = c("stime", "time"),R+ levels = c("id"),R+ family = c("weibull", "gaussian"),R+ data = heart.valveR+ )R> summary(m12)

Mixed effects regression modelLog likelihood = -840.1187Estimate Std. Error z Pr(>|z|) [95% Conf. Interval]type 0.834296 0.331631 2.516 0.0119 0.184312 1.484281dEV[] 1.835397 39.467308 0.047 0.9629 -75.519105 79.189898_cons -5.018370 0.719250 -6.977 0.0000 -6.428074 -3.608665log(gamma) 0.517752 0.124721 4.151 0.0000 0.273304 0.762200sex 0.141272 0.061993 2.279 0.0227 0.019767 0.262777age -0.001734 0.003074 -0.564 0.5727 -0.007758 0.004290time -0.017077 0.021456 -0.796 0.4261 -0.059131 0.024977_cons 2.748958 0.198092 13.877 0.0000 2.360704 3.137211log_sd(resid.) -0.401866 0.034755 -11.563 0.0000 -0.469985 -0.333748log_sd(M1) -3.337529 0.660043 -5.057 0.0000 -4.631190 -2.043867Integration method: Non-adaptive Gauss-Hermite quadratureIntegration points: 7

The hazard ratio for the gradient of the longitudinal model is 6.268 ( exp(dEV[] ), this largehazard ratio suggests an increase in log of the valve gradient leads to greater hazard. Howeverthe conﬁdence intervals are wide, which may be due to the small standard deviation of 0.036in the linear slope, leading to issues in the estimation of its eﬀect.Other available links include the second derivative d2EV[] and the cumulative exposure iEV[] .We can extend the previous model to investigate possible non-linearities in the associationsand time dependent eﬀects by including the interaction between the expected value of thebiomarker and log time.

R> m13 <- merlin(R+ model = list(R+ Surv(stime, died) ~ type +R+ EV[log.grad] +R+ EV[log.grad]:fp(stime, powers = c(0)),R+ log.grad ~ time + M1[id] * 1R+ ),R+ timevar = c("stime", "time"),R+ levels = c("id"),R+ family = c("weibull", "gaussian"),R+ data = heart.valveR+ )R> summary(m13) merlin: Mixed Eﬀects Regression Mixed effects regression modelLog likelihood = -827.7168Estimate Std. Error z Pr(>|z|) [95% Conf. Interval]type 0.67433 0.31513 2.140 0.0324 0.05669 1.29197EV[] 0.24692 1.03307 0.239 0.8111 -1.77786 2.27170EV[]:fp() -0.17456 0.38326 -0.455 0.6488 -0.92574 0.57661_cons -5.85705 3.03192 -1.932 0.0534 -11.79950 0.08539log(gamma) 0.77284 0.46006 1.680 0.0930 -0.12885 1.67453time -0.01275 0.01273 -1.002 0.3163 -0.03770 0.01219_cons 2.66026 0.04662 57.058 0.0000 2.56888 2.75164log_sd(resid.) -0.50952 0.03478 -14.652 0.0000 -0.57768 -0.44136log_sd(M1) -1.13669 0.12899 -8.812 0.0000 -1.38949 -0.88388Integration method: Non-adaptive Gauss-Hermite quadratureIntegration points: 7

The parameter estimate for the interaction (

EV[]:fp() in the results table) suggests thatover time a unit increase in the current value of the log valve gradient has a reduced eﬀecton hazard, however this is not signiﬁcant.

For illustrative purposes we will now show how ﬂexible merlin is by bringing together theprevious examples, with some new model options, in one ﬁnal model. Binary outcomes canbe included using the bernoulli family. To illustrate this we will create a new binary variable catef from the ejection fraction variable ef . R> heart.valve$catef <- 0R> heart.valve$catef[heart.valve$ef > 70] <- 1

Model m14 includes the two survival models for the competing risks of death (cardiovasculardisease and other causes) from model m9 . Continuous log valve gradient over time is describedusing restricted cubic splines as in model m6 with random intercept term ( M1 ). Binary catef over time is also modelled with a random intercept term ( M2 ). Both survival outcomes aredescribed using Weibull models, the eﬀect of the type of valve replacement on both causes ofdeath is estimated. The eﬀect of type of valve replacement is assumed to be time dependentin the time to death from other causes model. The random intercept for valve gradient M1 is shared with the time-to-death from other causes model, while the random intercept forcategorical ejection fraction M2 is shared with the time-to-death from cardiovascular diseasemodel. The expected value of valve gradient is included in the time-to-death from cardiovas-cular disease model. R> m14 <- merlin(R+ model = list(R+ Surv(stime, cardio) ~ type + EV[log.grad] + M2[id],R+ Surv(stime, other) ~ type + type:fp(stime, powers = c(0)) + M1[id], ournal of Statistical Software R+ log.grad ~ age + type + rcs(time, df = 3, orthog = TRUE) + M1[id] * 1,R+ catef ~ fp(time, powers = c(1)) + M2[id] * 1R+ ),R+ timevar = c("stime", "stime", "time", "time"),R+ levels = c("id"),R+ covariance = "unstructured",R+ family = c("weibull", "weibull", "gaussian", "bernoulli"),R+ data = heart.valve,R+ control = list(ip = 9)R+ )R> summary(m14)

Mixed effects regression modelLog likelihood = -1409.455Estimate Std. Error z Pr(>|z|) [95% Conf. Interval]type 1.713733 0.741166 2.312 0.0208 0.261074 3.166392EV[] -1.206885 2.351814 -0.513 0.6078 -5.816356 3.402585M2 -0.069736 0.119881 -0.582 0.5608 -0.304699 0.165228_cons -8.339090 5.046409 -1.652 0.0984 -18.229870 1.551689log(gamma) 1.429550 0.153848 9.292 0.0000 1.128013 1.731087type 11.132882 1.326384 8.393 0.0000 8.533217 13.732548type:fp() -4.097624 0.684881 -5.983 0.0000 -5.439965 -2.755282M1 2.813494 0.158675 17.731 0.0000 2.502497 3.124491_cons -13.450441 1.494459 -9.000 0.0000 -16.379527 -10.521355log(gamma) 1.590105 0.138583 11.474 0.0000 1.318488 1.861722age -0.013361 0.003780 -3.535 0.0004 -0.020769 -0.005953type 0.911158 0.071653 12.716 0.0000 0.770720 1.051596rcs():1 -0.007336 0.028614 -0.256 0.7977 -0.063419 0.048747rcs():2 -0.159499 0.032838 -4.857 0.0000 -0.223860 -0.095138rcs():3 0.126317 0.026191 4.823 0.0000 0.074983 0.177651_cons 3.283403 0.223338 14.701 0.0000 2.845668 3.721138log_sd(resid.) -0.578881 0.036576 -15.827 0.0000 -0.650569 -0.507193fp() -0.117359 0.072014 -1.630 0.1032 -0.258503 0.023785_cons 1.377721 0.550711 2.502 0.0124 0.298348 2.457094log_sd(M2) 1.125207 0.182159 6.177 0.0000 0.768182 1.482232log_sd(M1) 0.297373 0.240480 1.237 0.2162 -0.173959 0.768704atanh_corr(M1,M2) -1.524655 0.257312 -5.925 0.0000 -2.028978 -1.020332Integration method: Non-adaptive Gauss-Hermite quadratureIntegration points: 9

The parameters are reported in the results table in the order the submodels were speciﬁed.This model is complex, and it would not be possible to ﬁt it in the other joint modellingsoftware discussed. This complexity of the model means it is computationally intensive to8 merlin: Mixed Eﬀects Regression estimate, taking approximately 29 minutes on a 2-core laptop with 8 Gb of RAM.

4. Discussion

The example above illustrates the ﬂexibility of merlin and the wide rage of models it is ableto ﬁt, the cost of this ﬂexibility is computational time, more complex models with multiplerandom eﬀects can be slow. For particular models it may be possible to calculate the likelihoodmore eﬃciently, which cannot be done here due to the generalised way merlin has been built.By default the likelihood is estimated using Gauss-Hermite quadrature to integrate out therandom-eﬀects. For every random eﬀect the likelihood has to be estimated at each of thequadrature points, meaning for r random eﬀects, and n quadrature points the likelihood hasto be estimated n r times. Alternatively the option to use Monte-Carlo integration is available,which is more eﬃcient, especially for large numbers of random eﬀects, improving computationtime.A further cost of the ﬂexibility of merlin is the relatively complex syntax which is necessary toenable the ﬁtting of all possible models. Whilst the syntax allows for a wide range of modelsto be ﬁtted, it could lead to confusion and mistakes in model speciﬁcation being made. Toaddress this we have written inbuilt wrapper functions, such as mlsurv shown in Section 3.4,and mlrcs , to ﬁt speciﬁc types of models, which will use the underlying merlin package, butwill have simpliﬁed syntax to make them more user-friendly.The merlin package is constantly evolving, with many further updates to merlin plannedin line with the Stata package. Planned updates include implementation of fully adaptiveGauss-Hermite quadrature, allowing for left truncated survival data, and empirical Bayespredictions of the random eﬀects.Overall the ﬂexibility of merlin will allow it to be used to ﬁt a wide range of new models whichcannot be implemented in other software packages. Its modular nature will allow for the easyaddition of further components and families of outcomes, and the ability to incorporate user-deﬁned functions means that there are many directions merlin can be taken in which havenot yet been considered.

5. Acknowledgements

This work was supported by the Medical Research Council (MR/P015433/1).

References

Bates D, Mächler M, Bolker B, Walker S (2015). “Fitting Linear Mixed-Eﬀects Models Usinglme4.”

Journal of Statistical Software , (1), 1–48. doi:10.18637/jss.v067.i01 .Crowther MJ (2017). “Extended multivariate generalised linear and non-linear mixed eﬀectsmodels.” arXiv e-prints , arXiv:1710.02223. .Crowther MJ (2018). “merlin - a uniﬁed modelling framework for data analysis and methodsdevelopment in Stata.” arXiv e-prints , arXiv:1806.01615. . ournal of Statistical Software Statistics in medicine , (14),2181–2195.Hickey GL, Philipson P, Jorgensen A, Kolamunnage-Dona R (2018). “joineRML: a jointmodel and software package for time-to-event and multivariate longitudinal outcomes.” BMC Med Res Methodol , (1), 50. doi:10.1186/s12874-018-0502-1 . URL .Lim E, Ali A, Theodorou P, Sousa I, Ashraﬁan H, Chamageorgakis T, Duncan A, HeneinM, Diggle P, Pepper J (2008). “Longitudinal study of the proﬁle and predictors of leftventricular mass regression after stentless aortic valve replacement.” Ann Thorac Surg , (6), 2026–9. doi:10.1016/j.athoracsur.2008.02.023 . URL .Philipson P, Sousa I, Diggle PJ, Williamson P, Kolamunnage-Dona R, Henderson R, HickeyGL (2018). joineR: Joint Modelling of Repeated Measurements and Time-to-Event Data .R package version 1.2.4, URL https://github.com/graemeleehickey/joineR/ .Pinheiro J, Bates D, DebRoy S, Sarkar D, R Core Team (2019). nlme: Linear and NonlinearMixed Eﬀects Models . R package version 3.1-141, URL https://CRAN.R-project.org/package=nlme .Rizopoulos D (2010). “JM: An R Package for the Joint Modelling of Longitudinal and Time-to-Event Data.” Journal of Statistical Software , (9), 1–33.Rizopoulos D (2016). “The R Package JMbayes for Fitting Joint Models for Longitudinaland Time-to-Event Data Using MCMC.” Journal of Statistical Software , (7), 1–46.ISSN 1548-7660. doi:10.18637/jss.v072.i07 . Ed7td Times Cited:12 Cited ReferencesCount:52, URL ://WOS:000389072600001 .Rondeau V, Mazroui Y, Gonzalez JR (2012). “frailtypack: An R Package for the Anal-ysis of Correlated Survival Data with Frailty Models Using Penalized Likelihood Esti-mation or Parametrical Estimation.” Journal of Statistical Software , (4), 1–28. ISSN1548-7660. 939iw Times Cited:56 Cited References Count:25, URL ://WOS:000303804000001 .Royston P (2001). “Flexible Parametric Alternatives to the Cox Model, and more.” TheStata Journal , (1), 1–28. doi:10.1177/1536867X0100100101 . https://doi.org/10.1177/1536867X0100100101 , URL https://doi.org/10.1177/1536867X0100100101 .Therneau TM (2019). coxme: Mixed Eﬀects Cox Models . R package version 2.2-14, URL https://CRAN.R-project.org/package=coxme .Umlauf N, Klein N, Zeileis A (2017). “BAMLSS: Bayesian Additive Models for Location,Scale and Shape (and Beyond).” Journal of Computational and Graphical Statistics .0 merlin: Mixed Eﬀects Regression Aﬃliation:

Emma C. MartinUniversity of LeicesterDepartment of Health Sciences, University Road, Leicester, LE1 7RH, UKE-mail: [email protected]

Alessandro GaspariniKarolinska InstitutetDepartment of Medical Epidemiology and Biostatistics, Stockholm, SwedenE-mail: [email protected]

Michael J. CrowtherUniversity of Leicester &Karolinska InstitutetDepartment of Health Sciences, University Road, Leicester, LE1 7RH, UK & Department ofMedical Epidemiology and Biostatistics, Stockholm, SwedenE-mail: [email protected]

Journal of Statistical Software published by the Foundation for Open Access Statistics

MMMMMM YYYY, Volume VV, Issue II

Submitted: yyyy-mm-dd doi:10.18637/jss.v000.i00