A Bayesian Time-Varying Effect Model for Behavioral mHealth Data
Matthew D. Koslovsky, Emily T. Hebert, Michael S. Businelle, Marina Vannucci
SSubmitted to the Annals of Applied Statistics arXiv: arXiv:0000.0000
A BAYESIAN TIME-VARYING EFFECT MODEL FORBEHAVIORAL MHEALTH DATA
By Matthew D. Koslovsky ∗ ,Emily T. H´ebert † ,Michael S. Businelle † ,and Marina Vannucci ‡ Colorado State University ∗ , Oklahoma Tobacco Research Center † and RiceUniversity ‡ The integration of mobile health (mHealth) devices into behav-ioral health research has fundamentally changed the way researchersand interventionalists are able to collect data as well as deploy andevaluate intervention strategies. In these studies, researchers oftencollect intensive longitudinal data (ILD) using ecological momentaryassessment methods, which aim to capture psychological, emotional,and environmental factors that may relate to a behavioral outcomein near real-time. In order to investigate ILD collected in a novel,smartphone-based smoking cessation study, we propose a Bayesianvariable selection approach for time-varying effect models, designedto identify dynamic relations between potential risk factors and smok-ing behaviors in the critical moments around a quit attempt. We useparameter-expansion and data-augmentation techniques to efficientlyexplore how the underlying structure of these relations varies overtime and across subjects. We achieve deeper insights into these rela-tions by introducing nonparametric priors for regression coefficientsthat cluster similar effects for risk factors while simultaneously de-termining their inclusion. Results indicate that our approach is well-positioned to help researchers effectively evaluate, design, and delivertailored intervention strategies in the critical moments surroundinga quit attempt.
1. Introduction.
Scientific Background.
The integration of mobile health (mHealth)devices into behavioral health research has fundamentally changed the wayresearchers and interventionalists are able to collect data as well as deployand evaluate intervention strategies. Leveraging mobile and sensing tech-nologies, just-in-time adaptive interventions (JITAI) or ecological momen-tary interventions are designed to provide tailored support to participants
Keywords and phrases: ecological momentary assessment, mHealth, P´olya-Gamma aug-mentation, time-varying effect model, variable selection a r X i v : . [ s t a t . M E ] S e p M. KOSLOVSKY ET AL. based on their mood, affect, and socio-environmental context (Heron andSmyth, 2010; Nahum-Shani et al., 2017). In order to deliver theory-basedinterventions at critical moments, researchers collect intensive longitudinaldata using ecological momentary assessment (EMA) methods, which aim tocapture psychological, emotional, and environmental factors that may relateto a behavioral outcome in near real-time. In practice, JITAIs’ effectivenessdepends on accurately identifying high-risk situations by the user or by pre-determined decision rules to initiate the delivery of intervention components.Decision rules for efficacious interventions rely on a thorough understandingof the factors that characterize a subject’s risk for a behavioral outcome, thedynamics of these risk factors’ relation with the outcome over time, and theknowledge of possible strategies to target a risk factor (Nahum-Shani et al.,2017).In the analysis of this paper, we investigate a behavioral health interven-tion study that targets smoking cessation. Historically, smoking cessationstudies have used health behavior theory (Shiffman et al., 2002; Timms et al.,2013) or group-level trends of smoking antecedents (Piasecki et al., 2013)to determine when a JITAI should be triggered. However, this approach islimited since current health behavior models are inadequate for guiding thedynamic and granular nature of JITAIs (Riley et al., 2011; Klasnja et al.,2015). Additionally, the design of efficacious smoking cessation interven-tions is challenged by the complexity of smoking behaviors around a quitattempt and misunderstandings of the addiction process (Piasecki et al.,2002). More recently, smoking behavior researchers have capitalized on theability of mHealth techniques to collect rich streams of data capturing sub-jects’ experiences close to their occurrence at a high temporal resolution.The structure, as well as the complexity, of these data provide unique op-portunities for the development and implementation of more advanced ana-lytical methods compared to traditional longitudinal data analysis methodsused in behavioral research (e.g., mixed models, growth curve models) (Trailet al., 2014). For example, researchers have applied reinforcement learning(Luckett et al., 2019) and dynamic systems approaches (Trail et al., 2014;Rivera, Pew and Collins, 2007; Timms et al., 2013) to design and assessoptimal treatment strategies using mHealth data. Additionally, Koslovskyet al. (2018), de Haan-Rietdijk et al. (2017) and Berardi et al. (2018) haveapplied hidden and observed Markov models to study transitions betweendiscrete behavioral states, Shiyko et al. (2012) and Dziak et al. (2015) haveused mixture models to identify latent structures, and K¨ur¨um et al. (2016)have employed joint modeling techniques to study the complexity of smokingbehaviors.
VEM FOR MHEALTH DATA Greater insights into the dynamic relation between risk factors and smok-ing behaviors have been generated by the application of functional data tech-niques (Trail et al., 2014; Vasilenko et al., 2014; Koslovsky et al., 2017; Tanet al., 2012). These methods are well-suited for high-dimensional data withunbalanced and unequally-spaced observation times, matching the format ofdata collected with EMAs. They also require little assumptions on the struc-ture of the relations between risk factors and behavioral outcomes. One pop-ular approach uses varying-coefficient models, which belong to the class ofgeneralized additive (mixed) models. These semiparametric regression mod-els allow a covariate’s corresponding coefficient to vary as a smooth functionof other covariates (Hastie and Tibshirani, 1993). For example, Selya et al.(2015) examined how the relation between the number of cigarettes smokedduring a smoking event and smoking-related mood changes varies as a func-tion of nicotine dependence. More frequently, penalized splines have beenemployed in varying-coefficient models to investigate how the effect of a co-variate varies as a function of time, leading to time-varying effect models(TVEM) (Tan et al., 2012; Lanza et al., 2013; Koslovsky et al., 2017; Shiykoet al., 2012; Mason et al., 2015; Vasilenko et al., 2014). These approachesallow researchers to identify the critical moments that a particular risk fac-tor is strongly associated with smoking behaviors, information that can beused to design tailored intervention strategies based on a subject’s currentrisk profile.1.2.
Model Overview.
While there are various inferential challenges thatfunctional data analysis models can address, in the application of this paperwe focus on incorporating three recurring themes in behavioral research toexplore the relations between risk factors and smoking behaviors:1.
Model Assumptions - Numerous smoking behavior research studieshave relied on semiparametric, spline-based methods to learn the rela-tional structure between risk factors and outcomes (Tan et al., 2012;Vasilenko et al., 2014).2.
Variable Selection - One of the main objectives of intensive longitudi-nal data analysis is to identify or re-affirm complex relations betweenrisk factors and behavioral outcomes over time (Walls and Schafer,2005).3.
Latency - A common aim in smoking behavior research studies is toidentify latent structure in the data, such as groups or clusters of sub-jects with similar smoking behaviors over time (McCarthy et al., 2016;Cursio, Mermelstein and Hedeker, 2019; Geiser et al., 2013; Dziaket al., 2015; Brook et al., 2008).
M. KOSLOVSKY ET AL.
To incorporate and expand upon these features in our analysis, we de-velop a flexible Bayesian varying-coefficient regression modeling frameworkfor longitudinal binary responses that uses variable selection priors to pro-vide insights into the dynamic relations between risk factors and outcomes.We embed spike-and-slab variable selection priors as mixtures of a pointmass at zero (spike) and a diffuse distribution (slab) (George and McCul-loch, 1993; Brown, Vannucci and Fearn, 1998) and adopt the formulation ofScheipl, Fahrmeir and Kneib (2012) to deconstruct the varying-coefficientsterms, in our case time-varying effects, into a main effect, linear interac-tion term, and non-linear interaction term. Unlike previous approaches inbehavioral health research that use time-varying effect models, our formu-lation allows us to gain inference on whether a given risk factor is related tothe smoking behavior while also learning the type of relation. Additionally,by performing selection on fixed as well as random effects, our method isequipped to identify relations that vary over time and across subjects. Forthis, we exploit a P´olya-Gamma augmentation scheme that enables efficientsampling without sacrificing interpretability of the regression coefficients aslog odds ratios (Polson, Scott and Windle, 2013). Furthermore, we adopta Bayesian semiparametric approach to model fixed and random effects byreplacing the traditional spike-and-slab prior with a nonparametric construc-tion to cluster risk factors that have similar strengths of association.1.3.
Just-in-Time Adaptive Interventions for Smoking Abstinence.
Al-though multiple studies have examined momentary predictors of smokinglapse (Shiffman et al., 2000; Piasecki et al., 2003; Businelle et al., 2014),JITAIs for smoking cessation are still nascent. Thus far, studies have usedparticipant-labeled GPS coordinates to trigger supportive messages to pre-vent smoking (Naughton et al., 2016), or have tailored messages to the du-ration and intensity of participant’s self-reported side effects while takingvarenicline (McClure et al., 2016). Using our proposed approach, we ana-lyze ILD collected in a study investigating the utility of a novel, smartphone-based smoking cessation JITAI (
SmartT ). The
SmartT intervention (Businelleet al., 2016) uses a lapse risk estimator to identify moments of heightenedrisk for lapse, and tailors treatment messages in real-time based upon thelevel of imminent smoking lapse risk and currently present lapse triggers. Toour knowledge, no other studies have used EMA data to estimate risk forimminent smoking lapse and deliver situation-specific, individually-tailoredtreatment content prior to lapse.In this study, adult smokers (N=81) recruited from a smoking cessationresearch clinic were randomized to the
SmartT intervention, the National
VEM FOR MHEALTH DATA Cancer Institute’s QuitGuide (
NCI QuitGuide ), or weekly counseling ses-sions ( usual care ), and followed over a five-week period spanning one weekprior to a scheduled quit attempt to four weeks after. At the beginning ofthe assessment period, baseline measures were collected, and subjects wereshown how to complete EMAs on a study-provided smartphone. Through-out the assessment period, subjects completed daily diaries and receivedfour random EMAs from the smartphone to complete each day. For eachEMA, subjects were prompted on their recent smoking behaviors, alcoholconsumption, as well as various questions regarding their current psycholog-ical, social, and environmental factors that may contribute to an increasedrisk of smoking behaviors.Findings indicate that our approach is well-positioned to help researchersevaluate, design, and deliver tailored intervention strategies in the criticalmoments surrounding a quit attempt. In particular, results confirm previ-ously identified temporal relations between smoking behaviors around a quitattempt and risk factors. They also indicate that subjects differ in how theyrespond to different risk factors over time. Furthermore, we identify clus-ters of active risk factors that can help researchers prioritize interventionstrategies based on their relative strength of association at a given moment.Importantly, our approach generates these insights with minimal assump-tions regarding which risk factors were related to smoking in the presence ofothers, the structural form of the relation for active terms, or the parametricform of regression coefficients.The rest of the paper is organized as follows. In section 2, we presentour modeling approach and describe prior constructions. In section 3, weinvestigate the relations between risk factors and smoking behaviors in thecritical moments surrounding a scheduled quit attempt using mHealth data.In section 4, we conduct a simulation study investigating the variable se-lection and clustering performance of our proposed method on simulateddata. In section 5, we evaluate prior sensitivity of our model. In section 6,we provide concluding remarks.
2. Methods.
The objective of our analysis is to identify relations be-tween a set of risk factors (i.e., baseline and EMA items) and a binary out-come (i.e., momentary smoking) repeatedly collected over time. For this, weemploy a Bayesian variable selection framework that allows a flexible struc-ture for the unknown relations. We achieve this by performing selection notonly on main effects, but additionally on linear and non-linear interactionterms as well as random effects. In this work, we refer to fixed and randomeffects in the context of hierarchical or multilevel models, where fixed effects
M. KOSLOVSKY ET AL. are constant across subjects and random effects differ at the subject-level.We chose this terminology based on its familiarity within both frequentistand Bayesian paradigms, but point out that the fixed or population-leveleffects are treated as random variables in our model, and thus follow a prob-ability distribution.2.1.
A Varying-Coefficient Model for Intensive Longitudinal Data Col-lected with EMAs.
Let y ij ∈ { , } represent momentary smoking for sub-ject i = 1 , . . . , N , and x ij and z ij represent P - and D -dimensional vectors ofrisk factors collected on each subject at time j = 1 , . . . , n i , respectively. Tomaintain temporality in our particular application (see section 3 for moredetails), we model the relation between momentary smoking by the nextassessment and current, potential risk factors as a varying-coefficient modelof the type(2.1) logit ( P ( y i,j +1 = 1 | x ij , z ij , u ij )) = P (cid:88) p =1 f p ( u ij ) x ijp + α (cid:48) i z ij , where f p ( u ) are smooth functions of a scalar covariate u , and α i representssubject specific random effects. Similar temporal assumptions have beenmade previously in smoking behavior research studies (Bolman et al., 2018;Minami et al., 2014; Shiffman et al., 1996; Shiffman, 2013; Shiyko et al.,2014). Note that in general, researchers may use the framework of 2.1 tomodel the relation between a binary outcome and potential risk factors col-lected concurrently, in addition to lagged trends, as is typical in longitudinalstudies (Fitzmaurice, Laird and Ware, 2012). With this formulation, we in-clude varying-coefficient terms for each of the P risk factors based on u .However in general, we can specify varying-coefficient terms that depend on u (cid:48) (cid:54) = u , and thus the number of varying-coefficient terms in the full model isnot strictly P . If u is chosen to represent time, then this model is commonlyreferred to as a time-varying effect model in smoking behavior research (Tanet al., 2012; Vasilenko et al., 2014; Dziak et al., 2015; Koslovsky et al., 2017).Note that z ij is typically a subset of x ij (Kinney and Dunson, 2007; Chenget al., 2010; Hui, M¨uller and Welsh, 2017) and that incorporating a 1 in x ij and z ij , allows for an intercept term that varies as a function of u and arandom intercept term, respectively. Additionally, this formulation can han-dle time-invariant risk factors, such as baseline items, by fixing x ijp ( z ijd ) to x ip ( z id ) for all observations j .We approximate the smooth functions with spline basis functions. Specif- VEM FOR MHEALTH DATA ically,(2.2) f p ( u ij ) = U (cid:48) ij φ p , where U ij is a spline basis function for u ij , and φ p is a r p -dimensionalvector of corresponding spline coefficients. For simplicity, the splines areconstructed with an equal number of equally spaced knots that depend onthe minimum and maximum values of u .2.2. Penalized Priors for the Spline Coefficients.
Using a combination ofvariable selection and shrinkage priors, our approach generates insights onthe underlying structure of the smooth functions by reconstructing them asthe summation of main effect, linear interaction, and non-linear interactioncomponents. Formally, we rewrite Equation (Eq.) (2.2) as(2.3) f p ( u ij ) = β ∗ p U ∗(cid:48) ij ξ p + β ◦ p u ij + β p , where the constant term β p captures the main effect of x p , β ◦ p represents theeffect of the linear interaction between u and x p , and β ∗ p ξ p is a parameter-expanded vector of coefficients corresponding to the non-linear interactionterm.To derive the non-linear component in Eq. (2.3), we start by penalizingthe spline functions in Eq. (2.2) with a second-order Gaussian random walkprior following(2.4) U φ p | s ∼ N ( , s U P − U (cid:48) ) , where U is a (cid:80) Ni =1 ( n i − × r p -dimensional matrix with each row correspond-ing to U (cid:48) ij for the i th subject at the j th assessment, s controls the amountof smoothness, and P is the appropriate penalty matrix (Lang and Brezger,2004). Next, we take the spectral decomposition of U P − U (cid:48) = (cid:2) U + U ◦ (cid:3)(cid:20) V +
00 0 (cid:21) (cid:20) U + U ◦ (cid:21) , where U + is a matrix of eigenvectors with correspond-ing positive eigenvalues along the diagonal of matrix V + , and U ◦ are theeigenvectors associated with the zero eigenvalues. Now, we can re-define thesmooth functions in Eq. (2.2) as the sum of non-linear (penalized) interac-tion, linear (non-penalized) interaction, and main effect terms as presented inEq. (2.3), where the penalized term is written as U ∗ ϕ ∗ p with U ∗ = U + V / .By assuming independent normal priors for ϕ ∗ p , a proper prior for the pe-nalized terms that is proportional to Eq. (2.4) can be obtained.We take two additional measures to enhance the computational efficiencyof the resulting MCMC algorithm. First, only eigenvalues/vectors that ex-plain a majority of the variability in Eq. (2.4) are used to construct U ∗ . M. KOSLOVSKY ET AL.
Additionally, we apply a parameter-expansion technique for the penalizedterms in f p ( · ), setting ϕ ∗ p = β ∗ p ξ p , where β ∗ p is a scalar and ξ p is a vector withthe same dimension as ϕ ∗ p . This technique enables us to perform selection onthe penalized terms as a group rather than determining their inclusion sep-arately. By rescaling β ∗ p and ξ p at each MCMC iteration, such that | ξ p | hasmean equal to one, ξ p maintains the shape of the smooth function and β ∗ p represents the term’s strength of association, while preserving identifiability,similar to Scheipl, Fahrmeir and Kneib (2012).For variable selection, we impose spike-and-slab prior distributions on the3 ∗ P = T -dimensional vector β = ( β ∗ , β ◦ , β , . . . , β ∗ P , β ◦ P , β P ) (cid:48) . In general,the spike-and-slab prior distribution is composed of a mixture of a Diracdelta function at zero, δ ( · ), and a known distribution, S ( · ), such as a normalwith mean zero and diffuse variance (George and McCulloch, 1993; Brown,Vannucci and Fearn, 1998). A latent indicator variable, ν t , representing arisk factor’s inclusion or exclusion in the model determines whether the riskfactor’s regression coefficient is set to zero (spike) or free to be estimated inthe model (slab). Specifically for a given coefficient β t , we assume β t | ν t ∼ ν t · S ( β t ) + (1 − ν t ) δ ( β t ) . (2.5)To complete the prior specification for this portion of the model, we as-sume that the slab component, S ( β t ), follows a N (0 , τ ) with variance τ ,and that the inclusion indicators are distributed as ν t | θ t ∼ Bernoulli( θ t ),with prior probability of inclusion θ t ∼ Beta( a ν t , b ν t ). Integrating out θ t weobtain ν t ∼ Beta-Binomial( a ν t , b ν t ), where hyperparameters a ν t and b ν t areset to control the sparsity in the model. Lastly, each element of ξ p , ξ pr , isassumed to follow a N ( µ pr , µ pr = ± ξ pr around ± ϕ ∗ p , as described above.2.3. Prior Specification for the Random Effects.
We perform selection onthe random effects, α i , using the modified Cholesky decomposition approachof Chen and Dunson (2003). Specifically, we reparameterize the randomeffects(2.6) α i = K Γ ζ i , where K a positive diagonal matrix with elements κ = ( κ , . . . , κ D ) (cid:48) , and Γ a lower triangle matrix with diagonal elements set to one and free elementsotherwise. To perform variable selection, we set the prior for κ to followa similar spike-and-slab prior distribution as in section 2.2, where the slab VEM FOR MHEALTH DATA distribution S ( κ d ) = F N ( m , v ). Here, F N represents a folded normaldistribution defined as
F N ( m , v ) = (2 πv ) − / exp( − ( κ d − m ) / (2 v ))+(2 πv ) − / exp( − ( κ d + m ) / (2 v )) , where m ∈ R and v > κ and ultimiately their cluster assignments. Sim-ilar to section 2.2, we let the corresponding inclusion indicators λ d followa Beta-Binomial( a λ d , b λ d ) to induce sparsity on the random effect terms.Lastly, we assume the D ( D − / Γ follow N ( γ , V γ ) · I ( γ ∈ Z ), where I represents an indicator function, and Z represents the parameters with corresponding random effects included in themodel. For example, if the d th random effect is included (i.e., λ d = 1), then γ d , . . . , γ d,d − and γ d +1 ,d , . . . γ D,d ∈ Z . Lastly, we assume ζ i ∼ N ( , I ) . Spiked Nonparametric Priors.
To complete our approach, we inves-tigate nonparametric prior constructions for the spike-and-slab componentsof the reparameterized fixed and random effects by assuming that the slabcomponent follows a Dirichlet process (DP). These priors are commonly re-ferred to as spiked DP (SDP) priors (Canale et al., 2017; Kim, Dahl andVannucci, 2009; Savitsky and Vannucci, 2010; Dunson, Herring and Engel,2008). In the context of our model, SDP priors allow us to simultaneouslyselect influential risk factors while clustering effects with similar relationsto the smoking outcome. The formulation we use here is sometimes refersto as an “outer” SDP prior, since the point mass at zero is outside of thebase distribution of the DP. Alternatively, the “inner” construction placesthe spike-and-slab prior inside the DP, serving as the base distribution. Theinner formulation provides the opportunity for coefficients to cluster at zero,but does not force a point mass at zero explicitly. As such, the likelihoodthat a coefficient is assigned to the trivial cluster grows with the number ofcoefficients excluded from the model. Alternatively, the outer formulationis a more informative prior, since it explicitly assigns a point mass at zero,and, in addition, carries less computational demands since it does not requireauxiliary variables for MCMC sampling (Neal, 2000; Savitsky and Vannucci,2010). We refer readers to Canale et al. (2017) for a detailed explanation ofthe structural differences between the two prior formulations.First, we assume the regression coefficients associated with the main ef-fects and linear interaction terms follow a SDP to provide insights on risk M. KOSLOVSKY ET AL. factors that share underlying linear trends with momentary smoking by thenext assessment over the course of the study. Specifically, we assume theslab component in Eq. (2.5) is a Dirichlet process prior H ∼ DP ( ϑ, H ),with base distribution H = N (0 , τ ) and concentration parameter ϑ . Fur-thermore, we assume a hyperprior ϑ ∼ G ( a ϑ , b ϑ ), with a ϑ , b ϑ >
0. For thenonlinear interaction terms, we avoid the SDP since it would produce un-interpretable cluster assignments due to the parameter-expansion approachtaken to improve selection performance. For example, similar values for β ∗ t and β ∗ t (cid:48) may correspond to vastly different ϕ ∗ t and ϕ ∗ t (cid:48) , depending on theirrespective ξ and spline basis functions. Similarly, placing a DP prior onthe individual components in ξ , or even ϕ , would not provide interpretableresults on the overall nonlinear effect. We take a similar approach for therandom effects. Here, we assume the slab components for the diagonal ele-ments of K , S ( κ d ) = W , W ∼ DP ( A , W ) , where W ∼ F N ( m , v ), and A is the concentration parameter of the DP. To complete the prior assump-tions for the random effects portion of the model, let A ∼ G ( a A , b A ), where a A , b A > K , while letting ζ i follow a normaldistribution centered at zero. As such, our approach avoids any identifiabil-ity issues with the fixed effects while still relaxing the parametric assumptionon the reparameterized random effects, K Γ ζ i . It is important to note thatby doing this we are adopting a Bayesian semiparametric modeling struc-ture, since the random effects are linear combinations of spiked Dirichletprocess and normal random variables (M¨uller, Quintana and Rosner, 2007).2.5. Posterior Inference.
For posterior inference, we implement a Metropolis-Hastings within Gibbs algorithm. The full joint model is defined as f ( y | (cid:37) , ω , x , u , z ) p ( ω ) p ( β | ν ) p ( ν ) p ( ϑ ) p ( K | λ ) p ( λ ) p ( A ) p ( ξ | µ ) p ( µ ) p ( ζ ) p ( Γ ) , where (cid:37) = { β , ξ , K , Γ , ζ } . We use the P´olya-Gamma augmentation of Pol-son, Scott and Windle (2013) to efficiently sample the posterior distributionfor the logistic regression model. Following Polson, Scott and Windle (2013), VEM FOR MHEALTH DATA we express the likelihood contribution of y i,j +1 as f ( y i,j +1 |· ) = ( e ψ ij ) y i,j +1 (1 + e ψ ij ) ∝ exp( k i,j +1 ψ ij ) (cid:90) ∞ exp( − ω i,j +1 ψ ij / p ( ω i,j +1 | n i,j +1 , ∂ω, where k i,j +1 = y i,j +1 − n i,j +1 / p ( ω i,j +1 | n i,j +1 , ∼ P G ( n i,j +1 , P G is the P´olya-Gamma distribution. Using the notation presented in theprevious sections, we set ψ ij = P (cid:88) p =1 ( β ∗ p U ∗ ij ξ p + β ◦ p u ij + β p ) x ijp + z (cid:48) ij K Γ ζ i . The MCMC sampler used to implement our model is outlined below inAlgorithm 1. A more detailed description of the MCMC steps as well as agraphical representation of the model are provided in the SupplementaryMaterial. After burn-in and thinning, the remaining samples obtained fromrunning Algorithm 1 for ˜ T iterations are used for inference. To determinea risk factor’s inclusion in the model, its marginal posterior probability ofinclusion (MPPI) is empirically estimated by calculating the average of itsrespective inclusion indicator’s MCMC samples (George and McCulloch,1997). Note that inclusion for both fixed and random effects is determinedmarginally for β t and λ d , respectively. Commonly, covariates are included inthe model if their MPPI exceeds 0.50 (Barbieri et al., 2004) or a Bayesianfalse discovery rate threshold, which controls for multiplicity (Newton et al.,2004).
3. Case Study.
In this section, we study the smoking behaviors in agroup of adult smokers recruited from a smoking cessation research clinic.The overall research goal of this study was to identify and investigate thestructural form of the relations between a set of risk factors and smokingover a five-week period surrounding a scheduled quit attempt, using intensivelongitudinal data collected with EMAs.3.1.
Data Analysis.
In the study design, momentary smoking, our out-come of interest, was defined as whether or not a subject reported smokingin the 4 hours prior to the current EMA. However at each EMA, a subjectwas prompted on their current psychological, social, environmental, and be-havioral status. Thus to maintain temporality in this study, we assessed therelations between momentary smoking and measurements collected in theprevious EMA. As such, regression coefficients are interpreted as the log oddsof momentary smoking by the next assessment for a particular risk factor. M. KOSLOVSKY ET AL.
Algorithm 1
MCMC Sampler
1: Input data y , x , u , z
2: Initialize parameters: (cid:37) , ω , ν , λ , ϑ, A , µ
3: Set DP ¯ β and DP K to True or False to indicate DP for slab on fixed or random effects,respectively.4: for iteration ˜ t = 1 , . . . , ˜ T do for i = 1 , . . . , N do for j = 1 , . . . , n i − do
7: Update ω i,j +1 ∼ P G (1 , ψ ij )8: end for end for if DP ¯ β then
11: Update cluster assignment of ¯ β following Neal (2000) algorithm 2.12: end if
13: Jointly update β and ν with Between and Within Step following Savitsky, Vannucciand Sha (2011).14: Update ξ from FCD N ( µ ξ , V ξ ).15: for p = 1 , . . . , P do
16: Rescale ξ ∗ p and β ∗ p so ϕ ∗ p remains unchanged.17: end for for p = 1 , . . . , P do for r = 1 , . . . , r p do
20: Set µ pr = 1 with probabilty 1 / (1 + exp( − ξ pr )).21: end for end for
23: Update ϑ by the two-step Gibbs update of Escobar and West (1995).24: if DP K then
25: Update cluster assignment of DP K following Neal (2000) algorithm 2.26: end if
27: Jointly update K and λ with Between and Within Step following Savitsky, Van-nucci and Sha (2011).28: Update A following two-step Gibbs update of Escobar and West (1995).29: Update Γ from FCD N (ˆ γ , ˆ V γ ) · I ( γ ∈ Z ).30: for i = 1 , . . . , N do
31: Update ζ i from FCD N (ˆ ζ i , ˆ V ζ i ).32: end for end for VEM FOR MHEALTH DATA In this study, we investigated psychological and affective factors including urge to smoke, feelings of restlessness , negative affect (i.e., irritability, frus-tration/anger, sadness, worry, misery), positive affect (i.e., happiness andcalmness), being bored , anxiousness , and motivation to quit smoking . Ad-ditionally, we investigated numerous social and environmental factors suchas whether or not the subject was interacting with a smoker , if cigaretteswere easily available ( cigarette availability ), and whether or not the sub-ject was drinking alcohol ( alcohol consumption ). Also, we included a set ofbaseline, time-invariant measures (i.e., heaviness of smoking index ( HSI ), age (years), being female , and treatment assignment) into the model. Foreach of these risk factors, we included a fixed main effect, linear interaction,and non-linear interaction term as well as a random main effect and linearinteraction term. All interactions investigated in this analysis were betweenrisk factors and assessment time (i.e., u ij = t ij ), and t ij were centered sothat t = 0 represents the beginning of the scheduled quit attempt.Only complete EMAs with corresponding timestamps were included inthis analysis, resulting in 9,634 total observations with the median numberof assessments per individual 151 (IQR 101.5-162). All continuous covariateswere standardized to mean zero and variance one before analysis to helpreduce multicollinearity and place covariates on the same scale for interpre-tation. The spline functions were initially generated with 20 basis functions,but only the eigenvalues/eigenvectors that captured 99.9% of the variabilitywere included in the model to reduce the parameter space and computa-tion time, similar to (Scheipl, Fahrmeir and Kneib, 2012). This reduced thecolumn space of the penalized covariates U ∗ to 8 in our application. Weapplied our model with the traditional spike-and-slab prior, as well as thespiked DP. When fitting each model, we chose a non-informative prior forthe fixed and random effects’ inclusion indicators, a ν t = b ν t = a λ d = b λ d = 1.This assumption reflects the exploratory nature of our study aimed at learn-ing potential relations between risk factors and smoking behaviors with littleor no information regarding their occurrence in the presence of other riskfactors. We assumed a mildly informative prior on the fixed regression co-efficients by setting τ = 2. This places a 95% prior probability of includedregression coefficients between an odds ratio of 0.06 and 16. Additionally, weset v = v ∗ = 10, m = m ∗ = 0, and Γ ∼ N ( γ = , V γ = I ). Lastly, whenusing the SDP prior, the hyperparameters for the concentration parameters ϑ and A were set to a ϑ = b ϑ = a A = b A = 1. For posterior inference, weran our MCMC algorithm with and without SDP priors for both fixed andrandom effects for 10,000 iterations, treating the first 5,000 as burn-in andthinning to every 10 th iteration. Trace plots of the parameters’ posterior M. KOSLOVSKY ET AL. samples indicated good convergence and mixing. Additionally, we observeda relatively high correlation ( ∼ R , for each of the selected β and K be-low 1.1 (Gelman and Rubin, 1992), further demonstrating that the MCMCprocedure was working properly and the chains converged. To assess modelfit, a residual plot and a series of posterior predictive checks were performedin which we compared replicated data sets from the posterior predictive dis-tribution of the model to the observed data (Gelman et al., 2000). Overall,we found strong evidence of good model fit. See the Supplementary Mate-rials for details. Inclusion in the model was determined using the medianmodel approach (Barbieri et al., 2004) (i.e., marginal posterior probabilityof inclusion (MPPI) ≥ . loo (Vehtari, Gelman and Gabry, 2016), which re-quires the pointwise log-likelihood for each subject i = 1 , . . . , N at eachobservation j = 1 , . . . , n i calculated at each MCMC iteration s = 1 , . . . , S ,and produces an estimated (cid:100) epld value, with larger values implying a superiormodel.3.2. Results.
Overall, we found better predictive performance for themodel with SDP priors versus the traditional spike-and-slab priors, (cid:100) epld
SDP = − . (cid:100) epld SS = − .
7, respectively. Plots of the marginal poste-rior probabilities of inclusion for the fixed and random effects selected usingour proposed approach with SDP priors are found in Figure 1. Figure 2presents the time-varying effects selected using the same model. Comparedto usual care , we found a higher odds of momentary smoking by the nextassessment for those assigned to the
NCI QuitGuide group prior to the quitattempt. However immediately after the quit attempt, we observed a lowerodds of momentary smoking by the next assessment for those assigned tothe
NCI QuitGuide group, which gradually increased to the initial level overthe remainder of the study (top left panel). Similarly, we observed a positive
VEM FOR MHEALTH DATA relation between having the urge to smoke and momentary smoking by thenext assessment prior to the quit attempt that diminished during the threeweeks following the quit attempt, before sharply increasing during the fourthweek post-quit (top right panel). Throughout the assessment period, we ob-served a positive relation between negative affect and momentary smokingby the next assessment that increased during the first week post-quit, level-ing off at an odds ratio of 1.75 until the third week after the quit attempt.We additionally found a positive relation between cigarette availability andthe odds of momentary smoking by the next assessment that strengthenedover the assessment window. For a 1 SD increase in cigarette availability , theodds of momentary smoking by the next assessment increased by 300% forthe typical subject one week after the quit attempt, holding all else constant.In the two lower panels of Figure 2 we observe a relatively weak, oscillatingeffect of being bored and interacting with a smoker on momentary smokingby the next assessment, respectively. In addition to these effects, the modelidentified a constant effect for alcohol consumption in the last hour and mo-tivation to quit smoking over the assessment period. A similar set of fixedeffect relations were identified by our model without the SDP prior, withthe exception of not selecting being bored .Compared to standard TVEMs, our approach deconstructs the structureof the relations between risk factors and smoking behaviors over time, aid-ing the interpretation of the underlying trends. This information may helpthe development and evaluation of tailored intervention strategies targetingsmoking cessation using mHealth data. For example, negative effect has anobvious positive association with momentary smoking by the next assess-ment that wavers around an odds ratio of 1.2 to 1.5 for a majority of thestudy. However based on Figure 2, it is unclear whether or not the effectlinearly diminishes over time. By performing selection on the main effect,linear interaction, and non-linear interaction terms separately, we are ableto obtain an actual point estimate for the constant effect of negative affect (OR 1.40) as opposed to subjectively assuming a range of values from theplot. Additionally, since the linear interaction term was not selected, we canclaim that the effect was not linearly decreasing over time and that it wassimply wavering around the constant effect throughout the study.Tables 1 and 2 present the estimated variances and corresponding 95%credible intervals (CI) for the random effects selected using SDP priors andtraditional spike-and-slab priors, respectively. Using SDP priors, our methodidentified a random main effect for urge to smoke, cigarette availability , being bored , and motivation to quit smoking as well as a random linear interactionbetween being assigned to the SmartT treatment group, interacting with M. KOSLOVSKY ET AL.
Fig 1 . Smoking Cessation Study: Marginal posterior probabilities of inclusion (MPPI) forfixed (top) and random (bottom) effects. Selected fixed effects in ascending order: NCI (NL-INTX), urge to quit (NL-INTX), cigarette availability (all), interacting with a smoker (NL-INTX), negative affect (NL-INTX, main), being bored (NL-INTX), alcohol consumption(main), motivation to quit (main), HSI (NL-INTX). Selected random effects in ascendingorder: urge (main), cigarette availability (main), being bored (main), motivation to quit(main), SmartT (L-INTX), interacting with a smoker (L-INTX), being bored (L-INTX).Dotted lines represent the inclusion threshold of 0.50. NL-INTX: non-linear interaction,L-INTX: linear interaction
VEM FOR MHEALTH DATA Fig 2 . Smoking Cessation Study: Time-varying effects on momentary smoking by the nextassessment of those covariates selected by our model with SDP priors. Shaded regionsrepresent pointwise 95% CI. Dashed lines indicate an odds ratio of one. M. KOSLOVSKY ET AL.
Random Effect ˆ σ
95% CI
Intercept 0.923 (0.539, 1.528)Urge 0.152 (0.031, 0.278)Cigarette Availability 0.865 (0.394, 1.467)Bored 0.183 (0.076, 0.398)Motivation to Quit Smoking 0.156 (0.045, 0.311)SmartT × Time 0.077 (0.010, 0.210)Interacting with a Smoker × Time 0.016 (0.002, 0.050)Bored × Time 0.002 (0.000, 0.005)
Table 1
Smoking Cessation Study: Estimated variances with corresponding 95 % credible intervals(CI) for selected random effects with SDP priors based on MPPI ≥ . . smokers , and being bored with time. Thus even though we did not discover anoverall difference in the odds of momentary smoking by the next assessmentfor those assigned to the SmartT treatment versus usual care , we observedevidence that the subjects responded differently to the
SmartT treatmentacross the assessment window. With the traditional spike-and-slab priors,we found similar results overall. However, the model only selected a randommain effect for interacting with smokers and additionally suggested a randomeffect for anxiousness .By using SDP priors, our approach is capable of clustering covariates thatshare similar linear trends with momentary smoking by the next assessmentover time. In practice, this information can be used to help construct decisionrules when designing future intervention strategies. In our analysis, only fivemain effect and linear interaction terms were selected, and each of themwere allocated to their own cluster. With this knowledge, researchers canprioritize targeting risk factors based on their relative strength of associationat a given moment. Had some of these risk factors’ effects been clusteredtogether, researchers may rely more heavily on other pieces of information,such as the cost or success rates for a particular intervention strategy, whenassessing which risk factors to target during a high-risk moment.Similar to previous studies investigating the temporal relation betweenrisk factors and smoking behaviors around a quit attempt, our results showa convex relation between urge to smoke and momentary smoking after thequit attempt, a positive association with cigarette availability throughoutthe quit attempt, and a positive, increasing relation between negative af-fect and momentary smoking during the first week after the quit attempt(Koslovsky et al., 2017; Vasilenko et al., 2014). Existing TVEMs approaches,however, typically model the repeated measures structure of the data by sim-ply including a random intercept term in the model, neglecting to investigaterandom main effects or interaction terms. They also do not incorporate vari-
VEM FOR MHEALTH DATA Random Effect ˆ σ
95% CI
Intercept 1.317 (0.676, 2.487)Urge 0.099 (0.011, 0.248)Cigarette Availability 0.905 (0.503, 1.607)Interacting with a Smoker 0.848 (0.286, 1.924)Bored 0.244 (0.065, 0.517)Anxiousness 0.140 (0.001, 0.361)Motivation to Quit Smoking 0.212 (0.076, 0.448)SmartT × Time 0.062 (0.016, 0.155)Bored × Time 0.002 (0.000, 0.004)
Table 2
Smoking Cessation Study: Estimated variances with corresponding 95 % credible intervals(CI) for selected random effects with traditional spike-and-slab priors based on MPPI ≥ . . able selection. Our approach, on the other hand, delivers insights on howrelations vary over time as well as how they vary across individuals.3.3. Sensitivity Analysis.
To investigate our model’s sensitivity to priorspecification, we set each of the hyperparameters to default values and thenevaluated the effect of manipulating each term on the results obtained insection 3. For the default parameterization, we set the hyperparameters forthe prior inclusion indicators ν and λ to a ν t = b ν t = a λ d = b λ d = 1. Forinterpretation, a ν t = b ν t = 1 implies that the prior probability of inclusionfor a fixed effect is a ν t / ( a ν t + b ν t ) = 0 .
50. The default values for the varianceof the normal distribution for the slab of β and β ◦ as well as the basedistribution for β ∗ were each fixed at 5. Additionally, the mean and variancefor the random effect terms’ proposal and prior distributions were set to 0and 5, respectively. The hyperparameters for the concentration parameters ϑ and A were set to a ϑ = b ϑ = a A = b A = 1. Lastly, we assumed Γ ∼ N ( γ = , V γ = I ). We ran our MCMC algorithm for 10,000 iterations, treatingthe first 5,000 iterations as burn-in and thinning to every 10 th iterationfor the SDP model, similar to our case study. For each of the fixed andrandom effects, inclusion in the model was determined using the medianmodel approach (Barbieri et al., 2004).Since the true model is never known in practice, we evaluated each modelparameterization in terms of sparsity levels and overlap with the results re-ported in the case study section. Specifically, we present the total numberterms selected for both fixed and random effects ( M. KOSLOVSKY ET AL. random effects (r-IN and r-EX) as well as overall (IN and EX). Results ofthe sensitivity analysis are reported in Table 3. Compared to the resultspresented in the case study, we found relatively consistent overlap in therisk factors included and excluded by each model overall. We observed mod-erate sensitivity to hyperparameter values in terms of percent overlap forfixed and random effects of risk factors included in the model, an artifactof the relatively weak associations identified for some of the risk factors.Notably, risk factors showing stronger associations with momentary smok-ing at the next assessment (e.g., negative affect , cigarette availability , and motivation to quit smoking ) were selected by the model regardless of priorspecification. Likewise, weaker relations between momentary smoking at thenext assessment and risk factors, such as being bored and interacting with asmoker , were more sensitive to hyperparameters. We also observed that thenumber of selected fixed and random effects increased (decreased) as theprior probability of inclusion increased (decreased), as expected. In prac-tice, there are a variety of factors researchers should consider when settingthe prior probability of inclusion, including the aim of the research study,the desired sparsity of the model, prior knowledge of covariates inclusion,as well as results from simulation and sensitivity analyses to name a few.From a clinical perspective, τ = 10 reflects a relatively diffuse prior for agiven risk factor (i.e., odds ratio between 0.002 and roughly 500). To fur-ther investigate the model’s sensitivity to regression coefficients’ variances,we set τ = v = 1000, and found somewhat similar results to the modelwith τ = v = 10 overall (i.e., IN = 0.8, EX = 0.8). Here, we unexpectedlyfound non-montonic behavior in the proportion of included and excludedterms as a function of the coefficients’ variance, which might also reflectour model’s sensitivity to relatively weak associations as previously noted.In theory, the selection of random effects may be sensitive to the order inwhich the columns of Z are ordered, since the Cholesky decomposition isitself, order dependent (M¨uller et al., 2013). In our case study, we did notobserve any differences regarding which random effects were selected with arandom permutation of the Z columns. In section 5, we further demonstrateour model’s robustness to the ordering of Z on simulated data.
4. Simulation Study.
In this section, we evaluate our model in termsof variable selection and clustering performance on simulated data similarin structure to our case study data. We compared our method with andwithout SDP priors on varying-coefficient and random effects to two otherBayesian methods which are designed to handle this class of models. Thefirst is the method of Scheipl, Fahrmeir and Kneib (2012), which has pre-
VEM FOR MHEALTH DATA a v t = a λ d = 1, b v t = b λ d = 9 τ = v = 2 a ϑ = b ϑ = a A = b A = 0 . a v t = a λ d = 9, b v t = b λ d = 1 τ = v = 10 a ϑ = b ϑ = a A = b A = 10 Table 3
Case Study Data: Sensitivity results for the proposed model with SDP across variousprior specifications. Total number of terms selected for both fixed and random effects areindicted as M. KOSLOVSKY ET AL. viously shown promising results performing function selection in structuraladditive regression models using continuous spike-and-slab priors. Their ap-proach differs from ours in that they assume parameter-expanded normal-mixture-of-inverse-gamma (peNMIG) distribution priors for selection, in-spired by Ishwaran et al. (2005), and design a Metropolis-Hastings with pe-nalized iteratively weighted least-squares algorithm for updating regressioncoefficients within the logistic framework. A popular alternative to spike-and-slab priors to induce sparsity in high-dimensional regression settings isto assume global-local shrinkage priors on the regression coefficients (seeVan Erp, Oberski and Mulder (2019); Bhadra et al. (2019) for detailed re-views). At the request of a reviewer, we additionally compared our proposedmodel to a reparameterized version with shrinkage priors (Carvalho, Pol-son and Scott, 2009). To achieve this, we replaced the spike-and-slab priorson β with horseshoe priors, which belong to the class of global-local scalemixtures of normal priors (Polson and Scott, 2010). For random effects, K ,we assumed a similar global-local structure for the scale parameters of thefolded-normal distribution, v . To our knowledge, the theoretical propertiesand selection performance of global-local scale mixtures of non-normal pri-ors have yet to be explored. However we conjectured that the global-localframework should effectively shrink inactive random effects towards zero andallow active terms to be freely estimated. Details of the resulting model andaccompanying MCMC algorithm are found in the Supplementary Material.We simulated N = 100 subjects with 20-40 observations randomly spacedacross an assessment window with t ij ∈ [0 , x i , comprised of anintercept term and 14 continuous covariates simulated from a N ( , Σ),where Σ st = w | s − t | and w = 0 .
3. To simulate time-varying covariate tra-jectories, we randomly jittered half of the elements within x i by N (0 , z ij = x ij . Thus, each full model contained 15 maineffects, linear interactions, non-linear interactions, and random main effects,corresponding to 60 potential terms (or groups of terms for the non-linearinteraction components) to select. The first 5 functional terms in the truemodel were defined as • f ( t ij ) = π sin(3 πt ij ) + 1 . t ij − . • f ( t ij ) = π cos(2 πt ij ) + 1 . • f ( t ij ) = − πt sin(5 πt ij ) + 1 . t ij − . • f ( t ij ) = − . t ij + 1 . • f ( t ij ) = − . , and the random effects a i ∼ N ( , Σ α ) with σ kk = 0 .
75 and σ jk = 0 . VEM FOR MHEALTH DATA j, k = 1 , . . . ,
5. Thus in the true model, ψ ij = (cid:80) p =1 f p ( t ij ) x ijp + z (cid:48) ij a i .Note that to impose an inherent clustering for the main effects and linearinteraction terms, their values were specified to center around ± . th iteration for each model. The spline functions were generatedsimilar to our application. We set the hyperparameters for the inclusionindicators, a ν t = b ν t = a ν t = b ν t = 1, imposing a non-informative priorfor selection of fixed and random effect terms. Additionally, we fixed theregression coefficient hyperparameters to τ = 2 and m = 0 with v = 10.For the concentration parameters ϑ and A , we assumed a ϑ = b ϑ = a A = b A = 1. Before analysis, the covariates were standardized to mean 0 andvariance 1.For each of the models with spike-and-slab priors, inclusion in the modelfor both fixed and random effects was determined using the median modelapproach (Barbieri et al., 2004). For the horseshoe model, fixed effects wereconsidered active if their corresponding 95% credible interval did not con-tain zero, similar to Bhadra et al. (2019). The 95% credible interval forrandom effects will almost surely not contain zero. As a naive alternative,we assumed a random effect was active in the model if its posterior meanexceeded a given threshold. For the sake of demonstration, we evaluatedthe performance of the model over a grid of potential threshold values, andpresented the results for the best performing model overall. Notably, thissolution is only feasible when the true answer is known, which is never thecase in practice. Variable selection performance was evaluated via sensitivity(SENS), specificity (SPEC), and Matthew’s correlation coefficient (MCC)for fixed and random effects separately. These metrics are defined as SEN S = T PF N + T PSP EC = T NF P + T NM CC = T P × T N − F P × F N (cid:112) ( T P + F P )( T P + F N )( T N + F P )( T N + F N ) , where T N , T P , F N , and
F P represent the true negatives, true positives,false negatives, and false positives, respectively. For the SDP models, clus-ters of regression coefficients were determined using sequentially-allocatedlatent structure optimization to minimize the lower bound of the variationof information loss (Wade et al., 2018; Dahl and Muller, 2017). Once clusterswere determined, clustering performance was evaluated using the variation M. KOSLOVSKY ET AL. of information, a measure of distance between two clusterings ranging from0 to log R , where R is the number of items to cluster and lower values implybetter clustering (Meil˘a, 2003).Figure 3 presents the estimated smooth functions obtained using our pro-posed method with SDP priors on a randomly selected replicated data setfrom the simulation study. Here, f ( t ij ) represents the global intercept com-prised of a main effect, linear interaction, and non-linear interaction termthat were forced into the model. Of interest is the ability of the model toproperly select the influential components in f ( t ij ) and f ( t ij ) and addition-ally capture their structure. Using the method proposed in Dahl and Muller(2017) to identify latent clusters of fixed main effect and linear interactionterms, our method successfully clustered the linear interaction in f ( t ij ) andthe main effects in f ( t ij ) and f ( t ij ), while incorrectly assigning the linearinteraction term in f ( t ij ) to its own cluster. Additionally, the main effects in f ( t ij ), f ( t ij ), and f ( t ij ) were appropriately clustered together, while thelinear interaction term in f ( t ij ) was incorrectly assigning to its own cluster.The remaining, uninfluential terms were all allocated to the trivial group.Despite f ( t ij ) and f ( t ij ) having similar main effect and linear interactionterms, they are dramatically different in terms of their non-linear interactionterms. However by clustering their underlying linear trajectories, our modelwith SDP priors was able to uncover similarities in their relations with theoutcome over time that traditional approaches would fail to discover.Table 4 reports results for our proposed method with SDP priors (PGB-VSDP), our proposed method without SDP priors (PGBVS), peNMIG, andour model with horseshoe priors (PGHS) in terms of average sensitivity,specificity, and MCC for fixed (fSENS, fSPEC, fMCC) and random (rSENS,rSPEC, rMCC) effects across the replicate data sets with standard errorsin parentheses. Additionally for the PGBVSDP model, we provide cluster-ing performance results for fixed (fCLUST) and random effects (rCLUST).Since each of the random effects were simulated similarly, clusterings werecompared to a single cluster for the non-zero terms. Overall, the methodshad relatively similar results for fixed effects, with PGBVS and PGHS per-forming the best in terms of sensitivity (1.00 and 1.00) and MCC (0.96 and0.99), respectively. Our method with SDP priors, PGBVSDP, obtained thehighest specificity for fixed effects overall. Given that the maximum possiblevalues fCLUST and rCLUST could take on were 3.4 and 2.7, respectively, wefound fairly strong clustering performance for both fixed (0.39) and random(0.92) effects with PGBVSDP. We observed more variability in the selec-tion of random effects across models. Random effect selection sensitivitywas significantly lower compared to the fixed effects for all of the models. VEM FOR MHEALTH DATA Fig 3 . Simulated Data: Estimated smooth function f ( t ij ) , f ( t ij ) , f ( t ij ) for a randomlyselected replicate data set generated in the simulation study. The estimated smooth functionis represented by a solid black line with pointwise 95% credible regions in grey. Dashed linesrepresent the true log odds ratios as a function of time. M. KOSLOVSKY ET AL.PGBVSDP PGBVS peNMIG PGHSfSENS 0.96 (0.09) 1.00 (0.02) 0.93 (0.11) 1.00 (0.00)fSPEC 0.99 (0.02) 0.98 (0.02) 0.94 (0.04) 0.96 (0.01)fMCC 0.94 (0.08) 0.96 (0.05) 0.83 (0.10) 0.99 (0.02)fCLUST 0.39 (0.21) - - -rSENS 0.76 (0.21) 0.62 (0.25) 0.46 (0.23) 0.86 (0.24)rSPEC 0.88 (0.10) 0.96 (0.05) 0.64 (0.16) 0.90 (0.11)rMCC 0.63 (0.26) 0.64 (0.23) 0.11 (0.33) 0.76 (0.21)rCLUST 0.92 (0.50) - - -Time (s) 4658 (271) 2235 (46) 10720 (1116) 3076 (74)
Table 4
Simulated Data: Results for the proposed model with and without the SDP on regressioncoefficients compared to peNMIG (Scheipl, Fahrmeir and Kneib, 2012) and our modelwith horseshoe priors (Carvalho, Polson and Scott, 2009). Results are averaged over 50replicate data sets with standard deviations in parentheses.
In terms of specificity (1-false positive rate) for random effects, our meth-ods, regardless of prior formulation, dramatically outperformed peNMIG,with PGBVS obtaining the highest specificity overall (0.96). However, PG-BVSDP and PGBVS had lower sensitivity with respect to random effectscompared to PGHS. While PGHS performed well separating active frominactive random effects, recall that the truth was used to select the opti-mal selection threshold. The improved performance of PGBVS, PGBVSDP,and PGHS in terms of variable selection was achieved in considerably lesscomputation time compared to peNMIG. Our core method was able to run7,500 iterations in a fifth of the time compared to peNMIG, accessed viaScheipl (2011). Using the SDP priors, which requires additional updates forclustering the regression coefficients, we observed a two-fold increase in com-putation time for PGBVSDP compared to PGBVS. However on average, thePGBVSDP approach still achieved about a 50% reduction in computationtime compared to peNMIG. It is important to note that for comparison, allalgorithms were run in series, even though the R package spikeSlabGAM(Scheipl, 2011) provides functionality to run multiple chains in parallel.
5. Sensitivity Analysis.
To assess the model’s sensitivity to hyperpa-rameter settings, we set each of the hyperparameters to default values andthen evaluated the effect of manipulating each term on selection and cluster-ing performance. For the default parameterization, we set the hyperparam-eters for the prior inclusion indicators ν and λ to a ν t = b ν t = a λ d = b λ d = 1.The default values for the variance of the normal distribution for the slabof β and β ◦ as well as the base distribution for β ∗ were each fixed at 5.Additionally, the mean and variance for the random effect terms’ proposaland prior distributions were set to 0 and 5, respectively. The hyperparam- VEM FOR MHEALTH DATA eters for the concentration parameters, ϑ and A a ϑ = b ϑ = a A = b A = 1.Lastly, we assumed Γ ∼ N ( γ = , V γ = I ). We ran our MCMC algorithmon the 50 replicated data sets generated in the simulation study, using 7,500iterations, treating the first 3,750 iterations as burn-in and thinning to every10 th iteration for the SDP model.Results of the sensitivity analysis are reported in Table 5. As expected,we found that the sensitivity (specificity) increased (decreased) as the priorprobability of inclusion for the fixed and random effects increased. The modeldid not seem sensitive to the variance assumed for the normal and foldednormal priors assigned to the fixed and random effect slab distributions,respectively. Similarly, we found comparable results in terms of sensitivityand specificity for different values of the concentration parameters’ hyper-parameters. In terms of clustering, we saw marginally better variation ofinformation measures with larger concentration parameter hyperparame-ters. However across simulations runs, we observed relatively high standarderrors in terms of the variation of information measures. To assess potentialsensitivity to the order of random effects in our simulations, we re-ran thesimulation study with a random permutation of the columns of Z . Similar tothe case study, we found no evidence of sensitivity to random effect orderingwith our model as the results were almost identical to those presented inTable 4 with PGBVSDP (rSENS = 0.76 (0.20), rSPEC = 0.87 (0.09), rMCC= 0.62 (0.20), rCLUST = 0.94 (0.42)).
6. Conclusions.
In this paper, we have investigated intensive longitu-dinal data, collected in a novel, smartphone-based smoking cessation studyto better understand the relation between potential risk factors and smok-ing behaviors in the critical moments surrounding a quit attempt, usinga semiparametric Bayesian time-varying effect modeling framework. Unlikestandard TVEMs, our approach deconstructs the structure of the relationsbetween risk factors and smoking behaviors over time, which aids in for-mulating hypotheses regarding dynamic relations between risk factors andsmoking in the critical moments around a quit attempt. By performing vari-able selection on random effects, the approach delivers additional insightson how relations vary over time as well as how they vary across individuals.Furthermore, the use of non- and semiparametric prior constructions allowssimultaneous variable selection for fixed and random effects while learninglatent clusters of regression coefficients. As such, our model is designed todiscover various forms of latent structures within the data without requir-ing strict model assumptions or burdensome tuning procedures. Results fromour analysis have confirmed previously identified temporal relations between M. KOSLOVSKY ET AL. a v t = a λ d = 1, b v t = b λ d = 9 τ = v = 2 a ϑ = b ϑ = a A = b A = 0 . a v t = a λ d = 9, b v t = b λ d = 1 τ = v = 10 a ϑ = b ϑ = a A = b A = 10fSENS 0.99 (0.03) 0.96 (0.07) 0.94 (0.11)fSPEC 0.96 (0.03) 0.99 (0.02) 0.99 (0.02)fMCC 0.91 (0.06) 0.95 (0.07) 0.93 (0.11)fCLUST 0.40 (0.20) 0.39 (0.23) 0.41 (0.24)rSENS 0.85 (0.20) 0.78 (0.20) 0.74 (0.23)rSPEC 0.84 (0.10) 0.89 (0.10) 0.86 (0.10)rMCC 0.66 (0.19) 0.67 (0.25) 0.60 (0.27)rCLUST 0.84 (0.49) 0.89 (0.47) 0.97 (0.49) Table 5
Simulated Data: Sensitivity results for the proposed model with SDP on regressioncoefficients. Results are averaged over 50 replicated data sets with standard errors inparentheses. smoking behaviors and urge to smoke, cigarette availability , and negative af-fect . They have also identified subject-specific heterogeneity in the effects of urge to smoke, cigarette availability , and motivation to quit . Additionally,we have found that subjects differed in how they responded to the
SmartT treatment (compared to usual care), interacting with a smoker , and being bored over time. This has practical relevance as researchers can use this infor-mation to design adaptive interventions that prioritize targeting risk factorsbased on their relative strength of association at a given moment. They alsoreinforce the importance of designing dynamic intervention strategies thatare adaptive to subjects’ current risk profiles.Throughout this work, we have demonstrated how our method is well-suited to aide the development and evaluation of future JITAI strategiestargeting smoking cessation using mHealth data. The existing
SmartT al-gorithm delivers treatment based on the presence of six lapse triggers, whichare weighted based on their relative importance in predicting risk of lapse(Businelle et al., 2016). The results of this study allow for a more dynamicalgorithm that takes into account not only the time-varying relationshipsbetween psychosocial and environmental variables and smoking lapse, butthe different ways in which individuals experience a quit attempt. For exam-ple, the results suggest that providing momentary support to cope with urge
VEM FOR MHEALTH DATA to smoke and negative affect may be more useful if delivered in the earlystages of a quit attempt, but become less important by week 4 post-quit.However, messages that address cigarette availability , alcohol consumption ,and motivation to quit smoking may be a more important focus for the entirequit attempt. Although the findings for this small sample may not be gen-eralizable to larger, more diverse populations, these methods are the nextstep in developing a personalized smoking risk algorithm that can informhighly specific, individualized treatment to each smoker.It is important to note that selection of a risk factor by our proposedmethod (or any variable selection technique), does not imply clinical signifi-cance. Notably, the point-wise credible intervals often contained odds ratiosof one and most risk factors were only influential for brief moments through-out the study period. While these results highlight the importance of un-derstanding risk factors’ dynamic relations with smoking to design tailoredintervention strategies, we recommend using our method for hypothesis gen-eration in practice and conducting confirmatory studies before generalizingresults.Compliance rates for EMA studies typically range between 70% and 90%,with a recommended threshold of 80% (Jones, Xu and Grunwald, 2006).In our case study, the compliance rate was 84%. Additionally, 97.3% ofall assessments were completed once initiated, and subjects were unable toskip questions within an assessment. Since subjects were assessed multipletimes per day, nonresponse was attributed more to situational context (e.g.,driving) than smoking status. Thus for this study, we found the missingcompletely at random assumption for missing observations justified. How-ever, future studies may consider the development of advanced analyticalmethods for EMA data sets that can handle different types of missingnessassumptions and other potential biases, such as social desirability bias.In this analysis, we focus on time-varying effects due to their recent pop-ularity in smoking behavior research Tan et al. (2012); Shiyko et al. (2012);Vasilenko et al. (2014); Koslovsky et al. (2017); Lanza et al. (2013); Shiykoet al. (2014). A promising alternative for investigating the complexity ofsmoking behaviors around a quit attempt is the varying index coefficientmodel, which allows a covariate’s effect to vary as a function of multipleother variables (Ma and Song, 2015). By incorporating variable selectionpriors, researchers could identify which variables are responsible for modify-ing a covariate’s effect. Oftentimes behavioral researchers are interested inexploring other forms of latent structure, such as clusters of individuals whorespond similarly to treatments or have similar risk profiles over time. Tak-ing advantage of the flexibility and efficiency of our approach, future work M. KOSLOVSKY ET AL. could extend our core model to address these research questions by recastingit into a mixture modeling framework. In addition, while we have developedour method for binary outcomes due to their prevalence in smoking behaviorresearch studies, our approach is easily adaptable to other data structuresfound within and outside of smoking behavior research, such as time to eventdata (Sha, Tadesse and Vannucci, 2006) and continuous outcomes. While ourmethod borrows information across regression coefficients, we avoided im-posing structure among covariates via heredity constraints, which restrictthe model space for higher order terms depending on the inclusion status ofthe lower order terms that comprise them. Researchers interested in extend-ing our approach to accommodate these, and other forms of, hierarchicalconstraints may adjust the prior probabilities of inclusion (Chipman, 1996).Lastly, while we were hesitant to present variable selection results for PGHS,due to the limited understanding of global-local priors for non-Gaussian dis-tributions, this showed good results in simulations. Furthermore, when ap-plied to the case study data, we obtained promising predictive performance(i.e., (cid:100) epld HS = − .
5) that warrant future investigation of its theoreticalproperties.
Acknowledgements.
Matthew Koslovsky is supported by NSF via theResearch Training Group award DMS-1547433.
Supplementary Material.R-package for PGBVS:
R-package PGBVS contains code to perform the methods described in thearticle. The package also contains functionality for reproducing the dataused in the sensitivity and simulation studies and for posterior inference.The R package is located at https://github.com/mkoslovsky/PGBVS . Supplementary Information:
This file contains a description of the full joint distribution of our model witha graphical representation, a detailed description of our proposed MCMC al-gorithm with and without SDP priors, and derivations for the prior marginallikelihood used to sample latent cluster assignments. Additionally, we includedetails of the goodness-of-fit analysis for the case study.
References.
Barbieri, M. M. , Berger, J. O. et al. (2004). Optimal predictive model selection.
TheAnnals of Statistics Berardi, V. , Carretero-Gonz´alez, R. , Bellettiere, J. , Adams, M. A. , Hughes, S. and
Hovell, M. (2018). A Markov approach for increasing precision in the assessmentVEM FOR MHEALTH DATA of data-intensive behavioral interventions. Journal of Biomedical Informatics Bhadra, A. , Datta, J. , Polson, N. G. , Willard, B. et al. (2019). Lasso meets horse-shoe: A survey.
Statistical Science Bolman, C. , Verboon, P. , Thewissen, V. , Boonen, V. , Soons, K. and
Jacobs, N. (2018). Predicting smoking lapses in the first week of quitting: an ecological momentaryassessment study.
Journal of Addiction Medicine Brook, D. W. , Brook, J. S. , Zhang, C. , Whiteman, M. , Cohen, P. and
Finch, S. J. (2008). Developmental trajectories of cigarette smoking from adolescence to the earlythirties: personality and behavioral risk factors.
Nicotine & Tobacco Research Brown, P. J. , Vannucci, M. and
Fearn, T. (1998). Multivariate Bayesian variableselection and prediction.
Journal of the Royal Statistical Society: Series B (StatisticalMethodology) Businelle, M. S. , Ma, P. , Kendzor, D. E. , Reitzel, L. R. , Chen, M. , Lam, C. Y. , Bernstein, I. and
Wetter, D. W. (2014). Predicting quit attempts among home-less smokers seeking cessation treatment: an ecological momentary assessment study.
Nicotine & Tobacco Research Businelle, M. S. , Ma, P. , Kendzor, D. E. , Frank, S. G. , Vidrine, D. J. and
Wet-ter, D. W. (2016). An ecological momentary intervention for smoking cessation: eval-uation of feasibility and effectiveness.
Journal of Medical Internet Research e321. Cai, B. and
Bandyopadhyay, D. (2017). Bayesian semiparametric variable selection withapplications to periodontal data.
Statistics in Medicine Canale, A. , Lijoi, A. , Nipoti, B. and
Pr¨unster, I. (2017). On the Pitman–Yor processwith spike and slab base measure.
Biometrika
Carvalho, C. M. , Polson, N. G. and
Scott, J. G. (2009). Handling sparsity via thehorseshoe. In
Artificial Intelligence and Statistics
Chen, Z. and
Dunson, D. B. (2003). Random effects selection in linear mixed models.
Biometrics Cheng, J. , Edwards, L. J. , Maldonado-Molina, M. M. , Komro, K. A. and
Muller, K. E. (2010). Real longitudinal data analysis for real people: building a goodenough mixed model.
Statistics in Medicine Chipman, H. (1996). Bayesian variable selection with related predictors.
The CanadianJournal of Statistics
Cursio, J. F. , Mermelstein, R. J. and
Hedeker, D. (2019). Latent trait shared-parameter mixed models for missing ecological momentary assessment data.
Statisticsin Medicine Dahl, D. B. and
Muller, P. (2017). sdols: Summarizing Distributions of Latent Struc-tures.
R package version 1.4. de Haan-Rietdijk, S. , Kuppens, P. , Bergeman, C. S. , Sheeber, L. , Allen, N. and
Hamaker, E. (2017). On the use of mixed Markov models for intensive longitudinaldata.
Multivariate Behavioral Research Dunson, D. B. , Herring, A. H. and
Engel, S. M. (2008). Bayesian selection andclustering of polymorphisms in functionally related genes.
Journal of the AmericanStatistical Association
Dziak, J. J. , Li, R. , Tan, X. , Shiffman, S. and
Shiyko, M. P. (2015). Modeling in-tensive longitudinal data with mixtures of nonparametric trajectories and time-varyingeffects.
Psychological Methods Escobar, M. D. and
West, M. (1995). Bayesian density estimation and inference usingmixtures.
Journal of the American Statistical Association M. KOSLOVSKY ET AL.
Fitzmaurice, G. M. , Laird, N. M. and
Ware, J. H. (2012).
Applied LongitudinalAnalysis . John Wiley & Sons.
Geiser, C. , Bishop, J. , Lockhart, G. , Shiffman, S. and
Grenard, J. L. (2013). Ana-lyzing latent state-trait and multiple-indicator latent growth curve models as multilevelstructural equation models.
Frontiers in Psychology Gelfand, A. E. (1996). Model determination using sampling-based methods.
Markovchain Monte Carlo in practice
Gelman, A. and
Rubin, D. B. (1992). Inference from iterative simulation using multiplesequences.
Statistical science Gelman, A. , Goegebeur, Y. , Tuerlinckx, F. and
Van Mechelen, I. (2000). Diag-nostic checks for discrete data regression models using posterior predictive simulations.
Journal of the Royal Statistical Society: Series C (Applied Statistics) George, E. I. and
McCulloch, R. E. (1993). Variable selection via Gibbs sampling.
Journal of the American Statistical Association George, E. I. and
McCulloch, R. E. (1997). Approaches for Bayesian variable selection.
Statistica Sinica
Hastie, T. and
Tibshirani, R. (1993). Varying-coefficient models.
Journal of the RoyalStatistical Society: Series B (Methodological) Heron, K. E. and
Smyth, J. M. (2010). Ecological momentary interventions: Incorpo-rating mobile technology into psychosocial and health behaviour treatments.
BritishJournal of Health Psychology Hui, F. K. , M¨uller, S. and
Welsh, A. (2017). Hierarchical selection of fixed and randomeffects in generalized linear mixed models.
Statistica Sinica
Ishwaran, H. , Rao, J. S. et al. (2005). Spike and slab variable selection: frequentist andBayesian strategies.
The Annals of Statistics Jones, R. H. , Xu, S. and
Grunwald, G. K. (2006). Continuous time Markov modelsfor binary longitudinal data.
Biometrical Journal Kim, S. , Dahl, D. B. and
Vannucci, M. (2009). Spiked Dirichlet process prior forBayesian multiple hypothesis testing in random effects models.
Bayesian Analysis (On-line) Kinney, S. K. and
Dunson, D. B. (2007). Fixed and random effects selection in linearand logistic models.
Biometrics Klasnja, P. , Hekler, E. B. , Shiffman, S. , Boruvka, A. , Almirall, D. , Tewari, A. and
Murphy, S. A. (2015). Microrandomized trials: An experimental design for devel-oping just-in-time adaptive interventions.
Health Psychology Koslovsky, M. D. , H´ebert, E. T. , Swartz, M. D. , Chan, W. , Leon-Novelo, L. , Wilkinson, A. V. , Kendzor, D. E. and
Businelle, M. S. (2017). The time-varyingrelations between risk factors and smoking before and after a quit attempt.
Nicotine &Tobacco Research . Koslovsky, M. D. , Swartz, M. D. , Chan, W. , Leon-Novelo, L. , Wilkinson, A. V. , Kendzor, D. E. and
Businelle, M. S. (2018). Bayesian variable selection for multi-state Markov models with interval-censored data in an ecological momentary assessmentstudy of smoking cessation.
Biometrics K¨ur¨um, E. , Li, R. , Shiffman, S. and
Yao, W. (2016). Time-varying coefficient mod-els for joint modeling binary and continuous outcomes in longitudinal data.
StatisticaSinica Lang, S. and
Brezger, A. (2004). Bayesian P-splines.
Journal of Computational andGraphical Statistics Lanza, S. T. , Vasilenko, S. A. , Liu, X. , Li, R. and
Piper, M. E. (2013). Advancingthe understanding of craving during smoking cessation attempts: A demonstration ofVEM FOR MHEALTH DATA the time-varying effect model. Nicotine & Tobacco Research . Li, Y. , M¨uller, P. and
Lin, X. (2011). Center-adjusted inference for a nonparametricBayesian random effect distribution.
Statistica Sinica . Luckett, D. J. , Laber, E. B. , Kahkoska, A. R. , Maahs, D. M. , Mayer-Davis, E. and
Kosorok, M. R. (2019). Estimating dynamic treatment regimes in mobile healthusing V-learning.
Journal of the American Statistical Association
Ma, S. and
Song, P. X.-K. (2015). Varying index coefficient models.
Journal of theAmerican Statistical Association
Mason, M. , Mennis, J. , Way, T. , Lanza, S. , Russell, M. and
Zaharakis, N. (2015).Time-varying effects of a text-based smoking cessation intervention for urban adoles-cents.
Drug and Alcohol Dependence
McCarthy, D. E. , Ebssa, L. , Witkiewitz, K. and
Shiffman, S. (2016). Repeatedmeasures latent class analysis of daily smoking in three smoking cessation studies.
Drug and Alcohol Dependence
McClure, J. B. , Anderson, M. L. , Bradley, K. , An, L. C. and
Catz, S. L. (2016).Evaluating an adaptive and interactive mHealth smoking cessation and medicationadherence program: a randomized pilot feasibility study.
JMIR mHealth and uHealth e94. Meil˘a, M. (2003). Comparing clusterings by the variation of information. In
LearningTheory and Kernel Machines
Minami, H. , Yeh, V. M. , Bold, K. W. , Chapman, G. B. and
McCarthy, D. E. (2014).Relations among affect, abstinence motivation and confidence, and daily smoking lapserisk.
Psychology of Addictive Behaviors M¨uller, P. , Quintana, F. A. and
Rosner, G. L. (2007). Semiparametric Bayesianinference for multilevel repeated measurement data.
Biometrics M¨uller, S. , Scealy, J. L. , Welsh, A. H. et al. (2013). Model selection in linear mixedmodels.
Statistical Science Nahum-Shani, I. , Smith, S. N. , Spring, B. J. , Collins, L. M. , Witkiewitz, K. , Tewari, A. and
Murphy, S. A. (2017). Just-in-time adaptive interventions (JITAIs)in mobile health: Key components and design principles for ongoing health behaviorsupport.
Annals of Behavioral Medicine Naughton, F. , Hopewell, S. , Lathia, N. , Schalbroeck, R. , Brown, C. , Mas-colo, C. , McEwen, A. and
Sutton, S. (2016). A context-sensing mobile phone app(Q sense) for smoking cessation: a mixed-methods study.
JMIR mHealth and uHealth e106. Neal, R. M. (2000). Markov chain sampling methods for Dirichlet process mixture mod-els.
Journal of Computational and Graphical Statistics Newton, M. A. , Noueiry, A. , Sarkar, D. and
Ahlquist, P. (2004). Detecting differ-ential gene expression with a semiparametric hierarchical mixture method.
Biostatistics Piasecki, T. M. , Fiore, M. C. , McCarthy, D. E. and
Baker, T. B. (2002). Havewe lost our way? The need for dynamic formulations of smoking relapse proneness.
Addiction Piasecki, T. M. , Jorenby, D. E. , Smith, S. S. , Fiore, M. C. and
Baker, T. B. (2003).Smoking withdrawal dynamics:II. Improved tests of withdrawal-relapse relations.
Jour-nal of Abnormal Psychology
Piasecki, T. M. , Trela, C. J. , Hedeker, D. and
Mermelstein, R. J. (2013). Smok-ing antecedents: Separating between-and within-person effects of tobacco dependencein a multiwave ecological momentary assessment investigation of adolescent smoking.
Nicotine & Tobacco Research S119–S126. M. KOSLOVSKY ET AL.
Polson, N. G. and
Scott, J. G. (2010). Shrink globally, act locally: Sparse Bayesianregularization and prediction.
Bayesian statistics Polson, N. G. , Scott, J. G. and
Windle, J. (2013). Bayesian inference for logisticmodels using P´olya–Gamma latent variables.
Journal of the American Statistical As-sociation
Riley, W. T. , Rivera, D. E. , Atienza, A. A. , Nilsen, W. , Allison, S. M. and
Mer-melstein, R. (2011). Health behavior models in the age of mobile interventions: Areour theories up to the task?
Translational Behavioral Medicine Rivera, D. E. , Pew, M. D. and
Collins, L. M. (2007). Using engineering controlprinciples to inform the design of adaptive interventions: A conceptual introduction.
Drug and Alcohol Dependence S31–S40.
Savitsky, T. and
Vannucci, M. (2010). Spiked Dirichlet process priors for Gaussianprocess models.
Journal of Probability and Statistics . Savitsky, T. , Vannucci, M. and
Sha, N. (2011). Variable selection for nonparametricGaussian process priors: Models and computational strategies.
Statistical Science: Areview Journal of the Institute of Mathematical Statistics Scheipl, F. (2011). spikeSlabGAM: Bayesian variable selection, model choice and regu-larization for generalized additive mixed models in R. arXiv preprint arXiv:1105.5253 . Scheipl, F. , Fahrmeir, L. and
Kneib, T. (2012). Spike-and-slab priors for functionselection in structured additive regression models.
Journal of the American StatisticalAssociation
Selya, A. S. , Updegrove, N. , Rose, J. S. , Dierker, L. , Tan, X. , Hedeker, D. , Li, R. and
Mermelstein, R. J. (2015). Nicotine-dependence-varying effects of smoking eventson momentary mood changes among adolescents.
Addictive Behaviors Sha, N. , Tadesse, M. G. and
Vannucci, M. (2006). Bayesian variable selection for theanalysis of microarray data with censored outcomes.
Bioinformatics Shiffman, S. (2013). Conceptualizing analyses of ecological momentary assessment data.
Nicotine & Tobacco Research S76–S87.
Shiffman, S. , Paty, J. A. , Gnys, M. , Kassel, J. A. and
Hickcox, M. (1996). Firstlapses to smoking: within-subjects analysis of real-time reports.
Journal of Consultingand Clinical Psychology Shiffman, S. , Balabanis, M. H. , Paty, J. A. , Engberg, J. , Gwaltney, C. J. , Liu, K. S. , Gnys, M. , Hickcox, M. and
Paton, S. M. (2000). Dynamic effects ofself-efficacy on smoking lapse and relapse.
Health Psychology Shiffman, S. , Gwaltney, C. J. , Balabanis, M. H. , Liu, K. S. , Paty, J. A. , Kas-sel, J. D. , Hickcox, M. and
Gnys, M. (2002). Immediate antecedents of cigarettesmoking: an analysis from ecological momentary assessment.
Journal of Abnormal Psy-chology
Shiyko, M. P. , Lanza, S. T. , Tan, X. , Li, R. and
Shiffman, S. (2012). Using thetime-varying effect model (TVEM) to examine dynamic associations between negativeaffect and self confidence on smoking urges: Differences between successful quitters andrelapsers.
Prevention Science Shiyko, M. , Naab, P. , Shiffman, S. and
Li, R. (2014). Modeling Complexity of eMaData: time-varying lagged effects of negative affect on smoking Urges for subgroups ofnicotine addiction.
Nicotine & Tobacco Research S144–S150.
Tan, X. , Shiyko, M. P. , Li, R. , Li, Y. and
Dierker, L. (2012). A time-varying effectmodel for intensive longitudinal data.
Psychological Methods Timms, K. P. , Rivera, D. E. , Collins, L. M. and
Piper, M. E. (2013). A dynami-cal systems approach to understanding self-regulation in smoking cessation behaviorchange.
Nicotine & Tobacco Research S159–S168.VEM FOR MHEALTH DATA Trail, J. B. , Collins, L. M. , Rivera, D. E. , Li, R. , Piper, M. E. and
Baker, T. B. (2014). Functional data analysis for dynamical system identification of behavioral pro-cesses.
Psychological Methods Van Erp, S. , Oberski, D. L. and
Mulder, J. (2019). Shrinkage priors for Bayesianpenalized regression.
Journal of Mathematical Psychology Vasilenko, S. A. , Piper, M. E. , Lanza, S. T. , Liu, X. , Yang, J. and
Li, R. (2014).Time-varying processes involved in smoking lapse in a randomized trial of smokingcessation therapies.
Nicotine & Tobacco Research S135–S143.
Vehtari, A. , Gelman, A. and
Gabry, J. (2016). loo: Efficient leave-one-out cross-validation and WAIC for Bayesian models.
R package version 0.1 . Vehtari, A. , Gelman, A. and
Gabry, J. (2017). Practical Bayesian model evaluationusing leave-one-out cross-validation and WAIC.
Statistics and Computing Wade, S. , Ghahramani, Z. et al. (2018). Bayesian cluster analysis: Point estimation andcredible balls (with discussion).
Bayesian Analysis Walls, T. A. and
Schafer, J. L. (2005).
Models for intensive longitudinal data . OxfordUniversity Press.
Yang, M. (2012). Bayesian variable selection for logistic mixed model with nonparametricrandom effects.
Computational Statistics & Data Analysis Department of StatisticsColorado State UniversityFort Collins, CO, USAE-mail: [email protected]
Oklahoma Tobacco Research CenterThe University of Oklahoma Health Sciences Center655 Research Parkway, Suite 400Oklahoma City, OK 73104E-mail:
E-mail: