[PDF] A Bayesian Time-Varying Effect Model for Behavioral mHealth Data

Abstract

The integration of mobile health (mHealth) devices into behavioral health research has fundamentally changed the way researchers and interventionalists are able to collect data as well as deploy and evaluate intervention strategies. In these studies, researchers often collect intensive longitudinal data (ILD) using ecological momentary assessment methods, which aim to capture psychological, emotional, and environmental factors that may relate to a behavioral outcome in near real-time. In order to investigate ILD collected in a novel, smartphone-based smoking cessation study, we propose a Bayesian variable selection approach for time-varying effect models, designed to identify dynamic relations between potential risk factors and smoking behaviors in the critical moments around a quit attempt. We use parameter-expansion and data-augmentation techniques to efficiently explore how the underlying structure of these relations varies over time and across subjects. We achieve deeper insights into these relations by introducing nonparametric priors for regression coefficients that cluster similar effects for risk factors while simultaneously determining their inclusion. Results indicate that our approach is well-positioned to help researchers effectively evaluate, design, and deliver tailored intervention strategies in the critical moments surrounding a quit attempt.

Full PDF

SSubmitted to the Annals of Applied Statistics arXiv: arXiv:0000.0000

A BAYESIAN TIME-VARYING EFFECT MODEL FORBEHAVIORAL MHEALTH DATA

By Matthew D. Koslovsky ∗ ,Emily T. H´ebert † ,Michael S. Businelle † ,and Marina Vannucci ‡ Colorado State University ∗ , Oklahoma Tobacco Research Center † and RiceUniversity ‡ The integration of mobile health (mHealth) devices into behav-ioral health research has fundamentally changed the way researchersand interventionalists are able to collect data as well as deploy andevaluate intervention strategies. In these studies, researchers oftencollect intensive longitudinal data (ILD) using ecological momentaryassessment methods, which aim to capture psychological, emotional,and environmental factors that may relate to a behavioral outcomein near real-time. In order to investigate ILD collected in a novel,smartphone-based smoking cessation study, we propose a Bayesianvariable selection approach for time-varying eﬀect models, designedto identify dynamic relations between potential risk factors and smok-ing behaviors in the critical moments around a quit attempt. We useparameter-expansion and data-augmentation techniques to eﬃcientlyexplore how the underlying structure of these relations varies overtime and across subjects. We achieve deeper insights into these rela-tions by introducing nonparametric priors for regression coeﬃcientsthat cluster similar eﬀects for risk factors while simultaneously de-termining their inclusion. Results indicate that our approach is well-positioned to help researchers eﬀectively evaluate, design, and delivertailored intervention strategies in the critical moments surroundinga quit attempt.

1. Introduction.

Scientiﬁc Background.

The integration of mobile health (mHealth)devices into behavioral health research has fundamentally changed the wayresearchers and interventionalists are able to collect data as well as deployand evaluate intervention strategies. Leveraging mobile and sensing tech-nologies, just-in-time adaptive interventions (JITAI) or ecological momen-tary interventions are designed to provide tailored support to participants

Keywords and phrases: ecological momentary assessment, mHealth, P´olya-Gamma aug-mentation, time-varying eﬀect model, variable selection a r X i v : . [ s t a t . M E ] S e p M. KOSLOVSKY ET AL. based on their mood, aﬀect, and socio-environmental context (Heron andSmyth, 2010; Nahum-Shani et al., 2017). In order to deliver theory-basedinterventions at critical moments, researchers collect intensive longitudinaldata using ecological momentary assessment (EMA) methods, which aim tocapture psychological, emotional, and environmental factors that may relateto a behavioral outcome in near real-time. In practice, JITAIs’ eﬀectivenessdepends on accurately identifying high-risk situations by the user or by pre-determined decision rules to initiate the delivery of intervention components.Decision rules for eﬃcacious interventions rely on a thorough understandingof the factors that characterize a subject’s risk for a behavioral outcome, thedynamics of these risk factors’ relation with the outcome over time, and theknowledge of possible strategies to target a risk factor (Nahum-Shani et al.,2017).In the analysis of this paper, we investigate a behavioral health interven-tion study that targets smoking cessation. Historically, smoking cessationstudies have used health behavior theory (Shiﬀman et al., 2002; Timms et al.,2013) or group-level trends of smoking antecedents (Piasecki et al., 2013)to determine when a JITAI should be triggered. However, this approach islimited since current health behavior models are inadequate for guiding thedynamic and granular nature of JITAIs (Riley et al., 2011; Klasnja et al.,2015). Additionally, the design of eﬃcacious smoking cessation interven-tions is challenged by the complexity of smoking behaviors around a quitattempt and misunderstandings of the addiction process (Piasecki et al.,2002). More recently, smoking behavior researchers have capitalized on theability of mHealth techniques to collect rich streams of data capturing sub-jects’ experiences close to their occurrence at a high temporal resolution.The structure, as well as the complexity, of these data provide unique op-portunities for the development and implementation of more advanced ana-lytical methods compared to traditional longitudinal data analysis methodsused in behavioral research (e.g., mixed models, growth curve models) (Trailet al., 2014). For example, researchers have applied reinforcement learning(Luckett et al., 2019) and dynamic systems approaches (Trail et al., 2014;Rivera, Pew and Collins, 2007; Timms et al., 2013) to design and assessoptimal treatment strategies using mHealth data. Additionally, Koslovskyet al. (2018), de Haan-Rietdijk et al. (2017) and Berardi et al. (2018) haveapplied hidden and observed Markov models to study transitions betweendiscrete behavioral states, Shiyko et al. (2012) and Dziak et al. (2015) haveused mixture models to identify latent structures, and K¨ur¨um et al. (2016)have employed joint modeling techniques to study the complexity of smokingbehaviors.

VEM FOR MHEALTH DATA Greater insights into the dynamic relation between risk factors and smok-ing behaviors have been generated by the application of functional data tech-niques (Trail et al., 2014; Vasilenko et al., 2014; Koslovsky et al., 2017; Tanet al., 2012). These methods are well-suited for high-dimensional data withunbalanced and unequally-spaced observation times, matching the format ofdata collected with EMAs. They also require little assumptions on the struc-ture of the relations between risk factors and behavioral outcomes. One pop-ular approach uses varying-coeﬃcient models, which belong to the class ofgeneralized additive (mixed) models. These semiparametric regression mod-els allow a covariate’s corresponding coeﬃcient to vary as a smooth functionof other covariates (Hastie and Tibshirani, 1993). For example, Selya et al.(2015) examined how the relation between the number of cigarettes smokedduring a smoking event and smoking-related mood changes varies as a func-tion of nicotine dependence. More frequently, penalized splines have beenemployed in varying-coeﬃcient models to investigate how the eﬀect of a co-variate varies as a function of time, leading to time-varying eﬀect models(TVEM) (Tan et al., 2012; Lanza et al., 2013; Koslovsky et al., 2017; Shiykoet al., 2012; Mason et al., 2015; Vasilenko et al., 2014). These approachesallow researchers to identify the critical moments that a particular risk fac-tor is strongly associated with smoking behaviors, information that can beused to design tailored intervention strategies based on a subject’s currentrisk proﬁle.1.2.

Model Overview.

While there are various inferential challenges thatfunctional data analysis models can address, in the application of this paperwe focus on incorporating three recurring themes in behavioral research toexplore the relations between risk factors and smoking behaviors:1.

Model Assumptions - Numerous smoking behavior research studieshave relied on semiparametric, spline-based methods to learn the rela-tional structure between risk factors and outcomes (Tan et al., 2012;Vasilenko et al., 2014).2.

Variable Selection - One of the main objectives of intensive longitudi-nal data analysis is to identify or re-aﬃrm complex relations betweenrisk factors and behavioral outcomes over time (Walls and Schafer,2005).3.

Latency - A common aim in smoking behavior research studies is toidentify latent structure in the data, such as groups or clusters of sub-jects with similar smoking behaviors over time (McCarthy et al., 2016;Cursio, Mermelstein and Hedeker, 2019; Geiser et al., 2013; Dziaket al., 2015; Brook et al., 2008).

M. KOSLOVSKY ET AL.

To incorporate and expand upon these features in our analysis, we de-velop a ﬂexible Bayesian varying-coeﬃcient regression modeling frameworkfor longitudinal binary responses that uses variable selection priors to pro-vide insights into the dynamic relations between risk factors and outcomes.We embed spike-and-slab variable selection priors as mixtures of a pointmass at zero (spike) and a diﬀuse distribution (slab) (George and McCul-loch, 1993; Brown, Vannucci and Fearn, 1998) and adopt the formulation ofScheipl, Fahrmeir and Kneib (2012) to deconstruct the varying-coeﬃcientsterms, in our case time-varying eﬀects, into a main eﬀect, linear interac-tion term, and non-linear interaction term. Unlike previous approaches inbehavioral health research that use time-varying eﬀect models, our formu-lation allows us to gain inference on whether a given risk factor is related tothe smoking behavior while also learning the type of relation. Additionally,by performing selection on ﬁxed as well as random eﬀects, our method isequipped to identify relations that vary over time and across subjects. Forthis, we exploit a P´olya-Gamma augmentation scheme that enables eﬃcientsampling without sacriﬁcing interpretability of the regression coeﬃcients aslog odds ratios (Polson, Scott and Windle, 2013). Furthermore, we adopta Bayesian semiparametric approach to model ﬁxed and random eﬀects byreplacing the traditional spike-and-slab prior with a nonparametric construc-tion to cluster risk factors that have similar strengths of association.1.3.

Just-in-Time Adaptive Interventions for Smoking Abstinence.

Al-though multiple studies have examined momentary predictors of smokinglapse (Shiﬀman et al., 2000; Piasecki et al., 2003; Businelle et al., 2014),JITAIs for smoking cessation are still nascent. Thus far, studies have usedparticipant-labeled GPS coordinates to trigger supportive messages to pre-vent smoking (Naughton et al., 2016), or have tailored messages to the du-ration and intensity of participant’s self-reported side eﬀects while takingvarenicline (McClure et al., 2016). Using our proposed approach, we ana-lyze ILD collected in a study investigating the utility of a novel, smartphone-based smoking cessation JITAI (

SmartT ). The

SmartT intervention (Businelleet al., 2016) uses a lapse risk estimator to identify moments of heightenedrisk for lapse, and tailors treatment messages in real-time based upon thelevel of imminent smoking lapse risk and currently present lapse triggers. Toour knowledge, no other studies have used EMA data to estimate risk forimminent smoking lapse and deliver situation-speciﬁc, individually-tailoredtreatment content prior to lapse.In this study, adult smokers (N=81) recruited from a smoking cessationresearch clinic were randomized to the

SmartT intervention, the National

VEM FOR MHEALTH DATA Cancer Institute’s QuitGuide (

NCI QuitGuide ), or weekly counseling ses-sions ( usual care ), and followed over a ﬁve-week period spanning one weekprior to a scheduled quit attempt to four weeks after. At the beginning ofthe assessment period, baseline measures were collected, and subjects wereshown how to complete EMAs on a study-provided smartphone. Through-out the assessment period, subjects completed daily diaries and receivedfour random EMAs from the smartphone to complete each day. For eachEMA, subjects were prompted on their recent smoking behaviors, alcoholconsumption, as well as various questions regarding their current psycholog-ical, social, and environmental factors that may contribute to an increasedrisk of smoking behaviors.Findings indicate that our approach is well-positioned to help researchersevaluate, design, and deliver tailored intervention strategies in the criticalmoments surrounding a quit attempt. In particular, results conﬁrm previ-ously identiﬁed temporal relations between smoking behaviors around a quitattempt and risk factors. They also indicate that subjects diﬀer in how theyrespond to diﬀerent risk factors over time. Furthermore, we identify clus-ters of active risk factors that can help researchers prioritize interventionstrategies based on their relative strength of association at a given moment.Importantly, our approach generates these insights with minimal assump-tions regarding which risk factors were related to smoking in the presence ofothers, the structural form of the relation for active terms, or the parametricform of regression coeﬃcients.The rest of the paper is organized as follows. In section 2, we presentour modeling approach and describe prior constructions. In section 3, weinvestigate the relations between risk factors and smoking behaviors in thecritical moments surrounding a scheduled quit attempt using mHealth data.In section 4, we conduct a simulation study investigating the variable se-lection and clustering performance of our proposed method on simulateddata. In section 5, we evaluate prior sensitivity of our model. In section 6,we provide concluding remarks.

2. Methods.

The objective of our analysis is to identify relations be-tween a set of risk factors (i.e., baseline and EMA items) and a binary out-come (i.e., momentary smoking) repeatedly collected over time. For this, weemploy a Bayesian variable selection framework that allows a ﬂexible struc-ture for the unknown relations. We achieve this by performing selection notonly on main eﬀects, but additionally on linear and non-linear interactionterms as well as random eﬀects. In this work, we refer to ﬁxed and randomeﬀects in the context of hierarchical or multilevel models, where ﬁxed eﬀects

M. KOSLOVSKY ET AL. are constant across subjects and random eﬀects diﬀer at the subject-level.We chose this terminology based on its familiarity within both frequentistand Bayesian paradigms, but point out that the ﬁxed or population-leveleﬀects are treated as random variables in our model, and thus follow a prob-ability distribution.2.1.

A Varying-Coeﬃcient Model for Intensive Longitudinal Data Col-lected with EMAs.

Let y ij ∈ { , } represent momentary smoking for sub-ject i = 1 , . . . , N , and x ij and z ij represent P - and D -dimensional vectors ofrisk factors collected on each subject at time j = 1 , . . . , n i , respectively. Tomaintain temporality in our particular application (see section 3 for moredetails), we model the relation between momentary smoking by the nextassessment and current, potential risk factors as a varying-coeﬃcient modelof the type(2.1) logit ( P ( y i,j +1 = 1 | x ij , z ij , u ij )) = P (cid:88) p =1 f p ( u ij ) x ijp + α (cid:48) i z ij , where f p ( u ) are smooth functions of a scalar covariate u , and α i representssubject speciﬁc random eﬀects. Similar temporal assumptions have beenmade previously in smoking behavior research studies (Bolman et al., 2018;Minami et al., 2014; Shiﬀman et al., 1996; Shiﬀman, 2013; Shiyko et al.,2014). Note that in general, researchers may use the framework of 2.1 tomodel the relation between a binary outcome and potential risk factors col-lected concurrently, in addition to lagged trends, as is typical in longitudinalstudies (Fitzmaurice, Laird and Ware, 2012). With this formulation, we in-clude varying-coeﬃcient terms for each of the P risk factors based on u .However in general, we can specify varying-coeﬃcient terms that depend on u (cid:48) (cid:54) = u , and thus the number of varying-coeﬃcient terms in the full model isnot strictly P . If u is chosen to represent time, then this model is commonlyreferred to as a time-varying eﬀect model in smoking behavior research (Tanet al., 2012; Vasilenko et al., 2014; Dziak et al., 2015; Koslovsky et al., 2017).Note that z ij is typically a subset of x ij (Kinney and Dunson, 2007; Chenget al., 2010; Hui, M¨uller and Welsh, 2017) and that incorporating a 1 in x ij and z ij , allows for an intercept term that varies as a function of u and arandom intercept term, respectively. Additionally, this formulation can han-dle time-invariant risk factors, such as baseline items, by ﬁxing x ijp ( z ijd ) to x ip ( z id ) for all observations j .We approximate the smooth functions with spline basis functions. Specif- VEM FOR MHEALTH DATA ically,(2.2) f p ( u ij ) = U (cid:48) ij φ p , where U ij is a spline basis function for u ij , and φ p is a r p -dimensionalvector of corresponding spline coeﬃcients. For simplicity, the splines areconstructed with an equal number of equally spaced knots that depend onthe minimum and maximum values of u .2.2. Penalized Priors for the Spline Coeﬃcients.

Using a combination ofvariable selection and shrinkage priors, our approach generates insights onthe underlying structure of the smooth functions by reconstructing them asthe summation of main eﬀect, linear interaction, and non-linear interactioncomponents. Formally, we rewrite Equation (Eq.) (2.2) as(2.3) f p ( u ij ) = β ∗ p U ∗(cid:48) ij ξ p + β ◦ p u ij + β p , where the constant term β p captures the main eﬀect of x p , β ◦ p represents theeﬀect of the linear interaction between u and x p , and β ∗ p ξ p is a parameter-expanded vector of coeﬃcients corresponding to the non-linear interactionterm.To derive the non-linear component in Eq. (2.3), we start by penalizingthe spline functions in Eq. (2.2) with a second-order Gaussian random walkprior following(2.4) U φ p | s ∼ N ( , s U P − U (cid:48) ) , where U is a (cid:80) Ni =1 ( n i − × r p -dimensional matrix with each row correspond-ing to U (cid:48) ij for the i th subject at the j th assessment, s controls the amountof smoothness, and P is the appropriate penalty matrix (Lang and Brezger,2004). Next, we take the spectral decomposition of U P − U (cid:48) = (cid:2) U + U ◦ (cid:3)(cid:20) V +

00 0 (cid:21) (cid:20) U + U ◦ (cid:21) , where U + is a matrix of eigenvectors with correspond-ing positive eigenvalues along the diagonal of matrix V + , and U ◦ are theeigenvectors associated with the zero eigenvalues. Now, we can re-deﬁne thesmooth functions in Eq. (2.2) as the sum of non-linear (penalized) interac-tion, linear (non-penalized) interaction, and main eﬀect terms as presented inEq. (2.3), where the penalized term is written as U ∗ ϕ ∗ p with U ∗ = U + V / .By assuming independent normal priors for ϕ ∗ p , a proper prior for the pe-nalized terms that is proportional to Eq. (2.4) can be obtained.We take two additional measures to enhance the computational eﬃciencyof the resulting MCMC algorithm. First, only eigenvalues/vectors that ex-plain a majority of the variability in Eq. (2.4) are used to construct U ∗ . M. KOSLOVSKY ET AL.

Additionally, we apply a parameter-expansion technique for the penalizedterms in f p ( · ), setting ϕ ∗ p = β ∗ p ξ p , where β ∗ p is a scalar and ξ p is a vector withthe same dimension as ϕ ∗ p . This technique enables us to perform selection onthe penalized terms as a group rather than determining their inclusion sep-arately. By rescaling β ∗ p and ξ p at each MCMC iteration, such that | ξ p | hasmean equal to one, ξ p maintains the shape of the smooth function and β ∗ p represents the term’s strength of association, while preserving identiﬁability,similar to Scheipl, Fahrmeir and Kneib (2012).For variable selection, we impose spike-and-slab prior distributions on the3 ∗ P = T -dimensional vector β = ( β ∗ , β ◦ , β , . . . , β ∗ P , β ◦ P , β P ) (cid:48) . In general,the spike-and-slab prior distribution is composed of a mixture of a Diracdelta function at zero, δ ( · ), and a known distribution, S ( · ), such as a normalwith mean zero and diﬀuse variance (George and McCulloch, 1993; Brown,Vannucci and Fearn, 1998). A latent indicator variable, ν t , representing arisk factor’s inclusion or exclusion in the model determines whether the riskfactor’s regression coeﬃcient is set to zero (spike) or free to be estimated inthe model (slab). Speciﬁcally for a given coeﬃcient β t , we assume β t | ν t ∼ ν t · S ( β t ) + (1 − ν t ) δ ( β t ) . (2.5)To complete the prior speciﬁcation for this portion of the model, we as-sume that the slab component, S ( β t ), follows a N (0 , τ ) with variance τ ,and that the inclusion indicators are distributed as ν t | θ t ∼ Bernoulli( θ t ),with prior probability of inclusion θ t ∼ Beta( a ν t , b ν t ). Integrating out θ t weobtain ν t ∼ Beta-Binomial( a ν t , b ν t ), where hyperparameters a ν t and b ν t areset to control the sparsity in the model. Lastly, each element of ξ p , ξ pr , isassumed to follow a N ( µ pr , µ pr = ± ξ pr around ± ϕ ∗ p , as described above.2.3. Prior Speciﬁcation for the Random Eﬀects.

We perform selection onthe random eﬀects, α i , using the modiﬁed Cholesky decomposition approachof Chen and Dunson (2003). Speciﬁcally, we reparameterize the randomeﬀects(2.6) α i = K Γ ζ i , where K a positive diagonal matrix with elements κ = ( κ , . . . , κ D ) (cid:48) , and Γ a lower triangle matrix with diagonal elements set to one and free elementsotherwise. To perform variable selection, we set the prior for κ to followa similar spike-and-slab prior distribution as in section 2.2, where the slab VEM FOR MHEALTH DATA distribution S ( κ d ) = F N ( m , v ). Here, F N represents a folded normaldistribution deﬁned as

F N ( m , v ) = (2 πv ) − / exp( − ( κ d − m ) / (2 v ))+(2 πv ) − / exp( − ( κ d + m ) / (2 v )) , where m ∈ R and v > κ and ultimiately their cluster assignments. Sim-ilar to section 2.2, we let the corresponding inclusion indicators λ d followa Beta-Binomial( a λ d , b λ d ) to induce sparsity on the random eﬀect terms.Lastly, we assume the D ( D − / Γ follow N ( γ , V γ ) · I ( γ ∈ Z ), where I represents an indicator function, and Z represents the parameters with corresponding random eﬀects included in themodel. For example, if the d th random eﬀect is included (i.e., λ d = 1), then γ d , . . . , γ d,d − and γ d +1 ,d , . . . γ D,d ∈ Z . Lastly, we assume ζ i ∼ N ( , I ) . Spiked Nonparametric Priors.

To complete our approach, we inves-tigate nonparametric prior constructions for the spike-and-slab componentsof the reparameterized ﬁxed and random eﬀects by assuming that the slabcomponent follows a Dirichlet process (DP). These priors are commonly re-ferred to as spiked DP (SDP) priors (Canale et al., 2017; Kim, Dahl andVannucci, 2009; Savitsky and Vannucci, 2010; Dunson, Herring and Engel,2008). In the context of our model, SDP priors allow us to simultaneouslyselect inﬂuential risk factors while clustering eﬀects with similar relationsto the smoking outcome. The formulation we use here is sometimes refersto as an “outer” SDP prior, since the point mass at zero is outside of thebase distribution of the DP. Alternatively, the “inner” construction placesthe spike-and-slab prior inside the DP, serving as the base distribution. Theinner formulation provides the opportunity for coeﬃcients to cluster at zero,but does not force a point mass at zero explicitly. As such, the likelihoodthat a coeﬃcient is assigned to the trivial cluster grows with the number ofcoeﬃcients excluded from the model. Alternatively, the outer formulationis a more informative prior, since it explicitly assigns a point mass at zero,and, in addition, carries less computational demands since it does not requireauxiliary variables for MCMC sampling (Neal, 2000; Savitsky and Vannucci,2010). We refer readers to Canale et al. (2017) for a detailed explanation ofthe structural diﬀerences between the two prior formulations.First, we assume the regression coeﬃcients associated with the main ef-fects and linear interaction terms follow a SDP to provide insights on risk M. KOSLOVSKY ET AL. factors that share underlying linear trends with momentary smoking by thenext assessment over the course of the study. Speciﬁcally, we assume theslab component in Eq. (2.5) is a Dirichlet process prior H ∼ DP ( ϑ, H ),with base distribution H = N (0 , τ ) and concentration parameter ϑ . Fur-thermore, we assume a hyperprior ϑ ∼ G ( a ϑ , b ϑ ), with a ϑ , b ϑ >

0. For thenonlinear interaction terms, we avoid the SDP since it would produce un-interpretable cluster assignments due to the parameter-expansion approachtaken to improve selection performance. For example, similar values for β ∗ t and β ∗ t (cid:48) may correspond to vastly diﬀerent ϕ ∗ t and ϕ ∗ t (cid:48) , depending on theirrespective ξ and spline basis functions. Similarly, placing a DP prior onthe individual components in ξ , or even ϕ , would not provide interpretableresults on the overall nonlinear eﬀect. We take a similar approach for therandom eﬀects. Here, we assume the slab components for the diagonal ele-ments of K , S ( κ d ) = W , W ∼ DP ( A , W ) , where W ∼ F N ( m , v ), and A is the concentration parameter of the DP. To complete the prior assump-tions for the random eﬀects portion of the model, let A ∼ G ( a A , b A ), where a A , b A > K , while letting ζ i follow a normaldistribution centered at zero. As such, our approach avoids any identiﬁabil-ity issues with the ﬁxed eﬀects while still relaxing the parametric assumptionon the reparameterized random eﬀects, K Γ ζ i . It is important to note thatby doing this we are adopting a Bayesian semiparametric modeling struc-ture, since the random eﬀects are linear combinations of spiked Dirichletprocess and normal random variables (M¨uller, Quintana and Rosner, 2007).2.5. Posterior Inference.

For posterior inference, we implement a Metropolis-Hastings within Gibbs algorithm. The full joint model is deﬁned as f ( y | (cid:37) , ω , x , u , z ) p ( ω ) p ( β | ν ) p ( ν ) p ( ϑ ) p ( K | λ ) p ( λ ) p ( A ) p ( ξ | µ ) p ( µ ) p ( ζ ) p ( Γ ) , where (cid:37) = { β , ξ , K , Γ , ζ } . We use the P´olya-Gamma augmentation of Pol-son, Scott and Windle (2013) to eﬃciently sample the posterior distributionfor the logistic regression model. Following Polson, Scott and Windle (2013), VEM FOR MHEALTH DATA we express the likelihood contribution of y i,j +1 as f ( y i,j +1 |· ) = ( e ψ ij ) y i,j +1 (1 + e ψ ij ) ∝ exp( k i,j +1 ψ ij ) (cid:90) ∞ exp( − ω i,j +1 ψ ij / p ( ω i,j +1 | n i,j +1 , ∂ω, where k i,j +1 = y i,j +1 − n i,j +1 / p ( ω i,j +1 | n i,j +1 , ∼ P G ( n i,j +1 , P G is the P´olya-Gamma distribution. Using the notation presented in theprevious sections, we set ψ ij = P (cid:88) p =1 ( β ∗ p U ∗ ij ξ p + β ◦ p u ij + β p ) x ijp + z (cid:48) ij K Γ ζ i . The MCMC sampler used to implement our model is outlined below inAlgorithm 1. A more detailed description of the MCMC steps as well as agraphical representation of the model are provided in the SupplementaryMaterial. After burn-in and thinning, the remaining samples obtained fromrunning Algorithm 1 for ˜ T iterations are used for inference. To determinea risk factor’s inclusion in the model, its marginal posterior probability ofinclusion (MPPI) is empirically estimated by calculating the average of itsrespective inclusion indicator’s MCMC samples (George and McCulloch,1997). Note that inclusion for both ﬁxed and random eﬀects is determinedmarginally for β t and λ d , respectively. Commonly, covariates are included inthe model if their MPPI exceeds 0.50 (Barbieri et al., 2004) or a Bayesianfalse discovery rate threshold, which controls for multiplicity (Newton et al.,2004).

3. Case Study.

In this section, we study the smoking behaviors in agroup of adult smokers recruited from a smoking cessation research clinic.The overall research goal of this study was to identify and investigate thestructural form of the relations between a set of risk factors and smokingover a ﬁve-week period surrounding a scheduled quit attempt, using intensivelongitudinal data collected with EMAs.3.1.

Data Analysis.

In the study design, momentary smoking, our out-come of interest, was deﬁned as whether or not a subject reported smokingin the 4 hours prior to the current EMA. However at each EMA, a subjectwas prompted on their current psychological, social, environmental, and be-havioral status. Thus to maintain temporality in this study, we assessed therelations between momentary smoking and measurements collected in theprevious EMA. As such, regression coeﬃcients are interpreted as the log oddsof momentary smoking by the next assessment for a particular risk factor. M. KOSLOVSKY ET AL.

Algorithm 1

MCMC Sampler

1: Input data y , x , u , z

2: Initialize parameters: (cid:37) , ω , ν , λ , ϑ, A , µ

3: Set DP ¯ β and DP K to True or False to indicate DP for slab on ﬁxed or random eﬀects,respectively.4: for iteration ˜ t = 1 , . . . , ˜ T do for i = 1 , . . . , N do for j = 1 , . . . , n i − do

7: Update ω i,j +1 ∼ P G (1 , ψ ij )8: end for end for if DP ¯ β then

11: Update cluster assignment of ¯ β following Neal (2000) algorithm 2.12: end if

13: Jointly update β and ν with Between and Within Step following Savitsky, Vannucciand Sha (2011).14: Update ξ from FCD N ( µ ξ , V ξ ).15: for p = 1 , . . . , P do

16: Rescale ξ ∗ p and β ∗ p so ϕ ∗ p remains unchanged.17: end for for p = 1 , . . . , P do for r = 1 , . . . , r p do

20: Set µ pr = 1 with probabilty 1 / (1 + exp( − ξ pr )).21: end for end for

23: Update ϑ by the two-step Gibbs update of Escobar and West (1995).24: if DP K then

25: Update cluster assignment of DP K following Neal (2000) algorithm 2.26: end if

27: Jointly update K and λ with Between and Within Step following Savitsky, Van-nucci and Sha (2011).28: Update A following two-step Gibbs update of Escobar and West (1995).29: Update Γ from FCD N (ˆ γ , ˆ V γ ) · I ( γ ∈ Z ).30: for i = 1 , . . . , N do

31: Update ζ i from FCD N (ˆ ζ i , ˆ V ζ i ).32: end for end for VEM FOR MHEALTH DATA In this study, we investigated psychological and aﬀective factors including urge to smoke, feelings of restlessness , negative aﬀect (i.e., irritability, frus-tration/anger, sadness, worry, misery), positive aﬀect (i.e., happiness andcalmness), being bored , anxiousness , and motivation to quit smoking . Ad-ditionally, we investigated numerous social and environmental factors suchas whether or not the subject was interacting with a smoker , if cigaretteswere easily available ( cigarette availability ), and whether or not the sub-ject was drinking alcohol ( alcohol consumption ). Also, we included a set ofbaseline, time-invariant measures (i.e., heaviness of smoking index ( HSI ), age (years), being female , and treatment assignment) into the model. Foreach of these risk factors, we included a ﬁxed main eﬀect, linear interaction,and non-linear interaction term as well as a random main eﬀect and linearinteraction term. All interactions investigated in this analysis were betweenrisk factors and assessment time (i.e., u ij = t ij ), and t ij were centered sothat t = 0 represents the beginning of the scheduled quit attempt.Only complete EMAs with corresponding timestamps were included inthis analysis, resulting in 9,634 total observations with the median numberof assessments per individual 151 (IQR 101.5-162). All continuous covariateswere standardized to mean zero and variance one before analysis to helpreduce multicollinearity and place covariates on the same scale for interpre-tation. The spline functions were initially generated with 20 basis functions,but only the eigenvalues/eigenvectors that captured 99.9% of the variabilitywere included in the model to reduce the parameter space and computa-tion time, similar to (Scheipl, Fahrmeir and Kneib, 2012). This reduced thecolumn space of the penalized covariates U ∗ to 8 in our application. Weapplied our model with the traditional spike-and-slab prior, as well as thespiked DP. When ﬁtting each model, we chose a non-informative prior forthe ﬁxed and random eﬀects’ inclusion indicators, a ν t = b ν t = a λ d = b λ d = 1.This assumption reﬂects the exploratory nature of our study aimed at learn-ing potential relations between risk factors and smoking behaviors with littleor no information regarding their occurrence in the presence of other riskfactors. We assumed a mildly informative prior on the ﬁxed regression co-eﬃcients by setting τ = 2. This places a 95% prior probability of includedregression coeﬃcients between an odds ratio of 0.06 and 16. Additionally, weset v = v ∗ = 10, m = m ∗ = 0, and Γ ∼ N ( γ = , V γ = I ). Lastly, whenusing the SDP prior, the hyperparameters for the concentration parameters ϑ and A were set to a ϑ = b ϑ = a A = b A = 1. For posterior inference, weran our MCMC algorithm with and without SDP priors for both ﬁxed andrandom eﬀects for 10,000 iterations, treating the ﬁrst 5,000 as burn-in andthinning to every 10 th iteration. Trace plots of the parameters’ posterior M. KOSLOVSKY ET AL. samples indicated good convergence and mixing. Additionally, we observeda relatively high correlation ( ∼ R , for each of the selected β and K be-low 1.1 (Gelman and Rubin, 1992), further demonstrating that the MCMCprocedure was working properly and the chains converged. To assess modelﬁt, a residual plot and a series of posterior predictive checks were performedin which we compared replicated data sets from the posterior predictive dis-tribution of the model to the observed data (Gelman et al., 2000). Overall,we found strong evidence of good model ﬁt. See the Supplementary Mate-rials for details. Inclusion in the model was determined using the medianmodel approach (Barbieri et al., 2004) (i.e., marginal posterior probabilityof inclusion (MPPI) ≥ . loo (Vehtari, Gelman and Gabry, 2016), which re-quires the pointwise log-likelihood for each subject i = 1 , . . . , N at eachobservation j = 1 , . . . , n i calculated at each MCMC iteration s = 1 , . . . , S ,and produces an estimated (cid:100) epld value, with larger values implying a superiormodel.3.2. Results.

Overall, we found better predictive performance for themodel with SDP priors versus the traditional spike-and-slab priors, (cid:100) epld

SDP = − . (cid:100) epld SS = − .

7, respectively. Plots of the marginal poste-rior probabilities of inclusion for the ﬁxed and random eﬀects selected usingour proposed approach with SDP priors are found in Figure 1. Figure 2presents the time-varying eﬀects selected using the same model. Comparedto usual care , we found a higher odds of momentary smoking by the nextassessment for those assigned to the

NCI QuitGuide group prior to the quitattempt. However immediately after the quit attempt, we observed a lowerodds of momentary smoking by the next assessment for those assigned tothe

NCI QuitGuide group, which gradually increased to the initial level overthe remainder of the study (top left panel). Similarly, we observed a positive

VEM FOR MHEALTH DATA relation between having the urge to smoke and momentary smoking by thenext assessment prior to the quit attempt that diminished during the threeweeks following the quit attempt, before sharply increasing during the fourthweek post-quit (top right panel). Throughout the assessment period, we ob-served a positive relation between negative aﬀect and momentary smokingby the next assessment that increased during the ﬁrst week post-quit, level-ing oﬀ at an odds ratio of 1.75 until the third week after the quit attempt.We additionally found a positive relation between cigarette availability andthe odds of momentary smoking by the next assessment that strengthenedover the assessment window. For a 1 SD increase in cigarette availability , theodds of momentary smoking by the next assessment increased by 300% forthe typical subject one week after the quit attempt, holding all else constant.In the two lower panels of Figure 2 we observe a relatively weak, oscillatingeﬀect of being bored and interacting with a smoker on momentary smokingby the next assessment, respectively. In addition to these eﬀects, the modelidentiﬁed a constant eﬀect for alcohol consumption in the last hour and mo-tivation to quit smoking over the assessment period. A similar set of ﬁxedeﬀect relations were identiﬁed by our model without the SDP prior, withthe exception of not selecting being bored .Compared to standard TVEMs, our approach deconstructs the structureof the relations between risk factors and smoking behaviors over time, aid-ing the interpretation of the underlying trends. This information may helpthe development and evaluation of tailored intervention strategies targetingsmoking cessation using mHealth data. For example, negative eﬀect has anobvious positive association with momentary smoking by the next assess-ment that wavers around an odds ratio of 1.2 to 1.5 for a majority of thestudy. However based on Figure 2, it is unclear whether or not the eﬀectlinearly diminishes over time. By performing selection on the main eﬀect,linear interaction, and non-linear interaction terms separately, we are ableto obtain an actual point estimate for the constant eﬀect of negative aﬀect (OR 1.40) as opposed to subjectively assuming a range of values from theplot. Additionally, since the linear interaction term was not selected, we canclaim that the eﬀect was not linearly decreasing over time and that it wassimply wavering around the constant eﬀect throughout the study.Tables 1 and 2 present the estimated variances and corresponding 95%credible intervals (CI) for the random eﬀects selected using SDP priors andtraditional spike-and-slab priors, respectively. Using SDP priors, our methodidentiﬁed a random main eﬀect for urge to smoke, cigarette availability , being bored , and motivation to quit smoking as well as a random linear interactionbetween being assigned to the SmartT treatment group, interacting with M. KOSLOVSKY ET AL.

Fig 1 . Smoking Cessation Study: Marginal posterior probabilities of inclusion (MPPI) forﬁxed (top) and random (bottom) eﬀects. Selected ﬁxed eﬀects in ascending order: NCI (NL-INTX), urge to quit (NL-INTX), cigarette availability (all), interacting with a smoker (NL-INTX), negative aﬀect (NL-INTX, main), being bored (NL-INTX), alcohol consumption(main), motivation to quit (main), HSI (NL-INTX). Selected random eﬀects in ascendingorder: urge (main), cigarette availability (main), being bored (main), motivation to quit(main), SmartT (L-INTX), interacting with a smoker (L-INTX), being bored (L-INTX).Dotted lines represent the inclusion threshold of 0.50. NL-INTX: non-linear interaction,L-INTX: linear interaction

VEM FOR MHEALTH DATA Fig 2 . Smoking Cessation Study: Time-varying eﬀects on momentary smoking by the nextassessment of those covariates selected by our model with SDP priors. Shaded regionsrepresent pointwise 95% CI. Dashed lines indicate an odds ratio of one. M. KOSLOVSKY ET AL.

Random Eﬀect ˆ σ

95% CI

Intercept 0.923 (0.539, 1.528)Urge 0.152 (0.031, 0.278)Cigarette Availability 0.865 (0.394, 1.467)Bored 0.183 (0.076, 0.398)Motivation to Quit Smoking 0.156 (0.045, 0.311)SmartT × Time 0.077 (0.010, 0.210)Interacting with a Smoker × Time 0.016 (0.002, 0.050)Bored × Time 0.002 (0.000, 0.005)

Table 1

Smoking Cessation Study: Estimated variances with corresponding 95 % credible intervals(CI) for selected random eﬀects with SDP priors based on MPPI ≥ . . smokers , and being bored with time. Thus even though we did not discover anoverall diﬀerence in the odds of momentary smoking by the next assessmentfor those assigned to the SmartT treatment versus usual care , we observedevidence that the subjects responded diﬀerently to the

SmartT treatmentacross the assessment window. With the traditional spike-and-slab priors,we found similar results overall. However, the model only selected a randommain eﬀect for interacting with smokers and additionally suggested a randomeﬀect for anxiousness .By using SDP priors, our approach is capable of clustering covariates thatshare similar linear trends with momentary smoking by the next assessmentover time. In practice, this information can be used to help construct decisionrules when designing future intervention strategies. In our analysis, only ﬁvemain eﬀect and linear interaction terms were selected, and each of themwere allocated to their own cluster. With this knowledge, researchers canprioritize targeting risk factors based on their relative strength of associationat a given moment. Had some of these risk factors’ eﬀects been clusteredtogether, researchers may rely more heavily on other pieces of information,such as the cost or success rates for a particular intervention strategy, whenassessing which risk factors to target during a high-risk moment.Similar to previous studies investigating the temporal relation betweenrisk factors and smoking behaviors around a quit attempt, our results showa convex relation between urge to smoke and momentary smoking after thequit attempt, a positive association with cigarette availability throughoutthe quit attempt, and a positive, increasing relation between negative af-fect and momentary smoking during the ﬁrst week after the quit attempt(Koslovsky et al., 2017; Vasilenko et al., 2014). Existing TVEMs approaches,however, typically model the repeated measures structure of the data by sim-ply including a random intercept term in the model, neglecting to investigaterandom main eﬀects or interaction terms. They also do not incorporate vari-

VEM FOR MHEALTH DATA Random Eﬀect ˆ σ

95% CI

Intercept 1.317 (0.676, 2.487)Urge 0.099 (0.011, 0.248)Cigarette Availability 0.905 (0.503, 1.607)Interacting with a Smoker 0.848 (0.286, 1.924)Bored 0.244 (0.065, 0.517)Anxiousness 0.140 (0.001, 0.361)Motivation to Quit Smoking 0.212 (0.076, 0.448)SmartT × Time 0.062 (0.016, 0.155)Bored × Time 0.002 (0.000, 0.004)

Table 2

Smoking Cessation Study: Estimated variances with corresponding 95 % credible intervals(CI) for selected random eﬀects with traditional spike-and-slab priors based on MPPI ≥ . . able selection. Our approach, on the other hand, delivers insights on howrelations vary over time as well as how they vary across individuals.3.3. Sensitivity Analysis.

To investigate our model’s sensitivity to priorspeciﬁcation, we set each of the hyperparameters to default values and thenevaluated the eﬀect of manipulating each term on the results obtained insection 3. For the default parameterization, we set the hyperparameters forthe prior inclusion indicators ν and λ to a ν t = b ν t = a λ d = b λ d = 1. Forinterpretation, a ν t = b ν t = 1 implies that the prior probability of inclusionfor a ﬁxed eﬀect is a ν t / ( a ν t + b ν t ) = 0 .

50. The default values for the varianceof the normal distribution for the slab of β and β ◦ as well as the basedistribution for β ∗ were each ﬁxed at 5. Additionally, the mean and variancefor the random eﬀect terms’ proposal and prior distributions were set to 0and 5, respectively. The hyperparameters for the concentration parameters ϑ and A were set to a ϑ = b ϑ = a A = b A = 1. Lastly, we assumed Γ ∼ N ( γ = , V γ = I ). We ran our MCMC algorithm for 10,000 iterations, treatingthe ﬁrst 5,000 iterations as burn-in and thinning to every 10 th iterationfor the SDP model, similar to our case study. For each of the ﬁxed andrandom eﬀects, inclusion in the model was determined using the medianmodel approach (Barbieri et al., 2004).Since the true model is never known in practice, we evaluated each modelparameterization in terms of sparsity levels and overlap with the results re-ported in the case study section. Speciﬁcally, we present the total numberterms selected for both ﬁxed and random eﬀects ( M. KOSLOVSKY ET AL. random eﬀects (r-IN and r-EX) as well as overall (IN and EX). Results ofthe sensitivity analysis are reported in Table 3. Compared to the resultspresented in the case study, we found relatively consistent overlap in therisk factors included and excluded by each model overall. We observed mod-erate sensitivity to hyperparameter values in terms of percent overlap forﬁxed and random eﬀects of risk factors included in the model, an artifactof the relatively weak associations identiﬁed for some of the risk factors.Notably, risk factors showing stronger associations with momentary smok-ing at the next assessment (e.g., negative aﬀect , cigarette availability , and motivation to quit smoking ) were selected by the model regardless of priorspeciﬁcation. Likewise, weaker relations between momentary smoking at thenext assessment and risk factors, such as being bored and interacting with asmoker , were more sensitive to hyperparameters. We also observed that thenumber of selected ﬁxed and random eﬀects increased (decreased) as theprior probability of inclusion increased (decreased), as expected. In prac-tice, there are a variety of factors researchers should consider when settingthe prior probability of inclusion, including the aim of the research study,the desired sparsity of the model, prior knowledge of covariates inclusion,as well as results from simulation and sensitivity analyses to name a few.From a clinical perspective, τ = 10 reﬂects a relatively diﬀuse prior for agiven risk factor (i.e., odds ratio between 0.002 and roughly 500). To fur-ther investigate the model’s sensitivity to regression coeﬃcients’ variances,we set τ = v = 1000, and found somewhat similar results to the modelwith τ = v = 10 overall (i.e., IN = 0.8, EX = 0.8). Here, we unexpectedlyfound non-montonic behavior in the proportion of included and excludedterms as a function of the coeﬃcients’ variance, which might also reﬂectour model’s sensitivity to relatively weak associations as previously noted.In theory, the selection of random eﬀects may be sensitive to the order inwhich the columns of Z are ordered, since the Cholesky decomposition isitself, order dependent (M¨uller et al., 2013). In our case study, we did notobserve any diﬀerences regarding which random eﬀects were selected with arandom permutation of the Z columns. In section 5, we further demonstrateour model’s robustness to the ordering of Z on simulated data.

4. Simulation Study.

In this section, we evaluate our model in termsof variable selection and clustering performance on simulated data similarin structure to our case study data. We compared our method with andwithout SDP priors on varying-coeﬃcient and random eﬀects to two otherBayesian methods which are designed to handle this class of models. Theﬁrst is the method of Scheipl, Fahrmeir and Kneib (2012), which has pre-

VEM FOR MHEALTH DATA a v t = a λ d = 1, b v t = b λ d = 9 τ = v = 2 a ϑ = b ϑ = a A = b A = 0 . a v t = a λ d = 9, b v t = b λ d = 1 τ = v = 10 a ϑ = b ϑ = a A = b A = 10 Table 3

Case Study Data: Sensitivity results for the proposed model with SDP across variousprior speciﬁcations. Total number of terms selected for both ﬁxed and random eﬀects areindicted as M. KOSLOVSKY ET AL. viously shown promising results performing function selection in structuraladditive regression models using continuous spike-and-slab priors. Their ap-proach diﬀers from ours in that they assume parameter-expanded normal-mixture-of-inverse-gamma (peNMIG) distribution priors for selection, in-spired by Ishwaran et al. (2005), and design a Metropolis-Hastings with pe-nalized iteratively weighted least-squares algorithm for updating regressioncoeﬃcients within the logistic framework. A popular alternative to spike-and-slab priors to induce sparsity in high-dimensional regression settings isto assume global-local shrinkage priors on the regression coeﬃcients (seeVan Erp, Oberski and Mulder (2019); Bhadra et al. (2019) for detailed re-views). At the request of a reviewer, we additionally compared our proposedmodel to a reparameterized version with shrinkage priors (Carvalho, Pol-son and Scott, 2009). To achieve this, we replaced the spike-and-slab priorson β with horseshoe priors, which belong to the class of global-local scalemixtures of normal priors (Polson and Scott, 2010). For random eﬀects, K ,we assumed a similar global-local structure for the scale parameters of thefolded-normal distribution, v . To our knowledge, the theoretical propertiesand selection performance of global-local scale mixtures of non-normal pri-ors have yet to be explored. However we conjectured that the global-localframework should eﬀectively shrink inactive random eﬀects towards zero andallow active terms to be freely estimated. Details of the resulting model andaccompanying MCMC algorithm are found in the Supplementary Material.We simulated N = 100 subjects with 20-40 observations randomly spacedacross an assessment window with t ij ∈ [0 , x i , comprised of anintercept term and 14 continuous covariates simulated from a N ( , Σ),where Σ st = w | s − t | and w = 0 .

3. To simulate time-varying covariate tra-jectories, we randomly jittered half of the elements within x i by N (0 , z ij = x ij . Thus, each full model contained 15 maineﬀects, linear interactions, non-linear interactions, and random main eﬀects,corresponding to 60 potential terms (or groups of terms for the non-linearinteraction components) to select. The ﬁrst 5 functional terms in the truemodel were deﬁned as • f ( t ij ) = π sin(3 πt ij ) + 1 . t ij − . • f ( t ij ) = π cos(2 πt ij ) + 1 . • f ( t ij ) = − πt sin(5 πt ij ) + 1 . t ij − . • f ( t ij ) = − . t ij + 1 . • f ( t ij ) = − . , and the random eﬀects a i ∼ N ( , Σ α ) with σ kk = 0 .

75 and σ jk = 0 . VEM FOR MHEALTH DATA j, k = 1 , . . . ,

5. Thus in the true model, ψ ij = (cid:80) p =1 f p ( t ij ) x ijp + z (cid:48) ij a i .Note that to impose an inherent clustering for the main eﬀects and linearinteraction terms, their values were speciﬁed to center around ± . th iteration for each model. The spline functions were generatedsimilar to our application. We set the hyperparameters for the inclusionindicators, a ν t = b ν t = a ν t = b ν t = 1, imposing a non-informative priorfor selection of ﬁxed and random eﬀect terms. Additionally, we ﬁxed theregression coeﬃcient hyperparameters to τ = 2 and m = 0 with v = 10.For the concentration parameters ϑ and A , we assumed a ϑ = b ϑ = a A = b A = 1. Before analysis, the covariates were standardized to mean 0 andvariance 1.For each of the models with spike-and-slab priors, inclusion in the modelfor both ﬁxed and random eﬀects was determined using the median modelapproach (Barbieri et al., 2004). For the horseshoe model, ﬁxed eﬀects wereconsidered active if their corresponding 95% credible interval did not con-tain zero, similar to Bhadra et al. (2019). The 95% credible interval forrandom eﬀects will almost surely not contain zero. As a naive alternative,we assumed a random eﬀect was active in the model if its posterior meanexceeded a given threshold. For the sake of demonstration, we evaluatedthe performance of the model over a grid of potential threshold values, andpresented the results for the best performing model overall. Notably, thissolution is only feasible when the true answer is known, which is never thecase in practice. Variable selection performance was evaluated via sensitivity(SENS), speciﬁcity (SPEC), and Matthew’s correlation coeﬃcient (MCC)for ﬁxed and random eﬀects separately. These metrics are deﬁned as SEN S = T PF N + T PSP EC = T NF P + T NM CC = T P × T N − F P × F N (cid:112) ( T P + F P )( T P + F N )( T N + F P )( T N + F N ) , where T N , T P , F N , and

F P represent the true negatives, true positives,false negatives, and false positives, respectively. For the SDP models, clus-ters of regression coeﬃcients were determined using sequentially-allocatedlatent structure optimization to minimize the lower bound of the variationof information loss (Wade et al., 2018; Dahl and Muller, 2017). Once clusterswere determined, clustering performance was evaluated using the variation M. KOSLOVSKY ET AL. of information, a measure of distance between two clusterings ranging from0 to log R , where R is the number of items to cluster and lower values implybetter clustering (Meil˘a, 2003).Figure 3 presents the estimated smooth functions obtained using our pro-posed method with SDP priors on a randomly selected replicated data setfrom the simulation study. Here, f ( t ij ) represents the global intercept com-prised of a main eﬀect, linear interaction, and non-linear interaction termthat were forced into the model. Of interest is the ability of the model toproperly select the inﬂuential components in f ( t ij ) and f ( t ij ) and addition-ally capture their structure. Using the method proposed in Dahl and Muller(2017) to identify latent clusters of ﬁxed main eﬀect and linear interactionterms, our method successfully clustered the linear interaction in f ( t ij ) andthe main eﬀects in f ( t ij ) and f ( t ij ), while incorrectly assigning the linearinteraction term in f ( t ij ) to its own cluster. Additionally, the main eﬀects in f ( t ij ), f ( t ij ), and f ( t ij ) were appropriately clustered together, while thelinear interaction term in f ( t ij ) was incorrectly assigning to its own cluster.The remaining, uninﬂuential terms were all allocated to the trivial group.Despite f ( t ij ) and f ( t ij ) having similar main eﬀect and linear interactionterms, they are dramatically diﬀerent in terms of their non-linear interactionterms. However by clustering their underlying linear trajectories, our modelwith SDP priors was able to uncover similarities in their relations with theoutcome over time that traditional approaches would fail to discover.Table 4 reports results for our proposed method with SDP priors (PGB-VSDP), our proposed method without SDP priors (PGBVS), peNMIG, andour model with horseshoe priors (PGHS) in terms of average sensitivity,speciﬁcity, and MCC for ﬁxed (fSENS, fSPEC, fMCC) and random (rSENS,rSPEC, rMCC) eﬀects across the replicate data sets with standard errorsin parentheses. Additionally for the PGBVSDP model, we provide cluster-ing performance results for ﬁxed (fCLUST) and random eﬀects (rCLUST).Since each of the random eﬀects were simulated similarly, clusterings werecompared to a single cluster for the non-zero terms. Overall, the methodshad relatively similar results for ﬁxed eﬀects, with PGBVS and PGHS per-forming the best in terms of sensitivity (1.00 and 1.00) and MCC (0.96 and0.99), respectively. Our method with SDP priors, PGBVSDP, obtained thehighest speciﬁcity for ﬁxed eﬀects overall. Given that the maximum possiblevalues fCLUST and rCLUST could take on were 3.4 and 2.7, respectively, wefound fairly strong clustering performance for both ﬁxed (0.39) and random(0.92) eﬀects with PGBVSDP. We observed more variability in the selec-tion of random eﬀects across models. Random eﬀect selection sensitivitywas signiﬁcantly lower compared to the ﬁxed eﬀects for all of the models. VEM FOR MHEALTH DATA Fig 3 . Simulated Data: Estimated smooth function f ( t ij ) , f ( t ij ) , f ( t ij ) for a randomlyselected replicate data set generated in the simulation study. The estimated smooth functionis represented by a solid black line with pointwise 95% credible regions in grey. Dashed linesrepresent the true log odds ratios as a function of time. M. KOSLOVSKY ET AL.PGBVSDP PGBVS peNMIG PGHSfSENS 0.96 (0.09) 1.00 (0.02) 0.93 (0.11) 1.00 (0.00)fSPEC 0.99 (0.02) 0.98 (0.02) 0.94 (0.04) 0.96 (0.01)fMCC 0.94 (0.08) 0.96 (0.05) 0.83 (0.10) 0.99 (0.02)fCLUST 0.39 (0.21) - - -rSENS 0.76 (0.21) 0.62 (0.25) 0.46 (0.23) 0.86 (0.24)rSPEC 0.88 (0.10) 0.96 (0.05) 0.64 (0.16) 0.90 (0.11)rMCC 0.63 (0.26) 0.64 (0.23) 0.11 (0.33) 0.76 (0.21)rCLUST 0.92 (0.50) - - -Time (s) 4658 (271) 2235 (46) 10720 (1116) 3076 (74)

Table 4

Simulated Data: Results for the proposed model with and without the SDP on regressioncoeﬃcients compared to peNMIG (Scheipl, Fahrmeir and Kneib, 2012) and our modelwith horseshoe priors (Carvalho, Polson and Scott, 2009). Results are averaged over 50replicate data sets with standard deviations in parentheses.

In terms of speciﬁcity (1-false positive rate) for random eﬀects, our meth-ods, regardless of prior formulation, dramatically outperformed peNMIG,with PGBVS obtaining the highest speciﬁcity overall (0.96). However, PG-BVSDP and PGBVS had lower sensitivity with respect to random eﬀectscompared to PGHS. While PGHS performed well separating active frominactive random eﬀects, recall that the truth was used to select the opti-mal selection threshold. The improved performance of PGBVS, PGBVSDP,and PGHS in terms of variable selection was achieved in considerably lesscomputation time compared to peNMIG. Our core method was able to run7,500 iterations in a ﬁfth of the time compared to peNMIG, accessed viaScheipl (2011). Using the SDP priors, which requires additional updates forclustering the regression coeﬃcients, we observed a two-fold increase in com-putation time for PGBVSDP compared to PGBVS. However on average, thePGBVSDP approach still achieved about a 50% reduction in computationtime compared to peNMIG. It is important to note that for comparison, allalgorithms were run in series, even though the R package spikeSlabGAM(Scheipl, 2011) provides functionality to run multiple chains in parallel.

5. Sensitivity Analysis.

To assess the model’s sensitivity to hyperpa-rameter settings, we set each of the hyperparameters to default values andthen evaluated the eﬀect of manipulating each term on selection and cluster-ing performance. For the default parameterization, we set the hyperparam-eters for the prior inclusion indicators ν and λ to a ν t = b ν t = a λ d = b λ d = 1.The default values for the variance of the normal distribution for the slabof β and β ◦ as well as the base distribution for β ∗ were each ﬁxed at 5.Additionally, the mean and variance for the random eﬀect terms’ proposaland prior distributions were set to 0 and 5, respectively. The hyperparam- VEM FOR MHEALTH DATA eters for the concentration parameters, ϑ and A a ϑ = b ϑ = a A = b A = 1.Lastly, we assumed Γ ∼ N ( γ = , V γ = I ). We ran our MCMC algorithmon the 50 replicated data sets generated in the simulation study, using 7,500iterations, treating the ﬁrst 3,750 iterations as burn-in and thinning to every10 th iteration for the SDP model.Results of the sensitivity analysis are reported in Table 5. As expected,we found that the sensitivity (speciﬁcity) increased (decreased) as the priorprobability of inclusion for the ﬁxed and random eﬀects increased. The modeldid not seem sensitive to the variance assumed for the normal and foldednormal priors assigned to the ﬁxed and random eﬀect slab distributions,respectively. Similarly, we found comparable results in terms of sensitivityand speciﬁcity for diﬀerent values of the concentration parameters’ hyper-parameters. In terms of clustering, we saw marginally better variation ofinformation measures with larger concentration parameter hyperparame-ters. However across simulations runs, we observed relatively high standarderrors in terms of the variation of information measures. To assess potentialsensitivity to the order of random eﬀects in our simulations, we re-ran thesimulation study with a random permutation of the columns of Z . Similar tothe case study, we found no evidence of sensitivity to random eﬀect orderingwith our model as the results were almost identical to those presented inTable 4 with PGBVSDP (rSENS = 0.76 (0.20), rSPEC = 0.87 (0.09), rMCC= 0.62 (0.20), rCLUST = 0.94 (0.42)).

6. Conclusions.

In this paper, we have investigated intensive longitu-dinal data, collected in a novel, smartphone-based smoking cessation studyto better understand the relation between potential risk factors and smok-ing behaviors in the critical moments surrounding a quit attempt, usinga semiparametric Bayesian time-varying eﬀect modeling framework. Unlikestandard TVEMs, our approach deconstructs the structure of the relationsbetween risk factors and smoking behaviors over time, which aids in for-mulating hypotheses regarding dynamic relations between risk factors andsmoking in the critical moments around a quit attempt. By performing vari-able selection on random eﬀects, the approach delivers additional insightson how relations vary over time as well as how they vary across individuals.Furthermore, the use of non- and semiparametric prior constructions allowssimultaneous variable selection for ﬁxed and random eﬀects while learninglatent clusters of regression coeﬃcients. As such, our model is designed todiscover various forms of latent structures within the data without requir-ing strict model assumptions or burdensome tuning procedures. Results fromour analysis have conﬁrmed previously identiﬁed temporal relations between M. KOSLOVSKY ET AL. a v t = a λ d = 1, b v t = b λ d = 9 τ = v = 2 a ϑ = b ϑ = a A = b A = 0 . a v t = a λ d = 9, b v t = b λ d = 1 τ = v = 10 a ϑ = b ϑ = a A = b A = 10fSENS 0.99 (0.03) 0.96 (0.07) 0.94 (0.11)fSPEC 0.96 (0.03) 0.99 (0.02) 0.99 (0.02)fMCC 0.91 (0.06) 0.95 (0.07) 0.93 (0.11)fCLUST 0.40 (0.20) 0.39 (0.23) 0.41 (0.24)rSENS 0.85 (0.20) 0.78 (0.20) 0.74 (0.23)rSPEC 0.84 (0.10) 0.89 (0.10) 0.86 (0.10)rMCC 0.66 (0.19) 0.67 (0.25) 0.60 (0.27)rCLUST 0.84 (0.49) 0.89 (0.47) 0.97 (0.49) Table 5

Simulated Data: Sensitivity results for the proposed model with SDP on regressioncoeﬃcients. Results are averaged over 50 replicated data sets with standard errors inparentheses. smoking behaviors and urge to smoke, cigarette availability , and negative af-fect . They have also identiﬁed subject-speciﬁc heterogeneity in the eﬀects of urge to smoke, cigarette availability , and motivation to quit . Additionally,we have found that subjects diﬀered in how they responded to the

SmartT treatment (compared to usual care), interacting with a smoker , and being bored over time. This has practical relevance as researchers can use this infor-mation to design adaptive interventions that prioritize targeting risk factorsbased on their relative strength of association at a given moment. They alsoreinforce the importance of designing dynamic intervention strategies thatare adaptive to subjects’ current risk proﬁles.Throughout this work, we have demonstrated how our method is well-suited to aide the development and evaluation of future JITAI strategiestargeting smoking cessation using mHealth data. The existing

SmartT al-gorithm delivers treatment based on the presence of six lapse triggers, whichare weighted based on their relative importance in predicting risk of lapse(Businelle et al., 2016). The results of this study allow for a more dynamicalgorithm that takes into account not only the time-varying relationshipsbetween psychosocial and environmental variables and smoking lapse, butthe diﬀerent ways in which individuals experience a quit attempt. For exam-ple, the results suggest that providing momentary support to cope with urge

VEM FOR MHEALTH DATA to smoke and negative aﬀect may be more useful if delivered in the earlystages of a quit attempt, but become less important by week 4 post-quit.However, messages that address cigarette availability , alcohol consumption ,and motivation to quit smoking may be a more important focus for the entirequit attempt. Although the ﬁndings for this small sample may not be gen-eralizable to larger, more diverse populations, these methods are the nextstep in developing a personalized smoking risk algorithm that can informhighly speciﬁc, individualized treatment to each smoker.It is important to note that selection of a risk factor by our proposedmethod (or any variable selection technique), does not imply clinical signiﬁ-cance. Notably, the point-wise credible intervals often contained odds ratiosof one and most risk factors were only inﬂuential for brief moments through-out the study period. While these results highlight the importance of un-derstanding risk factors’ dynamic relations with smoking to design tailoredintervention strategies, we recommend using our method for hypothesis gen-eration in practice and conducting conﬁrmatory studies before generalizingresults.Compliance rates for EMA studies typically range between 70% and 90%,with a recommended threshold of 80% (Jones, Xu and Grunwald, 2006).In our case study, the compliance rate was 84%. Additionally, 97.3% ofall assessments were completed once initiated, and subjects were unable toskip questions within an assessment. Since subjects were assessed multipletimes per day, nonresponse was attributed more to situational context (e.g.,driving) than smoking status. Thus for this study, we found the missingcompletely at random assumption for missing observations justiﬁed. How-ever, future studies may consider the development of advanced analyticalmethods for EMA data sets that can handle diﬀerent types of missingnessassumptions and other potential biases, such as social desirability bias.In this analysis, we focus on time-varying eﬀects due to their recent pop-ularity in smoking behavior research Tan et al. (2012); Shiyko et al. (2012);Vasilenko et al. (2014); Koslovsky et al. (2017); Lanza et al. (2013); Shiykoet al. (2014). A promising alternative for investigating the complexity ofsmoking behaviors around a quit attempt is the varying index coeﬃcientmodel, which allows a covariate’s eﬀect to vary as a function of multipleother variables (Ma and Song, 2015). By incorporating variable selectionpriors, researchers could identify which variables are responsible for modify-ing a covariate’s eﬀect. Oftentimes behavioral researchers are interested inexploring other forms of latent structure, such as clusters of individuals whorespond similarly to treatments or have similar risk proﬁles over time. Tak-ing advantage of the ﬂexibility and eﬃciency of our approach, future work M. KOSLOVSKY ET AL. could extend our core model to address these research questions by recastingit into a mixture modeling framework. In addition, while we have developedour method for binary outcomes due to their prevalence in smoking behaviorresearch studies, our approach is easily adaptable to other data structuresfound within and outside of smoking behavior research, such as time to eventdata (Sha, Tadesse and Vannucci, 2006) and continuous outcomes. While ourmethod borrows information across regression coeﬃcients, we avoided im-posing structure among covariates via heredity constraints, which restrictthe model space for higher order terms depending on the inclusion status ofthe lower order terms that comprise them. Researchers interested in extend-ing our approach to accommodate these, and other forms of, hierarchicalconstraints may adjust the prior probabilities of inclusion (Chipman, 1996).Lastly, while we were hesitant to present variable selection results for PGHS,due to the limited understanding of global-local priors for non-Gaussian dis-tributions, this showed good results in simulations. Furthermore, when ap-plied to the case study data, we obtained promising predictive performance(i.e., (cid:100) epld HS = − .

5) that warrant future investigation of its theoreticalproperties.

Acknowledgements.

Matthew Koslovsky is supported by NSF via theResearch Training Group award DMS-1547433.

Supplementary Material.R-package for PGBVS:

R-package PGBVS contains code to perform the methods described in thearticle. The package also contains functionality for reproducing the dataused in the sensitivity and simulation studies and for posterior inference.The R package is located at https://github.com/mkoslovsky/PGBVS . Supplementary Information:

This ﬁle contains a description of the full joint distribution of our model witha graphical representation, a detailed description of our proposed MCMC al-gorithm with and without SDP priors, and derivations for the prior marginallikelihood used to sample latent cluster assignments. Additionally, we includedetails of the goodness-of-ﬁt analysis for the case study.

References.

Barbieri, M. M. , Berger, J. O. et al. (2004). Optimal predictive model selection.

TheAnnals of Statistics Berardi, V. , Carretero-Gonz´alez, R. , Bellettiere, J. , Adams, M. A. , Hughes, S. and

Hovell, M. (2018). A Markov approach for increasing precision in the assessmentVEM FOR MHEALTH DATA of data-intensive behavioral interventions. Journal of Biomedical Informatics Bhadra, A. , Datta, J. , Polson, N. G. , Willard, B. et al. (2019). Lasso meets horse-shoe: A survey.

Statistical Science Bolman, C. , Verboon, P. , Thewissen, V. , Boonen, V. , Soons, K. and

Jacobs, N. (2018). Predicting smoking lapses in the ﬁrst week of quitting: an ecological momentaryassessment study.

Journal of Addiction Medicine Brook, D. W. , Brook, J. S. , Zhang, C. , Whiteman, M. , Cohen, P. and

Finch, S. J. (2008). Developmental trajectories of cigarette smoking from adolescence to the earlythirties: personality and behavioral risk factors.

Nicotine & Tobacco Research Brown, P. J. , Vannucci, M. and

Fearn, T. (1998). Multivariate Bayesian variableselection and prediction.

Journal of the Royal Statistical Society: Series B (StatisticalMethodology) Businelle, M. S. , Ma, P. , Kendzor, D. E. , Reitzel, L. R. , Chen, M. , Lam, C. Y. , Bernstein, I. and

Wetter, D. W. (2014). Predicting quit attempts among home-less smokers seeking cessation treatment: an ecological momentary assessment study.

Nicotine & Tobacco Research Businelle, M. S. , Ma, P. , Kendzor, D. E. , Frank, S. G. , Vidrine, D. J. and

Wet-ter, D. W. (2016). An ecological momentary intervention for smoking cessation: eval-uation of feasibility and eﬀectiveness.

Journal of Medical Internet Research e321. Cai, B. and

Bandyopadhyay, D. (2017). Bayesian semiparametric variable selection withapplications to periodontal data.

Statistics in Medicine Canale, A. , Lijoi, A. , Nipoti, B. and

Pr¨unster, I. (2017). On the Pitman–Yor processwith spike and slab base measure.

Biometrika

Carvalho, C. M. , Polson, N. G. and

Scott, J. G. (2009). Handling sparsity via thehorseshoe. In

Artiﬁcial Intelligence and Statistics

Chen, Z. and

Dunson, D. B. (2003). Random eﬀects selection in linear mixed models.

Biometrics Cheng, J. , Edwards, L. J. , Maldonado-Molina, M. M. , Komro, K. A. and

Muller, K. E. (2010). Real longitudinal data analysis for real people: building a goodenough mixed model.

Statistics in Medicine Chipman, H. (1996). Bayesian variable selection with related predictors.

The CanadianJournal of Statistics

Cursio, J. F. , Mermelstein, R. J. and

Hedeker, D. (2019). Latent trait shared-parameter mixed models for missing ecological momentary assessment data.

Statisticsin Medicine Dahl, D. B. and

Muller, P. (2017). sdols: Summarizing Distributions of Latent Struc-tures.

R package version 1.4. de Haan-Rietdijk, S. , Kuppens, P. , Bergeman, C. S. , Sheeber, L. , Allen, N. and

Hamaker, E. (2017). On the use of mixed Markov models for intensive longitudinaldata.

Multivariate Behavioral Research Dunson, D. B. , Herring, A. H. and

Engel, S. M. (2008). Bayesian selection andclustering of polymorphisms in functionally related genes.

Journal of the AmericanStatistical Association

Dziak, J. J. , Li, R. , Tan, X. , Shiffman, S. and

Shiyko, M. P. (2015). Modeling in-tensive longitudinal data with mixtures of nonparametric trajectories and time-varyingeﬀects.

Psychological Methods Escobar, M. D. and

West, M. (1995). Bayesian density estimation and inference usingmixtures.

Journal of the American Statistical Association M. KOSLOVSKY ET AL.

Fitzmaurice, G. M. , Laird, N. M. and

Ware, J. H. (2012).

Applied LongitudinalAnalysis . John Wiley & Sons.

Geiser, C. , Bishop, J. , Lockhart, G. , Shiffman, S. and

Grenard, J. L. (2013). Ana-lyzing latent state-trait and multiple-indicator latent growth curve models as multilevelstructural equation models.

Frontiers in Psychology Gelfand, A. E. (1996). Model determination using sampling-based methods.

Markovchain Monte Carlo in practice

Gelman, A. and

Rubin, D. B. (1992). Inference from iterative simulation using multiplesequences.

Statistical science Gelman, A. , Goegebeur, Y. , Tuerlinckx, F. and

Van Mechelen, I. (2000). Diag-nostic checks for discrete data regression models using posterior predictive simulations.

Journal of the Royal Statistical Society: Series C (Applied Statistics) George, E. I. and

McCulloch, R. E. (1993). Variable selection via Gibbs sampling.

Journal of the American Statistical Association George, E. I. and

McCulloch, R. E. (1997). Approaches for Bayesian variable selection.

Statistica Sinica

Hastie, T. and

Tibshirani, R. (1993). Varying-coeﬃcient models.

Journal of the RoyalStatistical Society: Series B (Methodological) Heron, K. E. and

Smyth, J. M. (2010). Ecological momentary interventions: Incorpo-rating mobile technology into psychosocial and health behaviour treatments.

BritishJournal of Health Psychology Hui, F. K. , M¨uller, S. and

Welsh, A. (2017). Hierarchical selection of ﬁxed and randomeﬀects in generalized linear mixed models.

Statistica Sinica

Ishwaran, H. , Rao, J. S. et al. (2005). Spike and slab variable selection: frequentist andBayesian strategies.

The Annals of Statistics Jones, R. H. , Xu, S. and

Grunwald, G. K. (2006). Continuous time Markov modelsfor binary longitudinal data.

Biometrical Journal Kim, S. , Dahl, D. B. and

Vannucci, M. (2009). Spiked Dirichlet process prior forBayesian multiple hypothesis testing in random eﬀects models.

Bayesian Analysis (On-line) Kinney, S. K. and

Dunson, D. B. (2007). Fixed and random eﬀects selection in linearand logistic models.

Biometrics Klasnja, P. , Hekler, E. B. , Shiffman, S. , Boruvka, A. , Almirall, D. , Tewari, A. and

Murphy, S. A. (2015). Microrandomized trials: An experimental design for devel-oping just-in-time adaptive interventions.

Health Psychology Koslovsky, M. D. , H´ebert, E. T. , Swartz, M. D. , Chan, W. , Leon-Novelo, L. , Wilkinson, A. V. , Kendzor, D. E. and

Businelle, M. S. (2017). The time-varyingrelations between risk factors and smoking before and after a quit attempt.

Nicotine &Tobacco Research . Koslovsky, M. D. , Swartz, M. D. , Chan, W. , Leon-Novelo, L. , Wilkinson, A. V. , Kendzor, D. E. and

Businelle, M. S. (2018). Bayesian variable selection for multi-state Markov models with interval-censored data in an ecological momentary assessmentstudy of smoking cessation.

Biometrics K¨ur¨um, E. , Li, R. , Shiffman, S. and

Yao, W. (2016). Time-varying coeﬃcient mod-els for joint modeling binary and continuous outcomes in longitudinal data.

StatisticaSinica Lang, S. and

Brezger, A. (2004). Bayesian P-splines.

Journal of Computational andGraphical Statistics Lanza, S. T. , Vasilenko, S. A. , Liu, X. , Li, R. and

Piper, M. E. (2013). Advancingthe understanding of craving during smoking cessation attempts: A demonstration ofVEM FOR MHEALTH DATA the time-varying eﬀect model. Nicotine & Tobacco Research . Li, Y. , M¨uller, P. and

Lin, X. (2011). Center-adjusted inference for a nonparametricBayesian random eﬀect distribution.

Statistica Sinica . Luckett, D. J. , Laber, E. B. , Kahkoska, A. R. , Maahs, D. M. , Mayer-Davis, E. and

Kosorok, M. R. (2019). Estimating dynamic treatment regimes in mobile healthusing V-learning.

Journal of the American Statistical Association

Ma, S. and

Song, P. X.-K. (2015). Varying index coeﬃcient models.

Journal of theAmerican Statistical Association

Mason, M. , Mennis, J. , Way, T. , Lanza, S. , Russell, M. and

Zaharakis, N. (2015).Time-varying eﬀects of a text-based smoking cessation intervention for urban adoles-cents.

Drug and Alcohol Dependence

McCarthy, D. E. , Ebssa, L. , Witkiewitz, K. and

Shiffman, S. (2016). Repeatedmeasures latent class analysis of daily smoking in three smoking cessation studies.

Drug and Alcohol Dependence

McClure, J. B. , Anderson, M. L. , Bradley, K. , An, L. C. and

Catz, S. L. (2016).Evaluating an adaptive and interactive mHealth smoking cessation and medicationadherence program: a randomized pilot feasibility study.

JMIR mHealth and uHealth e94. Meil˘a, M. (2003). Comparing clusterings by the variation of information. In

LearningTheory and Kernel Machines

Minami, H. , Yeh, V. M. , Bold, K. W. , Chapman, G. B. and

McCarthy, D. E. (2014).Relations among aﬀect, abstinence motivation and conﬁdence, and daily smoking lapserisk.

Psychology of Addictive Behaviors M¨uller, P. , Quintana, F. A. and

Rosner, G. L. (2007). Semiparametric Bayesianinference for multilevel repeated measurement data.

Biometrics M¨uller, S. , Scealy, J. L. , Welsh, A. H. et al. (2013). Model selection in linear mixedmodels.

Statistical Science Nahum-Shani, I. , Smith, S. N. , Spring, B. J. , Collins, L. M. , Witkiewitz, K. , Tewari, A. and

Murphy, S. A. (2017). Just-in-time adaptive interventions (JITAIs)in mobile health: Key components and design principles for ongoing health behaviorsupport.

Annals of Behavioral Medicine Naughton, F. , Hopewell, S. , Lathia, N. , Schalbroeck, R. , Brown, C. , Mas-colo, C. , McEwen, A. and

Sutton, S. (2016). A context-sensing mobile phone app(Q sense) for smoking cessation: a mixed-methods study.

JMIR mHealth and uHealth e106. Neal, R. M. (2000). Markov chain sampling methods for Dirichlet process mixture mod-els.

Journal of Computational and Graphical Statistics Newton, M. A. , Noueiry, A. , Sarkar, D. and

Ahlquist, P. (2004). Detecting diﬀer-ential gene expression with a semiparametric hierarchical mixture method.

Biostatistics Piasecki, T. M. , Fiore, M. C. , McCarthy, D. E. and

Baker, T. B. (2002). Havewe lost our way? The need for dynamic formulations of smoking relapse proneness.

Addiction Piasecki, T. M. , Jorenby, D. E. , Smith, S. S. , Fiore, M. C. and

Baker, T. B. (2003).Smoking withdrawal dynamics:II. Improved tests of withdrawal-relapse relations.

Jour-nal of Abnormal Psychology

Piasecki, T. M. , Trela, C. J. , Hedeker, D. and

Mermelstein, R. J. (2013). Smok-ing antecedents: Separating between-and within-person eﬀects of tobacco dependencein a multiwave ecological momentary assessment investigation of adolescent smoking.

Nicotine & Tobacco Research S119–S126. M. KOSLOVSKY ET AL.

Polson, N. G. and

Scott, J. G. (2010). Shrink globally, act locally: Sparse Bayesianregularization and prediction.

Bayesian statistics Polson, N. G. , Scott, J. G. and

Windle, J. (2013). Bayesian inference for logisticmodels using P´olya–Gamma latent variables.

Journal of the American Statistical As-sociation

Riley, W. T. , Rivera, D. E. , Atienza, A. A. , Nilsen, W. , Allison, S. M. and

Mer-melstein, R. (2011). Health behavior models in the age of mobile interventions: Areour theories up to the task?

Translational Behavioral Medicine Rivera, D. E. , Pew, M. D. and

Collins, L. M. (2007). Using engineering controlprinciples to inform the design of adaptive interventions: A conceptual introduction.

Drug and Alcohol Dependence S31–S40.

Savitsky, T. and

Vannucci, M. (2010). Spiked Dirichlet process priors for Gaussianprocess models.

Journal of Probability and Statistics . Savitsky, T. , Vannucci, M. and

Sha, N. (2011). Variable selection for nonparametricGaussian process priors: Models and computational strategies.

Statistical Science: Areview Journal of the Institute of Mathematical Statistics Scheipl, F. (2011). spikeSlabGAM: Bayesian variable selection, model choice and regu-larization for generalized additive mixed models in R. arXiv preprint arXiv:1105.5253 . Scheipl, F. , Fahrmeir, L. and

Kneib, T. (2012). Spike-and-slab priors for functionselection in structured additive regression models.

Journal of the American StatisticalAssociation

Selya, A. S. , Updegrove, N. , Rose, J. S. , Dierker, L. , Tan, X. , Hedeker, D. , Li, R. and

Mermelstein, R. J. (2015). Nicotine-dependence-varying eﬀects of smoking eventson momentary mood changes among adolescents.

Addictive Behaviors Sha, N. , Tadesse, M. G. and

Vannucci, M. (2006). Bayesian variable selection for theanalysis of microarray data with censored outcomes.

Bioinformatics Shiffman, S. (2013). Conceptualizing analyses of ecological momentary assessment data.

Nicotine & Tobacco Research S76–S87.

Shiffman, S. , Paty, J. A. , Gnys, M. , Kassel, J. A. and

Hickcox, M. (1996). Firstlapses to smoking: within-subjects analysis of real-time reports.

Journal of Consultingand Clinical Psychology Shiffman, S. , Balabanis, M. H. , Paty, J. A. , Engberg, J. , Gwaltney, C. J. , Liu, K. S. , Gnys, M. , Hickcox, M. and

Paton, S. M. (2000). Dynamic eﬀects ofself-eﬃcacy on smoking lapse and relapse.

Health Psychology Shiffman, S. , Gwaltney, C. J. , Balabanis, M. H. , Liu, K. S. , Paty, J. A. , Kas-sel, J. D. , Hickcox, M. and

Gnys, M. (2002). Immediate antecedents of cigarettesmoking: an analysis from ecological momentary assessment.

Journal of Abnormal Psy-chology

Shiyko, M. P. , Lanza, S. T. , Tan, X. , Li, R. and

Shiffman, S. (2012). Using thetime-varying eﬀect model (TVEM) to examine dynamic associations between negativeaﬀect and self conﬁdence on smoking urges: Diﬀerences between successful quitters andrelapsers.

Prevention Science Shiyko, M. , Naab, P. , Shiffman, S. and

Li, R. (2014). Modeling Complexity of eMaData: time-varying lagged eﬀects of negative aﬀect on smoking Urges for subgroups ofnicotine addiction.

Nicotine & Tobacco Research S144–S150.

Tan, X. , Shiyko, M. P. , Li, R. , Li, Y. and

Dierker, L. (2012). A time-varying eﬀectmodel for intensive longitudinal data.

Psychological Methods Timms, K. P. , Rivera, D. E. , Collins, L. M. and

Piper, M. E. (2013). A dynami-cal systems approach to understanding self-regulation in smoking cessation behaviorchange.

Nicotine & Tobacco Research S159–S168.VEM FOR MHEALTH DATA Trail, J. B. , Collins, L. M. , Rivera, D. E. , Li, R. , Piper, M. E. and

Baker, T. B. (2014). Functional data analysis for dynamical system identiﬁcation of behavioral pro-cesses.

Psychological Methods Van Erp, S. , Oberski, D. L. and

Mulder, J. (2019). Shrinkage priors for Bayesianpenalized regression.

Journal of Mathematical Psychology Vasilenko, S. A. , Piper, M. E. , Lanza, S. T. , Liu, X. , Yang, J. and

Li, R. (2014).Time-varying processes involved in smoking lapse in a randomized trial of smokingcessation therapies.

Nicotine & Tobacco Research S135–S143.

Vehtari, A. , Gelman, A. and

Gabry, J. (2016). loo: Eﬃcient leave-one-out cross-validation and WAIC for Bayesian models.

R package version 0.1 . Vehtari, A. , Gelman, A. and

Gabry, J. (2017). Practical Bayesian model evaluationusing leave-one-out cross-validation and WAIC.

Statistics and Computing Wade, S. , Ghahramani, Z. et al. (2018). Bayesian cluster analysis: Point estimation andcredible balls (with discussion).

Bayesian Analysis Walls, T. A. and

Schafer, J. L. (2005).

Models for intensive longitudinal data . OxfordUniversity Press.

Yang, M. (2012). Bayesian variable selection for logistic mixed model with nonparametricrandom eﬀects.

Computational Statistics & Data Analysis Department of StatisticsColorado State UniversityFort Collins, CO, USAE-mail: [email protected]

Oklahoma Tobacco Research CenterThe University of Oklahoma Health Sciences Center655 Research Parkway, Suite 400Oklahoma City, OK 73104E-mail:

[email protected]

E-mail:

[email protected]