[PDF] Interactive graphics for functional data analyses

Abstract

Although there are established graphics that accompany the most common functional data analyses, generating these graphics for each dataset and analysis can be cumbersome and time consuming. Often, the barriers to visualization inhibit useful exploratory data analyses and prevent the development of intuition for a method and its application to a particular dataset. The refund.shiny package was developed to address these issues for several of the most common functional data analyses. After conducting an analysis, the plot_shiny() function is used to generate an interactive visualization environment that contains several distinct graphics, many of which are updated in response to user input. These visualizations reduce the burden of exploratory analyses and can serve as a useful tool for the communication of results to non-statisticians.

Full PDF

IInteractive graphics for functional data analyses

Julia Wrobel , So Young Park , Ana Maria Staicu , and Jeﬀ Goldsmith Department of Biostatistics, Mailman School of Public Health, Columbia University Department of Statistics, North Carolina State University * [email protected] February 15, 2016

Abstract

Although there are established graphics that accompany the most common functional data analy-ses, generating these graphics for each dataset and analysis can be cumbersome and time consuming.Often, the barriers to visualization inhibit useful exploratory data analyses and prevent the develop-ment of intuition for a method and its application to a particular dataset. The refund.shiny packagewas developed to address these issues for several of the most common functional data analyses. Afterconducting an analysis, the plot shiny() function is used to generate an interactive visualization envi-ronment that contains several distinct graphics, many of which are updated in response to user input.These visualizations reduce the burden of exploratory analyses and can serve as a useful tool for thecommunication of results to non-statisticians.

Key Words: Functional principal component analysis, multilevel functional data, longitudinal functionaldata, function-on-scalar regression.

Functional data analysis (FDA) has become a popular and useful framework for applications in whichthe unit of measurement is a function, curve or image. Conceptually, FDA leverages the underlying datastructure, often temporal or spatial, to improve understanding of patterns and variation. A wide arrayof tools have been developed for the functional data setting, for example, functional principal componentanalysis (FPCA) and regression models using functional responses (Ramsay and Silverman, 2005; Morris,2015; Sørensen et al., 2013). The basic unit of observation is the curve Y i ( t ) for subjects i ∈ . . . , I in the1 a r X i v : . [ s t a t . O T ] F e b ross-sectional setting and Y ij ( t ) for subject i at visit j ∈ . . . , J i for the multilevel or longitudinal structure.Methods for functional data are typically presented in terms of continuous functions, but in practice dataare observed on a discrete grid that may be sparse or dense at the subject level and that may be the sameacross subjects or irregular.Many methods for FDA have standard visualization approaches that clarify the results of analyses;examples include scree plots for FPCA and coeﬃcient function plots for function on scalar regression. Clearvisualizations aid in exploratory analysis and help to communicate results to non-statistical collaborators.However, creating useful plots is often time consuming and must be repeated each time a model is changed,and no software currently exists to facilitate this process.The refund.shiny package (Goldsmith and Wrobel, 2015) creates interactive visualizations for func-tional data analyses, allowing researchers to create common graphics for standard analyses with just a fewlines of code. Currently, refund.shiny builds plots for functional principal component analysis (FPCA),multilevel FPCA (MFPCA), time-varying FPCA (TV-FPCA), and function-on-scalar regression (FoSR).The workﬂow separates analysis and visualization steps: analyses are performed by functions in the refund package (Crainiceanu et al., 2015) and interactive visualizations are generated by the plot shiny() func-tion in the refund.shiny package. Changes to the analysis – increasing the number of retained principalcomponents, for example, or augmenting a regression model with new predictors – are easily incorpo-rated into the graphical interface. User interaction with the displayed graphics facilitates comparisons andstreamlines navigation between visualizations.We illustrate the tools in refund.shiny using a single dataset, which we describe brieﬂy here. Thediﬀusion tensor imaging ( DTI ) dataset available in the refund package includes cerebral white mattertracts for multiple sclerosis patients and healthy controls. White matter tracts are collections of axons,projections of neurons that transmit electrical signals and are coated by a fatty substance called myelin(Greven et al., 2010; Goldsmith et al., 2011; Staicu et al., 2012). DTI is a magnetic resonance imagingmodality that measures diﬀusion of water in the brain; because water movement is restricted in whitematter ﬁbers, DTI allows the quantiﬁcation of white matter tract integrity. The

DTI dataset containstract proﬁles – continuous summaries of tract properties along their major axis – for 142 subjects acrossmultiple visits, with a median of 4 scans per subject. The dataset includes tract proﬁles for several tracts,2he PASAT score (a continuous variable that indicates brain reactivity and attention span), subject sex,subject ID, visit number, and time of visit (Strauss et al., 2006). Because we observe tract proﬁles for eachsubject over time, the DTI dataset is a functional dataset with longitudinal structure; in order to use thesame dataset across examples we sometimes neglect this structure or subset the data. The following codecan be used to install refund and refund.shiny and load the

DTI data: > install.packages("refund.shiny")> library(refund.shiny)> library(refund)> data(DTI)

Sections 2, 3, 4, and 5 each provide a brief methodological overview of an analysis technique for FDAand describe the corresponding interactive visualization tools in the refund.shiny package. Section 6details the structure of the refund.shiny package. We close in section 7 with a discussion.

We start with FPCA, one of the most common exploratory tools for functional datasets.

FPCA characterizes modes of variability by decomposing functional observations into population levelbasis functions and subject-speciﬁc scores (Ramsay and Silverman, 2005). The basis functions have a clearinterpretation, analogous to that of PCA: the ﬁrst basis function explains the largest direction of variation,and each subsequent basis function describes less. The FPCA model is typically written Y i ( t ) = µ ( t ) + K (cid:88) k =1 c ik ψ k ( t ) + (cid:15) i ( t ) (1)where µ ( t ) is the population mean, ψ k ( t ) are a set of orthonormal population-level basis functions, c ik are subject-speciﬁc scores with mean zero and variance λ k , and (cid:15) i ( t ) are residual curves. Estimated basisfunctions (cid:98) ψ ( t ) , (cid:98) ψ ( t ) , . . . , (cid:98) ψ K ( t ) and corresponding variances (cid:98) λ ≥ (cid:98) λ ≥ . . . ≥ (cid:98) λ K are obtained from atruncated Karhunen-Lo`eve decomposition of the sample covariance (cid:98) Σ( s, t ) = (cid:100) Cov ( Y i ( s ) , Y i ( t )). In practice,the covariance (cid:98) Σ( s, t ) is often smoothed using a bivariate smoother that omits entries on the main diagonal3o avoid a “nugget eﬀect” attributable to measurement error, and scores are estimated in a mixed modelframework (Yao et al., 2005; Goldsmith et al., 2013). The truncation lag K is often chosen so that theresulting approximation accounts for at least 95% of observed variance. Our example uses the fpca.sc() function from the refund package. Several other implementations ofFPCA are available in refund , including fpca.face() , fpca.ssvd() , and fpca2s() , all of which arecompatible with refund.shiny . The number of functional principal components (FPCs) is chosen bypercent variance explained, with the default set to 0.99. See ?plot shiny for examples.Graphics for FPCA are implemented by the code below: > fit.fpca = fpca.sc(Y = DTI$cca)> plot_shiny(obj = fit.fpca) Executing this code produces a user interface with ﬁve tabs. The ﬁrst tab shows (cid:98) µ ( t ) ± (cid:113)(cid:98) λ k (cid:98) ψ k ( t ), andincludes a drop-down menu through which the user can select k (an example for a similar tab, based onmultilevel data, is shown in Section 3). The second tab presents static scree plots of the eigenvalues (cid:98) λ k andthe percent variance explained by each eigenvalue. The third tab shows (cid:98) µ ( t ) + (cid:80) Kk =1 c k (cid:98) ψ k ( t ), and includesslider bars through which the values of c k can be set; adjusting the sliders allows the user to see a ﬁttedcurve for a hypothetical subject with the selected combination of scores. The fourth tab allows users toassess quality-of-ﬁt by plotting ﬁtted and observed values for any subject in the dataset.The ﬁfth tab for the interactive graphic produced by the code above is shown as a static plot in Figure 1.A scatterplot of estimated FPC loadings (cid:98) c ik against (cid:98) c ik (cid:48) is shown in the upper plot, and k and k (cid:48) are selectedusing drop-down menus at the left. The lower plot shows ﬁtted curves for all subjects. In the scatterplot,a subset of FPC loadings can be selected by clicking-and-dragging to create a blue box; blue curves in theplot of ﬁtted values correspond to selected subjects in upper plot. In Figure 1 the ﬁrst and second FPCsare selected for the x and y axes of the score plot, respectively, and several subjects that have negativevalues for FPC 1 are highlighted. Fitted values for these subjects are clustered at the top of the y -axis,indicating that the ﬁrst FPC largely represents a vertical shift from the mean. A working example of refund.shiny for FPCA on a diﬀerent dataset is available at https://jeﬀ-goldsmith.shinyapps.io/FPCA.4igure 1: Screenshot showing tab 5 of the interactive graphics for FPCA. A scatterplot of FPC loadings (cid:98) c ik against (cid:98) c ik (cid:48) is shown in the upper plot, and k and k (cid:48) are selected using drop-down menus at the left.The lower plot shows ﬁtted curves for all subjects. In the scatterplot, a subset of estimated loadings can beselected by clicking-and-dragging to create a blue box; blue curves in the plot of ﬁtted values correspondto selected points in upper plot. Multilevel functional principal component analysis (MFPCA) extends the ideas of FPCA to functionaldata with a multilevel structure.

Multilevel functional data are increasingly common in practice; in the case of our DTI example, thisstructure arises from multiple clinical visits made by each subject. MFPCA models the within-subjectcorrelation induced by repeated measures as well as the between-subject correlation modeled by classicFPCA. This leads to a two-level FPC decomposition, where level 1 concerns subject-speciﬁc eﬀects and level2 concerns visit-speciﬁc eﬀects. Population-level basis functions and subject-speciﬁc scores are calculated5or both levels (Di et al., 2009, 2014). The MFPCA model is: Y ij ( t ) = µ ( t ) + η j ( t ) + K (cid:88) k =1 c (1) ik ψ (1) k ( t ) + K (cid:88) k =1 c (2) ijk ψ (2) k ( t ) + (cid:15) ij ( t ) (2)where µ ( t ) is the population mean, η j ( t ) is the visit-speciﬁc shift from the overall mean, ψ (1) k ( t ) and ψ (2) k ( t ) are the eigenfunctions for levels 1 and 2, respectively, and c (1) ik and c (2) ijk are the subject-speciﬁcand subject-visit-speciﬁc scores. Often, visit-speciﬁc means η j ( t ) are not of interest and can be omittedfrom the model. Estimation for MFPCA extends the approach for FPCA: estimated between- and within-covariances (cid:98) Σ (1) ( s, t ) = (cid:100) Cov( Y ij ( s ) , Y ij (cid:48) ( t )) for j (cid:54) = j (cid:48) and (cid:98) Σ (2) ( s, t ) = (cid:100) Cov( Y ij ( s ) , Y ij ( t )) are derived fromthe observed data, smoothed, and decomposed to obtain eigenfunctions and values. Given these objects,scores are estimated in a mixed-model framework. MFPCA is implemented in the mfpca.sc() function from the refund package. By default, mfpca.sc() does not calculate visit-means, but they can be calculated by specifying the mfpca.sc() argument twoway= TRUE .Graphics for MFPCA are implemented by the code below: > Y = DTI$cca> id = DTI$ID> fit.mfpca = mfpca.sc(Y = Y, id = id, twoway = FALSE)> plot_shiny(fit.mfpca)

This code produces an interface with ﬁve tabs, which is similar to the interface for FPCA but includesfeatures unique to multilevel analyses. Tabs 1, 2, 3, and 5 for MFPCA are (cid:98) µ ( t ) ± (cid:113)(cid:98) λ ( L ) k L (cid:98) ψ ( L ) k L ( t ), static screeplots of the estimated eigenvalues (cid:98) λ ( L ) k L , (cid:98) µ ( t ) + (cid:80) K L k L =1 c ( L ) k L (cid:98) ψ ( L ) k L ( t ), and scatterplots of FPC scores (similarto Figure 1), respectively. These mirror the tabs for FPCA and include inset sub-tabs to toggle betweenlevel, L , to display results for level 1 or level 2. The fourth tab plots ﬁtted and observed values for anyuser-selected subject in the dataset; the user can display all visits for the selected subject or choose asubset of visits. The ﬁrst tab for the interactive visualization produced by the code above is displayed inFigure 2, and shows (cid:98) µ ( t ) ± (cid:113)(cid:98) λ (1)2 (cid:98) ψ (1)2 ( t ). 6igure 2: Screenshot showing tab 1 of the interactive graphic for MFPCA. The plot at right shows (cid:98) µ ( t ) ± (cid:113)(cid:98) λ ( L ) k L (cid:98) ψ ( L ) k L ( t ); k L is chosen by the drop-down menu in at left, and the user can switch between level L byclicking Level 1 or Level 2 inset tabs at the top left.

Time-varying functional principal component analysis (TV-FPCA) extends the ideas of FPCA to modelfunctional data that are observed repeatedly in a longitudinal framework. In contrast to MFPCA, TV-FPCA accounts for the actual time of visit T ij at which the functional object Y ij ( · ) is recorded; this allowsus to study the time-varying behavior of the underlying true process and make predictions of full trajectoryat an unobserved visit time (Park and Staicu, 2015). Other modeling methods for longitudinal functionaldata that incorporate the actual visit times T ij include Greven et al. (2010) and Chen and M¨uller (2012). TV-FPCA (Park and Staicu, 2015) model for Y ij ( t ) = Y i ( t, T ij ) is given as follows: Y ij ( t ) = µ ( t, T ij ) + K (cid:88) k =1 c ik ( T ij ) ψ k ( t ) + (cid:15) ij ( t ) , (3)where µ ( t, T ij ) is the population mean that is assumed to vary smoothly over t and visit time T ij , ψ k ( t ) are orthogonal basis functions, c ik ( T ij ) are corresponding loadings that vary over T ij with meanzero and variance λ k , and (cid:15) ij ( t ) are residual curves. The time-varying scores c ik ( t ij ) are uncorrelatedover i , but correlated over j . Estimation of the TV-FPCA model components entails: 1) estimation of7he population mean by using bi-variate smoothing, 2) estimation of the marginal covariance Σ( s, t ) = (cid:82) Cov { Y i ( s, T ) , Y i ( t, T ) } g ( T ) dT , where g ( T ) is the density of the T ij ’s using the observed data, smoothingand decomposing it to get the eigenfunctions/eigenvalues (cid:98) ψ k ( t ) and (cid:98) λ k ; 3) estimation of the k th componentcovariance (cid:98) G k ( T, T (cid:48) ) = Cov { c ik ( T ) c ik ( T (cid:48) ) } . The last step is carried out using either linear random eﬀects,implying c ik ( T ) = b ( k )0 i + b ( k )1 i T or FPCA implying c ik ( T ) = b ik φ k ( T ) + . . . + b ikL k φ kL k ( T ). By modelingthese longitudinal dynamics, the time-varying coeﬃcient function c ik ( · ) can be used to predict scores atany longitudinal time T and, as a result, to predict the full response trajectory Y i ( · , T ). TV-FPCA is implemented in the fpca.lfda() function in the refund package. In Section 4.1, we haveused t to denote the functional argument for consistency with the rest of the paper; however to maintainconsistency with the notations used in Park and Staicu (2015), the plot_shiny() function for TV-FPCAuses s to denote the functional argument and T to denote the longitudinal time.Graphics for TV-FPCA are implemented by the code below: > MS <- subset(DTI, case ==1)> index.na <- which(is.na(MS$cca)); Y <- MS$cca; Y[index.na] <- fpca.sc(Y)$Yhat[index.na]> id <- MS$ID> visit.index <- MS$visit> visit.time <- MS$visit.time/max(MS$visit.time)> fit.tfpca <- fpca.lfda(Y = Y, subject.index = id,+ visit.index = visit.index, obsT = visit.time,+ LongiModel.method = ‘lme’)> plot_shiny(fit.tfpca) The code produces an interface with two tabs. Tab 1 shows exploratory plots and includes three insetsub-tabs. The ﬁrst sub-tab, shown in Figure 3, plots the observed curves for any user-selected subject,and includes options to display the observed curves of all subjects in the background and to displaythe estimated pointwise mean curve, denoted by m ( t ). The second sub-tab allows the user to see thelongitudinal changes of the observed curves for a user-selected subject i ; a slider bar animates the subject’svisit times and highlights the corresponding observed curve in the plot. The last sub-tab shows two plotsof the actual visit times T ij : the bottom plot presents static histogram of visit times of all subjects, whilethe top plot presents all of observed visit times on a horizontal line to help visualize the sparsity of thelongitudinal sampling. 8ab 2 shows estimated model components and predictions, and includes 8 inset sub-tabs. Sub-tabs1 and 2 present static images of the estimated mean surface (cid:98) µ ( t, T ) and estimated marginal covariance (cid:98) Σ( s, t ). Sub-tabs 3, 4, and 5 illustrate the ﬁrst step of estimation, and plot estimates of eigenfunctions (cid:98) ψ k ( t ), m ( t ) ± (cid:113)(cid:98) λ k (cid:98) ψ k ( t ), and static scree plots of the estimated eigenvalues (cid:98) λ k , respectively. Sub-tab 6shows the estimated covariance of the time-varying loadings c ik ( · ) for user-speciﬁed k . Sub-tab 7 showsthe prediction of c ik ( T ) for any user-selected subject i and component k ; it also has an option of displayingpredicted values of c ik ( T ) for all subjects in the background. Lastly, sub-tab 8 shows the prediction ofa full response trajectory Y i ( · , T ) for user-selected subject i in animation with change of values across 21equi-spaced grid of points of T in the range of observed visit times of all subjects.Figure 3: Screenshot showing Tab 1 of the interactive graphic for TV-FPCA. The plot shows observeddata of the selected subject. In many cases, a length p vector of scalar covariates x i = [ x i , . . . , x ip ] is observed in addition to thefunction Y i ( t ). In these situations, it is often of interest to model the conditional expectation of thefunctional response as it depends on the scalar predictors; indeed, this problem has been the focus of alarge literature (Brumback and Rice, 1998; Guo, 2002; Morris et al., 2003; Morris and Carroll, 2006; Reisset al., 2010; Scheipl et al., 2015; Goldsmith and Kitago, 2015; Goldsmith et al., 2015).9 .1 FoSR Model The most common function-on-scalar regression model is Y i ( t ) = β ( t ) + p (cid:88) k =1 x ik β k ( t ) + (cid:15) i ( t ) (4)where the β k ( t ) are ﬁxed eﬀects associated with scalar covariates and the (cid:15) i ( t ) are residual curves. Thecoeﬃcients β k ( t ) are interpreted analogously to coeﬃcients in a (non-functional) multiple linear regression– as the expected change in response for each one unit change in the predictor – with the exception thatthey, like the outcome, are deﬁned over t . Many estimation and inferential strategies are available formodel (4); a popular approach is to expand coeﬃcients β k ( t ) using a spline basis, which allows one torecast (4) as a traditional linear regression model and focus estimation on a vector of unknown splinecoeﬃcients. Our example uses the bayes fosr() function in the refund package, which uses a rich cubicB-spline basis and estimates spline coeﬃcients in a Bayesian framework with priors speciﬁed to enforcesmoothness in the resulting coeﬃcient functions. Both a Gibbs sampler and a computationally eﬃcientvariational approximation are available in refund . Graphics for FoSR are implemented by the code below: > DTI = DTI[complete.cases(DTI),]> fit.fosr = bayes_fosr(cca ~ pasat + sex, data = DTI)> plot_shiny(fit.fosr)

This code produces a interface with four tabs, each showing plots associated with model 4. The ﬁrst tab is aplot of the observed data with the option to color curves by a user-selected covariate; this builds intuitionanalogously to scatterplots for non-functional regression. The second tab shows (cid:98) β ( t ) + (cid:80) pk =1 x k (cid:98) β k ( t ),where values of x k can bet set by slider bars for continuous covariates or drop-down menus for categoricalcovariates; adjusting the sliders or drop-down menus shows the estimated conditional expectation for aspeciﬁed predictor vector. The third tab, illustrated in Figure 4, shows estimated coeﬃcient functions (cid:98) β k ( t ) with pointwise conﬁdence intervals for the covariate x k selected in a drop-down menu. The fourthtab is a plot of the residual curves (cid:98) (cid:15) i ( t ) and allows for identiﬁcation of median and outlying curves by band10epth (Lopez-Pintado and Romo, 2009; Sun and Genton, 2011; Sun et al., 2012); the user can also chooseto ’rainbowize by depth’, which colors the curves from the median outward based on depth.Figure 4: Screenshot showing tab 3 of the interactive graphic for FoSR. The plot shows the estimatedcoeﬃcient function (cid:98) β k ( t ) for the selected covariate x k with pointwise conﬁdence intervals. refund.shiny Package

We now brieﬂy describe the code infrastructure used to create the refund.shiny package.As indicated in the introduction, the workﬂow separates visualization from analysis in the followingway. First, one analyzes a dataset using a function in the refund package. The functions in refund take discretely observed functional data as input, perform an analysis, and return an object whose classcorresponds to the method used. For example, the fpca.sc() function return as object of class fpca and the bayes.fosr() function returns an object of class fosr . The primary function in refund.shiny , plot shiny() , is a generic function whose behavior depends on the class of the object passed as anargument. Because of this structure, the user experience is uniform across a variety of analyses; this alsosuggests a development strategy for the addition of interactive graphics as new analysis techniques becomeavailable. Lastly, by separating the analysis and visualization steps, it is possible for analysis functionsdeveloped outside of the refund package to return objects of a deﬁned class and thereby take advantageof the plotting capabilities we describe.The interactive graphics in the refund.shiny are built on RStudio’s R package shiny (RStudio Inc.,2015), which signiﬁcantly reduces the barriers to producing webpage-style representations of analysis results11n R . Other examples of interactive graphics that utilize the shiny framework are shinyMethyl (Fortinet al., 2014) for visualization of high-dimensional genomic data and shinystan (Stan Development Team,2015) for exploring Bayesian models ﬁt using Markov Chain Monte Carlo. In refund.shiny the plotswithin tabs are produced using ggplot2 (Wickham and Chang, 2015); it is possible to export each plot asa PDF or to save the corresponding ggplot object to the user’s R workspace for further manipulation. Visualization has long been acknowledged as a central tool in data analysis. For functional datasets, theneed for useful graphics is compounded: data are inherently complex, high-dimensional and structured.Although a robust literature for functional data exists and many methods have standard graphical represen-tations, the creation of these graphics is often time consuming. The refund.shiny package was developedto ease this process by producing a visualization framework for several common functional data analyses.By leveraging new tools for interactivity, refund.shiny responds to user input and actions and, in sodoing, can build intuition for analyses in both statisticians and practitioners. The interfaces produced by refund.shiny using the shiny framework are web applications, rendered locally by a web browser. Theseapplications can be hosted publicly and may, in the spirit of “visuanimations“ (Genton et al., 2015), beincluded as important parts of scientiﬁc papers and reports.We use an analytic workﬂow that separates modeling from visualization. Doing so allows several meth-ods and implementations to take advantage of the same visualization software; as an example, fpca.sc() , fpca.face() , fpca.ssvd() , and fpca2s() implement diﬀerent methods for FPCA but are all compatiblewith plot shiny() . This produces an intuitive user experience and leaves open the possibility for futureapproaches to FPCA or FoSR to use the refund.shiny package for visualization with minimal eﬀort. Sim-ilarly, this workﬂow is amenable to the development of interactive visualizations for additional functionaldata analyses in future iterations of the package. 12 Acknowledgments

The third author’s research was supported partially by National Science Foundation DMS 0454942 andNational Institutes of Health grants R01 NS085211 and R01 MH086633. The last author’s research was sup-ported in part by Award R01HL123407 from the National Heart, Lung, and Blood Institute and by AwardR21EB018917 from the National Institute of Biomedical Imaging and Bioengineering. The MRI/DTI datawere collected at Johns Hopkins University and the Kennedy-Krieger Institute.

References

Brumback, B. and Rice, J. “Smoothing spline models for the analysis of nested and crossed samples ofcurves.” Journal of the American Statistical Association, 93:961–976 (1998).Chen, K. and M¨uller, H.-G. “Modeling Repeated Functional Observations.” Journal of the AmericanStatistical Association, 107:1599–1609 (2012).Crainiceanu, C., Reiss, P., Goldsmith, J., Gellar, J., J, H., McLean, M. W., Swihart, B., Xiao, L., Chen,Y., Greven, S., Kundu, M. G., Wrobel, J., Huang, L., Huo, L., and Scheipl, F. refund: Regression withFunctional Data (2015). R package version 0.1-13.URL http://CRAN.R-project.org/package=refund

Di, C.-Z., Crainiceanu, C. M., Caﬀo, B. S., and Punjabi, N. M. “Multilevel Functional Principal ComponentAnalysis.” Annals of Applied Statistics, 4:458–488 (2009).Di, C.-Z., Crainiceanu, C. M., and Jank, S. J. “Multilevel Sparse Functional Principal Component Analy-sis.” Stat, 3:126–143 (2014).Fortin, J. P., Fertig, E., and Hansen, K. “shinyMethyl: interactive quality control of Illumina 450k DNAmethylation arrays in R [version 1; referees: 2 approved].” f1000research, 3:175 (2014).Genton, M. G., Castruccio, S., Crippa, P., Dutta, S., Huser, R., Sun, Y., and Vettori, S. “Visuanimationin statistics.” Stat, 4:81–96 (2015).Goldsmith, J., Bobb, J., Crainiceanu, C. M., Caﬀo, B., and Reich, D. “Penalized Functional Regression.”Journal of Computational and Graphical Statistics, 20:830–851 (2011).Goldsmith, J., Greven, S., and Crainiceanu, C. M. “Corrected Conﬁdence Bands for Functional Data usingPrincipal Components.” Biometrics, 69:41–51 (2013).Goldsmith, J. and Kitago, T. “Assessing Systematic Eﬀects of Stroke on Motor Control using HierarchicalFunction-on-Scalar Regression.” Journal of the Royal Statistical Society: Series C, To Appear (2015).Goldsmith, J. and Wrobel, J. refund.shiny: Interactive plotting for functional data analyses (2015). Rpackage version 0.1. 13oldsmith, J., Zipunnikov, V., and Schrack, J. “Generalized multilevel function-on-scalar regression andprincipal component analysis.” Biometrics, 71:344–353 (2015).Greven, S., Crainiceanu, C. M., Caﬀo, B., and Reich, D. “Longitudinal Functional Principal ComponentAnalysis.” Electronic Journal of Statistics, 4:1022–1054 (2010).Guo, W. “Functional mixed eﬀects models.” Biometrics, 58:121–128 (2002).Lopez-Pintado, S. and Romo, J. “On The Concept of Depth for Functional Data.” Journal of the AmericanStatistical Association, 104:486–503 (2009).Morris, J. S. “Functional Regression Analysis.” Annual Review of Statistics and Its Application, 2(1)(2015).Morris, J. S. and Carroll, R. J. “Wavelet-based functional mixed models.” Journal of the Royal StatisticalSociety: Series B, 68:179–199 (2006).Morris, J. S., Vannucci, M., Brown, P. J., and Carroll, R. J. “Wavelet-Based Nonparametric Modelingof Hierarchical Functions in Colon Carcinogenesis.” Journal of the American Statistical Association,98:573–583 (2003).Park, S. and Staicu, A.-M. “Longitudinal functional data analysis.” Stat, 4:212–226 (2015).Ramsay, J. O. and Silverman, B. W. Functional Data Analysis. New York: Springer (2005).Reiss, P. T., Huang, L., and Mennes, M. “Fast Function-on-Scalar Regression with Penalized Basis Ex-pansions.” International Journal of Biostatistics, 6:Article 28 (2010).RStudio Inc. shiny: Web Application Framework for R (2015). R package version 0.12.2.URL http://CRAN.R-project.org/package=shiny

Scheipl, F., Staicu, A.-M., and Greven, S. “Functional additive mixed models.” Journal of Computationaland Graphical Statistics, To Appear (2015).Sørensen, H., Goldsmith, J., and Sangalli, L. “An Introduction with Medical Applications to FunctionalData Analysis.” Statistics in Medicine, 32:5222–5240 (2013).Staicu, A.-M., Crainiceanu, C. M., Reich, D. S., and Ruppert, D. “Modeling functional data with spatiallyheterogeneous shape characteristics.” Biometrics, 68(2):331–343 (2012).Stan Development Team. shinystan: Interactive Visual and Numerical Diagnostics and Posterior Analysisfor Bayesian Models (2015). R package version 2.0.1.URL http://CRAN.R-project.org/package=shinystan

Strauss, E., Sherman, E., and Spreen, O. Compendium of neuropsychological tests: Administration, norms,and commentary. New York: Oxford University Press (2006).Sun, Y. and Genton, M. G. “Functional boxplots.” Journal of Computational and Graphical Statistics,20(2) (2011). 14un, Y., Genton, M. G., and Nychka, D. W. “Exact fast computation of band depth for large functionaldatasets: How quickly can one million curves be ranked?” Stat, 1(1):68–74 (2012).Wickham, H. and Chang, W. ggplot2: An Implementation of the Grammar of Graphics (2015). R packageversion 1.0.1.URL http://CRAN.R-project.org/package=ggplot2http://CRAN.R-project.org/package=ggplot2