Movement Prediction Using Accelerometers in a Human Population
Luo Xiao, Bing He, Annemarie Koster, Paolo Caserotti, Brittney Lange-Maia, Nancy W. Glynn, Tamara Harris, Ciprian M. Crainiceanu
MMovement Prediction Using Accelerometers in a HumanPopulation
Luo Xiao , Bing He , Annemarie Koster , Paolo Caserotti , Brittney Lange-Maia ,Nancy W. Glynn , Tamara Harris and Ciprian M. Crainiceanu The Johns Hopkins University, Baltimore, MD, U.S.A. University of Maastricht, Maastricht, The Netherlands University of Southern Denmark, Odense, Denmark University of Pittsburgh, Pittsburgh, PA, U.S.A. National Institute on Aging, Bethesda, MD, U.S.A. September 10, 2018
Abstract
We introduce statistical methods for predicting the types of human activity at sub-second resolution using triaxial accelerometry data. The major innovation is that weuse labeled activity data from some subjects to predict the activity labels of othersubjects. To achieve this, we normalize the data across subjects by matching thestanding up and lying down portions of triaxial accelerometry data. This is necessaryto account for differences between the variability in the position of the device relativeto gravity, which are induced by body shape and size as well as by the ambiguousdefinition of device placement. We also normalize the data at the device level to ensurethat the magnitude of the signal at rest is similar across devices. After normalizationwe use overlapping movelets (segments of triaxial accelerometry time series) extractedfrom some of the subjects to predict the movement type of the other subjects. Theproblem was motivated by and is applied to a laboratory study of 20 older participantswho performed different activities while wearing accelerometers at the hip. Predictionresults based on other people’s labeled dictionaries of activity performed almost aswell as those obtained using their own labeled dictionaries. These findings indicate a r X i v : . [ s t a t . A P ] J u l hat prediction of activity types for data collected during natural activities of dailyliving may actually be possible. Keywords:
Accelerometer; Activity type; Movelets; Prediction.
Body-worn accelerometers provide objective and detailed measurements of physical activityand have been widely used in observational studies and clinical trials (Atienza and King2005; Boyle et al. 2006; Bussmann et al. 2001; Choi et al. 2011; Grant et al. 2008; Kozey-Keadle et al. 2011; Schrack et al. 2014a; Sirard et al. 2005; Troiano et al. 2008). However,it is challenging to transform the accelerometry data into quantifiable and interpretableinformation such as activity intensity or energy expenditure (Bai et al. 2014; Schrack et al.2014b; Staudenmayer et al. 2012; Troiano et al. 2008; Trost et al. 2005; Welk et al. 2000). Animportant goal of these studies is to transform an observed accelerometry dataset into a seriesof activity types that is time-stamped. In this paper we are concerned with predicting activitytypes at sub-second resolution using detailed triaxial accelerometry information. Sub-secondlabels seems to be the highest resolution that matters in terms of human activity recognition.Indeed, most human movements occur between 0.3 and 3.4 Hz (Sun and Hill 1993). Moreover,the resolution is necessary as we are interested in capturing short movements such as walking2 or 3 steps, which is a highly prevalent type of activity in real life and likely to becomea bigger component of activity as people age. Such labeled time series data could then beused for health association studies, where decreases in activity diversity or changes in thecircadian rhythm of activities may represent early strong indicators of biological processes ordiseases. These expectations have strong face validity, as, for example, 1) an early indicationof health recovery after surgery is the will and ability of a patient to use the bathroom; 2)disease may be associated with early reduction or abandonment of non-essential activities;and 3) death is associated with exactly zero activities.2 .1 Accelerometry Data
An accelerometer is a device that measures acceleration. When attached to the body of ahuman subject, if the subject is at rest, the accelerometer measures the subject’s orientationrelative to the gravitational vector; if the subject is moving, the accelerometer measures acombination of the subject’s orientation and acceleration. Recent technology advances haveproduced small and light accelerometers that could collect data at high sample rates. Forexample, the Actigraph GT3+ device is of size 4 . × . × . cm , weighs only 19 grams,and could sample data at 100 Hz; see Figure 1 for pictures of this device. Thus, accelerom-eters can be easily attached to the human body and used for objectively recording detailedaccelerations due to human physical activities.Figure 1: An Actigraph GT3+ accelerometer and its standard orientation. The top left,right and bottom graphs show the acceleration due to Earth’s gravity when the correspondingaxis aligns up with the opposite direction of Earth’s gravity. When the up-down axis is the x -axis, the coordinate system is right-hand oriented.The output from triaxial accelerometers, such as the Actigraph GT3+ device, is a triaxialtime series of accelerations along three mutually orthogonal axes and expressed in unitsof Earth gravity, i.e., g = 9 . m/s . The three axes are labeled as “up-down”, “forward-3ackward” and “left-right” according to a device-specific reference system; see Section 2.1 formore details. While the axes have these labels they have meaning only in the reference systemof the device, as the device will move with the part of the body it is attached to. This meansthat an axis that is up-down relative to the device can easily be forward-backward, left-right,down-up or anything in between in the body-reference system. Figure 2 shows five segmentsof data for two subjects wearing the devices at hip. From top to bottom, subjects perform5 replicates of standing up from a chair and sitting back (chairStands), walk 20 metersat normal speed (normalWalk), mimic vacuuming, stand still, and lie down on the back.From the data we observe that: 1) variability for active periods (normalWalk, chairStandand vacuum) is higher than inactive periods (standing and lying); 2) within each subject,the ordering and relative position of the three axes are different for standing and lying asthe orientation of the accelerometer with respect to the gravity differs for the two postures.Observations indicate that accelerometers are capable of detecting and differentiating varioushuman activities. Moreover, we see that for chairStand and normalWalk, data for bothsubjects exhibit rhythmic patterns, suggesting that within the subject movements for sameactivities appear similar in the accelerometry data. The motivating data were collected from 20 older adults who were originally enrolled inthe Study of Energy and Aging (SEA) pilot study. These participants were invited for anancillary study for validating hip and wrist accelerometry and were instructed in a researchclinic to perform 15 different types of activities according to a protocol. Table 1 providesthe labels, detailed description and durations for the 15 activities. The selection and designof these activities are intended to simulate a free-living context. The activity types arereferred to by their labels in the paper. Throughout the study, each participant wore threeActigraph GT3X+ devices simultaneously, which were worn at the right hip, right wrist and4eft wrist, respectively. The data were collected at a sampling frequency of 80Hz. Basedon the protocol and the start/end times for each activity, a time series of labels of activitytypes is constructed to annotate the accelerometry data. In this paper we will focus on thedata collected from accelerometers located at the hip and study how well a given programof activities can be distinguished by the accelerometry data at the population level.We revisit Figure 2, which displays the raw accelerometry data obtained from the hipaccelerometers. We focus on the data for chairStand and normalWalk, which exhibit rhyth-mic patterns. An important observation is that these repetitive movements look very similarwithin the same person, though not across persons; this is a crucial observation as mostprediction techniques are fundamentally based on similarities between signals. For exam-ple, for chairStand, sudden large changes in acceleration magnitudes can be observed in theleft-right axis for subject 13 but not for subject 4. Another example is that for normalWalkaccelerations along the up-down and forward-back axes align up very well for subject 4 butare far apart for subject 13. These dissimilarities seem to suggest that the accelerometrydata are not comparable across subjects. Simply throwing prediction techniques at such aproblem, irrespective to how sophisticated or cleverly designed they are, would achieve littlein terms of understanding the data structure or solving the original problem. However, wewill show that a substantial amount of these observed dissimilarities is due to the orientationinconsistency of the reference systems across subjects and can be significantly reduced byusing the same orientation, i.e., a common reference system. If a common reference systemwere used, then the three axes for standing and lying in Figure 2 would be very similar forthe two subjects. This is clearly not the case for the raw accelerometry data as the left-rightand the up-down axes for lying overlap for subject 13 but are quite different for subject 4.5able 1: 15 activity types: labels, detailed description and durations
Groups Labels Description DurationResting lying lay still face-up on a flat surface with arms at sides and legsextended 10 minsstanding stand still with arms hanging at sides 3 minsUpper body(whilestanding) washDish fetch wet plates from a drying rack, dry them using a tryingtowel, and stack adjacent to the drying rack one-by-one 3 minsknead knead a ball of playdough as if for cooking/baking 3 minsdressing unfold lab jacket, put jacket on (no buttoning), then remove,place the jacket on a hanger, and put the hanger on a nearbyhook 3 minsfoldTowel fold towels and stack them nearby 3 minsvacuum vacuum a specified area of the carpet 3 minsshop walk along a long shelf, remove labeled items from the uppershelf about chest height, and place them on the lower shelfabout waist height 3 minsUpper body(while sitting) write write a specified sentence on one page of the notebook, thenturn to the next page and repeat 3 minsdealCards hold a full deck, and deal cards one-by-one to six positionsaround a table 3 minsLower body chairStand starting in a sitting position, rise to a normal standing po-sition, then sit back down 5 cyclesnormalWalk starting from standing still, walk 20 meters at a comfortablepace 20 metersnormalWalkNoSwing starting from standing still, walk 20 meters at a comfortablepace with arms folded in front of chest 20 metersfastWalk starting from standing still, walk 20 meters at the fastestpace 20 metersfastWalkNoSwing starting from standing still, walk 20 meters at the fastestpace with arms folded in front of chest 20 meters -1.0 -0.5 Subject 4: chairStand -1.5 -1.0 -0.5 Subject 4: normalWalk -1 Subject 4: vacuum -1.5 -1.0 -0.5 Subject 4: standing -1.5 -1.0 -0.5 Subject 4: lying -1.5 -1.0 -0.5 Subject 13: chairStand -1.5 -1.0 -0.5 Subject 13: normalWalk -1.5 -1.0 -0.5 Subject 13: vacuum -1.5 -1.0 -0.5
Subject 13: standing -1.5 -1.0 -0.5
Subject 13: lying up-down forward-backward left-right Figure 2: Raw data of chairStand, normalWalk, vacuum, standing and lying from hip-wornaccelerometers for two subjects. The x -axis denotes recording time in seconds; the y -axisdenotes the signal expressed in g units. The legend in the bottom right plot applies to allplots. The shaded areas contain two 1 s movelets. Note that orientation and placement ofthe device may change in the reference system designed around Earth’s gravity.7 .3 Proposed Methods In this paper, we first address the problem that the raw accelerometry data collected fromdifferent subjects may not be directly comparable. We show that the raw data are measuredwith respect to different reference systems and thus have different meanings across subjects.We will provide a data transformation approach for normalizing the data, which is designedto mitigate these inherent problems in data collection.Once data are normalized we proceed to predict activities using some of the subjectsfor training and the remaining subjects for testing the prediction algorithm. In particular,we will use movelets, a dictionary learning based approach that extends the methodologyin Bai et al. (2012) designed for same-subject prediction. Here we describe briefly whatmovelets are and provide the intuition behind the approach. A movelet is the collectionof the three acceleration time series in a window of given size (e.g., 1 s ). For example, thetime series in the two shaded areas in Figure 2 represent two 1- s movelets. The sets ofoverlapping movelets constructed from the accelerometry data with annotated labels areorganized by activity types, which play the role of accelerometry “dictionaries” for differentactivity types. As was illustrated in Section 1.1, the intuition is that movements, andthe associated movelets, are similar for same activities and different for different activities.Based on this intuition, predictions of activity type based on accelerometry data withoutannotated labels can be obtained by identifying the annotated movelet that is most similarto the data; the similarity is quantified by the L distance. An important problem withthe movelets approach is choosing the window size for the movelets. We will introduce acriterion based on prediction performance, evaluate the criterion in our data, and providespecific recommendations for the optimal size of the movelets. This gives us a rigorous anddata-based approach that provides the necessary context for the currently used window size.8 .4 Existing Literature A number of methods have been used for recognition of activity types, including linear/quadraticdiscriminant analysis (Pober et al. 2006), hidden Markov models (Krause et al. 2003), arti-ficial neural networks (see, e.g., Staudenmayer et al. 2009; Trost et al. 2012), support vectormachines (see, e.g., Mannini et al. 2013) and combined methods (see, e.g., Zhang et al. 2012).Bao and Intille (2004) and Preece et al. (2009) reviewed and evaluated methods used in clas-sifying normal movements. The major limitations of these methods include: 1) they usuallyrequire at least a 1-minute window to conduct feature extraction; 2) they do not capturefiner movements that last less than 1 minute, such as falling or standing up from a chair;and 3) the prediction process is usually hard to understand and interpret. In contrast, ourproposed method is fully transparent, easy to understand, requires minimal training data,and is designed both for periodic and non-periodic movements.The rest of the paper is organized as follows. In Section 2 we consider two main factors formaking the raw accelerometry data not comparable and propose a transformation methodfor normalizing the data. In Section 3, we describe in detail the movelet-based predictionmethods. In Section 4 we apply our prediction method to some real data. We conclude thepaper with a discussion about the feasibility of movelet-based movement prediction and itspotential relevance to public health applications.
A fundamental problem with the raw triaxial accelerometry data is that they may not bedirectly comparable across subjects; Figure 2 provides the intuition about this problem andindicates that orientation inconsistency is an important factor. Another factor is device-9pecific systematic bias, which we will explain in details later.More specifically, triaxial accelerometers measure accelerations along the three axes inthe reference system of the device. Indeed, Figure 1 indicates exactly the device-specificreference system. In its standard orientation and in absence of movement, the up-down axisof the device is aligned with Earth’s gravity and will register −
1g (acceleration towards thecenter of the earth) and 0g along the other two axes (orthonormal to Earth’s gravity). Ifthe device is rotated clockwise by 90 degrees towards the forward-backward (left-right) axisthen in absence of movement the forward-backward acceleration will continuously record − s for each segment) for standing and lying for each of the 20 subjectsin our study and computed the means of three axes for each segment; see the left panel ofFigure 3. If the orientation of the accelerometers respected our intuition of directions thenthe means should be ( − , ,
0) for standing and (0 , − ,
0) for lying. Here we use a length-310ector to denote the acceleration in the order of up-down, forward-backward, and left-rightaxes. The left panel of Figure 3 indicates that the orientations of the accelerometers arerarely they are expected to be and vary considerably across subjects. It is interesting tosee that almost all accelerometers were flipped in the up-down direction because for thoseaccelerometers the means of data for standing are (1 , , , , (cid:112) x + x + x if the mean acceleration vector is denoted by ( x , x , x ) T . Note that the magnitude of themean acceleration vector is invariant to rotation. Figure 4 plots the magnitude of the meanacceleration vector for standing, which should be 1 g . An inspection of the graph indicatesthat the magnitude is rarely equal to 1 and it differs from 1 by as much as 6% for twosubjects. We have found that these differences can have serious consequences for activitytype prediction, because a change of this magnitude can fundamentally affect the geometryof the activity space. Using a simple model where we assume that subjects have randommoves while standing still (see Appendix A) we could show that the gravity-inflation (notethat magnitudes are typically larger than 1) is most likely not due to random movementsfor most devices. Thus, for the purpose of our research we will treat these differences as11 Raw data
Subject ID up - do w n -1 Subject ID f o r w a r d - ba ck w a r d -1 Subject ID l e ft -r i gh t -1 Normalized data
Subject ID up - do w n -1 Subject ID f o r w a r d - ba ck w a r d -1 Subject ID l e ft -r i gh t -1 Figure 3: Means of standing and lying using the raw data (left panel) and the normalizeddata (right panel). Filled circles are for standing while filled triangles are for lying. The x -axis denotes subject ID; the y -axis is g units. The top panel is for the up-down axis, themiddle one is for the forward-backward axis, and the bottom one is for the left-right axis.This figure appears in color in the electronic version of this article.12ystematic biases that are associated with the devices. A more in-depth and principledanalysis of this assumption is beyond the scope of the current paper. Magnitude of mean acceleration for standing
Subject ID raw data normalized data
Figure 4: Magnitude of the mean acceleration vector during standing (expressed in g units)using the raw and normalized data. Blue triangles are for raw data while red circles are fornormalized data. The purpose of normalization is to make same-activity data more comparable across subjectsso that we can use the dictionary of movements of one or several subjects (movelets) to predictthe activity of others. To achieve this, a desired normalization procedure should be able tocorrect the orientation inconsistency and reduce the bias in the raw data.The proposed procedure is to apply a particular linear transformation to the raw data.This consists of two steps: rotation and translation. We first rotate the triaxial data so thatall the data are in the standard orientation (reference system) and then translate the rotateddata to reduce systematic biases. To be precise, assume u = ( u , u , u ) T is a single data13oint in the raw data space, R is a rotation matrix, and b = ( b , b , b ) T is the vector ofsystematic bias. Then we have x = Ru − b . (1)In practice, accelerometers might be moving from time to time relative to the human body,which could make their orientation time-dependent. We assume that the accelerometers donot move with respect to the human body and apply the same transformation in (1) to allraw data within the same subject. This simple approach has worked well in practice.We need to determine R and b based on the subject-specific raw data; both R and b depend on the subject, but the notation was dropped for presentation simplicity. Weextracted two small segments from the raw accelerometry data, one segment for standingand another for lying; the segments can be very short, say 2-3 seconds. The approach thenproceeds by calculating the means of the three axe for both activities, which results in twothree dimensional vectors a for standing and a for lying. If the up-down axis for theorientation of the device aligns well with the negative direction of Earth’s gravity and thedata have no systematic bias, then a should be close to − e = ( − , , T and a shouldbe close to − e = (0 , − , T . Hence, we select R from the class of rotation matrices thatminimizes (cid:107) Ra + e (cid:107) + (cid:107) Ra + e (cid:107) and satisfies e T R ( a × a ) >
0, where e = (0 , , T and a × a is the cross product of a and a . The latter condition ensures that we havea right-hand coordinate system for the rotated data. It can be shown that R is uniquelydetermined and can be computed, as shown in Web Appendix C. Then we let b = Ra + e ,which by (1) implies that the mean of standing accelerometry is exactly − e .We note that estimation of the rotation matrix R might be affected by the existence ofsystematic bias. For our accelerometry data, the systematic bias is small (see Figure 4) andseems to have negligible effect on rotation. However, if one is concerned about this issue,then we suggest a visual examination of the raw and normalized data, which might reveal ifnormalization provides reasonable results. 14 .3 Normalized Data We applied the normalization method to the accelerometry data and re-calculated the meansof standing and lying for the 20 subjects. As expected, the right panel of Figure 3 showsthat the means of standing (lying) are close to − e ( − e ) for all subjects. Comparingthe two panels in Figure 3 we see strong indications that the normalized data are morecomparable across subjects. The fact that means are closer to what they are designed tobe after normalization is satisfying, though not surprising. To show the dramatic effectsof normalization we investigate the changes from the raw to normalized data in activitiesthat were not used for normalization. The top panels of Figure 5 displays the raw datafor normal walking for 2 subjects. A close inspection of the two graphs indicate periodicmovements, though the patterns and the size of the means of movement make it very hard forany method developed for one of the subjects to recognize the patterns of the other subject.After normalization the patterns are much more comparable (see bottom panels in Figure 5),which will allow powerful techniques, such as movelets, to be generalized across individuals.It is interesting that while the signal is visually similar, the amplitude of the up-down axis(the black solid line in the bottom panels in Figure 5) differs quite substantially between thetwo subjects. This is likely due to the stronger up-down acceleration of subject 4 comparedto subject 13. In this section, we first describe the subject-level movelets prediction method developed byBai et al. (2012) and introduce some notation. Then we propose a population-level moveletsprediction method that could predict a subject’s activity labels with no knowledge abouthow this subject’s accelerometry data look when doing various activities. Finally, we providea simple automatic approach for tuning the window size/length of movelets, an important15 -2 -1 Subject 4: Raw data up-down forward-backward left-right -3 -2 -1 Subject 13: Raw data up-down forward-backward left-right -3 -2 -1 Subject 4: Normalized data up-down forward-backward left-right -3 -2 -1 Subject 13: Normalized data up-down forward-backward left-right Figure 5: Raw and normalized data from two subjects for normalWalk.16omponent in the movelets prediction procedure.
We denote the normalized triaxial accelerometry data by X i ( t ) = { X i ( t ) , X i ( t ) , X i ( t ) } where t ∈ T i and i ∈ I . Here T i is a time domain on the scale of seconds and I denotesthe collection of subjects. Associated with the data is a label function L i ( t ) which takesvalues in A = { Act , . . . , Act A } , a collection of labels each denoting a distinct activity. Ifsubject i is standing at time t , L i ( t ) will be the label in A that represents standing. Let U i and W i be a disjoint union of T i . Then the subject-level prediction is to predict the labels { L i ( t ) : t ∈ W i } for the data {X i ( t ) : t ∈ W i } , assuming {X i ( t ) : t ∈ U i } is the trainingdata with { L i ( t ) : t ∈ U i } being known. For the subject-level prediction, data and labelinformation across subjects are not used.The idea of movelets prediction (Bai et al. 2012; He et al. 2014) is to first decompose theaccelerometry data into a collection of overlapping movelets. A movelet of length h at time t is defined as M i ( t, h ) = {X i ( s ) : s ∈ [ t, t + h ) } and it captures the acceleration patterns in thetime interval [ t, t + h ). For simplicity we drop the parameter h from M i ( t, h ) hereafter. Theaccelerometry data can be decomposed into a continuous sequence of overlapping moveletswith X i ( t ) being contained in all the movelets starting at s ∈ ( t − h, t ]. The approachof overlapping movelets, as a type of sliding window technique, is advantageous to otherwindowing techniques such as event-defined windows or activity-defined windows (Preeceet al. 2009). The latter windowing techniques require either locating specific events ordetermining the times at which the activity changes; but the use of overlapping moveletsrequires no such “preprocessing” and is better for real-life applications. Another advantageof overlapping movelets is that they are computationally easy and simple to construct, whichmakes them well suited for analyzing ultra-dense acceleration data.17sing movelets, the training data can be represented as { M i ( t ) : t ∈ ¯ U i } and similarlyfor the unlabeled data, { M i ( t ) : t ∈ ¯ W i } . Here ¯ U i = { t ∈ U i : [ t, t + h ) ⊂ U i } and¯ W i = { t ∈ W i : [ t, t + h ) ⊂ W i } . Note that in the new data representation with movelets, adata point will be lost if it is not contained in any time interval of length at least h in U i or W i . To avoid ambiguity for prediction, the movelets in the training data that do not belongexclusively to a single type of activity are deleted. For the movelets in the training data, wedefine the label function ¯ L i { M i ( t ) } := L i ( t ).Given an unlabeled movelet M i ( t ) , t ∈ ¯ W i , the method is to find the closest match in thetraining data, i.e., to search for t ∗ = t ∗ ( t ) ∈ ¯ U i such that M i ( t ∗ ) := arg min s ∈ ¯ U i f { M i ( t ) , M i ( s ) } , (2)where f ( · , · ) is some function that measures distance between movelets; this type of 1-nearestneighbor works exceptionally well in conjunction with the sliding window movelets. The ideais that movelets with similar pattern or shape should belong to the same activity. A simple f that will be adopted in the paper is f { M i ( t ) , M i ( s ) } := h − (cid:90) h (cid:107)X i ( t + h (cid:48) ) − X i ( s + h (cid:48) ) (cid:107) dh (cid:48) . (3)The above L distance measures explicitly the amplitude difference between two movelets;in addition, because movelets are always of the same length, the distance between moveletsfrom activities of different frequencies can also be large.Since M i ( t ∗ ) is the best match for M i ( t ), the label of the movelet at time t can bepredicted by P ¯ L i ( M i ( t )) := ¯ L i ( M i ( t ∗ )) = L i ( t ∗ ). We may use the predicted label for M i ( t ) asthe prediction for X i ( t ). However, this may not always be an accurate prediction because weuse data as far as h seconds away in making the prediction with the underlying assumptionthat the activity type remains the same in the time interval [ t, t + h ). So if the moveletactually contains some transitional period between different activity types, this predictioncan be wrong. Using the facts that X i ( t ) contributes to the prediction of all the movelets18hat start at s ∈ ( t − h, t ] and that human physical movements are in general continuous, thepredicted label can be the most frequently predicted label for all the movelets that contain X i ( t ). Formally we let P L i ( t ) : = arg max Act j ∈A h − (cid:90) tt − h { P ¯ L i ( M i ( s ))= Act j } ds = arg max Act j ∈A h − (cid:90) tt − h { L i ( t ∗ ( s ))= Act j } d s, where 1 {·} is 1 if the statement inside the bracket is true and 0 otherwise. The movelets method described in the previous section considers only prediction at thesubject level as the training data and the data to be predicted are from the same subject.We now extend the movelets prediction method to the population level. We assume I , thecollection of subjects, are divided into two disjoint sub-collections, I and I . For subjects in I the activity labels are known while for these in I the labels are unknown. The problemis to predict the activity labels for subjects in I using the data from I .Given an unlabeled movelet M i ( t ), i ∈ I , the proposed method is to search for { i ∗ , t ∗ ( t ) } such that M i ∗ ( t ∗ ) := arg min { M j ( s ): j ∈ I , s ∈ ¯ T j } f { M i ( t ) , M j ( s ) } , (4)where f is defined in (3) and ¯ T j = { t ∈ T j : [ t, t + h ) ⊂ T j } . With small modifications,equation (4) reduces to (2) when we consider a single subject. The idea of the proposedmethod is that we will be able to label the movelet accurately as long as it could match themovelet to a same-activity movelet of at least one subject in I . This is the key for a successfulpopulation-level prediction, as movements from different subjects usually exhibit differentpatterns and hence movelets from different subjects have large within-activity variation.However, having a sizable group of subjects in the training set will cover multiple patterns19n the normalized data, which will lead to improved prediction. In some sense, ours isthe implementation of the intuition that “many people move differently, but some peoplemove like you”. For example, for fast walking with arm swinging in the accelerometry data,the time for completing two steps ranges from about 0 . s to 0 . s . Hence, rather thanrequiring a subject’s movement to be similar to the collection of movements of the sameactivity from all subjects in I , our method only requires this subject’s movement to besimilar to at least one subject’s movement of the same type. As long as there are people inthe training dataset whose times for completing two steps are similar to the subject we tryto predict, prediction should work reasonably well.Finally, the predicted label for M i ( t ) is P ¯ L { M i ( t ) } := ¯ L i ∗ { M i ∗ ( t ∗ ) } = L i ∗ ( t ∗ ) and we let P L ( i, t ) := arg max Act j ∈A h − (cid:90) tt − h { P ¯ L ( M i ( s ))= Act j } d s. We evaluate the performance of the proposed prediction method by defining the followingtwo quantities: for activity type
Act j ∈ A and subject i ∈ I , the subject-specific trueprediction rate is defined as r ij := (cid:80) t ∈ ¯ T i ,L i ( t )= Act j { P L ( i,t )= Act j } (cid:80) t ∈ ¯ T i ,L i ( t )= Act j , (5)and the corresponding false prediction rate is w ij := (cid:80) t ∈ ¯ T i ,P L ( i,t )= Act j { L i ( t ) (cid:54) = Act j } (cid:80) t ∈ ¯ T i ,P L ( i,t )= Act j . Note that r ij is the proportion of subject i ’s data points in activity type Act j that arecorrectly identified as belong to Act j , while w ij is the proportion of subject i ’s data pointsthat are identified as in activity type Act j but do not belong to Act j . A successful predictionmethod should have high r ij and low w ij . 20 .2.2 Selection of Movelets Length An important problem in movelets prediction is the selection of h , the length of movelets. Baiet al. (2012) and He et al. (2014) noted that h should be selected such that movelets containsufficient information to identify a movement while avoiding redundant information; basedon the guideline both papers used h = 1 s . This choice is based on the reasonable observationthat human movement happens on the 0 . − h using onlythe training data. Our approach is based on leave-one-subject-out cross validation appliedto the training data.Consider subject i ∈ I . For activity type Act j ∈ A of subject i , we calculate its trueprediction rate defined in (5) with subjects in I − { i } as training data; denote the accurateprediction rate by r ∗ ij . Then the average prediction accuracy over all subjects and all activitytypes is ¯ r ∗ = ( I A ) − (cid:88) i ∈ I ,Act j ∈A r ∗ ij . As ¯ r ∗ depends on h , we propose to select h ∗ := arg max h ¯ r ∗ . (6) We now apply the movelets prediction method to the 20 participants of the accelerometryvalidation study. As described in the Introduction, data from hip, left and right wrist-wornaccelerometers were collected. The participants were instructed to perform 15 activitieswith some resting breaks (3 minutes each break) between activities. Activities were chosenspecifically because they were representative of activities that older adults may commonly doin their daily lives; please see Table 1 for specific details. The resting breaks were removed21rom the data and the transitional periods between consecutive activities were also removed.We focus on applying our method to data from hip-worn accelerometers.The raw data were first normalized using the transformation proposed in Section 2. The20 participants were then split into two groups: 10 for training and 10 for testing. Becausethe accelerometry data for each subject contain dense triaxial time series, it will becomecomputationally challenging if all of the training data are used for prediction. For activitieswith explicit starting and ending, such as chairStand, two consecutive replicates from eachsubject were used as training data. For other activities, a segment of 5 seconds for eachsubject were used as training data. The shortness of the training periods is a hallmark ofthe movelets prediction approach. Indeed, movelets only need a quick look at some qualitydata to recognize the data in a complex system.
According to Table 1, there are 8 types of upper body activities in the data. For prediction,we will group these activities into a new subgroup, “upper body activities”. There are tworeasons for doing this. First, these activities involve mainly movements from the upper bodysuch as arms and hands and are not well detected by accelerometers worn on the hip (He et al.2014). Second, it was shown in He et al. (2014) that upper body activities require a seriesof distinct sub-movements that can be similar across activities and difficult to differentiateeven within the same subject. Differentiating these upper body activities across subjectsbecomes even more challenging because of the increased heterogeneity of activities acrosssubjects.For the lower body activities, there are four types of walking activities: normal walkingwithout arm swinging, normal walking with arm swinging, fast walking without arm swing-ing, and fast walking with arm swinging. Arm swinging belongs to upper body activities22nd is not well detected by accelerometers worn on the hip (He et al. 2014). Althoughdistinguishing normal and fast walking can be done relatively easily when training data areavailable for the same subject (He et al. 2014), it will be difficult to do so across subjects.The reason is that there is a lot of heterogeneity in the subjects’ walking speed and onesubject’s fast walking speed may well be close to another subject’s normal walking speed.Indeed, for the 10 subjects used as training data, the time for two normal-walking stepsranges from 0 . s to 1 . s while it is between 0 . s and 0 . s for two fast-walking steps.Thus, a new subgroup of activities, “walking”, is created to include all four walking types.We now have 5 new activity types: “standing”, “lying”, “chairStand”, “walking”, and“upper body activities”. We will use the new activity labels for evaluating our population-level movelets prediction method. We selected the movelet length h = 0 . s by the criteria in (6); see Figure ?? for cross-validated mean prediction accuracy for different choices of h . The high cross-validated meanprediction accuracy (near 90%) indicates that the population-level movelet method is capableof identifying activity types across subjects. We used the true prediction rate and false prediction rate described in Section 3.2.1 toevaluate the results. To illustrate the importance of the preprocessing step, we also appliedthe proposed method to the raw data and to the rotation-only data. As a comparison, theresults by the within-subject movelets method applied to the 10 testing subjects are alsoshown. Note that for the within-subject movelets method, 2-3 s of labeled accelerometrydata for each activity are used for predicting the activity type of the rest of the accelerometry23ata. The top panel of Figure 6 shows box plots of the true prediction rates for the 5 activitytypes and the bottom panel shows box plots of the false prediction rates. The two figuresindicate that the population-level method for the normalized data performs similarly with thewithin-subject method, i.e., having high mean true prediction rates ( > < We have proposed a movelet-based method that could predict activities types across subjects.Compared to feature extraction-based methods, a significant advantage of the movelet-basedmethods is that they can achieve high prediction accuracy at sub-second level. We have foundin free-living data that walking 2 or 3 steps is common in older individuals. To accuratelyquantify how much time an older person spends on walking, an important biomarker forelderly’s health, sub-second labels will be required to capture very short walking periods.24 .0 True prediction rates P r opo r t i on lying standing chairStand walking upper-body activities raw rotated normalized within subject raw rotated normalized within subject False prediction rates P r opo r t i on lying standing chairStand walking upper-body activities raw rotated normalized within subject raw rotated normalized within subject Figure 6: True prediction and false prediction rates for various cases. The term “raw” meansthe population-level method is applied to the raw data; “rotated” means the population-levelmethod is applied to the rotated data without further translation; “normalized” means thepopulation-level method is applied to the rotated and translated data; “within subject”means the within-subject movelet method is applied to each subject.25 -2 -1 Annotated Labels
Predicted Labels fastWalk fastWalk_noSwing normalWalk normalWalk_noSwing -3 -2 -1 Annotated Labels
Predicted Labels fastWalk fastWalk_noSwing normalWalk normalWalk_noSwing -3 -2 -1 Annotated Labels
Predicted Labels fastWalk fastWalk_noSwing normalWalk normalWalk_noSwing -3 -2 -1 Annotated Labels
Predicted Labels fastWalk fastWalk_noSwing normalWalk normalWalk_noSwing
Figure 7: Prediction results of one subject’s four types of walking. The top panels displaydata for normalWalk (left column) and normalWalk noSwing (right column), the bottompanels display data for fastWalk (left column) and fastWalking noSwing (right column).The activity types can also be distinguished by the annotated labels in each plot.26he population-level movement prediction is a non-trivial step forward compared to thesubject-by-subject methods in Bai et al. (2012) and He et al. (2014). Indeed, the methodsin Bai et al. (2012) and He et al. (2014) require training data for all subjects, which islikely unavailable in large epidemiological studies. Moreover, these methods do not considernormalization, which is a crucial problem when devices are worn for many days, are takenoff and put back on, and may be subject to unknown movements relative to the body theyare attached to. Our proposed methods, require training data for a subset of all subjects.The data analyzed here were collected in a research lab and provide only a partial viewof the heterogenous activities individuals perform in free-living environments. It remainsunclear how in-lab data prediction methods perform in real life environments. Nonetheless,with the approaches introduced here, we are mildly optimistic about resolving this muchharder problem.The methods that we proposed here can be developed further. For example, using thedata from all three accelerometers instead of just the hip, may provide better movementrecognition of upper body activities. Smoothing the movelets may further reduce the noisein the distance metric. Finally, many movements may actually be quite ambiguously defined.For example, a reaching arm movement could equally well correspond to “dealing cards”,“placing a plate in drawer”, “eating” or other qualitatively defined movements. Thus, forquantitative research we may need to move to more accurate definitions of movement. Thosedefinitions are likely to be closer in nature to “movelets” than to current non-quantifiabledefinitions. This is counter-intuitive and contrary to the way data are currently collectedand labeled. However, learning the language of movement will most likely require a carefulanalysis of the observed data and decomposition into its building blocks. Pairing accelerom-etry with video cameras, smart phone apps, and other health monitors has the potential tofundamentally change the way we measure health.27 cknowledgements
This research was funded in part by the Intramural Research Program of the National Insti-tute on Aging. Xiao, He, and Crainiceanu were supported by Grant Number R01NS060910from the National Institute of Neurological Disorders and Stroke. This work was also sup-ported by the National Institute of Health through the funded Study of Energy and AgingPilot (RC2AG036594), Pittsburgh Claude D. Pepper Older American’s Independence CenterResearch Registry (NIH P30 AG024826), a National Institute on Aging Professional ServicesContract (HHSN271201100605P), a University of Pittsburgh Department of EpidemiologySmall Grant, and the National Institute on Aging Training Grant (T32AG000181). Thiswork represents the opinions of the researchers and not necessarily that of the grantingorganizations.
Appendix A: A Test for Systematic Bias
Let X ∈ R be the acceleration vector at an observation point when the subject is standing still.Suppose that X follows a multivariate normal distribution with mean µ and covariance Σ . Let X , . . . , X n be i.i.d. copies of X . Then ¯ X = n − (cid:80) ni =1 X i is normal with mean µ and covariance Σ /n . Testing if there is systematic bias in the observations is to testing (cid:107) µ (cid:107) = 1 where (cid:107) · (cid:107) denotesthe Euclidean norm. We consider the testing statistic (cid:107) ¯ X (cid:107) , which has mean (cid:107) µ (cid:107) + t r ( Σ ) /n and variance var( (cid:107) ¯ X (cid:107) ). Here t r ( · ) denotes the trace of a square matrix, i.e., the sum of thediagonal entries. The derivation of the variance term is more involved. We let ODO T be theeigendecomposition of Σ , where O is an orthogonal matrix with O T O = OO T = I and D is adiagonal matrix with the diagonal entries d , d and d . Now let Y = ( Y , Y , Y ) T = O T ¯ X , then Y is normal with mean µ y = ( µ y , µ y , µ y ) T = O T µ and covariance matrix D /n . It is easy toshow that var( (cid:107) ¯ X (cid:107) ) = var( (cid:107) Y (cid:107) ) = (cid:88) k =1 var( Y k ) = (cid:88) k =1 (6 µ yk d k /n + 3 d k /n ) . ence var( (cid:107) ¯ X (cid:107) ) = (cid:88) k =1 (6 µ yk d k /n + 3 d k /n )= 6 n µ Ty D µ y + 3 n t r ( Σ )= 6 n µ T ODO (cid:48) µ + 3 n t r ( Σ )= 6 n µ T Σ µ + 3 n t r ( Σ ) . By the central limit theorem, (cid:107) ¯ X (cid:107) −(cid:107) µ (cid:107) − t r ( Σ ) /n √ var( (cid:107) ¯ X (cid:107) ) is approximately normal. Then an α -level rejectionregion for testing (cid:107) µ (cid:107) = 1 is given by (cid:12)(cid:12) (cid:107) ¯ X (cid:107) − (cid:12)(cid:12) > z α/ (cid:114) n µ T Σ µ Note that we dropped the term t r ( Σ ) /n in the numerator and the term 3t r ( Σ ) /n in the de-nominator as they are of smaller order than (cid:107) µ (cid:107) and 6 µ T Σ µ /n , respectively. The term µ T Σ µ isunknown and needs to be estimated. Since under the null hypothesis that (cid:107) µ (cid:107) = 1 we can derive µ T Σ µ ≤ (cid:107) Σ (cid:107) op , where (cid:107) · (cid:107) op is the operator norm of a matrix, we use instead a conservativerejection region (cid:12)(cid:12) (cid:107) ¯ X (cid:107) − (cid:12)(cid:12) > z α/ (cid:114) n (cid:107) ˆ Σ (cid:107) op , where ˆ Σ is the sample covariance matrix from the sample { X , . . . , X n } . We use α = 0 .
05 so that z α/ = 1 .
96. We display the value of the term T = (cid:12)(cid:12) (cid:107) ¯ X (cid:107) − (cid:12)(cid:12)(cid:113) n (cid:107) ˆ Σ (cid:107) op for all subjects in Table 2. The results show that except for subjects 12 and 13, the null hypothesisof (cid:107) µ (cid:107) is always rejected. T for the 20 subjectsSubject T ppendix B: Derivation of Rotation Matrices Lemma .1.
Let a and a be two vectors in R and a × a (cid:54) = 0 . Let b = a (cid:107) a (cid:107) , b = a − ( a T b ) b (cid:107) a − ( a T b ) b (cid:107) . Then b and b are two unit vectors and are orthogonal to each other. We write a and a as a = c b , a = c b + c b . Let R ∗ = arg min R ∈ R × : R T = R − and e T R ( a × a ) > (cid:0) (cid:107) Ra + e (cid:107) + (cid:107) Ra + e (cid:107) (cid:1) . Then R ∗ is unique with the expression cos( θ ) − sin( θ ) 0sin( θ ) cos( θ ) 00 0 1 T [ b , b , b × b ] T , where cos( θ ) = − c + c (cid:112) ( c + c ) + c , sin( θ ) = c (cid:112) ( c + c ) + c . Proof.
For an arbitrary rotation matrix R , R T = R − is also a rotation matrix. Hence R T e and R T e remain orthogonal unit vectors. There exists an θ ∈ [0 , π ] such that R T e = cos( θ ) b + sin( θ ) b , R T e = − sin( θ ) b + cos( θ ) b . (7) t follows that (cid:107) Ra + e (cid:107) + (cid:107) Ra + e (cid:107) = (cid:107) a + R T e (cid:107) + (cid:107) a + R T e (cid:107) = (cid:107) ( c + cos( θ )) b + sin( θ ) b (cid:107) + (cid:107) ( c − sin( θ )) b + ( c + cos( θ )) b (cid:107) = ( c + cos( θ )) + sin ( θ ) + ( c − sin( θ )) + ( c + cos( θ )) = 2 + c + c + c + 2( c + c ) cos( θ ) − c sin( θ ) . Therefore, (cid:107) Ra + e (cid:107) + (cid:107) Ra + e (cid:107) is minimized if cos( θ ) = − ( c + c ) / (cid:112) ( c + c ) + c andsin( θ ) = c / (cid:112) ( c + c ) + c . By (7), R T e = R T ( e × e )= ( R T e ) × ( R T e )= b × b . Then R T [ e , e , e ] = [ b , b , b × b ] cos( θ ) − sin( θ ) 0sin( θ ) cos( θ ) 00 0 1 . It’s easy to verify that R − = R T and the proof is complete. References
Atienza, A. and King, A. (2005). Comparing self-reported versus objectively measured physicalactivity behavior: a preliminary investigation of older Filipino American women.
Res. Q. Exerc.Sport , 76:358–362.Bai, J., Goldsmith, J., Caffo, B., Glass, T., and Crainiceanu, C. (2012). Movelets: A dictionary ofmovement.
Electron. J. Statist. , 6:559–578.Bai, J., He, B., Shou, H., Zipunnikov, V., Glass, T., and Crainiceanu, C. (2014). Normalizationand extraction of interpretable metrics from raw accelerometry data.
Biostatistics , 15:102–116. ao, L. and Intille, S. (2004). Activity recognition from user-annotated acceleration data. In Proceedings of the 2nd International Conference on Pervasive Computing , pages 1–17. Springer.Boyle, J., Karunanithi, M., Wark, T., Chan, W., and Colavitti, C. (2006). Quantifying functionalmobility progress for chronic disease management. In
Engineering in Medicine and BiologySociety, 2006. EMBS ’06. 28th Annual International Conference of the IEEE , pages 5916–5919.Bussmann, J., Martens, W., Tulen, J., Schasfoort, F., van den Berg-Emons, H., and Stam, H.(2001). Measuring daily behavior using ambulatory accelerometry: the activity monitor.
Behav.Res. Meth. Ins. C. , 33:349–356.Choi, L., Liu, Z., Matthews, C., and Buchowski, M. (2011). Validation of accelerometer wear andnonwear time classification algorithm.
Med Sci Sports Exerc. , 43:357–364.Grant, P., Dall, P., Mitchell, S., and Granat, M. (2008). Activity-monitor accuracy in measuringstep number and cadence in community-dwelling older adults.
J. Aging Phys. Activ. , 16:204–214.He, B., Bai, J., Zipunnikov, V., Koster, A., Paolo, C., Lange-Maria, B., Glynn, N., Harris, T.,and Crainiceanu, C. (2014). Predicting human movement with multiple accelerometers usingmovelets. To appear in
Med. Sci. Sports Exerc.
Kozey-Keadle, S., Libertine, A., Lyden, K., Staudenmayer, J., and Freedson, P. (2011). Validationof wearable monitors for assessing sedentary behavior.
Med. Sci. Sports Exerc. , 43:1561.Krause, A., Sieiorek, D., Smailagic, A., and Farringdon, J. (2003). Unsupervised, dynamic iden-tification of physiological and activity context in wearable computing. In
Proceedings of the7th International Symposium on Wearable Computers (White Plains, NY) , pages 88–97. IEEEComputer Society.Mannini, A., Intille, S., Rosenberger, M., Sabatini, A., and Haskell, W. (2013). Activity recognitionusing a single accelerometer placed at the wrist or ankle.
Med. Sci. Sports Exerc. , 45:2193–203.Pober, D., Staudenmayer, J., Raphael, C., and Freedson, P. (2006). Development of novel techniquesto classify physical activity mode using accelerometers.
Med. Sci. Sports Exerc. , 38:1626–1634. reece, S., Goulermas, J., Kenney, L., Howard, D., Meijer, K., and Crompton, R. (2009). Activityidentification using body-mounted sensors: a review of classification techniques. Physiol. Meas. ,30:R1–33.Schrack, J., Zipunnikov, V., Goldsmith, J., Bai, J., Simonsick, E., Crainiceanu, C., and Ferrucci, L.(2014a). Assessing the “physical cliff”: detailed quantification of aging and patterns of physicalactivity.
J. Gerontol. Med. Sci. , 69:973–979.Schrack, J., Zipunnikov, V., Goldsmith, J., Bandeen-Roche, K., Crainiceanu, C., and Ferrucci, L.(2014b). Estimating energy expenditure from heart rate in older adults: a case for calibration.
PLoS ONE , 9:e93520.Sirard, J., Trost, S., Pfeiffer, K., Dowda, M., and R.R., P. (2005). Calibration and evaluation of anobjective measure of physical activity in pre-school children.
J. Phys. Act. Health , 23:345–347.Staudenmayer, J., Pober, D., Crouter, S., Bassett, D., and Freedson, P. (2009). An artificial neuralnetwork to estimate physical activity energy expenditure and identify physical activity type froman accelerometer.
J. Appl. Physiol. , 107:1300–1307.Staudenmayer, J., Zhu, W., and Catellier, D. (2012). Statistical considerations in the analysis ofaccelerometry-based activity monitor data.
Med. Sci. Sports. Exerc. , 44:S61–7.Sun, M. and Hill, J. (1993). A method for measuring mechanical work and work efficiency duringhuman activities.
J. Biomech. , 26:229–241.Troiano, R., Berrigan, D., Dodd, K., Masse, L., Tilert, T., and McDowell, M. (2008). Physicalactivity in the United States measured by accelerometer.
Med. Sci. Sports Exerc. , 40:181–188.Trost, S., McIver, K., and Pate, R. (2005). Conducting accelerometer-based activity assessmentsin field-based research.
Med. Sci. Sports. Exerc. , 37:S531.Trost, S., Wong, W., Pfeiffer, K., and Zheng, Y. (2012). Artificial neural networks to predictactivity type and energy expenditure in youth.
Med. Sci. Sports Exerc. , 44:1801–1809. elk, G., Blair, S., Jones, S., and Thompson, R. (2000). A comparative evaluation of threeaccelerometry-based physical activity monitors. Med. Sci. Sports Exerc. , 32:489–497.Zhang, S., Rowlands, A., Murray, P., and Hurst, T. (2012). Physical activity classification usingthe GENEA wrist-worn accelerometer.
Med. Sci. Sports Exerc. , 44:742–748., 44:742–748.