A statistical framework for measuring the temporal stability of human mobility patterns
AA STATISTICAL FRAMEWORK FOR MEASURING THETEMPORAL STABILITY OF HUMAN MOBILITY PATTERNS
ZHIHANG DONG *, † , YEN-CHI CHEN * AND ADRIAN DOBRA * Abstract.
Despite the growing popularity of human mobility studies that collectGPS location data, the problem of determining the minimum required length of GPSmonitoring has not been addressed in the current statistical literature. In this paperwe tackle this problem by laying out a theoretical framework for assessing the temporalstability of human mobility based on GPS location data. We define several measuresof the temporal dynamics of human spatiotemporal trajectories based on the averagevelocity process, and on activity distributions in a spatial observation window. Wedemonstrate the use of our methods with data that comprise the GPS locations of185 individuals over the course of 18 months. Our empirical results suggest that GPSmonitoring should be performed over periods of time that are significantly longer thanwhat has been previously suggested. Furthermore, we argue that GPS study designsshould take into account demographic groups.KEYWORDS: Density estimation; global positioning systems (GPS); human mobility;spatiotemporal trajectories; temporal dynamics
Contents
1. Introduction 12. Methods 32.1. Measuring the temporal stability of human mobility patterns 42.2. The activity distribution of human mobility patterns 62.3. Measuring the temporal stability of human activity distributions 93. Application 104. Discussion 13Funding 14Acknowledgment 15Appendix A. Proofs of theoretical results 15A.1. Proof of Theorem 2.1 15A.2. Proof of Theorem 2.2 16References 171.
Introduction
Recent developments on global positioning systems (GPS) for wearable technologysuch as smartphones have drawn a great amount of interest from scientists studying the
CONTACT A. Dobra. Email: [email protected]. a r X i v : . [ s t a t . O T ] A ug ZHIHANG DONG *, † , YEN-CHI CHEN * AND ADRIAN DOBRA * effects of environmental influences on different population groups [34, 26, 33, 21, 3, 44,22, 41, 14]. A recent article [27] documents more than 100 studies from 20 disciplines thatcollect and analyze human time-stamped GPS location data. This type of data is keyfor learning about the places where people routinely spend their time during activities ofdaily living in order to establish their relationship with socio-economic outcomes, crimevictimization, and physical and mental well-being. There have been extensive studieson the social stratification of mobility, such as health disparities of different neighbor-hoods, mental health, and substance abuse intervention [13, 38, 41], on the assessmentof human spatial behavior and spatiotemporal contextual exposures [26, 33, 21], onthe characterization of the relationship between geographic and contextual attributesof the environment (e.g., the built environment) and human energy balance (e.g., diet,weight, physical activity) [3, 44], on the study of segregation, environmental exposure,and accessibility in social science research [22], or on the understanding of the relation-ship between health-risk behavior in adolescents (e.g., substance abuse) and communitydisorder [41, 1, 42].Notwithstanding a general consensus across disciplines about the tremendous poten-tial of GPS location data for studying human mobility, very little is currently knownabout how long a GPS study should last. There is an inherent trade-off between col-lecting location data from people for longer vs. shorter periods of time. Recordingmore GPS locations yields more information about the locations where an individualspends their time, as well as about the frequency, duration and timing of their visits tothese places. However, an individual’s participation in a GPS study comes with burdensthat often become significant if accumulated over longer periods of time: the individualneeds to carry the device recording the data (a GPS tracker) everywhere they go, andneeds to make sure the device is properly charged at all times and functions properly.Until recently, most GPS study designs stipulated mandatory regular visits to projectcoordination sites to download data from the location trackers, to replace batteries, andreplace the GPS tracking devices that were lost or were malfunctioning. While someof these issues have been addressed by using specialized apps on smartphones to col-lect GPS data and wirelessly transmit them into secure cloud databases, the costs ofdistributing smartphones to study participants, data plans, software development, andcloud computing are quite significant. In addition, there are important privacy consider-ations related to recording locations that might sensitive for study participants for longperiods of time. For these reasons, it is desirable to design GPS studies that are as shortas possible to reduce the costs of the projects and the burden of study participants,while in the same time still providing guarantees that sufficient location data have beencollected to properly address the research aims.Despite the constant growth in the number of human mobility studies that collectGPS location data in the last 20 years, the question about the determination of theamount of time of GPS monitoring has not been asked until recently [43]. In this paper,the authors argue that an effective GPS study should last until a minimum of 14 to 15days of valid GPS data have been collected. While this finding is relevant for numerousresearch groups that, in the past, have designed GPS studies with a duration of 7 days(see [43] and the references therein), two weeks seems to severely underestimate theduration of other, more recent, GPS studies whose duration is significantly longer. For EASURING THE TEMPORAL STABILITY OF HUMAN MOBILITY PATTERNS 3 example, [6] and [29] represent studies that tracked adolescents in the San Francisco Bayarea for one month. Another study [11] employs a more complex three site design thatcomprises five assessments that take place every six months over two years of follow-upfor participants enrolled in Chicago, and three assessments that take place every sixmonths over one year of follow-up for participants enrolled in Jackson and New Orleans.During each assessment, participants wear a GPS tracker for two weeks. Thus this study[11] records GPS locations for a total of 10 weeks and 6 weeks, respectively, but splitsthe period of observation into several contiguous two week periods of GPS monitoring.These longer periods of observation time were suggested in [25] who found 17 weeks tobe an adequate period of time to monitor human mobility based on geotagged socialmedia data.In this paper we lay out a theoretical framework for assessing the temporal stabilityof human mobility based on GPS location data. Such a framework is missing from thecurrent statistical literature. Previous work [43, 25] on the assessment of the durationof GPS observation periods is based on empirical findings, and lack any theoreticalunderpinnings. We address this gap by introducing several measures of the temporaldynamics of spatiotemporal trajectories of individuals. We illustrate the use of thesemeasures with publicly available data from a study that recorded GPS locations of 185individuals that live in a city in Switzerland over the course of 18 months.2.
Methods
The spatiotemporal trajectory of an individual in a reference time frame [ t min , t max ]and spatial observation window W ⊂ R is a curve X [ t min ,t max ] = { X ( t ) = ( x ( t ) , x ( t )) : t ∈ [ t min , t max ] } ⊆ W , (1)where x ( · ) and x ( · ) represent the longitude and latitude coordinates, respectively, and X ( t ) is the location visited by this individual at time t . We assume that this curve issmooth: x ( · ) and x ( · ) have continuous derivatives. The length of the curve in Eq. (1)is defined as [9]: L ( X [ t min ,t max ] ) = t max (cid:90) t min (cid:115)(cid:18) d x ( t )d t (cid:19) + (cid:18) d x ( t )d t (cid:19) d t. (2)The complete trajectory X [ t min ,t max ] is never observed in the real world. Instead, n obser-vation times t , . . . , t n are sampled from a distribution on [ t min , t max ] with density ρ ( · ),and the corresponding locations X ( t ) , . . . , X ( t n ) on the curve X [ t min ,t max ] are recorded.These locations are realizations of a random variable X ( T ) where T ∼ q ( · ). Ideally wewould like T to follow a uniform distribution to have the same chance of recording avisited location anywhere in the reference time frame [ t min , t max ]. Due to technologicallimitations (e.g., GPS devices running out of power), heterogeneous built environmentsthat prevent GPS devices to obtain a location (e.g., skyscapers in downtown areas orbuildings without windows and WIFI coverage), or human behavioral factors (e.g., in-dividuals turning off their GPS devices around certain locations sensitive to them) thedistribution of T can be far from the uniform distribution. ZHIHANG DONG *, † , YEN-CHI CHEN * AND ADRIAN DOBRA * We assume that GPS positional data from K study participants were recorded. Wedenote by X [ t min ,t max ] k = { X k ( t ) : t ∈ [ t min , t max ] } the unobserved spatiotemporal trajec-tory of the k -th study participant. The observation times in the reference time frame[ t min , t max ] can vary between study participants. The GPS data for the k -th study par-ticipant are the time stamped longitude and latitude locations: { X k,i = X k ( t k,i ) : i = 1 , . . . , n k } , (3)where n k ≥
1, the time t k,i was sampled from a distribution with density ρ k ( · ) inde-pendently of the rest of the observation times, and t min ≤ t k, ≤ . . . t k,n k ≤ t max . Here t k,i represents the time when the i -th location of study participant k was recorded. Ourframework allows for the possibility of having different reference time frames for variousgroups of study participants.2.1. Measuring the temporal stability of human mobility patterns.
One pos-sible measure of the dynamics of the spatiotemporal trajectory X [ t min ,t max ] is the av-erage velocity V ( τ ) at time τ which is a function V ( τ ) of the length of the subcurve X [ t min ,t min + τ ] of X [ t min ,t max ] from Eq. (1): V ( τ ) = 1 τ L ( X [ t min ,t min + τ ] ) , (4)for τ ∈ (0 , t max − t min ] and V (0) = 0. A sample estimator of the average velocity for the k -th study participant is (cid:98) V k ( τ ) = 1 τ (cid:88) { i : t k,i +1 ≤ τ } (cid:107) X k,i +1 − X k,i (cid:107) . (5)where (cid:107) X k,i +1 − X k,i (cid:107) represents an estimate of the distance traveled between times t k,i and t k,i +1 . In what follows we will assume that study participants traveled in a straightline or “as the crow flies” between two consecutive observed GPS locations. This isthe simplest assumption one can make which leads to an easy way of calculating GreatCircle (WGS84 ellipsoid) distances between two spatial locations [4]. However, thisassumption underestimates actual distances traveled, and consequently underestimatesthe average velocity. More accurate approximations of distances traveled can be definedbased on the shortest distances between two locations on a road network that spansthe spatial observation window W . Calculating distances based on a road network ismore complex than calculating straight line distances, and involves significant GIS worksince the maximum speed of travel on different segments of road needs to be takeninto account [10]. Nevertheless, as the span of time between two consecutive observedlocations becomes shorter, the difference between the road network and straight linedistances decrease.More generally, consider a stochastic process Z = { Z ( τ ) : τ ∈ [0 , t max − t min ] } , where Z ( τ ) is a mapping f ( · ) of the subcurve X [ t min ,t min + τ ] into R + . The mapping f ( · ) ischosen such that lim τ → ( t max − t min ) Z ( τ ) = Z ( t max − t min ). We define the absolute percentageerror (APE, henceforth) φ ( Z ; τ ) which measures the error made when approximating EASURING THE TEMPORAL STABILITY OF HUMAN MOBILITY PATTERNS 5 Z ( t max − t min ) with Z ( τ ) for τ ∈ [0 , t max − t min ]: φ ( Z ; τ ) = | Z ( τ ) − Z ( t max − t min ) | Z ( t max − t min ) . We quantify the temporal stability of the process Z by introducing a related processcalled the last crossing time process LCT Z = { LCT Z ( γ ) : γ ≥ } , where LCT Z ( γ ) = max { τ ∈ [0 , t max − t min ] : φ ( Z ; τ ) > γ } . (6)In Eq. (6), LCT Z ( γ ) is the last time when the APE made when Z ( t max − t min ) isapproximated with Z ( τ ) is above a threshold γ . The last crossing time is well definedsince lim τ → ( t max − t min ) φ ( Z ; τ ) = 0.Consider the process Z k = { Z k ( τ ) : τ ∈ [0 , t max − t min ] } associated with the k -th studyparticipant, Z k ( τ ) = f (cid:16) X [ t min ,t min + τ ] k (cid:17) , and let (cid:98) Z k be its sample estimator based on thepositional data in Eq. (3). The average velocity in Eq. (4) and its sample estimator inEq. (5) are examples of processes Z k and (cid:98) Z k . A sample estimator of the last crossingtime LCT Z k ( γ ) is(7) (cid:100) LCT Z k ( γ ) = max i =1 ,...,n k (cid:110) t k,i − t min : φ ( (cid:98) Z k ; t k,i − t min ) > γ (cid:111) . We note that (cid:98) Z k ( τ ) in the APE φ ( (cid:98) Z k ; τ ) is determined based on the locations recordedfor the k -th study participant before time τ : { X k,i : t min ≤ t k,i ≤ τ } . As an illustration,Figure 1 shows estimates of the average velocity of an individual in the MDC data,together with the last crossing time estimate at γ = 0 . φ K ( τ ) = 1 K K (cid:88) k =1 φ ( (cid:98) Z k ; τ ) . (8)We define two measures of the overall temporal stability of the spatiotemporal tra-jectories of multiple study participants. The first overall measure is the last crossingtime process LCT φ K = { LCT φ K ( γ ) : γ ≥ } of the MAPE process φ K = { φ K ( τ ) : τ ∈ [0 , t max − t min ] } . We refer to this measure as LCT − MAPE ( Z ). The second overall mea-sure is defined as the average of the last crossing times of the APE of (cid:98) Z k for k = 1 , . . . , K , ZHIHANG DONG *, † , YEN-CHI CHEN * AND ADRIAN DOBRA * Figure 1.
Estimate of the average velocity (gray curve) of an individualin the MDC data over t max = 21 weeks. The dashed line indicates thevalue of (cid:98) V ( t max ), and the two dotted lines represent the lower bound(1 − γ ) (cid:98) V ( t max ) and the upper bound (1 + γ ) (cid:98) V ( t max ) for γ = 0 .
1. Thesebounds correspond with times τ for which the APE φ ( V ; τ ) ≤ γ . Thecrosses denote the times τ for which φ ( V ; τ ) = γ . The last crossing timefor γ = 0 . LCT K = { LCT K ( γ ) : γ ≥ } where LCT K ( γ ) = 1 K K (cid:88) k =1 LCT Z k ( γ ) . We denote this second measure by
LCT − APE ( Z ). These two measures are the sameonly if they are calculated for a single study participant ( K = 1). They are useful forcomparing the temporal regularity of mobility patterns of groups of study participants(e.g., younger vs. older individuals, men vs. women, high SES vs. low SES).2.2. The activity distribution of human mobility patterns.
The average velocityassociated with the spatiotemporal trajectory of an individual does not provide anyinformation about the spatial configuration of locations visited. Consider two exampleindividuals that drive without stopping with the same speed for a long period of time.The first example individual drives back and forth between two places A and A . Thesecond example individual drives in a cycle from a place A to another place A , thento places A and A , then back to place A . Since the spatiotemporal trajectory of thesecond individual involves two additional places, more sample locations will be neededto understand the mobility pattern of the second individual compared to the mobilitypattern of the first individual. However, the mobility patterns of these two exampleindividuals will be indistinguishable based on the last crossing time process associatedwith their average velocity processes. We address this issue by introducing a distributionof the locations visited by an individual. EASURING THE TEMPORAL STABILITY OF HUMAN MOBILITY PATTERNS 7
We assume that the observation window W is partitioned into a set of grid cells G = { G , . . . , G N } . Each location X ( t ) on the curve X [ t min ,t max ] representing the spa-tiotemporal trajectory of an individual is mapped into a grid cell G ( t ) ∈ G . Theobserved locations for this individual mapped into G are the sequence of grid cells g = G ( t ) , . . . , g n = G ( t n ) that are realizations of a random variable G ( T ) where T is a random variable on [ t min , t max ] with a distribution with density ρ ( · ).We define the activity distribution π = ( π , . . . , π N ) over the grid cells G . Here π j represents the proportion of time in [ t min , t max ] spent by an individual in cell G j ∈ G .We assume that T follows a uniform distribution on [ t min , t max ], and define:(9) π j = P ( G ( T ) = G j ) , for j = 1 , . . . , N. The activity distributions associated with the two example individuals we introducedearlier can differentiate between their mobility patterns if the grid cells in which A and A do not coincide with the grid cells of A and A , and will show that the first exampleindividual did not spend any time in the grid cells associated with A and A . To employactivity distributions we need to have a method for recovering them from the availabledata.The simplest estimator (cid:98) π = ( (cid:98) π , . . . , (cid:98) π N ) of the activity distribution π is based on therelative frequency of visitation of the grid cells G : (cid:98) π j = 1 n n (cid:88) i =1 ( g i = G j ) , for j = 1 , . . . , N. However, this estimator of π is reasonable only if T follows a uniform distribution as inEq. (9). When T follows an arbitrary distribution with density ρ ( · ), a better approachis to use a weighted average estimator (cid:101) π = ( (cid:101) π , . . . , (cid:101) π N ) where:(10) (cid:101) π j = (cid:80) ni =1 ρ − ( t i ) ( g i = G j ) (cid:80) n(cid:96) =1 ρ − ( t (cid:96) ) , for j = 1 , . . . , N. Although this estimator can be shown to be statistically consistent, it requires knowledgeof the density ρ ( · ). There are many methods for estimating ρ ( · ) from the data such ashistograms or kernel density estimators [40]. We suggest using an estimation method thatassumes that the distribution of T is approximated by a piecewise uniform distribution.We take t = t min and t n +1 = t max . If T is approximately uniform in [ t i − , t i +1 ] for i = 1 , . . . , n , then ρ − ( t i ) ≈ t i +1 − t i − . This is a reasonable assumption if the times whenlocations are collected are roughly equally spaced in time (e.g., a location is collectedevery 10 minutes) since the mean of t i is ( t i +1 − t i − ) /
2. Thus an estimator of ρ ( · ) is (cid:98) ρ ( t i ) = ω ( t i ) (cid:80) n(cid:96) =1 ω ( t (cid:96) ) , ω ( t i ) = 1 t i +1 − t i − , for i = 1 , . . . , n. The weighted average estimator from Eq. (10) becomes (cid:98) π o,j = (cid:80) ni =1 ω − ( t i ) ( g i = G j ) (cid:80) n(cid:96) =1 ω − ( t (cid:96) )= (cid:80) ni =1 ( t i +1 − t i − ) ( g i = G j ) t max − t min + t n − t , for j = 1 , . . . , N. (11) ZHIHANG DONG *, † , YEN-CHI CHEN * AND ADRIAN DOBRA * We call (cid:98) π o = ( (cid:98) π o, , . . . , (cid:98) π o,N ) the ordinary proportional time estimator of the activitydistribution π . This estimator relies on the assumption that the length of the timeintervals in which an individual transitions between two grid cells is added to the timespent in both the grid cell they leave from, and the grid cell they arrive in. Morespecifically, assume that the consecutive observation times t i and t i +1 are such that g i (cid:54) = g i +1 . Then (cid:98) π o allocates ( t i +1 − t i ) to the total time spent in both g i and g i +1 .We introduce a second estimator (cid:98) π c = ( (cid:98) π c, , . . . , (cid:98) π c,N ) of the activity distribution π :(12) (cid:98) π c,j = (cid:80) ni =2 ( t i − t i − ) ( g i = g i − = G j ) (cid:80) ni =2 ( t i − t i − ) ( g i = g i − ) , for j = 1 , . . . , N. We call (cid:98) π c the conservative proportional time estimator. This estimator is more con-servative than the ordinary proportional time estimator (cid:98) π o from Eq. (11) in the sensethat any time interval defined by consecutive observation times t i and t i +1 such that g i (cid:54) = g i +1 is ignored. That is, the time spent in a grid cell is calculated only based ontime intervals in which an individual is known to have remained in that cell.We show two important properties of the ordinary and the conservative proportionaltime estimators. First, we prove that both estimators are asymptotically equivalent.Second, we prove that both estimators are statistically consistent, that is, they willeventually recover the true activity distribution π if sufficient location data are available.These properties rely on the assumptions (S1), (S2) and (S3) below:(S1) The length of the time intervals between consecutive observation times max i =1 ,...,n − | t i +1 − t i | → n → ∞ .(S2) The sampling period is such that t → t min and t n → t max when n → ∞ .(S3) The number of transitions between grid cells is finite, i.e., there exists M < ∞ such that (cid:80) t ∈ [ t min ,t max ] ( G ( t + ) (cid:54) = G ( t − )) ≤ M , where G ( t − ) and G ( t + ) are theleft and right limits of G ( · ) at t .Assumptions (S1) and (S2) describe the meaning of asymptotics in our context. Theyimply that the observation times t , . . . , t n will eventually be dense in the reference timeframe, i.e., there will not exist a fixed region of [ t min , t max ] without any observation timeswhen n → ∞ . Assumption (S3) requires that the spatiotemporal trajectory X [ t min ,t max ] is sufficiently smooth such that it will not jump between grid cells infinitely often. Theorem 2.1 (Asymptotic Equivalence Rule with Large Sampling Rate) . Under as-sumptions (S1), (S2) and (S3), the ordinary proportional time estimator (cid:98) π o from Eq. (11) and the conservative proportional time estimator (cid:98) π c from Eq. (12) are asymptoti-cally the same. The proof of this result is given in Appendix A.1. We can also show that the sameassumptions imply that the two estimators are statistically consistent.
Theorem 2.2 (Convergence Rule with Large Sampling Rate) . Under assumptions (S1),(S2) and (S3), the ordinary proportional time estimator (cid:98) π o from Eq. (11) and the con-servative proportional time estimator (cid:98) π c from Eq. (12) converge to the true activitydistribution π from Eq. (9) . The proof of this result is given in Appendix A.2.
EASURING THE TEMPORAL STABILITY OF HUMAN MOBILITY PATTERNS 9
Measuring the temporal stability of human activity distributions.
Weare interested in determining the temporal stability of the activity distribution of anindividual. We assume that the reference time frame [ t min , t max ] is divided into D max time periods of equal lengths (e.g., days or weeks). We denote by π ( d ) the activitydistribution from Eq. (12) associated with time period D , D = 1 , . . . , D max . Then π ( D ) can be viewed as an N -dimensional random vector whose distribution reflects thevariability from time period to time period of the individual’s mobility patterns. Withthis understanding, we are interested in determining the expectation ¯ π = E ( π ( D ) ). Wecall ¯ π the time period activity distribution (e.g., daily or weekly activity distribution).The j -th component of ¯ π is interpreted as the average proportion of time spent by theindividual in grid cell G j in a given time period (a day or a week).A simple estimator of ¯ π is(13) (cid:98) ¯ π ( D ) = 1 D D (cid:88) d =1 (cid:98) π ( d ) , for D = 1 , . . . , D max , where (cid:98) π ( d ) is the ordinary proportional time estimator (cid:98) π o from Eq. (11) or the conser-vative proportional time estimator (cid:98) π c from Eq. (12).Because (cid:98) ¯ π ( D ) is a consistent estimator of ¯ π , the error we make when approximating¯ π with (cid:98) ¯ π ( D ) decreases as we observe the spatiotemporal trajectory of the individual fora larger number of time periods D max . We define the last crossing time of the sequenceof estimators { (cid:98) ¯ π ( D ) : D = 1 , . . . , D max } as follows:(14) (cid:100) LCT dist ( γ ) = max D =1 ,...,D max (cid:8) D : (cid:107) (cid:98) ¯ π ( D ) − (cid:98) ¯ π ( D max ) (cid:107) > γ (cid:9) , where (cid:107) v (cid:107) is the usual L norm for a vector v , i.e., (cid:107) v (cid:107) = (cid:80) i | v i | . Note in Eq. (14) weused the fact that (cid:107) (cid:98) ¯ π ( D ) (cid:107) = 1 for any D .The last crossing time in Eq. (14) is a measure of the temporal stability of the entiretime period activity distribution ¯ π . Individuals that spend approximately the sameamount of time in the same places in every time period need to be observed for a smallernumber of time periods to calculate estimator (cid:98) ¯ π ( D ) with the same APE compared toindividuals with heterogeneous mobility patterns that spend different amounts of timesat locations that change substantially across time periods. Therefore (cid:100) LCT dist ( γ ) willbe smaller for individuals whose time period to time period mobility changes less, andlarger for individuals with irregular mobility patterns.The disadvantage of using the last crossing time in Eq. (14) as a measure of temporalstability comes from the fact that it gives the same weight to the error made whenestimating the proportion of time spent in grid cells in which an individual spends a lotof their time, and to the grid cells in which the individual rarely visits. The numberof grid cells with a large proportion of time spent in them is likely significantly smallerthan the total number of grid cells N because most people tend to spend time at theirresidence, to their work place and perhaps in a few other select locations. For thisreason, the error made when estimating the proportion of time spent in grid cells withsparse presence could dominate the overall APE of (cid:98) ¯ π ( D ), and lead to larger values of (cid:100) LCT dist ( γ ). To remedy this issue, we define a new measure of temporal stability thatfocuses on the grid cells in which an individual spends larger proportions of time. *, † , YEN-CHI CHEN * AND ADRIAN DOBRA * We define the ranking time period activity distribution ¯ r = (¯ r , · · · , ¯ r N ) associatedwith ¯ π by replacing each component of ¯ π with the sum of those components of ¯ π thatare no larger than that component, as follows [7]:(15) ¯ r j = N (cid:88) l =1 ¯ π l (¯ π l ≤ ¯ π j ) , for j = 1 , . . . , N. The α -level set ( α ∈ [0 , r is defined to consist of all the grid cells whose corre-sponding components in ¯ r exceed α :(16) L α = { G j : ¯ r j ≥ α } . It turns out that the α -level set covers grid cells whose total sum of components of ¯ π islarger than 1 − α : (cid:88) G j ∈ L α ¯ π j ≥ − α. Levels sets have an easy to understand interpretation: for a given level α , say α = 0 . . − . ·
100 = 30% of the time in the time period. Values of α closer to1 lead to level sets L α with a smaller coverage that comprise only the grid cells in whichthe individual spends the largest amounts of time. Values of α close to 0 lead to levelsets L α with a larger coverage that comprise the majority of grid cells the individualspent time in.Let (cid:98) ¯ r ( D ) be the ranking distribution of the estimator (cid:98) ¯ π ( D ) of ¯ π in Eq. (13), and L α ( D ) be the α -level set associated with (cid:98) ¯ r ( D ) as in Eq. (16). Given a level α ∈ [0 , γ >
0, we define the last crossing time of the sequence of levelsets { L α ( D ) : D = 1 , . . . , D max } as follows:(17) (cid:100) LCT level ,α ( γ ) = max D =1 ,...,D max (cid:26) D : (cid:107) L α ( D ) (cid:52) L α ( D max ) (cid:107)(cid:107) L α ( D max ) (cid:107) > γ (cid:27) , where (cid:52) denotes the symmetric difference of two sets, and (cid:107) · (cid:107) denotes the number ofelements in a set.The LCT of the level sets from Eq. (17) is a measure of temporal stability of thetime period activity distribution ¯ π that takes into account only the error made whenestimating the time spent in the grid cells in which an individual spent most of theirtime. For the same value of γ , (cid:100) LCT level ,α ( γ ) is decreasing as the level α is increasing.3. Application
The data we analyze comes Nokia’s Mobile Data Challenge (MDC) [18, 23, 24]. Thiswas a mobile computing research initiative focusing on generating a deeper scientificunderstanding of social and behavioral patterns related to mobile technologies. Thestudy took place in Switzerland, and collected various types of longitudinal informationincluding time stamped GPS data from the cell phones of 185 study participants over thecourse of 18 months. Demographic data such as age and sex is also available. There areapproximately 57.5 million GPS location records. The average length of observation forstudy participants was about 55 weeks. These data are publicly available upon requestfrom the Idiap Research Institute.
EASURING THE TEMPORAL STABILITY OF HUMAN MOBILITY PATTERNS 11
Table 1.
Means, medians and sample standard deviations of three mea-sures of temporal stability of mobility patterns. The unit of time is weeks.Mobility Measure Mean Median St. Dev.LCT-velocity 30.04 26 17.29LCT-distribution 37.18 37 16.06LCT-level set ( α = 0 .
2) 17.69 17 9.50Most activities of daily living of the study participants took place in a rectangulararea that we partitioned into 4000 square grid cells with sides of length 28 meters. Thelocations that do not belong to this spatial observation window were dropped. Theselocations typically correspond with longer trips took by study participants away fromtheir places of residency. Figure 2 displays summaries of the GPS locations that fall inour chosen spatial observation window. Figure 2.
Summary information of the GPS location data. Left panel:histogram of the total length of observation for each study participantexpressed in weeks. Right panel: histrogram of the average number ofGPS locations per week for each study participant.For each study participant, we calculated three measures of temporal stability oftheir mobility patterns: the last crossing time of the average velocity (LCT-velocity)as defined in Eq. (5) and Eq. (7), the last crossing time of the activity distribution(LCT-distribution) as defined in Eq. (14), and the last crossing time of the level setsof the weekly activity distribution as defined in Eq. (17). In the calculation of LCT-distribution and LCT-level set, we use the ordinary proportional time estimator definedin Eq. (11). We used α = 0 . γ = 0 . *, † , YEN-CHI CHEN * AND ADRIAN DOBRA * weekly activity distribution stabilizes. The increased length of the period of observationfor this measure is not surprising since it is based on an estimated of the full weeklyactivity distribution in N = 4000 grid cells. About half of this observation time (18weeks) is needed to obtain estimates of the 0 . α -level set L α from Eq. (16) and its corresponding LCT-levelset (cid:100) LCT level ,α (0 .
2) from Eq. (17) change for different values of α ∈ [0 , G grid whose vertices are the N = 4000 grid cells in thespatial observation window. Two grid cells are connected by an edge in G grid if theyshare an edge or a corner in their arrangement in the spatial observation window [39, 4].We denote by G grid ( L α ) the subgraph of G grid defined by the grid cells in L α . We chose astudy participant, and determined the level set L α , the last crossing time (cid:100) LCT level ,α (0 . G grid ( L α ) for α ∈ { . , . , . . . , } – seeFigure 3. For smaller values of α , L α contains grid cells in which the study participantspend the largest proportion of time. When α ∈ { . , . , . , . } , G grid ( L α ) has oneconnected component which implies that the grid cells that belong to L α are spatiallyadjacent, and define a single area in which the study participant spends larger amounts oftime. The corresponding values of (cid:100) LCT level ,α ( γ ) are less than 20 weeks which representsthe length of observation time needed for reliably detecting this spatial area. For α ∈{ . , . } , G grid ( L α ) has two connected components, and for α ∈ { . , . } , G grid ( L α ), G grid ( L α ) has three connected components. Thus this study participant spends theirtime in grid cells that define two or three spatially contiguous areas. Since these areasinclude grid cells in which the study participant spends smaller proportions of theirweekly time, the length of the observation time needed to identify these areas doubles toabout 40 weeks. For α = 1, G grid ( L α ) has 72 connected components because L α includesgrid cells in which the study participant spends very little time. Figure 3 shows thatapproximately 70 weeks of observation time are needed to detect these grid cells. Thesame type of plots constructed for other study participants show similar relationshipsbetween α , L α , and (cid:100) LCT level ,α (0 . ≥
55 years old). For each of these five demographic groups, we cal-culated the average of the last crossing times of the activity distribution (cid:100)
LCT level ,α (0 . α ∈ { . , . , . . . , } . The resulting curves are presented in Figure 4. Thelast crossing times at all levels are similar for men and women (see the top left panel).As such, there do not seem to be any sex-based differences in the temporal stability ofmen and women who live in Switzerland. However, since Switzerland is known to be acountry with very high equality between the two sexes, this finding might not extend toother countries with profound sex inequality.In the top right and bottom panels of Figure 4, we find evidence that the averagelast crossing times decrease with age especially for levels below 0 .
5. This means thatmobility patterns are more regular, and consequently are more temporally stable for olderstudy participants compared to younger study participants. The average last crossing
EASURING THE TEMPORAL STABILITY OF HUMAN MOBILITY PATTERNS 13
Figure 3.
Values of the LCT-level sets (cid:100)
LCT level ,α (0 .
2) for α ∈{ . , . , . . . , } for an MDC study participant. The unit of time is weeks.The number of connected components of G grid ( L α ) defined by the α -levelsets L α are shown above the curve.times are larger and become very similar across demographic groups for levels above 0 . .
5. Thus study participants that belong to any ofthe five demographic groups tend to visit locations they do not typically visit. Longerobservation periods are needed to successfully determine these locations. Nevertheless, inorder to identify the areas in which study participants spend most of their time, Figure 4suggests that 10 weeks of observation of GPS locations should suffice for individuals olderthan 55. Middle age individuals require about 15 weeks of observation time, while youngindividuals require about 20 weeks. 4.
Discussion
The contribution we made in this paper is two fold. On the theoretical side, we pro-posed the use of last crossing time processes associated with spatiotemporal trajectoriesof individuals to assess the temporal stability of their mobility patterns. We definedseveral measures of the temporal dynamics of spatiotemporal trajectories based on theaverage velocity process, and on human activity distributions in a spatial observationwindow. We defined the ordinary and the conservative proportional time estimators ofhuman activity distributions, and proved that they are consistent and asymptoticallyequivalent. We introduced the time period and the ranking time period activity distri-butions that capture the change in human activity distributions across time periods. Wepresented related estimators based on GPS location data.On the empirical side, we analyzed GPS location data collected over a period of 18months. The previous empirical study [43] that focused on assessing the duration of GPSstudies is based on data collected over 30 days. By using our new statistical methodsand GPS data collected over a much longer period of time, we determined that GPSmonitoring needs to be done for at least 15 weeks which represents a minimum study *, † , YEN-CHI CHEN * AND ADRIAN DOBRA * Figure 4.
Mean values and 90% confidence intervals of the LCT-levelsets (cid:100)
LCT level ,α (0 .
2) for α ∈ { . , . , . . . , } calculated for five demograhicgroups: sex (male, female), and age (young, middle, old).duration about 7 times longer than the 14 days minimum duration recommended in[43]. We also put forward the idea that the duration of GPS studies should be assessedby demographic groups. We determined that younger population groups should bemonitored for longer periods of time compared to middle age population groups becauseof their more irregular patterns of mobility. On the other hand, shorter monitoringperiods might be needed for older population groups that exhibit mobility patterns thatare temporally more stable. We also suggest using our methods to assess the need fordifferent time spans of GPS monitoring for men and women in countries with a knownhistory of inequality between the two sexes. To the best of our knowledge, differentialperiods of GPS data collection based on demographic groups has not been discussedbefore. Our work suggests that GPS study designs should take demographic groups intoaccount. Funding
The work of Z.D. and A.D. was partially supported by the National Science Foun-dation Grant DMS/MPS-1737746 to University of Washington. Y.C. received partialsupport from the National Science Foundation Grant DMS-1810960 and National Insti-tutes of Health Grant U01-AG016976. The funders had no role in the study design, datacollection and analysis, decision to publish, or preparation of the manuscript.
EASURING THE TEMPORAL STABILITY OF HUMAN MOBILITY PATTERNS 15
Acknowledgment
Portions of the research in this paper used the MDC Database made available byIdiap Research Institute, Switzerland and owned by Nokia.
Appendix A. Proofs of theoretical results
A.1.
Proof of Theorem 2.1.
Proof.
We note that the ordinary proportional time estimator in Eq. (11) can be writtenas(18) (cid:98) π o,j = (cid:80) n − i =2 ( t i +1 − t i − ) ( g i = G j ) ( T + t n − t ) , where T = t max − t min . We will first show that the denominators of (cid:98) π o,j and (cid:98) π c,j areasymptotically the same. Assumption (S2) implies that ( T + t n − t ) → T , which showsthe asymptotic behavior of the denominator of (cid:98) π o,j . For (cid:98) π c,j , we have n (cid:88) i =2 ( t i − t i − ) ( g i = g i − ) = n (cid:88) i =2 ( t i − t i − ) − n (cid:88) i =2 ( t i − t i − ) ( g i (cid:54) = g i − ) , = T − n (cid:88) i =2 ( t i − t i − ) ( g i (cid:54) = g i − ) , ≥ T − M max i | t i +1 − t i | , → T , where M is the constant from assumption (S3). The limit in the above equation is due toassumption (S1). Thus, the denominators of (cid:98) π o,j and (cid:98) π c,j are asymptotically the same.Next we focus on the numerators of the two estimators.The numerator of (cid:98) π c,j can be written as n (cid:88) i =2 ( t i +1 − t i ) ( g i +1 = g i = G j ) = n (cid:88) i =2 A i , where A i = ( t i +1 − t i ) ( g i +1 = g i = G j ). Let B i = t i +1 − t i − ( g i = G j ). Using Eq. (18),the numerator of (cid:98) π o,j can be written as12 n − (cid:88) i =2 ( t i +1 − t i − ) ( g i = G j ) = n − (cid:88) i =2 B i . When g i − = g i = g i +1 = G j , we have 2 B i = A i + A i − . By assumption (S3), there areat most 2 M number of time points t i such that the equality g i − = g i = g i +1 = G j doesnot hold. Thus n − (cid:88) i =2 B i ( g i − = g i = g i +1 = G j ) ≥ n − (cid:88) i =2 B i − M · max i | t i +1 − t i | , *, † , YEN-CHI CHEN * AND ADRIAN DOBRA * which implies that (cid:98) π o,j → T n − (cid:88) i =2 B i ( g i − = g i = g i +1 = G j ) , = 1 T n − (cid:88) i =2 A i + A i − ( g i − = g i = g i +1 = G j ) . (19)Again, using the fact that there are at most 2 M number of time points t i such that theequality g i − = g i = g i +1 = G j does not hold, we obtain n − (cid:88) i =2 A i ( g i − = g i = g i +1 = G j ) ≥ n (cid:88) i =2 A i − (2 M + 1) · max i | t i +1 − t i | , n − (cid:88) i =2 A i − ( g i − = g i = g i +1 = G j ) ≥ n (cid:88) i =2 A i − (2 M + 1) · max i | t i +1 − t i | . It follows that (cid:98) π c,j = (cid:80) ni =2 ( t i − t i − ) ( g i = g i − = G j ) (cid:80) ni =2 ( t i − t i − ) ( g i = g i − ) → T n (cid:88) i =2 A i , → T n − (cid:88) i =2 A i + A i − ( g i − = g i = g i +1 = G j ) , which is the same limit in Eq. (19) we obtained for (cid:98) π o,j . Therefore the numerators of (cid:98) π o,j and (cid:98) π c,j are asymptotically the same, which proves that (cid:98) π o,j and (cid:98) π c,j are asymptoticallyequal. (cid:3) A.2.
Proof of Theorem 2.2.
Proof.
Theorem 2.1 proves that the two estimators are asymptotically equivalent. Thus,we only need to derive the convergence of one of the two estimators to the true activitydistribution π = ( π , . . . , π N ) from Eq. (9). In what follows we focus on the conservativeproportional time estimator.Without loss of generality, we assume that there exist K ≥ G j , i.e., there are [ a , b ] , · · · , [ a K , b K ] such that a i < b i < a i +1 for i = 1 , . . . , K − t min ≤ a , b K ≤ t max and { t : G ( t ) ∈ G j } = [ a , b ] ∪ · · · ∪ [ a K , b K ] . Since, in the definition of the true activity distribution π , T follows a uniform distributionon the reference time frame [ t min , t max ], we can express π j as π j = P ( G ( T ) ∈ G j ) = K (cid:88) k =1 P ( T ∈ [ a k , b k ]) = 1 T K (cid:88) k =1 ( b k − a k ) . EASURING THE TEMPORAL STABILITY OF HUMAN MOBILITY PATTERNS 17
As before, T = t max − t min .For the interval [ a k , b k ], we let t i ∗ be the first observation time after a k , and t i ∗∗ bethe last observation time before b k : t i ∗ ≥ a k , t i ∗ − < a k , t i ∗∗ +1 > b k , t i ∗∗ ≤ b k . Because G ( t ) ∈ G j for all t ∈ [ a k , b k ], we have g i ∈ G j for all i ∈ { i ∗ , i ∗ + 1 , . . . , i ∗∗ } .The conservative proportional time estimator estimates the length of the interval [ a k , b k ]based on the length of the interval [ t i ∗ , t i ∗∗ ]. The corresponding error is | ( b k − a k ) − ( t i ∗∗ − t i ∗ ) | ≤ t i ∗ − a k + b k − t i ∗∗ , ≤ ( t i ∗ − t i ∗ − ) + ( t i ∗∗ +1 − t i ∗∗ ) , ≤ i =1 ,...,n − | t i +1 − t i | → , due to assumption (S1).By applying the above argument to each interval [ a k , b k ], k = 1 , . . . , K , we concludethat n (cid:88) i =2 ( t i − t i − ) ( g i = G j ) → K (cid:88) k =1 ( b k − a k ) . Because n (cid:88) i =2 ( t i − t i − ) ( g i = G j ) ≥ n (cid:88) i =2 ( t i − t i − ) ( g i = g i − = G j ) − M · max i =1 ,...,n − | t i +1 − t i | , we further conclude that n (cid:88) i =2 ( t i − t i − ) ( g i = g i − = G j ) → K (cid:88) k =1 ( b k − a k ) . This proves the convergence of the conservative proportional estimator to the true ac-tivity distribution: (cid:98) π c,j → (cid:80) ni =2 ( t i − t i − ) ( g i = g i − = G j ) T , → (cid:80) Kk =1 ( b k − a k ) T , = π j . (cid:3) References [1] L.A. Basta, T.S. Richmond, and D.J. Wiebe,
Neighborhoods, daily activities, and measuring healthrisks experienced in urban environments , Social Science & Medicine 71 (2010), pp. 1943–1950.[2] J. Beekhuizen, H. Kromhout, A. Huss, and R. Vermeulen,
Performance of GPS-devices for envi-ronmental exposure assessment , Journal of Exposure Science and Environmental Epidemiology 23(2013), pp. 498–505.[3] D. Berrigan, J.A. Hipp, P.M. Hurvitz, P. James, M.M. Jankowska, J. Kerr, F. Laden, T. Leonard,R.A. McKinnon, T.M. Powell-Wiley, E. Tarlov, S.N. Zenk, and The TREC Spatial and Contextual,Measures and Modeling Work Group,
Geospatial and contextual approaches to energy balance andhealth , Annals of GIS 21 (2015), pp. 157–168. *, † , YEN-CHI CHEN * AND ADRIAN DOBRA * [4] R.S. Bivand, E. Pebesma, and V. G´omez-Rubio, Applied Spatial Data Analysis with R , Springer,New York, 2013.[5] C.R. Browning, C.A. Calder, B. Soller, A.L. Jackson, and J. Dirlam,
Ecological networks and neigh-borhood social organization , American Journal of Sociology 122 (2017), pp. 1939–1988.[6] H. Byrnes, B.A. Miller, C.N. Morrison, D.J. Wiebe, M. Woychik, and S.E. Wiehe,
Associationof environmental indicators with teen alcohol use and problem behavior: Teens’ observations vs.objectively-measured indicators , Health & Place 43 (2017), pp. 151–157.[7] Y.C. Chen,
Generalized cluster trees and singular measures , Annals of Statistics 47 (2019), pp.2174–2203.[8] W.J. Christian,
Using geospatial technologies to explore activity-based retail food environments , Spa-tial and Spatio-temporal Epidemiology 3 (2012), pp. 287–295.[9] R. Courant and F. John,
Introduction to Calculus and Analysis , Vol. I, Springer, New York, 1991.[10] A. Dobra and N.E. Williams,
Spatiotemporal detection of unusual human population behavior usingmobile phone data , PLoS ONE 10 (2015), p. e0120449.[11] D.T. Duncan, D.A. Hickson, W.C. Goedel, D. Callander, B. Brooks, Y.T. Chen, H. Hanson, R.Eavou, A.S. Khanna, B. Chaix, S. Regan, D.P. Wheeler, K.H. Mayer, S.A. Safren, M.S. Carr, C.Draper, V. Magee-Jackson, R. Brewer, and J.A. Schneider, International Journal of EnvironmentalResearch and Public Health 16 (2019), p. 1922.[12] D.T. Duncan, F. Kapadia, S.D. Regan, W.C. Goedel, M.D. Levy, S.C. Barton, S.R. Friedman, andP.N. Halkitis,
Feasibility and acceptability of Global Positioning System (GPS) methods to study thespatial contexts of substance use and sexual risk behaviors among young men who have sex with menin New York City: A P18 cohort sub-study , PloS One 11 (2016), p. e0147520.[13] K. Elgethun, M.G. Yost, C.T. Fitzpatrick, T.L. Nyerges, and R.A. Fenske,
Comparison of GlobalPositioning System (GPS) tracking and parent-report diaries to characterize children’s time-locationpatterns , Journal of Exposure Science and Environmental Epidemiology 17 (2007), pp. 196–206.[14] B. Entwisle,
Putting people into place , Demography 44 (2007), pp. 687–703.[15] C. Graif, A.S. Gladfelter, and S.A. Matthews,
Urban poverty and neighborhood effects on crime:Incorporating spatial and network perspectives , Sociology Compass 8 (2014), pp. 1140–1155.[16] C. Harding, Z. Patterson, L. Miranda-Moreno, and S. Zahabi,
Modeling the effect of land use onactivity spaces , Transportation Research Record: Journal of the Transportation Research Board2323 (2012), pp. 67–74.[17] Y. Kestens, A. Lebel, M. Daniel, M. Th´eriault, and R. Pampalon,
Using experienced activity spacesto measure foodscape exposure , Health & Place 16 (2010), pp. 1094–1103.[18] N. Kiukkonen, J. Blom, O. Dousse, D. Gatica-Perez, and J. Laurila,
Towards Rich Mobile PhoneDatasets: Lausanne Data Collection Campaign , in
Proc. ACM Int. Conf. on Pervasive Services(ICPS), Berlin , July. 2010.[19] J.A. Kopec,
Concepts of disability: The activity space model , Social Science & Medicine 40 (1996),pp. 649–656.[20] L.J. Krivo, H.M. Washington, R.D. Peterson, C.R. Browning, C.A. Calder, and M.P. Kwan,
Socialisolation of disadvantage and advantage: The reproduction of inequality in urban space , Social Forces92 (2013), pp. 141–164.[21] M.P. Kwan,
The uncertain geographic context problem , Annals of the Association of AmericanGeographers 102 (2012), pp. 958–968.[22] M.P. Kwan,
Beyond space (as we knew it): Toward temporally integrated geographies of segregation,health, and accessibility , Annals of the Association of American Geographers 103 (2013), pp. 1078–1086.[23] J.K. Laurila, D. Gatica-Perez, I. Aad, J. Blom, O. Bornet, T. Do, O. Dousse, J. Eberle, andM. Miettinen,
The Mobile Data Challenge: Big Data for Mobile Computing Research , in
Proc.Mobile Data Challenge Workshop (MDC) in conjunction with Int. Conf. on Pervasive Computing,Newcastle , June. 2012.[24] J.K. Laurila, D. Gatica-Perez, I. Aad, J. Blom, O. Bornet, T.M.T. Do, O. Dousse, J. Eberle, and M.Miettinen,
From big smartphone data to worldwide research: The Mobile Data Challenge , Pervasiveand Mobile Computing 9 (2013), pp. 752–771.
EASURING THE TEMPORAL STABILITY OF HUMAN MOBILITY PATTERNS 19 [25] J.H. Lee, A.W. Davis, S.Y. Yoon, and K.G. Goulias,
Activity space estimation with longitudinalobservations of social media data , Transportation 43 (2016), pp. 955–977.[26] S.A. Matthews and T.C. Yang,
Spatial polygamy and contextual exposures (SPACEs): Promotingactivity space approaches in research on place and health. , The American Behavioral Scientist 57(2013), pp. 1057–1081.[27] J.D. Mazimpaka and S. Timpf,
Trajectory data mining: A review of methods and applications ,Journal of Spatial Information Science 13 (2016), pp. 61–99.[28] H. Miller,
Place-based versus people-based Geographic Information Science , Geography Compass 1(2007), pp. 503–535.[29] C.N. Morrison, H.F. Byrnes, B.A. Miller, E. Kaner, S.E. Wiehe, W.R. Ponicki, and D. Wiebe,
Assessing individuals’ exposure to environmental conditions using residence-based measures, activitylocation-based measures, and activity path-based measures , Epidemiology 30 (2019), pp. 166–176.[30] T.H. Newsome, W.A. Walcott, and P.D. Smith,
Urban activity spaces: Illustrations and applicationof a conceptual model for integrating the time and space dimensions , Transportation 25 (1998), pp.357–377.[31] A.J. Noah,
Putting families into place: Using neighborhood-effects research and activity spaces tounderstand families , Journal of Family Theory & Review 7 (2015), pp. 452–467.[32] B.K. Paul,
Female activity space in rural Bangladesh , Geographical Review 82 (1992), pp. 1–12.[33] C. Perchoux, B. Chaix, S. Cummins, and Y. Kestens,
Conceptualization and measurement of envi-ronmental exposure in epidemiology: Accounting for activity space related to daily mobility , Health& Place 21 (2013), pp. 86–93.[34] D.B. Richardson, N.D. Volkow, M.P. Kwan, R.M. Kaplan, M.F. Goodchild, and R.T. Croyle,
Spatialturn in health research , Science 339 (2013), pp. 1390–1392.[35] S. Schonfelder and K.W. Axhausen,
Activity spaces: Measures of social exclusion? , Transport Policy10 (2003), pp. 273–286.[36] J.E. Sherman, J. Spencer, J.S. Preisser, W.M. Gesler, and T.A. Arcury,
A suite of methods for repre-senting activity space in a healthcare accessibility study , International Journal of Health Geographics4 (2015), p. 24.[37] L.K. VanWey, R.R. Rindfuss, M.P. Gutmann, B. Entwisle, and D.L. Balk,
Confidentiality andspatially explicit data: Concerns and challenges , Proceedings of the National Academy of Sciences102 (2005), pp. 15337–15342.[38] G.M. Vazquez-Prokopec, D. Bisanzio, S.T. Stoddard, V. Paz-Soldan, A.C. Morrison, J.P. Elder,J. Ramirez-Paredes, E.S. Halsey, T.J. Kochel, and T.W. Scott,
Using GPS technology to quan-tify human mobility, dynamic contacts and infectious disease dynamics in a resource-poor urbanenvironment , PloS One 8 (2013), p. e58802.[39] L.A. Waller and C.A. Gotway,
Applied Spatial Statistics for Public Health Data , John Wiley & Sons,Hoboken, NJ, 2004.[40] L. Wasserman,
All of Nonparametric Statistics , Springer Texts in Statistics, Springer, New York,2007.[41] S.E. Wiehe, M.P. Kwan, J. Wilson, and J.D. Fortenberry,
Adolescent health-risk behavior and com-munity disorder , PloS One 8 (2013), p. e77667.[42] S.E. Wiehe, A.E. Carroll, G.C. Liu, K.L. Haberkorn, S.C. Hoch, J.S. Wilson, and J.D. Fortenberry,
Using gps-enabled cell phones to track the travel patterns of adolescents , International Journal ofHealth Geographics 7 (2008), pp. 22–22.[43] S.N. Zenk, S.A. Matthews, A.N. Kraft, and K.K. Jones,
How many days of Global PositioningSystem (GPS) monitoring do you need to measure activity space environments in health research? ,Health & Place 51 (2018), pp. 52–60.[44] S.N. Zenk, A.J. Schulz, S.A. Matthews, A. Odoms-Young, J. Wilbur, L. Wegrzyn, K. Gibbs, C.Braunschweig, and C. Stokes,
Activity space environment and dietary and physical activity behaviors:a pilot study , Health & Place 17 (2011), pp. 1150–1161. * Department of Statistics, University of Washington, Seattle, WA, USA; ††