A continuous-time state-space model for rapid quality-control of Argos locations from animal-borne tags
Ian D. Jonsen, Toby A. Patterson, Daniel P. Costa, Philip D. Doherty, Brendan J. Godley, W. James Grecian, Christophe Guinet, Xavier Hoenner, Sarah S. Kienle, Patrick W. Robison, Stephen C. Votier, Matthew J. Witt, Mark A. Hindell, Robert G. Harcourt, Clive R. McMahon
JJonsen et al.
METHODOLOGY
A continuous-time state-space model for rapidquality-control of Argos locations fromanimal-borne tags
Ian D. Jonsen , Toby A. Patterson , Daniel P. Costa , Philip D. Doherty , Brendan J. Godley , W.James Grecian , Christophe Guinet , Xavier Hoenner , Sarah S. Kienle , Patrick W. Robinson , StephenC. Votier , Scott Whiting , Matthew J. Witt , Mark A. Hindell , Robert G. Harcourt and Clive R.McMahon * Correspondence:[email protected] Dept of Biological Sciences,Macquarie University, Sydney,AustraliaFull list of author information isavailable at the end of the article
AbstractBackground:
State-space models are important tools for quality control andanalysis of error-prone animal movement data. The near real-time (within 24 h)capability of the Argos satellite system can aid dynamic ocean management ofhuman activities by informing when animals enter wind farms, shipping lanes, andother intensive use zones. This capability also facilitates the use of oceanobservations from animal-borne sensors in operational ocean forecasting models.Such near real-time data provision requires rapid, reliable quality control to dealwith error-prone Argos locations.
Methods:
We formulate a continuous-time state-space model to filter the threetypes of Argos location data (Least-Squares, Kalman filter, and Kalmansmoother), accounting for irregular timing of observations. Our model isdeliberately simple to ensure speed and reliability for automated, near real-timequality control of Argos location data. We validate the model by fitting to Argoslocations collected from 61 individuals across 7 marine vertebrates and comparemodel-estimated locations to contemporaneous GPS locations. We then testassumptions that Argos Kalman filter/smoother error ellipses are unbiased, andthat Argos Kalman smoother location accuracy cannot be improved bysubsequent state-space modelling.
Results:
Estimation accuracy varied among species with median Root MeanSquared Errors usually < Conclusions:
Our model provides quality controlled locations from ArgosLeast-Squares or Kalman filter data with accuracy similar to or marginally betterthan Argos Kalman smoother data that are only available via fee-basedreprocessing. Simplicity and ease of use make the model suitable both forautomated quality control of near real-time Argos data and for manual use byresearchers working with historical Argos data.
Keywords: animal-borne sensors; bio-telemetry; foieGras R package; GlobalPositioning System; seabird; pinniped; sea turtle; Template Model Builder a r X i v : . [ q - b i o . Q M ] M a y onsen et al. Page 2 of 25
Background
State-space models have emerged as important tools both for quality control andecological analysis of error-prone animal movement data [1–5]. Analysis of these datawith discrete-time models is simple in principle, breaking down animal movementinto a series of discrete steps that occur on some fixed time interval (e.g., [1, 6]).Yet animal movement is a process that unfolds continuously through time, usuallyabsent of clear breaks that could delineate discrete steps. We merely measure themovements from locations obtained over discrete, often irregular intervals in time. Inthis sense, a continuous-time model can more naturally handle temporally irregularobservations while mimicking the true underlying continuous movement process(es)[2, 7].In the marine realm, air-breathing animal locations are typically measured bysatellite-linked electronic tags at irregular time intervals dictated by a combina-tion of satellite availability and an animal’s surface behaviour. The Argos satellitetelemetry system is one of the most common platforms used to track animals at sea,with over 40,000 individuals tracked since 2007 (S. Baudel, pers. comm.). In this sys-tem, transmissions from electronic tags are received by one of several polar-orbitingsatellites as they pass overhead, and the Doppler shift in transmission frequencyalong with other information is used to geolocate the tags [8]. The polar orbitsof Argos satellites result in more dense coverage and potentially higher temporal,resolution data closer to the poles than at the equator. From inception in 1978 to2011, CLS (Collecte Localisation Satellites) has used a Least-Squares algorithm togeolocate the tag transmissions. This approach does not quantify location uncer-tainty but rather provides location quality classes based on information includingthe number of transmissions received [8].State-space models developed for Argos Least-Squares locations have relied on in-dependent, ground-truth data (e.g., [9]) to quantify location uncertainty for each ofthe location quality classes [1, 2]. However, independently quantified uncertainties,based on a single or small number of data sets, are unlikely to be appropriate for allspecies in all locations. For example, Lowther et al. [10] found that modificationsto assumed Least-Square error variances can influence the accuracy of locationspredicted by different state-space models.In 2011, CLS replaced their Least-Squares algorithm with a state-space model,based on a multiple model Kalman filter algorithm, to estimate locations and theiruncertainty [11]. This approach provides more location estimates, each with a corre-sponding estimated error ellipse, and with greater accuracy compared to the originalLeast-Squares method. These locations are provided in near real-time; here definedas within 24 h of occurrence. However, CLS also provides an extra service thatuses a fixed-interval Kalman smoother to further improve location accuracy fromthe original Kalman filter-based location estimates [12]. Whereas the Kalman filteremploys a one-step recursion to estimate locations based only on the current andprevious observations, the Kalman smoother uses a two-pass approach, first em-ploying the Kalman filter and then employing a backwards smooth of the data [13].In this sense, the Kalman smoother uses information from the entire animal trackto estimate locations and their uncertainty. This results in more accurate locationestimates than the Kalman filter alone [12]. Such smoother-based location estimates onsen et al.
Page 3 of 25 are theoretically optimal given the available data, and it should not be possible toimprove on them if uncertainty is characterised and propagated accurately (e.g.,[14]). Currently, CLS does not provide Kalman smoother-based locations in nearreal-time, they can only be obtained with reprocessing, for an additional fee, aftera tag deployment ends.Traditional use of animal tracking data has required neither near real-time dataprovision nor rapid modelling tools for quality control or ecological analysis. How-ever, real-time management of at-risk species’ mortality from interactions with hu-man activities such as offshore wind farms, fisheries and shipping increasingly relieson animal telemetry data [15–17]. Dynamic ocean management applied at high spa-tial and temporal resolutions can increase the efficiency and efficacy of measuresto reduce mortality [18], placing an onus on rapidly available, high-resolution data.Similarly, the utility of animal-borne sensors for ocean observing [19, 20] as partof the Global Ocean Observing System has spurred coordinated animal telemetryprograms, such as the Australian Integrated Marine Observing System’s AnimalTracking Facility (IMOS ATF [1] ) and the U.S. Integrated Ocean Observing Sys-tem’s Animal Telemetry Network (IOOS ATN [2] ). These programs aim to providenear real-time ocean measurements via the World Meteorological Organization’sGlobal Telecommunication System for assimilation in operational ocean and atmo-spheric forecast models. In all these cases, near real-time telemetry data provisionrequires rapid and therefore automated, reliable quality control processes, includ-ing the error-prone Argos location data that are essential for understanding animalmovements and distribution, and for providing geospatial context to ocean mea-surements.Here we present a continuous-time state-space model for rapid filtering of anyArgos location data. This model is now used as part of the IMOS ATF’s qualitycontrol/quality assurance process for animal-borne ocean observations. To facilitatefast automation, we trade off realism - the ability to explain complex movement pro-cesses - for reliability by using a simple continuous-time random walk on velocitywith a single variance parameter. We evaluate the model by: 1) comparing fitsto all three Argos location types from the same individuals; 2) assessing accuracyof model-estimated locations against contemporaneous GPS locations; 3) assessinghow a model assumption about Argos error ellipses influences estimation accuracy;4) comparing the accuracy of modelled and un-modelled Kalman Smoother loca-tions.
Methods
A continuous-time state-space model for animal telemetry data
We model animal movement as a continuous-time random walk on velocity v t intwo coordinate axes: v t = v t − ∆ + Σ ∆ (1) [1] http://imos.org.au/facilities/animaltracking [2] https://ioos.noaa.gov/project/atn onsen et al. Page 4 of 25 where ∆ is the time increment and Σ ∆ is a zero-mean, bi-variate Gaussian randomvariable with variance 2D∆. The parameter D is a 1-d diffusion coefficient account-ing for variability in velocity, which increases with the time interval ∆. Notingthat locations x are the summed velocities, given a starting location, the followingequation describes a simple process model subject to variable time increments: x i = x i − + v i ∆ i (2)where the subscript i indexes time t i , x i is the true location of the animal at time t i and v i ∆ i is the displacement (velocity x elapsed time) between x i − and x i . Tosimplify the model, we assume that the velocity random walk variances 2D∆ i areequal on the two axes but they could also be assumed to vary independently [2].Correlation in movements arises from allowing the locations to be the sum of thevelocities.We couple this process model to a generally applicable measurement model thatdescribes how the error-prone and possibly irregularly-timed observed locations y i map onto the corresponding true location states x i : y i = x i + (cid:15) i ; (cid:15) i ∼ N(0 , Ω i ) (3)where y i the location observed at time t i corresponding to x i , and Ω i is the mea-surement error variance-covariance matrix that can be structured to suit differenttypes of location data. Below, we focus on modifications to accommodate differentArgos location types, but other location data (e.g., processed light-level geoloca-tions) could also be considered in this framework. Argos Least-Squares data
Locations measured using CLS’ older Least-Squares (LS) approach [8] are associ-ated with location quality class designations: 3, 2, 1 0, A, B, and Z. These classesare the only contemporaneous information about location quality and provide onlya relative index of measurement uncertainty [1]. We use the class information, alongwith independent estimates of their associated standard errors from Argos trans-mitters deployed on seals held captive at a known location [9], to construct thefollowing variance-covariance matrix:Ω i = (cid:34) τ x K x,i ρτ x K x,i τ y K y,i ρτ x K x,i τ y K y,i τ y K y,i (cid:35) (4)where τ x and τ y are the overall measurement error variances on the two coordinateaxes, K x,i and K y,i are error weighting factors that scale the τ ’s appropriately forthe Argos location quality class associated with the i th observation. The τ ’s areestimated during model fitting and the error weighting factors are the standarderror ratios between the best quality class, 3, and each other class (2, 1, 0, A, B,Z). onsen et al. Page 5 of 25
Argos Kalman filter and Kalman smoother data
Locations measured using CLS’ Kalman filter (KF) or Kalman smoother (KS) al-gorithms have their estimated uncertainties provided to users as error ellipses [11].Ellipses are defined by three variables: semi-major axis, semi-minor axis and semi-major axis orientation from north. Building on McClintock et al.[14], the errorvariance-covariance matrix is:Ω i = (cid:34) τ x,i τ xy,i τ xy,i τ y,i (cid:35) (5)with the elements being derived from the Argos error ellipse components: τ x,i = (cid:18) M i √ (cid:19) sin c i + (cid:18) m i ψ √ (cid:19) cos c i (6) τ y,i = (cid:18) M i √ (cid:19) cos c i + (cid:18) m i ψ √ (cid:19) sin c i (7)and τ xy,i = (cid:18) M i − m i ψ (cid:19) cos c i sin c i (8)where M i is the ellipse semi-major axis length of the i -th observation, m i is thesemi-minor axis length and c i is the semi-major axis orientation [11, 14].McClintock et al. [14] used a bivariate t -distribution, with variance-covariancedefined by the Argos error ellipses, in their measurement model to account foroccasional outlier observations (i.e., where error ellipses underestimate the truemeasurement uncertainty). Here we chose to identify and remove outlier locationsusing a travel rate filter [21] prior to fitting the state-space model, as per [2, 22].Additionally, we included the parameter ψ to account for possible consistent underestimation of the Kalman filter (& smoother)-derived location uncertainty (Figure1). ψ re-scales all ellipse semi-minor axes m i , where estimated values > y i ’s from geographic coordinates (lon, lat) onto aCartesian plane prior to modelling, using the WGS84 World Mercator projection(EPSG 3395). To facilitate optimization, all planar coordinates and their uncer-tainty estimates, where available, are converted from m to km. Estimation
We used the R package TMB (Template Model Builder, [23]) to fit the state-spacemodel, using maximum likelihood to estimate model parameters and the Laplaceapproximation to rapidly estimate the random effects - the unobserved locationand velocity states, x and v [5, 24]. Using this estimation approach, uncertaintyin x and v estimates are obtained using a generalised delta method (see [23] fordetails). All model and associated general data preparation code are available inthe foieGras R package [25]. The latest version can be downloaded from the leadauthor’s GitHub site ( https://github.com/ianjonsen/foieGras ). onsen et al. Page 6 of 25
Data and pre-processing
We model all three types of Argos satellite location data: LS, KF, and KS. The dataare comprised of four pinnipeds, one seabird and two sea turtle species (Table 1);with deployment locations ranging between polar, temperate, and tropical marineregions (Figure S1). The number of individual data sets by species and data typerange from 6 to 13 with all having locations measured by GPS and at least oneArgos type (Table 1). All data collected after 2008 were reprocessed by CLS toobtain the three Argos data types (4 species; Table 1).We used an automated pre-filtering step to identify outlier observations to be ig-nored by the state-space model. This pre-filtering used the argosfilter
R package[21] to identify locations implying travel rates > -1 for all pinnipeds and seaturtles and travel rates >
17 ms -1 for northern gannets. These speed thresholdsrepresent conservative upper limits of travel for these species and are intended toidentify only the extreme outlier observations. This resulted in <
30% of Least-Squares, <
15% of Kalman filter, and <
10% of Kalman smoother data beingremoved. The proportion of data removed by pre-filtering is considerably less thanthose associated with optimal speed thresholds for other species (e.g., [22]).
Empirical validation
We examined the accuracy of model-predicted locations, assuming GPS data rep-resent truth. Although GPS data have higher spatial accuracy and precision, andtypically have higher sampling rates than Argos data, they are nonetheless discretemeasurements of a continuous-time process. As a consequence, they are also likelyto misrepresent animals’ true movement paths but to a far smaller extent (10’s ofm; [26]) than Argos data.For all validations presented, we compared GPS locations to model-fitted locations(hereafter model-estimated locations), which are location states estimated at thetimes of the Argos-measured locations. By focusing on model-estimated locationsand not predicted locations that occur at regular time intervals, we reduce thedegree to which model accuracy is confounded with data sampling rates that areknown to vary across species and Argos data types (see Discussion).We compared model-estimated locations from fits to all three Argos data types,where available, with GPS data. In all cases, the times of GPS observations do notmatch the times of Argos observations or the corresponding model-estimated loca-tions. To account for this mismatch, we initially considered three approaches forcomparing between GPS and modelled locations. First, using a linear interpolationof GPS locations to model-estimated location times [27]. Second, using the tem-porally closest GPS observation if any occurred within ±
10 min. Third, using themodel to predict locations at the GPS observation times. In several cases, it was notfeasible to predict model locations for each GPS observation time as the typicallyhigher frequency of GPS observations resulted either in implausible artefacts in themodel fits to the Argos data or in convergence failures of the optimiser used to fitthe model. For these reasons, we chose not to consider this approach further.Fitting the state-space model with a fixed 2-h prediction interval resulted in op-timiser convergence for all individual tracks. For each individual track, we sum-marized the deviations between model-estimated locations and either the linearly onsen et al.
Page 7 of 25 interpolated GPS locations or the temporally matched GPS locations by taking theroot mean of the squared distances (RMSD in km) between all pairs of locationsand comparing distributions of individual RMSD values among species. We reportresults of comparisons with the linearly interpolated GPS locations here and com-parisons with the temporally matched GPS locations in Supplementary Information.We discuss implications of using each of these approaches.
Potential under-representation of Argos KF/KS location uncertainty
Our default model accounts for a perceived under-estimation of the size of CLS’Kalman filter and Kalman smoother error ellipses (Figs. 1 and S2 - S4) by includingthe parameter ψ (Eqns. 6, 8). Although uncertainty is expected to be lower in thegeneral North - South plane due to the polar orbits of the Argos satellites [11],the frequent compression of error ellipses in this plane (semi-minor axis; e.g., Fig.1b) seems extreme. Values of ψ > ψ parameter on the accuracy of model-estimated locations by comparing RMSD valuesfrom models with and without the ψ parameter. To simplify the results, we pooledRMSD values across species and assessed the log e difference in RMSD (denoted aslog ∆ RMSD), which approximates % difference on the linear scale [28]. Argos KS location accuracy
The CLS Kalman smoother locations have greater spatial accuracy and precisionthan Least-Squares or Kalman filter data [12]. In principle, it should not be possibleto improve the accuracy of KS-based locations with subsequent modelling becausethey are theoretically optimal estimates, using all available data. It does seem rea-sonable, however, to question whether this is actually the case. We evaluated thisby comparing log ∆ RMSD derived from GPS and KS locations to those derivedfrom GPS and estimates from the state-space model fit to the KS locations. Inboth cases, we apply the same pre-filtering to identify and remove outlier locations,though these outliers should not be present in KS-based locations.
Results
State-space model fits to the 3 Argos data types
We fit the state-space model to the four species with all three Argos data (Table 1),and present fits with a 2-h prediction time interval. Model fits to hawksbill turtleand southern elephant seal data show a consistent increase in spatial resolution anddecrease in estimation uncertainty of the predicted tracks across the three Argosdata types (top to bottom; (Fig. 2 a,e,i and b,f,j, respectively). This effect is dueto an increase in the number of observations from least-squares to Kalman filterdata, and to a shrinking of the error ellipses (measurement uncertainty), by nearlyhalf, from Kalman filter to Kalman smoother data (Table 2). Model fits to leopardseal and northern gannet data do not show any clear differences in resolution orestimation uncertainty across the Argos data types (Fig. 2 c,g,k and d,h,l, respec-tively). This appears due to smaller differences in the number of observations for onsen et al.
Page 8 of 25
Least-Squares versus Kalman filter data, arising from lower proportions of class Aand B locations, relative to hawksbill turtles and southern elephant seals (Table 2).The lower proportions of class A and B locations for leopard seals and northerngannets are likely due to the large amount of time they spend at or above the oceansurface. Additionally, northern gannets had, on average, far larger error ellipses thanthe other species (Table 2). The uncertainty of their state-space model-predictedlocations was consequently larger, regardless of Argos data type (light blue 95%confidence ellipses in Fig. 2 d,h,l).
Validation with GPS data
Total sequential processing time for all 129 Argos data sets (Table 1) was 13.43min, an average of 6.25 s per data set. This included both the pre-filter algorithmand state-space model estimation, running on a 2018 MacBook Pro 15” laptop with2.9 GHz i9 processor, 32 GB RAM, with R 3.6.2.Median distances between state-space model-estimated and interpolated GPS lo-cations were within 8 km for all species and data types, with most species anddata types having 95% of estimated locations within 12 km of GPS locations (Ta-ble 3). Northern gannets were an exception, with 95-th percentiles extending > ± se) improvement of state-space model-estimatedlocation accuracy relative to un-modelled KS location accuracy was: LS = 0.21 ± ± ± ±
10 min (Fig. S5).
Effect of ψ parameter Inclusion of the ψ parameter resulted in lower RMSD values, on average, implyingthat Argos error ellipses under-represent the true location uncertainty in the generalnorth - south direction (Fig. 4). This result was less pronounced with fits to ArgosKalman smoother locations, with 81% of individuals having a log ∆ RMSD < onsen et al. Page 9 of 25 likely to benefit from re-scaled error ellipses, with most individuals having log ∆RMSD values close to or > ψ re-scaling effect would be less pronounced; or, 3) acombination of the two. Argos KS accuracy
Argos Kalman smoother locations were less accurate by an average of 0.34 km with-out subsequent state-space model filtering (Table 3; compare KS and pf KS values),although comparisons of log ∆ RMSD were variable both within and among species(Fig. 5). The mean log ∆ RMSD across species implied a average 6% increase inaccuracy with subsequent state-space model filtering of Argos KS locations. How-ever, results were equivocal for southern elephant seals and hawksbill turtle trackswere typically more accurate without any subsequent state-space filtering (Fig. 5).
Discussion
We presented a continuous-time model for animal movement, fit in a state-spaceframework that allows flexible handling of Argos satellite telemetry data. The modelwas initially intended for automated quality control of large Argos animal trackingdata sets, but is broadly applicable for any Argos location data. Using Argos -GPS double tagged animals, we assessed the accuracy of model-estimated locations,comparing across three types of Argos data where possible. Median accuracy waswithin 4 km for most species and data types, with state-space model-estimatedlocations being slightly more accurate (by 0.1 - 0.3 km on average) than the bestquality CLS Kalman smoother locations. Median root mean squared deviationswere typically at or under 5 km for 6 of the 7 species studied. In most cases, RMSDvalues were lowest when fitting to Argos Kalman smoother data and highest whenfitting to Argos Least-Squares or Kalman filter data, although the within-speciesdifferences in RMSD between data types were typically small. Although the modelwas evaluated over a limited number of individuals and species, it is apparent thatthe accuracy and spatio-temporal resolution of inferred locations is situational.Highlighting this situational aspect are the northern gannet results (Table 3;Figs. 2 & 3), which are clearly distinct from the other species. Accuracy of model-estimated locations was approximately 4-5 times worse than for other species, al-though absolute magnitude is subject to the approach used for matching model-estimated and GPS locations (compare Figs. 3 & S5). Unlike other species wheremedian distances between model-estimated and GPS locations either declined con-sistently or were similar when comparing LS to KF and KF to KS data types,gannets had the lowest median distances for fits to LS data and had far broaderdistributions of distance across the 3 data types. We suspect this pattern may arisefrom the considerably faster mean travel rates of northern gannets (12 km h -1 , withcruising speeds up to 45 km h -1 ) compared to the other species (approximately 0.7- 3 km h -1 ). Similarly, Lopez et al. [12] reported lower overall coverage probabilitiesof error ellipses estimated by their Kalman filter and Kalman smoother algorithmsfor two avian species analyzed in comparison to other platforms (terrestrial and onsen et al. Page 10 of 25 marine mammals, sea turtles, ships and drifters). Combined, this implies that Ar-gos error ellipses may be more strongly underestimated for species/platforms thattravel faster and/or at higher altitude.McClintock et al. [14] used a bivariate t -distribution, parameterised by the Argoserror ellipse information, to model location measurement error. Their estimates ofthe t degrees of freedom parameter implied that the Argos error ellipses do not fullyexplain location measurement error. To avoid computational challenges associatedwith t -distribution parameter estimation, we used a two-step approach for dealingwith location measurement error in Argos Kalman filter and Kalman smoother data.First, we identified and removed potentially large outliers using a travel-rate filter[21] prior to fitting the state-space model, as per [2, 22]. Although underestimationof location error was acknowledged by Lopez et al. [11, 12] and has been reportedby others [14, 29], it is unclear why occasional, apparent hugely underestimatederror ellipses are present in the Kalman filter and Kalman smoother data. Second,we accounted for potential Argos error ellipse underestimation by including the ψ parameter to inflate the semi-minor axis. We adopted this approach given theobservation that Argos error ellipses often have semi-minor axes vastly smaller thancorresponding semi-major axes, resulting in “squashed” error ellipses (Figs. S2 -S4). We found that in most cases the ψ parameter contributed to more accuratelocation estimates, implying that the error ellipses commonly underestimate the trueuncertainty in Argos-measured locations. This result is evident but less pronouncedwhen fitting to Kalman smoother versus Kalman filter data. Location estimateswere more accurate for at least some individuals of all species, however, hawksbillturtles and northern gannets appeared least likely to benefit from the ψ re-scalingeffect (see Fig. 4). Both of these species had somewhat more circular error ellipses,in comparison to the leopard and southern elephant seals, and thus any possiblecontribution of ψ would be reduced. Ultimately, we are unsure why Argos errorellipses appear to be so commonly biased low in the semi-minor axis direction(generally north - south).Where possible, both Kalman filter and Kalman smoother data types were in-cluded in this study. We found, in most cases, that the model-estimated locationswere most accurate when using the Kalman smoother data, but on average byless than 200 m compared with fits to Kalman filter data. Although the Kalmansmoother data should represent optimal estimates of location because informationalong the entire movement track is used to update and smooth each location es-timate, we show that fitting the state-space model to these estimates can furtherimprove location accuracy in some cases (by an average reduction in error of ap-proximately 6%). The Kalman smoother data are not provided in the default, nearreal-time service from CLS, rather they are only available with post-processing byCLS at an additional cost. There are two points to be made about this. First, thesmoothing algorithm is a standard approach that can be implemented rapidly, withcomputing requirements no greater than the Kalman filter. It could be applied innear real-time. Second, a near real-time Kalman smoother would result in the bestavailable location estimates changing as new data became available. This incremen-tal improvement, due to information gain propagating backwards in time, wouldreduce as locations become less recent. This should be of little consequence to most onsen et al. Page 11 of 25 wildlife users who typically do not use their data in near real-time, and users whodo require near real-time data may see greater benefit in more accurate locationseven if they are subject to change in retrospect.Our state-space model produced location estimates with a median accuracy com-parable to or greater than CLS’ Kalman smoother locations, regardless of inputArgos data type. This implies that users can obtain similar or better accuracythan CLS’ Kalman Smoother locations by applying the state-space model to theirLeast-Squares or Kalman filter data. Therefore the method we describe is a viablealternative to the CLS’ fee-based reprocessing service. The Laplace approximationapproach employed in Template Model Builder models states (velocity and loca-tion) as unknown random effects, providing a most likely estimate of the currentstate from the posterior of it’s location given all available data, both forward andbackward in time. This is precisely what a Kalman smoother does. That our modelcan improve on the CLS Kalman smoother’s location estimates may imply thatuncertainty is somehow not well-propagated from the raw Doppler shift data avail-able to CLS through to the location estimates available to users. If this is indeedthe case, it is unclear why this is so. The issue may be due to necessary trade-offsbetween accuracy and precision versus providing a near real-time location servicefor a multitude of moving platforms, of which wildlife are a small component.
Spatio-temporal resolution and spatial accuracy
It is important to note that when comparing GPS locations with those from mod-els fitted to Argos-measured locations, accuracy is interlinked with the temporalresolution (sampling rate) of Argos relative to GPS locations. As GPS resolutionis typically greater than Argos, comparisons to determine spatial accuracy of esti-mated locations are confounded by this difference. No model fit to Argos-measuredlocations alone can resolve all the nuances of a movement path that are presentin higher resolution GPS data. This discrepancy will be reflected in measures ofspatial accuracy, unless GPS data are suitably sub-sampled or interpolated.We interpolated GPS locations to the times of the Argos-measured locations towhich the state-space model was fitted. Our reasoning was that interpolation ofthe generally higher resolution GPS data should be less corrupted by spatial er-ror than a similar interpolation of the lower resolution and irregularly occurringmodel-estimated locations. Sub-sampling GPS locations by matching them with thetemporally closest model-estimated location, commonly used elsewhere [12, 30, 31],resulted in lower RMSD or greater (apparent) accuracy than comparison with thelinearly interpolated GPS locations. These lower RMSD values, however, were basedon fewer (n <
10) temporally matched pairs of model-estimated and GPS locationsfor some species/individuals (Fig. S5); using a 20-min window. Although samplesizes could be increased by choosing a wider time window, the potential for biasedcomparisons would increase differently across species due to their different spatio-temporal scales of movement.Fits to the three Argos location types from the same individuals showed thatmovement pathways can be predicted with increasing spatial resolution, i.e., resolvegreater spatial detail despite the same prediction time interval (2 h), and precisionas the number of Argos-measured locations increased (transition from Least-Squares onsen et al.
Page 12 of 25 to Kalman filter data) and as their uncertainty decreased (transition from Kalmanfilter to Kalman smoother data). One of the main advantages of Argos’ Kalman filterover the older Least-Squares method is a gain in the number of location estimates,mostly by resolving locations from the single transmissions between tag and satellitethat Least-Squares can not [11]. This increase in resolution and precision is case-dependent, however, as species with lower overall proportions of class A and Blocations do not gain as many new locations when transitioning from Least-Squaresto Kalman filter data. This case-dependency is likely tied to typical surface timeintervals of diving species, and, for those species spending the majority of time inair, on the magnitude of their travel rates.On different issue of scale, many ecological analyses of animal tracking data con-sider remotely sensed or other environmental data at spatial resolutions (2 - 10km; e.g., [32]) approaching the state-space model accuracy limits found here. Thishighlights the need for researchers to consider the appropriate resolution of their en-vironmental data given their specific questions and the limitations of their locationestimates. Fitting a state-space model to Argos tracking data is not a panacea. Re-searchers should consider carrying location uncertainty estimates provided by state-space models through to subsequent ecological analyses. For example, by repeatedlysampling from the location uncertainty, conducting the analysis, and pooling results(sensu [33]). This can be done either completely through the whole analysis or par-tially via subsequent sensitivity analysis.
Conclusions
The state-space model developed and validated here can be used to obtain quality-controlled animal locations from Argos Least-Squares or Kalman filter data in nearreal-time, with median accuracy comparable to or marginally better than CLS’reprocessed Kalman Smoother data. Our model also accounts for apparent north-south bias in Kalman filter- and Kalman smoother-derived error ellipses.The model’s near real-time capability provides the best estimates of location, giventhe available data, that can be continually updated as new data arrive via the Argossystem. This rapid, continual quality control of animal tracking data is necessary asnear real-time monitoring and forecasting of ocean states increasingly incorporatesoceanographic data from animal-borne sensors, and as the need for dynamic oceanmanagement grows in our increasingly exploited and rapidly changing oceans.Although the model was developed for fast, automated quality control processes,its simplicity and ease of use also make it suitable for manual use by researcherswishing to conduct quality control of historical or otherwise less immediate Argosdata.
Competing interests
The authors declare that they have no competing interests.
Author’s contributions
Conceived and designed the study: IDJ, CRM, TAP. Developed methodology: IDJ, TAP. Performed the analyses:IDJ. Contributed data: DPC, PDD, WJG, BJG, CG, XH, SK, PWR, SCV, SW, MJW, MAH, RGH, CRM. Wrote thepaper: IDJ. Edited the paper: All. onsen et al.
Page 13 of 25
Acknowledgements
We thank M Weise and B Woodward for motivating the validation study, H Lourie for assistance with CLSreprocessing, and M Holland and K Wilson for facilitating data access. IDJ supported by Macquarie University’sco-Funded Fellowship Program and by external partners: Office of Naval Research grant N00014-18-1-2405; theIntegrated Marine Observing System - Animal Tracking Facility; the Ocean Tracking Network; Taronga ConservationSociety; Birds Canada; and Innovasea/Vemco. TAP supported by CSIRO Oceans & Atmosphere internal researchfunding scheme. CG thanks the Institut Polaire Fran¸cais Paul Emile Victor (IPEV programs 109, H.Weimerskirchand 1201, C.Gilbert) and Terres Australes et Antarctiques Fran¸caises (TAAF) for logistical and field support. WJGand SCV thank Greg & Lisa Morgan for field assistance and were funded by NERC New Investigators Grant(NE/G001014/1), the Peninsula Research Institute for Marine Renewable Energy and EU INTERREG ProjectCHARM III. DPC and PWR thank National Oceanographic Partnership Program, the Office of Naval Research, theMoore, Packard, and Sloan Foundations, and California Sea Grant Program. SSK supported by a National ScienceFoundation Office of Polar Projects research grant. XH and SW supported by the Australian Government under theCaring for Country Initiative, the Anindilyakwa Land Council, the Northern Territory Government, Charles DarwinUniversity, and the ANZ Trustees Foundation – Holsworth Wildlife Research Endowment.
Author details Dept of Biological Sciences, Macquarie University, Sydney, Australia. CSIRO Oceans and Atmosphere, Hobart,Australia. Department of Ecology and Evolutionary Biology, University of California Santa Cruz, Santa Cruz, USA. Environment and Sustainability Institute, University of Exeter, Penryn, UK. Sea Mammal Research Unit, ScottishOceans Institute, University of St Andrews, St Andrews, UK. CNRS, Chize, France. Department of Biodiversity,Conservation and Attractions, Government of Western Australia, Kensington, Australia. Institute for Marine andAntarctic Studies, University of Tasmania, Hobart, Australia. Sydney Institute of Marine Science, Mosman,Australia.
References
1. Jonsen I, Flemming J, Myers R. Robust state–space modeling of animal movement data. Ecology.2005;86(11):2874–2880.2. Johnson DS, London JM, Lea M, Durban JW. Continuous-time correlated random walk model for animaltelemetry data. Ecology. 2008;89(5):1208–1215.3. Patterson TA, Thomas L, Wilcox C, Ovaskainen O, Matthiopoulos J. State-space models of individual animalmovement. Trends in Ecology and Evolution. 2008;23:87–94.4. Albertsen CM, Whoriskey K, Yurkowski D, Nielsen A, Mills Flemming J. Fast fitting of non-Gaussianstate-space models to animal movement data via Template Model Builder. Ecology. 2015;96:2598–2604.5. Auger-M´eth´e M, Albertsen CM, Jonsen ID, Derocher AE, Lidgard DC, Studholme KR, et al. Spatiotemporalmodelling of marine movement data using Template Model Builder (TMB). Marine Ecology Progress Series.2017;565:237–249.6. McClintock BT, King R, Thomas L, Matthiopoulos J, McConnel BJ, Morales JM. A general discrete-timemodeling framework for animal movement using multistate random walks. Ecological Monographs.2012;82:335–349.7. McClintock BT, Johnson DS, Hooten MB, Ver Hoof JM, Morales JM. When to be discrete: the importance oftime formulation in understanding animal movement. Movement Ecology. 2014;2:21.8. Service Argos. Argos User’s Manual. CLS; 2016. Available from: .9. Vincent C, McConnell BJ, Fedak MA, Ridoux V. Assessment of ARGOS location accuracy from satellite tagsdeployed on captive grey seals. Marine Mammal Science. 2002;18:301–322.10. Lowther AD, Lydersen C, Fedak MA, Lovell P, Kovacs KM. The Argos-CLS Kalman filter: error structures andstate-space modelling relative to Fastloc GPS data. PLOS One. 2015;10:4.11. Lopez R, Malard´e J, Royer F, Gaspar P. Improving Argos Doppler location using multiple-model Kalmanfiltering. IEEE Transactions on Geoscience and Remote Sensing. 2014;52:4744–4755.12. Lopez R, Malard´e J, Dan`es P, Gaspar P. Improving Argos Doppler location using multiple-model smoothing.Animal Biotelemetry. 2015;3:32.13. Rauch HE, Tung F, Striebel CT. Maximum likelihood estimates of linear dynamic systems. AIAA Journal.1965;3:1445–1450.14. McClintock BT, London JM, Cameron MF, Boveng PL. Modelling animal movement using the Argos satellitetelemetry location error ellipse. Methods in Ecology and Evolution. 2015;6:266–277.15. Maxwell SM, Hazen EL, Lewison RL, Dunn DC, Bailey H, Bograd SJ, et al. Dynamic ocean management:Defining and conceptualizing real-time management of the ocean. Marine Policy. 2015;58:42–50.16. Hazen EL, Scales KL, Maxwell SM, Briscoe DK, Welch H, Bograd SJ, et al. A dynamic ocean managementtool to reduce bycatch and support sustainable fisheries. Science advances. 2018;4:eaar3001.17. Pirotta V, Grech A, Jonsen ID, Laurance WF, Harcourt RG. Consequences of global shipping traffic for marinegiants. Frontiers in Ecology and the Environment. 2019;17:39–47.18. Dunn DC, Maxwell SM, Boustany AM, Halpin PN. Dynamic ocean management increases the efficiency andefficacy of fisheries management. Proceedings of the National Academy of Sciences. 2016;113:668–673.19. Treasure AM, Roquet F, Ansorge IJ, Bester MN, Boehme L, Bornemann H, et al. Marine mammals exploringthe oceans pole to pole: A review of the MEOP consortium. Oceanography. 2017;30(2):132–138.20. Harcourt R, Sequeira AMM, Zhang X, Roquet F, Komatsu K, Heupel M, et al. Animal-Borne Telemetry: AnIntegral Component of the Ocean Observing Toolkit. Frontiers in Marine Science. 2019;6:326. Available from: .21. Freitas C, Lydersen C, Fedak MA, Kovacs KM. A simple new algorithm to filter marine mammal Argoslocation. Marine Mammal Science. 2008;24:315–325.22. Patterson TA, McConnell BJ, Fedak MA, Bravington MV, Hindell MA. Using GPS data to evaluate theaccuracy of state-space methods for correction of Argos satellite telemetry error. Ecology. 2010;91:273. onsen et al.
Page 14 of 25
23. Kristensen K, Nielsen A, Berg CW, Skaug H, Bell BM. TMB: Automatic Differentiation and LaplaceApproximation. Journal of Statistical Software. 2016;70:1–21.24. Jonsen I, McMahon CR, Patterson TA, Auger-M´eth´e M, Harcourt R, Hindell MA, et al. Movement responsesto environment: fast inference of variation among southern elephant seals with a mixed effects model. Ecology.2019;100:e02566.25. Jonsen I, Patterson TA. foieGras: Fit continuous-time state-space and latent variable models for filtering Argossatellite (and other) telemetry data and estimating movement behaviour; 2019. R package version 0.4.0.Available from:
Figure 1 Unfiltered Argos Kalman filter (KF) locations (gold points) and error ellipses (paleblue with black borders) for (a) hawksbill turtle and (b) southern elephant seal.
Locations areconnected by dashed blue lines. Scale bars at lower right provide an indication of the magnitude oferrors. The extreme outlier location at top right (a) is approximately 650 km away from thepreceding and following locations, and has a vastly underestimated error ellipse.onsen et al.
Page 16 of 25 F i g u r e S t a t e - s p a c e m o d e l fi t s t o t h e t h r ee A r go ss a t e lli t e d a t a t y p e s o b t a i n e d f r o m a h a w k s b ill t u r t l e ( a , e , i ) , a l e o p a r d s e a l ( b , f , j ) , a s o u t h e r n e l e ph a n t s e a l ( c , g , k ) , a nd a n o r t h e r n g a nn e t ( d , h , l ) . S t a t e - s p a ce m o d e l - p r e d i c t e d l o c a t i o n s ( d a r k b l u ec i r c l e s ) s m oo t h t h r o u g h t h e A r go s d a t a t y p e s ( go l d c i r c l e s ) - L e a s t - S qu a r e s ( a , b , c , d ) , K a l m a n fi l t e r ( e , f , g , h ) a nd K a l m a n s m oo t h e r ( i , j , k , l ) - a t r e g u l a r l y s p ec i fi e d - h i n t e r v a l s . T h e % c o n fi d e n ce i n t e r v a l s o n t h e p r e d i c t e d l o c a t i o n s ( li g h t b l u e ) a r e a l s o d i s p l a y e d . S p a t i a l s c a l e i n k m i s i nd i c a t e d a t l o w e r l e f t o rr i g h t o f e a c hp a n e l . onsen et al. Page 17 of 25
Figure 3 Root mean squared distance (RMSD in km) between state-space model estimatedlocations and corresponding interpolated GPS locations by Argos data type and species.
Threespecies - California sea lion (CASL), Cape fur seal (CPFS), and leatherback turtle (LBTU) - onlyhad Argos least-squares (LS) data available, the remainder - hawksbill turtle (HBTU), leopard seal(LESE), southern elephant seal (SESE), and northern gannet (NOGA) - had data that werereprocessed by CLS Argos to yield all 3 Argos data types - LS, Kalman filter (KF) and Kalmansmoother (KS). Individual RMSD values (filled circles), inner quartile range (beige box) andmedians (black bar) are displayed.onsen et al.
Page 18 of 25
Figure 4 Accuracy of estimated locations from models fit with and without a ψ parameter tore-scale the Argos error ellipse semi-minor axis length. The log difference in root mean squareddistance ( log ∆
RMSD) between state-space model-estimated locations and interpolated GPSlocations was calculated for the two models fit to each individual animal. Negative values indicatethe model with a ψ parameter is more accurate, whereas positive values indicated the modelwithout a ψ parameter is more accurate. Results are pooled across species with individual logdifferences in RMSD displayed as species-specific shapes (gold, blue). The inner quartile range(beige box) and medians (black bar) are displayed. Figure 5 Accuracy of Argos Kalman smoother locations with and without subsequentstate-space model filtering.
The log difference in root mean squared distance ( log ∆
RMSD)between locations and interpolated GPS locations was calculated for each individual animal. Inboth cases, the Kalman smoother locations were subjected to the same travel-rate filtering toremove highly implausible observations. Negative values indicate the state-space model estimatedlocations are more accurate. The inner quartile range (beige box) and medians (black bar) aredisplayed.onsen et al.
Page 19 of 25 T a b l e N u m b e r o f i nd i v i du a l d a t a s e t s b y s p ec i e s a ndd a t a t y p e . A r go s d a t a t y p e s a r e : L e a s t - S qu a r e s ( L S ) ; K a l m a n fi l t e r ( K F ) ; K a l m a n s m oo t h e r ( K S ) . M e a n t r a c k du r a t i o n s w e r ec a l c u l a t e d f r o m t h e d a t aa f t e rr e m o v i n g p e r i o d s o f p r o l o n g e dd a t a g a p s . T a g p r og r a mm i n g d e t a il s w e r e n o t a v a il a b l e f o r a ll d e p l o y m e n t s , s o A r go s a nd G P S s a m p li n g r a t e s w e r ec a l c u l a t e d f r o m t h e un fi l t e r e dd a t a . S p ec i e s C o mm o nn a m e C o d e D e p l o y m e n t M e a n t r a c k D a t a t y p e G P S s a m p l e F a s t l o c y e a r ( s ) du r a t i o n ( d ) L S K F K S G P S r a t e ( m i n ) G P S Z a l o phu s c a li f o r n i a nu s C a li f o r n i a s e a li o n C A S L .. Y A r c t o ce ph a l u s pu s ill u s C a p e f u r s e a l CP F S .. Y D e r m o c h e l y s c o r i a ce a l e a t h e r b a c k t u r t l e L B T U .. Y E r e t m o c h e l y s i m b r i c a t a h a w k s b ill t u r t l e H B T U Y H y d r u r g a l e p t o n yx l e o p a r d s e a l L E S E Y M i r o un g a l e o n i n a s o u t h e r n e l e ph a n t s e a l S E S E Y M o r u s b a ss a nu s n o r t h e r n g a nn e t N O GA N onsen et al. Page 20 of 25
Table 2
Argos track summary statistics by species and data type. prop’n A,B is the proportion of alllocations that are in quality class A and B. Error ellipse shape is the ratio of semi-minor tosemi-major axis length, with values closer to 1 indicating a more circular shape. Shape and areastatistics were calculated from values pooled among individuals within species and Argos data type.Species Argos total n prop’n Error ellipse Error ellipse area (km)code type locations A,B shape median 95 th %-ileHBTU LS 603 0.71 . . .HBTU KF 1 038 0.84 0.24 17.31 120.29HBTU KS 1 038 0.84 0.27 8.44 60.31LESE LS 32 780 0.47 . . .LESE KF 41 367 0.57 0.13 1.73 69.23LESE KS 41 367 0.57 0.14 1.03 31.50SESE LS 3 152 0.82 . . .SESE KF 5 016 0.91 0.13 34.40 780.11SESE KS 5 016 0.91 0.14 16.26 338.18NOGA LS 1 568 0.36 . . .NOGA KF 2 066 0.52 0.18 25.43 5 573.62NOGA KS 2 066 0.52 0.19 16.73 2 104.87 Table 3
Accuracy in km of state-space model-estimated locations and pre-filtered KS locations(pf KS), by species and Argos data type. The pf KS locations had the pre-filter algorithm applied butnot the state-space model. Median and 95 th percentile statistics were calculated from distances toGPS locations, pooled among individuals within species and Argos data type.LS KF KS pf KSSpecies median 95 th %-ile median 95 th %-ile median 95 th %-ile median 95 th %-ileCASL 1.70 10.15 . . . . . .CPFS 1.87 8.03 . . . . . .LBTU 2.77 10.06 . . . . . .HBTU 1.60 8.11 1.69 7.08 1.47 6.05 1.62 5.91LESE 1.88 11.11 2.05 11.61 1.89 10.90 2.25 12.62SESE 4.40 15.03 3.24 12.06 2.97 9.50 3.29 11.02NOGA 6.04 43.95 7.50 52.62 7.37 46.47 7.85 53.67onsen et al. Page 21 of 25
Additional Files
Additional file 1 — Global map of Argos least-squares tracking data by species.
Figure S1 Argos tracking data by species.
Locations displayed are unfiltered Argos Least-Squaresdata.onsen et al.
Page 22 of 25
Additional file 2 — Argos Kalman filter and Kalman smoother error ellipses along ahawksbill turtle track.
Figure S2 Argos error ellipses for a representative hawksbill turtle track.
The maps display (a)Argos Kalman filter- and (b) Argos Kalman smoother-measured locations (beige) and errorellipses (light blue with black borders). Locations are connected by dashed blue lines.
The majority of visible ellipses are highly compressed on their semi-minor axesfor both Argos data types. The dashed blue track line helps highlight the outlierlocation (upper right) that clearly has a vastly under-estimated error ellipse. Ap-plication of the Kalman smoother (b) did not reduce this outlier or improve itsuncertainty estimate. onsen et al.
Page 23 of 25
Additional file 3 — Argos Kalman filter and Kalman smoother error ellipses along aleopard seal track.
Figure S3 Argos error ellipses for a representative leopard seal track.
The maps display (a)Argos Kalman filter- and (b) Argos Kalman smoother-measured locations (beige) and errorellipses (light blue with black borders). Locations are connected by dashed blue lines.
The majority of ellipses are highly compressed on their semi-minor axes for bothArgos data types. The dashed blue track line helps highlight the outlier location(lower right) that has an error ellipse with under-estimated semi-minor axis. Ap-plication of the Kalman smoother (b) did not reduce this outlier or improve itsuncertainty estimate. onsen et al.
Page 24 of 25
Additional file 4 — Argos Kalman filter and Kalman smoother error ellipses along anorthern gannet track.
Figure S4 Argos error ellipses for a representative northern gannet track.
The maps display (a)Argos Kalman filter- and (b) Argos Kalman smoother-measured locations (beige) and errorellipses (light blue with black borders). Locations are connected by dashed blue lines.onsen et al.
Page 25 of 25
Additional file 5 — Alternate validation using temporally closest GPS location with ± min of state-space model-estimated locations. Figure S5 Root mean squared distance (RMSD in km) between state-space model estimatedlocations and their corresponding temporally closest GPS locations within ± min, by Argosdata type and species. Size of RMSD points (coloured) is proportional to √ n distances betweentemporally matched pairs of model-estimated and GPS locations. Black points are nn