Abstract

Estimating the altimeters a cyclist has climbed from noisy GPS data is a challenging problem. In this article a method is proposed that assumes that a person locally takes the shortest path. This results in an algorithm that does not need smoothing parameters. Moreover, it turns out that this assumption allows one to find a similarity between entropy and likelihood which results to the introduction of an entropic force.

Full PDF

GGPS Fit Method for Paths of Non Drunken Sailorsand its Connection to Entropy

Fetze Pijlman [email protected] ResearchHigh Tech Campus 7, Eindhoven, The Netherlands

August 20, 2019

Abstract

Estimating the altimeters a cyclist has climbed from noisy GPSdata is a challenging problem. In this article a method is proposed thatassumes that a person locally takes the shortest path. This results inan algorithm that does not need smoothing parameters. Moreover, itturns out that this assumption allows one to ﬁnd a similarity betweenentropy and likelihood which results to the introduction of an entropicforce.

Using GPS data points for ﬁtting paths of cyclists remains at present a chal-lenging problem. Apps such as strava [1] use GPS coordinates to estimatehow many meters a cyclist has cycled upwards but it is not uncommon thatthis estimate can be oﬀ by more than 50% (see also Ref. [2]). The mainreason for this inaccuracy is the uncertainty in the coordinate measurement(where the uncertainty along the diﬀerent axes x, y, z may be diﬀerent).Although there are several ﬁt methods available, such as spline ﬁtting, itis the aim of this article to present a new method that has an interestingrelation with entropy.We can visualize the problem as presented in Fig. 1. Somehow oneneeds to ﬁnd the ”optimal” path from a set of measurement points. As eachpoint has an uncertainty, it is too conservative to draw the path directlythrough the points. A more balanced path can be achieved by taking thisuncertainty into account but the question is how to ﬁnd this balanced pathand its justiﬁcation. The assumption that is used here is to assume thata person locally takes the shortest path, hence the title of this paper nondrunken sailor (drunken sailor is used for describing a person performing arandom walk (see also Ref. [3]) whereas here the person takes locally the1 a r X i v : . [ phy s i c s . d a t a - a n ] A ug igure 1: Illustration of diﬀerent path ﬁts. In the left illustration the pathsare running directly through the measured points while in the right illustra-tion the paths are running at some distance from the points.shortest path). Please note that this assumption is sometimes not met,e.g., when strolling around with a family through a shopping center. Sothe applicability of this paper is limited to the scope of this assumption.The other assumptions, which normally do apply, are that the uncertaintyof measurement points can be described through some probability functionand that the number of points is large.It turns out that the mentioned assumptions above can be translatedinto a thermodynamic model in which the ﬁtting process can be describedby a statistical force that has its origin from entropy. Entropic forces are ofinterest in other ﬁelds such as Ref. [4] and Ref. [5]. An explicit derivation onhow entropic forces can arise in ﬁtting methods is insightful when studyingother ﬁelds as well. We will assume that the uncertainty between the measured position c i andthe true position r i can be described by some probability function P ( r i ). Theexpected likelihood can be computed using the deﬁnition for expectation (cid:104) P (cid:105) = (cid:90) P ( r i ) P ( r i )d r i , (1)where r i is the coordinate and which may have several dimensions. In thefollowing we will assume that the expectation value for likelihood < P > is the same for all data points (although one can actually allow for pointdependent expectation values for probability). The likelihood of the datacan be computed by multiplying the likelihood of all data points. In thelimit of n → ∞ this likelihood equals the expected likelihoodlim n →∞ n (cid:89) i =1 P ( r i ) = (cid:104) P (cid:105) n . (2)The equation above will serve as a constraint in the ﬁtting process.2 Fit method

The minimization of the path length can be achieved by imagining a stringthat runs through some points that we will call control points where thecontrol points are connected to the measurement points via springs. Whena string is completely loose the springs are completely contracted whichmakes that the string runs through the measurement points. However, whenstarting to pull the string (eﬀectively making it shorter), the springs willstart to execute a force which becomes higher the further the control pointis away from the measurent point. The optimal path is achieved at themoment when the constraint in Eq. 2 is met.We observe that the optimal path has the highest likelihood for thatparticular string length, otherwise an even shorter path was feasible. Thisjustiﬁes the postulation of an entropic force that forces that path to thehighest likelihood at ﬁxed path lengths. This postulate has similarities withentropy which for an isolated system can only stay equal or increase.Let’s study what happens when we pull the string away from a certainpoint j. From now on we will assume that one has carried out a linearcoordinate transformation for each r j → r j + c j such that all c j ’s are dropped.When moving the string over an inﬁnitesimal distance ∆ r j one changes thelength of the overal string by ∆ L = d L d r j ∆ r j . (3)Throughout this article the coordinate systems are chosen such that fora positive ∆ r j the length L is shorted so that ∆ L <

0. By moving thestring over an inﬁnitesimal distance one also changes the likelihood which ismost conviniently computed by considering the log of the likelihood (withoutloosing generality), giving∆ log (cid:32) n (cid:89) i =1 P ( r i ) (cid:33) = d log P ( r j )d r j ∆ r j . (4)Now that we know how the overal log likelihood changes when we pullthe string from one point, we can use the entropic force postulate to computethe new situation. Please note that the postulate of highest likelihood meansalso highest log likelihood. At ﬁxed length L one can compensate the change∆ L from one point by a change − ∆ L from another point. By the postulate,such changes will occur if it would increase the overal likelihood. In eﬀect,all such changes will occur until the maximum likelihood is reached which isthe case when the change in log likelihood per ∆ L is the same for all points.In other words, in the optimum state the forces are in balance: ∀ j : ∆ log ( (cid:81) ni =1 P ( r i ))∆ L = F S ( L ) = d log P ( r j )d r j d L d r j , (5)3here F S ( L ) is some constant. When one rewrite the expression above oneobtains F S ( L ) d L d r j = d log P ( r j )d r j . (6)When we look at the equation above we recognize the left-hand-side asthe resulting force from the string tension F S ( L ) which is balanced by theentropic force which is apparently the derivative of the log probability dis-tribution. The algorithm for ﬁnding the optimal path now becomes rathersimple. 1) Start with a loose string, 2) create a small pull in the string bypulling it oﬀ from the ﬁrst point, 3) let the system readjust itself via theentropy forces, 4) compute whether constraint has been met, if yes then theoptimal path has been found, if not continue with step 2. The probability distribution of the measurement error may take several dif-ferent forms. In this section we will consider the Lorentz (or Cauchy) dis-tribution. The distribution reads P i ( (cid:126)x i ) = 1 πγ (cid:32) γ | (cid:126)x i − (cid:126)c i | + γ (cid:33) . (7)For the expected likelihood (see Eq. 1) one ﬁnds (cid:104) P (cid:105) = 1 π , (8)which for n points becomes (cid:42)(cid:89) i πγ (cid:32) γ | (cid:126)x i − (cid:126)c i | + γ (cid:33)(cid:43) = (cid:18) π (cid:19) n . (9)For n going to inﬁnity one simply getslim n →∞ (cid:89) i πγ (cid:32) γ | (cid:126)x i − (cid:126)c i | + γ (cid:33) = lim n →∞ (cid:18) π (cid:19) n . (10)Please note that the relation above is expected to be met for large n if theassumption of having a Lorentz distribution applies.Let’s study what happens when we pull the string away from a certainpoint such that the overal length is shortened. In Fig. 2 one can read oﬀ thelength reduction giving∆ L = d L d r ∆ r = − r cos α. (11)4igure 2: Shortening the path by pulling path away from point 4.Figure 3: Shortening the path by pulling path away from point 4.A reduction of the overal length is accompanied by a reduction in the loglikelihood which is∆ log (cid:32) n (cid:89) i =1 P ( (cid:126)x i ) (cid:33) = d log P ( (cid:126)x j )d x j ∆ x j = − rγ + r ∆ r. (12)As discussed in the previous section leading to Eq. 6, at a ﬁxed overallength the maximum likelihood path is found where all the forces are in bal-ance. This means that for all points the ratio of the change in log likelihoodwith change in L is constant ∀ j : 2 cos α j F S ( L ) = 2 rγ + r , (13)where F S ( L ) is some constant.When comparing Eq. 13 with Fig. 3 the analogy with mechanical forcesbecomes quite strong. On the left-hand-side of Eq. 13 we see the net force F R as a result of the string tension which is balanced by the postulated5ntropic force which for a Lorentz distribution takes the form of the right-hand-side of Eq. 13. The concept of string tension originates for mechanics,see e.g. Ref. [6], and appears here in a diﬀerent context. In this section we study the problem that was discussed in the introduction.How to compute the amount of altimeters when assuming that a personchose the path with the least amount of height diﬀerences. As an examplewe will consider height measurements c i for we will assume that the error isnormally distributed. The probability distribution in this case reads P i ( z i ) = 1 √ πσ e − ( zi − ci )22 σ , (14)where σ is the standard deviation. The expected likelihood can be computedusing Eq. 1 giving (cid:104) P (cid:105) = 12 σ √ π . (15)So for n points Eq. 2 now becomeslim n →∞ n (cid:89) i =1 √ πσ e − ( zi − ci )22 σ = lim n →∞ (cid:18) σ √ π (cid:19) n . (16)Similar to the previous section we study what happens when one pulls acontrol point from a measurement point (please see the illustration in Fig. 4)but in this case there turns out to be a small complication. Any given pathruns through control points points that are close to the measurement points.It turns out that for a given path one can divide the control points in thefollowing two classes. For the ﬁrst class, that we will call class A, a smallchange in the position of the control point does not lead to a change inthe overal vertical length. For the second class, that we will call class B, asmall change in the control point does lead to a change in the overal verticallength.Considering now only the points of class B one ﬁnds that Eq. 3 becomesnow (∆ L is how the change in altimeters which is the sum of changes upand down) ∆ L = d L d r j ∆ r j = − r j . (17)The change in the log likelihood (see also Eq. 4) becomes for class B simply∆ log (cid:32) n (cid:89) i =1 P ( r i ) (cid:33) = d log P ( r j )d r j ∆ r j = − ( z i − c i ) σ , (18)6igure 4: Two examples when pulling path away from point 4. In the leftﬁgure point 4 has its direct neighbours on both sides in which case a pull onpoint 4 does not lead to a shortening of the overal vertical path (class A).In the right ﬁgure, point 4 has its direct neighbours (point 3 and point 5)on the same side. Consequently, a pull from point 4 leads to a shorteningof the overal vertical path (class B).after which the force equation (Eq. 6) becomes ∀ i : 2 F S ( L ) = ( z i − c i ) σ . (19)The resulting force from the maximum likelihood postulate is similar to alinear spring. Points that are further away from the measurement pointexperience a larger force than points that are closer to their measurementpoint. In the equilibrium state, all these points should experience the sameforce.The algorithm for computing the minimal vertical path can now be sum-marized as follows:1. initiate path by putting all control points at the measurement points2. shorten the overal path length by pulling the control points of class Bby a small amount away from the measurement points3. compute for each control point of class B the force (which is linear tothe distance of the path and measurement point)4. if all forces of previous step are equal, then continue with step 6.5. move the control point that experiences the highest force towards themeasurement point by a ﬁxed amount and move by the same amountthe control point that experiences the lowest force away from the mea-surement point and continue with step 3.6. compute the overal likelihood of the n points. If equal or lower thanthe expectation likelihood then one has obtained the optimal path,else go to step 2. 7lease note that the number control points n of type class B will ﬂuctuateduring the execution of the algorithm. There are various methods available for estimating on the basis of noisy GPSdata how many meters a cyclist has climbed during a ride but apparently itis not an easy task to carry out. In this article a new method is proposedfor persons that locally take the shortest path. The assumption allowedus to construct a theoretical entropic force which enables us to ﬁnd theactual/expected path a person has taken.For the two dimensional example the shape of the path is deﬁned by thecontrol points which can be interpreted as being kept close to the measure-ment point by a spring. By increasing the string tension (shortening of thepath), the springs are elongated until one meets the expectation value. Theobtained path meets the expectation value which should be close to realityas the number of measurement points for GPS tracks is often high (GPSpoints are often measured every second). The fact that the obtained pathhas strong bents (which seem unnatural) is a short coming of the model asthe model does not include an upper limit on acceleration. This aspect canbe included in future studies.The one dimensional example which is relevant to the motivation of thisarticle turned out to be slightly more complicated as the points need to beseparated in classes. However, the same methodoly could also be used hereleading to a nice short algorithm that can hopefully contribute in a betterestimation of number altimeters climbed.Besides the concrete problem of ﬁnding a path given GPS data points,the introduction of an entropic force that ﬁnds its origin in statistics is ofinterest. Just like entropy can only increase for an isolated system, one canpostulate that at ﬁxed path lengths the likelihood will only increase (or stayequal).

References [1] [2] [3] W. Feller,

An Introduction to Probability Theory and its Applications,in 2 vols.

Wiley, 1966[4] ’t Hooft,

Introduction to the theory of black holes , Lectures presentedat Utrecht University, 2009. 85] Erik P. Verlinde,

On the Origin of Gravity and the Laws of Newton ,arXiv:1001.0785 [hep-th][6] Fowler and Cassiday,