When Lipschitz Walks Your Dog: Algorithm Engineering of the Discrete Fréchet Distance under Translation
WWhen Lipschitz Walks Your Dog:Algorithm Engineering of theDiscrete Fréchet Distance under Translation
Karl Bringmann
Saarland University and Max Planck Insitute for Informatics, Saarland Informatics Campus,Saarbrücken, [email protected]
Marvin Künnemann
Max Planck Insitute for Informatics, Saarland Informatics Campus, Saarbrücken, [email protected]
André Nusser
Saarbrücken Graduate School of Computer Science and Max Planck Insitute for Informatics,Saarland Informatics Campus, Saarbrücken, [email protected]
Abstract
Consider the natural question of how to measure the similarity of curves in the plane by a quantitythat is invariant under translations of the curves. Such a measure is justified whenever we aimto quantify the similarity of the curves’ shapes rather than their positioning in the plane, e.g.,to compare the similarity of handwritten characters. Perhaps the most natural such notion isthe (discrete)
Fréchet distance under translation . Unfortunately, the algorithmic literature on thisproblem yields a very pessimistic view: On polygonal curves with n vertices, the fastest algorithmruns in time O ( n . ) and cannot be improved below n − o (1) unless the Strong Exponential TimeHypothesis fails. Can we still obtain an implementation that is efficient on realistic datasets?Spurred by the surprising performance of recent implementations for the Fréchet distance, weperform algorithm engineering for the Fréchet distance under translation. Our solution combinesfast, but inexact tools from continuous optimization (specifically, branch-and-bound algorithms forglobal Lipschitz optimization) with exact, but expensive algorithms from computational geometry(specifically, problem-specific algorithms based on an arrangement construction). We combine thesetwo ingredients to obtain an exact decision algorithm for the Fréchet distance under translation. Forthe related task of computing the distance value up to a desired precision, we engineer and comparedifferent methods. On a benchmark set involving handwritten characters and route trajectories, ourimplementation answers a typical query for either task in the range of a few milliseconds up to asecond on standard desktop hardware.We believe that our implementation will enable, for the first time, the use of the Fréchetdistance under translation in applications, whereas previous algorithmic approaches would have beencomputationally infeasible. Furthermore, we hope that our combination of continuous optimizationand computational geometry will inspire similar approaches for further algorithmic questions. Theory of computation → Computational geometry
Keywords and phrases
Fréchet Distance, Computational Geometry, Continuous Optimization,Algorithm Engineering
Supplementary Material https://gitlab.com/anusser/frechet_distance_under_translation
Funding
Karl Bringmann : This work is part of the project TIPEA that has received funding fromthe European Research Council (ERC) under the European Unions Horizon 2020 research andinnovation programme (grant agreement No. 850979).
Acknowledgements
We thank Andreas Karrenbauer for helpful discussions. a r X i v : . [ c s . C G ] A ug Algorithm Engineering of the Discrete Fréchet Distance under Translation
Consider the following natural task: Given two handwritings of (the same or different)characters, represented as polygonal curves π, σ in the plane, determine how similar theyare. To measure the similarity of two such curves, several distance notions could be used,where the most popular measure in computational geometry is given by the
Fréchet distance d F ( π, σ ): Intuitively, we imagine a dog walking on π and its owner walking on σ , and define d F ( π, σ ) as the shortest leash length required to connect the dog to its owner while bothwalk along their curves (only forward, but at arbitrarily and independently variable speeds).In this paper, we focus on the discrete version, in which dog and owner do not continuouslywalk along the curves, but jump from vertex to vertex. As a fundamental curve similaritynotion that takes into account the sequence of the points of the curves (rather than simplythe set of points, as in the simpler notion of the Hausdorff distance), the discrete Fréchetdistance and variants have received considerable attention from the computational geometrycommunity, see, e.g. [4, 19, 12, 17, 3, 8, 11, 15]. While the fastest known algorithms taketime n ± o (1) on polygonal curves with at most n vertices [4, 19, 3, 11]—which is best possibleunder the Strong Exponential Time Hypothesis [8]—a recent line of research [6, 13, 18, 10]gives fast implementations for practical input curves.In the setting of handwritten characters, one would expect our notion of similarity tobe invariant under translations of the curves; after all, translating one character in theplane while fixing the position of the other should not affect their similarity. In this sense,the original Fréchet distance seems inadequate, as it does not satisfy translation invariance.However, we may canonically define a translation-invariant adaptation as the minimumFréchet distance between π and any translation of σ , yielding the Fréchet distance undertranslation . Note that beyond computing the similarity of handwritten characters, thismeasure is generally applicable whenever our intuitive notion of similarity is not affectedby translations, such as recognition of movement patterns . In some settings, we wouldexpect our notion to additionally be scaling- or rotation-invariant; however, this is beyondthe scope of this paper, as already the Fréchet distance under translation presents previouslyunresolved challenges.Can we compute the Fréchet distance under translation quickly? The existing theoreticalwork yields a rather pessimistic outlook: For the discrete Fréchet distance under translation inthe plane, the currently fastest algorithm runs in time O ( n . ), and any algorithm requirestime n − o (1) under the Strong Exponential Time Hypothesis [9]. These high polynomialbounds appear prohibitive in practice, and have likely impeded algorithmic uses of thissimilarity measure. (For the continuous analogue, the situation appears even worse, as thefastest algorithm has a significantly higher worst-case bound of O ( n log n ); we thus solelyconsider the discrete version in this work.) Given the surprising performance of recent Fréchetdistance implementations on realistic curves [35, 10], can we still hope for faster algorithmson realistic inputs also for its translation-invariant version? Our problem.
Towards making the Fréchet distance under translation applicable for practi-cal applications, we engineer a fast implementation and analyze it empirically on realisticinput sets. Perhaps surprisingly, our fastest solution for the problem combines inexact We give a precise definition in Section 2. One may argue that the similarity of movement patterns also depends on the speed/velocity of themotion. In principle, we can also incorporate such information into any Fréchet-distance-based measureby introducing an additional dimension. . Bringmann, M. Künnemann, A. Nusser 3 continuous optimization techniques with an exact, but expensive problem-specific approachfrom computational geometry to obtain an exact decision algorithm. We discuss our approachin Section 3 and present the details of our decision algorithm in Section 4. We develop ourapproach also for the related, but different task to compute the distance value up to a givenprecision in Section 5, and evaluate our solutions for both settings in comparison to baselineapproaches in Section 6.
Further related work.
Variations of the distance measure studied in this paper arise bychoosing (1) the discrete or continuous Fréchet distance, (2) the dimension d of the ambientEuclidean space, and (3) a class of transformations, e.g., translations, rotations, scaling,or arbitrary linear transformations. A detailed treatment of algorithms for this class ofdistance measures can be found in [34]. The earliest example of a problem in this class is thecontinuous Fréchet distance under translations in dimension d = 2, which was introduced byAlt et al. [5] together with an O ( n log n )-time algorithm.In this paper we focus on the discrete Fréchet distance under translation in the plane.This problem was first studied by Mosig and Clausen [31], who gave an O ( n ) algorithmfor approximating the discrete Fréchet distance under rigid motions. Subsequently, Jianget al. [29] presented an O ( n log n )-time algorithm for the exact Fréchet distance undertranslation. Their running time was improved by Ben Avraham et al. to O ( n log n ) [7],and then by Bringmann et al. to O ( n . ) [9]. A conditional lower bound of n − o (1) can befound in [9].Algorithm engineering efforts for the Fréchet distance were initiated by the SIGSPATIALGIS Cup 2017 [35], where the task was to implement a nearest neighbor data structurefor curves under the Fréchet distance; see [6, 13, 18] for the top three submissions. Thecurrently fastest implementation of the Fréchet distance is due to Bringmann et al. [10].Further recent directions of Fréchet-related algorithm engineering include k-means clusteringof trajectories [14] and locality sensitive hashing of trajectories [16]. Throughout the paper, we consider the Euclidean plane and denote the Euclidean normby k·k . A polygonal curve π is a sequence π = ( π , . . . , π n ) of vertices π i ∈ R . For any τ ∈ R , we write π + τ for the translated curve ( π + τ, . . . , π n + τ ).For any curves π = ( π , . . . , π n ) , σ = ( σ , . . . , σ m ), we define their discrete Fréchet distance as follows. A traversal is a sequence T = ( ( p , s ) , . . . , ( p t , s t ) ) of pairs ( p i , s i ) ∈ [ n ] × [ m ] suchthat ( p , s ) = (1 , p t , s t ) = ( n, m ) and ( p i +1 , s i +1 ) ∈ { ( p i +1 , s i ) , ( p i , s i +1) , ( p i +1 , s i +1) } for all 1 ≤ i < t . The width of a traversal is max i =1 ,..., | T | k π p i − σ s i k . The discrete Fréchetdistance is then defined as the smallest width over all traversals, i.e., d F ( π, σ ) := min traversal T max i =1 ,..., | T | k π p i − σ s i k . As we only consider the discrete Fréchet distance in this paper, we drop “discrete” in theremainder. To avoid confusion, we also refer to it as the fixed-translation
Fréchet distance.As the canonically translation-invariant variant of the discrete Fréchet distance, we definethe discrete Fréchet distance under translation as d trans- F ( π, σ ) := min τ ∈ R d F ( π, σ + τ ).We typically view the problem as a two-dimensional optimization problem with objectivefunction f ( τ ) := d F ( π, σ + τ ). Specifically, we consider the task to decide min τ ∈ R f ( τ ) ≤ δ ? ( exact decider ) or to return a value in the range [(1 − (cid:15) ) min τ ∈ R f ( τ ) , (1 + (cid:15) ) min τ ∈ R f ( τ )]( approximate value computation , multiplicative version). In fact, for implementation reasons Algorithm Engineering of the Discrete Fréchet Distance under Translation (see Section 5 for the details), our implementation returns a value in [min τ ∈ R f ( τ ) − (cid:15), min τ ∈ R f ( τ ) + (cid:15) ] ( approximate value computation , additive version) using a straightforwardadaptation of our approach.Apart from a black-box Fréchet oracle answering decision queries d F ( π, σ + τ ) ≤ δ ? , ouralgorithms only exploit the following simple properties: (cid:73) Observation 1 (Lipschitz property) . The objective function f is 1-Lipschitz, i.e., | f ( τ ) − f ( τ + τ ) | ≤ k τ k . Proof.
Note that for any π i , σ j , τ, τ ∈ R , we have |k π i − ( σ j + τ + τ ) k − k π i − ( σ j + τ ) k| ≤ k τ k by triangle inequality. Thus, the widths of any traversal T for π, σ + τ and π, σ + τ + τ differ by at most k τ k , which immediately yields the observation. (cid:74) We obtain a simple 2-approximation of the Fréchet distance under translation as follows. (cid:73)
Observation 2.
Let τ start := π − σ be the translation of σ that aligns the first points of π and σ . Then d F ( π, σ + τ start ) ≤ · d trans- F ( π, σ ) .Analogously, for τ end := π n − σ m , we have d F ( π, σ + τ end ) ≤ · d trans- F ( π, σ ) . Proof.
Let δ ∗ := d trans- F ( π, σ ) and let τ ∗ be such that d F ( π, σ + τ ∗ ) = δ ∗ , which implies inparticular that k π − ( σ + τ ∗ ) k ≤ δ ∗ . Thus, k τ start − τ ∗ k = k π − ( σ + τ ∗ ) k ≤ δ ∗ . Thusby Observation 1, we obtain d F ( π, σ + τ start ) ≤ d F ( π, σ + τ ∗ ) + δ ∗ = 2 δ ∗ . (cid:74) Note that the above observation gives a formal guarantee of a simple heuristic: translatethe curves such that the start points match, and compute the corresponding fixed-translationFréchet distance. Unfortunately, this worst-case guarantee is tight – a correspondingly largediscrepancy is also observed on our data sets. To obtain a fast exact decider, we approach the problem from two different angles: First, wereview previous problem-specific approaches to the Fréchet distance under translation, allrelying on the construction of an arrangement of circles as an essential tool from computationalgeometry. Second, we cast the problem into the framework of global Lipschitz optimizationwith its rich literature on fast, numerical solutions. In isolation, both approaches areinadequate to obtain a fast, exact decider (as the arrangement can be prohibitively largeeven for realistic data sets, and black-box Lipschitz optimization methods cannot returnan exact optimum). We then describe how to combine both approaches to obtain a fastimplementation of an exact decider for the discrete Fréchet distance under translation inthe plane. We evaluate our approach, including comparisons to (typically computationallyinfeasible) baseline approaches, on a data set that we craft from sets of handwritten characterand (synthetic) GPS trajectories used in the ACM SIGSPATIAL GIS Cup 2017 [2, 1]. Webelieve that our approach will inspire similar combinations of fast, inexact methods fromcontinuous optimization with expensive, but exact approaches from computational geometryalso in other contexts. To see this, take any segment in the plane and let π traverse it in one direction, and σ in the other. Thenthe heuristic would return as estimate two times the segment length (the distance of the translated endpoints), while the optimal translation aligns the segments and achieves the segment length as Fréchetdistance. . Bringmann, M. Künnemann, A. Nusser 5 Figure 1
Example curves π, σ (left) together with their arrangement A δ (right), δ = d trans- F ( π, σ ). Previous algorithms for the Fréchet distance under translation in the plane work as follows.Given two polygonal curves π, σ and a decision distance δ , consider the set of circles C := { C δ ( π i − σ j ) | π i ∈ π, σ j ∈ σ } , where C r ( p ) denotes the circle of radius r ∈ R around p ∈ R . Define the arrangement A δ asthe partition of R induced by C . The decision of d F ( π, σ + τ ) ≤ δ is then uniform amongall τ ∈ R in the same face of A δ (for a detailed explanation, we refer to [7, Section 3] or [9]).Thus, it suffices to check, for each face f of A δ , an arbitrarily chosen translation τ f ∈ f .Specifically, the Fréchet distance under translation is bounded by δ if and only if there issome face f of A δ such that d F ( π, σ + τ f ) ≤ δ . Since the arrangement A δ has size O ( n )and can be constructed in time O ( n ) [29], using the standard O ( n )-time algorithm forthe fixed-translation Fréchet distance [19, 4] to decide d F ( π, σ + τ f ) ≤ δ for each face f , weimmediately arrive at an O ( n )-time algorithm.Subsequent improvements [7, 9] speed up the decision of d F ( π, σ + τ f ) ≤ δ for all faces f by choosing an appropriate ordering of the translations τ f and designing data structures thatavoid recomputing some information for “similar” translations, leading to an O ( n . )-timealgorithm. Still, these works rely on computing the arrangement A δ of worst-case size Θ( n ),and a conditional lower bound indeed rules out O ( n − (cid:15) )-time algorithms [9]. Drawback: The arrangement size bottleneck.
Despite the worst-case arrangement sizeof Θ( n ) and the conditional lower bound in [9], which indeed constructs such large arrange-ments, one might hope that realistic instances often have much smaller arrangements. Ifso, a combination with a practical implementation of the fixed-translation Fréchet distancecould already give an algorithm with reasonable running time. Unfortunately, this is not thecase: our experiments in this paper exhibit typical arrangement sizes between 10 to 10 forcurves of length n ≈ Algorithm Engineering of the Discrete Fréchet Distance under Translation F r ´ ec h e t D i s t a n ce Figure 2
Example curves π, σ (left) together with a plot of the resulting non-convex objectivefunction f ( τ ) = d F ( π, σ + τ ). For a closer look at the area close to the optimal translation (andhighly non-convex small-scale artefacts), we refer to Figure 3. A second view on the Fréchet distance under translation results from a simple observation: Forany polygonal curves π, σ and any translation τ ∈ R , we have | d F ( π, σ + τ ) − d F ( π, σ ) | ≤ k τ k ,see Section 2. As a consequence, the Fréchet distance under translation is the minimumof a function f ( τ ) := d F ( π, σ + τ ) that is 1-Lipschitz (i.e., | f ( x ) − f ( x + y ) | ≤ k y k for all x, y ). This suggests to study the problem also from the viewpoint of the generic algorithmsdeveloped for optimizing Lipschitz functions by the continuous optimization community.Following the terminology of [25], in an unconstrained bivariate global Lipschitz optimiza-tion problem , we are given an objective function f : R → R that is 1-Lipschitz, and the aimis to minimize f ( x ) over x ∈ B := [ a , b ] × [ a , b ]; we can access f only by evaluating it on(as few as possible) points x ∈ B . Note that in this abstract setting, we cannot optimize f exactly, so we are additionally given an error parameter (cid:15) > x ∈ B such that f ( x ) ≤ min z ∈ B f ( z ) + (cid:15) .Global Lipschitz optimization techniques have been studied from an algorithmic perspec-tive for at least half a century [32]. This suggest to explore the use of the fast algorithmsdeveloped in this context to obtain at least an approximate decider for the discrete Fréchetdistance under translation. Indeed, our problem fits into the above framework, if we take thefollowing considerations into account: (1) Finite Box Domain:
While we seek to minimize f ( τ ) = d F ( π, σ + τ ) over τ ∈ R , theabove formulation assumes a finite box domain B . To reconcile this difference, observethat any translation τ achieving a Fréchet distance of at most δ must translate the first(last) point of σ such that the first (last) point of π is within distance at most δ . Thus,any feasible translation τ must be contained in the intersection of the two correspondingdisks, and we can use any bounding box of this intersection as our box domain B . (2) (Approximate) Decision Problem: While we seek to decide “min τ f ( τ ) ≤ δ ”, theabove formulation solves the corresponding minimization problem. Note that approximateminimization can be used to approximately solve the decision problem, but exactly solvingthe decision problem is impossible in the above framework. (3) Oracle Access to f ( τ ): Evaluation of f ( τ ) corresponds to computing the Fréchetdistance of π and σ + τ , for which we can use previous fast implementations [6, 13, 18, 10].(Actually, these algorithms were designed to answer decision queries of the form “ f ( τ ) ≤ δ ?”; we discuss this aspect at the end of this section.)In Figure 2, we illustrate our view of the Fréchet distance under translation as Lipschitz . Bringmann, M. Künnemann, A. Nusser 7 . . . . . . F r ´ ec h e t D i s t a n ce Figure 3
Highly non-convex artefacts of the objective function at a local scale, resulting particu-larly from the notion of traversals in the discrete
Fréchet distance. optimization problem. As the figure suggests, on many realistic instances, the problemappears well-behaved (almost convex) at a global scale; using the Lipschitz property, oneshould be able to quickly narrow down the search space to small regions of the search space .Particularly for this task, it is very natural to consider branch-and-bound approaches, aspioneered by Galperin [20, 21, 22, 23] and formalized by Horst and Tuy [26, 27, 28], sincethese have been applied very successfully for low-dimensional Global Lipschitz optimization(and non-convex optimization in general).On a high level, in this approach we maintain a global upper bound ˜ δ and a list of searchboxes B , . . . , B b with lower bounds ‘ , . . . , ‘ b (i.e., min τ ∈ B i f ( τ ) ≥ ‘ i ) obtained via theLipschitz condition. We iteratively pick some search box B i and first try to improve theglobal upper bound ˜ δ or the local lower bound ‘ i using a small number of queries f ( τ ) with τ ∈ B i (and exploiting the Lipschitz property). If the local lower bound exceeds the globalupper bound, i.e., ‘ i > ˜ δ , we drop the search box B i , otherwise, we split B i into smallersearch boxes. The procedure stops as soon as ˜ δ ≤ (1 + (cid:15) ) min i ‘ i , which proves that ˜ δ gives a(1 + (cid:15) )-approximation to the global minimum.Specifically, we arrive at the following branch-and-bound strategy proposed by Gourdin,Hansen and Jaumard [24]. We specify it by giving the rules with which it (i) attempts toupdate the global upper bound, (ii) selects the next search box from the set of current searchboxes, (iii) splits a search box if it remains active after bounding, and (iv) determines thelocal lower bounds. (i) Upper Bounding Rule:
We evaluate f at the center τ i of the current search box B i . (ii) Selection Rule:
We pick the search box with the smallest lower bound (ties are brokenarbitrarily). (iii)
Branching Rule:
We split the current search box along its longest edge into 2equal-sized subproblems. (iv)
Lower Bounding Rule:
We obtain the local lower bound ‘ i as f ( τ i ) − d where d is the half-diameter of the current box. (Since f is 1-Lipschitz, we indeed have For an illustration that highly non-convex behavior may still occur at a local level, we refer to Figure 3. See [25] for a precise formalization of the generic branch-and-bound algorithm that leaves open theinstantiation of these rules. In any case, we give a self-contained description of our algorithms inSection 4 and 5.
Algorithm Engineering of the Discrete Fréchet Distance under Translation min τ ∈ B i f ( τ ) ≥ ‘ i .)One may observe that the chosen selection rule (also known as Best-Node First) is a no-regretstrategy in the sense that no other selection rule, even with prior knowledge of the globaloptimum , considers fewer search boxes (see, e.g., [36, Section 7.4]). Drawback: Inexactness.
Unfortunately, the above branch-and-bound approach for Lip-schitz optimization fundamentally cannot return an exact global optimum, and thus yieldsonly an approximate decider.In a somewhat similar vein, in the above framework we assume that we can evaluate f ( τ )quickly. Previous implementations for the fixed-translation Fréchet distance focus on thedecision problem “ f ( τ ) ≤ δ ?”, not on determining the value f ( τ ). Both precise computations(via parametric search) or approximate computations (using a binary search up to a desiredprecision) are significantly more costly, raising the question how to make optimal use of thecheaper decision queries. Our first main contribution is engineering an exact decider for the discrete Fréchet distanceunder translation by combining the two approaches. On a high level, we globally perform thebranch-and-bound strategy described in the Lipschitz optimization view in Section 3.2, butuse as a base case a local version of the arrangement-based algorithms of Section 3.1 oncethe arrangement size in a search box is sufficiently small. As each search box is thus resolvedexactly, this yields an exact decider. More precisely, our final algorithm is a result of thefollowing steps and adaptations: (1)
Fréchet Decision Oracle.
We adapt the currently fastest implementation of a deciderfor the continuous fixed-translation Fréchet distance [10] to the discrete fixed-translationFréchet distance. Furthermore, to handle many queries for the same curve pair underdifferent translations quickly, we incorporate an implicit translation so that curves donot need to be explicitly translated for each query translation τ . (2) Objective Function Evaluation.
For our exact decider, the branch-and-bound strat-egy in Section 3.2 simplifies significantly: We do not maintain a global upper bound andlocal lower bounds ‘ i , but for each box only test whether f ( τ i ) ≤ δ (if so, we return YES)or whether f ( τ i ) > δ + d (this corresponds to updating the local lower bound beyond δ ,i.e., we may drop the box completely). Therefore, we may use an arbitrary selection rule.Note that we only require decision queries to the fixed-translation Fréchet algorithm. (3) Base Case.
We implement a local arrangement-based algorithm: For a given searchbox B i , we (essentially) construct the arrangement A ∩ B i using CGAL [33], and test, foreach face f of A ∩ B i , some translation τ ∈ f for f ( τ ) ≤ δ . This yields the algorithmthat we may use as a base case. (4) Base Case Criterion.
For each search box, we compute an estimate of its arrangementcomplexity. If this estimate is smaller than a (tunable) parameter γ size , or the depth ofthe branch-and-bound recursion for the current search box exceeds a parameter γ depth ,then we use the localized arrangement-based algorithm. (5) Benchmark and Choice of Parameters.
We choose the size and depth parameters γ size , γ depth guided by a benchmark set that we create from a set of handwritten charactersand synthetic GPS trajectories. . Bringmann, M. Künnemann, A. Nusser 9 Algorithm 1
Algorithm for deciding the Fréchet distance under translation. We use τ B to denotethe center of the box B and d B to denote the length of the diagonal of B . procedure Decider ( π, σ, δ ) decide trivial NO instances with empty initial search box quickly Q ← Fifo (initial search box) while Q = ∅ do B ← extract front of search box queue Q if FréchetDistance ( π, σ + τ B ) > δ + d B / then . Lower Bounding skip B if FréchetDistance ( π, σ + τ B ) ≤ δ then . Upper Bounding return YES u ← upper bound on arrangement size inside B if u = 0 then . Arrangement-based Base Case skip B else if u ≤ γ size or layer of B is γ depth then if local arrangement-based algorithm on π, σ, δ, B returns YES then return YES else skip B halve B along longest edge and push resulting child boxes to Q .
Branching return
NOThe pseudocode of the resulting algorithm is shown in Algorithm 1. In the remainderof this section, we describe the details of our Fréchet-under-translation decider. We firstdescribe the details of the local arrangement-based algorithm which serves as the base casefor our decider.
Recall that given two polygonal curves π, σ and a decision distance δ , the set of circles of thearrangement A δ is C := { C δ ( π i − σ j ) | π i ∈ π, σ j ∈ σ } , where C r ( p ) denotes the circle of radius r ∈ R around p ∈ R . The arrangement is thendefined as the partition of R induced by C . In particular, the decision of d F ( π, σ + τ ) ≤ δ is uniform for each τ ∈ R in the same face of A δ (for a detailed explanation, we refer to[7, Section 3] or [9]). Thus, as already described in Section 3.1, it suffices to evaluate arepresentative translation from each face of the arrangement by running a fixed-translationFréchet decider query on it to reach a Fréchet under translation query decision.For integration into our branch-and-bound approach where each node in the branch-and-bound tree corresponds to a search box B , the base case task is to decide whether thereis some τ ∈ B with d F ( π, σ + τ ) ≤ δ . For this task, we consider local arrangements, i.e.,arrangements restricted to B . A circle C ∈ C contributes to the local arrangement of B ifthe boundary of C intersects the box. In other words, C is relevant for the arrangement of B if C either is completely contained in B or C intersects the boundary of B . In particular, C does not contribute to the local arrangement if it contains B completely. Estimation of local arrangement sizes.
Given a search box B , a simple way to estimatethe size of the local arrangement for B , i.e., the arrangement restricted to B , is to considerthe number of circles in C that contribute to it. We can obtain this number naively, byiterating over all |C| ≤ nm circles of the global arrangement and check if they contribute tothe local arrangement (by checking for intersection and containment as described above).Let this number be denoted by c . The maximal number of nodes in the arrangement is thenbounded by u := 2( c + c ), as this is the maximal number of intersections between two circlesand a circle and the box. In particular, if u = 0, then the arrangement in the box belongs toa single face and all translations in B are equivalent for our decision question.As a simple optimization, we may stop counting contributing circles once our estimateexceeds the threshold γ size . A more sophisticated optimization builds a geometric datastructure (specifically a kd-tree) to quickly retrieve all contributing circles without checkingall circles in C naively. We discuss this approach in Section 5, as the expensive preprocessingfor constructing this data structure only amortizes in the value computation setting. Construction of local arrangement.
For a search box B with an estimate smaller than γ size ,we construct an arrangement A B . To this end, we adapt our arrangement size estimation toalso return the set C B of circles intersecting B or being contained in B . Note that computingtopologically correct geometric arrangements on such a circle set is a challenging task, asit requires the usage of arbitrary precision numbers to reliably test for intersections andorderings of those intersections. Thus, we use the state-of-the-art computational geometrylibrary CGAL [33] to build our circle arrangements. Unfortunately, CGAL only providesmethods for building a global arrangement and not an arrangement restricted to a boundingbox, thus we always build the whole arrangement of the circles in C B instead of just thearrangement restricted to the box B . Alternatively, we could indeed compute circular arcsrestricted to the bounding box and then build the arrangement of those arcs. However,due to the rather expensive construction of these arcs, this seems wasteful compared to adirect computation. Thus, a practical performance improvement of our approach could beachieved by directly computing an arrangement with a box restriction. Furthermore, we usethe standard bulk-insertion interface for building the arrangement. Resulting local arrangement-based algorithm.
Finally, given the arrangement A B of thecircles C B , we may simply test a translation τ for each face f of A B that intersects B . Infact, for efficiency, we do this by testing each vertex τ of A B (even for vertices outside of B ,as due to the expensive construction of A B , it pays off to make the rather cheap tests forpositive witnesses also outside of B ); observe that this ensures that each face f is indeedtested. We return YES if and only if some vertex τ of A B achieves d F ( π, σ + τ ) ≤ δ . Now, we describe our decider (whose pseudocode is given in Algorithm 1) in more detail.Recall that an exact decider, given curves π = ( π , . . . , π n ) , σ = ( σ , . . . , σ m ) and a distance δ , decides whether the Fréchet distance under translation of π and σ is at most δ , i.e., whether d trans- F ( π, σ ) ≤ δ . Specifically, we use the exact predicates and exact computation kernels as this is necessary for CGALarrangements. The significantly faster kernel for inexact computation is not suitable for the CGALarrangement package (although, surprisingly, for most instances it actually worked). Being able to usea faster kernel for arrangements should significantly improve our implementation’s performance. . Bringmann, M. Künnemann, A. Nusser 11
Preprocessing.
As a first step, we aim to determine an initial search box. Since any τ ∈ R with d F ( π, σ + τ ) ≤ δ implies that k π − ( σ + τ ) k , k π n − ( σ m + τ ) k ≤ δ , we must have that τ is in the intersection I := D δ ( π − σ ) ∩ D δ ( π n − σ m ), where D r ( p ) denotes the disk ofradius r around p . If this intersection is empty, i.e., π − σ and π n − σ m have a distance morethan 2 δ , we return NO immediately. Otherwise, we take a bounding box of the intersection. Branch-and-Bound.
We implement the recursive branch-and-bound strategy using a FIFOqueue Q of search boxes (corresponding to a breadth-first search) that is initialized withthe initial search box. As long as there are undecided boxes in the queue, we take the firstsuch box B and try to resolve it using the upper bounding rule (point (i) in View II) andthe lower bounding rule (point (iv) in View II), which are both derived by queries to thefixed-translation Fréchet distance decider using the center point τ B of the box as translation.Specifically, if d F ( π, σ + τ B ) ≤ δ (line 6 in Algorithm 1), we have found a witness translationand can return YES. The lower bounding rule (line 8) tests if d F ( π, σ + τ B ) > δ + d B / δ plus the maximaldistance of any point in the box to the center τ B , i.e., the half-diagonal length d B /
2. If so,by the Lipschitz property, we know that the any translation in B yields a Fréchet distancelarger than δ and thus we can drop B .If neither rule applies, we check our termination criterion of the branch-and-boundstrategy. To this end, in line 11, we calculate a good upper bound u on the size of the localarrangement for B as described in Section 4.1. If u = 0, the arrangement for B consists of asingle face, i.e., each translation τ ∈ B is equivalent for our decision problem, and we canskip the box since we have already tested the translation τ B ∈ B . Otherwise, in line 14,if u = 0, we check if the number is bounded by a size parameter γ size or the depth of thecurrent search box (in the implicit recursion tree) is bounded by a depth parameter γ depth .If so, we run the local arrangement-based algorithm to decide B .If none of the above rules decide the search box B , we split it along its longer side intotwo equal-sized child boxes and push them to the queue. If all boxes have been droppedwithout finding a witness translation, we have verified that any translation τ ∈ B yields d F ( π, σ + τ ) > δ and may safely return NO. Low-level optimizations.
For further practical speed-ups, we employ several low-leveloptimizations, which we briefly mention here (for further details, we refer to the source codeof our implementation).For each box in the branch-and-bound tree we need a differently translated curve. However,often we barely access the nodes of the translated curve. For example, if already the startnodes of the curves are too far, we do not need to consider the remainder. Thus, it seemswasteful to translate each point of the curves before calling the fixed-translation Fréchetdecider. To avoid this overhead, we lazily translate the necessary parts of a curve on access.In fact, while the currently fastest implementation of the fixed-translation Fréchet distancedecider [10] uses a preprocessing of the curves that computes all prefix lengths and extremaof the curves, we only need to perform this preprocessing once, as all computed informationis either invariant under translations (for the prefix lengths) or can just be shifted by thetranslation (for the extrema). In fact, we use a slightly more refined search box by incorporating additionally the extreme points ofboth curves.
Algorithm 2
Algorithm of our Lipschitz-Meets-Fréchet (LMF) algorithm for approximate valuecomputation. We use τ B to denote the center of the box and d B to denote the length of the diagonal. procedure LMF ( π, σ ) Preprocessing: build data structures for fast arrangement estimation and construction compute initial distance interval [ δ LB , δ UB ] containing d trans- F ( π, σ ) initialize global upper bound ˜ δ ← δ UB Q ← PriorityQueue (initial search box B with local lower bound ‘ B ← δ LB ) while Q = ∅ do B ← box with smallest local lower bound ‘ B in Q if ˜ δ ≤ ‘ B (1 + (cid:15) ) then skip B if FréchetDistance ( π, σ + τ B ) ≤ ˜ δ then . Upper/Lower Bounding compute value d F ( π, σ + τ B ) with high precision and update ˜ δ and ‘ B else if FréchetDistance ( π, σ + τ B ) > ˜ δ + d B / then skip B compute value d F ( π, σ + τ B ) with coarse precision and update ‘ B if ˜ δ ≤ ‘ B (1 + (cid:15) ) then skip B u ← upper bound on arrangement size inside B for δ ∈ [ ‘ B , ˜ δ ] if u = 0 then . Arrangement-based Base Case skip B else if u ≤ γ size or layer of B is γ depth then update ˜ δ via binary search over arrangement algorithm on B and δ ∈ [ ‘ B , ˜ δ ] skip B push child boxes of B to Q with local lower bounds set to ‘ B . Branching return ˜ δ Furthermore, while the initial bounding box is derived from the discs around the translationbetween the start nodes and the translation between the end nodes, later child boxes in thebranch-and-bound tree might violate this condition. We therefore re-check this condition oncreating child boxes. Additionally, in line 14 of Algorithm 1 we check if the depth parameter γ depth is reached. This can actually already be done before line 8, which we also do in theimplementation, but for the sake of brevity, we present it differently in the pseudocode. In this section we present our second main contribution: an algorithm for computing thevalue of the Fréchet distance under translation. Thus, we now focus on the functional taskof computing the value d trans- F ( π, σ ) = min τ ∈ R d F ( π, σ + τ ), in contrast to the previouslydiscussed decision problem “ d trans- F ( π, σ ) ≤ δ ?”. In theory, one could use the paradigmof parametric search [30], see [7, 9] for details for the discrete case. However, it is rarelyused in practice as it is non-trivial to code, and computationally costly. Instead, as in mostconceivable settings an estimate with small multiplicative error (1 ± (cid:15) ) with, e.g., (cid:15) = 10 − ,suffices, we consider the problem of computing an estimate in (1 ± (cid:15) ) d trans- F ( π, σ ).There are several possible approaches to obtain an approximation with multiplicative . Bringmann, M. Künnemann, A. Nusser 13 error (1 ± (cid:15) ) for arbitrarily small (cid:15) > (cid:15) -approximate Set: A natural approach underlying previous approximation algo-rithms [5] is to generate a set of f (1 /(cid:15) ) candidate translations T such that the besttranslation τ among this set gives a (1 + (cid:15) )-approximation for the Fréchet distance undertranslation. Specifically, it is simple to obtain a bounding box B of side length O ( δ )for the optimal translation τ ∗ (see the 2-approximation in Section 2 together with thepreprocessing described in Section 4). We impose a grid of side length at most ( (cid:15)/ √ δ so that each each point in B is within distance (cid:15)δ of some grid point. Since the Fréchetdistance is Lipschitz, this yields a (1 + (cid:15) )-approximate set. Unfortunately, this set is ofsize Θ(1 /(cid:15) ) which is prohibitively large for approximation guarantees such as (cid:15) = 10 − . Remark:
In the context of global Lipschitz optimization, this approach is known as the passive algorithm whose performance generally is dominated by (the adaptive) branch-and-bound methods. Binary Search via Decision Problem:
A further canonical approach is to reducethe (1 + (cid:15) )-approximate computation task to the decision problem using a binary search.Formally, let δ ∗ denote the Fréchet distance under translation. Starting from a simple2-approximation δ UB (see Section 2, or, more precisely, the initial estimates discussedlater in this section), we use a binary search in the interval [0 . · δ UB , δ UB ], terminatingas soon as we arrive at an interval of length [ a, b ] with b ≤ (1 + (cid:15) ) a . As this takes only O (log(1 /(cid:15) )) iterations to obtain an (1 + (cid:15) )-approximation, this approach is much moresuitable to obtain a desired guarantee of (cid:15) = 10 − . Lipschitz-only Optimization:
The main drawback of the generic Lipschitz optimiza-tion algorithms discussed in Section 3.2 was that they cannot be used to derive anexact answer. This drawback no longer applies for approximate value computation. Wecan thus use a pure branch-and-bound algorithm for global Lipschitz optimization. Inparticular, we will use the same strategy as our fastest solution, however, we never usethe arrangement-based algorithm, but only terminate at a search box once the local lowerbound and global upper bound provide a (1 + (cid:15) )-approximation. Our solution, Lipschitz-meets-Fréchet:
We follow our approach of combining Lip-schitz optimization with arrangement-based algorithms (described in Section 3) to computea (1 + (cid:15) )-approximation of the distance value. As opposed to the decision algorithm, weindeed maintain a global upper bound ˜ δ and local lower bounds ‘ i for each search box B i .To update these bounds, we approximately evaluate the objective function f ( τ ) usinga tuned binary search over the fixed-translation Fréchet decider algorithm. We stopbranching in a search box B i if either the global upper bound ˜ δ is at most ‘ i (1 + (cid:15) ), or abase case criterion similar to the decision setting applies. As selection strategy, we employthe no-regret strategy of choosing the box with the smallest lower bound first. The basecase performs a binary search using the local arrangement-based decision algorithm; thus,our upper bound on the arrangement size must hold for all δ in the search interval. Thepseudocode of our solution is shown in Algorithm 2.We present the details of our approach in the remainder of this section. As our experimentsreveal, our solution generally outperforms the above described alternatives, see Section 6. We tune the binary search by distinguishing the precision with which we want to evaluate f ( τ ); intuitively,it pays off to evaluate f ( τ ) with high precision if this evaluation yields a better global upper bound,while for improvements of a local lower bound, a cheaper evaluation with coarser precision suffices. Remark:
To enable a fair comparison of the Lipschitz-meets-Fréchet (LMF) approachto the alternative approaches of Binary Search and Lipschitz-only optimization, we takecare that the low-level optimizations for LMF described in the reminder of this sectionare also applied to these approaches, as far as applicable. In particular, we use the samemethod to obtain initial estimates for the desired value for LMF, Binary Search andLipschitz-only optimization, and adapt the kd-tree-based data structure used to speed-upestimation and construction of arrangements for LMF also for Binary Search (note thatthese tasks do not apply to Lipschitz-only optimization).We now present details of our solution for the (approximate) value computation setting,the LMF algorithm. We first consider the base case (which differs from the base case of thedecider, given in Section 4.1), before we discuss further details.
Our base case problem is the following: Given curves π, σ , a test distance interval I =[ δ LB , δ UB ] and a search box B , we let δ ∗ := min τ ∈ B d F ( π, σ + τ ) and ask to determinewhether δ ∗ ∈ I , and if so, an estimate δ with | δ − δ ∗ | ≤ (cid:15) .The central idea is to solve this task via a binary search for δ ∗ ∈ I using our localarrangement-based algorithm of Section 4.1 to decide queries of the form “ δ ∗ ≤ δ ?” for anygiven δ . For this algorithm to run quickly, we need that for any queried distance δ , thecorresponding local arrangement for the test distance δ is small. To this end, we seek toobtain a strong upper bound for the local arrangement size over worst-case δ ∈ I . Estimation of local arrangement sizes.
Given an interval I = [ δ LB , δ UB ] of test distances,instead of the circles defined in Section 4.1, we consider the set of annuli D := { D δ UB ( π i − σ j ) \ D δ LB ( π i − σ j ) | π i ∈ π, σ j ∈ σ } , where D r ( p ) denotes the disk of radius r ∈ R around p ∈ R . Clearly, if a circle C δ ( π i − σ j )contributes to the local arrangement of B for some test distance δ ∈ [ δ LB , δ UB ], then thecorresponding annulus D δ UB ( π i − σ j ) \ D δ LB ( π − σ j ) intersects B or is contained in B . Thusby determining the number d of annuli a ∈ D that intersect B or are contained in B , we maybound the local arrangement size for B for any δ ∈ [ δ LB , δ UB ] by u := 2( d + d ) (analogouslyto Section 4.1).To obtain the above upper bound efficiently, we implement a geometric search datastructure based on the kd-tree . Specifically, we build a kd-tree on the set of center points ofall annuli in D . Given a search box B , we seek to determine all centers of annuli a ∈ D thatintersect B or contain B . While this condition can be described using a constant (but large)set of simple primitives, evaluating this test frequently for many kd-tree nodes is costly. Thus,to determine whether a node in the kd-tree needs to be explored, we use a more permissive,but cheaper test which essentially approximates the search box B by its center point: wesearch for all candidate points that are contained in an annulus of width roughly | I | plushalf the diameter of B , centered at the center of B , and test for each such point whether thecorresponding annulus in D indeed intersects B .Again, we implement this search for contributing annuli such that we return the centersof all found annuli. This can subsequently be used by the local arrangement-based algorithmto quickly construct the arrangement for each query. Furthermore, we again stop the searchas soon as the numbers of such annuli exceeds γ size . . Bringmann, M. Künnemann, A. Nusser 15 Binary Search via local arrangement-based algorithm.
To obtain the desired estimate for δ ∗ in the case that our size estimate is bounded by γ size , we use a binary search via our localarrangement-based algorithm. As a low-level optimization to speed-up the construction ofthe local arrangement for a query distance δ , we pass the centers of contributing annuli tothe local arrangement-based algorithm. Furthermore, as described in Section 4.1, we letthe arrangement-based decision algorithm test all vertices in the arrangement of all circles C B contributing to the search box B , not only vertices in B . As this can only decrease thereturned estimate (by finding a corresponding witness), this does not affect correctness ofthe algorithm. The pseudocode of the LMF algorithm is shown in Algorithm 2. When referring to lines inthe remainder of this section, we refer to lines in this algorithm. Before we address someaspects and optimizations in detail, we give a short overview over the algorithm. First, notethat as our selection strategy is different from the decider setting, we now use a priorityqueue for the boxes, see line 5. In lines 8 to 17 the bounding happens and in lines 18 to 23we check if the base case criterion applies, and if it does, determine the value for this boxusing the arrangement-based approach. Finally, in line 25 we branch if we did not alreadyskip the box.
Initial estimates.
In line 3 we calculate initial estimates for the upper and lower bound. Tothis end, we consider the translation τ start (resp. τ end ) that aligns the first (resp. last) pointsof π, σ as it yields a 2-approximation δ start := d F ( π, σ + τ start ) (resp. δ end := d F ( π, σ + τ end )).Using the best of both approximations, our initial estimation interval for d trans- F ( π, σ ) is[ δ LB , δ UB ] := [max { δ start , δ end } / , min { δ start , δ end } ], see Section 2. Priority queue.
To implement our smallest-lower-bound-first selection rule, we use a priorityqueue to organize the search boxes, using the local lower bounds as keys. Recall that thisyields a no-regret selection strategy for our branch-and-bound framework.
Objective function evaluation: Computing Fréchet distance via Fréchet decider.
Toupdate our global upper bound and local lower bounds, we need to determine Fréchetdistance values rather than decisions (which were sufficient for our decider), see lines 11 and15. However, we do not always need a very precise calculation. While the upper bound isglobal and thus an improvement might lead to significant progress by dropping a numberof search boxes, the lower bound only has an effect on the box itself and on its children.Thus, we use a coarse distance computation (i.e., an approximation up to a larger additiveconstant) for the lower bound in line 15, but a more precise calculation for the upper boundin line 11.In two cases (lines 10 and 22) we are only interested in the exact Fréchet distance valueif it is smaller than the current global upper bound. Thus, as is hidden in the pseudocode,we first check if there is an improvement at all, and only if this is the case, we compute theactual value using a binary search.
Additive vs. multiplicative approximation.
Due to rounding issues that occur at decisionsdepending on extremely small value differences when using fixed precision arithmetic, we usean additive approximation of (cid:15) = 10 − instead of a multiplicative approximation to ensure that these issues do not arise on usage of our implementation with arbitrary data sets. Notethat all computed distances in our benchmarks have a value larger than 1, and thus also interms of multiplicative approximation (1 + (cid:15) ), we have (cid:15) ≤ − . To engineer and evaluate our approach, we provide a benchmark on the basis of the curvedatasets that were used to evaluate the currently fastest fixed-translation Fréchet deciderimplementation in [10]. In particular, this curve set involves a set of handwritten characters(
Characters , [2]) and the data set of the GIS Cup 2017 (
Sigspatial , [1]). Table 1 givesstatistics of these datasets.
Table 1
Information about the data sets used in the benchmarks.Data set Type
Sigspatial [1] synthetic GPS-like 20199 247.8
Characters [2] 20 handwritten characters 2858 120.9(142.9 per character)
The aim of our evaluations is to investigate the following main questions: Is our solution able to decide queries on realistic curve sets in an amount of time that ispractically feasible, even when the size of the arrangement suggests infeasibility? Is our combination of Lipschitz optimization and arrangement-based algorithms for valuecomputation superior to the alternative approaches described in Section 5?Furthermore, we aim to provide an understanding of the performance of our novel algorithms.
Decider experiments.
For decision queries of the form “ d trans- F ( π, σ ) ≤ δ ?”, we generatea benchmark query set that distinguishes between how close the test distance is to theactual distance of the curves: Given a set of curves C , we sample 1000 curve pairs π, σ ∈ C uniformly at random. Using our implementation, we determine an interval [ δ LB , δ UB ] suchthat δ UB − δ LB ≤ · − and d trans- F ( π, σ ) ∈ [ δ LB , δ UB ]. For each ‘ ∈ {− , . . . , − } , weadd “ d trans- F ( π, σ ) ≤ (1 − ‘ ) δ LB ?” to the query set C NO ‘ , which contains only NO instances.Similarly, for each ‘ ∈ {− , . . . , } we add “ d trans- F ( π, σ ) ≤ (1 + 4 ‘ ) δ UB ?” to the queryset C YES ‘ , which contains only YES instances. We evaluate our decider on this benchmarkcreated for the Characters and
Sigspatial data sets. Furthermore, we give results fora further benchmark set generated from the
Characters curve set by sampling, for eachof the 20 characters c included in Characters , 50 curve pairs π, σ representing the samecharacter c . This yields a benchmark that has the same size of 1000 query curve pairs, butcompares only same-character curves. We show the mean running times on these threebenchmark sets in Figure 4. As before, we also depict the number of black-box calls ofour decider and, as a baseline, an estimate of the arrangement size (and thus the numberof black-box calls of a naive approach) in Figure 5. Note that for small ranges of the testdistance δ , it may happen that we decide a NO instance without a single black-box call bydetermining that the distance between π − σ and π n − σ m is larger than 2 δ ; correspondingvalues below 1 call are not depicted in Figure 5.To give an insight for the speed-up achieved over the baseline arrangement-based algorithmthat makes a black-box call to the fixed-translation Fréchet decider for each face of the . Bringmann, M. Künnemann, A. Nusser 17 − − − − − − − − − − − − − − − distance factor D e c i s i o n t i m e ( m s ) S AME - CHARACTERS : − − − − − − − − − − − − − − − distance factor D e c i s i o n t i m e ( m s ) A LL - CHARACTERS : − − − − − − − − − − − − − − − distance factor D e c i s i o n t i m e ( m s ) S IGSPATIAL : Figure 4
Running time for our decider. We plot the mean running times over 1000 NO (or YES)queries with a test distance of approximately (1 − − ‘ ) (or (1 + 4 − ‘ )) times the true Fréchet distanceunder translation, as well as the interval between the lower and upper quartile over the queries. arrangement A δ , in Figure 5 we depict both the number of black-box calls to the fixed-translation Fréchet decider made by our implementation, as well as an estimate for thearrangement size, and thus the number of black-box calls of the baseline approach.We observe that on the above sets, the average decision time ranges from below 1 ms to142 ms, deciding our Characters benchmark (involving 23 ,
000 queries) in 628 seconds.Our estimation suggests that a naive implementation of the baseline arrangement-basedalgorithm would have been worse by more than three orders of magnitude , as for each set,the average number of black-box calls to the fixed-translation Fréchet decider is smaller bya factor of at least 1000 than our estimation of the arrangement size. See Table 2 for thedetailed timing results of our decider on the benchmarks described above.
Approximate value computation experiments.
We evaluate our implementation of thealgorithm presented in Section 5 by computing an estimate ˜ δ such that | ˜ δ − d trans- F ( π, σ ) | ≤ (cid:15) with a choice of (cid:15) = 10 − . In particular, we compare the performances of the differentapproaches discussed in Section 5:
Binary Search:
Binary search using our Fréchet-under-translation decider of Section 4.
Lipschitz-only:
Algorithm 2 without the arrangement, i.e., without lines 18 to 23.
Lipschitz-meets-Fréchet (LMF):
Our implementation as detailed in Section 5.Since simple estimates show that the (cid:15) -approximate sets are clearly too costly for (cid:15) = 10 − , wedrop this approach from all further consideration. We took care to implement all approacheswith a similar effort of low-level optimizations.For our evaluation, we focus on the Characters data set which allows us to distinguishthe rough shape of the curves: We subdivide the curve set into the subsets C α for α ∈ Σ(where Σ is the set of 20 characters occurring in
Characters ). In particular for eachcharacter pair α, β ∈ Σ, we create a sample of N samples curve pairs ( π, σ ) chosen uniformlyat random from C α × C β . For N samples = 5, computing the value (up to (cid:15) = 10 − ) for all N samples · ( (cid:0) | Σ | (cid:1) + | Σ | ) = 1050 sampled curve pairs gives the statistics shown in Table 3.Since already for this example the Lipschitz-only approach is dominated by almost afactor of 30 by LMF (and by a factor of almost 8 by binary search), we perform more detailedanalyses with N samples = 100 only for LMF and binary search. The overall performanceis given in Table 4. Also here LMF is more than 3 times faster than binary search. Togive more insights into the relationship of their running times, we give a scatter plot of therunning times of LMF and binary search on the same instances over the complete benchmarkin Figure 6, showing that binary search generally outperforms LMF only on instances whichare comparably easy for LMF as well. The advantage of LMF becomes particularly clear onhard instances.Apart from these general statistics for our value computation benchmarks, we depictindividual mean computation times and mean number of black-box calls (over all N samples We only give an estimate for the arrangement size, since the size of the arrangement is too large tobe evaluated exactly for all our benchmark queries within a day. Specifically, we estimate the numberof vertices of the arrangement which closely corresponds to the number of faces by Euler’s formula.We give the following estimate: We first determine a search box B for the given decision instance π = ( π , . . . , π n ) , σ = ( σ , . . . , σ m ) , δ as described for our algorithm. We then sample S = 100000 tuples i , i ∈ { , . . . , n } , j , j ∈ { , . . . , m } and count the number I of intersections of the circles of radius δ around π i − σ j and π i − σ j inside B . The number ( I/S ) · ( nm ) is the estimated number ofcircle-circle intersections in B . Adding the number of circle-box intersections, which we can computeexactly, yields our estimate. Here we use additive rather than multiplicative approximation for technical reasons. Since all computeddistances are within [1 . , . (cid:15) )-approximation with (cid:15) ≤ − . . Bringmann, M. Künnemann, A. Nusser 19 − − − − − − − − − − − − − − − distance factor B l a c k - b o x c a ll s S AME - CHARACTERS : − − − − − − − − − − − − − − − distance factor B l a c k - b o x c a ll s A LL - CHARACTERS : − − − − − − − − − − − − − − − distance factor B l a c k - b o x c a ll s S IGSPATIAL : Figure 5
Number of black-box calls to the fixed-translation Fréchet decider made by our decider(below, in green), as well as an estimate of the arrangement complexity, i.e., number of calls of anaive algorithm (above, in black). We plot the mean number of calls and arrangement complexityover 1000 NO (or YES) queries with a test distance of approximately (1 − − ‘ ) (or (1 + 4 − ‘ )) timesthe true Fréchet distance under translation, as well as the interval between the lower and upperquartile over the queries. Table 2
Time measurements for the components of the decider over the complete deciderbenchmark sets. In parentheses, we give average values over the total of 23,000 decision instances. same-characters
Time Black-Box Calls all-characters
Time Black-Box Calls sigspatial
Time Black-Box Calls
Table 3
Statistics for approximate value computation for N samples = 5. In parentheses we showthe mean values averaged over a total of 1050 instances. Approach Time Black-Box Calls
LMF 148,032 ms 13,323,232(141.0 ms per instance) (12,688.8 per instance)Binary Search 536,853 ms 45,909,628(511.3 ms per instance) (43,723.5 per instance)Lipschitz-only 4,204,521 ms 820,468,224(4,004.3 ms per instance) (781,398.3 per instance) . Bringmann, M. Künnemann, A. Nusser 21 Binary Search (ms) L M F ( m s ) Figure 6
Running times of LMF and binary search on set of randomly sampled
Characters curves. samples) for each character pair α, β ∈ Σ in Figures 7 and 8.Finally, we give the average distance values on our benchmark set both under a fixedtranslation (specifically, with start points of π and σ normalized to the origin) and undertranslation in Figure 9. Note that using naive approaches computing these tables would havebeen computationally extremely costly. We engineer the first practical implementation for the discrete Fréchet distance undertranslation in the plane. While previous algorithmic solution for the problem solve it viaexpensive discrete methods, we introduce a new method from continuous optimization toachieve significant speed-ups on realistic inputs. This is analogous to the success of integerprogramming solvers which, while optimizing a discrete problem, choose to work over thereals to gain access to linear programming relaxations, cutting planes methods, and more. Anovelty here is that we successfully apply such methods to obtain drastic speed-ups for a polynomial-time problem .We leave as open problems to determine whether there are reasonable analogues of furtherideas from integer programming, such as cutting plane methods or preconditioning, thatcould help to get further improved algorithms for our problem. More generally, we believethat this gives an exciting direction for algorithm engineering in general that should findwider applications. A particular direction in this vein is the use of our methods to computerotation- or scaling-invariant versions of the Fréchet distance. Intuitively, by introducingadditional dimensions in our search space, our methods can in principle also be used tooptimize over such additional degrees of freedom. However, the Lipschitz condition changessignificantly, and we leave it to future work to determine the applicability in these settings. a b c d e g h l m n o p q r s u v w y z zywvusrqponmlhgedcba Value
Color Keyand Histogram C oun t a b c d e g h l m n o p q r s u v w y z zywvusrqponmlhgedcba Value
Color Keyand Histogram C oun t Figure 7
Log of mean value computation time in ms for LMF (left) and Binary Search (right). a b c d e g h l m n o p q r s u v w y z zywvusrqponmlhgedcba Value
Color Keyand Histogram C oun t a b c d e g h l m n o p q r s u v w y z zywvusrqponmlhgedcba Value
Color Keyand Histogram C oun t Figure 8
Log of mean number of black-box calls for LMF (left) and Binary Search (right). . Bringmann, M. Künnemann, A. Nusser 23 a b c d e g h l m n o p q r s u v w y z zywvusrqponmlhgedcba
20 60 100
Value
Color Keyand Histogram C oun t a b c d e g h l m n o p q r s u v w y z zywvusrqponmlhgedcba
10 30 50 70
Value
Color Keyand Histogram C oun t a b c d e g h l m n o p q r s u v w y z zywvusrqponmlhgedcba Value
Color Keyand Histogram C oun t Figure 9
Average Fréchet distance value (top left) and average Fréchet distance under translationvalue (top right), as well as the quotient of these values (bottom left).
Table 4
Statistics for approximate value computation for N samples = 100. In parentheses, wegive average values over the total of 21,000 curve pairs. Algorithm Time Black-Box Calls
LMF 2,938,512 ms 260,128,449(140.0 ms per instance) (12,387.1 per instance)- Preprocessing 71,728 ms- Black-box calls (Lipschitz) 400,189 ms- Arrangement estimation 166,479 ms- Arrangement algorithm 2,250,493 ms* Construction 1,537,500 ms* Black-box calls 545,442 msBinary Search 10,555,630 ms 875,424,988( 502.7 ms per instance) (41,686.9 per instance)
References ACM SIGSPATIAL GIS Cup 2017 Data Set. . Accessed: 2018-12-03. Character Trajectories Data Set. https://archive.ics.uci.edu/ml/datasets/Character+Trajectories . Accessed: 2018-12-03. Pankaj K. Agarwal, Rinat Ben Avraham, Haim Kaplan, and Micha Sharir. Computingthe discrete Fréchet distance in subquadratic time.
SIAM J. Comput. , 43(2):429–449, 2014. doi:10.1137/130920526 . Helmut Alt and Michael Godau. Computing the Fréchet distance between two polygonalcurves.
Internat. J. Comput. Geom. Appl. , 5(1–2):78–99, 1995. Helmut Alt, Christian Knauer, and Carola Wenk. Matching polygonal curves with respect tothe Fréchet distance. In
Proc. 18th Annual Symposium on Theoretical Aspects of ComputerScience (STACS’01) , pages 63–74, 2001. Julian Baldus and Karl Bringmann. A fast implementation of near neighbors queries for Fréchetdistance (GIS Cup). In
Proc. of the 25th ACM SIGSPATIAL International Conference on Ad-vances in Geographic Information Systems (SIGSPATIAL 2017) , pages 99:1–99:4. ACM, 2017.URL: http://doi.acm.org/10.1145/3139958.3140062 , doi:10.1145/3139958.3140062 . Rinat Ben Avraham, Haim Kaplan, and Micha Sharir. A faster algorithm for the discreteFréchet distance under translation.
ArXiv preprint http: // arxiv. org/ abs/ 1501. 03724 ,2015. Karl Bringmann. Why walking the dog takes time: Fréchet distance has no strongly sub-quadratic algorithms unless SETH fails. In
Proc. 55th Ann. IEEE Symposium on Foundationsof Computer Science (FOCS’14) , pages 661–670, 2014. Karl Bringmann, Marvin Künnemann, and André Nusser. Fréchet distance under translation:Conditional hardness and an algorithm via offline dynamic grid reachability. In
Proc. 30thAnnual ACM-SIAM Symposium on Discrete Algorithms (SODA 2019) , pages 2902–2921, 2019. doi:10.1137/1.9781611975482.180 . Karl Bringmann, Marvin Künnemann, and André Nusser. Walking the dog fast in practice:Algorithm engineering of the Fréchet distance. In
Proc. 35th International Symposium onComputational Geometry (SoCG 2019) , pages 17:1–17:21, 2019. doi:10.4230/LIPIcs.SoCG.2019.17 . . Bringmann, M. Künnemann, A. Nusser 25 Kevin Buchin, Maike Buchin, Wouter Meulemans, and Wolfgang Mulzer. Four Soviets walkthe dog: Improved bounds for computing the Fréchet distance.
Discrete & ComputationalGeometry , 58(1):180–216, 2017. doi:10.1007/s00454-017-9878-7 . Kevin Buchin, Maike Buchin, and Yusu Wang. Exact algorithms for partial curve matching viathe Fréchet distance. In
Proc. 20th Annu. ACM-SIAM Symp. Discrete Algorithms (SODA’09) ,pages 645–654, 2009. Kevin Buchin, Yago Diez, Tom van Diggelen, and Wouter Meulemans. Efficient trajectoryqueries under the Fréchet distance (GIS Cup). In
Proc. of the 25th ACM SIGSPATIALInternational Conference on Advances in Geographic Information Systems (SIGSPATIAL2017) , pages 101:1–101:4. ACM, 2017. URL: http://doi.acm.org/10.1145/3139958.3140064 , doi:10.1145/3139958.3140064 . Kevin Buchin, Anne Driemel, Natasja van de L’Isle, and André Nusser. klcluster: Center-basedclustering of trajectories. In
Proc. of the 27th ACM SIGSPATIAL International Conferenceon Advances in Geographic Information Systems (SIGSPATIAL 2019) , pages 496–499. ACM,2019. doi:10.1145/3347146.3359111 . Kevin Buchin, Tim Ophelders, and Bettina Speckmann. SETH says: Weak Fréchet distanceis faster, but only if it is continuous and in one dimension. In Timothy M. Chan, editor,
Proceedings of the Thirtieth Annual ACM-SIAM Symposium on Discrete Algorithms, SODA2019, San Diego, California, USA, January 6-9, 2019 , pages 2887–2901. SIAM, 2019. doi:10.1137/1.9781611975482.179 . Matteo Ceccarello, Anne Driemel, and Francesco Silvestri. FRESH: Fréchet similaritywith hashing. In
Proc. of the 16th International Symposium on Algorithms and DataStructures (WADS 2019) , volume 11646 of
LNCS , pages 254–268. Springer, 2019. doi:10.1007/978-3-030-24766-9\_19 . Anne Driemel, Sariel Har-Peled, and Carola Wenk. Approximating the Fréchet distance forrealistic curves in near linear time.
Discrete & Computational Geometry , 48(1):94–127, Jul2012. doi:10.1007/s00454-012-9402-z . Fabian Dütsch and Jan Vahrenhold. A filter-and-refinement-algorithm for range queries basedon the Fréchet distance (GIS Cup). In
Proc. of the 25th ACM SIGSPATIAL InternationalConference on Advances in Geographic Information Systems (SIGSPATIAL 2017) , pages100:1–100:4. ACM, 2017. URL: http://doi.acm.org/10.1145/3139958.3140063 , doi:10.1145/3139958.3140063 . Thomas Eiter and Heikki Mannila. Computing discrete Fréchet distance. Technical ReportCD-TR 94/64, Christian Doppler Laboratory for Expert Systems, TU Vienna, Austria, 1994. Efim A. Galperin. The cubic algorithm.
Journal of Mathematical Analysis and Applica-tions , 112(2):635 – 640, 1985. URL: , doi:https://doi.org/10.1016/0022-247X(85)90268-9 . Efim A. Galperin. Two alternatives for the cubic algorithm.
Journal of Mathematical Analysisand Applications , 126(1):229 – 237, 1987. URL: , doi:https://doi.org/10.1016/0022-247X(87)90088-6 . Efim A. Galperin. Precision, complexity, and computational schemes of the cubic algorithm.
Journal of Optimization Theory and Applications , 57(2):223 – 238, 1988. doi:https://doi.org/10.1007/BF00938537 . Efim A. Galperin. The fast cubic algorithm.
Computers & Mathematics with Applica-tions , 25(10):147 – 160, 1993. URL: , doi:https://doi.org/10.1016/0898-1221(93)90289-8 . E. Gourdin, P. Hansen, and B. Jaumard. Global optimization of multivariate lipschitz functions:Survey and computational comparison, 1994. Pierre Hansen and Brigitte Jaumard. Lipschitz optimization. In Reiner Horst and Panos M.Pardalos, editors,
Handbook of Global Optimization , pages 407–493. Springer US, Boston, MA,1995. doi:10.1007/978-1-4615-2025-2_9 . Reiner Horst. A general class of branch-and-bound methods in global optimization with somenew approaches for concave minimization.
Journal of Optimization Theory and Applications ,51:271 – 291, 1986. doi:https://doi.org/10.1007/BF00939825 . Reiner Horst and Hoang Tuy. On the convergence of global methods in multiextremaloptimization.
Journal of Optimization Theory and Applications , 54:253 – 271, 1987. doi:https://doi.org/10.1007/BF00939434 . Reiner Horst and Hoang Tuy.
Global Optimization – Deterministic Approaches . SpringerBerlin Heidelberg, 3rd edition, 1996. Minghui Jiang, Ying Xu, and Binhai Zhu. Protein structure–structure alignment with discreteFréchet distance.
J. Bioinformatics and Computational Biology , 6(01):51–64, 2008. Nimrod Megiddo. Applying parallel computation algorithms in the design of serial algorithms.
Journal of the ACM , 30(4):852–865, 1983. Axel Mosig and Michael Clausen. Approximately matching polygonal curves with respect tothe fréchet distance.
Computational Geometry , 30(2):113 – 127, 2005. Special Issue on the19th European Workshop on Computational Geometry. URL: , doi:https://doi.org/10.1016/j.comgeo.2004.05.004 . S.A. Piyavskii. An algorithm for finding the absolute extremum of a function.
USSRComputational Mathematics and Mathematical Physics , 12(4):57 – 67, 1972. URL: , doi:https://doi.org/10.1016/0041-5553(72)90115-2 . Ron Wein, Eric Berberich, Efi Fogel, Dan Halperin, Michael Hemmer, Oren Salzman, andBaruch Zukerman. 2D arrangements. In
CGAL User and Reference Manual . CGAL EditorialBoard, 5.0.2 edition, 2020. URL: https://doc.cgal.org/5.0.2/Manual/packages.html . Carola Wenk.
Shape matching in higher dimensions . PhD thesis, Freie Universität Berlin,2002. PhD Thesis. Martin Werner and Dev Oliver. ACM SIGSPATIAL GIS cup 2017: Range queries underFréchet distance.
SIGSPATIAL Special , 10(1):24–27, 2018. L.A. Wolsey.