[PDF] Finding the Homology of Manifolds using Ellipsoids

Abstract

Full PDF

FFinding the Homology of Manifolds usingEllipsoids

Sara Kaliˇsnik and Davorin Leˇsnik

Abstract

A standard problem in applied topology is how to discover topological invariantsof data from a noisy point cloud that approximates it. We consider the case wherea sample is drawn from a properly embedded C -submanifold without boundary in aEuclidean space. We show that we can deformation retract the union of ellipsoids,centered at sample points and stretching in the tangent directions, to the manifold.Hence the homotopy type, and therefore also the homology type, of the manifold isthe same as that of the nerve complex of the cover by ellipsoids. By thickening samplepoints to ellipsoids rather than balls, our results require a smaller sample density thancomparable results in the literature. They also advocate using elongated shapes in theconstruction of barcodes in persistent homology. Data is often unstructured and comes in the form of a non-empty ﬁnite metric space, calleda point cloud. It is often very high dimensional even though data points are actually samplesfrom a low-dimensional object (such as a manifold) that is embedded in a high-dimensionalspace. One reason may be that many features are all measurements of the same underlyingcause and therefore closely related to each other. For example, if you take photos of a singleobject from multiple angles simultaneously there is a lot overlap in the information capturedby all those cameras. One of the main tasks of ‘manifold learning’ is to design algorithms toestimate geometric and topological properties of the manifold from the sample points lyingon this unknown manifold.One successful framework for dealing with the problem of reconstructing shapes from pointclouds is based on the notion of (cid:15) -sample introduced by Amenta et al [1]. A sampling ofa shape M is an (cid:15) -sampling if every point P in M has a sample point at distance at most (cid:15) · lfs M ( P ), where lfs M ( P ) is the local feature size of P , i.e. the distance from P to the medialaxis of M . Surfaces smoothly embedded in R can be reconstructed homeomorphically fromany 0 . a r X i v : . [ m a t h . GN ] J un ne simple method for shape reconstructing is to output an oﬀset of the sampling for asuitable value α of the oﬀset parameter. Topologically, this is equivalent to taking the ˇCechcomplex or the α -complex [18]. This leads to the problem of ﬁnding theoretical guarantees asto when an oﬀset of a sampling has the same homotopy type as the underlying set. In otherwords, we need to ﬁnd conditions on a point cloud S of a shape M so that the thickeningof S is homotopy equivalent to M . This only works if the point cloud is suﬃciently closeto M , i.e. when there is a bound on the Hausdorﬀ distance between S and M .Niyogi, Smale and Weinberger [22] proved that this method indeed provides reconstructionshaving the correct homotopy type for densely enough sampled smooth submanifolds of R n .More precisely, one can capture the homotopy type of a Riemannian submanifold M withoutboundary of reach τ in a Euclidean space from a ﬁnite (cid:15) -dense sample S ⊆ M whenever (cid:15) < (cid:113) τ by showing that the union of (cid:15) -balls with centers in sample points deformationretracts to M .Let us denote the Hausdorﬀ distance between S and M by κ — that is, every point in M has an at most κ -distant point in S . We can rephrase the above result as follows: whenever2 κ < (cid:113) τ , the homotopy type of M is captured by a union of (cid:15) -balls with centers in S forevery (cid:15) ∈ R (2 κ , √ τ ) . Thus the bound of the ratio κ τ < (cid:113) ≈ . M .Other authors gave variants of Niyogi, Smale and Weinberger’s result. In [5, Theorem 2.8],the authors relax the conditions on the set we wish to approximate (it need not be a manifold,just any non-empty compact subset of a Euclidean space) and the sample (it need not beﬁnite, just non-empty compact), but the price they pay for this is a lot lower upper boundon κ τ , which in their case is ≈ . µ -reach [8], [10], [12], [23].However, in practice producing a suﬃciently dense sample can be diﬃcult or requires a longtime [16], so relaxing the upper bound of κ τ is desirable. The purpose of this paper is toprove that we can indeed relax this bound when sampling manifolds (though we allow amore general class than [22]) if we thicken sample points to ellipsoids rather than balls.The idea is that since a diﬀerentiable manifold is locally well approximated by its tangentspace, an ellipsoid with its major semi-axes in the tangent directions well approximates themanifold. This idea ﬁrst appeared in [4], where the authors construct a ﬁltration of “ellipsoid-driven complexes”, where the user can choose the ratio between the major (tangent) andthe minor (normal) semi-axes. Their experiments showed that computing barcodes fromellipsoid-driven complexes strengthened the topological signal, in the sense that the bars2orresponding to features of the data were longer. In our paper we make the ratio dependenton the persistence parameter and give a proof that the union of ellipsoids around samplepoints (under suitable assumptions) deformation retracts onto the manifold. Hence our papergives theoretical guarantees that the union of ellipsoids captures the manifold’s homotopytype, and thus further justiﬁes the use of ellipsoid-inspired shapes to construct barcodes.The central theorem of this paper (Theorem 6.1) is the following: Theorem.

Let n ∈ N and let M be a non-empty properly embedded C -submanifold of R n without boundary. Let S ⊆ M be a subset of M , locally ﬁnite in R n (the sample fromthe manifold M ). Let τ be the reach of M in R n and κ the Hausdorﬀ distance between S and M . Then for all p ∈ R [0 . τ, . τ ] which satisfy κ < (cid:113) p (cid:0)(cid:112) τ ( p + 2 τ ) − τ (cid:1) − . τ there exists a strong deformation retraction from E p (the union of open ellipsoids aroundsample points with normal semi-axes of length p and tangent semi-axes of length (cid:112) τp + p )to M . In particular, E p and M are homotopy equivalent, and so have the same homology. By replacing the balls with ellipsoids, we manage to push the upper bound on κ τ to approx-imately 0 . .

36 compared to [22]. In other words,our method allows samples with less than half the density.The paper is organized as follows. Section 2 lays the groundwork for the paper, provid-ing requisite deﬁnitions and deriving some results for general diﬀerentiable submanifolds ofEuclidean spaces. In Section 3 we calculate theoretical bounds on the persistence parame-ter p : the lower bound ensures that the union of ellipsoids covers the manifold and the upperbound ensures that the union does not intersect the medial axis. Part of our proof relies onthe normal deformation retraction working on intersections of ellipsoids which appears toodiﬃcult to prove theoretically by hand, so we resort to a computer program, explained inSection 4. In Section 5 we construct a deformation retraction from the union of ellipsoidsto the manifold. The section is divided into several subsections for easier reading. Section 6collects the results from the paper to prove the main theorem. In Section 7 we discuss ourresults and future work. Notation

Natural numbers N = { , , , . . . } include zero. Unbounded real intervals are denotedby R >a , R ≤ a etc. Bounded real intervals are denoted by R ( a,b ) (open), R [ a,b ] (closed) etc.3lossary: d Euclidean distance in R n N a submanifold of R n M m -dimensional C -submanifold of R n , embedded as a closed subset M r open r -oﬀset of M , i.e. { X ∈ R n | d ( M , X ) < r }M r closed r -oﬀset of M , i.e. { X ∈ R n | d ( M , X ) ≤ r } T X M tangent space on M at X N X M normal space on M at X S manifold sample (a subset of M ), non-empty and locally ﬁnite κ the Hausdorﬀ distance between M and SA the medial axis of M τ the reach of M p persistence parameter E p ( S ) open ellipsoid with the center in a sample point S ∈ S with the major semi-axes tangent to M E p ( S ) closed ellipsoid with the center in a sample point S ∈ S with the major semi-axes tangent to M ∂ E p ( S ) the boundary of E p ( S ), i.e. E p ( S ) \ E p ( S ) E p the union of open ellipsoids over the sample, (cid:83) S ∈S E p ( S ) E p the union of closed ellipsoids over the sample, (cid:83) S ∈S E p ( S ) pr the map A (cid:123) → M taking a point to the unique closest point on M prv the map taking a point X to the vector pr ( X ) − X (cid:101) V S auxiliary vector ﬁeld, deﬁned on E p ( S ) (cid:101) V auxiliary vector ﬁeld, deﬁned on E p V the vector ﬁeld of directions for the deformation retractionΦ the ﬂow of the vector ﬁeld VR a deformation retraction from E p to a tubular neighbourhood of M All constructions in this paper are done in an ambient Euclidean space R n , n ∈ N , equippedwith the usual Euclidean metric d . We will use the symbol N for a general submanifoldof R n .By deﬁnition each point X of a manifold N has a neighbourhood, homeomorphic to a Eu-clidean space or a closed Euclidean half-space. The dimension of this (half-)space is the dimension of N at X . Diﬀerent points of a manifold can have diﬀerent dimensions ,though the dimension is constant on each connected component. In this paper, when we A simple example is the submanifold of the plane, given by the equation ( x + y ) = x + y (a unionof a point and a circle). N is an m -dimensional manifold , we mean that it has dimension m at its every point.We quickly recall from general topology that it is equivalent for a subset of a Euclidean spaceto be closed and to be properly embedded. Proposition 2.1.

Let ( X , d ) be a metric space in which every closed ball is compact (everyEuclidean space R n satisﬁes this property). The following statements are equivalent for anysubset S ⊆ X .1. S is a closed subset of X .2. S is properly embedded into X , i.e. the inclusion S (cid:44) → X is a proper map .3. S is empty or distances from points in the ambient space to S are attained. That is,for every X ∈ X there exists Y ∈ S such that d ( X , S ) = d ( X , Y ) .Proof. • (1 ⇒ S is closed in X , then its intersection with a compact subset of X is compact, so S is properly embedded into X . • (2 ⇒ S is non-empty, pick S ∈ S . For any X ∈ X we have d ( X , S ) ≤ d ( X , S ), so d ( X , S ) = d (cid:0) X , S ∩ B d ( X , S ) ( X ) (cid:1) . Since S is properly embedded in X , its intersection with thecompact closed ball B d ( X , S ) ( X ) is compact also. A continuous map from a non-emptycompact space into reals attains its minimum, so there exists Y ∈ S such that d ( X , Y ) = d (cid:0) X , S ∩ B d ( X , S ) ( X ) (cid:1) = d ( X , S ). • (3 ⇒ S is non-empty. Then for every point in theclosure X ∈ S we have d ( X , S ) = 0. By assumption this distance is attained, i.e. wehave Y ∈ S such that d ( X , Y ) = 0, so X = Y ∈ S . Thus S ⊆ S , so S is closed.In this paper we consider exclusively submanifolds of a Euclidean space which are properlyembedded, so closed subsets. We mostly use the term ‘properly embedded’ instead of ‘closed’to avoid confusion: the term ‘closed manifold’ is usually used in the sense ‘compact manifoldwith no boundary’ which is a stronger condition (a properly embedded submanifold need notbe compact or without boundary, though every compact submanifold is properly embedded). Recall that a map is proper when the preimage of every compact subspace of the codomain is compact. k ∈ N ≥ ∪ {∞} ; in that case it iscalled a C k -manifold. A C k -submanifold of R n is a C k -manifold which is a subset of R n andthe inclusion map is C k .If N is at least a C -manifold, one may abstractly deﬁne the tangent space T X N and thenormal space N X N at any point X ∈ N ( X is allowed to be a boundary point). As we restrictourselves to submanifolds of R n , we also treat the tangent and the normal space as aﬃnesubspaces of R n , with the origins of T X N and N X N placed at X . The dimension of T X N (resp. N X N ) is the same as the dimension (resp. codimension) of N at X . Because of thisand because T X N and N X N are orthogonal, they together generate R n . Deﬁnition 2.2.

Let N be a C -submanifold of R n , X ∈ N and m the dimension of N at X . • A tangent-normal coordinate system at X ∈ N is an n -dimensional orthonormalcoordinate system with the origin in X , the ﬁrst m coordinate axes tangent to N at X and the last n − m axes normal to N at X . • A planar tangent-normal coordinate system at X ∈ N is a two-dimensional planein R n containing X , together with the choice of an orthonormal coordinate system lyingon it, with the origin in X , the ﬁrst axis (the abscissa) tangent to N at X and the secondaxis (the ordinate) normal to N at X .Recall from Proposition 2.1 that distances from points to a non-empty properly embeddedsubmanifold are attained. However, these distances need not be attained in just one point.As usual, we deﬁne the medial axis A N of a submanifold N ⊆ R n as the set of all pointsin the ambient space for which the distance to N is attained in at least two points: A N := (cid:8) X ∈ R n (cid:12)(cid:12) ∃ Y (cid:48) , Y (cid:48)(cid:48) ∈ N . Y (cid:48) (cid:54) = Y (cid:48)(cid:48) ∧ d ( X , Y (cid:48) ) = d ( X , Y (cid:48)(cid:48) ) = d ( X , N ) (cid:9) . If N is empty, so is A N , though the medial axis can be empty even for non-empty manifolds(consider for example a line or a line segment in a plane). The manifold and its medial axisare always disjoint.The reach of N is the distance between the manifold N and its medial axis A N (if A N isempty, the reach is deﬁned to be ∞ ). Deﬁnition 2.3.

Let N be a C -submanifold of R n , X ∈ N and (cid:126)N a non-zero normal vectorto M at X . The τ N -ball, associated to X and (cid:126)N , is the closed ball (in R n , so n -dimensional)with radius τ N and center in X + τ N (cid:126)N (cid:107) (cid:126)N (cid:107) , which therefore touches N at X . A τ N -ball, associated to X , is the τ N -ball, associated to X and some non-zero normal vector to M at X . If τ N = ∞ , this “ball” is the whole closed half-space which contains (cid:126)N and the boundary of which isthe hyperplane, orthogonal to (cid:126)N , which contains X . τ N -balls is that they provide restrictions to where a manifold issituated. Speciﬁcally, a manifold is disjoint with the interior of its every associated τ N -ball.We will approximate manifolds with a union of ellipsoids (similar as to how one uses a unionof balls to approximate a subspace in the case of a ˇCech complex). The idea is to useellipsoids which are elongated in directions, tangent to the manifold, so that they “extendlonger in the direction the manifold does”, so that we require a sample with lower density.Let us deﬁne the kind of ellipsoids we use in this paper. Deﬁnition 2.4.

Let N be a C -submanifold of R n and p ∈ R > . The tangent-normalopen (resp. closed ) p -ellipsoid at X ∈ N is the open (resp. closed) ellipsoid in R n withthe center in X , the tangent semi-axes of length (cid:112) τ N p + p and the normal semi-axes oflength p . Explicitly, in a tangent-normal coordinate system at X the tangent-normal openand closed p -ellipsoids are given by E p ( X ) := (cid:110) ( x , . . . , x n ) ∈ R n (cid:12)(cid:12)(cid:12) x + . . . + x m τ N p + p + x m +1 + . . . + x n p < (cid:111) , E p ( X ) := (cid:110) ( x , . . . , x n ) ∈ R n (cid:12)(cid:12)(cid:12) x + . . . + x m τ N p + p + x m +1 + . . . + x n p ≤ (cid:111) , where m denotes the dimension of N at X . If τ N = ∞ , then these “ellipsoids” are simplythickenings of T X N : E p ( X ) := (cid:8) ( x , . . . , x n ) ∈ R n (cid:12)(cid:12) (cid:113) x m +1 + . . . + x n < p (cid:9) , E p ( X ) := (cid:8) ( x , . . . , x n ) ∈ R n (cid:12)(cid:12) (cid:113) x m +1 + . . . + x n ≤ p (cid:9) . Observe that the deﬁnitions of ellipsoids are independent of the choice of the tangent-normalcoordinate system; they depend only on the submanifold itself.The value p in the deﬁnition of ellipsoids serves as a “persistence parameter” [20], [6], [7],[17], [4]. We purposefully do not take ellipsoids which are similar at all p (which would meanthat the ratio between the tangent and the normal semi-axes was constant). Rather, wewant ellipsoids which are more elongated (have higher eccentricity) for smaller p . This isbecause on a smaller scale a smooth manifold more closely aligns with its tangent space, andthen so should the ellipsoids. We want the length of the major semi-axes to be a functionof p with the following properties: for each p its value is larger than p , and when p goes to 0,the function value also goes to 0, but the eccentricity goes to 1. In addition, the functionshould allow the following argument. If we change the unit length of the coordinate system,but otherwise leave the manifold “the same”, we want the ellipsoids to remain “the same”as well, but the reach of the manifold changes by the same factor as the unit length, whichthe function should take into account. The simplest function satisfying all these propertiesis arguably (cid:112) τ N p + p , which turns out to work for the results we want.7igure 1: Tangent-normal coordinate system.Figure 1 shows an example, how a manifold, associated balls and a tangent-normal ellipsoidlook like in a tangent-normal coordinate system at some point on the manifold.We now prove a few results that will be useful later. Lemma 2.5.

Let N be a properly embedded C -submanifold of R n . Let X ∈ N and let m bethe dimension of N at X . Assume < m < n .1. For every Y ∈ R n a planar tangent-normal coordinate system at X ∈ N exists whichcontains Y . Without loss of generality we may require that the coordinates of Y in thiscoordinate system are non-negative ( Y lies in the closed ﬁrst quadrant).2. If p ∈ R > , Y ∈ ∂ E p ( X ) and (cid:126)N is a vector, normal to ∂ E p ( X ) at Y , then we mayadditionally assume that the planar tangent-normal coordinate system from the previousitem contains (cid:126)N .3. Let O be a closed ( n − m + 1) -dimensional ball, C -embedded in R n (in particular ∂ O is a C -submanifold of R n , diﬀeomorphic to an ( n − m ) -dimensional sphere). Assume hat O ∩ ∂ N = ∅ and that N and ∂ O intersect transversely in X . Then X is not theonly intersection point, i.e. there exists Y ∈ N ∩ ∂ O \ { X } .4. Assume τ N < ∞ . Let Y ∈ R n and let ( y T , y N ) be the (non-negative) coordinates of Y in the planar tangent-normal coordinate system from the ﬁrst item. Let D be the setof centers of all τ N -balls, associated to X (i.e. the ( n − m − -dimensional spherewithin N X N with the center in X and the radius τ N ). Let C be the cone which is theconvex hull of D ∪ { Y } , and assume that C ∩ ∂ N = ∅ . Then d ( N , Y ) ≤ (cid:113) y T + ( y N + τ N ) − τ N . Proof.

1. Fix an n -dimensional tangent-normal coordinate system at X ∈ N , and let ( y , . . . , y n )be the coordinates of Y . Let (cid:126)a = ( y , . . . , y m , , . . . , (cid:126)b = (0 , . . . , , y m +1 , . . . , y n ). Ifboth (cid:126)a and (cid:126)b are non-zero, they deﬁne a (unique) planar tangent-normal coordinatesystem at X which contains Y . If (cid:126)a is zero (resp. (cid:126)b is zero), choose an arbitrary tangent(resp. normal) direction (we may do this since 0 < m < n ).2. Assume that Y ∈ ∂ E p ( X ) and (cid:126)N is a direction, normal to ∂ E p ( X ). In the n -dimensionaltangent-normal coordinate system from the previous item, the boundary ∂ E p ( S ) isgiven by the equation m (cid:88) i =1 x i τ N p + p + n (cid:88) j = m +1 x j p = 1 . The gradient of the left-hand side, up to a scalar factor, is (cid:0) x τ N p + p , . . . , x m τ N p + p , x m +1 p , . . . , x n p (cid:1) = τ N p + p (cid:126)a + p (cid:126)b. The vector (cid:126)N has to be parallel to it since ∂ E p ( X ) has codimension 1, i.e. a non-zero λ ∈ R exists such that (cid:126)N = λτ N p + p (cid:126)a + λp (cid:126)b . Hence (cid:126)N also lies in the plane, determinedby (cid:126)a and (cid:126)b . This proof works for τ N < ∞ , but the required modiﬁcation for τ N = ∞ is trivial.3. Since O is a compact ( n − m + 1)-dimensional disk and ∂ N is closed, some thickeningof O exists — denote it by T — which is diﬀeomorphic to an n -dimensional ball andis still disjoint with ∂ N . With a small perturbation of N around N ∩ ∂ T (but awayfrom the intersection N ∩ O which must remain unchanged) we can achieve that N and ∂ T only have transversal intersections [21].Imagine R n embedded into its one-point compactiﬁcation S n (denote the added pointby ∞ ) in such a way that T is a hemisphere. Replace the part of N outside of T with a copy of N ∩ T , reﬂected over ∂ T , and denote the obtained space by N (cid:48) .This is an embedding of the so-called double of the manifold N ∩ T . Then N (cid:48) is a9anifold without boundary, closed in the sphere, and therefore compact. If necessary,perturb it slightly around the point ∞ , so that ∞ / ∈ N (cid:48) . Hence N (cid:48) is a compactsubmanifold in R n without boundary and C -smooth everywhere except possibly on N (cid:48) ∩ ∂ T . The double of a C -manifold can be equipped with a C -structure. Thereforewe can use Whitney’s approximation theorem [21] to adjust the embedding of N (cid:48) ona neighbourhood of ∂ T away from O , so that it is C -smooth everywhere. The resultis a compact manifold N (cid:48) without boundary satisfying all the properties we requiredof N , and we have N (cid:48) ∩ O = N ∩ O . This shows that we may without loss of generalityassume that N is compact without boundary.Any compact k -dimensional submanifold of S n without boundary represents an elementin the cohomology H k ( S n ; Z ) (we take the Z -coeﬁcients, so that we do not have toworry about orientation). For elements [ N ] ∈ H m ( S n ; Z ) and [ ∂ O ] ∈ H n − m ( S n ; Z )we know [3] that their cup-product [ N ] (cid:94) [ ∂ O ] ∈ H n ( S n ; Z ) is the intersection numberof N and ∂ O (times the generator). Since the cohomology of S n is trivial except indimensions 0 and n , we have [ N ] = [ ∂ O ] = 0, and hence [ N ] (cid:94) [ ∂ O ] = 0. But the localintersection number in the transversal intersection X is 1, and the intersection numberis the sum of local ones, so X cannot be the only point in N ∩ ∂ O .4. First consider the case when Y ∈ N X N , i.e. y T = 0. Then d ( N , Y ) ≤ d ( X , Y ) = y N = (cid:113) y T + ( y N + τ N ) − τ N . Now suppose Y / ∈ N X N . Then the cone C is homeomorphic to an ( n − m +1)-dimensionalclosed ball. This C and its boundary are smooth everywhere except in Y and on D .Let E be the ( n − m + 1)-dimensional aﬃne subspace which contains Y and N X N (thusthe whole C ). We can smooth ∂ C around the centers of the associated balls within E without aﬀecting the intersection with N since N is disjoint with the interiors of theassociated τ N -balls. If Y ∈ N , then d ( N , Y ) = 0 ≤ (cid:112) y T + ( y N + τ N ) − τ N , and weare done. If Y / ∈ N , then d ( N , Y ) > N is a closed subset. Then we can alsosmooth ∂ C around Y within E without aﬀecting the intersection with N . The boundarysmoothed in this way is diﬀeomorphic to an ( n − m )-dimensional sphere, and so bythe generalized Schoenﬂies theorem splits E into the inner part, diﬀeomorphic to an( n − m +1)-dimensional ball, and the outer unbounded part. Since N intersects ∂ C andtherefore also its smoothed version orthogonally in X , this intersection is transversal.By the previous item another intersection point X (cid:48) ∈ N ∩ ∂ C \ { X } exists. It cannot liein N X N since we would then have a manifold point in the interior of some associatedball, so X (cid:48) must lie on the lateral surface of the cone. That is, X (cid:48) lies on the linesegment between Y and some associated ball center, but it cannot lie in the interior ofthe associated ball, so d ( X (cid:48) , Y ) is bounded by the distance between Y and the furthestassociated ball center, decreased by τ N . The furthest center is the one within thestarting planar tangent-normal coordinate system that has coordinates (0 , − τ N ). Thus d ( N , Y ) ≤ d ( X (cid:48) , Y ) ≤ (cid:113) y T + ( y N + τ N ) − τ N . emma 2.6. Let

A, B ∈ R ≥ which are not both and let τ ∈ R > . Then a unique q ∈ R > exists which solves the equation Aτq + q + Bq = 1 . Moreover, this q depends continuously on A and B , and if ( A, B ) → (0 , (with τ ﬁxed),then q → .Proof. If A = 0, then clearly q = √ B > B = 0, then the unique positive solutionto the quadratic equation q + τq − A = 0 is q = √ τ +4 A − τ .Assume that A, B >

0. Multiply the equation from the lemma by q ( τ + q ) and take allterms to one side of the equation to get q + τq − ( A + B ) q − τB = 0 . Deﬁne the function f : R → R by f ( x ) := x + τx − ( A + B ) x − τB . The zeros of itsderivative f (cid:48) ( x ) = 3 x + 2 τx − ( A + B ) are − τ ± (cid:112) τ + 3( A + B )3 ;since A + B >

0, both zeros are real and one is negative, the other positive. Let z denotethe positive zero. We have f (0) = − τB < f (cid:48) is ≤ R [0 ,z ] , so f cannot have a zerohere, and f ( z ) <

0. Since f is strictly increasing on R >z and lim x →∞ f ( x ) = ∞ , we concludethat f has a unique zero on R >z and therefore also on R > .Since q is the root of the polynomial q + τq − ( A + B ) q − τB and polynomial roots dependcontinuously on the coeﬃcients, q depends continuously on A and B as well. In particular,if A and B tend to 0, then q tends to one of the roots of q + τq . It cannot tend to − τ sinceit is positive, so it tends to 0.Given a properly embedded C -submanifold N ⊆ R n without boundary and a point Y ∈ N ,the dimension of N at which we denote by m , let us deﬁne the continuous function q Y : R n → R ≥ in the following way. Deﬁnition 2.7. If τ N = ∞ , then q Y ( X ) := d ( X , N ) (this also covers the case m = n since thennecessarily N = R n ). Otherwise, if N has dimension 0, then q Y ( X ) := d ( X , Y ). If both thedimension and codimension of N are positive and τ N < ∞ , we split the deﬁnition of q Y intotwo cases. Let q Y ( Y ) := 0. For X ∈ R n \ { Y } introduce a tangent-normal coordinate systemwith the origin in Y (it exists by Lemma 2.5(1)). Let X = ( x , . . . , x n ) be the coordinatesof X in this coordinate system. Deﬁne q Y ( X ) to be the unique element in R > which satisﬁesthe equation x + . . . + x m τ N q Y ( X ) + q Y ( X ) + x m +1 + . . . + x n q Y ( X ) = 1 . X and Y . Lemma 2.6 guarantees existence,uniqueness and continuity of q Y ( X ).The point of this deﬁnition is that (except in the case m = n , when all ellipsoids are thewhole R n ) the unique ellipsoid of the form E r ( Y ) which has X in its boundary has r = q Y ( X ),i.e. X ∈ ∂ E q Y ( X ) ( Y ). Lemma 2.8.

Let N be a properly embedded C -submanifold of R n . Let X ∈ R n and Y ∈ N .Then d ( N , X ) ≤ q Y ( X ) .Proof. If τ N = ∞ , the statement is clear, so assume τ N < ∞ .Let m be the dimension of N at Y . If m = 0, then d ( N , X ) ≤ d ( Y , X ) = q Y ( X ).For 0 < m < n we rely on Lemma 2.5. There is a planar tangent-normal coordinate systemwhich has the origin in Y and contains X . We can additionally assume that the axes areoriented so that X is in the closed ﬁrst quadrant. Since X ∈ ∂ E q Y ( X ) ( Y ), there exists ϕ ∈ R [0 , π ] such that the coordinates of X in this coordinate system are (cid:0)(cid:112) τ N q + q cos( ϕ ) , q sin( ϕ ) (cid:1) , where we have shortened q := q Y ( X ). Hence d ( N , X ) ≤ (cid:113) ( τ N q + q ) cos ( ϕ ) + (cid:0) q sin( ϕ ) + τ N (cid:1) − τ N == (cid:113) τ N + q + τ N q (cid:0) cos ( ϕ ) + 2 sin( ϕ ) (cid:1) − τ N == (cid:113) τ N + q + τ N q (cid:0) − sin ( ϕ ) + 2 sin( ϕ ) (cid:1) − τ N == (cid:113) τ N + q + τ N q (cid:0) − (1 − sin( ϕ )) (cid:1) − τ N . Clearly, the last expression is the largest where the function R [0 , π ] → R , ϕ (cid:55)→ − (1 − sin( ϕ )) attains a maximum which is at ϕ = π . Thus the distance d ( N , X ) is the largest in the normalspace at Y , where we get d ( N , Y ) ≤ (cid:113) τ N + q + τ N q (cid:0) − (1 − (cid:1) − τ N = (cid:113) τ N + q + 2 τ N q − τ N = q. Let us also recall some facts about Lipschitz maps that we will need later. A map f betweensubsets of Euclidean spaces is Lipschitz when it has a

Lipschitz coeﬃcient C ∈ R ≥ , sothat for all X , Y in the domain of f we have (cid:13)(cid:13) f ( X ) − f ( Y ) (cid:13)(cid:13) ≤ C · (cid:107) X − Y (cid:107) . A function is locallyLipschitz when every point of its domain has a neighbourhood such that the restriction ofthe function to this neighbourhood is Lipschitz.12et f and g be maps with Lipschitz coeﬃcients C and D , respectively. Then clearly C + D is a Lipschitz coeﬃcient for the functions f + g and f − g , and C · D is a Lipschitz coeﬃcientfor g ◦ f (whenever these functions exist).For bounded functions the Lipschitz property is preserved under further operations. Afunction being bounded is meant in the usual way, i.e. being bounded in norm. Lemma 2.9.

Let f and g be maps between subsets of Euclidean spaces with the same domain.Assume that f and g are bounded and Lipschitz.1. If b is bilinear with the property (cid:13)(cid:13) b ( X , Y ) (cid:13)(cid:13) ≤ (cid:107) X (cid:107) (cid:107) Y (cid:107) , then the map X (cid:55)→ b (cid:0) f ( X ) , g ( X ) (cid:1) is bounded Lipschitz.

2. Assume g takes values in R and has a positive lower bound m ∈ R > . Then the map x (cid:55)→ f ( X ) g ( X ) is bounded Lipschitz.Proof. Let M be an upper bound for the norms of f and g and let C be a Lipschitz coeﬃcientfor f and g . Let X , X (cid:48) and X (cid:48)(cid:48) be elements of the domain of f and g .1. Boundedness: (cid:13)(cid:13) b (cid:0) f ( X ) , g ( X ) (cid:1)(cid:13)(cid:13) ≤ (cid:13)(cid:13) f ( X ) (cid:13)(cid:13) (cid:13)(cid:13) g ( X ) (cid:13)(cid:13) ≤ M .Lipschitz property: (cid:13)(cid:13) b (cid:0) f ( X (cid:48) ) , g ( X (cid:48) ) (cid:1) − b (cid:0) f ( X (cid:48)(cid:48) ) , g ( X (cid:48)(cid:48) ) (cid:1)(cid:13)(cid:13) = (cid:13)(cid:13) b (cid:0) f ( X (cid:48) ) , g ( X (cid:48) ) (cid:1) − b (cid:0) f ( X (cid:48)(cid:48) ) , g ( X (cid:48) ) (cid:1) + b (cid:0) f ( X (cid:48)(cid:48) ) , g ( X (cid:48) ) (cid:1) − b (cid:0) f ( X (cid:48)(cid:48) ) , g ( X (cid:48)(cid:48) ) (cid:1)(cid:13)(cid:13) ≤ (cid:13)(cid:13) f ( X (cid:48) ) − f ( X (cid:48)(cid:48) ) (cid:13)(cid:13) (cid:13)(cid:13) g ( X (cid:48) ) (cid:13)(cid:13) + (cid:13)(cid:13) f ( X (cid:48)(cid:48) ) (cid:13)(cid:13) (cid:13)(cid:13) g ( X (cid:48) ) − g ( X (cid:48)(cid:48) ) (cid:13)(cid:13) ≤ CM (cid:13)(cid:13) X (cid:48) − X (cid:48)(cid:48) (cid:13)(cid:13) .

2. Boundedness: (cid:13)(cid:13)(cid:13) f ( X ) g ( X ) (cid:13)(cid:13)(cid:13) = (cid:107) f ( X ) (cid:107)| g ( X ) | ≤ Mm .Lipschitz property: (cid:13)(cid:13)(cid:13) f ( X (cid:48) ) g ( X (cid:48) ) − f ( X (cid:48)(cid:48) ) g ( X (cid:48)(cid:48) ) (cid:13)(cid:13)(cid:13) = (cid:13)(cid:13)(cid:13) f ( X (cid:48) ) g ( X (cid:48)(cid:48) ) − f ( X (cid:48)(cid:48) ) g ( X (cid:48) ) g ( X (cid:48) ) g ( X (cid:48)(cid:48) ) (cid:13)(cid:13)(cid:13) = (cid:107) f ( X (cid:48) ) g ( X (cid:48)(cid:48) ) − f ( X (cid:48)(cid:48) ) g ( X (cid:48)(cid:48) ) + f ( X (cid:48)(cid:48) ) g ( X (cid:48)(cid:48) ) − f ( X (cid:48)(cid:48) ) g ( X (cid:48) ) (cid:107)| g ( X (cid:48) ) g ( X (cid:48)(cid:48) ) |≤ (cid:107) f ( X (cid:48) ) − f ( X (cid:48)(cid:48) ) (cid:107) (cid:107) g ( X (cid:48)(cid:48) ) (cid:107) + (cid:107) f ( X (cid:48)(cid:48) ) (cid:107) (cid:107) g ( X (cid:48)(cid:48) ) − g ( X (cid:48) ) (cid:107)| g ( X (cid:48) ) | | g ( X (cid:48)(cid:48) ) |≤ CMm (cid:13)(cid:13) X (cid:48) − X (cid:48)(cid:48) (cid:13)(cid:13) . In practice, b is the product of numbers or scalar product of vectors. orollary 2.10. Let ( U i ) i ∈ I be a locally ﬁnite open cover of a subset U of a Euclidean space, ( f i ) i ∈ I a subordinate smooth partition of unity and ( g i : U i → R n ) i ∈ I a family of maps. Let g : U → R n be the map, obtained by gluing maps g i with the partition of unity f i , i.e. g ( x ) := (cid:88) i ∈ I f i ( x ) g i ( x ) . Then if all g i are locally Lipschitz, so is g .Proof. Every continuous map is locally bounded, including the derivative of a smooth map,the bound on which is then a local Lipschitz coeﬃcient for the map. We can apply thisfor f i .Given x ∈ U , pick an open set V ⊆ U , for which the following holds: x ∈ V , there is a ﬁniteset of indices F ⊆ I such that V intersects only U i with i ∈ F and V ⊆ (cid:84) i ∈ F U i , and themaps f i and g i are bounded and Lipschitz on V for every i ∈ F . Then g | V = (cid:80) i ∈ F f i | V g i | V which is Lipschitz on V by Lemma 2.9. Having derived some results for more general manifolds, we now specify the manifolds forwhich our main theorem holds. We reserve the symbol M for such a manifold.Let M be a non-empty m -dimensional properly embedded C -submanifold of R n withoutboundary, and let A be its medial axis. Let τ denote the reach of M . In this section weassume τ < ∞ and in Sections 4 and 5 we assume τ = 1. We will drop these assumptionson τ for the main theorem in Section 6.By Proposition 2.1 and the deﬁnition of a medial axis the map pr : R n \A → M , which takes apoint to its closest point on the manifold M , is well deﬁned. We also deﬁne prv : R n \A → M , prv ( X ) := pr ( X ) − X . We view prv ( X ) as the vector, starting at X and ending in pr ( X ). Thisvector is necessarily normal to the manifold, i.e. it lies in N pr ( X ) M . By the deﬁnition of thereach, the maps pr and prv are deﬁned on M τ . Lemma 3.1.

For every r ∈ R (0 ,τ ) the maps pr and prv are Lipschitz when restricted to M r ,with Lipschitz coeﬃcients ττ − r and ττ − r +1 , respectively. Hence these two maps are continuouson M τ .Proof. The map pr is Lipschitz on M r by [9, Proposition 2] with a Lipschitz coeﬃ-cient ττ − r [19, Theorem 4.8(8)]. As a diﬀerence of two Lipschitz maps, the map prv isLipschitz as well, with a Lipschitz coeﬃcient ττ − r + 1. The maps pr and prv are thereforecontinuous on M r for all r ∈ R (0 ,τ ) , and hence also on the union M τ = (cid:83) r ∈ R (0 ,τ ) M r .14e want to approximate the manifold M with a sample. We assume that the sample set S is a non-empty discrete subset of M , locally ﬁnite in R n (meaning, every point in R n has aneighbourhood which intersects only ﬁnitely many points of S ). It follows that S is a closedsubset of R n .Let κ denote the Hausdorﬀ distance between M and S . We assume that κ is ﬁnite. Thisvalue represents the density of our sample: it means that every point on the manifold M has a point in the sample S which is at most κ away.Since M is properly embedded in R n and κ < ∞ , the sample S is ﬁnite if and only if M is compact. A properly embedded non-compact submanifold without boundary needs toextend to inﬁnity and so cannot be sampled with ﬁnitely many points (think for exampleabout the hyperbola in the plane, x − y = 1). As it turns out, we do not need ﬁniteness,only local ﬁniteness, to prove our results.If the sample is dense enough in the manifold, it should be a good approximation to it.Speciﬁcally, we want to recover at least the homotopy type of M from the information,gathered from S . A common way to do this is to enlarge the sample points to balls, theunion of which deformation retracts to the manifold, so has the same homotopy type (inother words, we consider a ˇCech complex of the sample).In this paper we use ellipsoids instead of balls. The idea is that a tangent space at somepoint is a good approximation for the manifold at that point, so an ellipsoid with the majorsemi-axes in the tangent directions should better approximate the manifold than a ball.Consequently we should require a less dense sample for the approximation. This idea indeedpans out (as demonstrated by Theorem 6), though it turns out that the standard methods,used to construct the deformation retraction from the union of balls to the manifold, do notwork for the ellipsoids.Given a persistence parameter p ∈ R > , let us denote the unions of open and closed tangent-normal p -ellipsoids around sample points by E p := (cid:91) S ∈S E p ( S ) , E p := (cid:91) S ∈S E p ( S ) . As a union of open sets, E p is open in R n . As a locally ﬁnite union of closed sets, E p isclosed in R n .We want a deformation retraction from E p to M . Clearly this will not work for all p ∈ R > .If p is too small, E p covers only some blobs around sample points, not the whole M . If p is too large, E p reaches over the medial axis A , therefore creates connections which do notexist in the manifold, so diﬀers from it in the homotopy type. This suggests that the lowerbound on p will be expressed in terms of κ (the denser the sample, the smaller the required p for E p to cover M ), and the upper bound on p will be expressed in terms of τ (the further15way the medial axis, the larger we can make the ellipsoids so that they still do not intersectthe medial axis). Lemma 3.2.

1. Assume p ∈ R > satisﬁes κ < (cid:113) p (cid:0)(cid:112) τ ( p + 2 τ ) − τ (cid:1) . Then M ⊆ E p , i.e. (cid:0) E p ( S ) (cid:1) S ∈S is an open cover of M .2. The map R > → R > , p (cid:55)→ (cid:113) p (cid:0)(cid:112) τ ( p + 2 τ ) − τ (cid:1) , is strictly increasing. Thus thereexists a unique λ ∈ R > such that κ < (cid:113) p (cid:0)(cid:112) τ ( p + 2 τ ) − τ (cid:1) ⇐⇒ λ < p. Proof.

1. Take any X ∈ M . By assumption there exists S ∈ S such that d ( X , S ) ≤ κ . We claimthat X ∈ E p ( S ).If m = n , then E p ( S ) = B √ τp + p ( S ), and a quick calculation shows that (cid:113) p (cid:0)(cid:112) τ ( p + 2 τ ) − τ (cid:1) ≤ (cid:112) τp + p , so X ∈ E p ( S ).If m = 0, then E p ( S ) = B p ( S ) and the reach τ is half of the distance between the twoclosest distinct points in M (since we are assuming τ < ∞ and therefore A (cid:54) = ∅ , themanifold M must have at least two points). If p ≤ τ , then (cid:113) p (cid:0)(cid:112) τ ( p + 2 τ ) − τ (cid:1) ≤ τ, so necessarily X = S ∈ E p ( S ). If p > τ , then (cid:113) p (cid:0)(cid:112) τ ( p + 2 τ ) − τ (cid:1) < p, so X ∈ B p ( S ) = E p ( S ).Assume hereafter that 0 < m < n . Choose a planar tangent-normal coordinate systemwith the origin in S which contains X (use Lemma 2.5(1)). In this coordinate systemthe boundary of E p ( S ) is given by the equation x τp + p + y p = 1. A routine calculationshows that it intersects the boundaries of the τ -balls, associated to S (with centers in C (cid:48) = (0 , τ ) and C (cid:48)(cid:48) = (0 , − τ )), given by the equations x + ( y ± τ ) = τ , in the points (cid:16) ± (cid:115) p ( p + τ ) (cid:0) (cid:112) τ ( p + 2 τ ) − p − τ (cid:1) τ , ± p (cid:0)(cid:112) τ ( p + 2 τ ) − τ (cid:1) τ (cid:17) , r := (cid:113) p (cid:0)(cid:112) τ ( p + 2 τ ) − τ (cid:1) > κ ≥ d ( X , S ). It follows that withinthe given two-dimensional coordinate system X ∈ B r ( S ) ⊆ E p ( S ) ∪ B τ ( C (cid:48) ) ∪ B τ ( C (cid:48)(cid:48) ) , see Figure 2. S X

Figure 2: The point X within the ellipsoid.Since S ∈ M and the reach of M is τ , the manifold M does not intersect the open τ -balls, associated to S , so X ∈ E p ( S ).2. The derivative of the given function is τ (cid:0) p + 4 τ − (cid:112) τ ( p + 2 τ ) (cid:1) (cid:113) pτ ( p + 2 τ ) (cid:0)(cid:112) τ ( p + 2 τ ) − τ (cid:1) , which is positive for p, τ > λ . Calculatedwith Mathematica , the actual value is λ = 2 τ (3 κ + τ )3 (cid:112) κ τ − κ τ + 3 √ κ τ − κ τ − κ τ − τ +17 (cid:112) κ τ − κ τ + 3 √ κ τ − κ τ − κ τ − τ τ − τ . We can strengthen this result to thickenings of M . Given r ∈ R > , we denote the open andclosed r -thickening of M by M r := { X ∈ R n | d ( M , X ) < r } , M r := { X ∈ R n | d ( M , X ) ≤ r } . Corollary 3.3.

For every r ∈ R ≥ and every p ∈ R >λ + r we have M r ⊆ E p .Proof. Lemma 3.2 implies that

M ⊆ E p − r . Hence M r is contained in the union of r -thickenings of open ellipsoids E p − r ( S ), and an r -thickening of E p − r ( S ) is contained in E p ( S ).Let us now also get an upper on p . Lemma 3.4.

Assume p ∈ R (0 ,τ ) . Then E p ⊆ M τ ; in particular E p and E p do not intersectthe medial axis of M .Proof. Take any S ∈ S and X ∈ E p ( S ). By Lemma 2.8 we have d ( M , X ) ≤ q S ( X ) ≤ p < τ .The results in this section give the theoretical bounds on the persistence parameter p , withinwhich we look for a deformation retraction from E p to M , which we summarize in thefollowing corollary. Corollary 3.5. If p ∈ R ( λ,τ ) , then M ⊆ E p ⊆ A (cid:123) . In this section (as well as the next one) we assume that τ = 1 and 0 < m < n .Our goal is to prove that if we restrict the persistence parameter p to a suitable interval,the union of ellipsoids E p deformation retracts to M . Recall that the normal deformationretraction is the map retracting a point to its closest point on the manifold, i.e. the convexcombination of a point and its projection: ( X , t ) (cid:55)→ (1 − t ) X + t pr ( X ) = X + t prv ( X ). Forexample, in [22] this is how the union of balls around sample points is deformation retractedto the manifold. 18igure 3: Normal deformation retraction does not always work.The same idea does not in general work for the union of ellipsoids, or any other suﬃcientlyelongated ﬁgures. Figure 3 shows what can go wrong.However, it turns out that the only places where the normal deformation retraction doesnot work are the neighbourhoods of tips of some ellipsoids which avoid all other ellipsoids.This section is dedicated to proving the following form of this claim: for all points in at leasttwo ellipsoids the normal deformation retraction works. This means that the line segmentbetween a point X and pr ( X ) is contained in the union of ellipsoids, but actually more holds:the line segment is contained already in one of the ellipsoids.To prove this, we would in principle need to examine all possible conﬁgurations of ellipsoidsand a point. However, we can restrict ourselves to a set of cases, which include the “worstcase scenarios”.Let S (cid:48) , S (cid:48)(cid:48) ∈ S be two diﬀerent sample points, and let X ∈ E p ( S (cid:48) ) ∩ E p ( S (cid:48)(cid:48) ) (we purposefullytake closed ellipsoids here). Denote Y := pr ( X ). We claim that there is S ∈ S (not necessarilydistinct from S (cid:48) and S (cid:48)(cid:48) ) such that X ∈ E p ( S ) and Y ∈ E p ( S ). Due to convexity of ellipsoids,the line segment XY is in E p ( S ); with the possible exception of the point X , this line segmentis in E p ( S ).Assuming p ∈ R ( λ, , the point Y is covered by at least one open ellipsoid. Suppose that noneof the closed ellipsoids, containing Y in their interior, contains X . Let us try to construct asituation where this is most likely to be the case. We will derive a contradiction by showingthat even in these “worst case scenarios” we fail in satisfying this assumption.There are two things which reduce the chance that an ellipsoid contains X : the further it isfrom X , and the more one of its minor semi-axes points in the direction towards X . We willproduce a set of conﬁgurations where these two criteria are maximized.19onsider a planar tangent-normal coordinate system with the origin in S (cid:48) which contains X in the fourth quadrant (nonnegative tangent coordinate, nonpositive normal coordinate). Apiece of the manifold goes rightward from S (cid:48) . The fastest that this piece can turn awayfrom X is in this plane along the boundary of the upper τ -ball, associated to S (cid:48) . Suppose themanifold continues along this path until some point X (cid:48) , and consider a plane containing thepoints X , X (cid:48) and S where the distance between S ∈ S and Y is bounded by κ , so Y ∈ E p ( S ).Going from X (cid:48) to S , the quickest way to turn the normal direction towards X is within thisplane, and along a τ -arc. While this second plane need not be the same as the ﬁrst one,they intersect along the line containing X and X (cid:48) . We can turn the half-plane containing S (cid:48) and the half-plane containing S along the line so that they form one plane, and that will bethe conﬁguration where it is equally (un)likely for E p ( S ) to contain X , but where S (cid:48) , X , X (cid:48) , Y and S all lie in the same plane.We can make the same argument starting from S (cid:48)(cid:48) instead of S (cid:48) , so we conclude the following:if our claim fails for some conﬁguration of X , Y , S (cid:48) , S (cid:48)(cid:48) , S , then it fails in a planar case wherethe part of the manifold connecting points S (cid:48) and S (cid:48)(cid:48) consists of (at most) three τ -arcs, as inFigure 4. S' S''SX Y

Figure 4: Point in two ellipsoids, whose projection is in another ellipsoid Imagine distinct points A and B in some higher-dimensional Euclidean space and a non-zero vector (cid:126)a ,starting in A and having a nonnegative scalar product with (cid:126)AB . Consider paths, starting in A and going inthe direction of (cid:126)a , the curvature of which is bounded by some number. Then the fastest that we can getaway from B along such a path is within the plane, determined by A , B and (cid:126)a — speciﬁcally, along the arcwith the maximum curvature.

20e started with the assumption X ∈ E p ( S (cid:48) ) ∩ E p ( S (cid:48)(cid:48) ), but we may without loss of generalityadditionally assume X ∈ ∂ E p ( S (cid:48) ). If we had a counterexample X to our claim in the interiorof all ellipsoids containing X , we could project it in the opposite direction of pr ( X ) to the ﬁrstellipsoid boundary we hit, and declare the center of that ellipsoid to be S (cid:48) .Although the reduction of cases we have made is already a vast simpliﬁcation of the necessarycalculations, we ﬁnd that it is still not enough to make a theoretical derivation of the desiredresult feasible. Instead, we produce a proof with a computer.We can reduce the possible conﬁgurations to four parameters (see Figure 5): • α denotes the angle measuring the length of the ﬁrst τ -arc, • σ denoted the angle for the second τ -arc until S , • p is, as usual, the persistence parameter, • χ determines the position of X in the boundary ∂ E p ( S (cid:48) ). S' SX Y α σχ

Figure 5: Notation of parameters in the programNotice that Figure 5 does not include both ellipsoids containing X but not Y , like Figure 4does. It turns out that as soon as Y is not in the ﬁrst ellipsoid, both X and Y will be in an21llipsoid, the center of which is within κ distance from Y . This allows us to restrict ourselvesto just the four aforementioned variables, which makes the program run in a reasonable time.The space of the conﬁgurations we restricted ourselves to — let us denote it by C — is com-pact (we give its precise deﬁnition below). We want to calculate for each conﬁguration in C that X is in some ellipsoid with the center within κ distance from Y (it follows automaticallythat Y is in this ellipsoid). The boundary of the ellipsoid is a level set of a smooth functionwhich we can adjust so that X is in the open ellipsoid if and only if the value of the functionis positive. Let us denote this function by v : C → R ; we have our claim if we show that v is positive for all conﬁgurations in C .Of course, the program cannot calculate the function values for all inﬁnitely many conﬁgura-tions in C . We note that the (continuous) partial derivatives of v are bounded on compact C ,hence the function is Lipschitz. If we change the parameters by at most δ , the function valuechanges by at most C · δ where C is the Lipschitz coeﬃcient. The program calculates thefunction values in a ﬁnite lattice of points, so that each point in C is at most a suitable δ away from the lattice, and veriﬁes that all these values are larger than C · δ . This showsthat v is positive on the whole C .Let us now deﬁne C precisely and then calculate the Lipschitz coeﬃcient of v . We mayorient the coordinate system so that the point X is in the closed fourth quadrant. Hence wehave X = (cid:0)(cid:112) p + p cos( χ ) , − p sin( χ ) (cid:1) , where χ ranges over the interval R [0 , π ] .Unfortunately due to our method we cannot allow p to range over the whole interval R ( λ, ;if we did, the values of v would come arbitrarily close to zero, in particular below C · δ , sothe program would not prove anything. Let us set p ∈ R [ m p ,M p ] , where we have chosen in ourprogram m p := 0 . M p := 0 .

96. The closer M p is to 1, the smaller the density we proveis required. However, larger M p necessitates smaller δ which increases the computation time.Through experimentation, we have chosen bounds, so that the program ran for a few days.Ultimately, with better computers (and more patience) one can improve our result. We notethat experimentally we never came across any counterexample to our claims even outsideof C (so long as the conﬁguration satisﬁed the theoretical assumptions from Corollary 3.5).We discuss this further in Section 7.We can now calculate the upper bound on α (the lower bound is just 0). For ﬁxed p and χ we claim that the case α ≥ arctan (cid:0) √ p + p cos( χ )1+ p sin( χ ) (cid:1) is impossible. In this case the point(0 ,

1) + X − (0 , (cid:107) X − (0 , (cid:107) lies on the manifold, and is the closest to X among points on M . This isbecause its distance to X is bounded by p (by Lemma 2.8) which is smaller than τ = 1, so itsassociated τ -ball includes all points, closer to X , and M cannot intersect an open associated τ -ball — see Figure 6. We claim that the point pr ( X ) = (0 ,

1) + X − (0 , (cid:107) X − (0 , (cid:107) lies in E p ( S (cid:48) ). Thisis a contradiction since then Y = pr ( X ) ∈ E p ( S (cid:48) ).Clearly, it suﬃces to verify pr ( X ) ∈ E p ( S (cid:48) ) for χ = 0 (for larger χ the point pr ( X ) lies on the τ -arc further towards the ellipsoid center S (cid:48) ). If we put the coordinates of pr ( X ) for χ = 0 into22 ' X α χ Figure 6: Too large α the equation for the ellipsoid, we see that we need p + p +2 − √ p + p +1 p + p + p <

1. This is equivalentto − p − p + p + 4 p + 5 p + 2 p − p >

0, which is equivalent to − p ( p + 1) (cid:0) p + p − p − p − p + 1 (cid:1) > p + p − p − p − p + 1 <

0. Thederivative of the polynomial on the left is5 p + 4 p − p − p − ≤ − (5 p + 4 p )(1 − p ) − < , so p + p − p − p − p + 1 is decreasing on R [ m p ,M p ] ⊆ R (0 , . The value of this polynomialat m p = 0 . − . <

0, so the polynomial is negative on R [ m p ,M p ] , as required. With this we have conﬁrmed that it suﬃces to restrict ourselves to α ≤ arctan (cid:0) √ p + p cos( χ )1+ p sin( χ ) (cid:1) .As mentioned, this bound will be the largest at χ = 0, so we will cover the relevantconﬁgurations for α ≤ arctan (cid:0)(cid:112) p + p (cid:1) , or equivalently (for α ∈ R [0 , π ) and p ∈ R (0 , )tan ( α ) ≤ p + p , in particular tan ( α ) ≤ M p + M p . We could reduce m p to around 0 . . f and shortening the run-time of the program. σ ∈ R [0 ,π ] . If the manifold were to tracea τ -circle within a plane for longer than π , it would necessarily be that τ -circle. If it wereto veer away from the circle, the medial axis would continue from the center of the circle tothe area between the two parts of the manifold (see Figure 7) which would contradict thatthe manifold’s reach is τ . medial axis Figure 7: Medial axis of a manifold tracing an arc for longer than π However, if the manifold was indeed just a circle in a plane, then Y would be inside of E p ( S (cid:48) )by the same argument we used when calculating the bound on α . Hence we may postulate σ ∈ R [0 ,π ] .Having calculated the bounds on the variables, we may now deﬁne C := (cid:110) ( α, σ, p, χ ) ∈ R [0 , arctan( √ M p + M p )] × R [0 ,π ] × R [ m p ,M p ] × R [0 , π ] (cid:12)(cid:12)(cid:12) tan ( α ) ≤ p + p (cid:111) . For the sake of a later calculation we also deﬁne a slightly bigger area, (cid:101) C := (cid:110) ( α, σ, p, χ ) ∈ R [0 , arctan( √ M p + M p )] × R [0 ,π ] × R [ m p ,M p ] × R [0 , π ] (cid:12)(cid:12)(cid:12) tan ( α ) ≤ p + p (cid:111) . Both C and (cid:101) C are 4-dimensional rectangular cuboids with a small piece removed; in the α - p -plane they look as shown in Figure 8.Given ( α, σ, p, χ ) ∈ C , we have X = ( X T , X N ) = (cid:0)(cid:112) p + p cos( χ ) , − p sin( χ ) (cid:1) . Let us denotethe center of the τ -ball, along the boundary of which lies the arc containing S , by C . Observe24 .0 0.2 0.4 0.6 0.80.50.60.70.80.9 α p Figure 8: Regions C and (cid:101) C from Figure 9 that C = (0 ,

1) + 2 (cid:0) sin( α ) , − cos( α ) (cid:1) and S = ( S T , S N ) = C + (cid:0) − sin( α − σ ) , cos( α − σ ) (cid:1) == (cid:0) α ) − sin( α − σ ) , − α ) + cos( α − σ ) (cid:1) (this works also if α − σ is negative). S' S C α σα - σ Figure 9: Position of C and S

25t will be convenient to deﬁne v on the larger area (cid:101) C (although we are still only interestedin positivity of v on C ). Recall that we want v to be a function, so that its 0-level setis the boundary of E p ( S ), and is positive on E p ( S ) itself. Let x, y be coordinates in ourcurrent coordinate system, x (cid:48) , y (cid:48) the coordinates in the coordinate system, translated by S ,and x (cid:48)(cid:48) , y (cid:48)(cid:48) the coordinates if we rotate the translated coordinate system by α − σ in thepositive direction. Hence (cid:20) x (cid:48) y (cid:48) (cid:21) = (cid:20) xy (cid:21) − S , (cid:20) x (cid:48)(cid:48) y (cid:48)(cid:48) (cid:21) = (cid:20) cos( α − σ ) sin( α − σ ) − sin( α − σ ) cos( α − σ ) (cid:21) · (cid:20) x (cid:48) y (cid:48) (cid:21) . In the rotated translated coordinate system, the equation for the boundary of the ellipse is x (cid:48)(cid:48) p + p + y (cid:48)(cid:48) p = 1, or equivalently p ( p + 1) − (cid:0) x (cid:48)(cid:48) p + y (cid:48)(cid:48) ( p + 1) (cid:1) = 0. We therefore deﬁne f : (cid:101) C → R by v ( α, σ, p, χ ) := p ( p + 1) − (cid:16)(cid:0) cos( α − σ )( X T − S T ) + sin( α − σ )( X N − S N ) (cid:1) p + (cid:0) − sin( α − σ )( X T − S T ) + cos( α − σ )( X N − S N ) (cid:1) ( p + 1) (cid:17) . Recall that it follows from the multivariate Lagrange mean value theorem that for any a, b ∈ (cid:101) C (cid:12)(cid:12) v ( a ) − v ( b ) (cid:12)(cid:12) ≤ max (cid:107)∇ v (cid:107) · (cid:107) a − b (cid:107) where the maximum of the norm of the gradient is taken over the line segment connectingthe points a and b . In particular, the maximum over the entire (cid:101) C is a Lipschitz coeﬃcientfor v .This theorem holds for any pair of conjugate norms. We take the ∞ -norm on (cid:101) C , and the1-norm for the gradient. The reason is that we cover the region C by cuboids which arealmost cubes (in the centers of which we calculate the function values). The smaller thedistance between the center of a cube and any of its points, the better the estimate weobtain. Hence (cid:12)(cid:12) v ( α + ∆ α, σ + ∆ σ, p + ∆ p, χ + ∆ χ ) − v ( α, σ, p, χ ) (cid:12)(cid:12) ≤≤ max (cid:101) C (cid:16)(cid:12)(cid:12)(cid:12) ∂v∂α (cid:12)(cid:12)(cid:12) + (cid:12)(cid:12)(cid:12) ∂v∂σ (cid:12)(cid:12)(cid:12) + (cid:12)(cid:12)(cid:12) ∂v∂p (cid:12)(cid:12)(cid:12) + (cid:12)(cid:12)(cid:12) ∂v∂χ (cid:12)(cid:12)(cid:12)(cid:17) · max {| ∆ α | , | ∆ σ | , | ∆ p | , | ∆ χ |} . Before we estimate the absolute values of partial derivatives, let us make several preliminarycalculations. 26irst we put the function into a more convenient form. v ( α, σ, p, χ ) = p ( p + 1) − (cid:16)(cid:0) cos( α − σ )( X T − S T ) + sin( α − σ )( X N − S N ) (cid:1) p + (cid:0) − sin( α − σ )( X T − S T ) + cos( α − σ )( X N − S N ) (cid:1) ( p + 1) (cid:17) == p ( p + 1) − (cid:16)(cid:0) cos( α − σ )( X T − S T ) + sin( α − σ )( X N − S N ) (cid:1) + (cid:0) − sin( α − σ )( X T − S T ) + cos( α − σ )( X N − S N ) (cid:1) (cid:17) p − (cid:0) − sin( α − σ )( X T − S T ) + cos( α − σ )( X N − S N ) (cid:1) == p ( p + 1) − (cid:107) X − S (cid:107) p − (cid:0) − sin( α − σ )( X T − S T ) + cos( α − σ )( X N − S N ) (cid:1) == p ( p + 1) − (cid:104) X − S , X − S (cid:105) p − (cid:0) (cid:104) ( − sin( α − σ ) , cos( α − σ )) , X − S (cid:105) (cid:1) Now we calculate the bound on X − S and its partial derivatives. X − S = (cid:16)(cid:112) p + p cos( χ ) − α ) + sin( α − σ ) , − p sin( χ ) − α ) − cos( α − σ ) (cid:17) (cid:107) X − S (cid:107) ≤ (cid:107) X − C (cid:107) + (cid:107) C − S (cid:107) = (cid:13)(cid:13)(cid:13)(cid:0)(cid:112) p + p cos( χ ) − α ) , − p sin( χ ) − α ) (cid:1)(cid:13)(cid:13)(cid:13) + 1The norm will be the largest when either the components are largest ( χ = 0, α = 0) or small-est ( χ = π , α = α max := arctan( (cid:112) p + p )). In the ﬁrst case we get (cid:107) X − C (cid:107) ≤ p + p + 1and in the second (taking into account cos( α max ) = √ ( α max ) = √ p + p = p ) (cid:107) X − C (cid:107) ≤ ( α max ) + ( − p − α max )) = 5 + p + 2 p − p ) cos( α max )= 1 + p + 2 p = (1 + p ) , so either way (cid:107) X − S (cid:107) ≤ p ≤ M p . (cid:13)(cid:13)(cid:13) ∂ ( X − S ) ∂α (cid:13)(cid:13)(cid:13) = (cid:13)(cid:13)(cid:0) − α ) + cos( α − σ ) , − α ) + sin( α − σ ) (cid:1)(cid:13)(cid:13) ≤ (cid:13)(cid:13)(cid:0) − cos( α ) , − sin( α ) (cid:1)(cid:13)(cid:13) + (cid:13)(cid:13)(cid:0) cos( α − σ ) , sin( α − σ ) (cid:1)(cid:13)(cid:13) = 3 (cid:13)(cid:13)(cid:13) ∂ ( X − S ) ∂σ (cid:13)(cid:13)(cid:13) = (cid:13)(cid:13)(cid:0) − cos( α − σ ) , − sin( α − σ ) (cid:1)(cid:13)(cid:13) = 1 27 (cid:13)(cid:13) ∂ ( X − S ) ∂p (cid:13)(cid:13)(cid:13) = (cid:13)(cid:13)(cid:13)(cid:16) p (cid:112) p + p cos( χ ) , − sin( χ ) (cid:17)(cid:13)(cid:13)(cid:13) = (cid:13)(cid:13)(cid:13)(cid:16)(cid:0) p (cid:112) p + p − (cid:1) cos( χ ) , (cid:17) + (cid:16) cos( χ ) , − sin( χ ) (cid:17)(cid:13)(cid:13)(cid:13) ≤ (cid:12)(cid:12)(cid:12) p (cid:112) p + p − (cid:12)(cid:12)(cid:12) + 1= 1 + 2 p (cid:112) p + p ≤ m p (cid:112) m p + m p Here the last equality holds because (1 + 2 p ) = 1 + 4 p + 4 p ≥ p + 4 p = (2 (cid:112) p + p ) andthe last inequality holds because p √ p + p is a decreasing function: its derivative is − p + p ) . (cid:13)(cid:13)(cid:13) ∂ ( X − S ) ∂χ (cid:13)(cid:13)(cid:13) = (cid:13)(cid:13)(cid:13)(cid:0) − (cid:112) p + p sin( χ ) , − p cos( χ ) (cid:1)(cid:13)(cid:13)(cid:13) = (cid:113) p sin ( χ ) + p ≤ (cid:113) M p + M p Next we calculate a bound on the term (cid:104) ( − sin( α − σ ) , cos( α − σ )) , X − S (cid:105) and its derivatives. (cid:12)(cid:12) (cid:104) (cid:0) − sin( α − σ ) , cos( α − σ ) (cid:1) , X − S (cid:105) (cid:12)(cid:12) ≤ (cid:107) X − S (cid:107) ≤ M p (cid:12)(cid:12)(cid:12) ∂∂α (cid:104) (cid:0) − sin( α − σ ) , cos( α − σ ) (cid:1) , X − S (cid:105) (cid:12)(cid:12)(cid:12) = (cid:12)(cid:12)(cid:12) (cid:104) (cid:0) − cos( α − σ ) , − sin( α − σ ) (cid:1) , X − S (cid:105) + (cid:104) (cid:0) − sin( α − σ ) , cos( α − σ ) (cid:1) , ∂ ( X − S ) ∂α (cid:105) (cid:12)(cid:12)(cid:12) ≤ (cid:107) X − S (cid:107) + (cid:13)(cid:13)(cid:13) ∂ ( X − S ) ∂α (cid:13)(cid:13)(cid:13) ≤ (2 + M p ) + 3 = 5 + M p (cid:12)(cid:12)(cid:12) ∂∂σ (cid:104) (cid:0) − sin( α − σ ) , cos( α − σ ) (cid:1) , X − S (cid:105) (cid:12)(cid:12)(cid:12) = (cid:12)(cid:12)(cid:12) (cid:104) (cid:0) cos( α − σ ) , sin( α − σ ) (cid:1) , X − S (cid:105) + (cid:104) (cid:0) − sin( α − σ ) , cos( α − σ ) (cid:1) , ∂ ( X − S ) ∂σ (cid:105) (cid:12)(cid:12)(cid:12) ≤ (cid:107) X − S (cid:107) + (cid:13)(cid:13)(cid:13) ∂ ( X − S ) ∂σ (cid:13)(cid:13)(cid:13) ≤ (2 + M p ) + 1 = 3 + M p (cid:12)(cid:12)(cid:12) ∂∂p (cid:104) (cid:0) − sin( α − σ ) , cos( α − σ ) (cid:1) , X − S (cid:105) (cid:12)(cid:12)(cid:12) = (cid:12)(cid:12)(cid:12) (cid:104) (cid:0) − sin( α − σ ) , cos( α − σ ) (cid:1) , ∂ ( X − S ) ∂p (cid:105) (cid:12)(cid:12)(cid:12) ≤ (cid:13)(cid:13)(cid:13) ∂ ( X − S ) ∂p (cid:13)(cid:13)(cid:13) ≤ m p (cid:112) m p + m p (cid:12)(cid:12) ∂∂χ (cid:104) (cid:0) − sin( α − σ ) , cos( α − σ ) (cid:1) , X − S (cid:105) (cid:12)(cid:12)(cid:12) = (cid:12)(cid:12)(cid:12) (cid:104) (cid:0) − sin( α − σ ) , cos( α − σ ) (cid:1) , ∂ ( X − S ) ∂χ (cid:105) (cid:12)(cid:12)(cid:12) ≤ (cid:13)(cid:13)(cid:13) ∂ ( X − S ) ∂χ (cid:13)(cid:13)(cid:13) ≤ (cid:113) M p + M p We can now estimate the partial derivatives of v . (cid:12)(cid:12)(cid:12) ∂f∂α (cid:12)(cid:12)(cid:12) ≤ M p (2 + M p ) + 2(2 + M p )(5 + M p ) = 20 + 26 M p + 8 M p (cid:12)(cid:12)(cid:12) ∂f∂σ (cid:12)(cid:12)(cid:12) ≤ M p ) M p + 2(2 + M p )(3 + M p ) = 12 + 14 M p + 4 M p (cid:12)(cid:12)(cid:12) ∂f∂p (cid:12)(cid:12)(cid:12) ≤ M p + 2 M p + 2(2 + M p ) 1 + 2 m p (cid:112) m p + m p M p + (2 + M p ) + 2(2 + M p ) 1 + 2 m p (cid:112) m p + m p = 4 + 6 M p + 4 M p + 2(2 + M p )(1 + M p ) 1 + 2 m p (cid:112) m p + m p (cid:12)(cid:12)(cid:12) ∂f∂χ (cid:12)(cid:12)(cid:12) ≤ M p ) (cid:113) M p + M p M p + 2(2 + M p ) (cid:113) M p + M p = 2(2 + M p )(1 + M p ) (cid:113) M p + M p Hence a Lipschitz coeﬃcient for v on (cid:101) C is L := 20 + 26 M p + 8 M p + 12 + 14 M p + 4 M p + 4 + 6 M p + 4 M p + 2(2 + M p )(1 + M p ) 1 + 2 m p (cid:112) m p + m p + 2(2 + M p )(1 + M p ) (cid:113) M p + M p = 36 + 46 M p + 16 M p + 2(2 + M p )(1 + M p ) (cid:16) m p (cid:112) m p + m p + (cid:113) M p + M p (cid:17) which is a little less than 125.The idea behind the program is that it accepts a value δ ∈ R > , sets each of the variables α , σ , p , χ at δ away from the edge of (cid:101) C and calculates the values of v in a lattice of points,of which any two consecutive ones diﬀer in the values of the variables by 2 δ . The idea isthat ∞ -balls (cubes) with the centers in the lattice points and radius δ cover C , so if thevalues of v in these points is > L δ , then v is positive. This requires two remarks, however.First, it could happen that even a simpler shape of a cuboid would not be covered by suchcubes, if dividing an edge length by 2 δ yields a remainder, greater than δ . For this reason, inthe program we decrease the step of each variable slightly so that the now slightly distorted29ubes would exactly cover the cuboid enclosing C and (cid:101) C if we took the lattice points fromacross the cuboid. The second problem is that C is not actually a cuboid and might not getcovered by the almost-cubes if we only took those with the centers in C . However, we claimthat the almost-cubes cover C if we take the centers from (cid:101) C , as long as δ is small enough.Recall Figure 8; since the dependence of the lower bound for p on α is in-creasing for both C and (cid:101) C , it suﬃces to check that if (cid:0) α, σ, p, χ (cid:1) ∈ C , then (cid:0) min (cid:8) α + δ, arctan( (cid:112) M p + M p ) (cid:9) , σ, max { p − δ, m p } , χ (cid:1) ∈ (cid:101) C .We have arctan( (cid:113) M p + M p ) ≤ arctan( √ , (cid:12)(cid:12) (tan ( α )) (cid:48) (cid:12)(cid:12) = (cid:12)(cid:12)(cid:12) α )cos ( α ) (cid:12)(cid:12)(cid:12) = (cid:12)(cid:12) α )(1 + tan ( α )) (cid:12)(cid:12) ≤ √ ≤ . Hence tan ( α ) has a Lipschitz coeﬃcient of 9 on the relevant region. Similarly, p has aLipschitz coeﬃcient of 2.Thentan ( α + δ ) ≤ tan ( α ) + 9 δ ≤ p + p + 9 δ ≤ p − δ ) + ( p − δ ) − p + 2 δ + 2 δ + 9 δ == 2( p − δ ) + ( p − δ ) − p + 13 δ ≤ p − δ ) + ( p − δ ) for δ ∈ R (0 , mp ] . In particular, δ ∈ R (0 , . suﬃces, also in the cases where we hit the edgesat arctan( (cid:112) M p + M p ) and/or m p (we did not have to be too picky about these particularestimates; the actual value of δ we run the program with is far smaller, at 0 . κ < (cid:113) p (cid:0)(cid:112) τ ( p + 2 τ ) − τ (cid:1) = (cid:113) p (cid:0) √ p + 2 − (cid:1) . The program veriﬁes that if Y / ∈ E p ( S (cid:48) ),then X ∈ E p ( S ), where S is chosen within κ -distance from Y , hence the entire line segmentfrom X to Y is in E p ( S ). However, if we allow κ to get arbitrarily close to (cid:113) p (cid:0) √ p + 2 − (cid:1) ,then the value of v gets arbitrarily close to zero, and we cannot use our method to provethat v is positive. To avoid this, we decrease the upper bound on the distance between S and Y to (cid:113) p (cid:0) √ p + 2 − (cid:1) − κ oﬀ for κ oﬀ = 0 .

55 (we chose this value experimentally, so thatthe result of the program was suﬃciently good).After some experimentation, we ran the program with δ = 0 . v that the program returned, was 0 . v has a Lipschitz coeﬃcient of 125. Since any possible conﬁguration is atmost δ = 0 . v , thevalues that v can take are at most 125 · . .

05 smaller than the values, calculated bythe program. In particular, v is necessarily positive.30he source code of our c ++ program is available at http://sarakalisnik.wescreates.wesleyan.edu/ellipsoids.cpp .The price of this method is that we had to decrease the size of the theoretical interval forthe persistence parameter p ∈ R ( λ,τ ) which in particular requires greater density sample forthe proof than is strictly necessary. We discuss this in Section 7.Let us summarize the results we have obtained in this section. We have seen that if apoint X ∈ E p is in at least two of the closed ellipsoids, then there exists S ∈ S such that X ∈ E p ( S ) and pr ( X ) ∈ E p ( S ). This happened in one of two ways. The ﬁrst closed ellipsoidwe took X from could already satisfy this property, or we could ﬁnd an ellipsoid with thecenter close to pr ( X ) which contained both X and pr ( X ) in its interior. If we start with X ∈ E p though, we can pick as the ﬁrst ellipsoid one that has X in its interior, which means that wecan always conclude the following. Corollary 4.1.

For every X ∈ E p , if there are S (cid:48) , S (cid:48)(cid:48) ∈ S , S (cid:48) (cid:54) = S (cid:48)(cid:48) such that X ∈ E p ( S (cid:48) ) ∩ E p ( S (cid:48)(cid:48) ) , then there exists S ∈ S such that X , pr ( X ) ∈ E p ( S ) . By convexity theentire line segment between X and pr ( X ) is therefore in E p ( S ) . In this section we show that under the same assumptions on τ and p as in the previoussection, the union of the open ellipsoids around sample points deformation retracts onto themanifold M .Informally, the idea of the deformation retraction is as follows. For a point X in an openellipsoid E p ( S ), consider the closed ellipsoid E q ( S ) where q := q S ( X ) (Deﬁnition 2.7), theboundary of which contains X . If the vector prv ( X ) points into the interior of this closedellipsoid, we move in the direction of prv ( X ), i.e. we use the normal deformation retraction.Otherwise, we move in the direction of the projection of the vector prv ( X ) onto the tangentspace T X ∂ E q ( S ). This causes us to slide along the boundary ∂ E q ( S ). Either way, we remainwithin E q ( S ) (and therefore within E p ) and eventually reach the manifold M . This procedureis problematic for points which are in more than one ellipsoid, but we can glue together thedirections of the deformation retraction with a suitable partition of unity.To make this work, we will need precise control over the partition of unity, which is the topicof Subsection 5.1. Then in Subsection 5.2 we deﬁne the vector ﬁeld which gives directions,in which we deformation retract. Subsection 5.3 proves that the ﬂow of this vector ﬁeld hasdesired properties. We then use this ﬂow to explicitly give the deﬁnition of the requisitedeformation retraction in Subsection 5.4. 31 .1 The Partition of Unity For each S ∈ S deﬁne A S := E p \ (cid:91) S (cid:48) ∈S\{ S } E p ( S (cid:48) ) , B S := E p \ E p ( S ) . The sets A S and B S are closed in E p because they are complements within E p of open sets.Note that A S ⊆ E p ( S ) = E p \ B S . In particular, A S and B S are disjoint. Proposition 5.1. If S (cid:48) , S (cid:48)(cid:48) ∈ S and S (cid:48) (cid:54) = S (cid:48)(cid:48) , then A S (cid:48) ⊆ B S (cid:48)(cid:48) and A S (cid:48) ∩ A S (cid:48)(cid:48) = ∅ .Proof. For any X ∈ E p , if X ∈ A S (cid:48) , then X / ∈ E p ( S (cid:48)(cid:48) ), so X ∈ B S (cid:48)(cid:48) . Consequently A S (cid:48) ∩ A S (cid:48)(cid:48) ⊆B S (cid:48)(cid:48) ∩ A S (cid:48)(cid:48) = ∅ .The only way B S could be empty is if the sample S is a singleton which can only happenwhen M is a singleton, but this possibility is excluded by the assumption that the dimensionof M is positive. The distance to any non-empty set is a well-deﬁned real-valued function,the zeroes of which form the closure of the set.Deﬁne (cid:98) A S := (cid:40)(cid:8) X ∈ E p (cid:12)(cid:12) d ( A S , X ) ≤ d ( B S , X ) (cid:9) if A S (cid:54) = ∅ , ∅ if A S = ∅ , (cid:98) B S := (cid:40)(cid:8) X ∈ E p (cid:12)(cid:12) d ( B S , X ) ≤ d ( A S , X ) (cid:9) if A S (cid:54) = ∅ , E p if A S = ∅ .The sets (cid:98) A S and (cid:98) B S are disjoint. If we had X ∈ (cid:98) A S ∩ (cid:98) B S , then d ( A S , X ) ≤ d ( B S , X ) ≤ d ( A S , X ),so d ( A S , X ) = d ( B S , X ) = 0, meaning X ∈ A S ∩ B S , a contradiction. Note also that B S ⊆ (cid:98) B S and A S ⊆ (cid:98) A S ⊆ E p \ (cid:98) B S ⊆ E p \ B S = E p ( S ) . The sets (cid:98) A S and (cid:98) B S are closed in E p because they are (empty or) preimages of R ≥ undercontinuous maps X (cid:55)→ d ( B S , X ) − d ( A S , X ) and X (cid:55)→ d ( A S , X ) − d ( B S , X ). Using the smoothversion of Urysohn’s lemma [21], choose a smooth function f S : E p → R [0 , such that f S isconstantly 1 on (cid:98) A S and constantly 0 on (cid:98) B S .Recall that the support supp( f ) of a continuous real-valued function f is deﬁned as theclosure of the complement of the zero set, where both the complementation and the closureare calculated in the domain of f . Proposition 5.2.

For every S ∈ S and X ∈ E p , if X ∈ supp( f S ) , then E p \ (cid:98) B S , A S (cid:54) = ∅ , d ( (cid:98) B S , X ) ≥ d ( A S , X ) and X ∈ E p ( S ) . roof. Since X ∈ supp( f S ), the support of f S is non-empty, so (cid:98) B S (cid:54) = E p , therefore A S (cid:54) = ∅ .The set (cid:8) Y ∈ E p (cid:12)(cid:12) d ( (cid:98) B S , X ) ≥ d ( A S , X ) (cid:9) is closed in E p and contains f S − ( R (0 , ), so itcontains supp( f S ).If X ∈ A S , then X ∈ E p ( S ). If X / ∈ A S , then d ( B S , X ) ≥ d ( (cid:98) B S , X ) ≥ d ( A S , X ) >

0, so again X ∈ E p ( S ). Proposition 5.3.

The supports of functions f S are pairwise disjoint. Hence every pointin E p has a neighbourhood which intersects the support of at most one f S .Proof. Take S (cid:48) , S (cid:48)(cid:48) ∈ S , S (cid:48) (cid:54) = S (cid:48)(cid:48) and let X ∈ f S (cid:48) ∩ f S (cid:48)(cid:48) . Then d ( A S (cid:48) , X ) ≤ d ( B S (cid:48) , X ) ≤ d ( A S (cid:48)(cid:48) , X )and likewise d ( A S (cid:48)(cid:48) , X ) ≤ d ( A S (cid:48) , X ), implying d ( A S (cid:48) , X ) = d ( A S (cid:48)(cid:48) , X ) = 0, so X ∈ A S (cid:48) ∩ A S (cid:48)(cid:48) , acontradiction.Since supp( f S ) ⊆ E p ( S ) for all S ∈ S , the family of supports of f S is also locally ﬁnite. Thusany X ∈ E p has a neighbourhood U ⊆ E p which intersects only ﬁnitely many supports, atmost one of which contains X . The intersection of the complements of the rest with the set U is a neighbourhood of X which intersects at most one support.From these results we can conclude that X (cid:55)→ (cid:80) S ∈S f S ( X ) gives a well-deﬁned smooth map E p → R [0 , . We may therefore deﬁne a smooth map f P : E p ( S ) → R [0 , , f P ( X ) := 1 − (cid:88) S ∈S f S ( X ) . Thus the family of maps f S , S ∈ S , together with f P , forms a smooth partition of unityon E p .We will need two more subsets of E p : V := (cid:8) X ∈ E p (cid:12)(cid:12) ∃ S ∈ S . A S (cid:54) = ∅ ∧ d ( A S , X ) < d ( B S , X ) (cid:9) , W := (cid:8) X ∈ E p (cid:12)(cid:12) ∃ S ∈ S . X ∈ E p ( S ) ∧ pr ( X ) ∈ E p ( S ) (cid:9) . Lemma 5.4.

The sets V and W are open in E p and in R n , and V ∪ W = E p .Proof. The given sets are open in E p since V = (cid:83) S ∈S , A S (cid:54) = ∅ (cid:0) X (cid:55)→ d ( B S , X ) − d ( A S , X ) (cid:1) − ( R > )and W = (cid:83) S ∈S E p ( S ) ∩ pr − (cid:0) E p ( S ) (cid:1) . As E p is open in R n , they are also open in R n .Assume that X ∈ E p \ V . If X was in any A S , we would have 0 = d ( A S , X ) ≥ d ( B S , X ), so X ∈ A S ∩ B S , a contradiction. Since X is in no A S , it must be in at least two E p ( S ), so X ∈ W by Corollary 4.1. Recall that a function f S is deﬁned on E p , so when calculating its support, we take the closure within E p .This support is in general not closed in R n . .2 The Velocity Vector Field Let us deﬁne for each S ∈ S the vector ﬁeld (cid:101) V S : E p ( S ) → R n as follows. Given X ∈ E p ( S ), let H S X denote the n -dimensional closed half-space which is bounded by thehyperplane T X ∂ E q S ( X ) ( S ) and which contains E q S ( X ) ( S ). Deﬁne (cid:101) V S ( X ) to be the projection ofthe vector prv ( X ) to the closest point in H S X . Explicitly, if we introduce any orthonormalcoordinate system with the origin in X such that the last coordinate axis points orthogonallyto ∂ E q S ( X ) ( S ) into the interior of E q S ( X ) ( S ), then the projection in these coordinates is givenby ( x , . . . , x n − , x n ) (cid:55)→ ( x , . . . , x n − , max { x n , } ). Proposition 5.5.

The vector ﬁeld (cid:101) V S : E p ( S ) → R n is Lipschitz with a bound on a Lipschitzcoeﬃcient independent from S .Proof. The projection onto a half-space is 1-Lipschitz. By setting τ = 1 and r = p inLemma 3.1, we see that the map prv is ( − p + 1)-Lipschitz on E p ( S ) ⊆ M − p . As thecomposition of these two maps, the vector ﬁeld (cid:101) V S is Lipschitz with the product Lipschitzcoeﬃcient, i.e. also − p + 1.For any S ∈ S and X ∈ E p ( S ) \ M let α S X denote the angle between the vectors prv ( X )and (cid:101) V S ( X ), and let hl S X denote the closed half-line which starts at X , is orthogonal to ∂ E q S ( X ) ( S )and points into the exterior of E q S ( X ) ( S ). Lemma 5.6.

Let S ∈ S and X ∈ E p ( S ) \ M . Then pr ( X ) / ∈ hl S X ; in fact, the angle between prv ( X ) and hl S X is bounded from below by arccot( √ ) . Hence ≤ α S X ≤ arccos (cid:0)(cid:113) (cid:1) ; inparticular cos( α S X ) ≥ (cid:113) .Proof. Let q := q S ( X ). We have q ≤ p < X / ∈ M , in particular X (cid:54) = S , we have q >

0. Let (cid:126)n be the unit vector, orthogonal to the boundary of E q ( S ) and pointing intothe exterior of E q ( S ), so that hl S X = { X + t(cid:126)n | t ∈ R ≥ } . Let (cid:96) := (cid:107) prv ( X ) (cid:107) ; by assumption X / ∈ M , so (cid:96) >

0, and we may deﬁne (cid:126)m := prv ( X ) (cid:96) . Also, since E p ⊆ M τ = M , we have (cid:96) < S which contains X as well as (cid:126)n , hence the whole hl S X . Without loss of generality assume that X lies in the closed ﬁrst quadrant, so that we have χ ∈ R [0 , π ] with X = (cid:0)(cid:112) q + q sin( χ ) , q cos( χ ) (cid:1) (the angle is measured from N S M ).Let us ﬁrst prove that pr ( X ) / ∈ hl S X . Assume to the contrary that this were the case, sothat (cid:126)m = (cid:126)n . We will derive the contradiction by showing that the open τ -ball with thecenter in X − (1 − (cid:96) ) (cid:126)m , associated to M at pr ( X ), intersects all open τ -balls, associated to M at S . Two of those have their centers in the tangent-normal plane we are considering, and34ecessarily one of those is the τ -ball at S which is the furthest away from the τ -ball with thecenter in X − (1 − (cid:96) ) (cid:126)m . It thus suﬃces to check that the latter intersects the former two.First we explicitly calculate (cid:126)m . (cid:126)m = (cid:126)n = (cid:0) q sin( χ ) , (cid:112) q + q cos( χ ) (cid:1)(cid:13)(cid:13)(cid:0) q sin( χ ) , (cid:112) q + q cos( χ ) (cid:1)(cid:13)(cid:13) = (cid:0) q sin( χ ) , (cid:112) q + q cos( χ ) (cid:1)(cid:112) q cos ( χ ) + q We derive the contradiction by showing that d (cid:0) X − (1 − (cid:96) ) (cid:126)m, (0 , ± (cid:1) < d (cid:0) X − (1 − (cid:96) ) (cid:126)m, (0 , ± (cid:1) = (cid:13)(cid:13) X − (1 − (cid:96) ) (cid:126)m − (0 , ± (cid:13)(cid:13) = (cid:107) X (cid:107) + (1 − (cid:96) ) + 1 − − (cid:96) ) (cid:104) X , (cid:126)m (cid:105) − (cid:104) X − (1 − (cid:96) ) (cid:126)m, (0 , ± (cid:105) = q sin ( χ ) + q + (1 − (cid:96) ) + 1 − − (cid:96) ) q √ q + q √ q cos ( χ )+ q ∓ (cid:16) q cos( χ ) − (1 − (cid:96) ) √ q + q cos( χ ) √ q cos ( χ )+ q (cid:17) = q (cid:0) − cos ( χ ) (cid:1) + q + (1 − (cid:96) ) + 1 − − (cid:96) ) q √ q √ cos ( χ )+ q ∓ χ ) (cid:16) q − (1 − (cid:96) ) √ q √ cos ( χ )+ q (cid:17) = q (cid:0) − (cid:0) ± cos( χ ) (cid:1) (cid:1) + q + (1 − (cid:96) ) + 1 − − (cid:96) ) (cid:113) q cos ( χ )+ q (cid:0) q ∓ cos( χ ) (cid:1) = (1 + q ) − q (cid:0) ± cos( χ ) (cid:1) + (1 − (cid:96) ) − − (cid:96) ) (cid:113) q cos ( χ )+ q (cid:0) q ∓ cos( χ ) (cid:1) ≤ (1 + q ) − q (cid:0) ± cos( χ ) (cid:1) + 1 − − (cid:96) ) (cid:113) q cos ( χ )+ q (cid:0) q ∓ cos( χ ) (cid:1) We verify that this expression is < q, (cid:96) ∈ R (0 , and χ ∈ R [0 , π ] with the help from Mathematica , see ﬁle

ProjectionNotOnHalfline.nb , available at http://sarakalisnik.wescreates.wesleyan.edu/ProjectionNotOnHalfline.nb .This has shown that (cid:126)m cannot be equal to (cid:126)n because in that case the open τ -ball with thecenter in X − (1 − (cid:96) ) (cid:126)m would intersect all open τ -balls, associated to S . A lower bound on theangle between (cid:126)m and (cid:126)n is therefore the minimal angle, by which we must deviate from (cid:126)n , sothat we no longer have an intersection of the aforementioned balls.Observe that if two balls intersect, the closer their centers are, the greater the angle we mustturn one of them by around a point on its boundary, so that they stop intersecting. Hence,if we try to turn the ball with the center in X − (1 − (cid:96) ) (cid:126)n around the point pr ( X ) so thatit no longer intersects all balls, associated to S , we can get a lower bound on the angle byturning it by a minimal angle so that it no longer intersects the ball, associated to S , whichis furthest away. The center of this furthest ball lies in our planar tangent-normal coordinatesystem in which it has coordinates (0 , − (cid:96) is, the further35 − (1 − (cid:96) ) (cid:126)n is away from (0 , − (cid:96) and can be continuously extended to (cid:96) = 0 (the casewe excluded by the assumption X / ∈ M ). Once we set (cid:96) = 0, this minimal angle is still afunction of q and χ , and its minimum is a lower bound for the angle for any (cid:96) .Calculating this minimumis very complicated however, so we again resort to a computer proof with Mathematica ,see ﬁle

AngleBetweenProjectionAndHalfline.nb at http://sarakalisnik.wescreates.wesleyan.edu/AngleBetweenProjectionAndHalfline.nb .The desired deformation retraction should ﬂow in the direction of (cid:101) V S . However, the ﬁeld (cid:101) V S is deﬁned only on a single ellipsoid E p ( S ). Two such vector ﬁelds generally do not coincideon the intersection of two (or more) ellipsoids, so we use the partition of unity, constructedin Subsection 5.1, to merge the vector ﬁelds (cid:101) V S into one.Deﬁne the vector ﬁeld (cid:101) V : E p → R n as (cid:101) V ( X ) := f P ( X ) prv ( X ) + (cid:88) S ∈S f S ( X ) (cid:101) V S ( X ) . We understand this deﬁnition in the usual sense: this sum has only ﬁnitely many non-zeroterms at each X (in fact at most two by Proposition 5.3), and outside of the ellipsoid E p ( S ),we take the value of (cid:101) V S to be 0. Corollary 5.7.

1. If S ∈ S and X ∈ E p ( S ) , then (cid:104) (cid:101) V S ( X ) , prv ( X ) (cid:105) = (cid:16)(cid:13)(cid:13) prv ( X ) (cid:13)(cid:13) · cos( α S X ) (cid:17) ≥ (cid:13)(cid:13) prv ( X ) (cid:13)(cid:13) .

2. If X ∈ E p , then (cid:104) (cid:101) V ( X ) , prv ( X ) (cid:105) ≥ (cid:13)(cid:13) prv ( X ) (cid:13)(cid:13) .In particular, these two scalar products are non-zero outside M . Hence the ﬁelds (cid:101) V S and (cid:101) V have no zeros outside M .Proof.

1. Assume ﬁrst that prv ( X ) points into the half-space bounded by T X ∂ E q S ( X ) ( S ) whichcontains E q S ( X ) ( S ). Then (cid:101) V S ( X ) = prv ( X ) and α S X = 0, so the statement is clear.Otherwise, (cid:101) V S ( X ) is the orthogonal projection of prv ( X ) onto T X ∂ E q ( S ), so (cid:13)(cid:13) (cid:101) V S ( X ) (cid:13)(cid:13) = (cid:13)(cid:13) prv ( X ) (cid:13)(cid:13) · cos( α S X ) and (cid:104) (cid:101) V S ( X ) , prv ( X ) (cid:105) = (cid:13)(cid:13) (cid:101) V S ( X ) (cid:13)(cid:13) · (cid:13)(cid:13) prv ( X ) (cid:13)(cid:13) · cos( α S X ) = (cid:16)(cid:13)(cid:13) prv ( X ) (cid:13)(cid:13) · cos( α S X ) (cid:17) . ( α S X ) ≥ .2. We have (cid:104) (cid:101) V ( X ) , prv ( X ) (cid:105) = (cid:104) f P ( X ) prv ( X ) + (cid:88) S ∈S f S ( X ) (cid:101) V S ( X ) , prv ( X ) (cid:105) = f P ( X ) (cid:13)(cid:13) prv ( X ) (cid:13)(cid:13) + (cid:88) S ∈S f S ( X ) (cid:104) (cid:101) V S ( X ) , prv ( X ) (cid:105)≥ f P ( X ) (cid:13)(cid:13) prv ( X ) (cid:13)(cid:13) + (cid:88) S ∈S f S ( X ) (cid:13)(cid:13) prv ( X ) (cid:13)(cid:13) ≥ (cid:13)(cid:13) prv ( X ) (cid:13)(cid:13) . There is one more problem with taking (cid:101) V as the direction vector ﬁeld of the deformationretraction. The closer X is to the manifold, the shorter the vector prv ( X ), and thus (cid:101) V ( X ), is.If we used (cid:101) V as the velocity vector ﬁeld for the ﬂow, we would need inﬁnite time to reachthe manifold M . If we scale the vector ﬁeld in the way that the distance to the manifolddecreases with speed 1, we are sure to reach the manifold within time 1 which is how oneusually gives a deformation retraction (or more generally any homotopy).Since d ( X , M ) = d ( X , pr ( X )), we need to divide (cid:101) V ( X ) with the length of its projection ontothe vector prv ( X ). Hence the following deﬁnition of the vector ﬁeld V : E p \ M → R n : V ( X ) := (cid:107) prv ( X ) (cid:107)(cid:104) (cid:101) V ( X ) , prv ( X ) (cid:105) (cid:101) V ( X ) . Corollary 5.7 ensures that the vector ﬁeld V is well deﬁned and that it has the same directionas (cid:101) V . Proposition 5.8.

For every S ∈ S the ﬁeld (cid:101) V S : E p ( S ) → R n is bounded Lipschitz. The ﬁelds (cid:101) V : E p → R n and V : E p \ M → R n are bounded and locally Lipschitz.Proof. The projection onto a half-space is 1-Lipschitz; since the map prv is bounded in norm(by Lemma 2.8 we have (cid:107) prv ( X ) (cid:107) = d ( X , pr ( X )) = d ( X , M ) ≤ q S ( X ) < p ), the ﬁeld (cid:101) V S is alsobounded. Lemma 3.1 tells us that the map prv is ( − p + 1)-Lipschitz on E p ( S ) ⊆ M − p .As the composition of two Lipschitz maps, the vector ﬁeld (cid:101) V S is Lipschitz with the productLipschitz coeﬃcient, i.e. also − p + 1.Since the norm of the map prv , as well as all (cid:101) V S , has the same bound p , this is also a boundon the norm of (cid:101) V : (cid:13)(cid:13) (cid:101) V ( X ) (cid:13)(cid:13) = (cid:13)(cid:13) f P ( X ) prv ( X ) + (cid:88) S ∈S f S ( X ) (cid:101) V S ( X ) (cid:13)(cid:13) ≤ f P ( X ) (cid:13)(cid:13) prv ( X ) (cid:13)(cid:13) + (cid:88) S ∈S f S ( X ) (cid:13)(cid:13) (cid:101) V S ( X ) (cid:13)(cid:13) ≤ (cid:16) f P ( X ) + (cid:88) S ∈S f S ( X ) (cid:17) p = p. The ﬁeld (cid:101) V is locally Lipschitz by Corollary 2.10.Assume now that X ∈ E p \ M . Recall from Lemma 5.6 that cos( α S X ) ≥ (cid:113) . Thus (cid:107) prv ( X ) (cid:107)(cid:104) (cid:101) V ( X ) , prv ( X ) (cid:105) = 1 (cid:104) f P ( X ) prv ( X ) + (cid:80) S ∈S f S ( X ) (cid:101) V S ( X ) , prv ( X ) (cid:107) prv ( X ) (cid:107) (cid:105) = 1 f P ( X ) (cid:107) prv ( X ) (cid:107) + (cid:80) S ∈S f S ( X ) (cid:104) (cid:101) V S ( X ) , prv ( X ) (cid:107) prv ( X ) (cid:107) (cid:105) = 1 f P ( X ) (cid:107) prv ( X ) (cid:107) + (cid:80) S ∈S f S ( X ) (cid:107) (cid:101) V S ( X ) (cid:107) cos( α S X ) ≤ f P ( X ) (cid:107) prv ( X ) (cid:107) + (cid:113) (cid:80) S ∈S f S ( X ) (cid:107) (cid:101) V S ( X ) (cid:107)≤ (cid:113) (cid:13)(cid:13) f P ( X ) prv ( X ) + (cid:80) S ∈S f S ( X ) (cid:101) V S ( X ) (cid:13)(cid:13) = 1 (cid:113) (cid:107) (cid:101) V ( X ) (cid:107) . It follows that the norm of V is bounded by (cid:113) .Let U be a neighbourhood of X , where (cid:101) V is Lipschitz. Let r ∈ R > be such that B r ( X ) ⊆ U and r < d ( M , X ) = (cid:107) prv ( X ) (cid:107) and r < − d ( M , X ). We claim that V is Lipschitz on B r ( X )and therefore locally Lipschitz.By Lemma 3.1 the map prv is Lipschitz on B r ( X ) ⊆ M − d ( M , X ) . The map (cid:107) prv (—) (cid:107) is acomposition of Lipschitz maps and therefore Lipschitz on B r ( X ). Clearly, it is also bounded.Since (cid:101) V is also bounded and Lipschitz on B r ( X ) ⊆ U , so is the scalar product Y (cid:55)→ (cid:104) (cid:101) V ( Y ) , prv ( Y ) (cid:105) by Lemma 2.9. Recall from Corollary 5.7 that (cid:104) (cid:101) V ( Y ) , prv ( Y ) (cid:105) ≥ (cid:13)(cid:13) prv ( Y ) (cid:13)(cid:13) > (cid:0) d ( M , X ) − r (cid:1) > . Hence Lemma 2.9 also tells us that the map Y (cid:55)→ (cid:107) prv ( Y ) (cid:107)(cid:104) (cid:101) V ( Y ) ,prv ( Y ) (cid:105) is bounded Lipschitz on B r ( X ),and then so is its product with (cid:101) V , i.e. the ﬁeld V .The reason we consider the local Lipschitz property is that it allow us to deﬁne the ﬂow ofthe ﬁeld V . 38 .3 The Flow of the Vector Field We will use the ﬂow of the vector ﬁeld V as part of the deﬁnition of the desired deformationretraction. Generally the ﬂow of a vector ﬁeld need not exist globally, and in our case thewhole point is that the ﬂow takes us to the manifold where the vector ﬁeld is not deﬁned.However, before we can establish what the exact domain of deﬁnition for the ﬂow is, we willalready need to refer to the ﬂow to prove some of its properties. As such, it will be convenientto treat the ﬂow as a partial function. Also, it is convenient to use Kleene equality (cid:39) inthe context of partial functions: a (cid:39) b means that a is deﬁned if and only if b is, and is theyare deﬁned, they are equal.The ﬂow of the vector ﬁeld V : E p \ M → R n can thus be given as a partial mapΦ : ( E p \ M ) × R ≥ (cid:42) E p \ M which satisﬁes the following for all X ∈ E p \M and t, u ∈ R ≥ :1. the domain of deﬁnition of Φ is an open subset of ( E p \ M ) × R ≥ ,2. the ﬂow Φ is continuous everywhere on its domain of deﬁnition,3. if Φ( X , u ) is deﬁned and t ≤ u , then Φ( X , t ) is deﬁned,4. Φ( X , (cid:39) X ,5. Φ (cid:0) Φ( X , t ) , u (cid:1) (cid:39) Φ( X , t + u ),6. if Φ( X , u ) is deﬁned, the derivative of the function Φ( X , —) exists at u , and is equalto V (cid:0) Φ( X , u ) (cid:1) .A standard result [14] tells us that if a vector ﬁeld is locally Lipschitz, it has a local vectorﬂow. That is, for every X ∈ E p \ M there exists (cid:15) ∈ R > such that Φ( X , t ) is deﬁned for all t ∈ R [0 ,(cid:15) ) .We claim that if we move with the ﬂow Φ of the vector ﬁeld V , we approach the manifold M with constant speed. Lemma 5.9. If ( X , u ) ∈ ( E p \ M ) × R ≥ is in the domain of deﬁnition of Φ , then d (cid:0) M , Φ( X , u ) (cid:1) = d ( M , X ) − u. Proof.

Consider the functions R [0 ,u ] → R , given by t (cid:55)→ d (cid:0) M , Φ( X , t ) (cid:1) and t (cid:55)→ d ( M , X ) − t .To show that these two functions are the same (and thus in particular coincide for t = u ), itsuﬃces to show that they match in one point and have the same derivative.For t = 0, we have d (cid:0) M , Φ( X , (cid:1) = d ( M , X ). The derivative of the second function isconstantly −

1. We calculate the derivative of the ﬁrst function via the chain rule. Take t ∈ R [0 ,u ] and introduce an orthonormal n -dimensional coordinate system with the origin39n Y := Φ( X , t ), such that the ﬁrst coordinate axis points in the direction of prv ( Y ). In thiscoordinate system, the Jacobian matrix of the map d ( M , —) at Y is a matrix row with theﬁrst entry − (cid:104) Φ (cid:48) ( X , t ) , prv ( Y ) (cid:107) prv ( Y ) (cid:107) (cid:105) , i.e. the scalar projection onto the direction prv ( Y ) (cid:107) prv ( Y ) (cid:107) of thederivative of Φ( X , —) at Y .By the chain rule, the derivative of the function t (cid:55)→ d (cid:0) M , Φ( X , t ) (cid:1) is therefore( − · (cid:104) Φ (cid:48) ( X , t ) , prv ( Y ) (cid:107) prv ( Y ) (cid:107) (cid:105) = −(cid:104) V ( Y ) , prv ( Y ) (cid:107) prv ( Y ) (cid:107) (cid:105) = −(cid:104) (cid:107) prv ( Y ) (cid:107)(cid:104) (cid:101) V ( Y ) ,prv ( Y ) (cid:105) (cid:101) V ( Y ) , prv ( Y ) (cid:107) prv ( Y ) (cid:107) (cid:105) = − , as required.The next lemma is a tool which serves as a form of induction for real intervals. Lemma 5.10.

Let a ∈ R ≥ and let I be either the interval R [0 ,a ) or the interval R [0 ,a ] . Let L ⊆ I have the following properties: • L is a lower subset of I (i.e. ∀ t, u ∈ I . u ∈ L ∧ t ≤ u ⇒ t ∈ L ), • ∈ L , • for every t ∈ L t such that u ∈ L , • for every t ∈ I , if R [0 ,t ) ⊆ L , then t ∈ L .Then L = I .Proof. To prove L = I , it suﬃces to show that L is non-empty, open and closed in I since I is connected.Because L contains 0, it is non-empty. Since L is a lower subset of I , the third assumptionon L is equivalent to openness of L , and the fourth assumption is equivalent to closednessof L . Lemma 5.11. An (cid:15) ∈ R (0 ,p ) exists so that for every X ∈ W \ E p − (cid:15) and every S ∈ S , suchthat X , pr ( X ) ∈ E p ( S ) , the vector V ( X ) has the same direction as prv ( X ) .Proof. First we will require that (cid:15) < p − λ . In that case the inequality d ( X , pr ( X )) < p − λ leadsto contradiction X ∈ M p − λ ⊆ E p + λ ⊆ E p − (cid:15) . Thus pr ( X ) / ∈ B p − λ ( X ).Consider the intersection of E p ( S ) with the closed half-space, bounded by the hyperplane,tangent to ∂ E q S ( X ) ( S ), on the side not containing E q S ( X ) ( S ). This intersection contains X . If wetake X arbitrarily close to ∂ E p ( S ) (i.e. we consider q S ( X ) tending towards p ), the intersection40s contained in arbitrarily small balls around X . More explicitly, calculation shows that theintersection is contained in B (cid:114) (1+ p )( p − ( q S ( X ))2) p ( X ).If we choose (cid:15) so that this intersection is contained in B p − λ ( X ), then this intersection cannotcontain pr ( X ), whence (cid:101) V S ( X ) = prv ( X ). From 0 < λ < p < p (3 p − λ + 2 p (2 + λ ))1 + p > p − (cid:115) p (3 p − λ + 2 p (2 + λ ))1 + p > . If we pick any (cid:15) ∈ R (0 , p − λ ) satisfying (cid:15) < p − (cid:113) p (3 p − λ +2 p (2+ λ ))1+ p , then the aforementionedintersection is indeed contained in B p − λ ( X ).By assumption X ∈ W , i.e. X and pr ( X ) are in the same open ellipsoid. Recall from Propo-sition 5.3 that, aside from f P , at most one f S is non-zero. Thus the vector (cid:101) V ( X ) is a convexcombination of (cid:101) V S ( X ) and prv ( X ), so it is equal to prv ( X ). Hence V ( X ) has the same directionas prv ( X ) for our choice of (cid:15) . Lemma 5.12. An (cid:15) ∈ R (0 ,p ) exists so that for every X ∈ E p − (cid:15) and every u ∈ R ≥ , for which Φ( X , u ) is deﬁned, we have Φ( X , u ) ∈ E p − (cid:15) .Proof. Take any positive (cid:15) , smaller than the one in Lemma 5.11. Let X ∈ E p − (cid:15) and u ∈ R ≥ ,so that Φ( X , u ) is deﬁned. Let L := (cid:8) t ∈ R [0 ,u ] (cid:12)(cid:12) Φ( X , t ) ∈ E p − (cid:15) (cid:9) . We use Lemma 5.10 to show L = R [0 ,u ] ; this ﬁnishes the proof.Clearly 0 ∈ L and L is a lower set. It is the preimage of E p − (cid:15) under the mapΦ( X , —) : R [0 ,u ] → R n , so it is closed. We only still need to see that for every t ∈ L

The ﬂow Φ is deﬁned on D . roof. The ﬂow is deﬁned as long as it remains within the domain of V , i.e. E p \ M . Takeany X ∈ E p \ M and deﬁne L := (cid:8) t ∈ R [0 ,d ( M , X )) (cid:12)(cid:12) Φ is deﬁned at ( X , t ) (cid:9) . We verify the properties for L from Lemma 5.10 to get L = R [0 ,d ( M , X )) . The basic propertiesof the ﬂow tell us that 0 ∈ L and that L is an open lower subset of R [0 ,d ( M , X )) . Take t ∈ R [0 ,d ( M , X )) such that R [0 ,t ) ⊆ L . Because the vector ﬁeld V is bounded (Proposition 5.8),the map Φ( X , —) : R [0 ,t ) → E p \ M (of which the ﬁeld is the derivative) is Lipschitz, inparticular uniformly continuous. Hence it has a (uniformly) continuous extension R [0 ,t ] → E p (since E p , as a closed subspace of R n , is complete). Thus the limit Y := lim t (cid:48) (cid:37) t Φ( X , t (cid:48) ) existsand is in E p .We need to show that Y ∈ E p \ M . Using Lemma 5.9, we get d ( M , Y ) = d (cid:0) M , lim t (cid:48) (cid:37) t Φ( X , t (cid:48) ) (cid:1) = lim t (cid:48) (cid:37) t d (cid:0) M , Φ( X , t (cid:48) ) (cid:1) = lim t (cid:48) (cid:37) t (cid:0) d ( M , X ) − t (cid:48) (cid:1) = d ( M , X ) − t > . Hence Y / ∈ M .We also have Y ∈ E p . Before the ﬂow could leave E p , it would have to get arbitrarily closeto ∂ E p which would contradict Lemma 5.12. We can now deﬁne a deformation retraction from E p to M . The ﬂow Φ takes us arbitrarilyclose to the manifold without actually reaching it, so we will deﬁne the deformation retractionin two parts: ﬁrst from E p to a small neighbourhood of M , and then from this neighbourhoodto M itself.Recall that by assumption p > λ , and we have M ⊆ (cid:91) S ∈S E λ ( S ) ⊆ (cid:91) S ∈S E p ( S ) . The distance of a point in E λ ( S ) to the complement of E p ( S ) is the smallest in a co-vertexof E λ ( S ), where it is equal to p − λ . Hence M p − λ ⊆ M p − λ ⊆ E p .Denote w := p − λ and deﬁne the map R : E p × R [0 , → E p by R ( X , t ) := (cid:40) Φ (cid:0) X , min { d ( M , X ) − w, t } (cid:1) if X ∈ E p \ M w , X if X ∈ M w .This map is well deﬁned: if d ( M , X ) = w , the two function rules match. Each of them iscontinuous and deﬁned on a domain, closed in E p × R [0 , , so R is continuous on E p × R [0 , .Clearly R is a strong deformation retraction from E p to M w : for X ∈ E p \ M w we have R ( X ,

1) = Φ (cid:0) X , min { d ( M , X ) − w, } (cid:1) = Φ (cid:0) X , d ( M , X ) − w (cid:1) , d (cid:0) M , R ( X , (cid:1) = d ( M , X ) − (cid:0) d ( M , X ) − w (cid:1) = w by Lemma 5.9. Proposition 5.14.

There exists a strong deformation retraction of E p to M .Proof. First use R to strongly deformation retract E p to M w . From here, the usual normaldeformation retraction works. Speciﬁcally, since w is less that the reach of M , the map pr is deﬁned on M w . Hence the map M w × R [0 , → M w , given by ( X , t ) (cid:55)→ (1 − t ) · X + t · pr ( X ),is well deﬁned and a strong deformation retraction from M w to M . Theorem 6.1.

Let n ∈ N and let M be a non-empty properly embedded C -submanifold of R n without boundary. Let M have the same dimension m around every point. Let S ⊆ M bea subset of M , locally ﬁnite in R n (the sample from the manifold M ). Let τ be the reachof M in R n and κ the Hausdorﬀ distance between S and M . Then for all p ∈ R [ m p τ,M p τ ] which satisfy κ < (cid:113) p (cid:0)(cid:112) τ ( p + 2 τ ) − τ (cid:1) − κ oﬀ τ there exists a strong deformation retraction from E p (the union of open ellipsoids aroundsample points) to M . In particular, E p and M are homotopy equivalent, and so have thesame homology.Proof. First consider the case τ = ∞ . Then M is an m -dimensional aﬃne subspace of R n .In that case E p is just M p (the tubular neighbourhood around M of radius p ) which clearlystrongly deformation retracts to M via the normal deformation retraction.A particular case of this is when m = n or when M is a single point. If m = 0, i.e. M is a non-empty locally ﬁnite discrete set of points, and if M has at least two points, thenthe reach is half the distance between two closest points. In this case we necessarily have S = M and the set E p is a union of p -balls around points in M which clearly deformationretracts to M .We now consider the case τ < ∞ and 0 < m < n .All our conditions and results are homogeneous in the sense that they are preserved underuniform scaling. In particular, we may rescale the whole space R n by the factor τ and maythus without loss of generality assume τ = 1. The result now follows from Proposition 5.14. Corollary 6.2.

Let n ∈ N and let M be a non-empty properly embedded C -submanifoldof R n without boundary. Let M have the same dimension m around every point. Let S ⊆ M e a subset of M , locally ﬁnite in R n (the sample from the manifold M ). Let τ be the reachof M in R n and κ the Hausdorﬀ distance between S and M . Then whenever κ τ < (cid:113) M p ( (cid:112) M p − − κ oﬀ ≈ . , there exists p ∈ R > such that M is homotopy equivalent to E p .Proof. The expression (cid:113) p (cid:0)(cid:112) τ ( p + 2 τ ) − τ (cid:1) − κ oﬀ τ is increasing in p . Hence we get therequired result from Theorem 6.1 by taking p = M p τ . As already mentioned in the introduction, the ratio κ τ is a measure of the density of thesample. We want the required sample density to be small, i.e. the ratio κ τ should be as largeas possible. Corollary 6.2 gave us the upper bound κ τ < . κ τ < (cid:113) ≈ . . κ τ < (cid:113) √ − ≈ . , which would be a further improvement of the above result κ τ < .

913 by around a third(more than three times an improvement over the Niyogi, Smale and Weinberger’s result).The only reason we had to settle for the worse result was because to prove Corollary 4.1in Section 4, we used a computer program. We can get closer to the theoretical bound byincreasing M p and decreasing m p and κ oﬀ , in which case the program yields a smaller lowerbound on the values of the calculated function. Hence we would need to run the programwith smaller δ , but since the loops in the program, the number of steps in which is inverselyproportional to δ , are nested four levels deep, dividing δ by some t makes the program runapproximately t -times longer. The parameters, given in Section 4, are what we settled forin this paper in order for the program to complete the calculation in a reasonable amountof time — the program which computes in parallel ran for around 2 . µ -reach and related concepts [1], [11], [13], [8], [23], [15]. A natural44uestion arises whether we can apply these concepts to improve the bounds on a (local)density of a sample when using ellipsoids. In particular, it would be interesting to seewhether we can improve our result by allowing diﬀerently sized ellipsoids around diﬀerentsample points, with the upper bound on the size given in terms of the local feature size (localdistance to the medial axis) or the distance to critical points. Acknowledgements.

We thank Jure Kaliˇsnik and Peter Hintz for helpful discussions.

References [1] Nina Amenta and Marshall Bern. Surface reconstruction by Voronoi ﬁltering.

Discrete& Computational Geometry , 22(4):481–504, 1999.[2] Nina Amenta, Sunghee Choi, Tamal K. Dey, and Naveen Leekha. A simple algorithmfor homeomorphic surface reconstruction.

International Journal of Computational Ge-ometry & Applications , 12(01n02):125–141, 2002.[3] Glen E Bredon.

Topology and geometry , volume 139. Springer Science & BusinessMedia, 2013.[4] Paul Breiding, Sara Kalisnik, Bernd Sturmfels, and Madeleine Weinstein. Learningalgebraic varieties from samples.

Revista Matematica Complutense , 31(3):545–593, 92018.[5] Peter B¨urgisser, Felipe Cucker, and Pierre Lairez. Computing the homology of basicsemialgebraic sets in weak exponential time.

J. ACM , 66(1), December 2018.[6] G. Carlsson. Topology and data.

Bulletin of the American Mathematical Society , 46:255–308, 2009.[7] G. Carlsson and A. J. Zomorodian. Computing persistent homology.

Discrete andComputational Geometry , 33:249–274, 2005.[8] Fr´ed´eric Chazal, David Cohen-Steiner, and Andr´e Lieutier. A sampling theory forcompact sets in euclidean space.

Discrete & Computational Geometry , 41(3):461–479,Apr 2009.[9] Fr´ed´eric Chazal, David Cohen-Steiner, Andr´e Lieutier, Quentin M´erigot, and BorisThibert.

Inference of Curvature Using Tubular Neighborhoods , pages 133–158. SpringerInternational Publishing, Cham, 2017.[10] Fr´ed´eric Chazal and Andr´e Lieutier. The λ -medial axis. Graphical Models , 67(4):304–331, 2005. 4511] Fr´ed´eric Chazal and Andr´e Lieutier. Weak feature size and persistent homology: com-puting homology of solids in R n from noisy data samples. In Proceedings of the twenty-ﬁrst annual symposium on Computational geometry , pages 255–262, 2005.[12] Fr´ed´eric Chazal and Andr´e Lieutier. Stability and computation of topological invariantsof solids in R n . Discrete & Computational Geometry , 37(4):601–617, 2007.[13] Fr´ed´eric Chazal and Andr´e Lieutier. Smooth manifold reconstruction from noisy andnon-uniform approximation with guarantees.

Computational Geometry , 40(2):156–170,2008.[14] Rodney Coleman.

Calculus on normed vector spaces . Springer Science & BusinessMedia, 2012.[15] Tamal K Dey, Zhe Dong, and Yusu Wang. Parameter-free topology inference andsparsiﬁcation for data on manifolds. In

Proceedings of the Twenty-Eighth Annual ACM-SIAM Symposium on Discrete Algorithms , pages 2733–2747. SIAM, 2017.[16] Emilie Dufresne, Parker Edwards, Heather Harrington, and Jonathan Hauenstein. Sam-pling real algebraic varieties for topological data analysis. In , pages 1531–1536.IEEE, 2019.[17] H. Edelsbrunner, D. Letscher, and A. J. Zomorodian. Topological persistence andsimpliﬁcation.

Discrete and Computational Geometry , 28:511–533, 2002.[18] Herbert Edelsbrunner and Ernst P. M¨ucke. Three-dimensional alpha shapes.

ACMTrans. Graph. , 13(1):4372, January 1994.[19] Herbert Federer. Curvature measures.

Transactions of the American MathematicalSociety , 93(3):418–491, 1959.[20] R. Ghrist. Barcodes: The persistent topology of data.

Bulletin of the American Math-ematical Society , 45:61–75, 2008.[21] John M Lee. Smooth manifolds. In

Introduction to Smooth Manifolds , pages 1–31.Springer, 2013.[22] Partha Niyogi, Stephen Smale, and Shmuel Weinberger. Finding the homology of sub-manifolds with high conﬁdence from random samples.

Discrete Comput. Geom. , 39(1-3):419–441, 2008.[23] Katharine Turner. Cone ﬁelds and topological sampling in manifolds with boundedcurvature.

Found. Comput. Math. , 13(6):913–933, December 2013.46 uthors’ addresses:

Sara Kaliˇsnik, Wesleyan University [email protected]

Davorin Leˇsnik, University of Ljubljana [email protected]@fmf.uni-lj.si