From Geometry to Topology: Inverse Theorems for Distributed Persistence
FFrom Geometry to Topology: Inverse Theorems forDistributed Persistence
Elchanan SolomonDepartment of Mathematics,Duke UniversityDurham, [email protected] Alexander WagnerDepartment of Mathematics,Duke UniversityDurham, [email protected] Paul BendichDepartment of Mathematics, Duke UniversityGeometric Data AnalyticsDurham, [email protected]
Abstract —What is the “right” topological invariant of a largepoint cloud X? Prior research has focused on estimating thefull persistence diagram of X, a quantity that is very expensiveto compute, unstable to outliers, and far from a sufficientstatistic. We therefore propose that the correct invariant is not thepersistence diagram of X, but rather the collection of persistencediagrams of many small subsets. This invariant, which we call“distributed persistence,” is trivially parallelizable, more stableto outliers, and has a rich inverse theory. The map from the spaceof point clouds (with the quasi-isometry metric) to the space ofdistributed persistence invariants (with the Hausdorff-Bottleneckdistance) is a global quasi-isometry. This is a much strongerproperty than simply being injective, as it implies that the inverseof a small neighborhood is a small neighborhood, and is to ourknowledge the only result of its kind in the TDA literature.Moreover, the quasi-isometry bounds depend on the size of thesubsets taken, so that as the size of these subsets goes from smallto large, the invariant interpolates between a purely geometricone and a topological one. Lastly, we note that our inverse resultsdo not actually require considering all subsets of a fixed size (anenormous collection), but a relatively small collection satisfyingcertain covering properties that arise with high probabilitywhen randomly sampling subsets. These theoretical results arecomplemented by two synthetic experiments demonstrating theuse of distributed persistence in practice.
I. I
NTRODUCTION
Morphometric techniques in data analysis can be looselydivided into the geometric and the topological. Geomet-ric techniques, like landmarks, the Procrustes distance, theGromov-Hausdorff metric, optimal transport methods, PCA,MDS [Kru64], LLE [RS00], and Isomap [TSL00], are de-signed to capture some combination of global and local metricstructure. Many geometric methods can be solved exactly orapproximately via spectral methods, and hence are fast toimplement using iterative and sketching algorithms. In con-trast, topological techniques, like t-SNE [vdMH08], UMAP[MHM18], Mapper [SMC07], and persistent homology, aim tocapture large-scale connectivity structure in data. The growingpopularity of t-SNE and UMAP as dimensionality reduction
The first and third authors were partially supported by the Air Force Officeof Scientific Research under the grant “Geometry and Topology for DataAnalysis and Fusion”, AFOSR FA9550-18-1-0266. The second author waspartially supported by the National Science Foundation under the grant “HDRTRIPODS: Innovations in Data Science: Integrating Stochastic Modeling,Data Representations, and Algorithms”, NSF CCF-1934964. methods suggests that many data sets are topologically, butnot metrically, low-dimensional.The goal of this paper is to introduce a new technique intotopological data analysis (TDA) that:1) Provably interpolates between topological and geometricstructure (Theorem V.15).2) Is trivially parallelizable.3) Is exactly computable via deterministic and stochasticmethods (Porisms V.17 and V.18 and Propositions V.23and V.25).4) Is provably stable to perturbation of the data (PropositionV.2).5) Is provably invertible, with globally stable inverse (The-orems V.9, V.15, V.21, and Porism V.19).6) Suggests new methods for a host of morphometricchallenges, ranging from dimensionality reduction tofeature extraction (Section VI).The theoretical guarantees provided here are, to our knowl-edge, unmatched by any other method in topological dataanalysis. The same applies for many spectral methods, whichare famously unstable in the presence of a small spectral gap.In addition to these theoretical contributions, we demonstrateour theoretical results empirically on synthetic data sets.II. T HE D ISTRIBUTED T OPOLOGY P ROBLEM
Let λ be a statistic of finite point clouds in R d . Let X bean abstract indexing set with an embedding ψ : X → R d . For k ∈ Z , we can define a distributed statistic λ k that maps the labeled point cloud ( X, ψ ) to the set { ( S, λ ( ψ ( S ))) | S ⊂ X, | S | = k } if k > and to ∅ otherwise. Put another way, λ k ( X, ψ ) records the values of λ on subsets of ψ ( X ) of a fixedsize, together with abstract labels identifying which invariantcorresponds to which subset. For the remainder of this paper,we will omit mentioning the embedding ψ , and will refer to X as a point cloud, unless it becomes important to disambiguatebetween X as an abstract set and X as a set with a fixedembedding.When the computational complexity of λ scales poorly inthe size of X , the statistic λ k can be easier to compute. It is also possible to do away with these labels, and we will consider thispossibility later on in the paper. a r X i v : . [ m a t h . A T ] F e b oreover, λ k may contain information not accessible via λ itself. We will say that λ is k -distributed if λ k ( X ) determines λ ( X ) for any subset X ⊂ R d with | X | ≥ k . Many commongeometric invariants are k -distributed: • Let λ send a finite set X to its Euclidean distance matrix.This invariant is k -distributed for all k ≥ . • Let λ send a finite set X to its diameter. This invariantis k -distributed for all k ≥ . • Let λ send a finite set X to its mean. This invariant is k -distributed for all k ≥ .The primary theoretical goal of this paper is to address thefollowing three questions: Problem II.1.
Which invariants in applied algebraic topologyare k -distributed for various k ? Problem II.2. If λ is k -distributed, how much additionaltopological or geometric information does λ k contain, ascompared to λ , and how does this depend on k ? Problem II.3.
Can λ k be well-approximated, with high proba-bility, using only a small fraction of the total number of subsetsof size k ?A. Case Study: The Noisy Circle To illustrate the advantage of working with distributedinvariants, we compare three data sets of points. Thefirst is spaced regularly around a circle, the second sampleduniformly from the unit disc, and the third contains points on the circle and points sampled from the disc (wecall this the noisy circle ), see Figure II.1. For each of thesepoint clouds, we compute their full -dimensional persistencediagrams, see Figure II.2. In addition, for each point cloud,we sample subsets of size , compute the resulting -dimensional persistence diagrams, vectorize them as persistence images , and average the results, see Figure II.3.The persistence diagram of the noisy circle is most similarto that of the disc (in Bottleneck distance), demonstrating thatordinary persistence does not see the circle around which mostof the data points are clustered. The distributed persistence,however, tells a different story. The distribution for the noisycircle interpolates between the distributions of the other twospaces, but is substantially closer to that of the circle than thedisc. III. P RIOR W ORK ON D ISTRIBUTED T OPOLOGY
In [CFL + ( X , ρ, µ ) , sample m points and compute the persistence landscape of the associatedVietoris-Rips filtration. This procedure produces a randompersistence landscape, λ , whose distribution is denoted Ψ mµ .Repeating this procedure n times and averaging producesthe empirical average landscape, an unbiased estimator ofthe average landscape E Ψ mµ [ λ ] . This approach is similar to This is a technique for turning a persistence diagram into a function byplacing a Gaussian kernel at each dot in the persistence diagram, with meanand variance varying by location, cf. [AEK + Fig. II.1: Three point clouds: the circle, the noisy circle, andthe disc.Fig. II.2: The persistence diagrams of our three point clouds,plotted in birth-persistence coordinates.Fig. II.3: Averaged distributed persistence images of our threespaces. The dominant orange/yellow region is the overlayof the circle (red) distribution and the noisy circle (green)distribution.the distributed topological statistics considered in this paper,except we consider a collection of topological statistics asa labeled set rather than taking their sum. Though Bubenik[Bub20] gives conditions in Theorem . under which acollection of persistence diagrams may be reconstructed fromthe average of their corresponding persistence landscapes, suchn inverse exists only generically, and is highly unstable.The main theorem of [CFL + µ and ν are two probability measures onthe same metric space ( X , ρ ) , then the sup norm betweeninduced average landscapes is bounded by m /p W ρ,p ( µ, ν ) for any p ≥ . Similar results were obtained in [BGMP14]for distributions of persistence diagrams of subsamples. Inparticular, Blumberg et al. showed that the distribution ofbarcodes with the Prohorov metric is stable with respectto the associated compact metric measure space with theGromov-Prohorov metric. Both these results are analagous tothe stability of the distributed topological statistics given inProposition V.2. However, working with labeled collections ofdistributed topological statistics, we are also able to provideinverse stability results, such as our main Theorem V.15, whichstates that changes in the metric structure are bounded withrespect to changes in the distributed topological statistics.In [BHPW20], Bubenik et al. consider unit disks, denoted D K , of surfaces of constant curvature K with K ∈ [ − , .Since these spaces are all contractible, their reduced singularhomology is trivial and global homology cannot distinguishthem. However, the authors prove that the maximum ˇCech per-sistence for three points sampled from D K determines K . Theauthors also successfully apply the same empirical frameworkof average persistence landscapes from [CFL + D K for various K . Theauthors in [DGP +
16] used average persistence landscapes toprovide experimental verification of a known phase transition.Finally, the authors in [MGSH +
20] use average persistencelandscapes to achieve improved results, compared to standardmachine learning algorithms, in disease phenotype predictionbased on subject gene expressions.IV. B
ACKGROUND
The content of this paper assumes familiarity with theconcepts and tools of persistent homology. Interested readerscan consult the articles of Carlsson [Car09] and Ghrist [Ghr08]and the textbooks of Edelsbrunner and Harer [EH10] andOudot [Oud15]. We include the following primer for readersinterested in a high-level, non-technical summary.Persistent homology records the way topology evolves ina parametrized sequence of spaces. To apply persistent ho-mology to a point cloud, a pre-processing step is neededthat converts the point cloud into such a sequence. The twoclassical ways of doing this are called the Rips and ˇCechfiltrations, respectively; the former is much easier to computethan the latter, at the expense of some geometric fidelity.Both consist of inserting simplices into the point cloud ata parameter value equal to the proximity of the associatedvertex points. As the sequence of spaces evolves, the additionof certain edges or higher-dimensional simplices changes thehomological type of the space – these simplices are calledcritical. Persistent homology records the parameter values atwhich critical simplices appear, notes the dimension in whichthe homology changes, and pairs critical values by matching the critical value at which a new homological feature appearsto the critical value at which it disappears. This informationis organized into a data structure called a persistence diagram,and there are a number of metrics with which persistencediagrams can be compared.If one forgets about the pairing and retains only thedimension information of the critical values, the resultinginvariant is called a Betti curve. Betti curves are simpler tocompute and work with than persistence diagrams, but areless informative and harder to compare. Finally, if one alsodrops the dimension information by taking the alternating sumof the Betti curves, one gets an Euler curve. Euler curvesare even less discriminative than Betti curves, but enjoy thespecial symmetry properties of the Euler characteristic. Thesesymmetries will be put to good use in this paper.Persistence theory guarantees that a small modification tothe parametrization of a sequence of spaces implies onlysmall changes in its persistence diagram. To be precise, ifthe appearance time of any given simplex is not delayed oradvanced by more than (cid:15) , the persistence diagram as a wholeis not distorted by more than (cid:15) in the appropriate metric (calledthe
Bottleneck distance ). Throughout this paper we will use thetrick of modifying filtrations by rounding their critical valuesto a fixed, discrete set.As a rule, the map sending a point cloud to its persistencediagram is not injective, as many different point clouds sharethe same persistence diagram. Moreover, the set of pointclouds sharing a common persistence diagram need not bebounded, so that arbitrarily distinct point clouds might havethe same persistence. There are a number of constructionsin the TDA literature that attempt to correct this lack ofinjectivity by constructing more sophisticated invariants; theseare often called topological transforms . Examples includethe Persistent Homology Transform [TMB14] and IntrinsicPersistent Homology Transform [OS17]; consult [OS20] fora survey of inverse results in persistence. These methods arelargely unfeasible to compute exactly, unstable, and provideno global Lipschitz bounds on their inverse, so two wildlydifferent spaces may produce arbitrarily similar (though notexactly identical) transforms. The distributed topology invari-ant studied in this paper is injective, practically computable,stable, and with Lipschitz inverse.V. T
HEORETICAL R ESULTS
In what follows, we let λ be any of the following fourtopological invariants: • Rips Persistence (RP). • Rips Euler Curve (RE). • ˇCech Persistence (CP). • ˇCech Euler Curve (CE).To be precise, RP and CP consist of persistence diagramsfor every homological degree. When working with either ofthese invariants, the Bottleneck or Wasserstein distance is themaximum of the Bottleneck or Wasserstein distances over alldegrees. . Stability A result of the following form is standard in the TDAliterature, and demonstrates the ease of producing stableinvariants using persistent homology.
Definition V.1.
Let ( X, d X ) and ( Y, d Y ) be metric spaces.A map φ : ( X, d X ) → ( Y, d Y ) is an (cid:15) -quasi-isometry if | d X ( x , x ) − d Y ( φ ( x ) , φ ( x )) | ≤ (cid:15) for all x , x ∈ X . Proposition V.2.
Let φ : ( X, d X ) → ( Y, d Y ) be an (cid:15) -quasi-isometry of metric spaces. Then for all subsets S ⊆ X , and λ either RP or CP, d B ( λ ( S ) , λ ( φ ( S ))) ≤ (cid:15) , where d B is theBottleneck distance on persistence diagrams.Proof. This follows immediately from the Gromov-Hausdorffstability theorem for persistence diagrams of point clouds[CDSGO16, CSEH07]. B. k -Distributivity In this section, we show how many distributed invariantssuffice to determine the isometry type of a point cloud. Thisprovides an answer to Problem II.1. To help motivate thisresult, we consider the simple cases of k = 2 and k = 3 . Lemma V.3.
All of our λ are -distributed. Moreover, theknowledge of λ determines the isometry type of X .Proof. Regardless of the invariant used, it is possible to readoff the distances between any pair of points in X . Thisdetermines the embedding of X up to rigid isometry (see[Sin08]), and hence the Rips and ˇCech filtrations.Setting k = 3 is sufficient to break the implication of anisometry. Lemma V.4. λ does not determine the isometry type of X .Proof. A simple counterexample suffices. Let X consist of thevertices of an obtuse triangle with angle θ > π/ . Varying theangle θ in ( π/ , π ) alters the isometry type of X , but leavesits topology unchanged.To obtain stronger results, we introduce the following twogeneralizations, one to the notion of distributivity, and theother to the invariants λ . Definition V.5.
We say that λ is ( k , k , · · · , k r ) -distributedif λ k through λ k r , taken together, determine λ . Definition V.6.
For any of our four invariants λ , let λ m be themodified invariant restricted to the m -skeleton of the Rips ofˇCech complex. In other words, they are persistence invariantsof filtrations whose top simplices have dimension m .Setting m = 0 provides information only on the cardinalityof X . The -skeleton contains both geometric and topologicalinformation, and its persistence is fast to compute. As m increases, computational complexity goes up, and the resultinginvariants record higher-dimensional topological information.The following lemma demonstrates how knowing sufficientlymany Euler characteristic invariants allows one to determinenew ones. Lemma V.7.
Let λ be RE or CE. For any point cloud X and k ≥ m + 2 , { λ mk , λ mk − , · · · λ mk − m − } determine λ mk − m − .Proof. Let Y ⊂ X be a subset of size ( k − m − .Let { x , · · · , x m +2 } be points in X \ Y , set W = Y ∪{ x , · · · , x m +2 } and Y i = W \ { x i } . Then | W | = k and | Y i | = ( k − for all i . Note that every subset of size ( m + 1) in W is contained in some Y i . Thus if we write K m ( W ) to denote the m -skeleton of the full simplex on W , we have K m ( W ) = (cid:83) i K m ( Y i ) , and the same equalityholds true when the full simplex is replaced with the Ripsor ˇCech complex at a fixed scale r . Note that in general, K m ( S ) ∩ K m ( T ) = K m ( S ∩ T ) for any subsets S, T ⊂ X ,but the same equality does not hold with intersections replacedwith unions, as there may be simplices in K m ( S ∪ T ) whoseset of vertices are not contained in either S or T . This explainswhy we take all Y , · · · Y m +2 to cover W .Let us now apply the inclusion-exclusion property of theEuler characteristic to compare the Euler characteristic of W (at a given scale r ) with those of the Y i . χ ( W r ) = χ (cid:32)(cid:91) i Y ri (cid:33) = (cid:88) i χ ( Y ri ) − (cid:88) i Let λ be RE or CE. For any point cloud X and k ≥ m + 2 , { λ mk , λ mk − , · · · λ mk − m − } determine λ m .Proof. Lemma V.7 shows that { λ mk , λ mk − , · · · λ mk − m − } deter-mines λ mk − m − . By the same logic, { λ mk − , λ mk − , · · · λ mk − m − } determines λ mk − m − . Repeating this argument, we can deduce λ m .Leveraging Lemma V.7, we prove that all of our persistenceinvariants are appropriately distributed. Theorem V.9. For any of the four invariants λ , the m -skeletoninvariant λ m is ( k, k − , · · · , k − m − -distributed for all k ≥ m +1 ≥ . Moreover, { λ mk , λ mk − , · · · λ mk − m − } determinethe isometry type of X .Proof. When m ≥ , the m -skeleton contains all edges in X , so Lemma V.3 still applies. If the set { k, k − , k − = 5 Y Y χ = 0 Y \ Y Y \ Y Y \ Y Y Wk = 4 k = 3 k = 2 χ = { 1 χ = 0 χ = 1 Y χ = 1 χ = 2 χ = 0 χ = 1{ 1 = (0 + 1 + 0) { (1 + 2 + 0) + 1 Fig. V.1: Our goal is to deduce the Euler Characteristic (at afixed scale r ) of Y , a -simplex consisting of k = 2 points.This can be derived from the Euler Characteristics of the othersubcomplexes in the diagram above. , · · · , k − m − } contains , this follows from Lemma V.3.Otherwise, let us assume λ is either RE or CE, as RP or CPcontain strictly more information than their Euler characteristiccounterparts. By Corollary V.8, we can determine λ m and thenapply Lemma V.3. Remark V.10. Note that m = 1 is sufficient to apply the priortheorem. As m gets larger, more topological information isneeded to determine the isomety type of the underlying space. C. Approximate Distributivity We now consider what happens if two point clouds havedistributed invariants which are similar but not identical. Weshow that this implies a quasi-isometry between X and Y , withconstant depending quadratically on the subset size parameter k . This provides a precise answer to Problem II.2 on howthe distributed statistic interpolates between geometry andtopology.The key insight in the proof of this result is that there isalways a way to modify the Rips or ˇCech filtrations on X and Y to force their distributed invariants to coincide exactly.Taken together with the telescoping trick of Corollary V.8,this modified invariant must agree for all subsets of sizetwo. Persistence stability allows us to assert that the modifiedinvariant and the original persistence invariant are a boundeddistance apart, so equality of the modified invariant gives near-equality of the Rips or ˇCech persistences on subsets of sizetwo, which is nothing more than pairwise distance data.The proposed modification to our filtration consists ofrounding it to a discrete set of values. The following technicallemma shows how to pick a rounding set R that aligns two sets of points without moving any point more than a boundedamount. Lemma V.11 (Rounding Lemma) . Let P = { p ≤ p ≤· · · p N } and Q = { q , q · · · , q N } be two sets of real numbers.Define d i = | p i − q i | , let (cid:15) = max d i and δ = (cid:80) ni =1 d i . Thenthere exists a subset R ⊂ R and a map π : P ∪ Q → R sendinga point x to the unique closest element in R (rounding up atmidpoints), with: π ( p i ) = π ( q i ) for all i . | π ( x ) − x | ≤ (cid:15) + 4 δ .In particular, since (cid:15) ≤ δ , we can replace (2) with (2*) | π ( x ) − x | ≤ δ .Proof. The proof is a recursive construction. The first stepis to add p to R . We then repeat the following argument,iterating through P . Consider p n , and let r ∗ be the largestelement of R so far. If p n < r ∗ + 2 (cid:15) + 4 δ , skip p n .Otherwise, initialize r n = p n , and iterate over all i < n and check that p i > ( r n + r ∗ ) / iff q i > ( r n + r ∗ ) / .Every time an index i is found for which this conditionis violated, increment r n ← r n + 2 d i . The effect of thisincrementation is to force both q i and p i to be strictlycloser to r ∗ than they are to r n . This condition can beviolated at most once for each p i , hence the total sum ofthe incrementation is δ , at the end of which r n is added to R .Let us see why the resulting set R satisfies (1) and (2). If r n was added to R , then it is at most δ from p n and δ + (cid:15) from q n , whereas | r ∗ − p n | > (cid:15) + 4 δ and | r ∗ − q n | > (cid:15) + 4 δ by the triangle inequality. Thus π ( q n ) = π ( p n ) = r n . For i < n , the recursive incrementation ensures π ( p i ) = r n if andonly if π ( q i ) = r n , and otherwise the value of π on ( p i , q i ) is unchanged. Thus (1) is preserved. To check (2), note thatif π ( p i ) = π ( q i ) = r n for i < n , then p i and q i are closer to r n than any other element in R . By recursive hypothesis, thisdistance is at most (cid:15) +4 δ , so | p i − r n | and | q i − r n | ≤ (cid:15) +4 δ .If, on the other hand, no point was added to R , then p n < r ∗ + 2 (cid:15) + 4 δ . Let p ∗ ∈ P be the point correspondingto r ∗ . Since r ∗ + 2 (cid:15) + 4 δ > p n ≥ p ∗ ≥ r ∗ − δ , we know | p n − r ∗ | ≤ (cid:15) + 4 δ and | q n − r ∗ | ≤ | q n − p n | + | p n − r ∗ | ≤ (cid:15) + 4 δ . If we can show that π ( p n ) = r ∗ and π ( q n ) = r ∗ ,the proof will be complete. If p n ≥ r ∗ then it is clear that π ( p n ) = r ∗ , and similarly, if q n ≥ r ∗ , we have π ( q n ) = r ∗ .Thus we need to consider what happens if p n or q n arestrictly less than r ∗ .Let r ∗∗ < r ∗ be the penultimate point in R . Our goal isto show that p n or q n are strictly closer to r ∗ than they areto r ∗∗ . Recall the point p ∗ ∈ P corresponding to r ∗ . Since p ∗ ≤ p n and | r ∗ − p ∗ | ≤ δ , we know that p n ≥ r ∗ − δ and q n ≥ r ∗ − δ − (cid:15) . Thus if p n or q n are strictly less than r ∗ , they are no further than δ and δ + (cid:15) away, respectively.However, since | r ∗ − r ∗∗ | ≥ (cid:15) + 4 δ , the triangle inequalityimplies that | p n − r ∗∗ | ≥ (cid:15) + 2 δ and | q n − r ∗∗ | ≥ (cid:15) + 2 δ .hus, if p n or q n are smaller than r ∗ , they must still round up r ∗ than r ∗∗ , and not r ∗∗ or any other element of R . Corollary V.12. We can extend the set R in the RoundingLemma to a δ -dense subset R (cid:48) ⊂ R , without changing π on P ∪ Q . All that is necessary is to enrich R by adding pointsin ( ∪ r ∈ R N ( r, δ )) C . With our rounding trick in hand, we can now prove thecentral result of this section, Theorem V.15. The followingpieces of notation clarify the statement and proof of thetheorem: Definition V.13. Let m < k be natural numbers. We definethe following partial sum of binomial coefficients: S ( k, m ) = (cid:18) k (cid:19) + (cid:18) k (cid:19) + · · · + (cid:18) km + 1 (cid:19) Definition V.14. Let ( K, f ) be a filtered simplicial complex,i.e. a simplicial complex K with a real-valued function f : K → R encoding the appearance times of simplices. Givena subset R ⊂ R , rounding this filtration to R consists ofpost-composing f with the map sending every element of R to its nearest element in R (rounding up at midpoints).Thus, the simplices in the rounding filtration appear only atvalues contained in R . The effect of rounding on the resultingpersistence diagrams is to round the birth and death times ofits constituent dots; no new points are introduced. Theorem V.15. Let λ be either RP or CP, and take k > m > . Let φ : X → Y be a bijection such that for all S ⊆ X with | S | ∈ { k, k − , · · · , k − m − } , d B ( λ m ( S ) , λ m ( φ ( S ))) ≤ (cid:15) .If λ is RP, φ is a k (cid:15) quasi-isometry, and if λ is CP, φ isa S ( k, m ) k m +1 (cid:15) quasi-isometry.Proof. Let ( x , x ) be an edge in X , and let ( y , y ) be thecorresponding edge in Y . Let S ⊆ X be a subset of size k containing ( x , x ) . Let A ( S ) be the set of appearancetimes of simplices in the m -skeleton of S , and define A ( φ ( S )) similarly. Apply the Rounding Lemma to the following set ofpairs: { ( l, l + 2 (cid:15) ) , ( l, l − (cid:15) ) | l ∈ A ( S ) ∪ A ( φ ( S )) } In the language of the hypotheses of the Rounding Lemma, wehave δ = (cid:80) d i = 4 (cid:15) | S ( A ) | +4 (cid:15) | S ( φ ( A )) | . Let R be the subsetgiven by the Rounding Lemma and its corollary, and let λ R denote the invariant λ m with filtration rounded to R . Note thatif S (cid:48) ⊂ S has the property that d B ( λ m ( S (cid:48) ) , λ m ( φ ( S (cid:48) ))) ≤ (cid:15) ,then λ R ( S (cid:48) ) = λ R ( φ ( S (cid:48) )) . To see why this is the case, let p = ( a, b ) ∈ λ R ( S (cid:48) ) ∪ ∆ and p (cid:48) = ( a (cid:48) , b (cid:48) ) ∈ λ R ( φ ( S (cid:48) )) ∪ ∆ be dots paired in an optimal Bottleneck matching, where ∆ isthe diagonal.Let us first assume that p is on the diagonal, so that | b (cid:48) − a (cid:48) | ≤ (cid:15) . If p (cid:48) is also on the diagonal, then both p and p (cid:48) remain on the diagonal after rounding to R (or, indeed,rounding to any set of values). If p (cid:48) is not on the diagonal, a (cid:48) , b (cid:48) ∈ A ( φ ( S )) ; since | b (cid:48) − a (cid:48) | ≤ (cid:15) , a (cid:48) are b (cid:48) are rounded tothe same point in R , and hence the point ( a (cid:48) , b (cid:48) ) is roundedto the diagonal. If p is not on the diagonal, then a, b ∈ A ( S ) , and since a (cid:48) ∈ [ a − (cid:15), a + (cid:15) ] and b (cid:48) ∈ [ b − (cid:15), b + (cid:15) ] , we can concludethat a and a (cid:48) round to the same point in R , and the sameis true for b and b (cid:48) . In any case, the points p and p (cid:48) become identical after rounding to R . Thus, using λ R , φ preserves persistence diagrams of all subsets of S of size k through k − m − , and hence, by Corollary V.8, all subsetsof size two, in particular ( x , x ) . Thus, λ R (( x , x )) = λ R (( y , y )) . As R is (4 × (cid:15) | S ( A ) | + (4 × (cid:15) | S ( φ ( A )) | dense in R , persistence stability implies that λ m and λ R are within (cid:15) ( | S ( A ) | + | S ( φ ( A )) | ) of each other in Bot-tleneck distance. The triangle inequality then tells us that d B ( λ m ( x , x ) , λ m (( y , y ))) ≤ (cid:15) ( | S ( A ) | + | S ( φ ( A )) | ) ,which is equivalent to |(cid:107) x − x (cid:107)−(cid:107) y − y (cid:107)| ≤ (cid:15) ( | S ( A ) | + | S ( φ ( A )) | ) . To conclude the proof, note that for the Ripscomplex, | S ( A ) | , | S ( φ ( A )) | ≤ (cid:0) k (cid:1) = k − k ≤ k , as allappearance times of simplices are just pairwise distancesbetween points. For the ˇCech complex, there may be a total of S ( k, m ) distinct appearance times in S ( A ) or S ( φ ( A )) , onefor each simplex of dimension between and m , that need tobe rounded correctly (all dimension zero simplices necessarilyappear at height zero). Remark V.16. Theorem V.15 answers Problem II.2 by show-ing that smaller values of k give more control of quasi-isometry type than larger values. This justifies our claim thatdistributed topology interpolates between local geometry andglobal topology.Moving on to Problem II.3, the following two porisms, re-sulting from the proof of Theorem V.15, show that our inverseresults do not require checking all subsets with cardinality k through k − m − , but a much smaller collection that coversthe space X in the right way. Subsection V-E bounds thenumber of randomly selected subsets needed to produce sucha covering with high probability. Porism V.17. The results of Theorem V.15 do not require φ to preserve the topology for all subsets S with | S | ∈ { k, k − , · · · , k − m − } . Rather, it suffices to consider a collection C of subsets of X with the following properties: • (Covering property) For every subset σ of X with | σ | ≤ ,there is a subset S ∈ C containing σ with | S | = k . • (Closure property) If S ∈ C has | S | = k , and S (cid:48) ⊂ S has | S (cid:48) | ≥ k − m − , then S (cid:48) ∈ C .This requires checking many fewer subsets of X , rather than (cid:0) | X | k (cid:1) + (cid:0) | X | k − (cid:1) + · · · + (cid:0) | X | k − m − (cid:1) . One can often check even fewer subsets by replacing thecovering property with a δ -dense version: • ( δ -dense covering property) There exists a subset X (cid:48) ⊆ X with | X (cid:48) | ≥ k , such that X (cid:48) is δ -dense in X and φ ( X (cid:48) ) is δ -dense in Y , and such that for every subset σ of X (cid:48) with | σ | = 2 , there is a subset S ∈ C containing σ with | S | = k . Noting that for subsets of size two, Euler curves and persistence diagramscontain identical information. he resulting bound is not in the quasi-isometry distancebut in the Gromov-Hausdorff distance. Porism V.18. Let λ be either RP or CP, and take k > m > .Let φ : X → Y be a bijection between metric spaces, andlet C be a collection of subsets of cardinality between k and k − m − that satisfies both the δ -dense covering property andthe closure property. Suppose that d B ( λ m ( S ) , λ m ( φ ( S ))) ≤ (cid:15) for all S ∈ C . If λ is RP, then d GH ( X, Y ) ≤ k (cid:15) + 2 δ ,and if λ is CP, then d GH ( X, Y ) ≤ S ( k, m ) k m +1 (cid:15) + 2 δ .Proof. The proof of Theorem V.15 implies that φ is a quasi-isometry from X (cid:48) to φ ( X (cid:48) ) . We can extend this to a a Gromov-Hausdorff matching between X and Y , and two applicationsof the triangle inequality increase the bound by δ . Porism V.19. If X ⊂ R d and Y ⊂ R d , then the quasi-isometry bound for ˇCech persistence in the prior theorem canbe replaced with: k (cid:32) (cid:15) + (cid:114) d d + 1 + (cid:114) d d + 1 (cid:33) Note that the added terms sum at most to √ , so that thisbound is better than the bound given in Porism V.18 for non-infinitesimal (cid:15) , but does fail to go to as (cid:15) → .Proof. The Rips and ˇCech persistence of point clouds in R d are always within (cid:113) dd +1 of one another in the bottleneckdistance, cf. Theorem 2.5 in [DSG07]. The result then followsby replacing ˇCech persistence with Rips persistence and usingthe triangle inequality. D. Topology + Sparse Geometry Our goal now is improve the results of the prior section bygiving quasi-isometry bounds that scale linearly in k , ratherthan quadratically. This can be accomplished by using aninclusion-exclusion argument on the -skeleton persistence of X that uses only subsets of size k and ( k − . Namely, givena subset Y ⊂ X with | Y | = ( k − , we take Y = Y ∩ Y for | Y | = | Y | = ( k − and W = ( Y ∪ Y ) with | W | = k ,as shown in Figure V.2, and attempt to deduce the Eulercharacteristic of Y from those of Y , Y , and W . However,the union of the -skeleton complexes on Y and Y is not the -skeleton complex on W , owing to the fact that W containsan extra edge connecting the pair of vertices in W \ Y .The effect of this extra edge on persistence is quite subtle,but its effect on the Euler curve is trivial, as it amounts tosubtracting a step function supported on [ r, ∞ ) , where r isthe appearance time of the extra edge in the complex. If weknew r , we could correct the deficit in our inclusion-exclusionargument. Note that the we have the freedom to choose Y and Y as we like, so to make this argument work we need onlyknow the length of a single edge in X that does not intersect Y . A very small collection of edge lengths suffice to patch upthe inclusion-exclusion argument for all subsets of X of sizeat most k . Before proving our quasi-isometry bound, we needthe following corollary of the Rounding Lemma. k = 5 Y Y Wk = 4 k = 3 Y Fig. V.2: Our goal is to deduce the Euler Characteristic (ata fixed scale r ) of Y , a subcomplex of size k = 3 , usingsubcomplexes of size k = 4 and k = 5 . However, theinclusion-exclusion argument fails because the union of thecomplexes of Y and Y is not the complex on W = Y ∪ Y ,and the missing edge is shown in red. Lemma V.20. Given A · · · A n and B · · · B n persistencediagrams, with W ( A i , B i ) ≤ δ , there exists a nδ -densesubset R ⊂ R such that rounding all the persistence diagramsto the grid R × R forces π ( A i ) = π ( B i ) for all i .Proof. This is a straightforward application of the RoundingLemma. We take the set P to consist of all the birth and deathtimes of all the dots in the A i , and construct Q from the B i similarly. As each ( A i , B i ) pair contributes two sets of points,births and deaths, the total (cid:96) norm of pairing P with Q is × nδ = 2 nδ . By Corollary V.12, one can find a subset R of density nδ which ensures π ( p i ) = π ( q i ) for all matchedpairs p i ∈ P, q i ∈ Q , and hence π ( A i ) = π ( B i ) for all i . Theorem V.21. Let λ be either RP or CP, and take k > m =1 . Let φ : X → Y be a bijection such that for all S ⊆ X with | S | ∈ { k, k − } , W ( λ ( S ) , λ ( φ ( S ))) ≤ (cid:15) . Supposefurther that there is a subset X (cid:48) ⊂ X of size ( k − with (cid:88) ( x i ,x j ) ∈ X (cid:48) × X (cid:48) |(cid:107) x i − x j (cid:107) − (cid:107) φ ( x i ) − φ ( x j ) (cid:107)| ≤ (cid:15) . Then φ is a k + 1) (cid:15) + 28 (cid:15) quasi-isometry.Proof. Let x , x be a pair of points in X . Without loss ofgenerality, we can assume that at least one of these pointsis not in X (cid:48) , as the proof is otherwise trivial. Thus, we canextend x , x to a subset S of size k by adding points in X (cid:48) . S has k subsets of size ( k − . The prior lemma tells usthat we can find a k + 1) (cid:15) -dense subset R ⊂ R such that λ R ( S ) = λ R ( φ ( S )) , and λ R ( S (cid:48) ) = λ R ( φ ( S (cid:48) )) for any subset S (cid:48) ⊂ S with | S | = ( k − . We can further demand from theRounding Lemma that the appearance time of every edge in X (cid:48) and every edge in φ ( X (cid:48) ) be exactly the same, where R will now be k + 1) (cid:15) + 14 (cid:15) dense in R .ow, for any subset S (cid:48) ⊂ S containing ( x , x ) with size | S (cid:48) | = k − , the set S \ S (cid:48) consists of a pair of points ( p , p ) ∈ X (cid:48) . We then know that λ R ( S (cid:48) ) = λ R ( φ ( S (cid:48) )) byusing an inclusion-exclusion calculation with S (cid:48) ∪ p , S (cid:48) ∪ p ,and S (cid:48) ∪ p ∪ p , since the missing term in the inclusion-exclusion formula is exactly the same for both X and Y ,after rounding to R . This argument can be iterated on theentire sublattice of S consisting of those subsets S (cid:48) ⊂ S with | S (cid:48) | ≤ k − and which contain ( x , x ) . The proof concludesby an identical stability analysis to that of Theorem V.15. Remark V.22. The above proof does not require all pairwisedistances in X (cid:48) , as the inclusion-exclusion trick can be carriedout with O ( k ) intersections, rather than the full sublatticeof O ( k ) intersections. We have omitted this analysis asit obfuscates the statement of the theorem and does notsignificantly improve it. E. Probabilistic Results Porisms V.17 and V.18 tell us that we do not need to sampleall (cid:0) | X | k (cid:1) + (cid:0) | X | k − (cid:1) + · · · + (cid:0) | X | k − m − (cid:1) subsets S ⊆ X of size | S | ∈ { k, · · · , k − m − } , so long as the collection C of subsetsconsidered satisfies appropriate cover and closure properties.The goal of this section is to give bounds on the probabilitythat a randomly chosen collection of subsets of size k has thecovering property. The closure property can then be ensuredby adding subsets of the appropriate cardinalities. Proposition V.23. Let X be a set of size n , and choose M subsets { S , · · · , S M } of size k by uniform sampling withoutreplacement. Let p ≤ k and A be the outcome that every setof p points ( x , · · · , x p ) is contained in at least one S i . Then P ( A ) ≥ − (cid:18) np (cid:19) (cid:18) − (cid:18) k − p + 1 n − p + 1 (cid:19) p (cid:19) M . Proof. P ( A ) = 1 − P ( ∃ ( x , · · · , x p ) not in any S i ) (1) ≥ − (cid:88) ( x , ··· ,x p ) P (( x , · · · , x p ) not in any S i ) (2) = 1 − (cid:18) np (cid:19) P (( x , · · · , x p ) not in any S i ) (3) = 1 − (cid:18) np (cid:19) M (cid:89) i =1 P (( x , · · · , x p ) not in S i ) (4) = 1 − (cid:18) np (cid:19) M (cid:89) i =1 (1 − P (( x , · · · , x p ) ⊆ S i )) (5)An elementary counting argument provides: P (( x , · · · , x p ) ⊆ S i ) = (cid:0) n − pk − p (cid:1)(cid:0) nk (cid:1) Note further that: (cid:0) n − pk − p (cid:1)(cid:0) nk (cid:1) = k ( k − k − · · · ( k − p + 1) n ( n − n − · · · ( n − p + 1) ≥ (cid:18) k − p + 1 n − p + 1 (cid:19) p Finally, observe that the effect of replacing P (( x , · · · , x p ) ⊆ S i ) with (cid:16) k − p +1 n − p +1 (cid:17) p is to decreasethe value of (5), and so the result is proved. Proposition V.24. Let A be as in the prior proposition. Forany (cid:15) ∈ (0 , , if M ≥ ( p log (cid:18) nep (cid:19) − log (1 − (cid:15) )) (cid:18) n − p + 1 k − p + 1 (cid:19) p then P ( A ) ≥ (cid:15) .Proof. Our goal is to have: (cid:15) ≥ − (cid:18) np (cid:19) (cid:18) − (cid:18) k − p + 1 n − p + 1 (cid:19) p (cid:19) M which is equivalent to (cid:18) np (cid:19) (cid:18) − (cid:18) k − p + 1 n − p + 1 (cid:19) p (cid:19) M ≥ − (cid:15) Taking the log of both sides gives log (cid:18) np (cid:19) + M log (cid:18) − (cid:18) k − p + 1 n − p + 1 (cid:19) p (cid:19) ≥ log(1 − (cid:15) ) Solving for M gives: M ≥ log(1 − (cid:15) ) − log (cid:0) np (cid:1) log (cid:16) − (cid:16) k − p +1 n − p +1 (cid:17) p (cid:17) (6)The denominator on the right-hand side of (6) is negative,so using the identity (cid:0) np (cid:1) < (cid:16) nep (cid:17) p , we can replace (6) withthe strictly stronger inequality: M ≥ log(1 − (cid:15) ) − p log nep log (cid:16) − (cid:16) k − p +1 n − p +1 (cid:17) p (cid:17) (7)We can then apply the identity ≥ − x ≥ log(1 − x ) for x ∈ (0 , , and so replace (7) with the stronger inequality, M ≥ log(1 − (cid:15) ) − p log nep − (cid:16) k − p +1 n − p +1 (cid:17) p (8)The result then follows via simple algebra.The following proposition can be used to bound the proba-bility that a collection C is a δ -dense covering. Proposition V.25. Suppose that the set X has a probabilitymeasure µ and can be covered by s subsets { X , · · · , X s } with measure µ ( X i ) ≥ /s . Choose { S , · · · , S M } subsets ofsize k according to µ . Let A be the outcome that for everycollection of p subsets { X i , · · · , X i p } , there exists some S i such that S i ∩ X i j (cid:54) = ∅ for all j . Then P ( A ) ≥ − (cid:18) sp (cid:19) (cid:18) − (cid:18) k − p + 1 s − p + 1 (cid:19) p (cid:19) M Proof. Construct the set ˜ X whose points are the sets { [ X ] , · · · , [ X s ] } . A subset S ⊆ X maps to subset ˜ S ⊆ ˜ X n the following way: ˜ S contains [ X i ] if S ∩ X i (cid:54) = ∅ . It isevident that the outcome A is equivalent to the conditionthat any { [ X i ] , · · · , [ X i p ] } is contained in some ˜ S i . Let B be the same outcome, with a different sampling procedure:instead of randomly picking subsets S ⊂ X and constructing ˜ S , pick subsets ˜ S uniformly in ˜ X directly. It is clear that P ( A ) ≥ P ( B ) , because µ ( X i ) ≥ /s means that thelikelihood of ˜ S containing [ X i ] is higher for the first samplingprocedure than the second. But Proposition V.23 implies that P ( B ) ≥ − (cid:18) sp (cid:19) (cid:18) − (cid:18) k − p + 1 s − p + 1 (cid:19) p (cid:19) M Let us explain how to produce such a measure µ . Given φ : X → Y , we define d φ ( x , x ) = max {(cid:107) x − x (cid:107) , (cid:107) φ ( x ) − φ ( x ) (cid:107)} . Using furthest point sampling, we can produce asubset { x , · · · , x s } of X that is δ -dense in d φ for some δ ,and let X i = N ( x i , δ ) . We define µ on X via the followingmixed sampling procedure: we randomly pick a subset X i andthen uniformly sample its elements. The resulting measure µ satisfies the hypotheses of the prior proposition, and a δ -densecovering C can be obtained with high probability by samplingi.i.d. from µ . VI. A PPLICATIONS Let us return to viewing X as an abstract set, and ψ : X → R d an embedding that turns X into a point cloud.The distributed topology λ k of X , as we defined it, is { ( S, λ ( ψ ( S ))) | S ⊂ X, | S | = k } . It is often also necessaryto consider the un-labeled invariant { λ ( ψ ( S )) | S ⊂ X, | S | = k } , particularly in situations when distributed persistence isa feature extraction method. As we list some applications ofdistributed persistence below, we will take care to identify ifthe invariant needed is labeled or unlabeled. • (Dimensionality Reduction) When the target dimensionof ψ : X → R d is too high, we may wish to learn a lower-dimensional embedding π : X → R d (cid:48) . We can force π to preserve the topological structure of ψ by minimizingthe following sum over { S ⊂ X | | S | = k } : (cid:88) S d B ( λ ( ψ ( S )) , λ ( π ( S ))) This application uses labeled distributed topology. • (Shape Registration) Given two embedded point clouds X and Y modeling the same shape, it can be of interestto learn a map f : X → Y aligning correspondingpoints. This can be accomplished by having f minimizethe following sum over { S ⊂ X | | S | = k } : (cid:88) S d B ( λ ( S ) , λ ( f ( S ))) This application uses labeled distributed topology. • (Feature Extraction) Given an embedded point cloud X ,we can consider the unlabeled set { λ ( ψ ( S )) | S ⊂ X, | S | = k } as a bag-of-features invariant. These features can be vectorized, averaged, transformed into a measure,and in any other way summarized, before being fed intoa standard supervised or unsupervised machine learningpipeline. VII. E XPERIMENTS Suppose X and Y are finite subsets of Euclidean spacesand φ : X → Y is a bijection between them. TheoremV.15 shows that we may test if φ is a quasi-isometry byevaluating d B ( λ m ( S ) , λ m ( φ ( S ))) for a certain collection ofsubsets S ⊆ X . If X is fixed and Y is variable, we canminimize d B ( λ m ( S ) , λ m ( φ ( S ))) thanks to the differentiabilityof persistence computations; this has the effect of bringing Y closer in alignment with X . Moreover, Porisms V.17 andV.18 and the probabilistic results in Section V-E show thatcorrecting a relatively small number of subsets S ⊆ X islikely to force a quasi-isometry.In the following two synthetic experiments, we followthe methodology described above for X as (1) pointsevenly distributed on a circle in R and (2) pointsevenly distributed on a torus in R . The codomain Y isinitialized to be X with independent Gaussian noise addedcoordinate-wise. Our aim is to see whether minimizing adistributed topological functional via gradent descent succeedsin correcting for the large geometric distortion of addingGaussian noise. In both cases, every iteration step consistsof uniformly sampling k = 25 points, denoted S , from X and taking a step (i.e. perturbing Y ) to minimize the loss W ( D ( S ) , D ( φ ( S )))+ W ( D ( S ) , D ( φ ( S ))) , where D i isthe degree i persistence diagram of the Rips filtration. Becausewe are updating Y based on only a single sample S , weuse the Adam optimizer [KB14] to benefit from momentum.The first (resp. second) row in Figure VII.1 show the initialstate of Y , Y after e (resp. e ) iterations, and Y after e (resp. e ) iterations. For both experiments, we observethe codomain space Y re-organizing itself to closely resemble X . The coloring of the points in Figure VII.1 denotes theirlabeling in X , so that nearby points have similar colors.The fact that the color gradients in the final positions of Y are largely continuous affirm that our optimization fixes notonly the global geometry of Y , but also the labeled pairwisedistances, and hence gives a space quasi-isometric to X .VIII. C ONCLUSION It has long been understood that computational complexityand sensitivity to outliers are major challenges in the applica-tion of persistent homology in data analysis. Moreover, thelack of a stable inverse makes it very hard to say whichgeometric information is retained in a persistence diagram,and which is forgotten. Multiple lines of research have soughtto address these problems by constructing more sophisti-cated topological invariants and tools, such as the persistenthomology transform, multiparameter persistence, distributedpersistence calculations [ZXG + 19] and discrete Morse theory.However, any gains in invertibility are compromised by size-able increases in computational complexity. Fig. VII.1: Synthetic optimization experiments. Columns cor-respond to initial, intermediate, and final positions of Y . Colordenotes labelling.The focus of this paper was the simplest scheme for speed-ing up persistence calculations: subsampling. Subsamplingand bootstrapping are ubiquitous in machine learning and arealready being applied in topological data analysis. What wehave shown is that this simple approach also enjoys uniquelystrong theoretical guarantees. In particular, the manner inwhich distributed persistence interpolates between geometryand topology is explicitly given by quadratic bounds. More-over, these theoretical guarantees are complemented by thesuccess that subsampling has seen in the TDA literature, andthe robust synthetic experiments shown above.There remain a number of outstanding problems, boththeoretical and computational, that would complement theresults of this paper and facilitate its practical application. • Distributed persistence, as we have defined it, dependson an alignment of two data sets. In practice, we use itas an unlabeled bag of features. What injectivity resultscan be obtained in this unstructured setting? • Individual persistence diagrams can be challenging towork with, due to the fact that the space of diagramsadmits no Hilbert space structure [CB19, BW20, Wag19],though there are a number of effective vectorizations inthe literature. How can these be extended or adaptedto provide vectorizations of sets of persistence diagramscoming from subsamples of a fixed point cloud? This isa more structured problem than working with arbitrarycollections of persistence diagrams. • If we are interested in recovering the global topology of X rather than its quasi-isometry or Gromov-Hausdorfftype, it suffices to estimate pairwise distances betweenpoints in adjacent Voronoi cells, at least when workingwith the full Rips or ˇCech complex and not a skeleton.A careful analysis of this setting could dramaticallydecrease the Lipschitz constants appearing in TheoremV.15. R EFERENCES[AEK + 17] Henry Adams, Tegan Emerson, Michael Kirby, Rachel Neville,Chris Peterson, Patrick Shipman, Sofya Chepushtanova, EricHanson, Francis Motta, and Lori Ziegelmeier, Persistence im-ages: A stable vector representation of persistent homology ,The Journal of Machine Learning Research (2017), no. 1,218–252. ↑ Robust statistics, hypothesis testing, and con-fidence intervals for persistent homology on metric measurespaces , Foundations of Computational Mathematics (2014),no. 4, 745–789. ↑ Persistent homology detects curvature , Inverse Problems (2020jan), no. 2, 025008. ↑ The persistence landscape and some of itsproperties , Topological data analysis, 2020, pp. 97–117. ↑ Embeddings of persis-tence diagrams into hilbert spaces , Journal of Applied andComputational Topology (2020), no. 3, 339–351. ↑ Topology and data , Bulletin of the AmericanMathematical Society (2009), no. 2, 255–308. ↑ On the metric distortion ofembedding persistence diagrams into separable Hilbert spaces ,35th International Symposium on Computational Geometry,2019, pp. Art. No. 21, 15. MR3968607 ↑ The structure and stability of persistence modules , Springer,2016. ↑ + Subsam-pling methods for persistent homology , Proceedings of the32nd international conference on machine learning, 201507,pp. 2143–2151. ↑ 2, 3[CSEH07] David Cohen-Steiner, Herbert Edelsbrunner, and John Harer, Stability of persistence diagrams , Discrete & computationalgeometry (2007), no. 1, 103–120. ↑ + 16] Irene Donato, Matteo Gori, Marco Pettini, Giovanni Petri,Sarah De Nigris, Roberto Franzosi, and Francesco Vaccarino, Persistent homology analysis of phase transitions , Phys. Rev. E (2016May), 052138. ↑ Coverage in sensor networksvia persistent homology , Algebraic & Geometric Topology (2007), no. 1, 339–358. ↑ Computational topology:an introduction (2010). ↑ Barcodes: the persistent topology of data , Bul-letin of the American Mathematical Society (2008), no. 1,61–75. ↑ Adam: A method forstochastic optimization , arXiv preprint arXiv:1412.6980 (2014). ↑ Multidimensional scaling by optimizing goodnessof fit to a nonmetric hypothesis , Psychometrika (1964), no. 1,1–27. ↑ + 20] Sayan Mandal, Aldo Guzm´an-S´aenz, Niina Haiminen, SaugataBasu, and Laxmi Parida, A topological data analysis approachon predicting phenotypes from gene expression data , Algo-rithms for computational biology, 2020, pp. 178–187. ↑ Umap:Uniform manifold approximation and projection for dimensionreduction , arXiv preprint arXiv:1802.03426 (2018). ↑ Barcode embeddings formetric graphs , arXiv preprint arXiv:1712.03630 (2017). ↑ Inverse problems intopological persistence , Topological data analysis, 2020,pp. 405–433. ↑ Persistence theory: from quiver representationsto data analysis , Vol. 209, American Mathematical SocietyProvidence, 2015. ↑ Nonlinear dimensional-ity reduction by locally linear embedding , Science (2000),o. 5500, 2323–2326, available at https://science.sciencemag.org/content/290/5500/2323.full.pdf. ↑ A remark on global positioning from local dis-tances , Proceedings of the National Academy of Sciences (2008), no. 28, 9507–9511. ↑ Topo-logical Methods for the Analysis of High Dimensional DataSets and 3D Object Recognition , Eurographics symposium onpoint-based graphics, 2007. ↑ Per-sistent homology transform for modeling shapes and surfaces ,Information and Inference: A Journal of the IMA (2014),no. 4, 310–344. ↑ A global geometric framework for nonlinear dimensionalityreduction , Science (2000), no. 5500, 2319–2323, avail-able at https://science.sciencemag.org/content/290/5500/2319.full.pdf. ↑ Visualizing datausing t-sne , Journal of Machine Learning Research (2008),no. 86, 2579–2605. ↑ Nonembeddability of Persistence Diagramswith p > Wasserstein Metric , arXiv e-prints (October 2019),arXiv:1910.13935, available at 1910.13935. ↑ + 19] Simon Zhang, Mengbai Xiao, Chengxin Guo, Liang Geng, HaoWang, and Xiaodong Zhang, Hypha: A framework based onseparation of parallelisms to accelerate persistent homologymatrix reduction , Proceedings of the acm international confer-ence on supercomputing, 2019, pp. 69–81. ↑↑