[PDF] Death and extended persistence in computational algebraic topology

Abstract

The main aim of this paper is to explore the ideas of persistent homology and extended persistent homology, and their stability theorems, using ideas from [Bubenik and Scott, 2014; Cohen-Steiner, Edelsbrunner, and Harer, 2007; and Cohen-Steiner, Edelsbrunner, and Harer, 2009], as well as other sources. The secondary aim is to explore the homology (and cohomology) of non-orientable surfaces, using the Klein bottle as an example. We also use the Klein bottle as an example for the computation of (extended) persistent homology, referring to it throughout the paper.

Full PDF

DDeath and extended persistence incomputational algebraic topology

Timothy HosgoodSeptember 6, 2016

Abstract

Contents

Contents a r X i v : . [ m a t h . A T ] S e p Conclusions 29References 30A Cellular homology and Poincaré duality 32

A.1 CW complexes and cellular homology . . . . . . . . . . . . . . . 32A.2 Orientability and Poincaré duality . . . . . . . . . . . . . . . . . 36 “Our birth is nothing but our death begun.” – Edward Young,

Night Thoughts

The main aim of this paper is to explore the ideas of persistent homology and extended persistent homology , and their stability theorems , using ideas from [3, 2, 1],as well as other sources. The secondary aim is to explore the homology (and co-homology) of non-orientable surfaces, using the Klein bottle as an example. Wealso use the Klein bottle as an example for the computation of (extended) persis-tent homology, referring to it throughout the paper. There are numerous diagramsand sketches, as well as small computational examples, in the hope that the topo-logical nature of this subject doesn’t get lost amidst the algebra.A lot of consideration has been given to ensuring that this paper is as self-contained as possible (without being overly long) but whilst mentioning other(recent) papers, since many of the ideas found in this paper are relatively modern.In particular, and as has always been the case with algebraic topology, the subjectis leaning more and more towards category-theoretic language – many ideas thathaven’t been around for very long are already being rephrased in new ways. Wetry to place equal emphasis on both approaches, drawing inspiration from [3] forthe topological view, and [1] for the category-theoretic view. In a sense, this paperaims to be an addendum to [4], which is a brilliant survey of persistent homology,in light of some of the results from [1].

Unless otherwise stated, we adopt the following conventions and notation: • all (co)homology groups have coefﬁcients in Z / Z ; • for a group G we write G n to mean (cid:76) ni =1 G ; • D n = { x ∈ R n : (cid:107) x (cid:107) (cid:54) } is the closed n -ball or n -disc; We write ‘(co)homology’ to mean ‘homology and cohomology’. D n = { x ∈ R n : (cid:107) x (cid:107) < } = ( D n ) ◦ is the open n -ball or n -disc; • S n − = { x ∈ R n : (cid:107) x (cid:107) = 1 } = ∂ ( D n ) is the n -sphere; • we ( sometimes ) write Z n to mean Z /n Z ; • we write G (cid:104) x , x , . . . , x n (cid:105) to mean the group G n with basis x , x , . . . , x n ; • for a category C we write x ∈ C to mean that x ∈ ob( C ) is an object of C ; • when we say ‘an interval I ⊆ R ’ we mean any interval, i.e. open, closed,half-open half-closed, or even inﬁnite; • if η : F ⇒ G is a natural transformation between functors then we write η ( x ) to mean the constituent morphism η ( x ) : F ( x ) → G ( x ) ; • if we write a (cid:62) then we mean, in particular, that a ∈ R ; • we write {∗} to mean a singleton (a set with one element); • ∈ N . We assume that the reader has a knowledge of some of the fundamental notionsin algebraic topology, namely: simplicial- and ∆ -complexes, and simplicial andsingular homology (both relative and absolute). All of these topics are covered in[6, §2.1]. In particular, we use the following theorem. Theorem 1.2.1

Let X be a ∆ -complex. Then the k -th simplicial homology group is isomorphic to the k -thsingular homology group, i.e. H k ( X ) ∼ = H ∆ k ( X ) . (cid:121) Proof.

This is a speciﬁc case (where A = ∅ ) of [6, Theorem 2.27, §2.1].Because of this theorem, for any ∆ -complex X we can write H k ( X ) to mean the k -th homology group of X without specifying whether it is calculated using simpli-cial or singular homology – up to isomorphism, the two are the same. This is not a strict convention, but we often use this shorthand to save space. The (co)homology of non-orientable surfaces “Which way is up, what’s goin’ down? I just don’t know, no” – Barry White,

Which Way Is Up

Our ﬁrst aim is to explore the (co)homology of non-orientable surfaces, usingthe Klein bottle as an explicit example. Two of the mains tools that we use are cellular homology and

Poincaré duality ; we summarise most of the necessary deﬁni-tions and results, as well as deﬁning some non-standard notation, in Appendix A.

Deﬁnition 2.1.1 [Non-orientable surface of genus g ]For g > let N g be the surface obtained from a regular g -gon by identifying itsedges according to a cyclic labelling of its edges a a a a . . . a g a g . (cid:121) Figure 1: An sketch of a ∆ -complex structure on N g , where the arrows indicate the ordering ofthe simplices. If we apply barycentric subdivision twice to this structure (or indeed, to any ∆ -complex) then we obtain a simplicial-complex structure [6, Exercise 23, §2.1]. We know that we can also consider N g as a CW complex by using its construc-tion as a polygon with pairwise side identiﬁcation (see Section A.1). Explicitly, theCW-complex structure on N g has the following properties:(i) dim CW N g = 2 ;(ii) σ ( N g ) = { e } ∪ { e , e , . . . , e g − } ∪ { e } ;(iii) ϕ α : S = {− , } → N g = { e } is the constant map;(iv) ϕ : S → N g = ∨ gi =1 S is the map a a a a . . . a g a g .4e can use this to calculate the (co)homology of N g (recalling Theorem A.1.6).By property (ii) , the associated cellular chain complex of N g is Z / Z ( Z / Z ) g Z / Z . d d (2.1.2)We need to calculate the maps d and d to compute the homology groups of thischain complex. Since N g is a singleton set and N g is connected, we know that d ∆1 = 0 , and so Theorem A.1.8 tells us that d = 0 . To calculate d we just need toknow deg( χ β ) for all β ∈ { , , . . . , g − } , also by Theorem A.1.8. Here it is easiestto calculate the degree using the local degree . For any point p ∈ S we have twopoints in the preimage under χ β . Since the attaching map just ‘wraps around’circles, we see that the attach-and-collapse map is of local degree 1. Putting thesetwo facts together we see that deg( χ β ) = 2 = 0 in Z , and so d = 0 . Because both d and d are zero, we can read the homology groups straightoff from Equation 2.1.2; using Theorem A.2.1 we can calculate the cohomologygroups from the homology groups. This gives the following. H k ( N g ) = H k ( N g ) =  Z / Z k = 0 , Z / Z ) g k = 10 k (cid:62) . (2.1.3) We claim that N is homeomorphic to K , where K is the Klein bottle: the non-orientable surface constructed from a square with side identiﬁcations abab − (seeFig. 2).Using Equation 2.1.3 with g = 2 , we see that the Klein bottle N has (co)homology H k ( N ) = H k ( N ) =  Z k = 0 , Z ⊕ Z k = 10 k (cid:62) . (2.2.1)However, if we wish to ﬁnd explicit cycle representatives for the generators of (thenon-trivial) H k ( N ) it is easier to use the ∆ -complex structure of N (see Fig. 3) andsimplicial homology. Recalling [6, Lemma 2.34, §2.2]: C CW n ( X ) is free abelian with basis in bijective correspondenceto n -cells of X . Because otherwise H ( X ) would not be Z . See [6, Proposition 2.30, §2.2] and the preceding paragraphs. More generally, N g is homeomorphic to the connected sum of g copies of RP . This canbe seen from the fact that RP can be described as the 2-gon with boundary word aa , and thatthe connected sum of two polygons with boundary words x . . . x m and y . . . y n (respectively) is x . . . x m y . . . y n . See [7, §1.5]. igure 2: Showing that N is homeomorphic to the Klein bottle K by ‘cutting and gluing’. Herethe arrows represent side identiﬁcation.Figure 3: The Klein bottle N with a ∆ -complex structure [6, p. 102, §2.1]. We have the simplicial chain complex → ∆ ( N ) ∂ −→ ∆ ( N ) ∂ −→ ∆ ( N ) → which (using the labelling from Fig. 3) takes the explicit form → Z (cid:104) U, L (cid:105) ∂ −→ Z (cid:104) a, b, c (cid:105) ∂ −→ Z (cid:104) v (cid:105) → .U (cid:55)→ a + b + cL (cid:55)→ a + b + ca (cid:55)→ v = 0 b (cid:55)→ c (cid:55)→ (2.2.2)6sing Equation 2.2.2 we can read off explicit generators for the homology groups: H ( N ) = Z (cid:104) v (cid:105) ; (2.2.3) H ( N ) = Z (cid:104) a, b (cid:105) ; (2.2.4) H ( N ) = Z (cid:104) U + L (cid:105) , (2.2.5)where Equation 2.2.4 comes from the fact that H ( N ) = ker ∂ im ∂ = Z (cid:104) a, b, c (cid:105) Z (cid:104) a + b + c (cid:105) ∼ = Z (cid:104) a, b, a + b + c (cid:105) Z (cid:104) a + b + c (cid:105) ∼ = Z (cid:104) a, b (cid:105) , since ( a + b + c ) + a + b = c . “The only thing wrong with immortality is that it tends to go on forever.” –Herb Caen, Herb Caen’s San FranciscoMany of the deﬁnitions in this section come from [1].

We now consider the persistent homology of topological spaces under certain(reasonably weak) hypotheses. We assume that the reader is familiar with somebasic concepts of category theory, such as functors and natural transformations.

Deﬁnition 3.0.6 [Functor categories and diagrams]Let C , D be categories where C is small . Deﬁne [ C , D ] to be the functor category : itsobjects are functors F : C → D , called C -indexed diagrams in D , or [ C , D ] diagrams ;its morphisms are natural transformations between such functors. (cid:121) Note 3.0.7

Many sources (including [1]) use the notation D C instead of [ C , D ] . We use thelatter simply as a matter of upbringing. (cid:121) Deﬁnition 3.0.8 [Poset categories]Let ( P, (cid:54) ) be a poset . Deﬁne the poset category P (cid:54) as follows: its objects are ele-ments of P ; there is a single morphism p → q if and only if p (cid:54) q , otherwise thereis no morphism. That is, Hom( p, q ) = (cid:40) { p (cid:54) q } if p (cid:54) q ; ∅ if p > q. (cid:121) i.e. the objects of C form a set, not a proper class. i.e. a set P equipped with a partial order (cid:54) .

7y deﬁnition all poset categories are small. One particular example that weoften use is R (cid:54) , where (cid:54) is the usual partial ordering on R .Name Objects Morphisms Vec ﬁnite-dimensional vector spaces over Z linear maps Vec ∞ all vector spaces over Z linear maps Top topological spaces continuous maps

Table 1: Deﬁnitions of some relevant categories.

Our main interest is in [ R (cid:54) , Vec ] diagrams , of which there are two relativelywell-behaved classes: tame and ﬁnite type (though it later turns out that these areactually equivalent – see Lemma 3.0.14). Deﬁnition 3.0.9 [Characteristic diagram]Let I ⊆ R be an interval. Deﬁne the characteristic diagram χ I ∈ [ R (cid:54) , Vec ] by χ I ( a ) = (cid:40) Z / Z if a ∈ I ;0 otherwise .χ I ( a (cid:54) b ) = (cid:40) id Z / Z if a ∈ I ;0 otherwise . (cid:121) These characteristic diagrams behave nicely with ﬁnite intervals, and we cansimplify things quite easily. Two useful examples are as follows. χ [ a,b ) ⊕ χ [ b,c ) = χ [ a,c ) for a < b < c ∈ R ; (3.0.10) χ I ⊕ χ J = χ I ∪ J ⊕ χ I ∩ J for I ∩ J (cid:54) = ∅ . (3.0.11) Deﬁnition 3.0.12 [Critical values]Let F ∈ [ R (cid:54) , Vec ] , and I ⊆ R be an interval. We say that F is constant on I if F ( a (cid:54) b ) is an isomorphism for all a (cid:54) b ∈ I . We say that a ∈ R is a regular value of F is there exists some open interval J ⊆ R with a ∈ J such that F is constant on J .If a ∈ R is not a regular value then we say that it is a critical value . (cid:121) Deﬁnition 3.0.13 [Finite type and tameness]Let F ∈ [ R (cid:54) , Vec ] . We say that F is of ﬁnite type if there exist ﬁnitely many intervals I , . . . , I N ⊆ R such that F = (cid:76) Ni =1 χ I i and that F is tame if it has ﬁnitely-manycritical values. (cid:121) Though many that we come across can be factored through

Top , i.e. can be written as acomposition of a [ R (cid:54) , Top ] diagram and a [ Top , Vec ] diagram emma 3.0.14 Let F ∈ [ R (cid:54) , Vec ] . Then F is tame if and only if F is of ﬁnite type. (cid:121) Proof. [1, Theorem 4.6]It turns out that we can deﬁne a notion of distance (though not quite a met-ric) between R (cid:54) -index diagrams, which provides useful when we start looking atapplications of [ R (cid:54) , Vec ] to algebraic topology. First, though, we need some moretechnical machinery. Deﬁnition 3.0.15 [Translation functors and translation natural transformations]Let b (cid:62) . Deﬁne the b -translation functor T b by T b : R (cid:54) → R (cid:54) a (cid:55)→ a + b and deﬁne the b -translation natural transformation η b : id R (cid:54) ⇒ T b by η b ( a ) = a (cid:54) a + b. (cid:121) It follows straight from the deﬁnitions that T b T c = T b + c and η b η c = η b + c . Deﬁnition 3.0.16 [Interleaving of diagrams]Let

F, G ∈ [ R (cid:54) , D ] for some arbitrary category D , and let ε (cid:62) . Deﬁne an ε -interleaving of F and G as a quadruple ( F, G, ϕ, ψ ) , where ϕ : F ⇒ GT ε and ψ : G ⇒ F T ε are natural transformations such that ( ψT ε ) ϕ = F η ε , ( ϕT ε ) ψ = Gη ε . That is, we want the following diagrams to commute: F ( a ) F ( a + 2 ε ) F ( a + ε ) G ( a + ε ) G ( a ) G ( a + 2 ε ) η ε ( a ) ϕ ( a ) ϕ ( a + ε ) ψ ( a + ε ) η ε ( a ) ψ ( a ) We say that F and G are ε -interleaved if there exists some ε -interleaving ( F, G, ϕ, ψ ) . (cid:121) eﬁnition 3.0.17 [Interleaving extended pseudometric]Let D be some arbitrary category. Deﬁne the extended pseudometric d on anysubset of the class of [ R (cid:54) , D ] diagrams by d( F, G ) = inf ε (cid:62) { ε | F and G are ε -interleaved } . (cid:121) We now deﬁne one of the fundamental concepts in computational algebraictopology: persistent homology . A good introduction to how this seemingly abstractdeﬁnition arises in a reasonably natural way can be found in [5, §§ 5.13 & 7.2], andour deﬁnition is from [1, §2.2.4].

Deﬁnition 3.0.18 [Persistent homology]Let F ∈ [ R (cid:54) , Top ] . Deﬁne the p -persistent k -th homology group P p H k F ( a ) of F at a tobe the image of the homomorphism H k F ( a (cid:54) a + p ) . (cid:121) This deﬁnition of persistent homology is better explained after some unpack-ing. Let a, p ∈ R , k ∈ N , and F ∈ [ R (cid:54) , Top ] . Write X b to mean F ( b ) . Then • F ( a (cid:54) a + p ) : X a (cid:44) → X a + p is an inclusion map of topological spaces, since F is a functor R (cid:54) → Top ; • H k F ( a (cid:54) a + p ) : H k X a → H k X a + p is the induced homomorphism of homol-ogy groups (which are Z -vector spaces); • P p H k X a = im H k F ( a (cid:54) a + p ) (cid:54) H k X a + p is a subgroup (subspace) of the k -thhomology group of X a + p .Putting this all together we see that P p H k ∈ [ Top , Vec ∞ ] , where the functorialityfollows by deﬁnition. Some more intuition behind the idea of persistent homologyis given in Section 4, where we explain the following statement:The p -persistent k -th homology group of F at a consists of homol-ogy classes that were born no later than a and that are still alive at a + p . See [1, Theorem 3.3]; we quote: “[i]t fails to be a metric because it can take the value ∞ and d( F, G ) does not imply that F ∼ = G ” . And using the fact that any topological space X can be written in the form F ( a ) for some F ∈ [ R (cid:54) , Top ] and a ∈ R by taking the trivial diagram F ( a ) = X for all a ∈ R . .1 Stability for persistent homology With these deﬁnitions and lemmas in hand, let us consider the following scenario:take some topological space X and some (not necessarily continuous ) function f : X → ( − M, M ) ⊂ R . We can deﬁne a height ﬁltration F ∈ [ R (cid:54) , Top ] of X by F ( a ) = f − (cid:0) ( −∞ , a ] (cid:1) and where F ( a (cid:54) b ) is the inclusion F ( a ) (cid:44) → F ( b ) . Then we can deﬁne the [ R (cid:54) , Vec ∞ ] diagram H k F given by taking the k -th homology group of F ( a ) . Forsimplicity we assume that H k F is tame for all k ∈ N . In particular then, by Lemma 3.0.14, H k F = N (cid:77) i =1 χ I i , (3.1.1)and so, for all a ∈ R , H k F ( a ) = N a (cid:77) i =1 ( Z / Z ) (3.1.2)for some N a (cid:54) N . That is, H k F being tame implies that all the homology groups H k F ( a ) are ﬁnitely generated, i.e. H k F ∈ [ R (cid:54) , Vec ] . Deﬁnition 3.1.3 [ M -bounded tame functions]We call any such f an M -bounded tame function on X , and call F the associatedﬁltration . (cid:121) Note 3.1.4 [The tameness assumption]If we didn’t assume that H k F is necessarily tame, but instead that F ( a ) is a com-pact manifold for all a ∈ R , then all the (singular) homology groups H k F ( a ) arestill ﬁnitely generated . So assuming that H k F is tame is at least no stronger thanassuming that all of our topological spaces are compact manifolds – this gives usa vague lower bound for the level of generality at which we are working. (cid:121) Now, as in Deﬁnition 3.0.18, we can look at p -persistent homology. Since each H k F ( a + p ) is ﬁnitely generated, and we are working with Z coefﬁcients , the See Note 3.1.7. See Note 3.1.4. i.e. f : X → ( − M, M ) with F ( a ) = f − (cid:0) ( −∞ , a ] (cid:1) being such that H k F is tame. Again, this is not standard terminology. As a statement in full generality, this is reasonably non-trivial: see [9, Proposition III.1, p. 130] Here the fact that homology groups are actually vector spaces is vital, since it is a simple factthat the subspace of a ﬁnite-dimensional vector space is itself ﬁnite dimensional. If, however, wewere working with coefﬁcients in a general group then we would have to appeal to somethinglike Schreier’s lemma, which tells us that any ﬁnite index subgroup of a ﬁnitely-generated group isitself ﬁnitely generated, or maybe even to some similar property of modules over a PID. P p H k F ( a ) is also ﬁnitely generated. That is, P p H k ∈ [ Top , Vec ] and so P p H k F ∈ [ R (cid:54) , Vec ] . This means that we can use the interleaving extended pseu-dometric from Deﬁnition 3.0.17 on p -persistent homology groups of F ( a ) .One of the main examples of an M -bounded tame function f on a topologicalspace X is a height function : we immerse X into R n for some n and ‘measure’ X along some axis . In this case, we can obtain different height functions, and thusdifferent associated [ R (cid:54) , Top ] diagrams, simply by perturbing the axis along whichwe measure by some small amount. For persistent homology to have much prac-tical use we would strongly desire that small perturbations of the height functionsresult in small changes to the persistent homology groups. Explicitly, we wouldhope to be able to bound the distance between the persistent homology groupsof F and G by the distance between f and g . It turns out that this is, in fact,possible. Theorem 3.1.5 [Stability theorem for persistent homology]

Let f, g be M -bounded tame functions on some topological space X , with associated ﬁl-trations F, G (respectively). Then d( P p H k F, P p H k G ) (cid:54) (cid:107) f − g (cid:107) ∞ . (cid:121) Proof. By our previous comments – namely that P p H k ∈ [ Top , Vec ] – this is a speciﬁccase of [1, Theorem 5.1]. As such, a full proof can be found there; we give here a shortsketch of the proof. Let ε = (cid:107) f − g (cid:107) ∞ . Then F ( a ) = f − (cid:0) ( −∞ , a ] (cid:1) ⊆ g − (cid:0) ( −∞ , a + ε ] (cid:1) = G ( a + ε ) , and similarly G ( a ) ⊆ F ( a + ε ) . Combining these gives us inclusions F ( a ) (cid:44) → G ( a + ε ) (cid:44) → F ( a + 2 ε ) which is, by deﬁnition, the same as the inclusion F ( a ) (cid:44) → F ( a + 2 ε ) . Similarly wehave G ( a ) (cid:44) → F ( a + ε ) (cid:44) → G ( a + 2 ε ) . Thus F and G are ε -interleaved. But thenthe functoriality of P p H k ensures that P p H k F and P p H k G are ε -interleaved (see [1,Proposition 3.6]) which gives the required result.Theorem 3.1.5 tells us that small perturbations to our ‘measuring’ function re-sult in small perturbations to the resulting p -persistent homology groups. How-ever, if we are given some topological space X , or construct one from a data point This is exactly the sort of example that we look at in Section 3.2; we explain how heightfunctions relate to the assumption that f is not necessarily continuous in Note 3.1.7. Measured by the interleaving extended pseudometric d . The most natural choice of metric for data sampling being the sup metric (cid:107) f − g (cid:107) ∞ . Recall that (cid:107) f − g (cid:107) ∞ = sup x ∈ X | f ( x ) − g ( x ) | . not ensure that picking any function f : X → R will result in persistent homology necessarily telling us anything useful about thespace.Many different applications of the stability theorem can be found in [4, §6].Two clearly important example (explained in full detail in [2, §4]) that stand out,however, are that of homology inference : computing the homology of a space boundby a smooth surface by computing the homology arising from a ﬁnite sample ofpoints from the space; and shape comparison : using persistent homology to mea-sure how similar two topological spaces embedded in R n are.The key point behind both of these examples is that, although Theorem 3.1.5 isphrased in terms of two functions on the same topological space, we can actuallyuse it for analysing the persistent homology of two different spaces: given some X embedded in R n we can deﬁne d X : R n → R by d X ( z ) = inf x ∈ X (cid:107) z − x (cid:107) R n . If wehave another space Y with d Y deﬁned similarly then we can apply the stabilitytheorem to d X and d Y to bound the ‘homological differences’ between X and Y by the ‘Euclidean-distance differences’ between X and Y . Note 3.1.6

Our summaries of homology inference and shape comparison are very brief, andthus skip over some of the ﬁner, but very important, details. The subtleties areexplained fully in [2, §4], but the main problem is that similar barcodes don’tnecessarily imply similar spaces, and vice versa. To quote,

Perhaps unexpectedly, the homology groups of X + δ can be different fromthose of X , even when X has positive homological feature size and δ is arbi-trarily small. . . . In particular, two shapes whose persistence diagrams are close are notnecessarily approximately congruent. (It does turns out, however, that the ‘pathological behaviour’ behind the ﬁrst partof the quote actually almost never occurs in practice, and using functions thataren’t simply distance functions can solve the problem in the second part of thequote.) (cid:121)

Note 3.1.7 [The continuity non-assumption]The fact that we don’t require f to be continuous corresponds to the idea that wemight be using some sort of discrete height map, maybe because we are working After restricting to some compact subset of R n containing both X and Y , say. It turns out that we can actually then bound the ‘Euclidean-distance difference’ (cid:107) d X − d Y (cid:107) ∞ by the Hausdorff distance between X and Y . See [4, §6]. (cid:121) Figure 4: Discrete height functions on the torus using 13 and 18 partitions.

If we have some [ R (cid:54) , Vec ] diagram of ﬁnite type then we can represent themgraphically using barcodes . Generally, barcodes give a very useful way of interpret-ing Theorem 3.1.5, especially in light of [1, Propositions 4.12, 4.13] which relatesthe interleaving distance of characteristic diagrams to the distances between theendpoints of their associated intervals. This is something that will prove usefulwhen we examine the Klein bottle in Section 3.2. Deﬁnition 3.1.8 [Barcodes]Let D ∈ [ R (cid:54) , Vec ] be of ﬁnite type, so that D = (cid:76) Ni =1 χ I i for some intervals I ⊆ R .Deﬁne the barcode B D of D to be the multiset B D = { I , I , . . . , I N } . (cid:121) A different visual representation of persistent homology is a persistence dia-gram . These are introduced and discussed extensively in [2], as are the implica-tions of the stability theorem. We choose, however, to use barcodes, and refer theinterested reader to other sources for persistence diagrams. i.e. a set where each element occurs with a multiplicity: { x, x, y } and { x, y } deﬁne the sameset, but different multisets. .2 Height of the Klein bottle (persistent homology) We now study an explicit example: the Klein bottle, immersed in R . Here wescale the Klein bottle so that it has height M for some M ∈ R and take f to bethe corresponding height function. Picking out certain values of a ∈ R we can seehow the ‘height slices’ F ( a ) change. We note that there are really three interestingpoints, namely − M , A , and M (as labelled in Fig. 5) where the homotopy typechanges , and so we can reﬁne our picture (see Fig. 6). Looking at the homologygroups we can read the critical values of each H k straight off (see Table 2).Putting all of the above together, we see that H F = χ [ − M, ∞ ) H F = χ [ A, ∞ ) ⊕ χ [ M, ∞ ) , (3.2.1) H F = χ [ M, ∞ ) . That is, each H K F is of ﬁnite type (and thus tame). We might have guessed this,since Note 3.1.4 told us that all of our homology groups would be ﬁnitely gen-erated, but it is much simpler in this case to calculate ﬁnite-type decompositionsfor the H k F directly. We can summarise Equation 3.2.1 by using a barcode (seeFig. 7). Looking at the barcode we see one interesting feature: there are no deaths .That is, as we increase a , at no point does the dimension of H k F ( a ) decrease. Al-ternatively, we can say that all of the intervals in our ﬁnite-type decomposition areupper-half inﬁnite: if a ∈ I then a + p ∈ I for all p (cid:62) . So H k F ( a (cid:54) a + p ) ∼ = id H k F ( a ) by deﬁnition. That is, for all p (cid:62) and k ∈ N , P p H k F ∼ = H k F. (3.2.2)Here then, Theorem 3.1.5, along with [1, Propositions 4.12, 4.13], tells us that if g is some other height function, close (in the (cid:107) · (cid:107) ∞ sense) to f , then the resultingbarcodes will be close (in the | · | sense). As a trivial example, we see that if wedeﬁne g as a shift of f by ε then the resulting barcodes will be exactly distance ε apart (see Fig. 8). As a slightly-less-contrived example, if we were working com-putationally and needed to use discrete data, then we could use a discrete heightfunction (as in Fig. 4) and bound the errors on the resulting barcode (as comparedto using a continuous height function) by how close the boundaries of our par-titions are to critical points (see Fig. 9). Using the ideas of shape comparison,mentioned in Theorem 3.1.5, we also know that deforming our immersion of theKlein bottle would result in bounded changes in the barcode. We appeal to the fact that homotopy equivalent spaces have isomorphic (singular) homologygroups (see [6, Corollary 2.11, §2.1]). Where we calculate the homology groups using [6, Corollary 2.25 & Proposition A.5]. Recall Deﬁnition 3.1.8. We explain the idea of birth and death in more detail in Section 4. To be picky, we really mean the graphical representation of the barcodes, i.e. the associatedintervals drawn inside R . igure 5: The Klein bottle K immersed in R with associated height function f → ( − M, M ) ⊂ R .We shouldn’t really be surprised that F ( − A ) , F (0) , and F ( B ) are all homotopy equivalent, sincethe self-intersection at height B isn’t really an intersection at all: it is simply an artefact comingfrom this immersion in R .Figure 6: The homology groups of F ( a ) between critical values. critical values of H k F − M, M − A, M M Table 2: Critical values of H k F .Figure 7: The barcode associated to the homology of F ( a ) . See Equation 3.2.1. igure 8: Here g is simply f shifted down by ε . This means that (cid:107) f − g (cid:107) ∞ = ε , and it turns out that d( H k F, H k G ) = ε as well (we can precisely relate the interleaving distance to the bottleneck distance between the barcodes, and there are some useful facts simplifying the calculation of interleavingdistances between characteristic diagrams – see [1, Propositions 4.12, 4.13]). So the bound in Theo-rem 3.1.5 is attained (since here the persistent homology groups are exactly the homology groups– we have no deaths).Figure 9: Using a discrete height function g (as in Fig. 4). We see that, in this scenario, where theheight of a section is determined by the highest value of f taken on that section, the interleavingdistance δ is given by the maximum distance between any partition boundary and the ﬁrst criticalpoint of K occurring before it. This is bounded by (cid:107) f − g (cid:107) ∞ . Extended persistent homology “Death is the surest calculation that can be made.” – Ludwig Büchner,

Force and MatterIn this section we keep the same assumptions as in Section 3, namely that f is some M -bounded tame function on a topological space X , and F ( a ) = f − (cid:0) ( −∞ , a ] (cid:1) . Before moving on to discuss extended persistent homology, we ﬁrst look aboutanother way of talking at persistent homology. One very useful way of talkingabout persistent homology is using the idea of birth and death . This is really just away of formalising some of the things that we noticed in Section 3.2.For all a (cid:54) b we have the inclusion ι : F ( a ) (cid:44) → F ( b ) , which induces the ho-momorphism of homology groups ι ∗ : H k F ( a ) → H k F ( b ) . Using this, given somehomology class [ β ] ∈ H k F ( b ) we can ask for which a (cid:54) b there exists some homol-ogy class [ α ] ∈ H k F ( a ) with ι ∗ ([ α ]) = [ β ] . Clearly, if we ﬁnd two such values of a ,say a and a , then it is the smaller one which is of the most interest: say a (cid:54) a ,then, using the general fact about induced homomorphisms that ( ικ ) ∗ = ι ∗ κ ∗ , wecan factor H k F ( a ) → H k F ( b ) through H k F ( a ) . Thinking of F ( r ) as evolving overtime as r ∈ R gets larger, we call the smallest such value of a the time of birth of [ β ] . Similarly, we can look at the time of death of [ β ] by considering the image of H k F ( a (cid:48) ) → H k F ( b (cid:48) ) for b (cid:48) > b and all a (cid:48) < a , where a is the time of birth: it is thesmallest b (cid:48) such that [ β ] is not in the image of H k F ( a (cid:48) ) → H k F ( b (cid:48) ) .Using this language we can formulate the following motto of persistent ho-mology: The p -persistent k -th homology group of F at a consists of homol-ogy classes that were born no later than a and that are still alive at a + p .This makes it clear that, if we have no deaths, then persistent homology is simplyhomology – every class that is born no later than a will always be alive at a + p .If X is a surface then Morse theory tells us that the births and deaths of homol-ogy classes will be at critical points of the surface: if [ β ] has birth time a and deathtime b then there will be critical points of X at F ( a ) and F ( b ) , call them p a and p b ,respectively. This gives us a pairing of critical points of X : we pair p a with p b andsay that they have persistence | b − a | (or sometimes | f ( p b ) − f ( p a ) | ). See [3, §2] formore details and motivation.The issue that remains (and that [3] aims to resolve) is that there are scenarioswhere homology classes don’t die (i.e. have death time ∞ ), since this leaves some Which might well be −∞ . If any such b (cid:48) exists. X unpaired. Extended persistent homology solves this problem byensuring that every homology class eventually dies within ﬁnite time.

Deﬁnition 4.0.3 [Biﬁltrations of an M -bounded tame function]For an M -bounded tame function f on a topological space X deﬁne the associated (2 M + 1) -biﬁltration (cid:98) F by F ↑ ( a ) = f − (cid:0) ( −∞ , a ] (cid:1) , F ↓ s ( a ) = f − (cid:0) [2 M + 1 − a, ∞ ) (cid:1) , (cid:98) F ( a ) = ( F ↑ ( a ) , F ↓ s ( a )) . So F ↑ = F in our previous notation, and where the s in F ↓ s is to remind us thatthere is some shift, i.e. that F ↓ s ( a ) is not simply f − (cid:0) [ a, ∞ ) (cid:1) . (cid:121) The reason for these deﬁnitions is made slightly clearer when we look at howthese functions change as a increases: (cid:98) F ( a ) =  ( ∅ , ∅ ) for a ∈ ( −∞ , − M )( X a , ∅ ) for a ∈ [ − M, M )( X, ∅ ) for a ∈ [ M, M + 1)(

X, X \ X M +1 − a ) for a ∈ [ M + 1 , M + 1)( X, X ) for a ∈ [3 M + 1 , ∞ ) (4.0.4)where X a = F ( a ) ⊆ X is a subspace of X . So we see that if we take the rela-tive homology of this pair (cid:98) F ( a ) then we recover H k F ( a ) for a ∈ ( −∞ , M + 1) , since H k ( Y, ∅ ) ∼ = H k ( Y ) for all Y . But for a (cid:62) M +1 the homology then ‘dies down’, end-ing with all relative homology groups being for a (cid:62) M + 1 , since H k ( Y, Y ) = 0 for all Y . See Section 4.2 for an example with the Klein bottle.The motivation for this construction of F ↑ and F ↓ s comes from [1, §6], whichis in turn motivated by the abstraction of the situation in [3, §4] where Poincaréand Lefschetz duality are used. We refer the reader to these two papers for furtherinformation; we carry on developing as much machinery as we can with the toolsthat we have.We claim that F ↑ and F ↓ s are [ R (cid:54) , Vec ] diagrams. That is, they are functorial:they preserve composition of morphisms and map identity morphisms to identitymorphisms. This follows from Equation 4.0.4, since (cid:98) F ( a ) ⊆ (cid:98) F ( b ) for all a (cid:54) b , and The choice of the +1 in F ↓ s is arbitrary: we could use any ‘spacing’ constant λ > . Since we only work with (2 M + 1) -biﬁltrations, we often refer to them just as biﬁltrations . They are in fact diagrams, though this does require some reasoning, which we give later. H k (cid:98) F ( a ) → H k (cid:98) F ( b ) on the relative homology groups . Deﬁnition 4.0.5 [Extended persistent homology]Let F ∈ [ R (cid:54) , Top ] . Deﬁne the extended p -persistent k -th homology group E p H k (cid:98) F ( a ) of F at a to be the image of the homomorphism H k (cid:98) F ( a (cid:54) a + p ) . (cid:121) As with Deﬁnition 3.0.18, this deﬁnition is better understood after some un-packing. Let a, p ∈ R , k ∈ N , and F ∈ [ R (cid:54) , Top ] . Write (cid:98) X b to mean (cid:98) F ( b ) . Then • (cid:98) F ( a (cid:54) a + p ) : (cid:98) X a (cid:44) → (cid:98) X a + p is an inclusion map of topological spaces; • H k (cid:98) F ( a (cid:54) a + p ) : H k (cid:98) X a → H k (cid:98) X a + p is the induced homomorphism of relativehomology groups (which are Z vector spaces); • E p H k (cid:98) X a = im H k (cid:98) F ( a (cid:54) a + p ) (cid:54) H k (cid:98) X a + p is a subgroup (subspace) of the k -thhomology group of (cid:98) X a + p .Again, as with P p H k , we see that E p H k ∈ [ Top , Vec ∞ ] .We will see that, using Equation 4.0.4, we can sometimes think of extendedpersistent homology as follows. First we compute persistent homology ‘from bot-tom to top’, then we compute persistent homology again, but from top to bottomand whilst squeezing our space to a point along the way. Theorem 4.1.1 [Stability theorem for extended persistent homology]

Let f, g be M -bounded tame functions on some topological space X , with associated (2 M + 1) -biﬁltrations (cid:98) F , (cid:98) G , respectively. Then d( E p H k F, E p H k G ) (cid:54) (cid:107) f − g (cid:107) ∞ . (cid:121) Proof.

Let ε = (cid:107) f − g (cid:107) ∞ . All we need to show is that (cid:98) F and (cid:98) G are ε -interleaved,since then we can use [1, Proposition 3.6]. As in the proof for Theorem 3.1.5, thiswould follow from showing that (cid:98) F ( a ) ⊆ (cid:98) G ( a + ε ) and (cid:98) G ( a ) ⊆ (cid:98) F ( a + ε ) for all a ∈ R .If a + 2 ε ∈ ( −∞ , M + 1) then this follows as in the proof of Theorem 3.1.5, since F ↓ s , G ↓ s are both ∅ for a , a + ε , and a + 2 ε . Similarly, if a (cid:62) M + 1 then F ↑ ( a ) ⊆ This works for relative homology almost exactly as it does for absolute homology, but thereare some helpful comments just after Example 2.18 in [6, 118]. Although we have been working with M -bounded tame functions f on a topological space X , we can still deﬁne (cid:98) F = ( F ↑ , F ↓ s ) for any F ∈ [ R (cid:54) , Top ] exactly as in Deﬁnition 4.0.3. ↑ ( a + ε ) ⊆ F ↑ ( a + 2 ε ) , so all that remains to show is that F ↓ s ( a ) ⊆ G ↓ s ( a + ε ) ⊆ F ↓ s ( a + 2 ε ) . But this follows from the observation that F ↓ s ( a ) = X \ F ↑ (2 M + 1 − a ) (see Equation 4.0.4). For all other cases (say, a + ε ∈ ( −∞ , M +1) but a +2 ε (cid:62) M +1 )we can combine the above two arguments to show the required inclusions. We now return to the example of the Klein bottle immersed in R of height M with height function f , as in Section 3.2. Previously, the fact that there wereno deaths (i.e. that every homology class had inﬁnite persistence) meant thatpersistent homology looked exactly like homology. But here, for the same ex-ample, extended persistent homology guarantees death in ﬁnite time . This meansthat we will expect to see different results when we compute E p H k F ( a ) as com-pared to H k F ( a ) . In Fig. 10 we sketch (cid:98) F ( a ) , recalling Equation 4.0.4, and calculatehomotopy-equivalent spaces and their homology groups for various (cid:98) F ( a ) . Tocalculate the homology groups, we use two facts: H k ( X, ∅ ) ∼ = H k ( X ) ; and [6,Proposition 2.22 & Proposition A.5], which says that if ( X, A ) is a good pair then H k ( X, A ) ∼ = (cid:101) H n ( X/A ) . From this, we can draw the barcode and read off a ﬁnite-type decomposition of the H k (cid:98) F ( a ) (see Fig. 11).If we now turn to the extended persistent homology groups we get some non-trivial results. Each group is parametrised by two variables: a ∈ R , which wethink of as time, and p (cid:62) , which we think of as lifespan. For example, if E p H k (cid:98) F ( a ) is non-zero then it means that there is some k -homology class aliveat time a that persists until at least time a + p . The extended p -persistent k -thhomology groups of (cid:98) F at a are as follows. E p H (cid:98) F ( a ) =  if a (cid:54)∈ [ − M, M + 1);0 if p (cid:62) (2 M + 1);0 if a ∈ [ − M, M + 1) but p (cid:62) (cid:0) M + 1 − | a + M | (cid:1) ; Z otherwise . (4.2.1) As in absolute homology, homotopy equivalent topological pairs have the same homology. Ina sense, this is trivial if you adopt the Eilenberg-Steenrod axiomatic point of view, since this is oneof the axioms. On the other hand, we do have to be slightly careful: if we take a space X andtwo homeomorphic (and thus homotopy equivalent) subspaces A, B ⊂ X then it is not necessarilytrue that H k ( X, A ) ∼ = H k ( X, B ) . For this to hold we also need that the inclusions A (cid:44) → X and A → B (cid:44) → X are homotopic, where A → B is a homotopy equivalence. p H (cid:98) F ( a ) =  if a (cid:54)∈ [ − A, M + 1 + A );0 if p (cid:62) (cid:0) M + A ) + 1 (cid:1) ;0 if a ∈ [ − A, M + 1 + A ) but p (cid:62) (cid:0) M + A ) + 1 − | a + A | (cid:1) ; Z ⊕ Z if a ∈ [ M, M + 1) and p (cid:54) ( M + 1 − a ); Z otherwise . (4.2.2) E p H (cid:98) F ( a ) =  if a (cid:54)∈ [ M, M + 1);0 if p (cid:62) (2 M + 1);0 if a ∈ [ M, M + 1) but p (cid:62) ( M + 1 − a ); Z otherwise . (4.2.3)Although these formulas might look slightly daunting at ﬁrst, it is largely dueto a lack of concise notation – comparing them to Figs. 11 and 12 we see that theyare (secretly) reasonably simple. Looking at the more general formula given inSection 4.3 will hopefully also help clarify what is actually being said in Equa-tions 4.2.1 to 4.2.3.As in Section 3.2, we can use the stability theorem in many ways. One in-teresting point is that, since all of our intervals in the barcode are now ﬁnite, thestability theorem could now be interpreted as being slightly weaker since extendedpersistence homology is naturally slightly stronger : if we have two half-inﬁnite in-tervals then their interleaving distance is exactly the distance between their two(ﬁnite) endpoints; if we have two ﬁnite intervals then their interleaving distance isthe minimum of the (maximum) distance between their endpoints and their (max-imum) length. That is, if we take a space, compute its persistent homology andextended persistent homology, both using two different functions f and g , thenthe stability theorem tells us that the difference between the persistent homologybarcodes will be no more than (cid:107) f − g (cid:107) ∞ . It tells us the same thing for the extendedpersistent homology barcodes, but we know that there is a distinct possibility thatthe difference between the barcodes will actually be smaller still.Of course, since this ‘double bound’ consists of two ‘less-than- or-equal ’ inequal-ities, it could very well be the case that there is no discernible improvement in thisspeciﬁc manner. However, it seems that it might be worth thinking about whetherwe can construct extended persistent homology in such a way that the barcodedistance is strictly less than that for persistent homology.23 igure 10: The homology groups and homotopy-equivalent spaces associated to the M + 1 -biﬁltration (cid:98) F ( a ) of K . Note the reversed symmetry in the table of relative homology groups.Figure 11: The barcode associated to the M + 1 -biﬁltration (cid:98) F ( a ) of K . The symmetry in the tablein Fig. 10 is even more apparent here: H is symmetric around the midpoint of [ M, M + 1] , and H , H seem to mirror each other. Note that all intervals are left-closed and right-open (i.e. of theform [ x, y ) ). igure 12: Extended persistent homology ensures death in ﬁnite time, and so each region in the ( a, p ) plane of E p H k (cid:98) F ( a ) is compact, as opposed to being half inﬁnite. For the sake of completeness we now generalise Equations 4.2.1 to 4.2.3 to a moregeneral setting, though still assuming that our diagrams are tame. We start bylooking at the simple case where H k (cid:98) F = χ I , and then extend this to the full casewhere H k (cid:98) F = (cid:76) ki =1 χ I i for some k ∈ N .Clearly, if H k (cid:98) F = χ I for some interval I ⊂ R , then E p H k (cid:98) F ( a ) = 0 whenever a (cid:54)∈ I , since H k (cid:98) F ( a ) = 0 . That is, any class that is not even alive has zero persistence.Similarly, if p (cid:62) | I | then E p H k (cid:98) F ( a ) = 0 for any choice of a . That is, each homologygroup has some element that lives the longest, and so no class will persist longerthan this maximum lifespan.In this simple case then, where H k (cid:98) F = χ I , we can summarise the extendedpersistent homology groups quite neatly. Write I = [ s, t ) , where t in necessarilyﬁnite. Then E p H k (cid:98) F ( a ) =  if a (cid:54)∈ [ s, t );0 if p (cid:62) | s − t | ;0 if a ∈ [ s, t ) but p (cid:62) | s − t | − | a − s | ; Z / Z otherwise . (4.3.1)In the more general (but still tame) case where H k (cid:98) F = (cid:76) ki =1 χ I i for intervals I i = [ s i , t i ) ⊂ R , we can still obtain some general formula . For m ∈ N , write This is not necessarily the simplest way of expressing E p H k (cid:98) F ( a ) in terms of conditions on a and p , but it is meant to give some intuition: the ﬁrst condition corresponds to the fact that thingsthat aren’t born have zero persistence; the second to the fact that nothing can live longer than themaximal lifespan; and the third to the fact that, if something has already been alive for time y thenit must die after T − y more time has passed, where T is its total lifespan. Though, in practice, it is much easier to read this information straight off from the barcode. m = { n ∈ N | n (cid:54) m } = { , , . . . , m } . Then E p H k (cid:98) F ( a ) =  if a (cid:54)∈ (cid:83) i ∈ N k [ s i , t i );0 if p (cid:62) max i ∈ N k | t i − s i | ;0 if a ∈ I j for some j ∈ N k but p (cid:62) | t j − s j | − | a − s j | ;( Z / Z ) d otherwise,where d = min t ∈ ( a,a + p ) (cid:40) | σ | : σ ⊆ N k , t ∈ (cid:92) i ∈ σ I i (cid:41) . (4.3.2)A point to note when calculating the above is that, even if at times a and a + p the homology group H k is non-zero, if it is zero at some time t ∈ ( a, a + p ) then H k (cid:98) F ( a (cid:54) a + p ) factors through zero, and so E p H k (cid:98) F ( a ) is also zero. More generally,if H k is of dimension d at times a and a + p , if it drops dimension at some time t ∈ ( a, a + p ) then E p H k (cid:98) F ( a ) will be of dimension d (cid:48) = min t ∈ ( a,a + p ) dim H k (cid:98) F ( t ) .However, in practice it is still much easier to simply draw the barcode and readthe data straight off from there. “All the really good ideas I ever hadcame to me while I was milking a cow.” Grant WoodWe mentioned the idea of homology inference from [2, §4] previously in passing.Here we spend a small amount of time looking at the practicality of this methodin terms of computation.The stability theorem tells us that we can estimate the persistent homology of,for example, a smooth manifold, by looking at a ﬁnite discrete subset of points.We use the ﬁnite set of points to construct a

Vietoris-Rips complex : in essence, weconsider a ball of radius r around each point and introduce a simplex betweenvertices whose balls intersect. Letting r vary we can obtain a (discrete) ﬁltration,and so apply the techniques of persistent homology.There is now quite a wide choice of software and libraries that can be usedto compute persistent homology; [10] provides a thorough survey of the options The last condition in Equation 4.3.1 could be phrased in a different way: we say that the groupis Z , unless a ∈ I j , I j for distinct j , j and p is small enough that a + p ∈ I j , I j , in which case thegroup is Z ⊕ Z , unless a ∈ I j , I j , I j for distinct j , j , j and p is small enough that . . . . Obviouslythough, there are many different ways of phrasing a reasonably complicated set of if-then phrases;the phrasing in Equation 4.3.1 was simply the ﬁrst one that occurred to the author. In particular, itseems that a simpler form could be obtained by using Eqs. (3.0.10) and (3.0.11). Mathematica [11] and

Perseus [8]. We pick a reasonably complex3D model from the

ExampleData[“Geometry3D”] library provided by Math-ematica – see Fig. 13. We generate a discrete version this model and use the

DirichletDistribution function to randomly sample points from the dis-crete model . We can then use the brips option of Perseus to calculate the per-sistent homology of the point cloud by using the Vietoris-Rips complex. Thereare two (relevant) options that we can alter: the amount s by which the radiusincreases at each step, and the number N of total steps to calculate. Using theBetti numbers (i.e. the dimensions of the homology groups, noting that Perseusalso works with coefﬁcients in Z / Z ) at each stage, we can import this data backinto Mathematica and use MatrixPlot to generate a sort of barcode: instead ofhaving multiple lines in each degree of homology, we have one line and representthe dimension by brightness – the darker the segment the higher the dimension,with the key below showing explicit values. These barcodes are shown in Table 3.The choice of taking an 800- and a 1000-point sample was arbitrary , but wecan still read some interesting data from the barcodes. For example, we see that,for N = 40 , , the degree- homology dies off quicker in the 1000-point sample,and the degree- homology is born again at a later time (and so the persistenthomology more closely resembles the homology of the original model).Although these barcodes aren’t vastly different, that is to be expected, exactlybecause of the stability theorem: comparing the point samples in Fig. 13 we seethat the maximum distance between any two points in the 800 sample is not muchmore than the maximum distance between any two points in the 1000 sample, andso the difference in the barcodes will be small as well. In a sense, the stability theo-rem tells us that we can do homology inference with a ﬁnite set of points, but alsothat, if we choose our points in a uniform way so that they are roughly evenly dis-tributed then any two random samples will give similar barcodes. An interestingconsequence of this is that, if we know our point samples are roughly uniformlydistributed, it might usually be enough to compute the persistent homology onlyonce – running the computation again with different (uniformly distributed) datawon’t result in many new results. This is a point that we mention again, as wellas the questions that it raises, in Section 6. The uniform distribution on any simplex. Using a method given on http://mathematica.stackexchange.com/questions/57938/ by user ybeltukov . It just so happens that 800 points was enough for the author to easily recognise the model,and 1000 was roughly the highest number of points for which Perseus could run in a few secondswith the given ( N, s ) . igure 13: From top-left, clockwise: A 3D model of a cow; a discrete version of the model; a 1000-point random sample from the discrete model; an 800-point random sample from the discretemodel. ( N, s (cid:48) )

800 point sample 1000 point sample (10 , , , , Table 3: Here N is the number of steps and s (cid:48) = s · is the (scaled) step size. The top horizontalbar in each barcode represents H , and the bottom represents H . Conclusions “Life and death are one thread, the same line viewed from different sides.” – Lao Tzu,

Tao Te Ching

In this paper we have developed the tools of (extended) persistent homologyusing the category-theoretic language of [1], formulated the relevant stability the-orems, and presented various worked examples (focusing largely on the Kleinbottle immersed in R ) and commentary. There are some potentially interestingquestions to consider in regards to the material that we have covered, though therelevance, the importance, and even the validity of them is, to the author, uncer-tain.1. Distribution of point sampling:

We mentioned in Section 5 that, if we werehanded a uniformly sampled point cloud from a space, then we probablywouldn’t gain much from asking for another set of uniformly sampled data.It seems like there should be some way of analysing how the distribution ofthe sampled data affects the resulting barcode: if we know that our pointsare distributed unevenly, say polynomially or exponentially more pointsare found around certain areas, can we predict how this will change thebarcode? Obviously, for persistent homology, if we are dealing with somesmooth manifold then it is the critical points that are of interest. So if we aretold that our points were sampled from near these critical points, then howcan we use this fact to improve our method of computing persistent homol-ogy? Depending on the speciﬁc embedding or immersion of the manifold, anaive calculation of the Vietoris-Rips complex might provide woefully inac-curate estimations as to the global homology.2.

Symmetry of extended persistent homology:

The barcodes in Fig. 11 have an in-teresting symmetry to them. Part of this is easily understandable: with this‘convex’ immersion of the Klein bottle and any height function, the th ho-mology will always be born at − M and die at M + 1 , and the nd homologywill always be born at M and die at M + 1 . But how can we formalise thesymmetry in the st homology? If we compute the extended persistence ho-mology of the torus embedded ‘vertically’ in R (i.e. with the hole perpen-dicular to the height function) then the intervals of the two st homologyclasses have a rotational symmetry around M + , which is different to themirror symmetry found in Fig. 11. Is this possibly to do with the orientabil-ity of the surface? If we had had more time, it would have been an interesting project to study other immersionsof the Klein bottle in R , such as the ﬁgure-8 immersion, for example. Software for extended persistent homology:

As listed in [10], there are many li-braries for computing persistent homology, all with their various strengthsand weaknesses. None of them, however, seem to be able to compute ex-tended persistent homology. Would it be feasible to extend them to be ableto do so, or would it be easier to write new dedicated code to do this task?It also seems possible that, with the powerful tools provided by Mathemat-ica, we could compute the (extended) persistent homology of 3D models:we can deﬁne a height function f ; calculate the ‘height slices’ F ( a ) ; discre-tise (triangulate) the resulting space; convert into simplicial data; and thencalculate the homology computationally.4. Probabilistic persistent homology:

Extended persistent homology ensures thatall classes die in ﬁnite time. This means that, at any given time, the per-sistence of a class is simply ‘how long it has left to live’. If we are handedsome k -th homology class at time a then we can calculate a probability thatit is alive at time a + p by simply looking at the proportion of k -th homol-ogy classes that persist until a + p . This naive idea becomes more interestingwhen we turn it around: given a slice of the barcode, say between times t and t (cid:48) , can we use the probabilities of a class persisting for time p calculatedfrom that slice to predict what might happen elsewhere in the barcode? Thatis, if we see many k -th homology classes being born and dying then it seemspossible that this is a trait of the ﬁltration – it is ‘ k -th homology noisy’ – andso we can take into account the fact that these classes have a high probabilityof dying in a short time when trying to reconstruct the rest of the barcodefrom ﬁnite data. Alternatively, is there anything to be gained from adoptingan entirely stochastic viewpoint of persistent homology: drawing inﬂuencefrom the theory of statistical lifetime models? We don’t allow ourselves ac-cess to all of the information about X , or its barcode. Instead, we can ran-domly sample various homology classes and look at how long they persists.In doing so, we can build up a statistical idea of the persistent homology,and try to ﬁt it to some probabilistic model.30 eferences [1] Peter Bubenik and Jonathan A Scott. “Categoriﬁcation of persistent homol-ogy”. In: arXiv.org (Jan. 2014). arXiv: .[2] David Cohen-Steiner, Herbert Edelsbrunner, and John Harer. “Stability ofPersistence Diagrams”. In: Discrete & Computational Geometry

Foundations of Com-putational Mathematics

Surveys on Discrete and Computational Geometry . Ed. by János Pach, Ja-cob E Goodman, and Richard Pollack. American Mathematical Society, 2008,pp. 257-282. ISBN: 978-0-8218-4239-3.[5] Robert Ghrist.

Elementary Applied Topology . Createspace, 2014. ISBN: 978-1-5028-8085-7.[6] Allen Hatcher.

Algebraic Topology . Cambridge University Press, 2002. ISBN:978-0-521-79540-1.[7] William S Massey.

Algebraic topology: an introduction . Harcourt, Brace &World, 1967.[8] Vidit Nanda.

Perseus, the Persistent Homology Software . URL: .[9] Karl-Hermann Neeb. “Current groups for non-compact manifolds and theircentral extensions”. In:

Inﬁnite Dimensional Groups and Manifolds . Ed. byTilman Wurzbacher. 2004, pp. 109-184. ISBN: 978-3-11-018186-9.[10] Nina Otter, Mason A Porter, Ulrike Tillmann, Peter Grindrod, and HeatherA Harrington. “A roadmap for the computation of persistent homology”. In: arXiv.org (June 2015). arXiv: .[11] Wolfram Research, Inc.

Mathematica . Version 10.4.31

Cellular homology and Poincaré duality

A.1 CW complexes and cellular homology

Most of our deﬁnitions in this section come from [6].

Here we brieﬂy summarise the ideas behind

CW complexes and cellular homol-ogy , and also agree on various notational quirks. The reader with a workingknowledge of these topics can skip this section, referring back to it only whenconfused by potentially unfamiliar notation.

Deﬁnition A.1.1

For m ∈ N let N m = { n ∈ N | n (cid:54) m } = { , , . . . , m } and deﬁne N ∞ = N . Call aset of the form N k (for k ∈ N ∪ {∞} ) a natural interval . (cid:121) Deﬁnition A.1.2 [CW complex]Let N k be a natural interval, and let { e nα } α ∈ α n be a non-empty set of n -cells (copiesof open n -discs D n ) for each n ∈ N k . We build a topological space X , called a cell (or CW ) complex , by the following inductive procedure:(i) deﬁne X = { e α } ;(ii) deﬁne the n -skeleton X n by attaching each e nα to X n − via a map ϕ nα : S n − → X n − ; (iii) If k ∈ N then we set X = X k ; if k = ∞ then we set X = (cid:83) n X n and endow X with the weak topology : a set U ⊂ X is open if and only if U ∩ X n is open forall n ∈ N .Deﬁne the CW-dimension of X as dim CW ( X ) = k . We write σ n ( X ) to mean theunderlying set structure of the n -skeleton: σ n ( X ) = (cid:83) nm =0 { e mα } α ∈ α m n (cid:54) dim CW ( X ); σ dim CW ( X ) ( X ) n > dim CW ( X ) , and deﬁne σ ( X ) = σ ∞ ( X ) . (cid:121) This is not standard terminology, but we introduce it here to simplify certain statementsthroughout this paper. C stands for ‘closure-ﬁnite’ and W stands for ‘weak topology’. i.e. take the quotient by an equivalence relation: X n = (cid:0) X n − (cid:116) α D nα (cid:1) / { x ∼ ϕ nα ( x ) } ; theattaching map tells us how the boundary of the closed n -disc gets mapped into X n − . igure 14: The torus as a polygon with side identiﬁcations [6, p. 5, §0]. Given a polygon with side identiﬁcations we can realise it as a CW complexas follows. Say the polygon has g edges, with sides identiﬁed in pairs, i.e.the boundary word is some permutation of a ± a ± . . . a ± g a ± g . Constructing theresulting surface is equivalent to attaching g -cells to a -cell with the constantattaching map, which gives a wedge sum of g copies of S . We label the i th copyof S with a i . Then we attach a -cell to the wedge sum ∨ gi =1 S along the boundaryword .This procedure is hopefully made clear in the following example. Example A.1.3 [CW-complex structure of the torus]The torus T can be deﬁned as the surface resulting from the square with boundaryword aba − b − after side identiﬁcation (see Fig. 14). This induces a CW-complexstructure on T , where • dim CW ( T ) = 2 ; • σ ( T ) = { e } ∪ { e , e } ∪ { e } ; • ϕ α : S = {− , } → T = { e } is the constant map; • ϕ : S → T = ∨ i =1 S is the map aba − b − .The map ϕ is realised as follows (see Fig. 15). We label one of the circles in thewedge sum a and the other one b , label the point where they are identiﬁed v ,and choose some orientation: say, clockwise. Next we cut S at a point, resultingin a line, and identify (‘glue’) one end to v . Then we glue the ﬁrst quarter ofthis line to the circle a in a clockwise manner. We glue the next quarter to thecircle b , also clockwise. The third quarter gets glued to the circle a again, but This construction can be done in a more general case – see [6, Cell Complexes, Chapter 0, p. 5]– but all the examples that we encounter will be of this form. e.g. if g = 6 then permissible boundary words include aabbcc and aba − cbc . Split S into g regions: R j = (cid:110) (cos θ, sin θ ) | θ ∈ (cid:104) jπg , ( j +1) πg (cid:17)(cid:111) for j = 0 , . . . , g − . Saythat the ﬁrst letter of the boundary word is a ε i , where ε = ± , and pick some orientation for thecopies of S in the wedge sum X = ∨ gm =1 S . Then we deﬁne ϕ : S → X by mapping, fromendpoint to endpoint, R onto the copy of S labelled with a i , reversing the orientation if ε = − . b , but againanticlockwise. (cid:121) Figure 15: Building the torus as a CW complex, n -cell by n -cell. To deﬁne cellular homology of a CW complex, we ﬁrst need to deﬁne the asso-ciated cellular chain complex . Deﬁnition A.1.4 [Cellular chain complex]Let X be a CW complex. Deﬁne the cellular chain complex C CW • ( X ) of X as C CW • ( X ) = . . . d n +1 −−−→ C CW n ( X ) d n −→ C CW n − ( X ) d n − −−−→ . . . where C CW n ( X ) = H n ( X n , X n − ) and the d n are compositions coming from the long exact sequences for the pairs ( X k , X k − ) and using [6, Lemma 2.34, §2.2]: see [6, §2.2, p. 139]. (cid:121) By [6, Lemma 2.34, §2.2] we know that C CW n ( X ) is free abelian and can bethought of as being generated by the n -cells of X . Deﬁnition A.1.5 [Cellular homology]Let X be a CW complex. Deﬁne n -th cellular homology group H CW n ( X ) as the n -thhomology group of the associated cellular chain complex C CW • ( X ) , i.e. H CW n ( X ) = ker d n im d n +1 . (cid:121) heorem A.1.6 Let X be a CW complex. Then the n -th cellular homology group is isomorphic to the n -thhomology group, i.e. H CW n ( X ) ∼ = H n ( X ) . (cid:121) The above theorem means that, given some CW complex X , we can write H n ( X ) to mean the n -th homology group of X without specifying whether we meansingular, simplicial, or cellular – they are all the same, up to isomorphism. Sim-ilarly, we can use any of these three types of homology to perform explicit calcu-lations. Deﬁnition A.1.7 [Attach-and-collapse map]Let X be a CW complex, e nα an n -cell, and e n − β an n − -cell. Deﬁne the attach-and-collapse map χ nαβ : S n − α → S n − β as the composition χ nαβ = q n − β q n − ϕ nα , where • ϕ nα is the attaching map; • q n − : X n − → X n − /X n − is the quotient map; • q n − β : X n − /X n − → S n − β is the map that collapses the complement of e n − β to a point . (cid:121) Theorem A.1.8 [Cellular boundary formula]

Let X be a CW complex and C CW • ( X ) = . . . d n +1 −−−→ C CW n ( X ) d n −→ C CW n − ( X ) d n − −−−→ . . . its associated cellular chain complex. Then, for n > , the boundary maps are given by d n : e nα (cid:55)→ (cid:88) β ∈ α n − deg (cid:0) χ nαβ (cid:1) e n − β and d is the same as the simplicial boundary map d ∆1 : ∆ ( X ) → ∆ ( X ) . (cid:121) Proof.

See [6, pp. 140, 141, §2.2] for the case where the homology coefﬁcients arein Z , and see [6, Lemma 2.49, §2.2] for how to apply this with general group coef-ﬁcients. Recall Theorem 1.2.1 – we can use either simplicial or singular homology. Though sometimes, to clarify what we mean, we do use the notation H CW n . Again, this is not standard terminology. For precise details, see [6, 141]. .2 Orientability and Poincaré duality Given some n -manifold we can generalise the idea of orientability to R -orientability ,where R is some commutative ring with identity, and in such a way that Z -orientabilityrecovers the original notion exactly. In particular, it can be shown that every man-ifold is Z -orientable. See [6, p. 235, §3.3] for explicit details. Theorem A.2.1 [Poincaré Duality]

Let M be an R -orientable closed n -manifold. Then H k ( M ; R ) ∼ = H n − k ( M ; R ) . (cid:121) Proof.