[PDF] A directed persistent homology theory for dissimilarity functions

Abstract

We develop a theory of persistent homology for directed simplicial complexes which detects persistent directed cycles in odd dimensions. In order to do so, we introduce a homology theory with coefficients in a semiring: by explicitly removing additive inverses, we are able to detect directed cycles algebraically. We relate directed persistent homology to classical persistent homology, prove some stability results, and discuss the computational challenges of our approach.

Full PDF

aa r X i v : . [ m a t h . A T ] A ug A DIRECTED PERSISTENT HOMOLOGY THEORY FOR DISSIMILARITYFUNCTIONS

DAVID M´ENDEZ AND RUB´EN J. S ´ANCHEZ-GARC´IA

Abstract.

We develop a theory of persistent homology for directed simplicial complexes which detectspersistent directed cycles in odd dimensions. In order to do so, we introduce a homology theory withcoeﬃcients in a semiring: by explicitly removing additive inverses, we are able to detect directed cyclesalgebraically. We relate directed persistent homology to classical persistent homology, prove some stabilityresults, and discuss the computational aspects of our approach. Introduction

Persistent homology is one of the most successful tools in Topological Data Analysis [2], with re-cent applications in numerous scientiﬁc domains such as biology, medicine, neuroscience, robotics, andmany others [18]. In its most common implementation, persistent homology is used to infer topologicalproperties of the metric space underlying a ﬁnite point cloud using two steps [10]:(1) Build a ﬁltration of simplicial complexes from distances, or similarities, between data points.(2) Compute the singular homology of each of the simplicial complexes in the ﬁltration, along withthe linear maps induced in homology by their inclusions. The resulting persistence module cannormally be represented using a persistence diagram or a persistence barcode .A fundamental limitation of persistent homology, and in fact homology, is its inability to incorporate di-rectionality, which can be important in some real-world applications (see examples below). For instance,homology cannot in principle distinguish between directed and undirected cycles (Figure 1). Althoughthe deﬁnition of homology, namely the diﬀerential or boundary operator, requires a choice of orientationfor the simplices, the resulting homology is independent of this choice. Previous attempts have had somepartial success at this issue (see ‘Related work’ below), however, as far as we know, there is no homologytheory able to exactly detect directed cycles up to boundary equivalence.The main diﬃculty occurs at the algebraic level: opposite orientations on a simplex correspond toadditive inverse elements in the coeﬃcient ring or ﬁeld. Our key insight is to explicitly prevent additiveinverses at the algebraic level, by extending homology to coeﬃcients in appropriate semirings withoutadditive inverses, in particular cancellative zerosumfree semirings such as N or R + . Despite the con-siderable weakening of the coeﬃcient algebraic structure, we are able to retain most of the necessaryhomological algebra, and deﬁne a homology theory that detects homology classes of directed 1-cycles(Fig. 1). Furthermore, our homology theory generalises simplicial homology (which we recover whenthe coeﬃcient semiring is a ring), and admits persistent versions (undirected, and directed), for whichwe are able to prove natural stability results. Our directed persistent homology is closely related to thestandard one and has a natural interpretation (see Fig. 5). For example, directed barcodes are simply asubset of undirected barcodes with possibly later birth (Figs. 2, 5, 6).Although we set up to deﬁne 1-homology only, our homology theory can be deﬁned in all dimensions,even if it is less clear what a directed n -cycle for n > V of data points, and an arbitrary function d V : V × V → R , which we call dissimilarityfunction (Section 2.2) and which, crucially, may not be symmetric. Our main example of directed Key words and phrases.

Persistent homology; topological data analysis; dissimilarity functions; directed cycles. v v v w w w Figure 1.

Two examples of directed simplicial complexes X (left) and Y (right). Ourhomology theory over a zerosumfree semiring Λ detects directed 1-cycles: H ( X, Λ) = 0while H ( Y, Λ) = Λ generated by the 1-cycle [ w , w ] + [ w , w ] + [ w , w ].simplicial complex is therefore the directed Rips complex (Deﬁnition 5.13) of such pair ( V, d V ), whichbecomes the input of our directed persistent homology pipeline. Related work.

A ﬁrst approach to incorporating directions would be to modify step (1) above so that weencode the asymmetry of a data set in the simplicial complex built from it. In [23], the author uses orderedtuple complexes or OT-complexes , which are generalisations of simplicial complexes where simplices areordered tuples of vertices. Similar ideas have been successfully used in the ﬁeld of neuroscience to showthe importance of directed cliques of neurons [21], showing that directionality in neuron connectivityplays a crucial role in the structure and function of the brain.In [7], the authors use so-called

Dowker ﬁltrations to develop persistent homology for asymmetricnetworks, and note [7, Remark 35] that a non-trivial 1-dimensional persistence diagram associated toDowker ﬁltrations suggests the presence of directed cycles. However, they also note that such persistencediagram may be non-trivial even if directed cycles are not present.Further progress can be achieved by modifying both steps (1) and (2). Namely, the ideas above canbe used in combination with homology theories that are themselves sensitive to asymmetry. In [8], theauthors develop persistent homology for directed networks using the theory of path homology of graphs[13]. This persistent homology theory shows stability with respect to the distance between asymmetricnetworks in [3], and is indeed able to tell apart digraphs with isomorphic underlying graphs but diﬀerentorientations on the edges. Nonetheless, cycles in path homology do not correspond to directed cycles,see [8, Example 10].Another signiﬁcant contribution can be found in [23], where four new approaches to persistent homol-ogy are developed for asymmetric data, all of which are shown to be stable. Each approach is sensitiveto asymmetry in a diﬀerent way. For example, one of them uses a generalisation of poset homology topreorders to detect strongly connected components in digraphs, a feature our implementation does nothave (see Proposition 4.19).More interestingly for our purposes, in [23, Section 5] the author introduces a persistent homologyapproach that builds a directed Rips ﬁltration of ordered tuple complexes associated to dissimilarityfunctions d V : V × V → R . This modiﬁcation of step (1) is indeed sensitive to asymmetry and yieldsa persistent homology pipeline that is stable with respect to the correspondence distortion distance,a generalisation of the Gromov-Hausdorﬀ distance to dissimilarity functions (see [23, Section 2] andSection 2.2). However, the information we want to capture is lost in this approach as homology overrings is used for step (2). Overview of results.

Our starting point on the theory of chain complexes of semimodules and theircorresponding homology developed by Patchkoria [19, 20]. We introduce homology over semimodulesfor directed simplicial complexes (Deﬁnition 4.6) and show that it satisﬁes some desirable properties.Furthermore, we prove that for certain coeﬃcient semirings, closed paths will only have non-trivialhomology in dimension 1 when their edges form a directed cycle, as desired (Proposition 4.26). It isworth mentioning that, however, our homology theory does not seem suited to detect directed informationin even dimensions, as it is zero in even dimensions (Proposition 4.27) except dimension 0 where it isonly able to detect weakly connected components.We then extend our homology theory to the persistence setting. We cannot use semimodules astheir lax algebraic structure means that persistence semimodules may not have persistence diagramsor barcodes. Instead, we use the module completion of the considered semimodules. This allows us to

DIRECTED PERSISTENT HOMOLOGY THEORY FOR DISSIMILARITY FUNCTIONS 3 v v v v v X v v v v v X v v v v v X v v v v v X Figure 2.

A ﬁltration of a directed simplicial complexes and corresponding undirected(bottom, left) and directed (bottom, right) persistence barcodes.introduce, for each dimension, two persistence modules associated to the same ﬁltration: an undirected persistence module (Deﬁnition 5.2), which is analogous to the one introduced in [23, Section 5], andthe submodule generated by directed classes, which we call the directed persistence module (Deﬁnition5.5). The directed persistence module gives raise to barcodes associated to homology classes that canbe represented by directed cycles, as illustrated in Fig. 2 and Examples 5.9 to 5.12.We also establish a relation between the undirected and directed persistence barcodes of the sameﬁltration. Indeed, in Proposition 5.8 we show that every bar in the directed barcode corresponds to aunique undirected barcode that dies at the same time. The directed barcode may, however, be bornlater, and some undirected barcodes may be left unmatched (see Figs. 2, 5 and 6).Having all the necessary ingredients, we can provide a complete pipeline for directed persistencehomology: given a dissimilarity function d V on a ﬁnite set V , we construct its directed Rips ﬁltration(Deﬁnition 5.13), and take the corresponding undirected and directed n -dimensional persistence modules,denoted H n ( V, d V ) respectively H Dir n ( V, d V ) (Deﬁnition 5.14). Both of these persistence modules havepersistence diagrams and barcodes (Deﬁnition 5.16) and the associated persistence modules are stablewith respect to the correspondence distortion distance (Section 2.2 and Theorem 5.20).We ﬁnish this article by brieﬂy looking at possible algorithmic implementations for directed persistenthomology. We show that the Standard Algorithm can be adapted to this context, allowing for theeﬃcient computation of the barcodes associated to the undirected persistence modules (see Algorithm1). However, we also give some reasons why the computation of the 1-dimensional directed persistencemodules seems more challenging. Outline of the paper.

In Section 2, we introduce the necessary background in persistence modules (2.1)and dissimilarity functions (2.2). Section 3 is devoted to the necessary algebraic background on semiringsand semimodules (3.1) and chain complexes of semimodules and their homologies (3.2). In Section 4, weintroduce the homology over semirings of directed simplicial complexes, show its functoriality, and studysome of its properties. We then use this homology theory to introduce directed persistent homologyin Section 5, where we also show our stability results. Finally, in Section 6, we make some commentsregarding the algorithmic implementation of directed persistent homology and lay some future researchdirections. 2.

Persistence modules and dissimilarity functions

Our goal is to extend persistent homology to directed simplicial complexes such as those constructedfrom a dissimilarity function. In this section, we introduce the necessary background regarding persistenthomology (Section 2.1) and dissimilarity functions (Section 2.2).2.1.

Persistence modules, diagrams and barcodes.

In this section we introduce persistence mod-ules and their associated persistence diagrams and barcodes. We follow the exposition in [8], as it issimple and focused on persistence modules arising in the context we are interested in: that of persistent

DAVID M´ENDEZ AND RUB´EN J. S ´ANCHEZ-GARC´IA homology of ﬁnite simplicial complexes. Many of the results below hold with much more generality [5]and could be used to extend the results in this paper to inﬁnite settings, although we chose to keep theexposition simple.Let R be a ring with unity and let T ⊆ R be a subset. Deﬁnition 2.1. A persistence R -module over T , written V = (cid:0) { V δ } , { ν δ ′ δ } (cid:1) δ ≤ δ ′ ∈ T , is a family of R -modules { V δ } δ ∈ T and homomorphisms ν δ ′ δ : V δ → V δ ′ , whenever δ ≤ δ ′ ∈ T , such that(1) for every δ ∈ T , ν δδ is the identity map, and(2) for every δ ≤ δ ′ ≤ δ ′′ ∈ T , ν δ ′′ δ ′ ◦ ν δ ′ δ = ν δ ′′ δ .Let V = (cid:0) { V δ } , { ν δ ′ δ } (cid:1) δ ≤ δ ′ ∈ T and W = (cid:0) { W δ } , { µ δ ′ δ } (cid:1) δ ≤ δ ′ ∈ T be two persistence R -modules over T .A morphism of persistence R -modules f : V → W is a family of morphisms of R -modules { f δ : V δ → W δ } δ ∈ T such that for every δ ≤ δ ′ ∈ T we have a commutative diagram V δ V δ ′ W δ W δ ′ . ν δ ′ δ µ δ ′ δ f δ f δ ′ Let us assume that R is a ﬁeld, thus V = (cid:0) { V δ } , { ν δ ′ δ } (cid:1) δ ≤ δ ′ ∈ T is a persistence vector space, and that V δ is ﬁnite-dimensional, for every δ ∈ R . Furthermore, we suppose that there exists a ﬁnite subset { δ , δ , . . . , δ n } ⊆ T for which(1) if δ ∈ T , δ ≤ δ , then V δ = 0,(2) if δ ∈ T ∩ [ δ i − , δ i ) for some 1 ≤ i ≤ n , the map ν δδ i − is an isomorphism, while ν δ i δ is not, and(3) if δ, δ ′ ∈ T , δ n ≤ δ < δ ′ , then v δ ′ δ : V δ → V δ ′ is an isomorphism. Remark 2.2.

When R is a ﬁeld, either of these restrictions (namely, each V δ being ﬁnite-dimensionalor the existence of a ﬁnite subset { δ , δ , . . . , δ n } as above) is enough to associate a persistence barcode and a persistence diagram to V [5, Theorem 2.8]. We nevertheless assume both restrictions since theyhold for persistence diagrams associated to ﬁnite simplicial complexes, and make the exposition simpler.Let us now introduce persistence barcodes and diagrams. First, for simplicity, we consider V indexed bythe naturals: V = (cid:0) { V δ i } , { ν δ i +1 δ i } (cid:1) i ∈ N , where V δ k = V δ n for all k ≥ n and ν δ l δ k is the identity whenever k, l ≥ n . (This clearly contains all of theinformation in V .)By [11, Basis Lemma], we can ﬁnd bases B i of the vector spaces V δ i , i ∈ N , such that(1) ν δ i +1 δ i ( B i ) ⊆ B i +1 ∪ { } ,(2) rank( ν δ i +1 δ i ) = | v δ i +1 δ i ( B i ) ∩ B i +1 | , and(3) each w ∈ Im (cid:0) v δ i +1 δ i (cid:1) ∩ B i +1 is the image of exactly one element v ∈ B i .Such bases are called compatible bases . Elements in B i that are mapped to an element in B i +1 correspondto linearly independent elements of V δ i that ‘survive’ until the next step in the persistence vector space.Similarly, elements in a basis B i which are not in B i − are considered to be ‘born’ at index i . Formally,we deﬁne L := { ( b, i ) | b ∈ B i , b Im( ν δ i δ i − ) , i > } ∪ { ( b, | b ∈ B } . Given ( b, i ) ∈ L , we call i the birth index of the basis element b . This element ‘survives’ on subsequentbasis until it is eventually mapped to zero. When this happens, the number of steps taken until the classdies is its death index of b . Formally, the death index of ( b, i ) is ℓ ( b, i ) := max { j ∈ N | ( ν δ j δ j − ◦ · · · ◦ ν δ i +2 δ i +1 ◦ ν δ i +1 δ i )( b ) ∈ B j } . We allow ℓ ( b, i ) = ∞ , if b is in every basis B j for all j ≥ i , so that the death index takes values in N = N ∪ { + ∞} . Using this information, we can introduce the persistence barcode of V . We call a pair DIRECTED PERSISTENT HOMOLOGY THEORY FOR DISSIMILARITY FUNCTIONS 5 ( X, m ) a multiset if X is a set and m : X → N = N ∪ { + ∞} a function. We call m ( x ) the multiplicity of x ∈ X . Deﬁnition 2.3.

Consider a persistence vector space V = (cid:0) { V δ i } , { ν δ i +1 δ i } (cid:1) i ∈ N as above. The persistencebarcode of V is then deﬁned as the multiset of intervalsPers( V ) := (cid:8) [ δ i , δ j + 1) | ∃ ( b, i ) ∈ L s.t. ℓ ( b, i ) = j (cid:9) ∪ (cid:8) [ δ i , + ∞ ) | ∃ ( b, i ) ∈ L s.t. ℓ ( b, i ) = + ∞ (cid:9) , the multiplicity of [ δ i , δ j + 1) (respectively [ δ i , + ∞ )) being the number of elements ( b, i ) ∈ L such that ℓ ( b, i ) = j (respectively ℓ ( b, i ) = + ∞ ).Thus, persistence barcodes encode the birth and death of elements in a family of compatible bases.Crucially, and even though compatible bases are not unique, the number of birthing and dying elementsat each step is determined by the rank of the linear maps ν δ i +1 δ i and their compositions, and it is thusindependent of the choice of compatible bases. Furthermore, the persistence barcode of a persistencevector space completely determines both the dimension of the vector spaces V δ i and the ranks of thelinear maps between them, hence it determines the persistence vector space up to isomorphism.Each interval in Pers( V ) is called a persistence interval . Persistence barcodes can be represented bystacking horizontal lines, each of which represents a persistence interval. The endpoints of the line in thehorizontal axis correspond to the endpoints of the interval it represents, whereas the vertical axis has nosigniﬁcance other than being able to represent every persistence interval at once. Persistence bars canbe stacked in any order, although they are usually ordered by their birth. This representation is whatgives persistence barcodes their name.An alternative characterisation of a persistence vector space is its persistence diagram . Let us write R = R ∪ {−∞ , + ∞} for the extended real line. Deﬁnition 2.4.

The persistence diagram of the persistence vector space V is the multisetDgm( V ) := (cid:8) ( δ i , δ j + 1) ∈ R | [ δ i , δ j + 1) ∈ Pers( V ) (cid:9) ∪ (cid:8) ( δ i , + ∞ ) ∈ R | [ δ i , + ∞ ) ∈ Pers( V ) (cid:9) . The multiplicity of a point in Dgm( V ) is the multiplicity of the corresponding interval in Pers( V ).A crucial property for applications of persistent homology to real data is that small perturbations ofthe input (a data set, encoded as ﬁltration of simplicial complexes) results in a small perturbation ofthe output (its persistent module). Algebraically, this amounts to proving that a small perturbation ofthe persistence diagram results in a small perturbation of the associated persistence module. In orderto state this stability result, we therefore need to introduce distances between persistence diagrams, andpersistence modules, respectively.In order to measure how far apart two persistence diagrams are, we can use the bottleneck distancebetween multisets of R . Let ∆ ∞ denote the multiset of R consisting on every point in the diagonalcounted with inﬁnite multiplicity. A bijection of multisets ϕ : ( X, m X ) → ( Y, m Y ) is a bijection of sets ϕ : ∪ x ∈ X ⊔ m X ( x ) i =1 x → ∪ y ∈ Y ⊔ m Y ( y ) i =1 y , that is, a bijection between the sets obtained when counting eachelement in both X and Y with its multiplicity. Deﬁnition 2.5.

Let A and B be two multisets in R . The bottleneck distance between A and B isdeﬁned as d B ( A, B ) := inf ϕ (cid:26) sup a ∈ A k a − ϕ ( a ) k ∞ (cid:27) , where the inﬁmum is taken over all bijections of multisets ϕ : A ∪ ∆ ∞ → B ∪ ∆ ∞ .We also need a distance between persistence vector spaces. To that purpose, we use the interleavingdistance , introduced in [4]. Deﬁnition 2.6.

Let V = (cid:0) { V δ } , { ν δ ′ δ } (cid:1) δ ≤ δ ′ ∈ R and W = (cid:0) { W δ } , { µ δ ′ δ } (cid:1) δ ≤ δ ′ ∈ R be two persistence vectorspaces, and ε ≥

0. We say that V and W are ε -interleaved if there exist two families of linear maps { ϕ δ : V δ → W δ + ε } δ ∈ R , { ψ δ : W δ → V δ + ε } δ ∈ R such that the following diagrams are commutative for all δ ′ ≥ δ ∈ R : DAVID M´ENDEZ AND RUB´EN J. S ´ANCHEZ-GARC´IA V δ V δ ′ W δ + ε W δ ′ + ε , W δ W δ ′ V δ + ε V δ ′ + ε , ν δ ′ δ µ δ ′ + εδ + ε ϕ δ ϕ δ ′ µ δ ′ δ ν δ ′ + εδ + ε ψ δ ψ δ ′ V δ V δ +2 ε W δ + ε , W δ V δ +2 ε W δ + ε . ν δ +2 εδ ϕ δ ψ δ + ε µ δ +2 εδ ψ δ ϕ δ + ε The interleaving distance between V and W is then deﬁned as d I ( V , W ) = inf { ε ≥ | V and W are ε -interleaved } . The authors in [4] show that interleaving distance is a pseudometric (a zero distance between distinctpoints may occur) in the class of persistence vector spaces. Moreover, they show the following AlgebraicStability Theorem.

Theorem 2.7 ([4]) . Let V = (cid:0) { V δ } , { ν δ ′ δ } (cid:1) δ ≤ δ ′ ∈ R and W = (cid:0) { W δ } , { µ δ ′ δ } (cid:1) δ ≤ δ ′ ∈ R be two persistencevector spaces. Then, d B (cid:0) Dgm( V ) , Dgm( W ) (cid:1) ≤ d I ( V , W ) . Dissimilarity functions and the correspondence distortion distance.

Our objective is todeﬁne a theory of persistent homology able to detect directed cycles modulo boundaries. Therefore,instead of building ﬁltrations (of simplicial complexes) from ﬁnite metric spaces, we are interested inﬁltrations (of directed simplicial complexes) built from arbitrary dissimilarity functions. We will follow[23], where they receive the name of set-function pairs, and [3, 7, 8], where they are referred to asdissimilarity networks or asymmetric networks.Namely, in this section, we introduce the necessary background on dissimilarity functions, and thecorrespondence distortion distance between them, which is a generalisation of the Gromov-Hausdorﬀdistance to the asymmetric setting. We ﬁnish with a reformulation of this distance, established in [3],which we will need to prove our persistent homology stability result.

Deﬁnition 2.8.

Let V be a ﬁnite set. A dissimilarity function ( V, d V ) on V is a function d V : V × V → R .The value of d V on a pair ( v , v ) may be interpreted as the distance or dissimilarity from v to v . Notethat no restrictions are imposed on d V , thus it may not be symmetric, the triangle inequality may nothold, and the distance from a point to itself may not be zero.These functions are also referred to as asymmetric networks [3, 7, 8] since they may be represented asa network with vertex set V and an edge from v to v with weight d V ( v , v ) for every ( v , v ) ∈ V × V .We would like to build (directed) simplicial complexes and, ultimately, persistence diagrams fromdissimilarity functions. In order to check the stability of our constructions, we need a way to measurehow close two such objects are. When comparing networks with the same vertex sets, a natural choiceis the ℓ ∞ norm. However, we are interested in comparing dissimilarity functions on diﬀerent vertex sets.To that end, we consider the ℓ ∞ norm over all possible pairings (quantiﬁed by a binary relation) betweenvertex sets, following similar ideas behind the Gromov-Hausdorﬀ distance deﬁnition. Deﬁnition 2.9.

Let (

V, d V ) and ( W, d W ) be two dissimilarity functions and let R be a non-empty binaryrelation between V and W , that is, an arbitrary subset R ⊆ V × W . The distortion of the relation R isdeﬁned as dis( R ) := max ( v ,w ) , ( v ,w ) ∈ R | d V ( v , v ) − d W ( w , w ) | . DIRECTED PERSISTENT HOMOLOGY THEORY FOR DISSIMILARITY FUNCTIONS 7 A correspondence between V and W is a relation R between these sets such that π V ( R ) = V and π W ( R ) = W , where π V : V × W → V is the projection onto V , and similarly for π W . That is, R is acorrespondence if every element of V is related to at least an element of W , and vice-versa. The set ofall correspondences between V and W is denoted R ( V, W ).The correspondence distortion distance [23] between (

V, d V ) and ( W, d W ) is deﬁned as d CD (cid:0) ( V, d V ) , ( W, d W ) (cid:1) = 12 min R ∈R ( V,W ) dis( R ) . We will use a reformulation of this distance that can be found in [7]. In order to introduce it, we needto deﬁne the distortion and co-distortion of maps between sets endowed with dissimilarity functions.

Deﬁnition 2.10.

Let (

V, d V ) and ( W, d W ) be any two dissimilarity functions and let ϕ : V → W and ψ : W → V be maps of sets. The distortion of ϕ (with respect to d V and d W ) is deﬁned asdis( ϕ ) := max v ,v ∈ V (cid:12)(cid:12) d V ( v , v ) − d W (cid:0) ϕ ( v ) , ϕ ( v ) (cid:1)(cid:12)(cid:12) . The co-distortion of ϕ and ψ (with respect to d V and d W ) is deﬁned ascodis( ϕ, ψ ) := max ( v,w ) ∈ V × W (cid:12)(cid:12) d V (cid:0) v, ψ ( w ) (cid:1) − d W (cid:0) ϕ ( v ) , w (cid:1)(cid:12)(cid:12) . Note that codistortion is not necessarily symmetrical, namely, codis( ϕ, ψ ) and codis( ψ, ϕ ) may be dif-ferent if either of the dissimilarity functions are asymmetric.Finally, we have the following reformation of the correspondence distortion distance.

Proposition 2.11 ([7, Proposition 9]) . Let ( V, d V ) and ( W, d W ) be any two dissimilarity functions.Then, d CD (cid:0) ( V, d V ) , ( W, d W ) (cid:1) = 12 min ϕ : V → W,ψ : W → V (cid:8) max { dis( ϕ ) , dis( ψ ) , codis( ϕ, ψ ) , codis( ψ, ϕ ) } (cid:9) . Chain complexes of semimodules

In this section we introduce the necessary algebraic background to deﬁne directed homology. Namely,in Section 3.1, we introduce semigroups, semirings and semimodules, along with some of their basicproperties. Then, in Section 3.2, we present the theory of chain complexes of semimodules and theirassociated homologies due to Patchkoria [19, 20].3.1.

Semirings, semimodules and their completions.

Let us begin by introducing the algebraicstructures that we need, for which our main reference is [12].

Deﬁnition 3.1. A semiring Λ = (Λ , + , · ) is a set Λ together with two operations such that • (Λ , +) is an abelian monoid whose identity element we denote 0 Λ , • (Λ , · ) is a monoid whose identity element we denote 1 Λ , • · is distributive with respect to + from either side, • Λ · λ = λ · Λ = 0 Λ , for all λ ∈ Λ.A semiring Λ is commutative if (Λ , · ) is a commutative monoid, and cancellative if (Λ , +) is a can-cellative monoid, that is, λ + λ ′ = λ + λ ′′ implies λ ′ = λ ′′ for all λ, λ ′ , λ ′′ ∈ Λ . A semiring Λ is a semiﬁeld if every 0 Λ = λ ∈ Λ has a multiplicative inverse. A semiring is zerosumfree if no element other than 0 Λ has an additive inverse. Example 3.2.

Every ring is a semiring. The non-negative integers, rationals and reals with their usualoperations, respectively denoted N , Q + and R + , are cancellative zerosumfree commutative semiringswhich are not rings. Deﬁnition 3.3.

Let Λ be a semiring. A (left) Λ -semimodule is an abelian monoid ( A, +) with identityelement 0 A together with a map Λ × A → A which we denote ( λ, a ) λa and such that for all λ, λ ′ ∈ Λand a, a ′ ∈ A , • ( λλ ′ ) a = λ ( λ ′ a ), • λ ( a + a ′ ) = λa + λa ′ , DAVID M´ENDEZ AND RUB´EN J. S ´ANCHEZ-GARC´IA • ( λ + λ ′ ) a = λa + λ ′ a , • Λ a = a , • λ A = 0 A = 0 A λ .A non-empty subset B of a left Λ-semimodule A is a subsemimodule of A if B is closed under additionand scalar multiplication, which implies that B is a left Λ-semimodule with identity element 0 A ∈ B . If A and B are Λ-semimodules, a Λ-homomorphism is a map f : A → B such that for all a, a ′ ∈ A and forall λ ∈ Λ, • f ( a + a ′ ) = f ( a ) + f ( a ′ ), • f ( λa ) = λf ( a ).Clearly, f (0 A ) = 0 B .A Λ-semimodule A is cancellative if a + a ′ = a + a ′′ implies a ′ = a ′′ , and zerosumfree if a + a ′ = 0implies a = a ′ = 0, for all a, a ′ , a ′′ ∈ A .The direct product of Λ-semimodules is a Λ-semimodule. The direct sum of Λ-semimodules can alsobe deﬁned analogously to that of the direct sum of modules, and a ﬁnite direct sum is isomorphic to thecorresponding direct product. Quotient Λ-semimodules can be deﬁned using congruence relations. Deﬁnition 3.4.

Let A be a left Λ-semimodule. An equivalence relation ρ on A is a Λ -congruence if, forall a, a ′ ∈ Λ, and all λ ∈ Λ, • if a ∼ ρ a ′ and b ∼ ρ b ′ , then ( a + b ) ∼ ρ ( a ′ + b ′ ), and • if a ∼ ρ a ′ , then λa ∼ ρ λa ′ .If ρ is a Λ-congruence relation on A and we write a/ρ for the class of an element a ∈ A , then A/ρ = { a/ρ | a ∈ A } inherits a Λ-semimodule structure by setting ( a/ρ ) + ( a ′ /ρ ) = ( a + a ′ ) /ρ and λ ( a/ρ ) = ( λa ) /ρ ,for all a, a ′ ∈ A and λ ∈ Λ. The left Λ-semimodule

A/ρ is called the factor semimodule of A by ρ . Notethat the quotient map A → A/ρ is a surjective Λ-homomorphism.If B is a subsemimodule of a Λ-semimodule A , then it determines a Λ-congruence ∼ B by setting a ∼ B a ′ if there exist b, b ′ ∈ B such that a + a ′ = b + b ′ . Classes in this quotient are denoted a/B , andthe factor semimodule is denoted A/B .Clearly, if Λ is a semiring then it is a Λ-semimodule over itself. The idea of free Λ-semimodules comesabout naturally just as in the case of rings, and we will make extensive use of these objects.

Deﬁnition 3.5.

Let Λ be a semiring, A a left Λ-semimodule and V = { v , v , . . . , v n } a ﬁnite subset of A . The set V is a generating set for A if every element in A is a linear combination of elements of V .The rank of a Λ-semimodule A , denoted rank( A ), is the least n for which there is a set of generators of A with cardinality n , or inﬁnity, if not such n exists.The set V is linearly independent if for any λ , λ , . . . , λ n ∈ Λ and µ , µ , . . . , µ n ∈ Λ such that P ni =1 λ i v i = P ni =1 µ i v i , then λ i = µ i , for i = 1 , , . . . , n , and it is called linearly dependent otherwise.We call V is a basis of A if it is a linearly independent generating set of A .The Λ-semimodule A is a free Λ -semimodule if it admits a basis V . We denote a free Λ-semimodulewith basis { v , v , . . . , v n } by Λ( v , v , . . . , v n ). Remark 3.6.

It is easy to check that free Λ-semimodules over a cancellative semiring Λ are themselvescancellative, as are subsemimodules of cancellative Λ-semimodules. The quotient of a cancellative Λ-semimodule over any of its subsemimodules is also cancellative, see [12, Proposition 15.24].Note that if Λ is a ring, free Λ-semimodules are just free Λ-modules. Thus, as not every module overa ring is free, clearly not every Λ-semimodule is free. On the other hand, Λ n (the direct sum, or product,of n copies of Λ) is clearly a free Λ-semimodule, for every n ≥

1. Another key property is that if A is afree Λ-semimodule with basis V and B is another Λ-semimodule, each map V → B uniquely extends toa Λ-homomorphism A → B , see [12, Proposition 17.12]. Remark 3.7.

If Λ is a commutative semiring and A is a Λ-semimodule admitting a ﬁnite basis, everybasis has the same cardinality, which coincides with the rank of A , [22, Theorem 3.4]. In fact, forevery integer n >

0, every ﬁnitely generated subsemimodule of N n has a unique basis, whereas basis forsemimodules of ( R + ) n are unique up to non-zero multiples, see e.g. [15, Theorem 2.1]. DIRECTED PERSISTENT HOMOLOGY THEORY FOR DISSIMILARITY FUNCTIONS 9

We ﬁnish this section by extending the Grothendieck construction from abelian monoids to semiringsand semimodules.

Deﬁnition 3.8.

Let M be an abelian monoid. Consider the equivalence relation ∼ in M × M deﬁnedas ( u, v ) ∼ ( x, y ) ⇔ there exists z ∈ M such that u + y + z = v + x + z. Let [ u, v ] denote the equivalence class of ( u, v ). Then M × M/ ∼ becomes a group with the componentwiseaddition. This group is called the Grothendieck group or group completion of M . There is a canonicalhomomorphism of monoids k M : M → K ( M ) deﬁned as k M ( x ) = [ x, M is a cancellative monoid,then k M is injective.Given an element [ u, v ] ∈ K ( M ) we can interpret u as its positive part and v as its negative part , andthe relation ∼ becomes the obvious one. The identity element is then 0 = [ x, x ] for any element x ∈ M ,and the inverse of [ x, y ] is [ y, x ] for any x, y ∈ M .If f : M → N is a morphism of monoids, there is a morphism of groups K ( f ) : K ( M ) → K ( N ) thattakes [ u, v ] to [ f ( u ) , f ( v )]. In fact, K is a functor from the category of abelian monoids to the categoryof abelian groups. Furthermore, K ( f ) ◦ k M = k N ◦ f .It is easy to check that K ( N ) ∼ = Z and that K ( R + ) ∼ = R . Crucially for us, this construction can beextended to semirings and semimodules. Deﬁnition 3.9.

Let Λ be a semiring. The group completion K (Λ) becomes a ring with the operation[ x , x ] · [ y , y ] = [ x y + x y , x y + x y ], and k Λ : Λ → K (Λ) is in fact a morphism of semirings,which is injective if Λ is cancellative. We call K (Λ) the ring completion of Λ. K ( f ) is also a morphismof rings, and K is a functor from the category of semirings to the category of rings.If A is a Λ-semimodule, then the abelian group K ( A ) together with the operation K (Λ) × K ( A ) → K ( A ) given by [ λ , λ ][ a , a ] = [ λ a + λ a , λ a + λ a ] is a K (Λ)-module, the K (Λ)-module completionof A . Again, if f : A → B is a Λ-morphism, K ( f ) : K ( A ) → K ( B ) becomes a morphism of K (Λ)-modules.Furthermore, if A is a free Λ-semimodule with basis { v i | i ∈ I } , it becomes immediate that K ( A ) is afree K (Λ)-module with basis (cid:8) [ v i , | i ∈ I } .3.2. Chain complexes of semimodules.

Now that we have introduced the necessary algebraic struc-tures, we move on to chain complexes of semimodules over semirings, following [20]. This theory is anatural generalisation of the classical theory of chain complexes of modules and, in fact, they give raiseto the same cycles, boundaries and homologies when the semimodules are modules over a ring.In order to introduce chain complexes in the context of semimodules an immediate problem arises:alternating sums cannot be deﬁned as elements in a semimodule may not have inverses. The solution isto use two maps, a positive and negative part, for the diﬀerentials.

Deﬁnition 3.10.

Let Λ be a semiring and consider a sequence of Λ-semimodules and homomorphismsindexed by n ∈ Z X : · · · ⇒ X n +1 ∂ + n +1 −−−−−− ⇒ ∂ − n +1 X n ∂ + n −−−− ⇒ ∂ − n X n − ⇒ · · · . We say that X = { X n , ∂ + n , ∂ − n } is a chain complex of Λ-semimodules if ∂ + n ∂ + n +1 + ∂ − n ∂ − n +1 = ∂ + n ∂ − n +1 + ∂ − n ∂ + n +1 . As in the classical case, chain complexes of Λ-semimodules give raise to a Λ-semimodule of homology.

Deﬁnition 3.11.

Let X = { X n , ∂ + n , ∂ − n } be a chain complex of Λ-semimodules. The Λ-semimodule of cycles of X is Z n ( X, Λ) = { x ∈ X n | ∂ + n ( x ) = ∂ − n ( x ) } . The n th homology of X is then the quotient Λ-semimodule H n ( X, Λ) = Z n ( X, Λ) /ρ n ( X, Λ)where ρ n ( X, Λ) is the following Λ-congruence relation on Z n ( X, Λ): x ∼ ρ n ( X, Λ) y ⇔ ∃ u, v ∈ X n +1 s.t. x + ∂ + n +1 ( u ) + ∂ − n +1 ( v ) = y + ∂ + n +1 ( v ) + ∂ − n +1 ( u ) . We will omit from now on the coeﬃcient semiring Λ from the notation when it is clear from the context.

Remark 3.12.

The deﬁnition of cycle is a direct generalisation of the classical deﬁnition. For theboundary relation, note that we may need two diﬀerent chains u and v in order to establish two classesas homologous. Intuitively, these two classes are the ‘positive’ and ‘negative’ part of w , where x = y + ∂ ( w )in the classical setting. Remark 3.13. If X = { X n , ∂ + n , ∂ − n } is a chain complex of Λ-semimodules, then · · · −→ K ( X n +1 ) K ( ∂ + n +1 ) − K ( ∂ − n +1 ) −−−−−−−−−−−−→ K ( X n ) K ( ∂ + n ) − K ( ∂ − n ) −−−−−−−−−→ K ( X n − ) −→ · · · is a chain complex of K (Λ)-modules. If furthermore the Λ-semimodules X n are cancellative for all n ,the converse is also true.If Λ is a ring, then K (Λ) = Λ and the functor K acts as the identity on Λ-modules. Therefore, X = { X n , ∂ + n , ∂ − n } is a chain complex of Λ-semimodules if and only if · · · −→ X n +1 ∂ + n +1 − ∂ − n +1 −−−−−−−→ X n ∂ + n − ∂ − n −−−−−→ X n − −→ · · · is a chain complex of Λ-modules. In this case, it is clear that the homology semimodules introduced inDeﬁnition 3.11 are precisely the usual homology modules of X .We now turn our attention to maps between complexes. Deﬁnition 3.14.

Let X = { X n , ∂ + n , ∂ − n } and X ′ = { X ′ n , ∂ + n , ∂ − n } be two chain complexes of Λ-semimodules. A sequence f = { f n } of Λ-homomorphisms f n : X n → X ′ n is said to be a morphism from X to X ′ if ∂ + n f n + f n − ∂ − n = ∂ − n f n + f n − ∂ + n . It is clear that for such map f n (cid:0) Z n ( X ) (cid:1) is a Λ-subsemimodule of Z n ( X ′ ). Furthermore, if X n and X ′ n are cancellative Λ-semimodules, f is also compatible with the congruence relations ρ n ( X ) and ρ n ( X ′ ),so it induces a map H n ( f ) : H n ( X ) → H n ( Y ) . It is then easy to check that homology H ∗ is a functor from the category of chain complexes of cancellativeΛ-semimodules and morphisms to the category of graded Λ-semimodules. Remark 3.15.

Let Λ be a semiring and let { X n , ∂ + n , ∂ − n } be a chain complex of Λ-semimodules.The family of canonical maps k X n : X n → K ( X n ) gives raise to a morphism from { X n , ∂ + n , ∂ − n } to (cid:8) K ( X n ) , K ( ∂ + n ) , K ( ∂ − n ) (cid:9) and, therefore, to a morphism of Λ-semimodules H ( k X ) : H n ( X ) → H n (cid:0) K ( X ) (cid:1) , which takes the class of x to the class of [ x, X n are cancella-tive, H n ( k X ) is injective, which in particular implies that H n ( X ) is a cancellative Λ-semimodule.Also note that, by Remark 3.13, the homology of (cid:8) K ( X n ) , K ( ∂ + n ) , K ( ∂ − n ) (cid:9) and the usual homologyof (cid:8) K ( X n ) , K ( ∂ + n ) − K ( ∂ − n ) (cid:9) are isomorphic as K (Λ)-modules.Finally, we discuss chain homotopies. Deﬁnition 3.16.

Let f = { f n } and g = { g n } be morphisms from X = { X n , ∂ + n , ∂ − n } to X ′ = { X ′ n , ∂ + n , ∂ − n } . We say that f is homotopic to g if there exist Λ-homomorphisms s + n , s − n : X n → X ′ n +1 such that ∂ + n +1 s − n + ∂ − n +1 s + n + s − n − ∂ + n + s + n − ∂ − n + g n = ∂ + n +1 s + n + ∂ − n +1 s − n + s + n − ∂ + n + s − n − ∂ − n + f n , for all n . The family { s + n , s − n } is called a chain homotopy from f to g , and we write ( s + , s − ) : f ≃ g .We then have the following result. Proposition 3.17. [20, Proposition 3.3]

Let f, g : X → X ′ be morphisms between chain complexes ofcancellative Λ -semimodules. If f is homotopic to g , then H n ( f ) = H n ( g ) . Homotopy equivalences are deﬁned in the usual way and they induce isomorphisms on homology. Weﬁnish with a remark that the homotopy of maps behaves well with respect to semimodule completion.

DIRECTED PERSISTENT HOMOLOGY THEORY FOR DISSIMILARITY FUNCTIONS 11

Remark 3.18.

If a morphism f : X → X ′ is homotopic to g : X → X ′ , then K ( f ) : K ( X ) → K ( X ′ ) ishomotopic to K ( g ) : K ( X ) → K ( X ′ ). Furthermore, if both X n and X ′ n are cancellative Λ-semimodules,for all n , and X n is a free Λ-semimodule for all n , the converse is also true.4. Directed homology of directed simplicial complexes

In this section, we introduce a theory of homology over semirings which, by using semirings which arenot rings, is able to detect directed cycles (Fig. 1). We cannot do so by using (undirected) simplicialcomplexes, as they cannot encode directionality information of the simplices. One approach is to usethe so-called ordered-set complexes , where simplices are sets with a total order on the vertices. Theygeneralise simplicial complexes, which can be encoded as fully symmetric ordered-set complexes, that is,ordered-set complexes where if a set of vertices forms a simplex, it must do so with every possible order.However, persistent homology of ordered-set complexes is not stable (Remark 5.21).To achieve stability, we use one further generalisation, called ordered tuple complexes or OT-complexes in [23], and directed simplicial complexes in this article (Deﬁnition 4.1). The only diﬀerence is thatarbitrary repetitions of vertices are allowed in the ordered tuples representing simplices. Clearly, anyordered-set complex is a directed simplicial complex. Furthermore, a (undirected) simplicial complex canbe encoded as a fully symmetric (as above) directed simplicial complex where every possible repetitionof vertices is also included. When doing so, the (undirected) simplicial complex and its associateddirected simplicial complex have isomorphic homologies over rings (Remark 4.7), and morphisms ofsimplicial complexes can be lifted to morphisms between their associated directed simplicial complexes(Remark 4.13). Finally, and crucially for us, using homology over semimodules for directed simplicialcomplexes, the corresponding directed persistent homology for dissimilarity functions is stable (Section5). Note that, as ordered-set complexes are directed simplicial complexes, the results stated here applyto homology computations on ordered-set complexes as well.Throughout this section, we assume that Λ is a cancellative semiring. This allows us to simplify manyof the proofs by making use of Grothendiek’s completion of semirings and semimodules. Nonetheless,many of the results in this section hold for arbitrary semirings.4.1.

Chain complexes of semimodules of directed simplicial complexes.

In this section we intro-duce directed simplicial complexes and their associated chain complexes of semimodules and homologysemimodules.

Deﬁnition 4.1. A directed simplicial complex or ordered tuple complex is a pair ( V, X ) where V is a ﬁniteset of vertices , and X is a family of tuples ( x , x , . . . , x n ) of elements of V such that if ( x , x , . . . , x n ) ∈ X , then ( x , x , . . . , b x i , . . . , x n ) ∈ X for every i = 0 , , . . . , n . Note that arbitrary repetitions of verticesin a tuple are allowed.We will denote the directed simplicial complex ( V, X ) just by X , and assume that every vertex belongsto at least one directed simplex (so that V is uniquely determined from X ). Elements of X of length n + 1 are called n -simplices , and the subset of the n -simplices of X is denoted by X n .We make use of the following terminology when dealing with simplicial complexes. Elements of X of length n + 1 are called n -simplices , and the subset of the n -simplices of X is denoted by X n . An n -simplex obtained by removing vertices from an m -simplex, n ≤ m , is said to be a face of the m -simplex.Such face is called proper if n < m . A directed simplicial complex is said to be n -dimensional, denoteddim( X ) = n , if X n +1 is trivial (empty) but X n is not. A collection Y of simplices of X that is itself asimplicial complex is said to be a directed simplicial subcomplex (or, simply, subcomplex ) of X , denoted Y ⊆ X . Note that the vertex set of Y may be strictly smaller than that of X .We now introduce chain complexes of semimodules associated to a directed simplicial complex. Deﬁnition 4.2.

The n -dimensional chains of X are deﬁned as the elements of the free Λ-semimodulegenerated by (i.e. with basis) the n -simplices, C n ( X, Λ) = Λ (cid:0) { [ x , x , . . . , x n ] | ( x , x , . . . , x n ) ∈ X } (cid:1) . We call the elements [ x , x , . . . , x n ] ∈ C n ( X, Λ), ( x , x , . . . , x n ) ∈ X , elementary n -chains . Remark 4.3.

Note that if Λ is a cancellative semiring, the Λ-semimodule C n ( X, Λ) is cancellative forevery n , as it is a free semimodule over a cancellative semiring. We now deﬁne the positive and negative diﬀerentials on C n ( X, Λ).

Deﬁnition 4.4.

Let X be a directed simplicial complex. For each n >

0, we deﬁne morphisms ofΛ-semimodules ∂ + n , ∂ − n : C n ( X, Λ) → C n − ( X, Λ) by ∂ + n ([ x , x , . . . , x n ]) = ⌊ n ⌋ X i =0 [ x , x , . . . , c x i , . . . , x n ] , and ∂ − n ([ x , x , . . . , x n ]) = ⌊ n − ⌋ X i =0 [ x , x , . . . , [ x i +1 , . . . , x n ] . For n = 0, let ∂ +0 , ∂ − : C ( X, Λ) → { } be the trivial maps, by deﬁnition. Proposition 4.5.

Let X be a directed simplicial complex and Λ be a cancellative semiring. Then { C n ( X, Λ) , ∂ + n , ∂ − n } is a chain complex of Λ -semimodules.Proof. By Remark 4.3, the Λ-semimodule C n ( X, Λ) is cancellative for every n . Thus, by Remark 3.13,it is enough to prove that (cid:8) K (cid:0) C n ( X, Λ) (cid:1) , K ( ∂ + n ) − K ( ∂ − n ) (cid:9) is a chain complex of K (Λ)-modules. Notethat K (cid:0) C n ( X, Λ) (cid:1) is a free K (Λ)-module whose basis is given by elements (cid:2) [ x , x , . . . , x n ] , (cid:3) such that[ x , x , . . . , x n ] is an elementary n -chain. Thus, it suﬃces to show that the composition (cid:0) K ( ∂ + n ) − K ( ∂ − n ) (cid:1) ◦ (cid:0) K ( ∂ + n − ) − K ( ∂ − n − ) (cid:1) is trivial (zero) on these elements. Since (cid:0) K ( ∂ + n ) − K ( ∂ − n ) (cid:1)(cid:2) [ x , x , . . . , b x i , . . . , x n ] , (cid:3) = n X i =0 ( − i (cid:2) [ x , x , . . . , x n ] , (cid:3) , the proof is now a straightforward computation analogous to the standard one for chain complexes insimplicial homology. (cid:3) Proposition 4.5 allows us to deﬁne the homology of a directed simplicial complex with coeﬃcients ona semimodule.

Deﬁnition 4.6.

Let X be a directed simplicial complex and n ≥

0. The n -dimensional homology of X ,written H n ( X, Λ), is the n th homology Λ-semimodule of the chain complex { C n ( X, Λ) , ∂ + n , ∂ − n } . Remark 4.7.

If Λ is a ring, by deﬁning ∂ = ∂ + − ∂ + , { C n ( X, Λ) , ∂ } is a chain complex in the usualsense. Furthermore, if X is an (undirected) chain complex, we can deﬁne an ordered-set simplicialcomplex X OT where ( x , x , . . . , x n ) ∈ X OT whenever { x , x , . . . , x n } , after removing any repetition ofvertices, is a simplex in X . The chain complex C ∗ ( X OT , Λ) receives the name of ordered chain complex of X in [17], and its homology is isomorphic to the singular homology of X over Λ.The following result will become useful later on, so we recorded it here. Proposition 4.8.

Let Λ be a cancellative semiring and X a directed simplicial complex. The chaincomplexes of K (Λ) -semimodules (cid:8) K (cid:0) C n ( X, Λ) (cid:1) , K ( ∂ + n ) , K ( ∂ − n ) (cid:9) and (cid:8) C n (cid:0) X, K (Λ) (cid:1) , ∂ + n , ∂ − n (cid:9) are iso-morphic.Proof. Take n ≥ C n (cid:0) X, K (Λ) (cid:1) is the free K (Λ)-module over the elementary n -chains [ x , x , . . . , x n ]. Similarly, K (cid:0) C n ( X, Λ) (cid:1) is a free K (Λ)-module with a basis given by the elements (cid:2) [ x , x , . . . , x n ] , (cid:3) where [ x , x , . . . , x n ] is an elementary n -chain. Consider the K (Λ)-morphisms α n : C n (cid:0) X, K (Λ) (cid:1) −→ K (cid:0) C n ( X, Λ) (cid:1) [ x , x , . . . , x n ] (cid:2) [ x , x , . . . , x n ] , (cid:3) , β n : K (cid:0) C n ( X, Λ) (cid:1) −→ C n (cid:0) X, K (Λ) (cid:1)(cid:2) [ x , x , . . . , x n ] , (cid:3) [ x , x , . . . , x n ] . Immediate computations show that the families of K (Λ)-morphisms { α n } and { β n } are morphisms of K (Λ)-semimodules which are inverses of each another, and the claim follows. (cid:3) DIRECTED PERSISTENT HOMOLOGY THEORY FOR DISSIMILARITY FUNCTIONS 13

Functoriality of directed homology.

In this section, we prove that homology is a functor fromthe category of directed simplicial complexes to the category of graded Λ-semimodules. We also showthat two morphisms allowing for the construction of the prism operator must induce the same mapon homology, a result we will need to prove our persistent homology stability result. We begin byintroducing morphisms of directed simplicial complexes.

Deﬁnition 4.9. A morphism of directed simplicial complexes , written f : ( V, X ) → ( W, Y ) or just f : X → Y , is a map of sets f : V → W such that if ( x , x , . . . , x n ) ∈ X is a simplex in X , then (cid:0) f ( x ) , f ( x ) , . . . , f ( x n ) (cid:1) ∈ Y . Remark 4.10.

This deﬁnition is stricter than the classical notion of morphism of simplicial complexes,where morphisms are allowed to take simplices to simplices of lower dimension, as directed simplicialcomplexes can intrinsically account for vertex repetitions. However, note that if X and Y are (undirected)simplicial complexes, a map f : X → Y is a morphism of simplicial complexes if and only if f : X OT → Y OT (see Remark 4.7) is a morphism of directed simplicial complexes. Deﬁnition 4.11.

Let f : X → Y be a morphism of directed simplicial complexes. Then f inducesmorphisms of Λ-semimodules C ( f ) = { C n ( f ) } , C n ( f ) : C n ( X, Λ) −→ C n ( Y, Λ)[ x , x , . . . , x n ] [ f ( x ) , f ( x ) , . . . , f ( x n )] . We will often abbreviate C n ( f ) = f n . Proposition 4.12. If f : X → Y is a morphism of directed simplicial complexes, the family of maps { C n ( f ) } is a Λ -homomorphism of chain complexes. Therefore, it induces a map H n ( f ) : H n ( X, Λ) → H n ( Y, Λ) .Proof. Let [ x , x , . . . , x n ] ∈ C n ( X, Λ) be a simplex. We need to prove that f n − ∂ + n = ∂ + n f n and that f n − ∂ − n = ∂ − n f n . f n − (cid:0) ∂ + n ([ x , x , . . . , x n ]) (cid:1) = f n −  ⌊ n ⌋ X i =0 [ x , x , . . . , c x i , . . . , x n ]  = ⌊ n ⌋ X i =0 [ f ( x ) , f ( x ) , . . . , \ f ( x i ) , . . . , f ( x n )] , and ∂ + n (cid:0) f n ([ x , x , . . . , x n ]) (cid:1) = ∂ + n ([ f ( x ) , f ( x ) , . . . , f ( x n )])= ⌊ n ⌋ X i =0 [ f ( x ) , f ( x ) , . . . , \ f ( x i ) , . . . , f ( x n )] , This shows that f n − ∂ + n = ∂ + n f n , and the proof that f n − ∂ − n = ∂ − n f n is analogous. (cid:3) Remark 4.13.

Using Proposition 4.8, it is clear that the map C n ( f ) : C n (cid:0) X, K (Λ) (cid:1) → C n (cid:0) Y, K (Λ) (cid:1) is precisely K (cid:0) C n ( f ) (cid:1) : K (cid:0) C n ( X, Λ) (cid:1) → K (cid:0) C n ( Y, Λ) (cid:1) . Also, note that if X and Y are (undirected)simplicial complexes and Λ is a ring, the map induced on homology by a morphism f : X → Y is thesame as the map induced on homology by the morphism f : X OT → Y OT (see Remark 4.7). Corollary 4.14.

Homology is a functor from the category of directed simplicial complexes to the cate-gory of graded Λ -semimodules. In particular, isomorphic directed simplicial complexes have isomorphichomologies. We ﬁnish this section by showing a suﬃcient condition for two morphisms to induce the same map onhomology. We will need this result to prove that our deﬁnition of persistent homology is stable.

Lemma 4.15.

Let Λ be a cancellative semiring. Let X and Y be two directed simplicial complexes andlet f, g : X → Y be morphisms of directed simplicial complexes such that if ( x , x , . . . , x n ) ∈ X , then (cid:0) f ( x ) , f ( x ) , . . . , f ( x i ) , g ( x i ) , . . . , g ( x n ) (cid:1) ∈ Y for every i = 0 , , . . . , n . Then, H n ( f ) = H n ( g ) for every n ≥ . Proof.

For x = [ x , x , . . . , x n ] ∈ C n ( X, Λ) an elementary n -chain, deﬁne s + n [ x , x , . . . , x n ] = ⌊ n ⌋ X i =0 [ f ( x ) , . . . , f ( x i ) , g ( x i ) , . . . , g ( x n )] , and s − n [ x , x , . . . , x n ] = ⌊ n − ⌋ X i =0 [ f ( x ) , . . . , f ( x i +1 ) , g ( x i +1 ) . . . . , g ( x n )] , Now recall that since Λ is cancellative, the canonical map k n : C n ( X, Λ) → K (cid:0) C n ( X, Λ) (cid:1) is injective forall n . We will show that ( s + , s − ) : C ( f ) ≃ C ( g ) by proving that both sides of the equality in Deﬁnition3.16 have the same images through k n . We will also make use of the isomorphism K (cid:0) C n ( X, Λ) (cid:1) ∼ = C n (cid:0) X, K (Λ) (cid:1) established in Proposition 4.8, and that for any Λ-morphism h : C n ( X, Λ) → C n ( X, Λ), k n ◦ h = K ( h ) ◦ k n .By abuse of notation, we write k n ( x ) = x = [ x , x , . . . , x n ] ∈ C n (cid:0) X, K (Λ) (cid:1) . Denote ∂ n = K ( ∂ + n ) − K ( ∂ + n ) and s n = K ( s + n ) − K ( s − n ). Then, ∂ n ( x ) = n X i =0 ( − i [ x , x , . . . , b x i , . . . , x n ] , s n ( x ) = n X j =0 ( − j (cid:2) f ( x ) , f ( x ) , . . . , f ( x j ) , g ( x j ) , . . . , g ( x n ) (cid:3) . Note that, by appropriately grouping the terms in the equality in Deﬁnition 3.16, it suﬃces to show that K ( g n ) − K ( f n ) and ∂ n +1 s n + s n − ∂ n have the same image on x . On the one hand, s n − ∂ n ( x ) = n X i =0 ( − i  i − X j =0 ( − j (cid:2) f ( x ) , f ( x ) , . . . , f ( x j ) , g ( x j ) , . . . , [ g ( x i ) , . . . , g ( x n ) (cid:3) + n X j = i +1 ( − j +1 (cid:2) f ( x ) , f ( x ) , . . . , [ f ( x i ) , . . . , f ( x j ) , g ( x j ) , . . . , g ( x n ) (cid:3) . (4.1)On the other hand, ∂ n +1 s n ( x ) = n X j =0 ( − j " j X i =0 ( − i (cid:2) f ( x ) , f ( x ) , . . . , [ f ( x i ) , . . . , f ( x j ) , g ( x j ) , . . . , g ( x n ) (cid:3) + n X i = j ( − i +1 (cid:2) f ( x ) , f ( x ) , . . . , f ( x j ) , g ( x j ) , . . . , [ g ( x i ) , . . . , g ( x n ) (cid:3) . By exchanging the roles of the indices in this last equation, we obtain that ∂ n +1 s n ( x ) = n X i =0  i X j =0 ( − i +1 ( − j (cid:2) f ( x ) , f ( x ) , . . . , f ( x j ) , g ( x j ) , . . . , [ g ( x i ) , . . . , g ( x n ) (cid:3) + n X j = i ( − i ( − j (cid:2) f ( x ) , f ( x ) , . . . , [ f ( x i ) , . . . , f ( x j ) , g ( x j ) , . . . , g ( x n ) (cid:3) . (4.2)Now, adding Equations (4.1) and (4.2), ∂ n +1 s n ( x ) + s n − ∂ n ( x ) = n X i =0 [ f ( x ) , f ( x ) , . . . , f ( x i − ) , g ( x i ) , . . . , g ( x n )] − n X i =0 (cid:2) f ( x ) , f ( x ) , . . . , f ( x i ) , g ( x i +1 ) , . . . , g ( x n ) (cid:3) . It is now straightforward to check that this sum amounts to[ g ( x ) , g ( x ) , . . . , g ( x n )] − [ f ( x ) , f ( x ) , . . . , f ( x n )] = K ( g n )( x ) − K ( f n )( x ) , and we are done. (cid:3) DIRECTED PERSISTENT HOMOLOGY THEORY FOR DISSIMILARITY FUNCTIONS 15

Basic computations and examples.

In this section we explore some basic properties of thishomology theory. We begin by studying the relation between homology and connectivity.

Deﬁnition 4.16.

Let (

V, X ) be a directed simplicial complex and v, w ∈ V be two vertices. A path from v to w in X a sequence of vertices v = x , x , . . . , x n = w such that either ( x i , x i +1 ) or ( x i +1 , x i ) isa simplex, for all 1 ≤ i ≤ n −

1. The (weakly) connected components of X are the equivalence classes ofthe equivalence relation v ∼ w if there is a path in X from v to w . We call X (weakly) connected if ithas only one connected component, that is, every pair of vertices can be connected by a path.Note that this notion of connectedness ignores the direction of the edges (1-simplices). The nextresult shows that we can compute the homology of each connected component independently. The prooffollows by observing that the chain complex C n ( X, Λ) is clearly the direct sum of the chain complexesassociated to each of the (weakly) connected components of X . Proposition 4.17.

The homology Λ -semimodules of a directed simplicial complex X are isomorphic tothe direct sum of the homology Λ -semimodules of its (weakly) connected components. We next show that the 0th homology counts the number of (weakly) connected components of adirected simplicial complex X . We start with a lemma. Lemma 4.18.

Let Λ be a cancellative semiring and X be a directed simplicial complex. For any vertex v and any λ, µ ∈ Λ , λ [ v ] and µ [ v ] are -cycles which are homologous if and only if λ = µ .Proof. It is clear that both are cycles, as both their diﬀerentials ∂ +0 and ∂ − are trivial. Assume that X has n + 1 vertices, namely V = { x = v, x , . . . , x n } . Then if λ [ x ] and µ [ x ] are homologous, there exist1-chains x = n X i,j =0 λ ij [ x i , x j ] and y = n X i,j =0 µ ij [ x i , x j ]such that λ [ x ] + ∂ + ( x ) + ∂ − ( y ) = µ [ x ] + ∂ − ( x ) + ∂ + ( y ), where we are assuming that λ ij = µ ij = 0whenever [ x i , x j ] is not a 1-chain. By computing the diﬀerentials, λ [ x ] + n X i,j =0 λ ij [ x j ] + n X i,j =0 µ ij [ x i ] = µ [ x ] + n X i,j =0 λ ij [ x i ] + n X i,j =0 µ ij [ x j ] . Now, since (cid:8) [ x ] , [ x ] , . . . , [ x n ] (cid:9) is a basis of C ( X, Λ), the coeﬃcients of each element in the basis mustbe equal in both sides of the equality. Namely, for each j = 0, P ni =0 ( λ ij + µ ji ) = P ni =0 ( λ ji + µ ij ), and λ + P ni =0 ( λ i + µ i ) = µ + P ni =0 ( λ i + µ i ). Adding these equations, we obtain λ + n X i,j =1 λ ij + n X i,j =1 µ ij = µ + n X i,j =0 λ ij + n X i,j =0 µ ij . Since Λ is cancellative, this implies that λ = µ , and the result follows. (cid:3) Proposition 4.19.

Let X be a directed simplicial complex and let Λ be a cancellative semiring. Then H ( X, Λ) = Λ k , where k is the number of (weakly) connected components of X .Proof. By Proposition 4.17 we only need to show that if X is connected, then H ( X, Λ) ∼ = Λ. Let v , w be any two vertices and let us show that [ v ] and [ w ] are homologous 0-cycles. Indeed, since X isconnected, we can ﬁnd vertices v = v , v , . . . , v m = w such that either [ v i , v i +1 ] or [ v i +1 , v i ] are 1-chains,for all i = 0 , , . . . , n −

1. If [ v i , v i +1 ] is a chain, then [ v i ] + ∂ + (cid:0) [ v i , v i +1 ] (cid:1) = [ v i +1 ] + ∂ − (cid:0) [ v i , v i +1 ] (cid:1) ,so [ v i ] is homologous to [ v i +1 ]. On the other hand, if [ v i +1 , v i ] is a chain, then [ v i ] + ∂ − (cid:0) [ v i +1 , v i ] (cid:1) =[ v i +1 ] + ∂ + (cid:0) [ v i +1 , v i ] (cid:1) , so again [ v i ] is homologous to [ v i +1 ]. As this holds for every i = 0 , , . . . , m − v ] is homologous to [ w ].Now, if P ni =1 λ i [ x i ] is any 0-cycle, and given that being homologous is a Λ-congruence relation, wededuce that P ni =1 λ i [ x i ] is homologous to P ni =1 λ i [ v ]. Finally, by Lemma 4.18, λ [ v ] is not homologous to µ [ v ] whenever λ = µ , thus H ( X, Λ) = { λ [ v ] | λ ∈ Λ } ∼ = Λ. (cid:3) Remark 4.20.

For some semirings, Λ ∼ = Λ (thus Λ n ∼ = Λ for any n ≥ X . If Λ is commutative, Λ n ∼ = Λ m if and only if n = m (a consequence of Remark 3.7), so for such semirings the 0th homology counts the number of(weak) connected components of X . Remark 4.21.

The fact that 0th-homology ignores edge directions seems counter-intuitive, as we setup this framework to detect directed features, namely directed cycles. Note that this is a consequenceof the Λ-congruence relation deﬁning boundaries (Deﬁnition 3.11) being an equivalence relation, hencesymmetric. Namely, vertices u and v are necessarily homologous if either either [ u, v ] or [ v, u ] (or both)are 1-chains, since u + ∂ + ([ u, v ]) = v + ∂ − ([ u, v ]) or u + ∂ − ([ v, u ]) = v + ∂ + ([ v, u ]) . However, this only occurs in dimension 0, as a consequence of 0-chains always being cycles. Thus,when working with 1-simplices, the symmetry in the homology relation does not aﬀect the detectionof asymmetry in the data, as such asymmetry is encoded in the cycles themselves. It is also worthmentioning that a persistent homology able to detect strong connected components in directed graphshas been introduced in [23].We now show that the homology of a point is trivial for positive indices.

Example 4.22.

Let Λ be a non-trivial cancellative semiring and let X be the directed simplicial complexwith vertex set V = { x } and simplices X = { ( x ) } . Then, H n ( X, Λ) = ( Λ , if n = 0 , , if n > X is connected, thus H ( X, Λ) = Λ by Proposition 4.19. For n ≥

1, there are no n -chains, hence H n ( X, Λ) = 0.Directed simplicial complexes whose homology is isomorphic to that of the point are called acyclic . Deﬁnition 4.23.

A directed simplicial complex X is acyclic if H n ( X, Λ) = ( Λ , if n = 0 , , if n > m -dimensional simplex along withall of its faces (note that this is an ordered-set complex) is acyclic. Proposition 4.24.

Let X be the directed simplicial complex consisting on the simplex ( x , x , . . . , x m ) and all of its faces. Then X is acyclic.Proof. As a consequence of how X is deﬁned, if x ∈ C n ( X, Λ) is a chain so that x does not participatein any of its elementary n -chains (as there are no repetitions in the simplices of X ), then x can beadded as the ﬁrst element of every elementary n -chain in x , giving us an ( n + 1)-chain which we denote x x ∈ C n +1 ( X, Λ). Simple computations show that ∂ + ( x x ) = x + x ∂ − ( x ) and ∂ − ( x x ) = x ∂ + ( x ) . Now take x ∈ Z n ( X, Λ) any cycle and decompose it as x = x y + z , where x does not participate inany of the chains in either y or z . Using the formula above, ∂ + ( x ) = ∂ + ( x y + z ) = y + x ∂ − ( y ) + ∂ + ( z ) ,∂ − ( x ) = ∂ − ( x y + z ) = x ∂ + ( z ) + ∂ − ( z ) . Since x is a cycle, ∂ + ( x ) = ∂ − ( x ). In particular, the chains in which x does not participate must beequal, namely y + ∂ + ( z ) = ∂ − ( z ).Now consider the chain x z . We have that ∂ + ( x z ) = z + x ∂ − ( z ) = x y + z + x ∂ + ( z ) ∂ − ( x z ) = x ∂ + ( z ) . DIRECTED PERSISTENT HOMOLOGY THEORY FOR DISSIMILARITY FUNCTIONS 17 Zv v v X w w w Y u u u u Figure 3.

Directed homology over a zerosumfree semiring Λ detects directed 1-cyclesmodulo boundaries. In these examples, H ( X, Λ) = 0 while H ( Y, Λ) = H ( Z, Λ) = Λgenerated by [ w , w ] + [ w , w ] + [ w , w ] respectively [ u , u ] + [ u , u ] + [ u , u ] (or[ u , u ] + [ u , u ] + [ u , u ] + [ u , u ]). To be a 1-cycle over a zerosumfree semiring, thedirections of the involved 1-simplices must form circuit. This does not hold for a ring(note that a zerosumfree semiring is not a ring).As a consequence (recall that x = x y + z ), we deduce that x + ∂ − ( x z ) = ∂ + ( x z ), thus x is homologousto the trivial cycle. That is, any cycle x ∈ Z n ( X, Λ) is homologous to zero, and the result follows. (cid:3)

We now illustrate the ability of this homology theory to detect directed cycles. We do so throughsome simple examples.

Example 4.25.

Let Λ ∈ { N , Q + , R + } , so that Λ is a cancellative, commutative, zerosumfree semiring.Let X , Y and Z be the directed simplicial complexes represented in Figure 3. These three complexesare connected, so their 0th homology is Λ. Furthermore, neither X nor Y have k -simplices for k ≥ H k ( X, Λ) and H k ( Y, Λ) are trivial (zero) for every k ≥

2. Although Z has one 2-simplex, it is not a2-cycle, thus H k ( Z, Λ) is also trivial. To compute the ﬁrst homology, we need to consider the 1-simplicesand their diﬀerentials. We list them below.

X ∂ + ∂ − Y ∂ + ∂ − Z ∂ + ∂ − [ v , v ] v v [ w , w ] w w [ u , u ] u u [ v , v ] v v [ w , w ] w w [ u , u ] u u [ v , v ] v v [ w , w ] w w [ u , u ] u u [ u , u ] u u [ u , u ] u u Note that λ [ v , v ] + λ [ v , v ] + λ [ v , v ], λ , λ , λ ∈ Λ, is a cycle for X if and only if λ + λ = λ + λ = 0. Since Λ is cancellative and zerosumfree, this is equivalent to λ = λ = λ and, hence, H ( X, Λ) = { } .For Y , note that µ [ w , w ] + µ [ w , w ] + µ [ w , w ], µ , µ , µ ∈ Λ, is a cycle if and only if µ = µ = µ . Since there are no 2-simplices in Y , diﬀerent 1-simplices cannot be homologous. We deducethat H ( Y, Λ) = Λ. Note that this distinction is a consequence of Λ being a zerosumfree semiring: overa ring, X and Y have isomorphic homology.Finally, for Z , similar computations show that Z ( Z, Λ) is the free Λ-semimodule generated by x =[ u , u ] + [ u , u ] + [ u , u ] + [ u , u ] and x = [ u , u ] + [ u , u ] + [ u , u ]. However, in this case, we havea 2-simplex, y = [ u , u , u ], for which ∂ + ( y ) = [ u , u ] + [ u , u ] and ∂ − ( y ) = [ u , u ]. Consequently, x + ∂ − ( y ) = x + ∂ + ( y ), which implies that x and x are homologous. Therefore, H ( Z, Λ) = Λ,generated by either x or x . In particular, the homology of Y and Z are isomorphic and we can seehow 2-simplices can make directed cycles equivalent, as expected.More generally, if Λ is a cancellative zerosumfree semiring, a polygon will only give raise to a non-trivial homology class in dimension 1 if, and only if, the cycle can be traversed following the directionof the edges. Namely, only directed cycles are detected in homology. Proposition 4.26.

Let Λ be a cancellative, zerosumfree semiring. Let X be a -dimensional directedsimplicial complex with vertex set { v , v , . . . , v n } , n ≥ , and with -simplices e , e , . . . , e n where either e i = ( v i , v i +1 ) or e i = ( v i +1 , v i ) , i = 0 , , . . . , n , and where v n +1 = v . Then, H ( X, Λ) = Λ is non-trivialif and only if either e i = ( v i , v i +1 ) for all i or e i = ( v i +1 , v i ) for all i , and H ( X, Λ) = { } in any othercase. v v Figure 4.

Directed simplicial complexes with two vertices can have non-trivial 1-cycles.

Proof.

Let [ e i ] denote the elementary 1-chain associated to e i . Let x = P ni =0 λ i [ e i ] be any non-trivial1-cycle. We can assume without loss of generality that λ = 0. If e = ( v , v ), then λ [ v ] is a non-trivialsummand in ∂ + ( e ). Since Λ is zerosumfree, such summand does not have an inverse, thus λ [ v ] must bea summand in ∂ − ( x ). Now note that e is the only other simplex involving the vertex v . Furthermore,[ v ] appears in ∂ − [ e ] if and only if e = ( v , v ), in which case ∂ − [ e ] = λ v . We further deduce that λ = λ .By proceeding iteratively, we deduce that if x is non-trivial, necessarily e i = ( v i , v i +1 ) and λ = λ = · · · = λ n = λ , in which case x = P ni =0 λ [ v i , v i +1 ]. Simple computations show that such x is indeed a cycle,and since there are no 2-simplices, it cannot be homologous to any other cycle. Thus H ( X, Λ) = Λ.Symmetrically, if we assume that e = ( v , v ), we would deduce that x can only be non-trivial if e i = ( v i +1 , v ) for all i and x = P ni =0 λ [ v i +1 , v i ], which simple computations exhibit as a cycle. Thus, inthis case we also have that H ( X, Λ) = Λ. The result follows. (cid:3)

Note that Proposition 4.26 applies to polygons with only two vertices v and v (see Figure 4),which are allowed in a directed simplicial complex. Indeed, an immediate computation shows that[ v , v ] + [ v , v ] is a 1-cycle.We end this section by remarking that, although this homology theory successfully detects directedcycles when computing it over cancellative, zerosumfree semirings, the use of such semirings does notseem appropriate to detect directed structures in dimension two or, more generally, in even dimensionsother than 0. Indeed, we prove the following result. Proposition 4.27.

Let X be a directed simplicial complex. If Λ is a cancellative, zerosumfree semiring,then H n ( X, Λ) = { } , for every n ≥ .Proof. We will show that no non-trivial cycles may exist. In order to do so, consider the morphism ofΛ-semimodules ϕ n : C n ( X, Λ) −→ Λ[ x , x , . . . , x n ] . Thus, if x = P ki =0 λ i x i where x i is an elementary n -chain, ϕ n ( x ) = P ki =0 λ i .Now assume that x i is an elementary 2 n -chain, for some n ≥

1. In this case, ∂ + ( λ i x i ) = λ i ∂ + ( x i ),where ∂ + ( x i ) consists on the sum of ⌊ n ⌋ = n elementary n -chains. Thus, ϕ n − (cid:0) ∂ + ( λ i x i ) (cid:1) = nλ i . Onthe other hand, ∂ − ( λ i x i ) = λ i ∂ − ( x i ), where ∂ − ( x i ) consists on the sum of ⌊ n − ⌋ = n − n -chains. Therefore, ϕ n − (cid:0) ∂ − ( λ i x i ) (cid:1) = ( n − λ i .We now use ϕ n − to show that Z n ( X, Λ) must be trivial. Thus, let x = P ki =0 λ i x i ∈ Z n ( X, Λ) bea cycle. Then, ∂ + ( x ) = ∂ − ( x ), thus ϕ n − (cid:0) ∂ + ( x ) (cid:1) = ϕ n − (cid:0) ∂ − ( x ) (cid:1) . However, ϕ n − (cid:0) ∂ + ( x ) (cid:1) = k X i =0 ϕ n − ∂ + ( λ i x i ) = k X i =0 nλ i ,ϕ n − (cid:0) ∂ − ( x ) (cid:1) = k X i =0 ϕ n − ∂ − ( λ i x i ) = k X i =0 ( n − λ i . Now, since Λ is cancellative, P ki =0 nλ i = P ki =0 ( n − λ i implies that P ki =0 λ i = 0, and given that Λ iszerosumfree, this implies that λ = λ = · · · = λ k = 0. Consequently, any 2 n -cycle is necessarily trivial,thus H n ( X, Λ) = 0. (cid:3)

Remark 4.28.

The proof is based on the fact that the positive boundary of an elementary n -chainhas a diﬀerent number of elementary ( n − n is even. This mayalso be related to the fact that there is no obvious way to deﬁne directed n -cycles for n >

1, and this

DIRECTED PERSISTENT HOMOLOGY THEORY FOR DISSIMILARITY FUNCTIONS 19 may not be possible in even dimensions. For the purposes of this article, it suﬃces to consider 1-cyclesand H ( X, Λ). However, note that non-trivial (homological) cycles do exist in all odd dimensions (forexample, the elementary (2 n − n times is always a cycle).5. Persistent directed homology

In this section, we introduce a theory of persistent homology for directed simplicial complexes whichcomes in two ﬂavours: one takes into account the directionality of the complex, whereas the other one isanalogous to persistent homology in the classical setting. We thus have, associated to the same ﬁltrationof directed simplicial complexes, two persistence modules which produce two diﬀerent barcodes. Bothpersistent homology theories show stability (see Theorem 5.20) and, furthermore, they are closely related.Indeed, directed cycles are undirected cycles as well, thus every bar in a directed persistence barcodecan be univocally matched with a bar in the corresponding undirected one, although the undirected barmay be born sooner, see Proposition 5.8.5.1.

Persistence modules associated to a directed simplicial complex.

Let us begin by intro-ducing ﬁltrations of directed simplicial complexes.

Deﬁnition 5.1.

Let X be a directed simplicial complex. A ﬁltration of X is a family of subcomplexes( X δ ) δ ∈ T , T ⊆ R , such that if δ ≤ δ ′ ∈ T , then X δ is a subcomplex of X δ ′ , and such that X = ∪ δ ∈ T X δ .Note that for δ ≤ δ ′ , the inclusion i δ ′ δ : X δ → X ′ δ is a morphism of directed simplicial complexes.In order to introduce the persistence modules associated to such a ﬁltration we need to steer awayfrom semimodules. Indeed, although homology with coeﬃcients in semimodules proves useful to detectfeatures that can only be traversed with an appropriate direction, the lack of algebraic structure makesthe deﬁnition of barcodes cumbersome. Namely, the structure theorem for ﬁnitely generated modules isrequired in order to be able to deﬁne persistence barcodes, but such result does not have an analogue inthe framework of semimodules.This can be overcome by using the semimodule completion of the homology semimodules of directedsimplicial complexes. Indeed, by using the semimodule completion we are able to detect the submoduleof the classical homology over rings corresponding to directed classes. Furthermore, we can exploit theproperties of the canonical homomorphisms associated to the semimodule completions to keep track ofhow these semimodules of interest behave through maps induced by morphisms of directed simplicialcomplexes.We begin by introducing undirected persistence modules. Deﬁnition 5.2.

Let ( X δ ) δ ∈ T be a ﬁltration of a directed simplicial complex X and let Λ be a cancellativesemiring. The n -dimensional undirected persistence K (Λ) -module of X is the persistence K (Λ)-module (cid:0) { H n ( X δ , K (Λ)) } , { H n ( i δ ′ δ ) } (cid:1) δ ≤ δ ′ ∈ T . The functoriality of H n makes this a persistence module.Now, in order to retain the information on directionality, we take the submodule of directed classes. Re-call from Remark 3.15 that the family of canonical maps k X n : X n → K ( X n ) gives raise to a morphism ofΛ-semimodules H ( k X ) : H n ( X ) → H n (cid:0) K ( X ) (cid:1) , which takes the class of x to the class of [ x, (cid:8) C n (cid:0) X, K (Λ) (cid:1) , ∂ + n , ∂ − n (cid:9) and (cid:8) K (cid:0) C n ( X, Λ) (cid:1) , K ( ∂ + n ) , K ( ∂ − n ) (cid:9) are isomorphic. In particular, we may regard H n (cid:0) X, K (Λ) (cid:1) as the homology of the chain complex of K (Λ)-semimodules (cid:8) K (cid:0) C n ( X, Λ) (cid:1) , K ( ∂ + n ) , K ( ∂ − n ) (cid:9) . Deﬁnition 5.3.

Let X be a directed simplicial complex and Λ be a cancellative semiring. The n -dimensional directed homology of X over K (Λ) is the K (Λ)-submodule of H n (cid:0) X, K (Λ) (cid:1) generated byIm H n ( k X ). We denote it by H Dir n (cid:0) X, K (Λ) (cid:1) .Equivalently, H Dir n (cid:0) X, K (Λ) (cid:1) is the submodule of H n (cid:0) X, K (Λ) (cid:1) generated by the classes of elements[ x,

0] where x ∈ Z n ( X, Λ). We then have the following.

Proposition 5.4.

Let f : X → Y be a morphism of directed simplicial complexes and Λ be a cancellativesemimodule. For every n ≥ , the morphism of K (Λ) -modules H n ( f ) : H n (cid:0) X, K (Λ) (cid:1) → H n (cid:0) Y, K (Λ) (cid:1) restricts to a morphism H Dir n ( f ) : H Dir n (cid:0) X, K (Λ) (cid:1) → H Dir n (cid:0) Y, K (Λ) (cid:1) . Proof.

Take a representative of a class P si =1 [ λ i , λ i ] · [ x i , ∈ H Dir n (cid:0) X, K (Λ) (cid:1) , where x i ∈ Z n ( X, Λ) and λ i , λ i ∈ Λ for i = 1 , , . . . , s . By Remark 4.13 and since since K (cid:0) C n ( f ) (cid:1) is a morphism of K (Λ)-modules,the image of this class through H n ( f ) is the homology class of s X i =1 [ λ i , λ i ] · K (cid:0) C n ( f ) (cid:1) [ x i ,

0] = s X i =1 [ λ i , λ i ] · [ C n ( f )( x i ) , . Finally, given that x i ∈ Z n ( X, Λ) and { C n ( f ) } is a morphism of chain complexes of Λ-semimodules, C n ( f )( x i ) ∈ Z n ( Y, Λ) and the result follows. (cid:3)

Consequently, we can deﬁne the following.

Deﬁnition 5.5.

Let ( X δ ) δ ∈ T be a ﬁltration of a directed simplicial complex X and let Λ be a cancellativesemiring. The n -dimensional directed persistent K (Λ) -module of X is the persistence K (Λ)-module (cid:0) { H Dir n ( X δ , K (Λ)) } , { H Dir n ( i δ ′ δ ) } (cid:1) δ ≤ δ ′ ∈ T . We can now introduce persistence diagrams and barcodes associated to ﬁltrations of directed simplicialcomplexes. As these were only introduced for ﬁelds in Section 2.1, from this point on, we will assumethat Λ is a semiring such that K (Λ) is a ﬁeld. Deﬁnition 5.6.

Let ( X δ ) δ ∈ T be a ﬁltration of a directed simplicial complex X where T ⊆ R is ﬁ-nite. Let Λ be a cancellative semiring such that K (Λ) is a ﬁeld. The persistence diagrams asso-ciated to the n -dimensional undirected and directed persistence K (Λ)-modules of X are respectivelydenoted Dgm n ( X, Λ) and Dgm

Dir n ( X, Λ). Similarly, the respective barcodes are denoted Pers n ( X, Λ) andPers

Dir n ( X, Λ).

Remark 5.7.

Note that H Dir0 (cid:0) X δ , K (Λ) (cid:1) ∼ = H (cid:0) X δ , K (Λ) (cid:1) for every δ ∈ T . Indeed, every 0-simplex isin Z ( X, Λ), thus its class is in H Dir n (cid:0) X δ , K (Λ) (cid:1) . As a consequence, the 0-dimensional directed and undi-rected persistence K (Λ)-modules of a directed simplicial complex X are isomorphic. In particular, theyhave the same persistence barcodes and diagrams, and they measure the connectivity of the simplicialcomplex at each stage of the ﬁltration.The next result establishes the relation between the undirected and directed persistence barcodes anddiagrams of a persistence module, and it is thus key to understanding the directed persistence barcodes. Proposition 5.8.

Let ( X δ ) δ ∈ T be a ﬁltration of directed simplicial complexes. Then, there is a one-to-one matching of the bars in Pers

Dir n ( X, Λ) with the bars in Pers n ( X, Λ) . Furthermore, matching barsmust die at the same time.Proof. Clearly, the injections (cid:8) H Dir n ( X δ , K (Λ)) ֒ → H n ( X δ , K (Λ)) (cid:9) δ ∈ T give raise to a monomorphism of persistence modules. The result is then an immediate consequence of[1, Proposition 6.1]. (cid:3) Note that a bar in the directed persistence barcode of a ﬁltration may be born after the one it ismatched with in the undirected one, and some bars in the undirected barcode may remain unmatched.Equivalently, Pers

Dir n ( X, Λ) can be obtained from Pers n ( X, Λ) by selecting the appropriate bars andpossibly ‘delaying’ their births (see Examples below and Figures 5 and 6).We now introduce some examples to illustrate Proposition 5.8 and the general behaviour of the directedpersistence barcodes.

Example 5.9.

Let us illustrate how undirected and directed persistence modules and their barcodescan be diﬀerent. First, consider the directed simplicial complexes X and Y in Figure 3 (see Example4.25). No matter the ﬁltration chosen for X , the lack of directed cycles means that the 1-dimensionaldirected persistence module is trivial. However, at the end of the ﬁltration, there is a cycle in homology,thus there is a bar in the undirected persistence barcode. In the case of Y , the only 1-cycle is directed,so the undirected and directed persistence modules associated to any ﬁltration of Y are isomorphic, anddiﬀerent from that of X . DIRECTED PERSISTENT HOMOLOGY THEORY FOR DISSIMILARITY FUNCTIONS 21 v v v Z v v v v Z v v v v Z v v v v Z Figure 5.

A ﬁltration of the directed simplicial complex Z in Example 4.25 (Figure 3)and its associated undirected (bottom, left) and directed (bottom, right) 1-dimensionalpersistence barcodes. The shorter undirected barcode is also directed, while the longerundirected barcode becomes directed at δ = 2. Example 5.10.

Consider now the directed simplicial complex Z in Figure 3 (see Example 4.25). Let T = { , , , . . . } and consider the ﬁltration of Z given by ( Z δ ) δ ∈ T , as illustrated in Figure 5, where Z δ = Z for every δ ≥

3. The undirected and directed 1-dimensional persistence modules of this ﬁltrationare not isomorphic. Indeed, in Z there is clearly an undirected cycle, whereas Z ( Z , Λ) is trivial,thus H Dir1 (cid:0) Z , K (Λ) (cid:1) = { } . However, H (cid:0) Z , K (Λ) (cid:1) ∼ = H Dir1 (cid:0) Z , K (Λ) (cid:1) , as both vector spaces aregenerated by the classes of [ v , v ] + [ v , v ] + [ v , v ] + [ v , v ] and [ v , v ] + [ v , v ] + [ v , v ]. These twoclasses become equivalent in Z . The undirected and directed 1-dimensional persistence barcodes of thisﬁltration, shown in Figure 5, illustrate how undirected classes may become directed.The last example also shows an important diﬀerence between classical and directed persistence modulesand barcodes. Namely, in the undirected setting, the addition of one simplex to the ﬁltration can eithercause the birth of a class in the dimension of the added simplex, or can kill a class in the precedingdimension. (This simple idea is in fact the basis of the Standard Algorithm for computing persistenthomology, see Section 6). However, the addition of only one simplex to a directed simplicial complexcan cause the birth of several classes in directed homology, as shown in the previous example at δ = 2,and also in the following example. Example 5.11.

Let T = { , , , . . . } and consider the ﬁltration of simplicial complexes ( X δ ) δ ∈ T illus-trated in Figure 2 in the Introduction, where X δ = X for every δ ≥ Z ( X j , Λ) is trivial for j = 0 , ,

2, whereas undirected cycles appear as early as X .However, by adding the edge from v to v , several directed cycles are born at once. Namely, Z ( X , Λ)is generated by the cycles [ v , v ] + [ v , v ] + [ v , v ] + [ v , v ] + [ v , v ] , [ v , v ] + [ v , v ] + [ v , v ] + [ v , v ] , [ v , v ] + [ v , v ] + [ v , v ] + [ v , v ] , [ v , v ] + [ v , v ] + [ v , v ] + [ v , v ] , [ v , v ] + [ v , v ] + [ v , v ] . Straightforward computations show that H (cid:0) X , K (Λ) (cid:1) = K (Λ) , the direct sum of four copies of K (Λ).And, when taking the semimodule completion of Z ( X , Λ), we observe that four of those ﬁve cyclesare linearly independent. Therefore, H Dir1 (cid:0) X , K (Λ) (cid:1) = K (Λ) , thus at δ = 3 every class (including thebirthing one) becomes directed.Our last example shows an undirected homology class that never becomes directed. Example 5.12.

Let T = { , , , . . . } and consider the ﬁltration of simplicial complexes ( X δ ) δ ∈ T illus-trated in Figure 6, where X δ = X for δ ≥

2. Clearly, Z ( X j , Λ) = 0 for j = 0 , , whereas Z ( X , Λ)is the free Λ-semimodule generated by the cycle [ v , v ] + [ v , v ] + [ v , v ]. The undirected class repre-sented by this element is thus directed, but there is a linearly independent class in undirected homology,[ v , v ] + [ v , v ] − [ v , v ], which never becomes directed. Its bar in the barcode is thus unmatched. v v v X v v v v X v v v v X Figure 6.

A ﬁltration of directed simplicial complexes and its associated undirected(bottom, left) and directed (bottom, right) 1-dimensional persistence barcodes. Theshorter undirected barcode is also directed, while the longer undirected barcode neverbecomes directed.5.2.

Directed persistent homology of dissimilarity functions.

In this section, we introduce thepersistence diagrams and barcodes associated to dissimilarity functions. In order to be able to deﬁnethem using the standard persistence setting (Section 2.1), we assume that Λ is a cancellative semiringfor which K (Λ) is a ﬁeld. Let us begin by introducing the directed Rips ﬁltration of directed simplicialcomplexes associated to a dissimilarity function, [23, Deﬁnition 16]. Deﬁnition 5.13.

Let (

V, d V ) be a dissimilarity function. The directed Rips ﬁltration of ( V, d V ) is theﬁltration of simplicial complexes (cid:0) R Dir ( V, d V ) (cid:1) δ ∈ R where ( v , v , . . . , v n ) ∈ R Dir ( V, d V ) δ if and only if d V ( v i , v j ) ≤ δ , for all 0 ≤ i ≤ j ≤ n . It is clearly a ﬁltration with the inclusion maps i δ ′ δ : R Dir ( V, d V ) δ →R Dir ( V, d V ) δ ′ for all δ ≤ δ ′ .Let us now introduce the persistence homology modules associated to such a ﬁltration. Deﬁnition 5.14.

Let (

V, d V ) be a dissimilarity function and consider its associated directed Rips ﬁl-tration (cid:0) R Dir ( V, d V ) (cid:1) δ ∈ R . For each n ≥

0, the n -dimensional undirected persistence K (Λ)-module of X is H n ( V, d V ) := (cid:0) { H n ( R Dir ( V, d V ) δ , K (Λ)) } , { H n ( i δ ′ δ ) } (cid:1) δ ≤ δ ′ ∈ R . Similarly, the n -dimensional directed persistence K (Λ)-module of X is deﬁned as H Dir n ( V, d V ) := (cid:0) { H Dir n ( R Dir ( V, d V ) δ , K (Λ)) } , { H Dir n ( i δ ′ δ ) } (cid:1) δ ≤ δ ′ ∈ R . Remark 5.15.

The persistence module H n ( V, d V ) associated to the directed Rips ﬁltration of ( V, d V )is precisely the persistence module studied in [23, Section 5] for the ﬁeld K (Λ), hence the remarksmade there hold for the undirected persistence module. In particular, if ( V, d V ) is a (ﬁnite) metric space, R Dir ( V, d V ) is the (classical) Vietoris-Rips ﬁltration of ( V, d V ). Furthermore, in this case, it can easily beseen that H n ( V, d V ) = H Dir n ( V, d V ). Thus, these persistence modules generalise the persistence modulesassociated to the Vietoris-Rips ﬁltration of a metric space.As K (Λ) is a ﬁeld and since V is ﬁnite, both of these persistence modules fulﬁl the assumptionsin Section 2.1. Namely, their indexing sets can be chosen to be ﬁnite, corresponding to the thresholdvalues where new simplices are added to the simplicial complex. Furthermore, no simplex is added tothe ﬁltration until the threshold value reaches the minimum of the images of the dissimilarity function.Finally, and even though the directed simplicial complex R Dir ( V, d V ) δ may have inﬁnite simplices dueto arbitrary repetitions of vertices being allowed, it always has a ﬁnite number of simplices in a givendimension n , thus its n -dimensional homology its always ﬁnite-dimensional. As a consequence, we canintroduce the following. Deﬁnition 5.16.

Let (

V, d V ) be a dissimilarity function. For each n ≥

0, the n -dimensional persistencediagrams associated to the persistence K (Λ)-modules H n ( V, d V ) and H Dir n ( V, d V ) are respectively de-noted by Dgm n ( V, d V ) and Dgm Dir n ( V, d V ). Similarly, their associated persistence barcodes are denotedby Pers n ( V, d V ) and Pers Dir n ( V, d V ). DIRECTED PERSISTENT HOMOLOGY THEORY FOR DISSIMILARITY FUNCTIONS 23

Of course, Proposition 5.8 holds for these barcodes, namely, every bar in Pers

Dir n ( V, d V ) can be univo-cally matched with one in Pers n ( V, d V ) which dies at the same time, although the directed bar may beborn later.We now use results from Section 2.2 to show that both these persistence homology constructions arestable. The proof is split in several lemmas. Let ( V, d V ) and ( W, d W ) be two dissimilarity functions onrespective sets V and W and deﬁne η = 2 d CD (cid:0) ( V, d V ) , ( W, d W ) (cid:1) . By Proposition 2.11, we can ﬁnd maps ϕ : V → W and ψ : W → V such that dis( ϕ ) , dis( ψ ) , codis( ϕ, ψ ) , codis( ψ, ϕ ) ≤ η . To simplify notationin the proofs below, denote R Dir ( V, d V ) δ = X δV and R Dir ( W, d W ) δ = X δW , for all δ ∈ R . Lemma 5.17.

For each δ ∈ R , the maps ϕ and ψ induce morphisms of directed simplicial complexes ϕ δ : X δV −→ X δ + ηW x ϕ ( x ) , ψ δ : X δW −→ X δ + ηV x ψ ( x ) . Proof.

Let us prove the statement for ϕ δ (the proof is analogous for ψ δ ). Let ( x , x , . . . , x n ) be an n -simplex in X δV . Then d V ( x i , x j ) ≤ δ for all 1 ≤ i ≤ j ≤ n . Since dis( ϕ ) ≤ η , we have that, for all v , v ∈ V , (cid:12)(cid:12) d V ( v , v ) − d W (cid:0) ϕ ( v ) , ϕ ( v ) (cid:1)(cid:12)(cid:12) ≤ η. Choosing v = x i and v = x j , we have d W (cid:0) ϕ ( x i ) , ϕ ( x j ) (cid:1) ≤ η + d V ( x i , x j ) ≤ δ + η for all 1 ≤ i ≤ j ≤ n. Consequently, (cid:0) ϕ ( x ) , ϕ ( x ) , . . . , ϕ ( x n ) (cid:1) ∈ X δ + ηW and the result follows. (cid:3) Lemma 5.18.

For δ ≤ δ ′ ∈ R consider the inclusion maps i δ ′ δ : X δV ֒ → X δ ′ V and j δ ′ δ : X δW ֒ → X δ ′ W . Thefollowing are commutative diagrams of morphisms of directed simplicial complexes. X δV X δ ′ V X δ + ηW X δ ′ + ηW , X δW X δ ′ W X δ + ηV X δ ′ + ηV . i δ ′ δ j δ ′ + ηδ + η ϕ δ ϕ δ ′ j δ ′ δ i δ ′ + ηδ + η ψ δ ψ δ ′ Proof.

We prove that the ﬁrst diagram is commutative (the proof for the second diagram is analogous).Let x ∈ V . Since i δ ′ δ is an inclusion, ( ϕ δ ′ ◦ i δ ′ δ )( x ) = ϕ δ ′ ( x ) = ϕ ( x ). Similarly, since j δ ′ + ηδ + η is an inclusion,( j δ ′ + ηδ + η ◦ ϕ δ )( x ) = j δ ′ + ηδ + η (cid:0) ϕ ( x ) (cid:1) = ϕ ( x ). (cid:3) Lemma 5.19.

With the same notation as in Lemmas 5.17 and 5.18, for every δ ∈ R , the followingdiagrams of morphisms of directed simplicial complexes induce commutative diagrams on homology. X δV X δ +2 ηV X δ + ηW , X δW X δ +2 ηV X δ + ηW . i δ +2 ηδ ϕ δ ψ δ + η j δ +2 ηδ ψ δ ϕ δ + η Proof.

Again, we only prove the result for the ﬁrst diagram, as the proof for the second diagram isanalogous. We show that it is commutative up to homotopy by showing that the maps i δ +2 ηδ and ψ δ + η ◦ ϕ δ satisfy the hypothesis of Lemma 4.15.Take a simplex σ = ( x , x , . . . , x n ) ∈ X δV , thus d V ( x i , x j ) ≤ δ , for all 1 ≤ i ≤ j ≤ n . On the one hand, i δ +2 ηδ is an inclusion, so i δ +2 ηδ ( σ ) = σ . On the other hand, since ψ δ + η ◦ ϕ δ is a morphism of directedsimplicial complexes (Deﬁnition 4.9), (cid:0) ψ ( ϕ ( x )) , ψ ( ϕ ( x )) , . . . , ψ ( ϕ ( x n )) (cid:1) ∈ X δ +2 ηV . This implies that d V (cid:0) ψ ( ϕ ( x i )) , ψ ( ϕ ( x j )) (cid:1) ≤ δ + 2 η , for all 1 ≤ i ≤ j ≤ n . Now recall that codis( ϕ, ψ ) ≤ η , thus for all v ∈ V and w ∈ W , (cid:12)(cid:12) d V (cid:0) v, ψ ( w ) (cid:1) − d W (cid:0) ϕ ( v ) , w (cid:1)(cid:12)(cid:12) ≤ η. Then, for 1 ≤ i ≤ j ≤ n , by taking v = x i and w = ϕ ( x j ), d V (cid:0) x i , ψ ( ϕ ( x j )) (cid:1) ≤ η + d W (cid:0) ϕ ( x i ) , ϕ ( x j ) (cid:1) ≤ δ + 2 η. As a consequence of the inequalities above, we have shown that for every 0 ≤ i ≤ n , (cid:0) x , x , . . . , x i , ψ ( ϕ ( x i )) , ψ ( ϕ ( x i +1 )) , . . . , ψ ( ϕ ( x n )) (cid:1) ∈ X δ +2 ηV . Therefore, the maps i δ +2 ηδ and ψ δ + η ◦ ϕ δ satisfy the hypothesis of Lemma 4.15, thus they induce thesame map on homology. The result follows. (cid:3) We now have everything we need to prove the stability results.

Theorem 5.20.

Let Λ be a cancellative semiring such that K (Λ) is a ﬁeld. Let ( V, d V ) and ( W, d W ) betwo dissimilarity functions on ﬁnite sets V and W . Then, for all n ≥ , d B (cid:0) Dgm n ( V, d V ) , Dgm n ( W, d V ) (cid:1) ≤ d CD (cid:0) ( V, d V ) , ( W, d W ) (cid:1) and d B (cid:0) Dgm

Dir n ( V, d V ) , Dgm

Dir n ( W, d V ) (cid:1) ≤ d CD (cid:0) ( V, d V ) , ( W, d W ) (cid:1) . Proof.

Deﬁne η = 2 d CD (cid:0) ( V, d V ) , ( W, d W ) (cid:1) . By Theorem 2.7, it suﬃces to show that the persistencemodules H n ( V, d V ) and H n ( W, d W ) (respectively H Dir n ( V, d V ) and H Dir n ( W, d W )) are η -interleaved. Com-paring Deﬁnition 2.5 (for ε = η ) and Lemmas 5.17, 5.18 and 5.19, the result follows by using thefunctoriality of homology and, in the directed case, Proposition 5.4. (cid:3) Remark 5.21.

Recall from Remark 5.15 that the persistence module H n ( V, d V ) associated to thedirected Rips ﬁltration of ( V, d V ) is the persistence module studied in [23, Section 5] for the ﬁeld K (Λ).Thus, the remarks made in [23, Section 5.2] hold for these persistence modules, meaning that a resultanalogous to Theorem 5.20 would not hold if we were using ordered-set complexes instead of directedsimplicial complexes, as mentioned at the beginning of Section 4. This justiﬁes our deﬁnition of directedsimplicial complex (Deﬁnition 4.1).6. Algorithmic implementation of directed persistent homology

Let (

V, d V ) be a dissimilarity measure in a set V . In this section we show that the Standard Algo-rithm for computing persistent homology is applicable to H n ( V, d V ). We also discuss the computationalchallenges for calculating H Dir n ( V, d V ).6.1. The Standard Algorithm for (undirected) persistence.

The Standard Algorithm for thecomputation of persistent homology was ﬁrst introduced in [9] for the ﬁeld F and later generalised toarbitrary ﬁelds in [24]. It was the ﬁrst algorithm suited for the computation of persistence homology, andalthough newer (sometimes only heuristically) faster algorithms have been introduced throughout theyears, they generally require for additional results not yet established in the framework of semimodulesand directed simplicial complexes, such as the use of cohomology or discrete Morse Theory (see [18] fora nice review and comparison of diﬀerent algorithms for the computation of persistent homology).In this section, as a ﬁrst approach to an algorithmic implementation that could later be extendedto the directed case, we show that the Standard Algorithm can be adapted to the computation of thepersistence diagrams and barcodes of the undirected persistent homology introduced in Section 5. Weuse [10, Chapter VII] as our main reference, including for the associated terminology.Let k be a ﬁeld and ( V, d V ) a dissimilarity function. Write R Dir ( V, d V ) δ = X δV , δ ∈ R for the directedRips ﬁltration (Deﬁnition 5.13). As V is ﬁnite, there exists λ ∈ R such that i δ ′ δ : X δV → X δ ′ V is anisomorphism, for every δ, δ ′ ≥ λ . That is, X λV is the ﬁnal stage of the ﬁltration, and no new simplicesare added when increasing the ﬁltration parameter.Assume that we want to compute the persistence barcodes of H k ( V, d V ) up to a certain dimension n ≥

0. Then, we only need to consider simplices up to dimension n + 1. As V is ﬁnite, the numberof simplices in X δV up to dimension n + 1 is ﬁnite, say N . We can list them { σ , σ , . . . , σ N } in sucha way that i < j if σ i is a (proper) face of σ j , or if σ j appears ‘later’ in the ﬁltration. Formally, we DIRECTED PERSISTENT HOMOLOGY THEORY FOR DISSIMILARITY FUNCTIONS 25 call the sequence σ , σ , . . . , σ N a compatible ordering of the simplices if the following two conditions aresatisﬁed:(1) if σ i is a proper face of σ j , then i < j ,(2) if σ i ∈ X δV , σ j ∈ X δ ′ V \ X δV , and δ < δ ′ , then i < j .With a compatible ordering, the set { σ , σ , . . . , σ k } is always a subcomplex of X λV , for every k ≤ N .Also, if we represent the diﬀerential using a sparse N × N matrix M over k , where the ( i, j )-entry M ij is the coeﬃcient of σ j in the diﬀerential of σ i , M becomes an upper-triangular matrix: the simplices inthe diﬀerential of σ i are faces of σ i , thus they are represented in rows that come ‘earlier’ than i .Finally, for each i = 1 , , . . . , N , we deﬁne a i = min { δ ∈ R | σ i ∈ X δV } , that is, the index at whichthe simplex σ i enters the ﬁltration. Equivalently, this is the maximum of the pairwise distances in thesimplex, a i = max { d V ( x j , x k ) | x j , x k ∈ σ i , j ≤ k } . Note that there may be indices i < j such that a i = a j .Now recall that, since we are working with coeﬃcients on the ﬁeld k , H n ( X λV , k ) = ker ∂ n / Im ∂ n +1 .The key observation used in the Standard Algorithm is that upon the addition of the simplex σ j ofdimension k only one of two things may happen: • The j th column of M is linearly independent of the preceding columns . As this column corre-sponds to ∂ k ( σ j ), written with respect to the preceding simplices, ∂ k ( σ j ) is linearly independentof the remaining terms in Im ∂ k , namely, the addition of σ j increases the dimension of Im ∂ k byone. As the dimension of ker ∂ k − remains unchanged, the dimension of the ( k − ∂ k ( σ j ) is made trivial, and no further changesare made to the homology. We say that σ j is a negative simplex. • The j th column of M is linearly dependent of the preceding columns . In this case, the dimensionof ker ∂ k increases by one. Indeed, there is a linear combination of columns of M which includescolumn j giving raise to a zero column. But since the i th column of M encodes the diﬀerentialof σ i , this means that there is a chain containing simplex σ j with a trivial diﬀerential, that is,there is a cycle containing the elementary chain associated to σ j . Equivalently, the entrance of σ j causes the birth of a homology class in dimension k . We say that σ j is a positive simplex.The Standard Algorithm (Algorithm 1) computes the persistent homology at once by reducing thematrix M using column operations. Let M j denote the j th column of the matrix M , and write M j =( m j , m j , . . . , m Nj ) ∈ k N . Deﬁne low( M j ) = max { i = 1 , , . . . , N | m ij = 0 } , that is, the index of thelowest (in terms of the matrix representation) non-trivial element in M j . The reduction is performedby so-called reducing column operations. Namely, a column operation M i ← M i + λM j , λ ∈ k is called reducing if j < i and low( M i ) = low( M j ), and a matrix is reduced if no reducing operations are possible.The reduction algorithm goes as follows, where R j denotes the j th column of R , R j = { r j , r j , . . . , r Nj } . Algorithm 1:

Standard persistent homology algorithm Procedure

StandardPersistentHomology( M ) R ← M L ← [0 , , . . . , // L ∈ k N for j = 1 , , . . . , N do while R j = 0 and L [low( R j )] = 0 do R j ← R j − ( r low( R j ) ,j /r low( R j ) ,L [low( R j )] ) R L [low( R j )] if R j = 0 then L [low( R j )] ← j return R Of course, Algorithm 1 is only one of many possible ways of obtaining a reduced matrix from M ,and diﬀerent algorithms may give raise to diﬀerent reduced matrices. Nonetheless, any reduced matrixobtained from M provides the necessary information to compute the persistence diagram of H k ( V, d V ),for every k ≤ n . Indeed, the discussion in [10, VII.1] applies here, showing that low( R j ) is independenton the reduced matrix R obtained from M . Furthermore, we have the following: • If R j is zero, σ j is a positive simplex. • If R j is non-zero, the entrance of σ j in the complex causes the death of the class created uponthe entrance of σ i , where i = low( R j ).Thus, Dgm k ( V, d V ) is obtained from R simply as follows. • If σ i is a k -dimensional simplex and R i is a zero column such that there exist j for which i = low( R j ), then ( a i , a j ) ∈ Dgm k ( V, d V ). • If σ i is a k -dimensional simplex and R i is a zero column but there is no j for which i = low( R j ),then ( a i , + ∞ ) ∈ Dgm k ( V, d V ). Remark 6.1.

The time complexity of the Standard Algorithm is cubic in the number of simplices ofthe complex. Note however that the number of simplices of a given dimension in a directed simplicialcomplex over a ﬁxed vertex set can be signiﬁcantly higher than it would be possible on a (undirected)simplicial complex over the same vertex set. It is also worth mentioning that many of the modiﬁcationsmaking this algorithm more eﬃcient, such as those in [10, VII.2] and [6], can be used in this framework.6.2.

On the computation of the directed persistence diagrams.

The key idea behind the Stan-dard Algorithm cannot unfortunately be applied in the directed case. Indeed, as we have seen in Example5.11, the addition of one single simplex to the ﬁltration may cause the birth of several directed cyclesat once, giving raise to diﬀerent classes on homology. This seems to indicate that in order to eﬀectivelycompute the directed persistence diagram associated to a ﬁltration, it may necessary to compute thedirected cycles that appear whenever a positive simplex is added to the ﬁltration.In the 1-dimensional case, computing directed cycles reduces to computing lineal combinations of theelementary circuits in the simplicial complex regarded as a directed graph. The most eﬃcient algorithmsto this purpose are variations of one due to Johnson [14]. Their algorithmic complexity is O (cid:0) ( n + e )( c +1) (cid:1) ,where n is the number of vertices, e the number of edges and c the number of elementary circuits.Preprocessing algorithms for the reduction of the size of the simplicial complexes such as that in [16]are also not applicable verbatim in the context of semirings. This poses signiﬁcant challenges on thecomputation of the persistence diagrams associated to directed persistence modules which are beyondof the scope of this paper, and motivate future research directions, such as the ones sketched below. • Find ways to eﬀectively compute a basis of the vector space of directed cycles that do not requirefor the computation of the entire set of elementary circuits. • Develop preprocessing algorithms to reduce the size of the directed simplicial complexes in theﬁltration without changing their homology. Such a task may require the generalisation of toolsetssuch as discrete Morse theory or Hodge Theory to semirings, if possible. • Use these tools to provide a complete eﬃcient implementation of a directed persistent homologypipeline, and use it to provide new insights in the study of asymmetric data.We are hopeful that, having provided the necessary groundwork for the study of persistent directedhomology, we are opening up a exciting new area for future research into the topological properties ofasymmetric data sets.

Acknowledgements.

This work was supported by The Alan Turing Institute under the EPSRC grantEP/N510129/1. The ﬁrst author was partially supported by Ministerio de Econom´ıa y Competitividad(Spain) Grants Nos. MTM2016-79661-P and MTM2016-78647-P and by Ministerio de Ciencia, Inno-vaci´on y Universidades (Spain) Grant No. PGC2018-095448-B-I00.

References [1] U. Bauer and M. Lesnick,

Induced matchings of barcodes and the algebraic stability of persistence , Proceedings ofthe Thirtieth Annual Symposium on Computational Geometry (New York, NY, USA), SOCG14, Association forComputing Machinery, 2014, p. 355364.[2] G. Carlsson,

Topology and data , Bull. Amer. Math. Soc. (N.S.) (2009), no. 2, 255–308.[3] G. Carlsson, F. M´emoli, A. Ribeiro, and S. Segarra, Hierarchical quasi-clustering methods for asymmetric networks ,Proceedings of the 31st International Conference on International Conference on Machine Learning - Volume 32,ICML14, JMLR.org, 2014, p. II352II360.[4] F. Chazal, D. Cohen-Steiner, M. Glisse, L. J. Guibas, and S. Oudot,

Proximity of persistence modules and theirdiagrams , Proceedings of the Twenty-Fifth Annual Symposium on Computational Geometry (New York, NY, USA),SCG 09, Association for Computing Machinery, 2009, p. 237246.[5] F. Chazal, V. de Silva, M. Glisse, and S. Oudot,

The structure and stability of persistence modules , SpringerBriefs inMathematics, Springer, [Cham], 2016. MR 3524869

DIRECTED PERSISTENT HOMOLOGY THEORY FOR DISSIMILARITY FUNCTIONS 27 [6] C. Chen and M. Kerber,

Persistent homology computation with a twist , Proceedings 27th European Workshop onComputational Geometry, 2011.[7] S. Chowdhury and F. M´emoli,

A functorial Dowker theorem and persistent homology of asymmetric networks , J. Appl.Comput. Topol. (2018), no. 1-2, 115–175. MR 3873182[8] S. Chowdhury and F. M´emoli, Persistent path homology of directed networks , Proceedings of the Twenty-Ninth AnnualACM-SIAM Symposium on Discrete Algorithms (USA), SODA 18, Society for Industrial and Applied Mathematics,2018, p. 11521169.[9] H. Edelsbrunner, D. Letscher, and A. Zomorodian,

Topological persistence and simpliﬁcation , vol. 28, 2002, Discreteand computational geometry and graph drawing (Columbia, SC, 2001), pp. 511–533.[10] H. Edelsbrunner and J. L. Harer,

Computational topology: An introduction , American Mathematical Society, Provi-dence, RI, 2010.[11] H. Edelsbrunner, G. Jab lo´nski, and M. Mrozek,

The persistent homology of a self-map , Found. Comput. Math. (2015), no. 5, 1213–1244.[12] J. S. Golan, Semirings and their applications , Kluwer Academic Publishers, Dordrecht, 1999.[13] A. Grigor’yan, Y. Lin, Y. Muranov, and S.-T. Yau,

Homologies of path complexes and digraphs , arXiv preprint, 2012, arXiv:1207.2834 [math.CO] .[14] D. B. Johnson,

Finding all the elementary circuits of a directed graph , SIAM Journal on Computing (1975), no. 1,77–84.[15] K. H. Kim and F. W. Roush, Generalized fuzzy matrices , Fuzzy Sets and Systems (1980), no. 3, 293–315.[16] K. Mischaikow and V. Nanda, Morse theory for ﬁltrations and eﬃcient computation of persistent homology , Discrete& Computational Geometry (2013), no. 2, 330–353.[17] J. R. Munkres, Elements of algebraic topology , Addison-Wesley Publishing Company, Menlo Park, CA, 1984.[18] N. Otter, M. A. Porter, U. Tillmann, P. Grindrod, and H. A. Harrington,

A roadmap for the computation of persistenthomology , EPJ Data Science (2017), no. 1, 17.[19] A. Patchkoria, Cohomology of monoids with coeﬃcients in semimodules , Bull. Georgian Acad. Sci. (1977), no. 3,545–548.[20] A. Patchkoria, Cohomology monoids of monoids with coeﬃcients in semimodules I , J. Homotopy Relat. Struct. (2014), no. 1, 239–255.[21] M. W. Reimann, M. Nolte, M. Scolamiero, K. Turner, R. Perin, G. Chindemi, P. Dotko, R. Levi, K. Hess, and H.Markram, Cliques of neurons bound into cavities provide a missing link between structure and function , Frontiers inComputational Neuroscience (2017), 48.[22] Y.-J. Tan, Bases in semimodules over commutative semirings , Linear Algebra Appl. (2014), 139–152.[23] K. Turner,

Rips ﬁltrations for quasimetric spaces and asymmetric functions with stability results , Algebr. Geom. Topol. (2019), no. 3, 1135–1170.[24] A. Zomorodian and G. Carlsson, Computing persistent homology , Discrete Comput. Geom. (2005), no. 2, 249–274.(D. M´endez) School of Mathematical Sciences, University of Southampton, SO17 1BJ, United Kingdom

E-mail address : [email protected] (R. S´anchez-Garc´ıa)

School of Mathematical Sciences, University of Southampton, SO17 1BJ, United Kingdom

E-mail address ::