HHOMOLOGICAL COORDINATIZATION
ANDREW TAUSZ AND GUNNAR CARLSSON
Abstract.
In this paper, we review a method for computing and parameterizing the set of homotopyclasses of chain maps between two chain complexes. This is then applied to finding topologically meaningfulmaps between simplicial complexes, which in the context of topological data analysis, can be viewed as anextension of conventional unsupervised learning methods to simplicial complexes. Introduction
One goal of topological data analysis is to adapt algebraic topological methods to the context of point clouddata (i.e. finite metric spaces). The generalization of homology to this setting is called persistent homology [ZC05], [Car09]. Persistent homology has been successful at providing insight into nonlinear datasets, thatwould not be accessible with more classical methods. Although maps between spaces play a fundamentalrole in algebraic topology, most of the developments in topological data analysis have focused on the spacesand datasets themselves. In this paper, we propose a method for studying these maps.Suppose that we compute the homology of a space X . By comparing the homology of X with the knownhomology of other spaces, it can suggest that X is homeomorphic to a previously understood space, or thatthere should be maps exhibiting prescribed homological behavior to model spaces with known homology.One thinks of this process as a kind of non-linear coordinatization. Standard methods for introducing usefulcoordinates on a point cloud include Principal Component Analysis and Multidimensional Scaling. Both ofthese methods often work well when the structure of the space is essentially Euclidean. However, when thespace carries noncontractible topology, as in the case of a circle, these methods do not provide a method formapping the data set to a nonlinear model. We describe some examples when the ability to construct mapsto nonlinear targets would be useful. Circular coordinatization:
In situations where one finds that the persistent homology of the data setis that of a circle, it is natural to attempt to find a map to the circle. In this special case, there is a naturalmethodology using the persistent cohomology generator in dimension 1 which allows one to construct themap. The procedure is described in detail in [dSVJ09].
Natural image example:
In [CISZ06], homological calculations where carried out to confirm that aspace of frequently occurring motifs within 3 × Gene expression data:
Gene expression microarrays provide a powerful tool for obtaining informationabout many biological phenomena, including cancer. They produce high dimensional data, where the coordi-nates consist of probes representing particular genes. It is very well known that the results of such studies arehighly dependent on platforms, procedures within the laboratories performing the studies, as well as manyother factors. All this can distort the geometry of the data set, but one can hope that certain topologicalfeatures would still be preserved, which might permit one to map one data set to another in a nonlinear way,preserving the relevant features. Often [NLC10], the geometry of these data sets are represented by shapeslike the one pictured below.
Date : October 25, 2018.The first author was supported by AFOSRG grant FA9550-09-0-1-0531.The second author was supported by AFOSRG grant FA9550-09-0-1-0531, ONR grant N00014-08-1-0931 and NSF grantDMS 0905823. a r X i v : . [ c s . C G ] A ug ANDREW TAUSZ AND GUNNAR CARLSSON
In this case, since the tree is contractible, the direct use of homology will not be useful, since homologyvanishes on contractible spaces. However, relative homology of the pair (
X, ∂X ), where ∂X denotes asuitably defined boundary of the space, does capture the existence of branches or flares in the geometricobject. There are reasonable ways of defining the boundary of the metric space, and the points in theboundary have significance in that they tend to consist of most representative phenomena of a particularsubclass of samples. For example, type I and type II diabetes can be distinguished in this way, as canvarious molecular subtypes of cancer. In this case, one could then fix the map on the boundary, and studythe relative mapping problem in which one enforces the constraint that the boundary is carried into a smallneighborhood of the boundary. This kind of mapping is of a great deal of importance, since the problem ofreconciling different data sets constructed on the same kinds of tumors or other disorders is a major problemin this area.From the perspective of topological data analysis, we desire a map between two simplicial complexes X and Y to satisfy the following: • Such a map must be functorial: It must induce homomorphisms on the homology of X and Y . • Ideally, the map would be simplicial . This means that the image of a simplex should be a chain in Y containing either 0 or 1 simplices. Simplicial maps are particularly very well behaved: they aredetermined by their values on vertices and can be fully extended by linear interpolation. Additionally,they have a nice geometric interpretation. In general, we obtain maps between chains in X and Y .Thus, the typical situation is that we have f ( σ ) = (cid:80) τ j ∈ Y, | τ j | = | σ | a j τ j . • Even though a map might be a chain map (hence inducing homomorphisms, or even isomorphismsin homology), such a map might be unsatisfactory in the eyes of a practitioner. We want thesemappings to reveal some sort of information about the structure of one complex in terms of theother. A common situation might be that X is created from a large and high-dimensional dataset,and Y is a simple model with the same homology. In this case, a map X → Y can be thought ofas performing a kind of topological dimensionality reduction. This leads to the process of geometricregularization through optimization.Unfortunately, simplicial maps do not always exist - an example is the case of mapping from a triangle toa square. For this reason, we wish to find maps that are as close to simplicial as possible. Our investigationto this problem proceeds as follows: • The first step is to compute a compute a parameterization of the homotopy classes of chain mapsbetween two finite simplicial complexes. This is accomplished quite easily using a simple trick fromhomological algebra. (See sections 2 and 3) • We wish to develop optimization problems that yield maps that are as close to simplicial as possible.Additionally, we would like these problems to satisfy other properties. For example, when they exist,simplicial maps should be contained in the set of optima, and to preserve tractability the problemshould be convex. We discuss these criteria in section 4 and formulate two different optimizationproblems over the parameterization. The first one is convex and can be formulated as a linearprogram, while the second is non-convex but has the property that its minima are precisely the setof simplicial maps when they exist. • In section 5, we discuss three applications to various scenarios in topological data analysis: manifold-valued coordinates, density maximization, and mappings of contractible spaces. In the third example,
OMOLOGICAL COORDINATIZATION 3 we show how to compute mappings between trees by considering their relative homology with respectto a boundary defined by a filter function.The two main existing investigations into the area of computing maps are [dSVJ09] and [Din09]. Inthe paper [dSVJ09], de Silva and Vejdemo-Johansson present a method based on persistent cohomology forcomputing circular-valued coordinates on statistical data sets. The fundamental idea behind their work isto use the Brown representation of cohomology. Since we know that S is the Eilenberg-MacLane space forthe group Z , we have that H ( X ; Z ) ∼ = [ X, K ( Z , X, S ]We can compute a map from a space X to the circle S by choosing a representative from a cohomologyclass H ( X ; Z ). In practice, they also perform an optimization step where they select the smoothest cocycle.Although this works very well for the case of a circle, it is not practically generalizable to spaces other than S . For example K ( Z ,
2) is the infinite-dimensional complex projective space CP ∞ , and K ( Z / Z ,
1) is RP ∞ .In the PhD thesis of Yi Ding, [Din09], the mapping problem is investigated from a combinatorial perspec-tive. The hom-complex is used, as it is here, to obtain a parameterization of the space of chain maps, butcombinatorial optimization techniques are used to select maps that satisfy certain criteria. Our work can beseen as an extension of Ding’s work to the continuous case.We close the introduction with a remark on how we can think about the simplicial mapping problem asa version of higher order clustering. Clustering can be thought of as computing a map from a dataset X tothe discrete space Y = { , ..., k } . In the figure below, a cluster assignment is a mapping from each point inthe set on the left to the set { , , } .Suppose that we construct a filtered simplicial complex from X with some maximum filtration parameter r max . Examples of such constructions include the Vietoris-Rips complex, the Cech complex, and others. Inthis case, a clustering assignment is a mapping from the filtered complex to { , ..., k } that is constant on thepath components of the complex. It is easy to verify that this is equivalent to the chain map property ondimension 0: If e is an edge between u and v in the same path component, then f ( u ) − f ( v ) = f ( ∂e ) = ∂f ( e ) =0. Thus, a higher dimensional analogue of clustering is the computation of some homotopy representative ofa class of chain maps for n -simplices, with n > Definitions and Basics
In this section we review some basic definitions for completeness. This material can be found in a standardtext on algebraic topology such as [Hat02] or [Mun93]. The material on the hom-complexes can be found in[ML75].
Definition 1.
A chain map between two chain complexes ( A n , d n ) and ( B n , d (cid:48) n ) is a family of homomor-phisms f n : A n → B n such that for each n we have that f n − d n = d (cid:48) n f n . Definition 2.
Given two chain maps f and g between the chain complexes ( A n , d n ) and ( B n , d (cid:48) n ) , a chainhomotopy is a family of homomorphisms s n : A n → B n +1 such that for each n we have that (1) d (cid:48) n +1 s n + s n − d n = f n − g n The key fact is that if the chain maps f and g are chain-homotopic, then they induce the same homo-morphism on homology. ANDREW TAUSZ AND GUNNAR CARLSSON
Definition 3.
Given two chain complexes ( A ∗ , d ∗ ) and ( B ∗ , d (cid:48)∗ ) , we define a new complex as follows. Let (2) Hom n ( A ∗ , B ∗ ) = ∞ (cid:77) p = −∞ Hom( A p , B p + n )An element f of Hom n ( A ∗ , B ∗ ) is a family of homomorphisms f p : A p → B p + n for p ∈ Z . Note that evenin the case where the chain complexes ( A ∗ , d ∗ ) and ( B ∗ , d (cid:48)∗ ) have non-negative support, the chain complexHom n ( A ∗ , B ∗ ) will have non-trivial negative terms in general.Now that we have defined the terms in the complex, we need to define connecting homomorphisms d Hn : Hom n ( A ∗ , B ∗ ) → Hom n − ( A ∗ , B ∗ )If f ∈ Hom n ( A ∗ , B ∗ ), then d Hn ( f ) will itself be a family of homomorphisms. Let us denote by f p , thecomponent mappings on the individual modules A p . To define d Hn ( f ), it is sufficient to specify its action onelements of A p for all p . Let a ∈ A p , be an arbitrary element. We define d Hn ( f )( a ) = d (cid:48) p + n ( f p ( a )) + ( − n +1 f p − ( d p ( a ))It is easy to see that d H satisfies d Hn d Hn +1 = 0. We summarize the properties of the hom-complex in theproposition below. Each claim can be proved by a simple computation. Proposition 1.
The hom-complex satisfies the following homological properties: (1) f ∈ Hom ( A ∗ , B ∗ ) is a cycle ( d H ( f ) = 0 ) if and only if f is a chain map. (2) f ∈ Hom ( A ∗ , B ∗ ) is the boundary of some element s ∈ Hom ( A ∗ , B ∗ ) if and only if f is chainhomotopic to the zero map via s . (3) The zeroth homology group H (Hom ∗ ( A ∗ , B ∗ )) consists of homotopy classes of chain maps. Thus we have a nice characterization of the homotopy classes of chain maps - all we need to do is computethe homology of the hom-complex. Note that in this paper, if we use the term homotopy we actually meanthe term “chain homotopy”. In other words, we only deal with the algebraic notion and not the topologicalone. Furthermore, by a map between chain complexes, we really mean a chain map.
Remark 1.
In our investigation, we will restrict ourselves to field coefficients for two computational reasons.Firstly, the homology computations will be performed in a persistent setting for which one must work over afield (see [ZC05] ). Secondly, working over a field of characteristic 0 greatly enhances the efficiency of the lateroptimization steps. As a bonus for working with field coefficients, we get a more manageable representationof the individual
Hom terms, since for vector spaces V and W we have that Hom(
V, W ) ∼ = V ∧ ⊗ W . Wedenote the vector space dual of V with V ∧ . In this case, the n -th term in the hom-complex is (3) Hom n ( A ∗ , B ∗ ) = ∞ (cid:77) p = −∞ ( A p ) ∧ ⊗ B p + n This is very similar to another standard construction in homology theory - the tensor product of two chaincomplexes. Thus an element f ∈ Hom n ( A ∗ , B ∗ ) may be written as f = (cid:80) c ij a ∗ i ⊗ b j . From the computationalpoint of view, this representation is particularly useful since most of the coefficients c ij will be zero and donot need to be stored. We now close this section with a theorem which gives a useful characterization of the homology of thehom-complex in special cases. A proof may be found in chapter III of [ML75].
Proposition 2.
Let ( A n , d n ) and ( B n , d (cid:48) n ) be chain complexes of R -modules. Then there exists the followingexact sequence for each n (4) 0 → ∞ (cid:77) −∞ Ext R ( H k ( A ) , H k + n +1 ( B )) → H n (Hom ∗ ( A, B )) → ∞ (cid:77) −∞ Hom( H k ( A ) , H k + n ( B )) → In the special case when we are dealing with vector spaces over a field F , then the Ext term vanishes andwe have an explicit expression for the Hom term. So we get that (5) H n (Hom ∗ ( A ∗ , B ∗ )) ∼ = ∞ (cid:77) −∞ Hom( H k ( A ) , H k + n ( B )) ∼ = ∞ (cid:77) −∞ H k ( A ) ⊗ H k + n ( B ) OMOLOGICAL COORDINATIZATION 5 Finding Mappings between Simplicial Complexes
In this section we apply the algebraic techniques of the previous section to computing a parameterizationof the homotopy classes of maps between two simplicial complexes.Suppose we have simplicial complexes, X and Y with X = { σ i } , Y = { τ j } . So { σ i } and { τ j } are basesfor the vector spaces (over a field, F ) C ∗ ( X ) and C ∗ ( Y ). We also denote the dual basis of C ∗ ( X ) with { σ ∗ i } .Then, the vector space Hom( C ∗ ( X ) , C ∗ ( Y )) has basis { σ ∗ i ⊗ τ j } .Based on our previous discussion about hom-complexes, we know that the set of homotopy classes of mapsbetween X and Y is given by the 0-th homology H (Hom ∗ ( C ∗ ( X ) , C ∗ ( Y ))) with(6) Hom n ( C ∗ ( X ) , C ∗ ( Y )) = ∞ (cid:77) p = −∞ C p ( X ) ⊗ C p + n ( Y )In practice, we are not interested in the rank or Betti numbers of the homology group, but rather wewish to find representatives for the homotopy classes. As described in [Mun93], homology can be computedvia the Smith normal form in the case of coefficients in a PID, or with Gaussian elimination in the case ofcoefficients in a field.The result of the homology computation is a set of representative cycles (equivalently chain maps), whichwe denote { f m } . We can also easily obtain the set of chain homotopies to zero by computing the columns(the image) of the matrix representation of d H . Let us denote these columns by { h n } .The general parameterization of the affine space of homotopy classes is[ X, Y ] = (cid:40)(cid:88) m b m f m + (cid:88) n c n h n | b m ∈ F , c n ∈ F (cid:41) Note that the above expression uses the notation [
X, Y ] for the chain-homotopy classes of chain maps between X and Y , and not the homotopy classes of all continuous maps. Let us denote the m -th chain mapby f m = (cid:80) ij f mij σ ∗ i ⊗ τ j , and the n -th chain homotopy by h n = (cid:80) ij h nij σ ∗ i ⊗ τ j .3.1. Example.
Let X be a triangle with vertices [0], [1], [2], and edges [0, 1], [0, 2], [1, 2]. Let Y = X . Notethat by the homotopy classification theorem of the previous section (Proposition 2), we have that H (Hom ∗ ( C ∗ ( X ) , C ∗ ( Y ))) ∼ = ∞ (cid:77) −∞ H k ( X ) ⊗ H k ( Y ) ∼ = F ⊕ F Thus we have two generating cycles. The two computed representatives of H (Hom n ( C ∗ ( X ) , C ∗ ( Y ))) are: f = [0] → [0] + [1] → [0] + [2] → [0] f = − [1 , → [0 ,
2] + [1 , → [0 ,
1] + [1 , → [1 , d H . The first few (outof a total of 9) are: h = [1 , → [0 ,
1] + [0 , → [0 ,
1] + [2] → [1] − [2] → [0] h = [1 , → [0 ,
2] + [0 , → [0 ,
2] + [2] → [2] − [2] → [0] h = [2] → [2] − [2] → [1] + [1 , → [1 ,
2] + [0 , → [1 , . . . We can see that the first generating cycle induces an isomorphism H ( X ) → H ( Y ), and induces the zeromap on dimension-1 homology. Similarly, the second generating cycle induces an isomorphism H ( X ) → H ( Y ) and induces the zero map on dimension-0 homology. A generator for H ( X ) (equivalently H ( Y )) is[0], and a generator for H ( X ) (equivalently H ( Y )) is [0 , − [0 ,
2] + [1 , { b m } for the chain maps. Fromthe previous paragraphs, we have chain maps (which are homology cycles), f which induces an isomorphismon dimension 0 homology, and f which induces an isomorphism on dimension 1 homology. If we take theirsum, f = f + f (by setting b = b = 1), then it is easy to see that f induces an isomorphism on both ANDREW TAUSZ AND GUNNAR CARLSSON homology groups. In practice this map is the one to start with, since it preserves the homological structureacross all dimensions. Thus in this case we would use the parameterization (cid:40) f + (cid:88) n c n h n | c n ∈ F (cid:41) Searching the Parameterization
Given two simplicial complexes X and Y , we now know how to compute a parameterization for thehomotopy classes of chain maps between them. However, in a statistical application setting we are interestedin selecting only one geometrically meaningful map from this set. Some reasonable criteria for such a mapare: • Image cardinality (simplicialness): In general, the image of a simplex σ under a map g will besome linear combination g ( σ ) = (cid:80) j d j τ j . Ideally, we would want to have the number of non-zerocoefficients c j to be as few as possible. • Preimage cardinality: Likewise, we would prefer if the number of non-zero coefficients in the expres-sion g ∗ ( τ ) = (cid:80) i c i σ i be as few as possible. • Locality: We would like the image of a simplex to be localized in the codomain. This means thatthe non-zero terms in the expression g ( σ ) = (cid:80) j d j τ j should not be spread apart. • Unbiasedness: There should be no a-priori reason to prefer one map over another if they achieve thesame optima. For example, in the case of a triangle mapping to a triangle, there is no geometric wayof distinguishing the identity map from one of the two rotations of the vertices (both are simplicialand optimally localized). • Convexity: The optimization problem should be convex.Unfortunately, the above criteria are mutually incompatible. To see this, it suffices to consider the casewhere X and Y are both triangles. The optimally sparse and localized chain maps include the three rotationsof the vertices. However, the unbiased property says that each of these three maps should be elements ofthe set of optima of the optimization problem. If we require the problem to be convex, then it turns outthat the set of optima must also be convex and in particular connected. Thus if one takes a non-trivialconvex combination (say with coefficients (1/3, 1/3, 1/3)), that will also be an optima but it will violatethe condition of sparsity. One remedy for this is to require that the set of optima is the convex hull of theunbiased set of points. Alternatively, one can discard unbiasedness and require that only one sparse pointbe returned.4.1. The Combinatorial Approach is Hard: Theory.
The purpose of this section is to provide someorientation regarding the complexity of finding nice maps. This assumes that we are taking a combinatorialapproach as discussed in [Din09].Our ultimate goal would be to construct a map that is simplicial in both directions. This means that theimage and preimage of each simplex contains zero or one simplices. Let us call such a map bisimplicial . Inother words, such a map would minimize the quantity(7) max σ ∈ X || f ( σ ) || + max τ ∈ Y || f ∗ ( τ ) || When a simplicial map exists, the above function has a minimum value of 2. However, it turns out that theabove optimization problem is hard in a precise sense:
Proposition 3.
Finding whether or not a bisimplicial map exists is at least as difficult as solving the graphisomorphism problem.Proof.
We use the standard technique of performing a polynomial-time reduction of an instance of the graphisomorphism problem to the bisimpliciality problem. In other words, suppose that we have an oracle thatcan tell us whether a bisimplicial map between two complexes exists in polynomial time. Then we mustshow that we can also determine whether there exists a graph isomorphism between two graphs G and H inpolynomial time. Let us construct a machine that solves this problem in polynomial time.Suppose that we are given two graphs G and H . Note that we can take care of any polynomial timegraph invariants beforehand. For example this means that we may answer “No” if G and H have differentnumbers of vertices or edges. Similarly, since homology over a field may be computed in polynomial time OMOLOGICAL COORDINATIZATION 7 (see [ZC05]), we may also answer “No” if H ∗ ( G ) (cid:54) = H ∗ ( H ). Thus suppose that using our oracle we haveconstructed a bisimplicial map f that induces an isomorphism on homology between C ∗ ( G ) and C ∗ ( H ).We claim that f is a graph isomorphism. Let us denote the vertices of G and H by V ( G ) and V ( H ),and the edges by E ( G ) and E ( H ). By bisimpliciality, f must be an isomorphism between V ( G ) and V ( H ).Suppose that u ∼ v in G (ie. the edge [ u, v ] exists in G ). Then by the fact that f is a chain map, ∂f ([ u, v ]) = f ( ∂ [ u, v ]) = f ( v − u ) = f ( v ) − f ( u ). Thus f ([ u, v ]) is an edge between f ( u ) and f ( v ) in H . So f ( u ) ∼ f ( v ).Conversely, suppose that there are vertices x , y in H such that x ∼ y and f ( u ) = x and f ( v ) = y , butwith u (cid:28) v . However, we must have an edge [ s, t ] ∈ E ( G ) such that f ([ s, t ]) = [ x, y ] (if not f would not bebisimplicial or would not be a homology-isomorphism). But then at least one of the following holds: s (cid:54) = u or t (cid:54) = v . Without loss of generality, s (cid:54) = u . Then we have that f ( s ) = f ( u ) = x , contradicting the bisimplicialityof f . Thus we must have that u ∼ v , and hence f is a graph isomorphism. Thus, answering “Yes” when abisimplicial map exists, and “No” otherwise solves the graph isomorphism decision problem. (cid:3) Unfortunately, it is not known whether the graph isomorphism problem is NP-complete or is in P [GJ79].Thus all of the best-known algorithms are super-polynomial. Complexity theorists have created a class calledGI which consists of those problems which are polynomial time reducible to the graph isomorphism problem.In this terminology, we have shown that computing bisimplicial maps is GI-hard in general.4.2.
The Combinatorial Approach is Hard: Empirical Findings.
Although the bisimpliciality prob-lem is provably hard as previously shown, one may wonder whether it is possible to use heuristic combinato-rial optimization techniques (over Z / Z ). Examples include random walks, simulated annealing, or greedysearch with randomized restarts. Here, we give some empirical evidence suggesting that these approachesare unlikely to work for finding bisimplicial maps. Although it is possible that many simplicial maps existfor certain articially constructed cases, the examples below suggest that such maps are “rare”.Consider the one of the simplest conceivable nontrivial cases: where both X and Y are squares containingthe simplices { [0] , [1] , [2] , [3] , [0 , , [1 , , [2 , , [0 , } . Suppose that we are looking for a map f : C ∗ ( X ) → C ∗ ( Y ) such that f is forward and backward simplicial. As stated before we wish to find a minimizer forequation (7).The homotopies in this case consist of the images of all simple tensors of the form [ a ∗ ] ⊗ [ b, c ] under themap d . Thus there are a total of 16 homotopies. Since we are dealing with Z / Z coefficients, there are2 = 65 ,
536 possible choices for the homotopy coefficients. Although it is not practical in general, we maysimply enumerate all possible sets of coefficients. If we enumerate all possible sets of coefficients, we findthat out of the 65,536 possibilities, only 16 choices of coefficients yield bisimplicial maps. Figure 1 shows theset of all coefficients enumerated on the horizontal axis (by using binary representation) with the simplicialobjective function value on the vertical axis.Similarly, for finding mappings between a triangle and a square, out of the 4096 possibilities only 7 ofthem yield minima. Similarly, in the case where X and Y are circles with 8 vertices (octagons) simulatedannealing was performed. After 25,300 iterations a minimum value of 11 was reported for the simplicialobjective function. Since the identity map is a minimizer, the actual minimum should be 2. This was alsothe case with the other heuristics (greedy search and random walking) - we have yet to find a nontrivialpair in which one of these techniques is able to find a map that comes close to minimizing (7). While theseexamples do not constitute proof, they suggest that bisimplicial maps are extremely rare among chain maps.4.3. Matrix Representation.
For convenience, we fix some notation that will be used for the rest of thisdocument. Let the capital letters F m denote the matrix representations of the chain maps f m , and H m denote the matrix representations of the homotopies h m . We let F = (cid:80) m F m be the sum of chain maps. Wedenote an arbitrary member of [ X, Y ] by g , and its matrix representation by G . The basis elements { σ i } of X correspond to the unit vectors { e i } , and the basis elements { τ j } of Y correspond to the unit vectors { e j } .4.4. Minimizing Image and Preimage Size.
Let us explore the first two criteria stated at the beginningof this section. We said that ideally we would like to minimize the number of non-zero terms in the imageand adjoint-image of each simplex. This can be formulated as a special case of the following optimizationproblem:
ANDREW TAUSZ AND GUNNAR CARLSSON S i m p li c i a li t y pena l t y Binary representation of coefficient set
Figure 1.
Enumeration of all of the homotopy representatives of the chain map inducing anisomorphism between two squares. The horizontal axis indicates the binary representationof the set of 16 coefficients for the homotopies, and the vertical axis shows the bisimplicialitypenalty.(8) minimize c n max σ ∈ X || g ( σ ) || p + max τ ∈ Y || g ∗ ( τ ) || p subject to g = (cid:88) m b m f m + (cid:88) n c n h n Note that in the above expression, the coefficients { b m } are fixed beforehand. This means that we makean initial selection of which homological features to preserve. If they were not selected and were optimizationvariables, the problem above would be minimized by the zero map.It is also possible to replace the max terms by sums over the domain and codomain, however such areplacement yields less meaningful maps (Consider the case of the chain map which maps each vertex in thedomain to a single vertex in the codomain).To minimize the maximum cardinality or the sum of the cardinalities, one can use p = 0 (although thisis not actually a norm). However, we have seen that such a combinatorial approach is difficult, and sincethe problem dimension we are dealing with is multiplicative in the sizes of the simplicial complexes X and Y , this is out of the question. As practiced in many optimization settings, one may relax the cardinalityminimization problem to a 1-norm minimization problem. The intuition behind this is that the unit ball inthe 1-norm is the convex hull of the points {± e i } and that constrained optima tend to lie on corner pointswhich are sparse.In the case of the 1-norm, we can rewrite this as:(9) minimize c n || G || + || G T || subject to G = (cid:88) m b m F m + (cid:88) n c n H n The 1-norm of a matrix is defined to be the maximum absolute column sum of its entries, and is equal tothe operator norm induced by the vector 1-norm. It turns out that this problem is convex and in fact canbe reformulated as a linear program.Although the optimization problem (8) with p ≥ OMOLOGICAL COORDINATIZATION 9 means, for example, that in the case of the two triangles, the map that sends each vertex to ([0] + [1] + [2])is an optima, but is not what we are looking for. Thus, our viewpoint changes from (8) being the answerour search for a suitable optimization problem, to instead being a definition of admissibility of a map: Definition 4.
A map g is said to be admissible if it is a member of the set of optima of (8) with p = 1 orequivalently (9). Denote the admissible set by C . The next step is to somehow identify the points within C that satisfy some sort of sparsity requirement.One possibility is to require the coefficients { c n } to be integral. This brings us into the realm of integer linearprogramming, which turns out to be NP-hard (in the absence of the property of integral vertices which is notsatsified in this situation). In fact, the integer feasibility problem (finding a point with integer coordinatesin a polytope defined by { x | Ax ≤ b } ) is also NP-hard. Computationally, this can be applied to situationswhere both X and Y are small complexes, but it does not scale well.The second possibility is to optimize some measure of sparsity or peakiness over the points in C . Onestrategy is to maximize the ratio of the 2-norm to the 1-norm. (To see that this is reasonable, one can considerthe vectors (1 , ...
0) and (1 /n, ... /n )). Another possibility is to minimize the function which measures thedistance of a point to it’s nearest integral point (let the objective be ( x − [ x ] , ..., x n − [ x n ]). Althoughboth these objectives have the property that their global minima over C are the simlicial maps, both of themsuffer from being non-convex. In fact, since the vertices of C are local minima of these functions, globalminimization would involve searching all of the corner points. Once again, the task of enumerating verticesof a convex polytope turns out to be NP-hard [KBB + v and solve the problem(10) minimize v T c subject to c ∈ arg min (cid:18) max σ ∈ X || g ( σ ) || p + max τ ∈ Y || g ∗ ( τ ) || p (cid:19) g ∈ [ X, Y ]This will result in the selection of a random corner point of C . This can be repeated until a sufficiently sparse(close to simplicial) map is found.4.5. The Alexander-Whitney Map.
It is easy to see that there is no convex objective function that willselect all of the simplicial maps, since a convex function cannot have a disconnected set of minima. However,if we discard the requirement of unbiasedness (not favoring one simplicial map over another) or convexitywe can devise other methods for selecting favorable maps. The following method dispenses with convexity.Define the Alexander-Whitney map ∆ : C ∗ ( X ) → C ∗ ( X ) ⊗ C ∗ ( X ) by∆( σ ) = deg σ (cid:88) i =0 σ | ...i ⊗ σ | i... deg σ Where σ | ...i is [ v , ...v i ] if σ = [ v , ..., v n ], and σ | i... deg σ is [ v i , ..., v n ]. It is a routine calculation to show thatthe following diagram commutes if and only if f is a simplicial chain map:(11) C ∗ ( X ) f −−−−→ C ∗ ( Y ) (cid:121) ∆ (cid:121) ∆ C ∗ ( X ) ⊗ C ∗ ( X ) f ⊗ f −−−−→ C ∗ ( Y ) ⊗ C ∗ ( Y )Thus we can measure the deviation of a chain map from simpliciality by measuring the norm of thedifference g ⊗ g (∆( σ )) − ∆( g ( σ )). Suppose we are given a loss function L on R J ⊗ R J , we can define(12) L AW ( g ) = (cid:88) σ ∈ X L ( g ⊗ g (∆( σ )) − ∆( g ( σ )))For example, a convenient choice would be the quadratic loss(13) || g || ,AW = (cid:88) σ ∈ X || g ⊗ g (∆( σ )) − ∆( g ( σ )) ||
220 ANDREW TAUSZ AND GUNNAR CARLSSON
Thus given a selection of a loss function, we can solve(14) minimize L AW ( g ) + L AW ( g ∗ )subject to g ∈ [ X, Y ]Note that even if the original loss function L is convex, L AW will not be convex on the affine space of chainmaps. However, it is clear the minima of the above optimization problem will be precisely the bisimplicialmaps, in the case where one exists. Remark 2.
A naive implementation of the above would be very expensive to compute. Suppose that | X | = I and | Y | = J , so the matrix representation of a map f is a J × I matrix F . At some point, we need to computea product of the form ( F ⊗ F ) v . The simple way would be to form the matrix F ⊗ F (of size J × I ), andthen perform the matrix-vector multiplication. The cost of this operation is O ( I J ) .However, it is possible to rearrange this product differently. Suppose that v is a vector in F I . We reshapethis into the matrix V of size I × I by dividing v into blocks of length I and laying them side by side. Itturns out that computing the product ( F ⊗ F ) v is equivalent to computing F V F T and then reshaping it intovector form (see [HS80] ). The complexity of this operation is O ( I J ) . Extensions and Applications
Coordinatization on a Manifold.
In this section we investigate how the method previously describedcan be used to find manifold-valued coordinates for a given simplicial complex. Suppose that we have thefollowing data: • X : The domain simplicial complex. This is the “original” data set under investigation. • M : A manifold that we wish to compute coordinates on. • Y : A triangulation of the manifold M . • ϕ : A homeomorphism, Y → M corresponding to the triangulation of M . • ψ : A “localization” function, C ( Y ) → M which maps zero-dimensional chains on Y to points onthe manifold.We enforce a compatibility constraint on functions ϕ and ψ : If α = (cid:80) i c i σ i , where c i = δ ij , then ψ ( α ) = ϕ ( σ j ). Our objective is to find a coordinate mapping ρ : X → M .The first approach to computing coordinates on M is to compute the parameterization of [ X, Y ] as wellas to perform one of the optimization routines in the previous section to obtain a map f : C ∗ ( X ) → C ∗ ( Y ).Then, given a vertex σ ∈ X , we define its coordinate on M to be ρ ( σ ) = ψ ( f ( σ ))A second approach relies on the geometry of the manifold. Suppose that M is equipped with a Riemannianmetric g . We wish to find a mapping X → M which minimizes the total distortion across the 1-skeleton of X . In other words, we wish that vertices connected by an edge in X should be mapped nearby in M . Thiscan be precisely formulated as the following optimization problem:(15) minimize (cid:88) [ σ i ,σ j ] ∈ X d ( ψ ( f ( σ i )) , ψ ( f ( σ j )))subject to f ∈ [ X, Y ]The metric d on M has the standard definition: d ( x, y ) = inf { L ( γ ) | γ : I → M, γ ∈ C , γ (0) = x, γ (1) = y } and where L ( γ ) is the length of the curve γ defined by the Riemannian metric g .5.2. Example: Mapping to a circle.
Let M = R / π Z be the circle, and suppose that Y is an n -polygonhomeomorphic to M . We define the coordinate of the k -th vertex to be 2 πk/n , where k = 0 , ..., n −
1. Wealso define ψ to map a chain in C ( Y ) to the weighted sum of its basis elements. So we have that ψ ( (cid:88) i c i σ i ) = (cid:88) i c i ϕ ( σ i ) OMOLOGICAL COORDINATIZATION 11
Our distance function on M becomes d ( x, y ) = ( y − x ) mod 2 π . Given this setup, we can find M -valuedcoordinates for a data set of interest, X by solving the optimization problem (15). An example is shown inFigure 5.5.3. Density Maximization.
Suppose that we have a set of points in Euclidean space, Y ∈ R n and asimplicial model X . We wish to find a mapping from the vertices in X to Euclidean space such that theimages of these vertices land in regions of high density. The interpretation is that this process produces aclustering of the data that is aware of the topological structure of the data set Y .To do this, we begin with to pieces of data: • A filtered simplicial complex, Y , constructed from the vertices Y ∈ R n . For example, one may usethe Vietoris-Rips or witness constructions discussed in [Car09]. • An emperical density estimator on R n , which we call ˆ f ( y ) created from the data points Y . Acommon choice would be a kernel density estimate defined byˆ f ( y ) = n (cid:88) i =1 nh K (cid:18) y − y i h (cid:19) A reasonable choice for the kernel function K would be the standard Gaussian density function.Given a chain map g : C ∗ ( X ) → C ∗ ( Y ), if σ ∈ C ( X ), then we interpret the image g ( σ ) ∈ C ( Y ) tobe the weighted average of the points in the chain. In other words, define ψ ( (cid:80) c j τ j ) = (cid:80) c j ϕ ( τ j ), where ϕ : Y → R n takes a vertex in Y to its Euclidean coordinates. Since we wish to move points in the image ofthe chain map to regions of high-density, we form the following optimization problem(16) maximize (cid:88) σ ∈ X ˆ f ( ψ ( g ( σ )))subject to g ∈ [ X, Y ]5.4.
Mapper and Contractible Data Sets.
Another extension of the idea of hom-based mappings is toshape matching or data fusion. The discussion of gene expression data in the introduction provides themotivation for this. Suppose that we have two data sets X and Y which for now are just sets of points. Ifthese data sets arise from sampling a null-homotopic space, then Proposition 2 tells us that the homologicalmapping technique described in the previous section will not be very helpful. However, there is a way toremedy this.In the paper [SMC], the authors describe a multiscale decomposition method called mapper. The idea isthat one has a filter function f : X → R , and we cluster the set X according to preimages of overlappingintervals. Although we do not fully describe the method here, mapper allows a statistical practitioner toobtain a multiscale representation of the data at different resolutions. The reader is advised to consult [SMC]for a detailed discussion.Given outputs of the mapper algorithm, we are interested in constructing maps between different rep-resentations of the same dataset, either from different filter functions or from different scale parameters.This problem is relevant in biological settings in which the obtained datasets are highly dependent on themeasuring procedures used. We combine our homological mapping procedure with mapper into a structuremapping method as follows: • Given two data sets X and Y , and two filter functions f X , f Y : X, Y → R , we run the mapperalgorithm to obtain reduced simplicial models T X and T Y . For the 1-dimensional version of mapperon a contractible data set, T X and T Y will be trees. • Since the filter functions f X and f Y are also defined on the T X and T Y , we take the quotient of eachtree by the set of vertices which are local maxima of the filtration functions. This yields two graphs G X and G Y . In general, these two graphs may have cycles. • We run the homological mapping algorithm to obtain an optimal chain map, g between C ∗ ( G X ) and C ∗ ( G Y ). Implementation and Results
Software.
The above ideas were implemented as described below in a new version of the JavaPlexsoftware package [TVJA11]. Further optimization and scripting was performed using Matlab. The com-putation of the homology of the hom-complex was performed over the field Q in exact arithmetic, and theoptimization was performed in floating-point.We also note that this entire mapping procedure can be performed in a persistent setting where in additionto the natural grading of the chain complex by dimension, we have a grading by filtration. For a gooddiscussion on persistent homology, the reader is invited to look at [Car09] and [ZC05]. The one modificationwe make is that we only select representatives for [ X, Y ] which correspond to nontrivial persistence intervals.In other words, in the expression [
X, Y ] = { (cid:80) m b m f m + (cid:80) n c n h n } , we set the coefficients b m equal to 1 forsignificant intervals and 0 for nonsignificant intervals.6.2. Visualization.
The visualizations in this section show various examples of mappings between simplicialcomplexes. The domain complexes are on the left, and the codomains are on the right. Colors of thecomplexes are computed as follows: • The color of the domain complex is fixed. We start with map µ : X → [0 , mapping the verticesin X to their RGB values. The color of a n -simplex for n > µ ([ v , ..., v n ]) = ( n + 1) − (cid:80) i µ ([ v i ]). • To compute the color of a simplex τ ∈ Y , under the map f : C ∗ ( X ) → C ∗ ( Y ), we define µ ∗ : Y → [0 , by µ ∗ ( τ ) = µ ( f ∗ ( τ )), where we extend µ linearly over chains in X , and where f ∗ is the adjointof f . This is analogous to the definition of the pushforward of a measure.6.3. Examples.
In Figure 2, we show an example of a homotopy representative between a circle with 8vertices and a circle with 4 vertices. In order to compute the map, a random corner point of the admissibleset, C , was selected. In other words, the map was a random extremal point of the polytope of maps minimizingthe maximum row and column sums. The computed map is given below, where the block in the upper leftcorner is the map on the 0-skeleton and the block in the lower right corner is the map on the 1-skeleton:(17) . . . . . . . . In Figure 3 we show an example of a map computed by minimizing the Alexander-Whitney function withquadratic loss. In Figure 4, the figure on the left is a simplicial complex created using the lazy-witnessconstruction from a sample of 500 points on a trefoil knot, [dSC04]. A map to a circle with 4 vertices isshown. In Figures 5 - 7, we explore the applications described in section 5: computing a map to a manifold,density maximization, and contractible datasets. We refer the reader to the captions for more informationregarding these. 7.
Concluding Remarks
In this paper we have discussed a method for computing maps between two simplicial complexes thatrespect their homological structure. The computation is done in a two-stage process: first a parameterizationis obtained for the homotopy classes of chain maps, and then an optimization procedure is run to select oneof the maps from the affine parameterization. We have also demonstrated the method on various examples.Some key distinguishing features in comparison with traditional statistical dimensionality reduction andmapping techniques include: • The domain and codomain data sets are not required to be Euclidean spaces, or even metric spaces. • Conventional linear and nonlinear dimensionality reduction methods rely on the fact that the datacan be somehow unfolded into a convex subset of Euclidean space. The homological method presentedis designed to preserve nontrivial topological structure.
OMOLOGICAL COORDINATIZATION 13 • Unlike the method of circular coordinates, or various other surface mapping algorithms, in principlethe method presented in this paper is not restricted by the dimension or structure of either thedomain or codomain spaces.Nevertheless, the mapping technique in its current state suffers from a few shortcomings: • There is no universal optimization problem which produces geometrically satisfying maps in allcases. Depending on the application or situation, a practitioner might want to use different objectivefunctions or constraints. • Since the computation relies on the construction of the hom-complex, the fundamental problemsize given simplicial complexes of sizes I and J is the product, IJ . This leads to somewhat pooralgorithmic complexity in comparison with first order methods and has limited the sizes of theexamples presented.A key step to improving the applicability of hom-complex based mappings would be to alleviate theproblems with its algorithmic efficiency. It would be interesting to investigate what optimizations wouldenable this method to scale to datasets of more reasonable size. Despite these shortcomings, the examplesin this paper are designed to be a proof-of-concept for hom-complex based mappings. References [Car09] Gunnar Carlsson,
Topology and data , Bulletin of the American Mathematical Society (2009), no. 2, 255–308.[CISZ06] Gunnar Carlsson, Tigran Ishkhanov, Vin De Silva, and Afra Zomorodian, On the local behavior of spaces of naturalimages , Internat. J. Comput. Vision (2006).[Din09] Yi Ding,
Topological algorithms for mapping point cloud data , Ph.D. thesis, Stanford University, September 2009.[dSC04] Vin de Silva and Gunnar Carlsson,
Topological estimation using witness complexes , Eurographics Symposium onPoint-Based Graphics (M. Alexa and S. Rusinkiewicz, eds.), The Eurographics Association, 2004.[dSVJ09] Vin de Silva and Mikael Vejdemo-Johansson,
Persistent cohomology and circular coordinates , SCG ’09: Proceedingsof the 25th annual symposium on Computational geometry (New York, NY, USA), ACM, 2009, pp. 227–236.[GJ79] M. R. Garey and D. S. Johnson,
Computers and intractability: A guide to the theory of NP-completeness (seriesof books in the mathematical sciences) , first edition ed., W. H. Freeman, January 1979.[Hat02] Allen Hatcher,
Algebraic topology , Cambridge University Press, Cambridge, 2002. MR MR1867354[HS80] H. V. Henderson and S. R. Searle,
The vec-permutation matrix, the vec operator and kronecker products: a review ,Linear and Multilinear Algebra (1980), 271–288.[KBB +
08] Leonid Khachiyan, Endre Boros, Konrad Borys, Khaled Elbassioni, and Vladimir Gurvich,
Generating all verticesof a polyhedron is hard , Discrete & Computational Geometry (2008), no. 1-3, 174–190.[ML75] Saunders Mac Lane, Homology , 3rd ed., Springer, October 1975.[Mun93] James R. Munkres,
Elements of algebraic topology , Westview Press, December 1993.[NLC10] Monica Nicolau, Arnold Levine, and Gunnar Carlsson,
Progression analysis of disease , 2010.[SMC] Gurjeet Singh, Facundo Memoli, and Gunnar Carlsson,
Topological Methods for the Analysis of High DimensionalData Sets and 3D Object Recognition , pp. 91–100.[TVJA11] Andrew Tausz, Mikael Vejdemo-Johansson, and Henry Adams,
Javaplex: A software package for computing persis-tent topological invariants , Software, 2011.[ZC05] Afra Zomorodian and Gunnar Carlsson,
Computing persistent homology , Discrete Comput. Geom (2005), 249–274. Stanford University, Stanford, CA, 94305
E-mail address : [email protected] Stanford University, Stanford, CA, 94305
E-mail address : [email protected] Figure 2.
Simple example of a visualization of a chain map. The map was computed byselecting a random extremal point of the polytope C . An artefact of the visualization methoddescribed in section 6.2 is that the colors on the figure on the right are more intense thanthose on the left. This is due to the fact that the rows in the matrix in equation 17 sum togreater than 1. OMOLOGICAL COORDINATIZATION 15
Figure 3.
This example shows an icosahedron being mapped to an octahedron. This mapwas constructed by performing the Alexander-Whitney optimization with quadratic loss.Note that the map was rescaled to prevent the colors in the codomain from being washed-out as in Figure 2.
Figure 4.
The shape on the left was created by first randomly sampling 500 points ona trefoil knot. From this, the lazy-witness construction was used to construct a filteredsimplicial complex as described in [dSC04] on a landmark set of 40 points constructed bysequential max-min selection. The mapping was obtained by randomly selecting 100 verticesof the polytope C , and then choosing the one which had greatest 2-norm to 1-norm ratio. OMOLOGICAL COORDINATIZATION 17 −0.5 0 0.5 1−1.5−1−0.500.511.5 Codomain PointsDomain Points 0 2 4 6−2−1012345678 Original Angle C o m pu t ed A ng l e Figure 5.
Manifold map example. The domain, X , consists of a simplicial circle with 60vertices, and the codomain, Y , consists of a circle embedded in the plane. Given a map f : X → Y , we may compute the embedded coordinates for the points in X as describedin section 5.2. The object here is to find a map from X to Y that minimizes the totaldistortion across the 1-skeleton of the domain. On the left, the images of the domain pointsare shown as crosses, whereas the codomain points are shown as circles. On the right, weshow the relationship between the original angular coordinates for points in the domain onthe horizontal axis, versus the computed angular coordinates on the vertical axis. −2 −1 0 1 2−1−0.500.5100.050.10.150.2 Codomain PointsDomain Points Figure 6.
Density maximization example. In the above figure, the codomain points consistof a random sample from the unit circle, and the domain complex is an idealized circle with10 vertices. The locations of the domain points are computed by selecting the homotopyrepresentative that maximizes the density of the image points as described in section 5.3.
OMOLOGICAL COORDINATIZATION 19