[PDF] A Differential Topological Model for Olfactory Learning and Representation

Abstract

This thesis is designed to be a self-contained exposition of the neurobiological and mathematical aspects of sensory perception, memory, and learning with a bias towards olfaction. The final chapters introduce a new approach to modeling focusing more on the geometry of the system as opposed to element wise dynamics. Additionally, we construct an organism independent model for olfactory processing: something which is currently missing from the literature.

Full PDF

AA Differential Topological Model for OlfactoryLearning and Representation

It’s Amazing it Works at All

Jack Alexander Cook a r X i v : . [ q - b i o . N C ] S e p o Janet and Ralph cknowledgement I would like to thank my advisor Thomas Cleland for his patience and guidance through-out my undergraduate career. Without him, this thesis would not have materialized.And to my parents who supported me always, thank you. reface

This thesis is designed to be a self-contained exposition of the neurobiological and math-ematical aspects of sensory perception, memory, and learning with a bias towards ol-faction. The ﬁnal chapters introduce a new approach to modeling focusing more on thegeometry of the system as opposed to element wise dynamics. Additionally, we constructan organism independent model for olfactory processing: something which is currentlymissing from the literature. Chapter 1, serves as an introduction to the basic biology,structure, and functions of the olfactory system and the related regions of the brain. Start-ing with the nasal cavity, odors excite receptors which in turn relay information to theolfactory bulb(we will often refer to this as bulb). From the bulb information is sent topiriform cortex which projects onto a myriad of structures, some of which are hippocam-pus, anterior olfactory nucleus, and amygdala. We discuss neuromodulation and someconjectures about higher order processing (post bulb).In Chapter 2, we take a brief aside to discuss some basic algebra which makes up theﬁrst half of the mathematical material needed to understand the later chapters. We beginthe tour with set theory where we lay down the preliminaries on functions, set theoreticnotation, and various deﬁnitions which will appear consistently throughout this text. Thenext stop is group theory where we study the symmetries of objects and build the notionof an Action on a set. We then pass to Ring Theory where we discuss ideals, morphismsand hidden group structures. Rings show up naturally in chapter 3 and play an importantrole in the theory of sections. We end the tour of basic structures with a discussion ofﬁelds and polynomial rings with coefﬁcients in a ﬁeld. This leads to a natural discussionof higher order structures such as Vector Spaces and modules. The latter being an integralcomponent of the model.Chapter 3, forms the second half of the mathematical underpinnings for chapters 4and 5. Here we discuss geometry, topology and give a brief introduction to the theory ofcategories, sheaves, and differentiable stacks. Topology studies the intrinsic propertiesof a space endowed with a topology. It concerns itself with ideas such as connectedness,compactness, and continuity. Geometry studies calculus on these spaces and very quicklyleads to the ideas of ﬂows, geodesics, and Lie groups. The terminal topics are abstractionsof the notions of set and function. These provide a convenient language and place todiscuss some of the algebraic invariants given to a topological space.Chapters 4 makes up the entirety of the original research of this thesis. We ﬁrst explorethe topological and geometric properties of the physical and perceptual spaces involvedin the olfactory system and discuss how the use of vector bundles and non-canonicalmaps from a bundle to its base space provide insight into the geometry of the system as a3hole. We conclude with future directions of research and unanswered questions alongwith some conjectures about the model.Chapter 5 will focus on potential new areas of investigation. The majority of thischapter covers representation theory and culminates with the Borel-Weil theorem. Thisgives a realization of representations of certain groups as sections of line bundles. Thisgeometric view of the situation makes it natural to consider sheaves. As the ultimatetheorem will tell us, there is some interesting information contained in sheaf cohomologythat cannot be accessed through other means.4 ontents R - Mod . . . . . . . . . . . . . . . . . . . . . . 683.2 Topology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 763.2.1 Topological Spaces and Continuous Maps . . . . . . . . . . . . . . . . 763.2.2 Basic Algebraic Topology . . . . . . . . . . . . . . . . . . . . . . . . . 863.3 Differentiable Manifolds and Vector Bundles . . . . . . . . . . . . . . . . . . 923.3.1 Smooth Maps and the category

Man ∞ . . . . . . . . . . . . . . . . . . 933.3.2 Sheaves . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1013.3.3 de Rham Cohomology . . . . . . . . . . . . . . . . . . . . . . . . . . . 1095 A Geometric Framework for Olfactory Learning and Processing 113 R -Space . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1174.2.2 Glomerular-layer computations, R (cid:48) . . . . . . . . . . . . . . . . . . . . 1184.2.3 M -Space . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1194.2.4 S -Space . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1204.2.5 Forms and timescales of odor learning . . . . . . . . . . . . . . . . . . 1234.2.6 The construction of hierarchical odor categories . . . . . . . . . . . . 1274.2.7 "Olfactory space" is not hyperbolic . . . . . . . . . . . . . . . . . . . . 128 hapter 1Olfaction and the Problem of Learning This section is intended to be a crash-course in the neurobiology of the olfactory systemand the various computational aspects of neuroscience. We assume a passing knowledgeof general neuroscience. This includes the broad organization of the brain, structure ofa neuron, biochemistry of action potentials, existence of neurotransmitters, structure ofa synapse, and feedback loops at the level of [BW13]. The main goal of this section is tointroduce the idea of categorical perception and apply it to olfaction.

Sensory systems are the backbone of human perception and form the only method forwhich humans (an all animals) can gain information about the outside world. Althoughthe exact number of distinct senses is debated, it is generally agreed upon that humanshave 6-7 main ones which govern a occupy a large portion of the brain and almost all ofthe cortical space devoted to perception [BW13]. One could spend an enormous numberof pages discussing the intricacies of each of these sensory systems and their associatedperceptual constructions. As the main focus of this thesis is to understand olfactory pro-cessing, we shall only give a broad introduction to the other sensory systems and leavethe remaining details to the many references.

Remark 1.1.1.

For the remainder of this chapter, all deﬁnitions are operational (maychange between researchers) unless otherwise noted. We shall give some explanationof the deﬁnitions in the cases where we deviate from the standard references.The easiest way to begin an analysis of these systems is to understand the basic neu-rophysiology.

Deﬁnition 1.1.2. A sensory system is a part of the nervous system consisting of sensoryneurons, a neural pathway, and a cortical area.Sensory systems play a key role in every action the body performs: from simple thingslike standing up straight and picking up a glass of water to more complex tasks like skiingor identifying someone’s face in a dimly lit room. To better understand these objects, letsinvestigate a few well known examples. 7 xample 1.1.3. (a) Vision : In this case, sensory neurons are rods and cones. These light sensitive cellstransmit information to the optic nerve which relays this information to visual cor-tex. In fact, different cells along the pathways from the retina to the occipital lobehave varying receptor ﬁelds. This variance contributes to the processing of an im-age.(b)

Audition : Vibrations in the basilar membrane due to sound waves coming intocontact with the ear drum, lead to the vibration of hair cells. This motion inducesaction potentials in the auditory nerve. From here the signal partially decussates tothe temporal lobes where further processing occurs.We can further divide up the sensory systems into those which have chemical stimuliand those which do not. Those chemical senses depend on molecular interactions in thesensory neurons to facilitate the transformation from stimulus to perception. In the case ofgustation, there are ﬁve distinguished "tastes": salty, sweet, sour, bitter, umami. These allcorrespond to different molecules interacting with the papillae on the tongue. Somethingwhich tastes more "salty" is directly related to the Na + ions present in the solution ofsaliva and food. In contrast to this, we have audition. The pertinent objects here arepressure waves in the air which vibrate the ear drum which in turn vibrates the bones ofthe inner ear and causes waves in the basilar membrane. These waves cause the ends ofthe hair cells to be perturbed and induce an action potential.The point of these examples is to show that sensory systems have a wide array ofpossible stimuli. Now that we have the basic (and grossly vague) description of the structure of a sen-sory system. One may ask, “what is the purpose of such a system?" Beyond the obviousanswer (we need a way to interact with our environment), there are some subtle and in-credibly important operations that sensory systems accomplish. The main references forthis section are [ Har87 ] and [ CL05b ] .The main operations we will discuss are learning, representation, and categorization. They are closely related and in some perspectives, are even intertwined. In fact we canthink of learning and representation as disparate ideas, whereas categorical perceptionseeks to, in some sense, unify these ideas. The motivation for studying such a constructionﬁrst originated in vision and speech with color perception and categorization of variousspeech patterns.There is an obvious evolutionary advantage to the construction of categories. Typi-cally, stimuli are continuous, or at least abundant enough that any model would func-tion adequately considering them continuous. Categorical perception transforms thiscontinuity into a discrete spectrum of perceptions organized by similarity with respect We will discuss this at length in the next section. The intent here is to get some intuition for theproblems we will be attacking in the later chapters.

8o some metric. We take the following example from [ Har87 ] . Consider a digital clockwhich presents the time in 12hr increments with the use of am and pm . Then twice a day,the clock would present 12 : 00 with the only difference being which signiﬁer is present.In this way, we can categorize the time on the clock as either times marked by am or timesmarked by pm .Now consider another construction of categories which will be revisited in chapter3. If asked to classify the capital letters in the english alphabet, what is an appropriatechoice of category? Suppose we choose to split them by the number of holes: in that anyletter with a closed loop has a hole, and having multiple closed loops should split upthe categories further. In this schema(which is font speciﬁc), the letters are grouped asfollows: { C , E , F , G , H , I , J , K , L , M , N , S , T , U , V , W , X , Y , Z } { A , D , O , P , Q , R } { B } So we have three distinct categories: No holes, One hole, two holes. Notice that we canbe a bit more general about this however. If we consider only letters that have a hole andletters which do not, we only have two categories. Inside of the holed category we get asub-category consisting of letters which have multiple holes.The point of these examples is to illuminate the idea that categories seek to simplify thestimuli. It is much easier to think about letters with or without holes than all of the letterssimultaneously. Due to this, it should be no surprise that a majority of current researchin sensory systems is devoted to understanding the process of categorization. This isprecisely what we will investigate in the upcoming sections and is the topic of Chapter4. We can think of completing a category, C , by adding to it all of the points which are"inﬁnitesimally close" to C . If we think about this in a geometric way, this amounts to thecircle which bounds a disc.Before diving into the world of olfaction, we need one more general function of sen-sory systems: generalization. In 1987 Shepard introduced the idea of generalization forperceptual spaces. To this end, it is a different method of categorization, but one whichdepends on minimal learning. We shall call this perceptual generalization. Here is a moreformal deﬁnition. Deﬁnition 1.1.4. Perceptual generalization is the process by which a sensory system (inparticular the neural pathway) constructs a broad category for a given stimulus, basedsolely on the learning of one (or a few) stimulus.In the ﬁgure below (Figure 1.1), we show the ﬁrst examples of perceptual generaliza-tion. This process seems to be a method of producing categories for unlearned/partiallylearned stimuli. The method of generalization is extrapolate information from one stimu-lus and use this to "learn" something about its nearest neighbors in the perceptual space.Perceptual generalization can be thought of as a pseudo-prior to categorization. Be-fore the system can split things into clean, discrete categories, it needs to build the objectsof the perceptual space (the things to categorize). Once this is done, but still before dis-cretization, the system has to understand the boundary of each perceptual object . This will be interpreted in chapter 3 and 4 as the boundary of a topological subspace of the perceptual

Deﬁnition 1.1.5.

The perceptual boundary (sometimes shortened to just boundary) of acategory is the collection of points which are extreme in the category. That is, these are thepoints which are only present in the completion of a category.One of the important themes of the research surrounding sensory systems is that ofdistinguishing the boundary of the perceptual space and the various percepts it contains[Har87, Chapter 1, Section 2]. This may seem like an easy task, but in the abstract this is space. This notion will allow us to give a more formal deﬁnition than the vague one given here and willalso lead to a clean method of discretizing the perceptual space.

The olfactory system can be broken up (coarsely) into two main regions: the olfactorybulb and piriform cortex. We shall focus on the olfactory bulb as the piriform cortexis much less understood. As the following ﬁgure (Figure 1.2) shows, the olfactory bulbis divided into several layers. Each plays a key role in the transmutation of the physicalstimulus to a usable perceptual object. As we still do not understand the full functionalityof each of the layers individually, we will treat them as separate objects and present whatwe do know about the different layers. In the same ﬂavor as the previous section, we need to specify what the sensory neuronsare. In Figure 1.2, the layer marked OE (olfactory epithelium) and the colored receptorsare precisely the olfactory sensory neurons (abbreviated OSNs). These are chemical recep-tors and their level of activation (spike frequency) is directly proportional to the bindingafﬁnity of the odorant. This binding information is processed at a variety of places beforebeing sent off to piriform cortex and other higher-order brain regions. The main layerswe concern ourselves with here are GL , EPL , and

GCL . In contrast with other modalities(such as vision) olfaction is intrinsically high-dimensional and this high-dimensionalityis consistent across species. In humans there are roughly 350 different types of OSNswhereas in mice there are upwards of 1000 [Cle14]. Each distinct type of OSN convergesto a neuropil tangle which is roughly spherical in nature. We call these tangles glomeruli and the layer consisting of all of them is GL .The largest cells protruding from the glomeruli are mitral cells . These pyramidal cellsembody the immediate connection of the OB to piriform cortex. In Figure 1.2 the mitralcells are drawn to be in one-one correspondence with the glomeruli and are seen to sam-ple from only one glomerulus: this is false in general. It turns out that in most mammaliantetrapods the mitral cells do indeed sample from a single glomerulus. In [MNS81a] and[MNS81b], it was shown that in some turtles and reptiles, the mitral cells can samplefrom a variety of glomeruli. The beneﬁts of this cross sampling are still not very wellunderstood.The ﬁnal change of information in the OB is the modiﬁcation by the granule cells.These are inhibitory synapses which delay the mitral cell action potential [Cle14]. It isthought that these synapses play a large role in the formation of the perception of anodorant, but no published research has looked at this yet. We do know however thatlearning is related to granule cell ﬁring patterns. With repeated trials of an odorant, the11igure 1.2: Schematic of the mamalian olfactory bulb microcircuitry. Layers are identiﬁed,and are arranged bottom to top, external to internal.number granule cells which ﬁre decreases monotonically. Each consecutive trial leads toa more speciﬁc and more reﬁned response. This ﬁts in with Shepard’s idea on generaliza-tion. This specialization implies however, that granule cells can become speciﬁed fairlyquickly and thus the brain should “run out" of possible speciﬁcity. That is is theory, thegranule cells could become speciﬁed to only one odorant. This however is a poor alloca-tion of energy and would then require the genesis of a hoard of new granule cells for eachvariant of the same odor. Some recent work by [MLE +

09] has shown that granule cells do exhibit adult neurogenesis which is incited from the piriform cortex. This neurogenesisis the reason for which it is thought that granule cells play an important role in buildingof perception and for which we can learn odors well into our adult years. It does nothowever remove the ﬂawed idea that granule cells can become highly specialized.Now that we are aquatinted with the general form of the OB we need to discuss thegeneral schema of processing. In GL , the periglomerular cells (PGs) and superﬁcial short-axon cells (sSA) are thought to be the cells which begin the construction of perceptualcategories. The evidence for this comes from recent work by [BC19] which shows that12earning can occur at the glomerular layer and not just at the granule cell layer. Thisimplies at a minimum, that the purpose of the early (exterior) layers of the OB are tonormalize data and to reduce noise in the sensation. Further it increases contrast betweensimilar odorants. A nice analogy to this is the existence of edges in visual perception.There is an enormous amount of cortical space allocated to the processing of edges. Thishelps build a better image and in the same way, contrast enhancement in bulb, “builds abetter odor."The common theme to keep in mind for this system, is that of sampling with noise.Every layer samples from the previous in order to get a more speciﬁc perceptual categoryat the end. With the introduction of some noise (variance) we can eliminate some of thetheoretical aspects of the system. For instance, the granule cell hyper-speciﬁcation frombefore can be formally disregarded as the inherent noise of the odorants will not allow forthe accuracy necessary to determine "exactly" what the odorant is. What this does tell ushowever is that we can combine the notions of categorical perception and generalizationfor this system to arrive on what we shall call Categorical Generalization . At a ﬁrst pass, thisidea is the construction of a generalized perceptual category which given some learningset, all contained in the same perceptual category, is the extension of the learning set viathe rules of generalization set forth in [ She87 ] . Deﬁnition 1.2.1.

Let O be an odorant and O the corresponding perceptual category gen-erated by learning on O . Then the Generalized Category G O is the perceptual categorywhich extends O by generalizing its boundary.To understand this idea further, we shall investigate computational models of the ol-factory system and see how the introduction of this concept motivates our model con-structed in chapter 4. Models of the olfactory system come in two main types: anatomical and theoretical (per-ceptual) . The anatomical models focus on understanding the biochemistry and spiketiming of the OSN and related cells of the OB, whereas perceptual models tend to be fan-tastical speculation on the "perceptual space" of the system [CL05a] [ET10]. In chapter4, we shall propose a model which has the advantage of being mixture of the two, withthe advantage of being mathematically elegant. Before then, let us understand some ofthe current problems with modeling and what makes the olfactory system signiﬁcantlydifferent than the other sensory systems.Each ﬂavor of OSN has a different receptive ﬁeld and thus we can consider the "spaceof possible physical inputs" to be some collection of points in a 350 -dimensional space , witheach different "dimension" deﬁned by a different OSN receptive ﬁeld. Compared to thethree dimensions of vision, this is monstrous. This aspect of the olfactory system makesstudying it substantially different from other modalities. One feature for example, is thatdistances tend to increase with the dimension. What we mean by this is the following:consider the unit sphere in an even dimensional space. The volume of a cube of sidelength 2, centered at the origin, has volume 2 k where 2 k is the dimension. Whereas, the13olume of the unit sphere sitting inside this box is π k k ! . So as k increases, the volume of theunit sphere actually decreases. What this tells us is that, proportionally in higher dimen-sions, more points lie outside the unit sphere than inside. The importance of the aboveobservation cannot be understated. It implies that there are theoretically an incrediblylarge number of possible odorants detectable by the OSNs as well as decreases the prob-ability that any two odorants which are chemically different will be identiﬁed as similar.We can go one step further and say that the physical marker of an odorant and the sen-sation thereof is a large determining factor in the construction of the perception of thatodor.The many thousands of OSNs converge onto glomeruli, of which there are the exactlysame number as the different receptor types ( ∼ Remark 1.2.2.

The remainder of this section will be dedicated to the modeling of mi-tral and granule cells. These two cell types occupy a majority of the mental theatre ofresearchers in this ﬁeld as they are the most mysterious cells in the olfactory bulb.We begin with mitral cells. As compared to the roughly 350 glomeruli, there are about3500 mitral cells (in humans) and even more in some mammalian tetrapods. The keyfeature of mammalian tetrapods is the independent sampling of the mitral cells from adistinct glomerulus. As mentioned above, this is not always the case and due to thisfact, modeling these cells is a delicate procedure. Most authors elect to simply ignore thepotential cross-sampling.As with most modeling, the early approaches were through linear algebra (see chapter2) and some form of calculus [ET10]. The type of modeling which makes use of calculusextensively is not particularly helpful for building understanding of the perceptual spaceas a geometric object. The use of linear algebra though is quite important in the construc-tion of a perceptual space. In [ZVM +

13] and many others, mitral cells are modeled asvectors in a Euclidean geometry. The important part here is the type of geometry cho-sen. Euclidean geometries are inherently the most restrictive geometry as it assumes nocurvature in the perceptual space.

Example 1.2.3.

To see why a Euclidean geometry is restrictive, consider two points on apiece of paper. Let d be the distance separating the points. Now, given any transformationof the paper which retains the ﬂatness (a rotation or reﬂection) the distance between thepoints will stay the same. Now, let us introduce a fold into the paper. This can bring thepoints closer together in the ambient three-dimensional space but their distance along thepaper will not change. If instead of a fold we make it a smooth change, this is preciselythe introduction of curvature.Nonetheless, this choice of model has been shown repeatedly to not be useful. Sim-ply speaking, perceptual distances do not sit well inside a linear space. It is convenienthowever to have the mathematical ease of a Euclidean space. For this reason, current14esearch (such as [CPO18]) has begun to try and understand manifolds (see chapter 3 fora deﬁnition) and their applications to sensory processing. These are objects which "looklike" Euclidean space on a local scale. The advantage of these spaces is that we can in-troduce curvature to the perceptual space, while still retaining the linear structure on thetangent space at every point. In fact, the problem with Euclidean space is not unique toit. Any space with constant curvature will have the same deﬁcit. We recommend runningthrough the example above but exchanging the piece of paper with a ball or a saddle.This will give the other two types of spaces of constant curvature. Even though the aboveapproach is ﬂawed, some interesting results have appeared in other modalities [MR07]that imply we may want to consider vector-like mitral cells in olfactory system models.Furthermore, the use of some high-level algebra and differential geometry has led to theinvestigation of certain mathematical objects called Lie Groups (see chapter 3 for a deﬁni-tion). These play an important role in mathematics and physics so it is no surprise thatthey have shown up in neuroscience as well.We now turn our attention to granule cells. One large mystery surrounding them isthe aforementioned adult neurogenesis. It was shown in [MLE +

09] that in order for theolfactory system to function at its current level of accuracy, adult neurogenesis is neces-sary. Some have argued however that all evidence of adult neurogenesis is actually rem-nants of embryonic stem-cell differentiation. We shall not contest either of these topicshere as the data is inconclusive either way. On a different note, granule cells are believedto be the workhorses of olfactory learning [Cle14]. These cells inhibit the action poten-tials of the far larger mitral cells and attribute to the variance in spike-timing seen acrossthe bulb for different odors. It should also be noted that there are orders of magnitudemore granule cells than mitral cells. The exact mechanism for mitral cell inhibition isup for debate, however it is clear that the piriform cortex plays some critical role in theexcitation-inhibition loop. Surprisingly however, models tend to not deal with subtle in-tricacies of granule cell inhibition. One possible explanation for this is that granule cellsonly act locally, in contrast to mitral cells which can inhibit relatively far away neighbors.This local action is not readily dealt with in computer models, and combining it with therelatively global action of mitral cells (sometimes having to intertwine the two) has beena blockade for some time now.As one ﬁnal question of this chapter, we want to deﬁne the perceptual categories inolfaction. Given an odorant, the generalized category associated to that odorant is theresult of the generalization gradients above. In practice, one should think of this in thefollowing way: suppose O is the odorant (or combination thereof) corresponding to anorange. Then the generalized category of unlearned oranges may encompass all citrusfruits. This is clearly too broad to be of use when differentiating particular species of or-ange or even ripeness. Therefore, we know that there must be some mechanism (granulecell interactions) which restricts the size of the generalized categories so that they are ofuse for identiﬁcation. In fact, as we shall see in chapter 4, we have proposed a way of gen-erating some speciﬁc hierarchies from such general data given some non-zero amount oflearning. Geometrically we can view this as constructing some rough approximation forthe perceptual space which somehow encodes the differences between distinct classes ofodorants.This completes the brief introduction to the computational neuroscience of olfaction.15 hapter 2An Introduction to Algebra Here we lay down the basics of set theory, its notation and how it is used in practice. Westart with a deﬁnition

Deﬁnition 2.1.0. A Set S , is any collection of elements (normally denoted with the cor-responding small letter) with cardinality some ordinal. The Order (size/cardinality) of aset S , is the number of elements in S and denoted | S | .We have the natural notion of a subset, denoted T ⊆ S . If T is strictly smaller than S ,then we write T (cid:40) S . The collection of all subsets of a set S is called the power set and isdenoted P ( S ) . Some classic examples of sets are the natural numbers, denoted N = {

0, 1, ... } and the integers, denoted Z = {

0, 1, −

1, 2, −

2, ... } .Some more interesting sets are Q , R , C the sets of rational, real, and complex numbersrespectively. Notice that N (cid:40) Z (cid:40) Q (cid:40) R (cid:40) C . For this reason, unless speciﬁed, we willuse C in examples.Additionally, we can deﬁne intersections and unions of sets. If S , T are two sets wedeﬁne their intersection S ∩ T = { x : x ∈ S and x ∈ T } and their union S ∪ T = { x : x ∈ S or x ∈ T } . Further, if T ⊆ S , we can deﬁne the complement of T is S , to be T c = S − T = { s ∈ S : s / ∈ T } . Deﬁnition 2.1.1.

Let X , Y be two sets. We deﬁne the Cartesian Product , denoted X × Y ,as the set of all ordered pairs of elements in X and Y . That is X × Y = { ( x , y ) : x ∈ X , y ∈ Y } Example 2.1.2.

Let X = {

1, 2 } and Y = { a , b } . Then X × Y = { ( a ) , ( b ) , ( a ) , ( b ) } .16or ﬁnite sets, it is easy to see that | X × Y | = | X || Y | as for each element x ∈ X we canlook at the subset { x } × Y ⊆ X × Y each of these sets has size | Y | . As there are | X | choicesfor x , the claim follows. Deﬁnition 2.1.3. A Function f : S → T is a mapping between sets which assigns to eachelement s in the source space S , an element f ( s ) = t ∈ T . For this reason, we call S the domain of f , and T the codomain of f . Denote by f − ( t ) = { s ∈ S : f ( s ) = t } this iscalled the Pre-Image of t under f .We can compose functions assuming the codomain of the ﬁrst is contained in the do-main of the second. We can actually relax this requirement to be that the image, denotedIm f , is contained in the domain of g .Notice that f may not hit every element of T : that is there may exists some t ∈ T suchthat t (cid:54) = f ( s ) for any s ∈ S . The following sister deﬁnitions provide us with insight intothis exact situation. Deﬁnition 2.1.4 (Injective, Surjective, and Bijective) . Let f : S → T be a function. f is called injective if whenever f ( s ) = f ( s ) this implies (denoted = ⇒ ) that s = s . f is called surjective if for all (denoted ∀ ) t ∈ T , there exists at least one s ∈ S , suchthat f ( s ) = t . A function which is both injective and surjective is called bijective . Example 2.1.5.

Let f : Z → Z be deﬁned by f ( n ) = n . Then f is injective trivially. f isnot surjective as for any odd number l = k + n for any n ∈ Z .For an example of a surjective map, consider the absolute value function | · | : Z → N f ( z ) = f ( − z ) = | z | . In more standard notation, one writes z (cid:55)→ | z | . Proposition 2.1.6.

Let f : A → B and g : B → C be injective (respectively surjective, bijective)functions. Then g ◦ f : A → C is injective (resp. surjective, bijective).Proof. (Injectivity) Suppose that ( g ◦ f )( a ) = ( g ◦ f )( a (cid:48) ) . As g is injective, we know that f ( a ) = f ( a (cid:48) ) . Now, as f is injective, we have that a = a (cid:48) .(Surjectivity) Let c ∈ C . As f is surjective, we know that the domain of g is all of B .Now, we know that c = g ( b ) for some b ∈ B . As f is surjective, we have that b = f ( a ) forsome a ∈ A . Thus, for all c ∈ C , there exists at least one a ∈ A such that g ◦ f ( a ) = c . Asbijectivity is a combination of the previous two statements, this completes the proof. The symbol ∈ is to be read as "an element of." If we use the symbol / ∈ the slash means "not". Forexample − ∈ Z should be read as − − ∈ N should be read as − − is an integer. heorem 2.1.7. Let f : X → Y be a bijective function. Then there exists a map g : Y → X suchthat f ◦ g = Id Y and g ◦ f = Id X . Proof.

Deﬁne g : Y → X as g ( y ) = f − ( y ) . This is well deﬁned as f is bijective so y ∈ Im f and ∃ ! x ∈ X such that f ( x ) = y . Then g ◦ f ( x (cid:48) ) = f − ( f ( x )) = x by bijectivity of f . Further, f ◦ g ( y ) = f ( f − ( y )) = f ( x (cid:48) ) = y by bijectivity. Hence, g satisﬁes the properties and we are done. Deﬁnition 2.1.8.

Let X be a set. We say E ⊆ X × X is an Equivalence Relation on X ifthe following properties hold:(a) ( x , x ) ∈ E for all x ∈ X .(b) If ( x , y ) ∈ E then ( y , x ) ∈ E .(c) If ( x , y ) , ( y , z ) ∈ E then ( x , z ) ∈ E .We call these properties reﬂexivity, symmetry, and transitivity respectively. It is commonpractice to not write E as a set of ordered pairs but rather write x ∼ y if ( x , y ) ∈ E . Wethen say ∼ is an equivalence relation on X . Further let [ x ] (also denoted ¯ x in some cases)be the set of all elements y ∈ X such that x ∼ y . We call [ x ] the Equivalence Class of x .We denote the set of equivalence classes as X / ∼ . Lemma 2.1.9.

Let ∼ be an equivalence relation on a set X . Then ∼ induces a partition of X viaequivalence classes. This is equivalent to saying for all elements x , y ∈ X , either [ x ] = [ y ] ∈ X / ∼ or [ x ] ∩ [ y ] = ∅ the empty set.Proof. Suppose [ x ] (cid:54) = [ y ] and [ x ] ∩ [ y ] (cid:54) = ∅ . Let w ∈ [ x ] ∩ [ y ] . Then x ∼ w and y ∼ w .Using the symmetry and transitive property of ∼ , we have that x ∼ y . Therefore [ x ] = [ y ] a contradiction. Hence, either [ x ] = [ y ] or [ x ] ∩ [ y ] for all x , y ∈ X . Example 2.1.10.

Let Z denote the set of integers as above. Fix some n ≥

0. Deﬁne a ∼ b if a − b = kn for some integer k . The space Z / ∼ : = Z n is called the set of integers modulo n . Notice that Z n = {

0, 1, 2, ..., n − } . Deﬁne the operation ( · ) mod n : Z → Z n whichsends k ∈ Z to [ k ] which is equivalent to its remainder after dividing by n . We have opted to start this section with a few examples to introduce the idea of a groupbefore giving the rigorous deﬁnition.

Example 2.2.1. Z . We can deﬁne + : Z × Z → Z by ( a , b ) (cid:55)→ a + b . Clearly if a (cid:54) =

0, then − a exists and is different from a . Further a + ( − a ) =

0. This makes 0 theadditive identity in Z .(b) Let D n denote the set of symmetries of the regular n -gon. Then it is left as an ex-ercise to the reader, to prove that | D n | = n . Note that we can compose two suchsymmetries. Take for example the case n =

4. Let the rotation by 90 ◦ counterclock-wise be denoted r = R ◦ and the vertical reﬂection s . Then rs is the reﬂection alongthe primary diagonal. There is an identity element r = R ◦ .(c) Let C × denote the set of all non-zero complex numbers. Then we can deﬁne · : C × × C × → C × by ( w , z ) (cid:55)→ w · z = wz the standard complex multiplication. Here1 is the multiplicative identity.With these examples in mind, we can now deﬁne groups in more abstraction. In gen-eral, one can think of groups as symmetries of some object, be it an n -gon or some set. Wewill make this more precise. Deﬁnition 2.2.2.

Let G be a set and deﬁne µ : G × G → G be a binary operation such that(a) For all x , y , z ∈ G , µ ( x , µ ( y , z )) = µ ( µ ( x , y ) , z ) .(b) There exists e ∈ G such that µ ( e , g ) = g = µ ( g , e ) for all g ∈ G (c) For all g ∈ G there exists h ∈ G such that µ ( g , h ) = µ ( h , g ) = e .We commonly denote µ ( g , h ) as gh when the operation is clear. Further, the last conditiontells us that every element has an inverse and we denote g − : = h from that condition.We call G equipped with µ , a Group and denote it ( G , µ ) . We say a group is Abelian iffor all g , h ∈ G , we have that gh = hg . Remark 2.2.3.

Other common notations for groups are ( G , · ) and ( G , (cid:63) ) where · and (cid:63) denote the multiplication operations.It should now be obvious that ( ) and ( ) in Example 2.2.1 are example of groups (i.e.every integer has an inverse, namely its negative and every non-zero complex numberis invertible. For ( ) , notice that applying r n − times, we get e . Therefore r n = e and r n − = r − . Further, s = e . and so s is its own inverse.Now we lay down some important non-examples. These, for various reasons, violateone or many of the group axioms. Non-Example 2.2.4. (a) Consider the Z , Q , R under standard multiplication. Z is not a group as all otherelements than ±

1, are not invertible as n is not an integer. Why do Q and R fail?(b) (Integers Modulo n ) Let Z n denote the set of integers {

0, 1, ..., n − } together withmultiplication modulo n . Multiplying modulo n , means that we ﬁrst multiply the19umbers using normal arithmetic and then "remove" n as many times as possibleand the remaining number is their product. For an example let n =

5, then3 · ≡ ( · ) mod n precisely gives the remainder when dividing by n . Under this multiplica-tion operation not every element here has an inverse, namely 0. For n (cid:54) = p a primenumber, we can ﬁnd other elements which are not invertible. Take for instance n = Lemma 2.2.5.

For any group ( G , · ) , inverses are unique. Further, the identity element is unique.Proof. Let g ∈ G . Suppose there exist h , h (cid:48) ∈ G , h (cid:48) (cid:54) = h both inverses for g . Then on onehand we have that h (cid:48) gh = ( h (cid:48) g ) h = eh = h on the other hand we have that h (cid:48) gh = h (cid:48) ( gh ) = h (cid:48) e = h (cid:48) Therefore h (cid:48) = h a contradiction. Hence, h = h (cid:48) is the unique element such that gh = hg = e . To see that the identity element is unique, use the same process as above. Thiscompletes the proof. Remark 2.2.6.

For the remainder of the text, we will refer to groups by the underlyingset ( G , · ) : = G when the multiplication is understood and there is no room for confusion.This is standard notation and in most cases the multiplication is well understood. We willspecify the multiplication when we have a choice of operation. Corollary 2.2.7.

If g , h ∈ G are any elements. Then ( gh ) − = h − g − . Corollary 2.2.8.

Suppose G is a group such that every non-identity element is an involution (thatis g = e). Then G is abelian.Proof. This proof is left as an exercise to the reader.

Hint:

Using the fact that inverses areunique, realize that x = e = ⇒ x = x − for all non-identity elements.In practice, it can be hard to know if a given set is indeed a group. The followingtheorem is integral in identifying groups from abstract sets. Theorem 2.2.9.

Let G be a set equipped with an associative binary operation and suppose thereexists e ∈ G with the following properties(a) ge = g for all g ∈ G(b) For every g ∈ G there exists h ∈ G such that gh = e . Then G is a group. roof. For g ∈ G , pick h ∈ G as in ( ) . Then it sufﬁces to show that eg = g and hg = e .Using ( ) again for h , we can ﬁnd an element i ∈ G such that hi = e . Then g = ge = g ( hi ) = ( gh ) i = ei = i Therefore, hg = h ( ei ) = ( he ) i = hi = e as desired. Now, g = ge = g ( hg ) = ( gh ) g = eg .This completes the proof.This theorem gives us a criterion to check whether or not a set X is actually a group.In practice, this is much more convenient to check than the entirety of the group axioms.An example of this is the set N . We have an identity element 0. However, we cannotﬁnd n − = − n for any non-zero element. Therefore N is not a group. However, N isthe prototypical example of a Semi-Group : a set which has an associative, unital binaryoperation where not every element has an inverse. These objects will play a role in chapter6 when discussing Toric Varieties.The following lemma is provided for ease with later proofs. It gives a criterion for asubset H ⊆ G to be a subgroup. Lemma 2.2.10 (Subgroup Criterion) . Let H ⊆ G be any subset. Let x , y ∈ H . If xy − ∈ H , for all x , y ∈ H , then H is a group and thus a subgroup of G . Proof.

Assume xy − ∈ H for all x , y ∈ H . If x = y , then xx − = e ∈ H . Let associativityis clear as the multiplication is inherited from G . To show every element is invertible,consider x ∈ H and e , then ex − = x − ∈ H . Using this, consider x and y − . Then x ( y − ) − = xy ∈ H . Therefore · : H × H → H deﬁnes an associative, binary, unital, andinvertible map. Hence, H is a group and in fact a subgroup of G . Now that we have the basic objects of this section, we can consider maps between them.Note in the following deﬁnition, the maps act as you would expect: preserving the struc-ture of both groups.

Deﬁnition 2.2.11.

Let ϕ : ( G , · ) → ( H , (cid:63) ) be a map. ϕ is a Group Homomorphism (or aMorphism of groups, see Ch. 3) if for all g , g (cid:48) ∈ G , we have that ϕ ( g · g (cid:48) ) = ϕ ( g ) (cid:63) ϕ ( g (cid:48) ) That is ϕ is equivariant with respect to the multiplication operations on G and H . A grouphomomorphism which is bijective is called an Group Isomorphism . If such a map exists,then the domain and codomain groups are said to be isomorphic and denoted G ∼ = H . Example 2.2.12. G = { − i , − i } where i = √−

1. Then deﬁne f : Z → G by f ( m ) = i m . Thismakes f a group homomorphism as f ( m + n ) = i m + n = i m · i n = f ( m ) · f ( n ) (b) Let R denote the set of all real numbers and R × denote the set of non-zero realnumbers. Deﬁne g : R × → R t imes by x (cid:55)→ x . Then g is a homomorphism as ( xy ) = x y for real numbers. We encourage the reader to investigate how chang-ing the domain and/or range to R ≥ changes the properties of the homomorphism.We now lay down two deﬁnitions which are integral to the study of algebra and haveanalogs in all other branches of mathematics. Deﬁnition 2.2.13.

Let H ⊆ G be a set contained in a group G . We call H a Subgroup iffor all h , h (cid:48) ∈ H , h · h (cid:48) ∈ H and h − ∈ H . We denote subgroups using the notation H ≤ G .Denote by gH = { gh : h ∈ H } for any g ∈ G . Then we call a subgroup Normal if gHg − = H for all g ∈ G and write H (cid:69) G . Denote by Z ( G ) = { g ∈ G : gh = hg , ∀ h ∈ G } ,the Center of G . It should be obvious that Z ( G ) is a normal subgroup of G . Deﬁnition 2.2.14.

Let ϕ : G → H be a group homomorphism. Deﬁne the Kernel of thehomomorphism ϕ to be ker ϕ = { g ∈ G : ϕ ( g ) = e H } This is the set of all elements which are annihilated under the mapping ϕ . Proposition 2.2.15.

The set ker ϕ is a group. In particular, it is a normal subgroup of G . Proof.

We ﬁrst show that ker ϕ is non-empty. Let e ∈ G be the identity. We claim that ϕ ( e G ) = e H . To see this, recognize that ϕ ( gg − ) = ϕ ( g ) (cid:63) ϕ ( g − ) = ϕ ( g ) (cid:63) ϕ ( g ) − = ϕ ( e G ) = e H for all g ∈ G . Thus, ker ϕ is nonempty. As ker ϕ ⊆ G , ker ϕ inherits multiplication from G . Notice that for x , y ∈ ker ϕ , we have that ϕ ( x · y ) = ϕ ( x ) (cid:63) ϕ ( y ) = e h Therefore ker ϕ is closed under multiplication. Further, it is closed under inverses for thesame reason. Hence, ker ϕ is a group and ker ϕ ≤ G . To check normality, notice that ϕ ( gxg − ) = ϕ ( g ) (cid:63) ϕ ( x ) (cid:63) ϕ ( g ) − = e for all g ∈ G . Hence, ker ϕ (cid:69) G . 22s shown by the proof above, the homomorphism condition is quite restricting andpowerful. We used the fact that ϕ ( g − ) = ϕ ( g ) − for group homomorphisms. it is left tothe reader to check this fact. Now, we have the following result which is important whenproving other theorems. Theorem 2.2.16.

Let ϕ : G → H be a group homomorphism. Then ϕ is injective if and only if ker ϕ = { e } . Proof. ( ⇒ ) Assume that ϕ is injective. That is ϕ ( x ) = ϕ ( y ) = ⇒ x = y for all x , y ∈ G .Then let g (cid:54) = e ∈ ker ϕ . By injectivity, ϕ ( g ) = ϕ ( e ) = e H = ⇒ g = e Hence ker ϕ = { e } . ( ⇐ ) Assume now that ker ϕ = { e } . Then suppose ϕ ( g ) = ϕ ( h ) . This tells us that ϕ ( gh − ) = ϕ ( g ) ϕ ( h ) − = e H .Therefore gh − ∈ ker ϕ . As ker ϕ = { e } , we know gh − = e and hence, g = h . Thiscompletes the proof. Example 2.2.17.

Let G be a simple group (that is the only normal subgroups are { e } and G itself). Then any map f : G → H is either injective or trivial. This follows fromProposition 2.2.15 and Theorem 2.2.16.Just as with sets, we can build the Cartesian product of groups, G and H , denoted G × H . As a set it is precisely the set G × H , but now we endow this with a group structuretaken component-wise. That is ( g , h )( g , h ) = ( g g , h (cid:63) h ) For a concrete example, consider the set Z × R × ( R × is the group of all non-zero realnumbers under multiplication). Here ( k , r )( k , r ) = ( k + k , r r ) as the multiplication in Z is addition. At this point, we have the ability to construct a group, transition between groups, and"multiply" groups to make new ones. Just as with high-school algebra, we can now con-sider dividing, or taking quotients of groups.

Deﬁnition 2.2.18.

Let G be a group and H any subgroup. We denote by G / H (resp. H \ G ) the set of all left(resp. right) Cosets gH = { gh : h ∈ H } under the equivalence relation that g ∼ g (cid:48) ⇐⇒ g = gh (cid:48) for some h ∈ H . This is ingeneral not a group as multiplication is not well deﬁned.23otice how the notation for this set of cosets is the same notation we use for equiva-lence relations on a set. The reason for this is that then we take left(right) cosets, we areessentially glueing G along the orbits of the subgroup H .The ﬁrst question one can ask about this set is when does it becomes a group? In otherwords, for what H ≤ G is G / H a group. The following Theorem provides an answer. Theorem 2.2.19.

Let G be a group and N a subgroup. Then G / N (read G mod N) is a groupunder the operation ( gN )( hN ) = ( gh ) N if and only if N (cid:69) G . Further there is a canonicalhomomorphism G → G / N which sends g (cid:55)→ gN such that ker ( G → G / N ) = N . Proof.

We ﬁrst need to show that the proposed group operation is well deﬁned. Suppose xN = gN and yN = hN . These two statements are equivalent to x = gn and y = hn (cid:48) .Then ( xy ) N = ( xN )( yN ) = ( gn ) N ( hn (cid:48) ) N = g ( nN ) h ( n (cid:48) N ) = ( gN )( hN ) = ( gh ) N Thus the multiplication is well deﬁned. ( ⇒ ) Now assume N is normal in G . Multiplication is associative by deﬁnition and theunit element is eN . It remains to show that gN has an inverse and that it is unique. Let g − be the inverse of g in G . Then ( gN )( g − N ) = ( gg − ) N = eN So g − N is an inverse for gN . Suppose there exists some y ∈ G such that ( gN )( yN ) =( yN )( gN ) = N . Then starting from the middle: yN = ( eN )( yN ) = ( g − N )( gN )( yN ) = ( g − N )( eN ) = g − N Hence, g − N = yN and G / N is a group.We defer the other direction of the proof for a moment. Deﬁne ϕ : G → G / N by ϕ ( g ) = gN . This is a homomorphism by the multiplication in G / N . If x ∈ ker ϕ then ϕ ( x ) = xN = N . Therefore x ∈ N and ker ϕ ⊆ N . The reverse inclusion is obvious andthus ker ϕ = N ( ⇐ ) Now suppose G / N is a group. Consider the canonical projection G → G / N . Thenker ϕ = N by above and by Proposition 2.2.15 we conclude that N is normal. Corollary 2.2.20.

Let G be an abelian group. Then for every subgroup H ≤ G , G / H is an abeliangroup.Proof.

The fact that G is abelian tells us that gH = Hg for all g ∈ G . To see that G / H isabelian, let g , g (cid:48) ∈ G . Then ( gH )( g (cid:48) H ) = ( gg (cid:48) ) H = ( g (cid:48) g ) H = ( g (cid:48) H )( gH ) xample 2.2.21. Recall the group Z n from above. To formally deﬁne Z n , we consider thegroup Z and the subgroup of multiples of n denoted n Z . Then Z n : = Z / n Z That is, we glue the integers along the multiples of n . The group operation in Z is + andtherefore x ≡ y mod n ⇐⇒ x + y = kn , k ∈ Z The quotient is a group as Z is abelian.Now that we have the idea of quotients, we can deﬁne one of the most useful theoremsin algebra: the First Isomorphism Theorem. The proof of which will introduce one of themost fundamental objects in algebra: the commutative diagram. These will show upmany times in the latter parts of this text and as such, we encourage the reader to try andprove the following theorem themselves before reading the proof. Theorem 2.2.22 (First Isomorphism Theorem) . Let G , H be groups and ϕ : G → H be a grouphomomorphism. Then G / ker ϕ ∼ = ϕ ( G ) Proof.

Consider the commutative diagram G ϕ ( G ) G / ker ϕ ϕ q ˆ ϕ The top arrow is surjective by deﬁnition and the map q is the canonical quotient. Denotethe cosets in G / ker ϕ as [ g ] . We deﬁne the map ˆ ϕ ([ g ]) = ϕ ( g ) . To show that ˆ ϕ is welldeﬁned, consider [ g ] = [ h ] that is g = ah where a ∈ ker ϕ . Then ϕ ( g ) = ϕ ( ah ) = ϕ ( a ) ϕ ( h ) = ϕ ( h ) Thus, ˆ ϕ is well deﬁned. It is a homomorphism asˆ ϕ ([ g ][ g (cid:48) ]) = ˆ ϕ ([ gg (cid:48) ]) = ϕ ( gg (cid:48) ) = ϕ ( g ) ϕ ( g (cid:48) ) = ˆ ϕ ([ g ]) ˆ ϕ ([ g (cid:48) ]) By the commutativity of the diagram, ˆ ϕ is surjective. We computeker ˆ ϕ = { [ g ] ∈ G / ker ϕ : ϕ ( g ) = e H } = ker ϕ As ker ϕ is the identity element in the quotient space, ˆ ϕ is injective. Hence, ˆ ϕ is an iso-morphism. Corollary 2.2.23. If ϕ : G → H is a surjective homomorphism thenG / ker ϕ ∼ = H .25he next tool we will discuss is fundamental to the study of algebra. Deﬁnition 2.2.24.

Consider a sequence of groups... G i − G i G i + ... d i d i + we say that the sequence is exact at G i if ker d i + = Im d i . If the sequence is exact at every G i , we say the sequence is exact and we call it a Long Exact Sequence . If the sequencehas the following form { e } → G → G → G → { e } we say the sequence is a Short Exact Sequence . Example 2.2.25.

Let G be a group and N a normal subgroup. We can rephrase the quotientconstruction as the unique (up to isomorphism) group H such that the following sequenceis exact { e } → N → G → H → { e } Here, the arrow N → G is the inclusion. Exactness tells us that N → G is injective, andthat G → H is surjective. Thus, by the ﬁrst isomorphism theorem, G / ker ( G → H ) ∼ = H .As ker ( G → H ) = Im ( N → G ) = N , we have our result. Let G be a group. Just as with the dihedral groups D n , we can ask how a group may act ona set; that is, how does it permute the elements? The formalization of this, a group action,is essential when understanding the later topics in this section. We give the following twodeﬁnitions Deﬁnition 2.2.26.

Let G be a group and X be a set. A (left) Group Action on X is a map · : G × X → X such that(a) h · ( g · x ) = hg · x (b) ∃ e ∈ G such that e · x = x for all x ∈ X . Deﬁnition 2.2.27.

Let G be a group and X a set as above. A (left)group action is a grouphomomorphism ϕ : G → Sym ( X ) where Sym ( X ) = { f : X → X : f is bijective } . This is a group under composition. Inver-sion is well deﬁned as every map is bijective. This is called the permutation representationof the group G on X . Lemma 2.2.28.

Deﬁnition and are equivalent. roof. It is clear that 2.2.27 = ⇒ = ⇒ ψ g to be the map on X suchthat ψ g ( x ) = g · x . We know that ψ g is invertible ( ψ g − ) and thus ψ g ∈ Sym ( X ) . Deﬁne amap ϕ : G → Sym ( X ) by ϕ ( g ) = ψ g . Then by the associative property of the action weget that ϕ ( gg (cid:48) ) = ψ gg (cid:48) = ψ g ◦ ψ g (cid:48) = ϕ ( g ) ◦ ϕ ( g (cid:48) ) Hence, ϕ is a group homomorphism and the deﬁnitions are equivalent.We can think of group actions as shufﬂing the elements of the set they act on. Thekernel of an action is precisely the kernel of the resulting homomorphism. We say anaction is faithful if the associated permutation representation is injective. Further, we callan action transitive if the has precisely one orbit. That is, for every pair ( x , y ) ∈ X × X ,there exists g ∈ G such that g · x = y . Remark 2.2.29.

We have been careful to refer to left and right multiplication. If G is non-abelian, these are different operations. When doing more advanced mathematics, one canconsider multiplication or an action on both the left and the right. This has some majorconsequences but as we will not make use of the them, we have made the decision to omitsuch a discussion. Lemma 2.2.30.

Let G be a group and suppose G acts on a set X.(a) Let

Stab G ( x ) = { g ∈ G : g · x = x } and Orb G ( x ) = { g · x : g ∈ G } denote the stabilizeand orbit of the point x ∈ X under the action of G . Then

Stab G ( x ) is a subgroup of G . (b) If X = G , then the action is transitive and faithful. Further, any subgroup acts faithfully.Proof. (a) It is clear that Stab G ( x ) is a subset of G . It carries the standard group multi-plication and is non-empty as e ∈ Stab G ( x ) . It sufﬁces to show that all non-identityelements have an inverse. Let g ∈ Stab G ( x ) . Then x = e · x = ( g − g ) · x = g − · ( g · x ) = g − · x Thus, g − ∈ Stab G ( x ) and by the subgroup criterion, Stab G ( x ) is a subgroup of G .(b) Let G act on itself by left multiplication. To show the action is faithful, suppose g · h = g (cid:48) · h . Then gh = g (cid:48) h ⇐⇒ ghh − = g (cid:48) hh − ⇐⇒ g = g (cid:48) So the map G → Sym ( G ) is injective. To show it is transitive, let h , i ∈ G , we needto show that there is an element g ∈ G such that gh = i . Pick g = ih − . This is agroup element and gh = ih − h = i . Therefore every element is in the orbit of a singleelement, namely the identity element. Hence, the action is faithful and transitive.As H ≤ G , this faithful map restricts to any subgroup.27 orollary 2.2.31. Let G act on a set X . If this action is transitive, then it is equivalent to theaction of G on G / H by left multiplication for some H ≤ G . Proof.

Let x ∈ X and consider H = Stab G ( x ) . Transitivity gives us that for all y ∈ X , y = gx for some g ∈ G . Suppose gx = g (cid:48) x then ( g (cid:48) ) − g ∈ H . This makes the map f : G / H → X gH (cid:55)→ gx a bijection. It remains to show that this map is G -equivariant. Let g ∈ G and w ∈ X . Wecan write w = g x for some g . Then f ( gg H ) = gg x = gw = g f ( g H ) Hence f is G -equivariant and the actions are equivalent.In this case, G / H is called the orbit space of the action as no element is stabilized inthe set. This will play an important role in the next chapter. Linear algebra is one of the oldest and core subjects to mathematics. It began as the studyof solutions to linear systems of equations and has grown into the study of transforma-tions on vector spaces. For example, given the following set of equations3 x + y = x + z = x + y − z = x , y , z satisfy them? There are a variety of ways to ﬁnd solutions, butperhaps the simplest is to use matrices. Deﬁnition 2.3.1. A Matrix is any rectangular array of numbers, symbols, operators, etc.arranged in rows and columns such that addition and multiplication are well deﬁned. If A is a matrix of ﬁnite size, it is convention to read the lengths of the sides as “rows bycolums". That is a matrix with 3 rows and 4 columns is a 3 × A , B be m × n and n × k matrices. Then ( AB ) ij = n ∑ l = A il B lj where A ij is the element of A in the i th row and j th column. An n × n matrix A is invertible if there exists an n × n matrix B such that AB = BA = I n which is the matrix with 1 s alongthe main diagonal and 0 elsewhere.We can turn the system of equations above into the single matrix equation  −   xyz  =  

28e leave it to the reader to check that x = y = , z = As we have just seen with groups, endowing a set with multiplication has some strikingimplications. In this section we consider a new algebraic object, a ﬁeld. Broadly, this is aset equipped with two operations, addition and multiplication which are compatible.

Deﬁnition 2.3.2.

Let F be a set and suppose it is equipped with two operations + , · . Let F × denote the set of non-zero elements of F . Suppose ( F , +) and ( F × , · ) are abelian groups.If for all a , b , c ∈ F , a ( b + c ) = ab + ac = ba + ca = ( b + c ) a then F is a Field . In a ﬁeld, we denote the identity for the addition as 0 = F and formultiplication as 1 = F . For any ﬁeld, we can deﬁne the characteristic of F , char F to bethe minimal n ∈ N such that n · =

0. If no such n exists, we say that char F = Example 2.3.3.

The quintessential example of a ﬁeld is the real numbers R . One can thenconstruct C the complex numbers as a ﬁeld which contains R . These ﬁelds both havechar F =

0. For an example of positive characteristic, consider Z p where p is prime. Thisis a ﬁeld and has characteristic p . A good exercise to test your understanding is to provethat Z n , for n (cid:54) = p l for some l (cid:54) = ∈ N , fails to be a ﬁeld. Example 2.3.4 (Polynomials) . Let F be a ﬁeld and denote by F [ x ] the set of all formalpolynomials ∑ n a i x i , with a i ∈ F . For any polynomial f ∈ F [ x ] deﬁne the degree of f ,denoted deg f , to be deg f = max { i : a i (cid:54) = } . We deﬁne addition as n ∑ a i x i + m ∑ b i x i = max { n , m } ∑ ( a i + b i ) x i where a i (resp. b i ) is considered to be 0 if i > n ( resp . i > m ) and multiplication as (cid:18) n ∑ a i x i (cid:19) · (cid:18) m ∑ b i x i (cid:19) = n + m ∑ (cid:32) ∑ j + k = i a j b k (cid:33) x i This makes F [ x ] a group under addition. It is not a group under multiplication as the setof invertible elements is precisely the constant polynomials as x n does not have an inversefor n ≥

1. Therefore F [ x ] is not a ﬁeld. As we will see later, F [ x ] is a ring. (See Section 2.4)If f ∈ F [ x ] cannot be written as f = gh for g , h ∈ F [ x ] and deg g , deg h (cid:54) = f is saidto be irreducible .We sometimes adjoin numbers to a ﬁeld in the same way we do with formal variables.Let i = √−

1. Then R [ i ] Then by the rules above this consists of all ﬁnite sums ∑ r j i j .However, i = − R [ i ] = { a + bi : a , b ∈ R } = C This is precisely the deﬁnition of the complex numbers.29 emark 2.3.5.

We will only concern ourselves with characteristic 0 as positive character-istic is a bit technical and does not play a role in the later chapters of this text.

Deﬁnition 2.3.6.

Let E be a ﬁeld which contains F as a subﬁeld. Then we say that E is an extension of F and denote this E / F . Further, the degree of the extension, denoted [ E : F ] ,is the integer n such that E ∼ = F n = ∏ n F .Similar to groups, we can deﬁne Field homomorphisms. Deﬁnition 2.3.7.

Let F , E be ﬁeld and f : F → E . If for all a , b ∈ F we have that f ( a + b ) = f ( a ) + f ( b ) f ( ab ) = f ( a ) f ( b ) the f is a Field Homomorphism . A bijective ﬁeld homomorphism is an isomorphism.For groups, this was where the story ended. For ﬁelds, due to the added structure, wehave the following lemma.

Lemma 2.3.8.

Every non-zero ﬁeld homomorphism is injective.Proof.

Let F , E be ﬁelds and x , y ∈ F . Suppose α : F → E is a morphism. If α ( x ) = α ( y ) ,then α ( x ) − α ( y ) = ⇐⇒ α ( x − y ) = = α ( ) If a − b (cid:54) =

0, then put q = a − b . Using the multiplication, α ( q ) α ( q − ) = α ( ) =

1. But α ( q ) =

0. This is a contradiction and thus a − b = α is injective.In this proof we used the fact that for non-trivial ﬁeld homomorphisms α ( ) =

1. Weleave it to the reader to check this.

Deﬁnition 2.3.9.

Let F be a ﬁeld and S any subset. Denote by F S the subﬁeld of F con-taining S . It is a fairly simple exercise to show that this ﬁeld always exists. For the specialcase that S = { } , we call F = F (cid:48) the prime subﬁeld of F as it is the ﬁeld generated by 1.A less trivial exercise is to prove that if F is ﬁnite with characteristic p then F (cid:48) ∼ = Z p andif F is inﬁnite and char F =

0, then F (cid:48) ∼ = Q . Deﬁnition 2.3.10.

Let L / K be a ﬁeld extension. An element a ∈ L is algebraic over K if there exists f ∈ K [ x ] such that f ( a ) = L is called an algebraic extension if everyelement is algebraic. A ﬁeld L is called algebraically closed if any for all f ∈ L [ x ] , f ( x ) = = ⇒ x ∈ L . Example 2.3.11. (a) C / R is an algebraic ﬁeld extension as the degree of the extension is ﬁnite and by theFundamental Theorem of Algebra, C is algebraically closed.(b) Q ( √ ) / Q is an algebraic extension.(c) R / Q is not an algebraic extension. Consider the element e = lim n → ∞ ( + n ) n . Thisis known to be transcendental 30 roposition 2.3.12. Let L / K be an algebraic extension. Then for every element α ∈ L , thereexists a unique monic irreducible m α ∈ K [ x ] such that m α ( α ) = and deg m α is minimalamong polynomials which have α as a root. We shall omit the proof of this proposition as it does not add to the text.The last theorem we shall prove on ﬁelds tells us that every intermediate set, closedunder addition and multiplication, of an algebraic extension is a ﬁeld. More precisely,

Theorem 2.3.13.

Let L / K be an algebraic extension and S a set such that S is a group underaddition and is closed under multiplication. If L ⊇ S ⊇ K , then S is a ﬁeld.Proof. As S ⊆ L , it is commutative and has a unit element. It sufﬁces to show that for all s (cid:54) = ∈ S that s − exists and is contained in S . Existence follows from the fact that s ∈ L and is non-zero. To show it is contained in S , we use Proposition 2.3.12. As L / K is analgebraic extension, the minimal polynomial m s of s over K exists. Let m s = x n + a n − x n − ... + a with each a i ∈ K . Evaluating at x = s , we get − a = s ( s n − + ... + a ) = ⇒ s ( s n − + ... + a ) (cid:18) − a (cid:19) = s − = ( s n − + ... + a ) (cid:16) − a (cid:17) ∈ S . Hence, S is a ﬁeld.Just as with groups, we can talk about actions of ﬁelds on sets. This does not varyfrom the theory of groups however as Sym ( X ) is not a ﬁeld so deﬁning the action in thisway is uninteresting. We thus need a different object to study. Linear algebra has emerged from its concrete origins in system of equations to the beau-tiful abstract algebra it is today. Vector spaces comprise the main objects of study. Theseobjects, as we will see, are incredibly well understood and intersect every area of mathe-matics. The main references for this section are [ Coo15 ] and [Kna06].We begin with the deﬁnition. Deﬁnition 2.3.14.

Let V be a set, and F a ﬁeld. Equip V with two operations + : V × V → V · : F × V → V which are compatible in the sense that for all f ∈ F and v , w ∈ V , we have that f ( v + w ) = f v + f w = ( v + w ) f and 1 v = v . If under these operations V is an abelian group togetherwith an action of F , we say V is an F - Vector Space , with elements v ∈ V called vectors andelements f ∈ F called scalars . The element f v is a scaled vector. A subset W ⊆ V , whichis closed under the operations of addition and scalar multiplication is called an F -vectorsubspace. Typically we simply say subspace if the underlying ﬁeld is understood. Deﬁnition: A monic polynomial is a polynomial whose highest degree term has coefﬁcient 1 xample 2.3.15. We have already seen some examples of vector spaces and subspaces.(a) Let F be a ﬁeld and E a ﬁnite ﬁeld extension. It is clear that a ﬁeld satisﬁes thedeﬁnition of a vector space over itself. Now, by the ﬁniteness condition on E , weknow that E ∼ = F n and therefore we can extend the action of F to each component of E . That is f · e = f · ( e , ..., e n ) = ( f e , f e , ..., f e n ) This is given by the diagonal inclusion of F (cid:44) → F n which sends f (cid:55)→ ( f , f , ..., f ) (cid:41) n − times (b) For a non-trivial example consider the space F [ x ] . Deﬁnition 2.3.16. An F - linear combination of vectors is anything of the form v = ∑ a i v i for ﬁnitely many i with each a i ∈ F . If v , ..., v m is a collection of vectors in a vector space V , denote by (cid:104) v , ..., v m (cid:105) the set consisting of all linear combinations of the v i . This is canonically a subspace of V .We say that v , ..., v m is a spanning set for a vector space V if every v ∈ V can be writtenas a linear combination of the v i . Given a set B we will denote bySpan F ( B ) the minimal vector space generated by the elements of B . We will omit F if it is clear fromthe situation and or if the section is true regardless of the ﬁeld chosen. Corollary 2.3.17.

Every vector space admits a spanning set.

This follows immediately from the deﬁnition as V is a spanning set for itself. A moreinteresting statement is that there exists a unique (up to conjugation) minimal spanningset Deﬁnition 2.3.18.

Let v , ..., v n be vectors in a vector space V . We say these vectors are linearly independent if n ∑ i = a i v i = ⇐⇒ a i = ∀ i We will commonly abuse the term linearly independent and refer to sets as linearly inde-pendent if all of the ﬁnite subsets of elements are linearly independent.

Example 2.3.19. (a) Let V = C treated as a real vector space via the inclusion of R (cid:44) → C . Its elements arewritten as z = x + iy . Let z , z , z be three, non-colinear ( z i (cid:54) = a j z j ∀ i , j ∈ {

1, 2, 3 } )complex numbers. It can be shown that z can be written uniquely as a z + a z .32b) For a more concrete example consider V = R . Let v =   v =  −  v =   It should be easy to see that v = v + v . Notice that if we change the thirdcoordinate of v to 0, we have that v is no longer a linear combination of v and v . Deﬁnition 2.3.20.

Let B = { v i } i ∈ I be a spanning set of the vector space V . We call B a basis if it is linearly independent. We denote elements of V with respect to this basis ascolumn vectors (tuples) v = ( k , ..., k n , ... ) t which means v = ∑ I k v i .It should be noted immediately that any basis B for a vector space is necessarily min-imal among the sets with the above properties. Theorem 2.3.21.

Let B and C be two bases for the vector space V . Then |B| = |C | . Proof.

We shall prove this is two cases |B| is ﬁnite and |B| is inﬁnite. Suppose ﬁrst that |B| < ∞ . We want to give bounds on the size of C . Lemma 2.3.22.

Suppose that |C | > |B| . Then C is linearly dependent.Proof. As B is a basis, the set B ∪ c must be linearly dependent. Therefore, up to reorder-ing, we can assume that b n ∈ Span { c , b , ..., b n − } . This is now a linearly independent set.Notice that by assumption { c j } is linearly independent. Therefore, repeating the aboveprocess with c j for 2 ≤ j ≤ n and reordering, we conclude that { c , ..., c n } is a linearly in-dependent, spanning set. As |C | > n , we then conclude that C is linearly dependent.From this lemma, we conclude that |C | ≤ |B| . The key step of the proof relied on the factthat B was a basis. We can similarly apply this logic to C and deduce then that |B| ≤ |C | .Hence, they must be equal.Now assume |B| is inﬁnite. The method above will not work as sets with inﬁnite cardi-nality as adding an element does not give any information regarding linear dependence.We can rephrase this part of the proof however as there exists a bijection f : B → C . We canconstruct such a function in the following way: let B = { b i : i ∈ I } and C = { c j : j ∈ J } with I , J some indexing sets of inﬁnite cardinality. For an arbitrary element c j ∈ C , weknow that c j ∈ Span ( B ) . In particular, we know that c j ∈ Span ( B j ) a ﬁnite subset of B .Put B = (cid:91) j ∈ J B j As C is a basis, it is in particular a spanning set. Therefore B is also a spanning set. As B ⊆ B . we know that B = B and therefore B = (cid:91) j ∈ J B j As each B j is ﬁnite we know that (cid:12)(cid:12)(cid:12) (cid:83) j ∈ J B j (cid:12)(cid:12)(cid:12) ≤ | J | = |C| . Hence, |B| ≤ |C | and, by applyingthe same logic, we have that |C | ≤ |B| . The proof is completed by the following theorem,a proof for which can be found in [Kna06, Appendix A.6].33 heorem 2.3.23 (Schroeder-Bernstein) . If A and B are sets such that there exists an injectivefunction f : A → B and and injective function g : B → A then | A | = | B | .This now begs the question: "does every vector space admit a basis?" The next theoremwill give an answer to this, but before giving a proof, we need the following famouslemma from Logic. Lemma 2.3.24 (Zorn’s Lemma) . Let P be a partially ordered set. Suppose that every totallyordered set has an upper bound. Then P contains a maximal element.

Deﬁnition 2.3.25. A partial order on a set X is a reﬂexive, antisymmetric, transitive, bi-nary relation (cid:22) . A total order is a partial order such that for all pairs ( x , y ) either x (cid:22) y or y (cid:22) x .The rest of the components of the lemma are self explanatory. The proof of this lemmawill be omitted as it does not add to the text. Although it seems innocuous, this lemmaprovides the technical support for many proofs in algebra. For example: Theorem 2.3.26.

Let V be a vector space deﬁned over the ﬁeld F . Then:(a) Every spanning set contains a basis.(b) Every linearly independent subset can be extended to a basis.(c) V has a basis.

We present the proof given in [Kna06].

Proof. (b) Let E be a linearly independent subset of V . Let S be the collection of all linearlyindependent subsets of V containing E . Then S is a partially ordered set under inclusionand non-empty as E ∈ S . Let T be a totally ordered subset of S and consider A = (cid:91) T ∈T T We claim that A ∈ S . It clearly contains E by construction. It remains to show it is linearlyindependent. To see this, suppose not. Then there exist v , ..., v n ∈ A such that c v + ... + c n v n = c i =

0. Let A j ∈ T be an element which contains v j . Then as T istotally ordered. There exists some A (cid:48) n such that A (cid:48) n ⊇ A j for all j ≤ n . As A (cid:48) n is linearlyindependent, c i = i , a contradiction. Hence, A is linearly independent and anupper bound for T . Thus, all totally ordered sets have an upper bound and by Zorn’sLemma, there is a maximal element B ∈ S . it remains to be shown that B is a spanningset. Let v ∈ V be arbitrary. Suppose v / ∈ Span F B . Then { v } ∪ B is a linearly dependent setby the maximality of B . Therefore, there exist constants c , c , ..., c m and vectors v , ..., v m such that cv + c v + ... + c m v m = c , c , ..., c m =

0. We know that c (cid:54) = B is linearly independent. Therefore v = − c − ( v c + ... + v m c m ) . Hence, v ∈ Span F B and B is a spanning set.34a) Now Let E be a spanning set. Let S denote the partially ordered set of linearlyindependent subsets contained in E ordered by inclusion. Let T be a totally orderedsubset of S . Let A be the union of all of the elements of T . Then it is clearly an upperbound by the argument in (b) above. By Zorn’s Lemma S contains a maximal element M and by an easy modiﬁcation of the proof showing that B was linearly independent in part(b), we conclude that M is a spanning set and therefore M is a basis. (c) now follows from(a) by taking E = V and follows from (b) by taking E = ∅ .Now, by Theorems 2.3.26 and 2.3.21, we know bases exist and that their cardinalityis unique. Therefore it is an invariant of the vector space and motivates the followingdeﬁnition. Deﬁnition 2.3.27.

Let V be an F -vector space and B a basis. By the F-dimension of V wemean dim F V = |B| Here it is important to distinguish the ﬁeld of deﬁnition.

Example 2.3.28.

Let F p denote the ﬁeld with p elements. It is a fun exercise to prove thatfor any natural number n ∈ N , there is a ﬁeld extension F p n . Each of these ﬁelds is a vectorspace of dimension n over F p given by adjoining a root of an irreducible polynomial ofdegree n and thus is isomorphic to F np . We can see this isomorphism explicitly after wedevelop the theory of rings in the next section. Example 2.3.29.

We now give an interesting example of an inﬁnite dimensional vectorspace. Consider R deﬁned over Q . At ﬁrst glance, this looks non-sensical as an inﬁnitedimensional vector space as Q is dense in R . However, suppose R ∼ = Q n for some n ∈ N .Then we can pick a basis { x , ..., x n } of R over Q . By Cantor’s diagonalization argument,we know that | R | > | Q | . In fact, Q is countably inﬁnite and R is uncountably inﬁnite.Using the basis we have picked, the claim R ∼ = Q n would imply that R is countablyinﬁnite as the ﬁnite product of countably inﬁnite sets is necessarily countably inﬁnite.This is a contradiction and thus dim Q R (cid:54) = n ∀ n ∈ N Another way to think about this is to look at all transcendental numbers, t , over Q (num-bers such as π , e , ln ( ) etc.) If we look at Span Q { t } ∼ = Q (cid:40) R we get disjoint one dimen-sional subspaces for each unique transcendental number. Lemma 2.3.30.

There are only countably many algebraic numbers.Proof.

A real number, r , is algebraic if there exists f ∈ Q [ x ] such that f ( r ) =

0. Therefore,we need a bound on the cardinality of Q [ x ] as this gives an upper bound on the cardinalityof the algebraic numbers. Notice that { x i } i ∈ N is a basis for Q [ x ] as a Q vector space. Thisis a countable basis and therefore Q [ x ] is a countably inﬁnite dimensional vector space.Hence, Q [ x ] is countably inﬁnite as a set and therefore the cardinality of the algebraicnumbers is at most countably inﬁnite. Corollary 2.3.31.

There are uncountably many transcendental numbers. R are uncountablymany copies of Q , each having trivial intersection, and thus R is an inﬁnite dimensionalvector space over Q . Now that we have the notions of basis and dimension, we can introduce the idea of linearmaps between vector spaces. These play a massive role in modern mathematics as wellas many applied areas. The reason, as will be shown shortly, is that linear maps are insome sense the “easiest" functions to understand. Further, there is a natural association ofa matrix to any linear map, regardless of dimension. This will give us a clear method totackle problems like. Example 2.3.19(b) and after Deﬁnition 2.3.1. First, we introduce thenotion of quotient for vector spaces. This treatment will mirror the treatment for groupsabove, but will elucidate the differences that vector spaces bring.Similar to the case of sets, we want to impose a notion of equivalence on a genericvector space V . We do this by identifying an entire subspace, not just a subset. Deﬁnition 2.3.32.

Let W ⊆ V be a subspace. We deﬁne the quotient space V / W = V / ∼ where v ∼ v (cid:48) if v − v (cid:48) ∈ W . It is easy to check that this is an equivalence relation. As V is an abelian group, we have that V / W is also an abelian group under the operation [ x ] + [ y ] = [ x + y ] . We deﬁne scalar multiplication as k [ v ] : = [ kv ] . This turns V / W into avector space.We shall see some examples of these after Theorem 2.75 below. Before this, we givethe ﬁrst deﬁnition of linear maps and some ﬁrst properties. Deﬁnition 2.3.33.

Let K be a ﬁeld and V , W be two K -vector spaces. We say a function f : V → W is a linear transformation if for all v , v (cid:48) ∈ V and k , k (cid:48) ∈ K , f ( kv + k (cid:48) v (cid:48) ) = f ( kv ) + f ( k (cid:48) v (cid:48) ) = k f ( v ) + k (cid:48) f ( v (cid:48) ) ∈ W The set of all v ∈ V such that f ( v ) = kernel and is denoted ker f . Similarly,the image, denoted Im f is deﬁned as the set of w ∈ W such that w = f ( v ) for some v . Weretain the same deﬁnitions of isomorphism as for groups above. Lemma 2.3.34.

The canonical map q : V → V / W is linear and surjective.Proof.

By deﬁnition, q ( kx + y ) = [ kx + y ] = [ kx ] + [ y ] = k [ x ] + [ y ] = kq ( x ) + q ( y ) . There-fore, q is a linear transformation. Now let C be a basis for V / W .. Let C (cid:48) be a choice ofrepresentatives for the elements of C in V . Then C = q ( C (cid:48) ) and extending by linearity, weget that V / W = Span C = Span q ( C (cid:48) ) . Hence, q is surjective. Deﬁnition/Theorem 2.3.35.

Let f : V → W be a linear transformation. Then:(a) ker f and Im f are vector subspaces of V and W respectively. We then call dim K ker f the nullity and dim K Im f the rank .(b) f is injective if and only if ker f =

0. 36 c) (First Isomorphism Theorem) V / ker f ∼ = Im f . (d) If dim K V = dim K W < ∞ then the following are equivalent:(i) f is injective(ii) f is surjective(iii) f is an isomorphismProof. (a), (b), and (c) follow from the fact that linear functions are additive group homo-morphisms that also respect scalar multiplication. This implies that ker f and Im f areadditive abelian groups closed under scalars by the K -equivariance. What remains to beproven for (c) is that the following diagram of linear maps commutes V Im fV / ker f fq ˆ f Forgetting the K -equivariance momentarily, the diagram commutes on the level of abeliangroups by the proof of Theorem 2.2.22. Therefore, we need to show that K -equivarianceof ˆ f . If k ∈ K , then ˆ f ( k [ v ]) = ˆ f ([ kv ]) = f ( kv ) = k f ( v ) = k ˆ f ([ v ]) By the proof of Theorem 2.2.22, we know that ˆ f is a bijective linear map and thus a. vectorspace isomorphism.(d) If sufﬁces to prove that ( i ) ⇐⇒ ( ii ) as ( iii ) = ⇒ ( i ) , ( ii ) trivially and ( i ) = ⇒ ( ii ) makes f a bijective linear map, hence an isomorphism.( ⇒ ) If f is injective, pick B a basis for V . Then f ( B ) is linearly independent by linear-ity. Since dim W = dim V , f ( B ) is a basis for W and f is surjective.( ⇐ ) If f is surjective, again let B be a basis for V and f ( B ) the corresponding basis of W . Let u ∈ ker f . We need to show u =

0. As B is a basis, let u = k v + ... + k n v n bethe unique expansion of u in the basis B . By the linearity of f , we know that f ( u ) = k f ( v ) + ... + k n f ( v n ) = W . However, f ( B ) is a basis for W and consequently k i = i . Thus u =

0. This completes the proof.

Corollary 2.3.36.

If V and W are ﬁnite dimensional vector spaces such that dimV = dim W , then V ∼ = WProof.

Let B be a basis for V and C a basis of W let f : V → W be deﬁned by f ( k b + ... + k n b n ) = k c + ... + k n c n This is clearly injective and by Theorem 2.3.35(d), an isomorphism.37e will not provide a proof for the following theorem as it is more or less an exercisein Category theory which will be postponed until Chapter 3.

Theorem 2.3.37.

Let B be a basis for a vector space V . Let U be any other vector space. Iff : B →

U is any function, then there exists a unique linear transformation F : V → U such thatthe following diagram commutes: B UV f ι F This is an example of a universal mapping propery . These types of theorems areabundant in algebra and will be seen to be parts of more general schema in Chapter 3.

Example 2.3.38.

We now give some examples of vector spaces that arise from the consid-eration of various linear maps.(a) (Direct Sums and Direct Products)

Let { V i } i ∈ I be a collection of vector spaces. Wedeﬁne two objects (cid:77) i ∈ I V I = { ( v i ) i ∈ I : all but ﬁnitely many v i = } ∏ i ∈ I V i = { ( v i ) i ∈ I } the direct sum and direct product respectively of vector spaces. These objects come withnatural linear maps ι j : V j → (cid:76) V i and π j : ∏ V i → V j . For a ﬁnite indexing set, (cid:76) V i = ∏ V i and thus the symbols ⊕ and × will be used interchangeably. IN general however (cid:76) V i (cid:44) → ∏ V i .The key feature of ⊕ for ﬁnite indexing sets is thatdim (cid:16) (cid:77) V i (cid:17) = ∑ dim V i This follows from the fact that we can take individual bases in each coordinate space. Aswill be seen in the next chapter, ⊕ is a coproduct (or colimit) of vector spaces and × is aproduct (or limit) of vector spaces.(b) (Hom and Dual Spaces) Let Hom F ( V , W ) denote the set of all F -linear transforma-tions V → W . This can be made into an F -vector space by deﬁning addition and scalarmultiplication point-wise. For the case of W = F , we denoteHom F ( V , F ) = V ∗ the dual space to V . If dim V < ∞ , then there exist isomorphisms (non-canonically) of V ∼ = V ∗ and (canonically) of V ∼ = V ∗∗ the double dual. We call elements of V ∗ linearfunctionals on V . Let T : V → W be a linear transformation, and deﬁne T ∗ = T t the transpose map as T t ( g ) = g ◦ T : W ∗ → V ∗ . Below, we will show the motivation behindsuch a naming and its relation matrices. It can be seen that in generaldim Hom ( V , W ) = dim V · dim W .38o ﬁnish this subsection, we shall go back to the start and relate matrices to linearmaps on vector spaces. Theorem 2.3.39.

Let V and W be ﬁnite dimensional vector spaces over the ﬁeld K and f : V → Wa linear transformation. Then, there exists a matrix A such that with respect to the bases on V andW , f ( v ) = Av . where the right side is taken to be matrix multiplication of the dim W × dim Vmatrix by the dim V × vector in V . Further more, given any matrix M of size dim W × dim V , this corresponds to a linear map g : V → W . Moreover, this correspondence is bijective.Proof.

Put M ( V , W ) to be all matrices in the bases B , C over V , W respectively. Noticethat this is a vector space over K and that the matrices E ij , whose only non-zero entry isa 1 in position ( i , j ) , forms a basis. Deﬁne f ij : V → W to be the unique linear extension(Theorem 2.3.37) of the map on the bases which sends b i (cid:55)→ c j . This gives an inclusion of E , the basis of M ( V , W ) , into Hom ( V , W ) via the map ϕ : E →

Hom ( V , W ) ϕ ( E ij ) = f ij We claim that { f ij } is a linearly independent set. To see this, consider the unique extensionˆ ϕ : M ( V , W ) → Hom ( V , W ) and the arbitrary sum0 = ∑ i , j a ij f ij Evaluating this at one of the b i , we get that0 = ∑ j a ij w j and by linear independence all a ij =

0. Hence, { f ij } is linearly independent and as thereare dim W · dim V many elements, we know that it is a basis for Hom ( V , W ) by Example2.3.38 and Corollary 2.3.36. Hence,ˆ ϕ : M ( V , W ) → Hom ( V , W ) is a surjection and by Theorem 2.3.35, an isomorphism. This completes the proof.What this tells us is that every matrix can be treated as a linear transformation and thusthe transpose map f t : W ∗ → V ∗ has a matrix representation as the transpose matrix. Aswe will see later, this correspondence between matrices and linear maps can be exploitedto prove a variety of theorems. One of the main theorems will be on determinants, tobe deﬁned in section 2.5 which relates invertibility of a matrix (and of the correspondinglinear map) to its determinant. We now enter the belly of the algebraic beast. Ring and module (section 2.5) theory gener-alizes both ﬁelds and vector spaces in a way which makes doing mathematics with them39igniﬁcantly more difﬁcult. However, we are lucky in that for the main applications inChapter 4 and 5, we only need sufﬁciently nice objects called local and/or noetherianrings. Modules over these rings are relatively controlled and thus are incredibly impor-tant for analyzing these objects. A majority of this section comes from [Kna06],[Rot15]and [DF04]. The material on commutative rings follows [Mat86] and [AM69]. Similar tothe previous sections, we begin with some deﬁnitions:

Deﬁnition 2.4.1.

Let R be a set equipped with two associative binary operations ( + , × ).We call R a ring if the following hold:(a) R is an abelian group under + .(b) R is closed under × . That is for all a , b ∈ R , a × b = ab ∈ R .(c) For all a , b , c ∈ R , a ( b + c ) = ab + ac and ( a + b ) c = ac + bc .If in addition there exists an element 1 R such that x × R = x , for all x ∈ R then wesay that R is unital . We call R commutative if a × b = b × a for all a , b ∈ R . A ringhomomorphism is a function f : R → S such that for all s , t ∈ R , f ( s + t ) = f ( s ) + f ( t ) f ( st ) = f ( s ) f ( t ) If R and S are unital, then we also impose the condition that f ( R ) = S . The set of units(multiplicative invertible elements) is denoted R × . Remark 2.4.2.

It is common practice to assume that all rings are unital. This makes one’sjob much easier when considering homomorphisms and related objects. We shall followthis convention for the remainder of the text and note the instances when an object doesnot contain a unit.

Lemma 2.4.3.

Let R be a ring. Then the set R × is a group under multiplication. Example 2.4.4.

Rings play a key role in the later parts of this text and therefore it is im-perative that we have a wealth of examples to draw from.(a) Let F be a ﬁeld, then F is a commutative, (unital) ring, where every non-zero elementhas an inverse. Therefore F × = F − Z , Q , R , C are rings with additional and multiplication deﬁned asusual. In fact, Z is the prototypical example of a commutative ring which is not aﬁeld. For Q , R , C their group of units is the set of non-zero elements. For Z , its easyto see that Z × = {± } ∼ = Z /2 Z .(c) Let V be a ﬁnite dimensional K -vector spaces of dim V = n , then M n ( K ) : = M ( V , V ) the set of n × n matrices is a ring with identity element I n = diag (

1, ..., 1 ) the matrixwith 1s along the main diagonal. Further, the group of units is special and gets itsown symbol GL n ( K ) : = M n ( K ) × M ( C ) with basis = (cid:18) (cid:19) i = (cid:18) i − i (cid:19) j = (cid:18) − (cid:19) k = (cid:18) ii (cid:19) We denote this space as H = Span R { , i , j , k } . These are the hamiltonian quater-nions and are an example of a division ring , one where every non-zero element hasa multiplicative inverse.(e) Let V be a vector space over a ﬁeld of char F (cid:54) = V with a bilinear map [ − , − ] : V ⊕ V → V which satisﬁes the following conditions for all x , y , z ∈ V (i) [ x , y ] = − [ y , x ] (Anti-commutativity)(ii) [ x , [ y , z ]] + [ z , [ x , y ]] + [ y , [ z , x ]] = V is a Lie Algebra . Every lie algebra is a non-commutative, non-associative,non-unital ring. These will play a part in the theory developed in Chapter 3.(f) Consider R [ x ] the polynomial ring with coefﬁcients in a ring R .. This is a ring asdiscussed in Example 2.3.4. The group of units is necessarily R × as these are theonly elements with formal inverses. Proposition 2.4.5.

Let R be a ring. Then there exists a unique ring homomorphism ϕ : Z → RProof.

Fix r ∈ R and deﬁne ϕ r ( n ) = nr = r + r + ... + r ( n -times) in R . For all r (cid:54) = R , this isa homomorphism of non-unital rings. Notice that each of these is determined completelyand uniquely by where it sends 1. Hence, put ϕ = ϕ : Z → R . This sends 1 (cid:55)→ R andtherefore is the desired homomorphism.We deﬁne the kernel of a ring homomorphism in direct analog to vector spaces andgroup homomorphisms. The following theorem is the ring version of Theorem 2.3.35. Weleave the proof as an exercise to the reader as it follows with slight modiﬁcation from theproof of Theorem 2.3.35. Theorem 2.4.6.

Let S and R be rings and ϕ : R → S a ring homomorphism. Then:(a) ker ϕ is a subring with no unit and ϕ ( R ) is a ring.(b) ϕ is injective if and only if ker ϕ = ϕ for a moment. It is a special example of an Ideal of R . Deﬁnition 2.4.7.

Let R be a ring. A left ideal I ≤ ( R , +) is a subgroup of the additivegroup of R such that RI = { ri : r ∈ R , i ∈ I } ⊆ I J ⊆ R such that JR ⊆ J . We call an ideal, m , maximal if there are no other ideals of R which properly contain m . We call an ideal, p , prime if ab ∈ p then either a ∈ p or b ∈ p . Remark 2.4.8.

Notice that in a commutative ring R , every left ideal is also a right ideal.An ideal which is a left and right ideal, is called two-sided . Further, over a commutativering we can think of ideals in the same way we thought about vector spaces. The maindifference however is that we cannot normally pick a generating set for I as non-trivialideals exist for rings which do not exist for ﬁelds. Example 2.4.9.

Lets consider some ideals in the rings given above.(a) Every ﬁeld has no proper non-zero ideals. This follows from the fact that an ideal I is necessarily a vector space over F and therefore has a basis. If I is non-zero thebasis has to be the element 1 F .(b) For any ring R , let S be a subset. We can form (cid:104) S (cid:105) the ideal generated by S by taking (cid:104) S (cid:105) = (cid:92) S ⊂ I ⊆ R I where I is an ideal. We leave it to the reader to check that the intersection of idealsis necessarily an ideal. We call an ideal principal if I = (cid:104) r (cid:105) for some element r ∈ R .(c) Z is an example of a Principal ideal domain . This means that every ideal is prin-cipal and thus m Z are all possible ideals. Because of this, any ideal which contains1 ∈ Z must be Z itself. In fact in any ring, an ideal which contains 1 R must be theentire ring. Lemma 2.4.10.

Let ϕ : B → A be a ring homomorphism. Then if p is a prime ideal in A , ϕ − ( p ) is a prime ideal in B . This does not hold true for maximal ideals.Proof.

Let ab ∈ ϕ − ( p ) . Then ϕ ( ab ) = ϕ ( a ) ϕ ( b ) ∈ p and thus one of ϕ ( a ) or ϕ ( b ) is anelement of p . Hence, either a or b is an element of ϕ − ( p ) and it is a prime ideal of B .For a counterexample in the maximal case, consider the canonical inclusion ι : Z (cid:44) → Q .As Q is a ﬁeld, its only ideal is 0 but ι − ( ) =

0, which is not maximal in Z .Similar to vector spaces and groups, we can take quotients of rings by two-sided ide-als. To see why these are the natural choice for quotients, consider that we want to havethe quotient R / I become a ring again. To do this, let I be an arbitrary subgroup of ( R , +) .A coset of I in R will be denoted r + I for r ∈ R . If we deﬁne addition and multiplicationin the obvious way ( r + I ) + ( s + I ) = ( r + s ) + I ( r + I )( s + I ) = ( rs ) + I The word prime here comes from the notion of prime integer. Normally a number is prime if its onlyfactors are 1 and itself. An equivalent condition is that p ∈ Z is prime if and only if when p divides theproduct ab for some a , b ∈ Z , then either p divides a or p divides b . ( R , +) is an abelian group, we know that the as groups R / I is well deﬁned under + .We need to make sure it is well deﬁned under × . That is for any r , s ∈ R and α , β ∈ I weshould have that ( r + α )( s + β ) + I = rs + I If we set r = s = I must be closed under multiplication. Thus, I is a subring(without unit) of R . By setting s = r β ∈ I for all r ∈ R and β ∈ I . Therefore I is closed under multiplication on the left by R . Setting r = s vary we alsosee that I must be closed under multiplication by R on the right. Conversely, if I is closedunder left and right multiplication by R then the relation above must be satisﬁed. Hence,being a two-sided ideal is a necessary and sufﬁcient condition for R / I to be a ring. Whatwe have just shown is the following technical lemma: Lemma 2.4.11.

Let I ⊆ ( R , +) be a subgroup. A necessary and sufﬁcient condition for R / I tohave the structure of a ring is that I is a two-sided ideal of R . Proposition 2.4.12 (First Isomorphism Theorem for Rings) . Let ϕ : R → S be a ring homo-morphism. Then:(a) ker ϕ is a two-sided ideal of R and R / ker ϕ ∼ = ϕ ( R ) . (b) If I is any two-sided ideal of R, the map π : R → R / I r (cid:55)→ r + Iis a surjective ring homomorphism with kernel I . Hence, every two-sided ideal of R can berealized as the kernel of some homomorphism.Proof. (a) The majority of this proof mirrors that of Theorem 2.2.22. What remains to beproven is that the map ˆ ϕ ( r + I ) = ϕ ( r ) is a bijection between R / I and ϕ ( R ) . This followimmediately from the deﬁnitions of a ring homomorphism.(b) We know that R / I is a ring from the discussion before the statement of the propo-sition. In particular R and R / I are abelian groups and therefore π : R → R / I is a grouphomomorphism. To see it is a ring homomorphism, consider two elements r , s ∈ R . Then π ( rs ) = rs + I = ( r + I )( s + I ) = π ( r ) π ( s ) Further, π ( ) = + I = R / I . Hence, π is a ring homomorphism.The ﬁnal proof of this subsection is possibly the most useful and interesting isomor-phism theorem. Theorem 2.4.13 (Fourth Isomorphism Theorem(s)) . (a) Let G be a group and N (cid:69) G . Then the subgroups of G / N are in one-to-one correspondencewith the subgroups of G containing N . (b) Let R be a ring and I a two-sided ideal. Then the subrings of R / I are in one-to-one corre-spondence with the subrings of R containing I .43 roof.

We shall prove (b) and (a) will follow immediately by the same argument. Let π : R → R / I be the canonical projection map and S be a subring of R containing I . Then I is a two-sided ideal in S and thus S / I is a ring contained in R / I . Now assume that P ⊆ R / I is a subring. Then π − ( P ) = { r ∈ R : r + I ∈ P } . We ﬁrst check that this is aring. If a , b , c ∈ π − ( P ) then π ( ab + c ) = ( ab + c ) + I = ( ab + I ) + ( c + I ) = ( a + I )( b + I ) + ( c + I ) This is an element of P by the deﬁnition of a ring. Thus, π − ( P ) is a ring and π − ( P ) / I = P so Lemma 2.4.11 tells us that I is an ideal in π − ( P ) . This completes the proof.An immediate corollary of this theorem is that Corollary 2.4.14.

Let R be a commutative ring and m an ideal. Then m is maximal if and only ifR / m is a ﬁeld. This single corollary will play a large role in the formulation of certain categorical andalgebraic objects later in the text.

Proposition 2.4.15.

Every commutative ring has a maximal ideal.Proof.

Let P be the set of proper ideals of R ordered by inclusion. Every chain C in P has an upper bound, namely (cid:83) C ∈C C . This is easily seen to be an ideal. Applying Zorn’slemma, P has a maximal element, m . By deﬁnition, m is a maximal ideal. This will take up the remainder of this section on rings. The theory developed later in thistext relies on the notion of local and noetherian rings. These play a huge role in algebraicgeometry and the theory of smooth manifolds. Speciﬁcally they form the basis for whichsheaves (see Chapter 3) can be built. It is known that understanding sheaves on a spaceis equivalent to understanding the space itself. Therefore, to get a better grasp on thegeometry later, we will need to understand sheaves. To do so, we start with commutativealgebra and build our way up.

Remark 2.4.16.

For the remainder of this chapter, all rings are assumed to be commutativeand unital. Ideals are two-sided (by Remark 2.4.8) and thus R / I can be given the structureof a ring always.We start with Principal Ideal Domains (deﬁned in Example 2.4.9) and their general-ization: Unique Factorization Domains and integral domains. Deﬁnition 2.4.17.

Let R be a ring. A zero-divisor in R is an element a ∈ R such thateither ab = ba = b ∈ R . A ring with no non-zero zero-divisors is called an integral domain . This amounts to being able to cancel the element from expressions suchas ab = cb = ⇒ a = c

44n an integral domain, an element r which is nonzero and not a unit is called irreducible if whenever r = ab then one of a or b is a unit. Otherwise r is reducible . An element iscalled prime if the ideal (cid:104) p (cid:105) is a prime ideal in the sense of Deﬁnition 2.4.7. An integraldomain is a principal ideal domain if every ideal is principal.Prime elements and irreducible elements are closely related. In most of the exampleswe have presented, they are in fact the same! The following lemma asserts this Lemma 2.4.18.

In an integral domain, R, every prime element is irreducible. If we assume furtherthat R is a principal ideal domain (P.I.D.) , then an element is prime if and only if it is irreducible.Proof.

Assume p = ab . Then by deﬁnition p divides a or p divides b . Assume without aloss of generality that p divides a . Then a = px for some x ∈ R . Then p = pxb As R is an integral domain it has no zero-divisors and thus xb = b is a unit. Hence, p is irreducible.Now assume further that R is a P . I . D . we need to show that irreducible elementsare prime. Let r be an irreducible element. Suppose that r ∈ M an ideal of R . By thehypothesis, M = (cid:104) m (cid:105) and r ∈ (cid:104) m (cid:105) = ⇒ r = mx for some x ∈ R . By irreducibility,either m or x is a unit. Thus either (cid:104) m (cid:105) = (cid:104) r (cid:105) or (cid:104) (cid:105) . Hence, (cid:104) r (cid:105) is a maximal ideal and allmaximal. ideals are prime. Deﬁnition 2.4.19.

An integral domain R is called a Unique Factorization Domain if ev-ery non-unit element has a unique (up to units) factorization into irreducible elements.Unique factorization is a topic that should be familiar to everyone. It is a standardresult in high-school level mathematics that every integer can be written as a product ofprime numbers. As Z is a P.I.D. this agrees with the deﬁnition above. The following resultputs all of these rings into context with what we have done prior. We shall not prove it asit does not add to the theory. Theorem 2.4.20.

The following inclusion of integral domains holds:Fields (cid:40)

Principal Ideal Domains (cid:40)

Unique Factorization Domains

Example 2.4.21.

Let F be a ﬁeld and consider F [ x ] the polynomial ring. It is a well knownfact that F [ x ] is a P.I.D. In fact, we can relax the restriction that F is a ﬁeld and consider R [ x ] the polynomial ring with coefﬁcients in a ring R . There is a nice result [DF04, Theorem 7,Chapter 9.3] that says that if R is a U.F.D. then so is R [ x ] . Shortly, we shall see that thereis another theorem of this variety, Hilbert’s Basis Theorem, which asserts that if a ring isNoetherian then so is R [ x ] . Deﬁnition 2.4.22.

Let R be a ring and I an ideal. We say that I is ﬁnitely generated ifthere exists a ﬁnite set S such that I = (cid:104) S (cid:105) . We call the ring R Noetherian if every ideal isﬁnitely generated. 45 heorem 2.4.23 (Hilbert’s Basis Theorem) . Let R be a noetherian ring. Then R [ x , .., x n ] isnoetherian for any n ∈ N .The proof of this theorem is moderately technical and will be omitted.The main reason we consider Noetherian rings is that we have the following proposi-tion which characterizes chains of ideals in R . Proposition 2.4.24.

A ring R is Noetherian if and only if every ascending chain I ⊆ I ⊆ ... ofideals stabilizes: that is there exists n ∗ ∈ N such that for all n ≥ n ∗ . we have that I ∗ n = I n . Proof. ( ⇒ ) Let I ⊆ I ⊆ ... be an ascending chain of ideals. Then by assumption all ofthese are ﬁnitely. generated. Consider I = (cid:83) n ∈ N I n . This is also an ideal as any two ele-ments can be taken to be in some higher indexed ideal and thus addition is well deﬁned.Multiplication by R also follows immediately. Thus, I is an ideal and ﬁnitely generated byassumption. Let { a , ..., a n } be a generating set. Then each of these is contained in some I j . Take j ∗ = max { j : a j ∈ I j } then all of these elements lie in I ∗ j and the chain stabilizes. ( ⇐ ) Let I be an ideal and consider the set of all ﬁnitely generated ideals contained in I . This has a maximal element m by Zorn’s Lemma (Lemma 2.3.24). We assert that m = I .If not, then there would be an ascending chain of ideals which did not stabilize, namelytake the generating set N of I . Then we can pick x i ∈ N such that Rx (cid:40) Rx + Rx ⊆ ....This is an inﬁnite chain of ideals and it does not stabilize. Hence, I = m and I is ﬁnitelygenerated. This completes the proof.We will see early on in the next chapter that we can deﬁne the notion of Noetherianfor topological spaces. We use this notion to relate noetherian rings to a certain topologyon Spec ( R ) called the Zariski topology.Another key class of rings are rings that have a single maximal ideal. Deﬁnition 2.4.25.

Let R be a ring. We say that R is a local ring if there exists a singlemaximal ideal in R . It is customary to denote local rings as ( R , m ) or ( R , m , k ) where k = R / m is called the residue ﬁeld of R . It is easy to show that m is precisely the set of allnon-units in R .We shall end this section with a discussion of localization and local rings. This will berelated to some geometry in the next chapter. We shall push off giving examples of localrings until the next chapter when they arise quite naturally in the theory of manifolds andschemes. Deﬁnition/Theorem 2.4.26.

A subset S ⊆ R is called multiplicative if x , y ∈ S = ⇒ xy ∈ S . We deﬁne the localization with respect to the multiplicative set S as the set of symbolsS − R = (cid:110) rs : r ∈ R , s ∈ S (cid:111) / ∼ where rs ∼ ab if there exists t ∈ R such that t ( rb − sa ) = If S = R − p for some prime ideal p then S − R = R p is a local ring. roof. Using the standard deﬁnitions of addition and multiplication of fractions, we seethat S − R is indeed a ring. We now need to show that R p is a local ring. We claim that p R p is a maximal ideal in R p . It is easily shown to be an ideal, and thus it sufﬁces to showmaximality. Suppose not, then ∃ I ⊆ R an ideal such that m (cid:40) I by Proposition 2.4.15. As p R P consists of all non-unit elements, it follows that I must contain a unit. Therefore I contains 1 and must be R R p itself. Hence, p R p is a maximal ideal. Uniqueness followsfrom the same reason: any other ideal must contain a unit and therefore is the entire ring.Hence, R p is a local ring.This theorem gives the motivation for calling the operation Localization. As will beseen later, prime ideals will be the most important ideals in a ring as they precisely givea bijection between certain bits of geometry and algebra. For now, we shall move on tomodule theory. The ﬁnal topic of this section is Module theory. This is a generalization of vector spacesover a ﬁeld as we now allow for the ground space to be a ring. It is far more commonto come across modules over rings than vector spaces. For this reason, there is an entiretheory of modules and their generalizations to categories which is used extensively in thenext chapter. This section draws from [ DF04 ] , [ Mat86 ] , [ Rot15 ] , [ Lan02 ] , and [ Kna06 ] . Deﬁnition 2.5.1.

Let R be a ring. An abelian group M is called an R -module if thereexists an action map R × M → M which is associative. We also impose that 1 m = m forall m ∈ M . A module is said to be ﬁnitely generated if there exists a ﬁnite set S such that M = Span R S . Here we adopt the same notion of span as for vector spaces. A modulehomomorphism is a linear map which is R -equivariant. A module is an R -algebra if M also has the structure of a ring. That is, a map µ : M × M → M which is a multiplicationand satisﬁes the axioms of multiplication in a ring.Let us look at some examples of modules and submodules. Example 2.5.2. (a) Let M = R . Then R caries the structure of an R -module trivially.Further, every ideal I can be considered as an R -submodule. Moreover, we candeﬁne R n = R ⊕ R ⊕ .... ⊕ R as a module by multiplication in each coordinate.(b) Let ϕ : A → B be a ring homomorphism. Then B can be given the structure ofan A -module by deﬁning a · b = ϕ ( a ) b . The properties of a ring homomorphismguarantee that this is indeed an action and satisﬁes the axioms of a module.(c) Let F be a ﬁeld and V a vector space over F . Then V is an F module. In particular,all F -modules are vector spaces. This is not true for modules.(d) Let R = Z and let M = Z ⊕ Z / m Z . Then M is an R -module by the multiplicationdeﬁned by n · ( a , [ b ]) = ( na , [ nb ])

47n fact, any abelian group G has a natural structure of a Z -module. This comes fromthe identiﬁcation of n · g = g n or ng depending on whether one uses multiplicativeor additive notation.(e) The polynomial ring R [ x , ..., x n ] is an R -module in the obvious way r · f = r f .The main difference between vector spaces and modules is that modules do not alwayshave bases (in fact they rarely have them). For example, over most rings, the existence ofquotient rings implies that modules can actually be very from from just R n .Let us consider some operations on modules. Deﬁnition 2.5.3.

Let M , N be R -modules. We deﬁne M ⊕ N : = { ( m , n ) : m ∈ M , n ∈ N } to be the external direct sum of modules. For some collection of modules { M i } we cantake their direct sum. If the indexing set is inﬁnite we deﬁne (cid:76) I M i as the tuples withﬁnitely many non-zero elements. The direct sum comes equipped with natural mor-phisms M , N (cid:44) → M ⊕ N . We can also build the internal direct sum for two submodulesof a larger module P . In this case we denote the internal direct sum as M + N : = { m + n : m ∈ M , n ∈ N } It is an easy exercise to show that these are isomorphic if M ∩ N = For M an R -module and N ⊆ M , we can deﬁne the quotient module M / N in the sameway we deﬁned V / W for vector spaces W ⊆ V . Furthermore, we can realize every sub-module as the kernel of some module homomorphism via the short exact sequence0 → N → M → M / N → Deﬁnition/Proposition 2.5.4.

We say a short exact sequence → M → N → P → splits (on the left) if there exists a morphism N → M which when composed with the inclusion isthe identity. It splits on the right if there exists a morphism P → N which composes with theprojection to be the identity. If a short exact sequence splits on the right, then N ∼ = M ⊕ P . Thisgives a characterization of short split short exact sequences.Proof.

Let i : M → N and p : N → P be the arrows in the above exact sequence. As weassume the sequences is split, we know ∃ j : P → N such that p ◦ j = P . We will show that N = Im i ⊕ Im j . Let n ∈ N , then n − ( j ◦ p )( n ) ∈ ker p as p ( n − jpn ) = p ( n ) − p ( n ) = m ∈ M such that i ( m ) = n − jpn . It follows then that N isthe internal direct sum N = Im i + Im j We need to prove that Im i ∩ Im j =

0. Let a ∈ M and b ∈ P such that i ( a ) = x = j ( b ) .Applying p to both sides, we get that j ( b ) = x =

0. Hence, N ∼ = M ⊕ P .48 orollary 2.5.5. If → U → V → W → is a short exact sequence of vector spaces, then thefollowing conditions are equivalent:(a) The sequence splits on the left.(b) The sequence splits on the right.(c) V ∼ = U ⊕ W . Corollary 2.5.6 (Rank-Nullity Theorem) . For a linear map f : V → W between vector spaces,we have that rank ( f ) + nullity ( f ) = dim V . Proof.

Set up the following exact sequence0 → ker f → V → Im ( f ) → Example 2.5.7.

Consider the exact sequence of Z -modules0 → Z → Q → Q / Z → → ker f → M → f ( M ) → Q (cid:54)∼ = Z ⊕ Q / Z An easy calculation shows that Q / Z has elements of arbitrary ﬁnite order and thereforecannot be a direct summand of Q .In fact, it is a fairly standard exercise to show that an exact sequence of R -modules splitsif and only if the middle term is a direct sum of the ﬁrst and third terms.We now move on to some theorems for modules which we have seen before. Theorem 2.5.8 (First and Fourth Isomorphism Theorems) . (a) Let ϕ : M → N be an R-module homomorphism. Then M / ker ϕ ∼ = ϕ ( M ) . (b) Let N ⊆ M . Then the submodules of M / N are in one-to-one correspondence with thesubmodules of M containing N . Proof.

The proof of this follows immediately from the proof for the ring case: Theorems2.4.12 and 2.4.13.It should be no surprise at this point that this theorem is true. After all, we can regardmodules as abelian groups and the result held true there. The only thing needing to bechecked is R -equivariance, but this follows immediately from the deﬁnitions.49 eﬁnition 2.5.9. We call an R -module free if M ∼ = R n = R ⊕ R ⊕ ... ⊕ R for some n ∈ N .We say that M is ﬁnitely presented if there exists a short exact sequence0 → K → F → M → K , F are free, ﬁnitely generated R -modules. For any set S we can build the free R -module R (cid:104) S (cid:105) with basis S . The following theorem gives a universal property for suchmodules. Theorem 2.5.10 (Universal Property of Free Modules) . Let S be a set and M an R − modulesuch that there exists a map ϕ : S → M . Then there exists a unique R-module homomorphism ˆ ϕ : R (cid:104) S (cid:105) → M such that the following diagram commutes:S MR (cid:104) S (cid:105) ϕ i ˆ ϕ The idea of ﬁnitely presented modules becomes important for the theory of sheaves whichwill be developed at the end of the next chapter. For now, we have a fundamental resulton modules over a principal ideal domain.

Theorem 2.5.11 (Fundamental Theorem of Finitely Generated Modules over a P.I.D.) . LetR be a P.I.D. and M a ﬁnitely generated R − module. ThenM ∼ = R k ⊕ R / (cid:104) r (cid:105) ⊕ R / (cid:104) r (cid:105) ⊕ ... ⊕ R / (cid:104) r n (cid:105) for some non-unit elements { r i } . Proof.

We will show existence of such a decomposition. Let v , ..., v n be a generating setfor M and consider R (cid:104) x , ..., x n (cid:105) the free module on the same number of generators. Itis obvious that there is a homomorphism ϕ : R n → M which sends x i (cid:55)→ v i . This ho-momorphism is surjective by construction. Therefore, by Theorem 2.5.8, we have that M ∼ = R (cid:104) S (cid:105) / ker ϕ . We know then that r x , ..., r m x m is a generating set for ker ϕ andtherefore ker ϕ = (cid:76) i ≤ m Rr i x i . Taking the quotient, we get that M ∼ = (cid:77) i ≤ n Rx i / (cid:77) i ≤ m Rr i x i = (cid:77) ( Rx i ./ Rr i x i ) ⊕ R n − m The terms of the direct sum become R / (cid:104) r i (cid:105) under the natural identiﬁcation. Hence, M ∼ = R n − m ⊕ R / (cid:104) r (cid:105) ⊕ R / (cid:104) r (cid:105) ⊕ ... ⊕ R / (cid:104) r n (cid:105) As Z is a principal ideal domain, this applies to abelian groups as well. As will beseen in the next chapter, this theorem becomes incredibly important in homology theoryfor ﬁnite cell complexes. 50 .5.2 Multilinear algebra The ﬁnal part of this chapter will concern multilinear algebra. In this section we intro-duce an other operation on modules which gives another way of building new modulesfrom old ones, the tensor product, and discuss related topics such as exterior powers ofmodules and vector spaces.For this section, let R be a ring and L , M , N R -modules. Further, let S ⊇ R be a ringcontaining R . Deﬁnition 2.5.12.

We call a function θ : M ⊕ N → L bilinear if it is linear in each argu-ment. In general, we have multilinear functions f : (cid:76) M i → L which are linear in eachargument.It should not come as surprising that multilinear functions are a bit more difﬁcultto deal with than linear functions. There is a way to convert between the two, but thisinvolves a new module. Deﬁnition 2.5.13 (Deﬁnition/Construction) . Let F ( M × N ) denote the free R -modulegenerated by M × N . Consider the submodule G which is generated by the relations ( a , b ) ∼ ( a , b )( a + a (cid:48) , b ) ∼ ( a , b ) + ( a (cid:48) , b )( a , b + b (cid:48) ) ∼ ( a , b ) + ( a , b (cid:48) )( ar , b ) ∼ ( a , rb ) r ∈ R We deﬁne M ⊗ R N = F ( M × N ) / G . As R is commutative, we have that M ⊗ R N is an R − module with multiplication deﬁned by the ﬁnal relation of G . There is a canonical map ⊗ : V × W → V ⊗ W which sends ( m , n ) (cid:55)→ m ⊗ n . Elements of M ⊗ R N are sums of the formal symbols m ⊗ n which is called a simple tensorTheorem 2.5.14 (Universal Property) . For every bilinear map ϕ : M × N → L there exists aunique linear map ˆ ϕ : M ⊗ N → L such that ϕ = ˆ ϕ ◦ ⊗ .This theorem is sometimes given as the deﬁnition of the tensor product as it impliesthe tensor product is unique up to isomorphism. The nice part of this theorem is it givesa bijection Bil ( M , N , L ) ∼ = Hom R ( M ⊗ N , L ) where BIl ( M , N , L ) is the set of bilinear maps M × N → L . Therefore, we can turn multi-linear functions into linear ones by using the appropriate number of tensors. This is alsotrue for all arbitrary collections of modules and multilinear maps.Let us look at some immediate applications of tensor products. Proposition 2.5.15.

Let V , W be ﬁnite dimensional vector spaces (or R-modules) over K a ﬁeld.Then there is a non-canonical isomorphismV ∗ ⊗ W ∼ = Hom K ( V , W ) which sends Π ( ϕ ⊗ w ) = ϕ ( − ) w . 51 emma 2.5.16. For V , W as above, dim V ⊗ W = dim V · dim W . Proof.

Let B , C be bases for V and W respectively. Then we can pick as a basis for V ⊗ W ,the set of simple tensors b i ⊗ c j for b i ∈ B and c j ∈ C . There are | B | × | C | of these. Proof of Proposition 2.2.15.

It sufﬁces to prove that this map is injective as the dimensionof the spaces are the same, we can apply Theorem 2.3.35(d). So, let φ ⊗ w ∈ ker Π . Then Π ( φ ⊗ w )( v ) = φ ( v ) w = v ∈ V . If w = w (cid:54) =

0. Thenwe know that φ ( v ) = v ∈ V . Therefore, by the uniqueness of 0 ∈ V ∗ , we havethat φ =

0. This completes the proof.

Remark 2.5.17.

Notice that the map itself is canonical, but the choice of basis is not. Ingeneral, for simple tensors ϕ ⊗ w ∈ V ∗ ⊗ W the map is canonical, we need bases to extendthis map to the entire tensor product.Sometimes it is useful to consider the module M as an S -module instead of an R -module. The following lemma gives a way to do such a thing. Lemma 2.5.18.

We can extend scalars from R to S by takingM (cid:55)→ M ⊗ R SWhere the module structure on S is given by the inclusion map. This is an S module.Proof.

Let s ∈ S . We need to deﬁne s ( m ⊗ t ) and then extend by linearity. Well, simplydeﬁne s ( m ⊗ t ) = m ⊗ st . As R and S are commutative, this is a valid action. Lemma 2.5.19.

There is a canonical isomorphism of R ⊗ R M ∼ = M for any R-module M . Proof.

Let ϕ : M ⊗ R R → M be given by ϕ ( ∑ m i ⊗ r i ) = ∑ r i m i We claim this is anisomorphism. Consider the map m (cid:55)→ m ⊗

1. This is an inverse for ϕ on both the left andright. Hence, ϕ is an isomorphism.Now let us investigate the module M and its tensor powers: M ⊗ n = (cid:78) n M . Thesespaces parametrize, in some sense, the multilinear maps of ∏ n M → M . We can build analgebra out of these modules by taking a large direct sum. Deﬁnition 2.5.20.

Let V be an R -module. The Tensor Algebra of M is the R -algebra T • ( M ) = (cid:77) n ∈ N M ⊗ n The algebra structure on T • ( M ) is given by concatenation v ∈ M ⊗ n and w ∈ M ⊗ m then v ⊗ w ∈ M ⊗ m + n .We have the following universal property of the tensor algebra, Some authors simply write T ( M ) or T ∗ ( M ) for the tensor algebra, we do not use these as it will becomedifﬁcult to distinguish T ( M ) and TM in the next chapter. roposition 2.5.21 (Universal Mapping Property of the Tensor Algebra) . Let A be an R-algebra and f : M → A an R-module homomorphism. Then there exists a unique R-algebrahomomorphism ˆ f : T • ( M ) → A extending f such that the following diagram commutes:M AT • ( M ) fi ˆ f The proof of this is the same ﬂavor as for the other universal mapping properties andthus will not be produced here. What we will concern ourselves with however is a certainideal of T • ( M ) . Deﬁnition 2.5.22.

A tensor v ∈ T • ( M ) is called alternating if v has the following form: v = m ⊗ ... m i ⊗ ... ⊗ m i ⊗ ... ⊗ m n The repeating element is the focus. Let J be the ideal of T • ( M ) generated by all suchalternating elements. Sometimes we say that J = (cid:104) v ⊗ v (cid:105) for v ∈ T • ( M ) . Lemma 2.5.23. J coincides with the ideal L = (cid:104) x ⊗ y + y ⊗ x − ( x + y ) ⊗ ( x + y ) + x ⊗ x + y ⊗ y (cid:105) only if Char ( R ) (cid:54) = Deﬁnition/Theorem 2.5.24.

Let R be a ring with Char ( R ) (cid:54) = and put (cid:86) • ( M ) = T • ( M ) / J . This is called the exterior algebra of M and comes with the following universal property: Givenany R-algebra A and a map φ : M → A such that ϕ ( m ) = there exists a unique algebrahomomorphism (cid:86) • ( M ) → A which makes the associated diagram commute.Proof.

The universal property for the tensor algebra gives us a map, Ψ , to A . Taking thekernel of this map, we see that it is precisely when Ψ ( m ⊗ m ) =

0. Hence, Ψ descends toa map on (cid:86) • ( M ) . This completes the proof.It is common practice to denote elements of (cid:86) • ( M ) with ∧ instead of ⊗ . In this way,we get immediately that v ∧ w = − w ∧ v . This is equivalent to the condition, v ∧ v = Remark 2.5.25.

We shall end this section with some nice properties of the exterior algebraso that we can use them in the next chapter readily.(a) We can build (cid:86) k ( M ) in a similar way to building (cid:86) • ( M ) we simply quotient T k ( M ) = (cid:76) k M ⊗ n . In this vein, (cid:86) k ( M ) ∧ (cid:86) l ( M ) ⊆ (cid:86) l + k ( M ) which gives (cid:86) • ( M ) an algebrastructure. 53b) If V is a ﬁnite dimensional vector space of dimension n . Then it can be shown thatdim (cid:86) k ( V ) = ( nk ) . Therefore (cid:86) • ( M ) is a ﬁnite dimensional algebra.(c) Recall the deﬁnition of a lie algebra from above. A different way to say the condi-tions of a lie algebra are that V is a vector space equipped with a map [ − , − ] : (cid:94) ( V ) → V satisfying the Jacobi identity.(d) As we will see in the next section, we can equivalently consider (cid:86) k ( V ) the vectorspace of differential k -forms on V . This allows us to do calculus on these spaces andis a bridge between the theory of manifolds (chapter 3) and algebra, among others.(e) (Determinants) Let V have dimension n and consider the top exterior power (cid:86) n ( V ) .This is a 1-dimensional space by (2) above. Consider any T ∈ Hom ( V , V ) : = End ( V ) and deﬁne the extension T : n (cid:94) ( V ) → n (cid:94) ( V ) T ( v ∧ ... ∧ v n ) = Tv ∧ ... ∧ Tv n As this is an endomorphism of a 1-dimensional space, it must be given by Tv = λ v for some λ ∈ K . Therefore we deﬁne the determinant of T to be the unique number λ such that Tv ∧ ... ∧ Tv n = ( det T )( v ∧ ... ∧ v n ) It then follows from the deﬁnition that for S , T ∈ End ( V ) , we get det ST = det S · det T . Those readers familiar with the determinant formula of a matrix should no-tice this as the standard property of the determinant. Furthermore, we have thefollowing lemma Lemma 2.5.26.

A matrix M is invertible if and only if det M (cid:54) = Proof.

Abusing notation, by Theorem 2.3.39 we consider the linear transformationassociated to the matrix M . Then det M (cid:54) = M : (cid:86) n ( V ) → (cid:86) n ( V ) isan isomorphism. Hence, M has an inverse as a linear transformation and thus as amatrix.This gives a nice way to think about determinants as the volume of the parallelepipedspanned by the basis vectors Tv , ..., Tv n .This completes the chapter. 54 hapter 3Topology and Geometry: From Spaces toSheaves This section will run through the basics of category theory, (point-set) topology, differen-tial geometry, and sheaf theory. The main goal is to deﬁne and give important propertiesof manifolds . To mathematicians, these are generalizations of Euclidean space and pro-vide a natural context to do calculus on non-ﬂat spaces (more on this in Section 3.3).There is some ambiguity on the deﬁnition of a manifold for psychologists which causessome technical problems when comparing computational models which claim to rely onthe "manifold" structure. We shall give the formal, mathematical constructions of theseobjects and in Chapter 4, use this to construct a perceptual space which encodes the gen-eralized perceptual categories of Chapter 1. Before then, we want to bridge the gap fromthe previous chapter to this one by exploring category theory.

Category theory began as an observation that many of the well known results of algebra(such as the First isomorphism theorems above) seemed to be linked. We now knowthat the reason this is true follows from general facts about what are known as

Additive and

Abelian categories. Although this theory is beautiful to those who fully understandthe concepts, it can be seen as esoteric and impenetrable by some beginners. As we areassuming little to no familiarity with these topics, we shall go into a bit more detail formost of the proofs in this section and provide several examples for each deﬁnition andtheorem. For references, we make extensive use of [ML71], [Kna06],[Kna07],[Rot09], and[Lee12].

Before giving the deﬁnition of a category, we want to understand, more precisely, the lan-guage used in the previous chapter. The main goal will be to understand the relationshipbetween morphisms of groups, rings, and modules. Category theory provides a settingin which these are all intimately related. 55 xample 3.1.1.

Let G , H be groups (not necessarily abelian). Denote by Hom ( G , H ) the set of all grouphomomorphisms. If G and H are assumed to be abelian, then Hom ( G , H ) can be endowedwith the structure of an abelian group in a natural way: for any f ∈ Hom ( G , H ) deﬁne n · f ( g ) = f ( ng ) = n f ( g ) ∈ H Notice that for H non-abelian we can still deﬁne a Z -module structure on Hom ( G , H ) by n · f ( g ) = f ( ng ) . We can similarly deﬁne a Z -module structure if G is non-abelianand H is abelian. Thinking about Hom as a function on the set of all groups we can askif it preserves group homomorphisms. To check this, let ϕ : G → G (cid:48) be a morphism ofgroups. Deﬁne ϕ ∗ : Hom ( G (cid:48) , H ) → Hom ( G , H ) f (cid:55)→ f ◦ ϕ If instead we had a morphism ψ : H → H (cid:48) , then there is a canonical map ψ ∗ : Hom ( G , H ) → Hom ( G , H (cid:48) ) deﬁned as you would imagine. Therefore, Hom can somehow detect which argument amorphism was taken in. If it is the ﬁrst argument then the order is reversed, whereas thesecond argument preserves the order.If we generalize the above example to rings and ring homomorphisms, we get the exactsame result. Let R , R (cid:48) , S , S (cid:48) be rings and ϕ : R → R (cid:48) , ψ : S → S (cid:48) be ring homomorphisms.Then ϕ ∗ and ψ ∗ are deﬁned according to the deﬁnitions above.The same story for rings works with modules as well. This should not be surprisinghowever as every abelian group is a Z -module and we know how Hom works for abeliangroups.This undercuts the original conclusion about Hom; it can detect which argument is be-ing manipulated but cannot (without some poking) detect group, ring, or module struc-tures. What we do know is that it also plays suitably nice with morphisms for the correctobjects. It is precisely this notion which categories and functors generalize. Deﬁnition 3.1.2.

A (small) category is a triple C = ( Obj ( C ) , Hom C ( − , − ) , ◦ ) with Obj ( C ) a set, an assignment for any two objects A , B ∈ Obj ( C ) a set Hom C ( A , B ) of morphisms between A and B , and a function ◦ such that for all A , B , C ∈ Obj ( C ) , ◦ : Hom C ( B , C ) × Hom C ( A , B ) → Hom C ( A , C ) These are subject to the following axioms(a) Hom sets are disjoint (that is every element has a unique domain and codomain).(b) There exists 1 A ∈ Hom C ( A , A ) for all A ∈ Obj ( C ) such that 1 A ◦ f = f . and g ◦ A = g . We are intentionally being sloppy here. As will be seen shortly Hom ( − , − ) is a functor Grp → Set . ◦ is associative.If it is clear from the context, we shall simply write Hom ( A , B ) for the set of morphisms.A subcategory of C is a triple D = ( Obj ( D ) , Hom ( − , − ) , ◦ ) where Obj ( D ) is a subset ofObj ( C ) and Hom D ( A , B ) ⊆ Hom C ( A , B ) . Composition is taken as in C .Notice that this deﬁnition does not require the objects themselves to be sets. Thisdistinction is what makes proving things in category theory particularly frustrating: onecannot reference elements of an object when deﬁning a morphism. Example 3.1.3. (a) Consider the following graph •• •

Deﬁne a category C whose objects are the vertices of the above graph, the mor-phisms are the arrows, and composition is concatenation of paths. Notice that theobjects of this category have no notion of element (i.e. they are not sets) and there-fore if we wish to prove something about this category, we have to rely on "arrowtheoretic" proof. That is to say we need to understand the morphisms in the cate-gory instead of the objects.(b) We now return to the algebraic objects of the previous chapter. For your favoriteobject in the previous chapter, it should be obvious that they form a category. Wedenote the categories as such:(i) Grp : the category of groups.(ii)

Ring : the category of rings.(iii)

Field : the category of ﬁelds.(iv) R − Mod : the category of R -modules for a ﬁxed ring R .(v) Vect K : the category of K -vector spaces.(vi) Ab : the category of abelian groups.Notice that Ab is a subcategory of Grp . In fact, every category above can be realizedas a subcategory of

Grp !(c) The "category" of Sets is denoted

Set . The quotations here are for caution: the "col-lection of all sets" is not itself a set (try to prove this!) but instead a proper class. Weare going to ignore almost all set theoretic problems that may arise. Nonetheless,this is an honest category (once you ﬁx your model of set theory) and it is quite im-portant. A majority of what will come up when we discuss functors can be realizedas some generalization of something involving sets.57 emark 3.1.4.

For the remainder of this thesis, we shall denote categories by calligraphicor script letters C , C if we are in a general setting, or a corresponding bold-face name suchas Grp for the category of groups.

Deﬁnition 3.1.5.

Let C and D be two categories. We deﬁne the product category C × D as the category whose objects are pairs ( C , D ) and whose morphisms are pairs ( f , g ) .Now that we have the notion of a category, we may ask if there are any "special" mor-phisms in this category. What we mean by special here will become apparent shortly.Consider the category Set . The following lemma gives a different characterization of in-jective and surjective functions which is easily generalizable.

Lemma 3.1.6.

Let f : A → B and g : A (cid:48) → B (cid:48) be two functions. Then f is injective if and onlyif for any two arrows i , i : C → A , the equalityf ◦ i = f ◦ i = ⇒ i = i . Similarly, g is surjective if and only if for any two arrows s , s : B (cid:48) → C (cid:48) , the equalitys ◦ g = s ◦ g = ⇒ s = s . This means that injective maps are left cancellable and surjective maps are right cancellable.Proof.

We shall prove the injective case and leave the surjective case to the reader. ( ⇒ )Assume that f is left cancellable. For any a , a (cid:48) ∈ A , let ϕ a : {∗} → A be the functionwhich picks out the element a . Then if f is left cancelable and f ( ϕ a ) = f ( ϕ a (cid:48) ) = ⇒ ϕ a = ϕ a (cid:48) = ⇒ a = a (cid:48) Hence f is injective. The other direction is obvious from the deﬁnition of injective. Thiscompletes the proof.Notice that we can re-write the injectivity condition on the level of diagrams as C A B i i f More generally, we can think of arrows in arbitrary categories which have the left (resp.right) cancellable property.

Deﬁnition 3.1.7.

Let C be a category and f : A → B be a morphism. We say that f is monic when for any pair of morphisms g , h : C ⇒ A , the equality. f ◦ g = f ◦ h implies g = h . We say that f is epic when for any pair of morphisms p , q : B ⇒ D , the equality. p ◦ f = q ◦ f implies p = q . We call f an isomorphism if there exists r : B → A such that f r = B and r f = A . Further, we denote isomorphisms by either A ∼ = B or A ∼ → B .58n all concrete categories (ones which can be realized as subcategories of Set ) monicmaps are injective and epic maps are surjective. This mirrors the result of Lemma 3.1.6.In fact, this is precisely the deﬁnitions of isomorphism coincide with the categorical onefor all of the algebraic objects in Chapter 2! In general, the converse is not true. Let R , S be two rings and UR , US their underlying sets. Then an injective function f : UR → US need not be a ring homomorphism. For an easy example, consider R = S = Z . Then themap 2 : Z → Z x (cid:55)→ x is a perfectly well deﬁned injective function but is deﬁnitely not a ring homomorphismas 1 cannot be written as 2 z for some z ∈ Z .Something else which needs generalization is the equivalence in Set between isomor-phisms and bijections. In general, every isomorphism is necessarily monic and epic. Theconverse may not be true (take for example the ring example above but change where theidentity is sent). We want to deal with categories where this is true.

Deﬁnition 3.1.8.

A category B is called balanced if all monic, epic morphisms are iso-morphisms.It should be clear that all concrete categories are balanced. More often than not, this issomething which needs to be proven but is not too hard.Before moving forward, it is important to label some distinguished objects of certaincategories. Deﬁnition 3.1.9.

An object T ∈ C is a terminal object if for all objects A ∈ C , there existsa unique (denoted ∃ !) A → T . An object I ∈ C is initial if for all objects A ∈ C , thereexists a unique I → A . A zero object is an object which is both terminal and initial. Proposition 3.1.10.

Initial, terminal, and zero objects are unique up to unique isomorphism.Proof.

The proof for initial, terminal, and zero objects is exactly the same. For this reason,we shall only prove the initial case. Let I , I be two initial objects. By deﬁnition thereexist unique morphisms ι : I → I and ι : I → I . It sufﬁces to show that ι ◦ ι = I and ι ◦ ι = I . As the objects are initial, the set Hom ( I i , I i ) contains a single element,namely 1 I i . As the composition ι ◦ ι ∈ Hom ( I , I ) it must be 1 I . By the same reasoningwe have that ι ◦ ι = I . Hence, I ∼ = I and this isomorphism is unique. Example 3.1.11.

Zero, initial, and. terminal objects are incredibly important in the theoryof abelian categories (section 3.1.4). For this reason, we give the following exmaples:(a) In

Grp the zero object is the trivial group G = { } .(b) In Ring the initial object is Z while there is no terminal object.(c) In R- Mod the zero object is the 0 module.59 unctors

Now that we have the notion of a category, we want to deﬁne morphisms of categories.Similar to the restrictions of a ring homomorphism, we want a morphism of categories topreserve both the objects and the morphisms.

Deﬁnition 3.1.12.

Let C , D be two categories. A (covariant) functor F : C → D subject tothe following:(a) For all A ∈ Obj ( C ) , F ( A ) ∈ Obj ( D ) and similarly for morphisms.(b) If A f → B g → C is a sequence of morphisms in C , then F ( g ◦ f ) = F ( g ) ◦ F ( f ) is amorphism in D .(c) F ( A ) = F ( A ) .Dually, we have the notion of contravariant functors for which F ( g ◦ f ) = F ( f ) ◦ F ( g ) .It is common practice to write FX for an object as opposed to F ( X ) . We shall use thesenotations interchangeably.Functors play a core role in the rest of the theory presented in this thesis. Speciﬁcally,they will form an important class of objects called sheaves (see section 3.3.2 below) whichwill ease the technical burden of understanding the geometry. of perceptual spaces. Lemma 3.1.13.

Let F : C → D be a functor. Then if ϕ : A → B is an isomorphism in C , thenF ( ϕ ) is an isomorphism in D . Proof.

Let ψ be ϕ − in C . Computing F ( ϕ ◦ ψ ) and F ( ψ ◦ ϕ ) , we see that by property (b)of the deﬁnition of a functor, we have that1 F ( A ) = F ( A ) = F ( ψ ) ◦ F ( ϕ ) F ( B ) = F ( B ) = F ( ϕ ) ◦ F ( ψ ) Hence, F ( ϕ ) is an isomorphism.The following examples of functors will play an exceptional role in section 3.3 below. Example 3.1.14. (a) Let ( − ) op : Cat → Cat be an endofunctor of the category of categories (this mor-phisms in this category are functors). This sends a category C to the opposite category C op . The objects of this category are the objects of C but the morphisms have theirtarget and source ﬂipped. That is, if f : A → B is a morphism in C then f op : B → A is a morphism in C op . This allows us to redeﬁne contravariant functors as covariantfunctors from the opposite category. As an added fact, ( C op ) op = C .(b) Consider Hom C ( − , − ) : C op × C → Set . This is a bifunctor and is covariant in theﬁrst argument and covariant in the second argument.60c) In R - Mod , − ⊗ R − is a bifunctor, covariant in both arguments. As we assume R iscommutative, ⊗ makes R - Mod into a symmetric monoidal category . Algebras aremonoid objects in this category.(d) Let U : Grp → Set be the forgetful functor which sends a group to its underlying set.In fact, in any concrete category we have a forgetful functor to

Set .If C and D are categories, then denote byFun ( C , D ) : = { F : C → D } We want to turn this into a category. In order to do this, we need to introduce the idea ofa morphism of functors . Deﬁnition 3.1.15.

Let F , G : C → D be two functors of the same variance. A naturaltransformation is a family of morphisms { τ X } which intertwine the functors as the fol-lowing diagram shows F ( X ) G ( X ) F ( Y ) G ( Y ) τ X F ( f ) G ( f ) τ Y In this case we write τ : F → G .These deﬁne the morphisms in Fun ( C , D ) and make it a category. Isomorphisms arenatural transformations for which every τ X is an isomorphism in D . In this case, we saythat two functors are naturally equivalent . The following lemma gives a description ofNatural transformations involving the Hom ( A , − ) functor. Lemma 3.1.16 (Yoneda Lemma) . Let G : C → Set be a functor and A an object in C . Thenthere is a bijection y : Nat ( Hom ( A , − ) , G ) → G ( A ) Proof.

Deﬁne y ( τ ) = τ A ( A ) . To show this is injective, suppose y ( τ ) = τ A ( A ) = σ A ( A ) = y ( σ ) . For any object B ∈ C , and ϕ ∈ Hom ( A , B ) , we have the following commutative di-agram Hom ( A , A ) G ( A ) Hom ( A , B ) G ( B ) τ A ϕ ∗ G ϕτ B So that τ B ( ϕ ) = G ϕτ A ( A ) = G ϕσ A ( A ) = σ B ( ϕ ) . Hence, τ B = σ B for all B ∈ C and thus τ = σ . So y is injective. We shall not deﬁne this here, but instead suggest [Kas95, Chapter XI]. Kassel uses the term tensorcateogry which is equivalent to “monoidal cateogry."

61o show it is surjective, let x ∈ G ( A ) . For every object B ∈ C and ψ ∈ Hom ( A , B ) ,deﬁne τ B ( ψ ) = ( G ψ )( x ) . We claim then that τ is a natural transformation. Indeed, forany θ ∈ Hom ( B , C ) , then commuting squareHom ( A , B ) G ( B ) Hom ( A , C ) G ( C ) τ B θ ∗ G θτ C Then going clockwise we get that G θτ B ( ψ ) = G θ G ψ ( x ) . Going counter-clockwise wehave that τ C ( θ ∗ ψ ) = τ C ( θψ ) = G θψ ( x ) . As G is a functor, these are equal. Thus, τ is anatural transformation and τ A ( A ) = G A ( x ) = x . Hence y is bijective. This completesthe proof.Now let F : C → D be a functor and X , Y ∈ C . Then F induces a function on Hom-sets F X , Y : Hom C ( X , Y ) → Hom D ( FX , FY ) which takes a function f to F ( f ) . Deﬁnition 3.1.17.

We say that F is:(a) Full if F X , Y is surjective for all X , Y .(b) Faithful if F X , Y is injective for all X , Y .(c) Fully-Faithful if F X , Y is bijective for all X , Y .Therefore, concrete categories are those which admit a faithful functor into Set . Ingeneral, fully-faithful functors play the same role as bijective functions on sets. In

Cat isomorphisms are necessarily fully-faithful. In general, a bijection on the level of Hom-sets is incredibly important.

We now explore the ﬁnal claim of the previous part. Let F : C (cid:29) D : G be functors suchthat there exists a natural transformation η : 1 C → GF . Then we want to understand theinduced morphism Hom D ( FX , Y ) → Hom C ( X , GY ) Deﬁnition 3.1.18.

Let F : C (cid:29) D : G . We say that ( F , G ) are an adjoint pair ifHom D ( FX , Y ) ∼ −→ Hom C ( X , GY ) for all X ∈ C , Y ∈ D . Further, the bijection is natural in X and Y . In this case, we say that F is left adjoint to G and G is right adjoint to F . We denote this by F (cid:97) G .62 heorem 3.1.19. An adjoint pair ( F , G ) induces two natural transformations η : 1 C → GF and ε : FG → D such that the compositionsF F η −→ FGF ε F −→ F G η G −→ GFG G ε −→ Gare the identity morphisms.Proof.

Let ϕ X , Y : Hom D ( FX , Y ) ∼ −→ Hom C ( X , GY ) be the bijection for the adjoint pair.Then if Y = FX , the element 1 FX ∈ Hom D ( FX , FX ) induces a morphism η X : = ϕ FX , FX ( FX ) : X → GFX

Deﬁne η : 1 C → GF by η X . We need to show that η is natural in X . Consider the followingdiagram: X GFXY GFY η X f GF ( f ) η Y It commutes by the fact that ϕ is natural in both X , Y . Similarly, we deﬁne ε Y : = ϕ − GY , Y ( GY ) .Its naturality is checked in a similar manner. Now,1 GY = ϕ GY , Y ( ε Y ) = G ( ε Y ) ◦ η GY again by the naturality of ϕ . We have the respective statement for 1 FX . This completes theproof. Remark 3.1.20.

The natural transformations η : 1 C → GF and ε : GF → D are calledthe Unit and

Counit of the adjunction. We then denote an adjunction as a quadruple ( F , G , η , ε ) . Corollary 3.1.21. If ( F , G , η , ε ) and ( F (cid:48) , G , η (cid:48) , ε (cid:48) ) are adjoint pairs, then F and F (cid:48) are naturallyisomorphic.Proof. η and η (cid:48) are universal arrows for each x . Therefore, there exists a unique isomor-phism θ X : FX → F (cid:48) X for all X ∈ C . This family of isomorphisms is natural in X by theuniversality of the units. Hence, F ∼ = Nat F (cid:48) .Adjoint functors play a large role in understanding isomorphisms of categories. Infact, two categories are equivalent if there exists an adjoint pair ( F , G , η , ε ) such that η and ε are natural isomorphisms. To build up some intuition, here are some examples of adjointfunctors. Example 3.1.22. (a) Let (cid:104)(cid:105) : Set → Grp be the free group functor and U the forgetfulfunctor. This sends a set X to the group (cid:104) X (cid:105) which is the group generated by allwords in the elements of X . It is characterized by the property that for any function63 : X → G a group, there exists a unique group homomorphism ˆ f : (cid:104) X (cid:105) → G . Weclaim this makes (cid:104)(cid:105) (cid:97) U . In fact, the universal property gives a bijectionHom Grp ( (cid:104) X (cid:105) , G ) ← Hom

Set ( X , UG ) In fact, for any concrete algebraic object we get an adjunction between the free func-tor and the forgetful functor in the same way.(b) Consider Hom ( M , − ) and − ⊗ R M as covariant endofunctors of R - Mod . Then forany objects A , B ∈ R - Mod , there is a bijectionHom ( A ⊗ M , B ) → Hom ( A , Hom ( M , B )) f (cid:55)→ ˜ f where ˜ f ( a )( m ) = f ( a ⊗ m ) . In this case, we have some additional facts that comefrom the adjunction. The two most interesting (and important) ones are:Hom ( M , ∏ A i ) = ∏ Hom ( M , A i ) M ⊗ (cid:77) A i = (cid:77) ( M ⊗ A i ) for arbitrary indexing sets. We will see shortly that this is more generally a propertyof adjoint functors between abelian categories . Limits and Colimits

We now want to generalize the last example and understand products and coproductsin generic categories. These manifest as limits and colimits respectively. Recall that aproduct of two objects A , B is an object A × B together with two maps A × B → A and A × B → B . To be more precise, this is somehow the universal object such that for anyother object with maps C → A and C → B , there exists a unique map C → A × B suchthat the following diragram commutes C A × B AB ∃ ! Let us now generalize this.

Deﬁnition 3.1.23. An inverse system in a category C is a collection indexed by a partiallyordered set I , { A i , ϕ ji : A j → A i } i (cid:22) j such that ϕ jk ϕ ij = ϕ ik for all i (cid:22) j (cid:22) k . Equivalently,an inverse system is a functor A : I op → C such that A ( i ) = A i and A ( i → j ) = ϕ ji .Therefore, A ∈ C I op = Fun ( I op , C ) .An inverse system is thus a diagram in the category C of shape I op .64 xample 3.1.24. (a) Let I = {

1, 2, 3 } with the partial order 1 (cid:22) (cid:22)

3. Thendiagrams of shape I op look like AB C (b) If I is discrete (that is the only partial order is equality) then a diagram of shape I op is an indexed family of objects. This is the case for products as above.(c) Let M be a concrete object. Then the subsets of M are ordered under inclusion andthus give a diagram of shape M op . Deﬁnition 3.1.25.

Let A ∈ C I op be an inverse system. Then we deﬁne the inverse limit (projective limit or limit) as the universal object lim ←− A i together with morphisms α j :lim ←− A i → A j for all j satisfying the following compatibility conditions:(a) ϕ ji ( α j ) = α i for i (cid:22) j .(b) If C is an object of C together with morphisms { β i } which are compatible with A ,then there exists a unique morphism so that the following diagram commutes forall i (cid:22) j : C lim ←− A i A i A j ∃ ! β j β i α j α i ϕ ji These objects are complicated to look at but are so useful that it’s worth the technical-ities. The following examples tie together some previous topics which at ﬁrst so not seemnecessarily related but are all examples of limits.

Example 3.1.26. (a) Consider the following diagram D in R - Mod A C f Then lim ←− D = ker f . In this case, we see that the limit must have the following setrepresentation lim ←− D = { ( x , y ) ∈ × A : 0 = f ( y ) } In fact, arbitrary limits exist in R - Mod by a simple argument considering sets likethose above. 65b) Clearly, products as above are now limits. over the discrete set I = {

1, 2 } .(c) We deﬁne the pullback of a diagram of the form Example 3.1.24 (a), to be their limit.Almost always, these have a set representation as in example (a) here. In this case,we denote lim ←− D = A × C B .(d) If we want to deﬁne intersections without using elements, we can do it using limits.Let A → C and B → C be monic morphisms (they are subobjects). Taking the limitof this diagram we get A ∩ B AB C ij The resulting morphisms are clearly monic.We have the dual notion to the above construction.

Deﬁnition 3.1.27. A direct system in a category C is a collection indexed by a partiallyordered set I , { A i , ϕ ij : A i → A j } i (cid:22) j such that ϕ jk ϕ ij = ϕ ik for all i (cid:22) j (cid:22) k . Equivalently, andirect system is a functor A : I → C such that A ( i ) = A i and A ( i → j ) = ϕ ij . Therefore, A ∈ C I = Fun ( I , C ) . Example 3.1.28. (a) Let I = {

1, 2, 3 } with the partial order 1 (cid:22) (cid:22)

3. Thendiagrams of shape I op look like A BC (b) If I is discrete (that is the only partial order is equality) then a diagram of shape I isan indexed family of objects. This is the case for products as above.(c) Let M be a concrete object. Then the subsets of M are ordered under inclusion andthus give a diagram of shape M . Deﬁnition 3.1.29.

Let A ∈ C I be an direct system. Then we deﬁne the direct limit (in-ductive limit or colimit) as the universal object lim −→ A i together with morphisms α j : A j → lim −→ A i for all j satisfying the following compatibility conditions:(a) α j ϕ ij = α i for i (cid:22) j .(b) If C is an object of C together with morphisms { β i } which are compatible with A ,then there exists a unique morphism so that the following diagram commutes for66ll i (cid:22) j : A i A j lim −→ A C β i α i ϕ ij β j α j ∃ ! Example 3.1.30. (a) Consider the following diagram D in R - Mod

A B f Then lim −→ D = coker f . In this case, we see that the limit must have the following setrepresentation lim −→ D = ( B ⊕ ) / { ( f ( x ) , 0 ) ∈ ⊕ A : x ∈ A } In fact, arbitrary colimits exist in R - Mod by a simple argument considering sets likethose above.(b) Clearly, coproducts as above are now colimits. over the discrete set I = {

1, 2 } .(c) We deﬁne the pushout of a diagram of the form Example 3.1.28 (a), to be their col-imit. Almost always, these have a set representation as in example (a) here. In thiscase, we denote lim ←− D = A ⊕ C B .(d) If we want to deﬁne internal sums without using elements, we can do it using colim-its. Let A , B be two objects. Then A ∩ B → A and A ∩ B → B are monic morphisms(they are subobjects). Taking the colimit of this diagram we get A ∩ B AB A + B ji The resulting morphisms are clearly monic.The following proposition gives motivation for thinking of limits and colimits as func-tors.

Proposition 3.1.31.

Let I be a partially ordered set. Then all limits and colimits exist in R-

Mod .Proof.

We prove the case of limits. The case of colimits is then formally dual and left as afun exercise. Consider L ⊆ ∏ i ∈ I A i the submodule of threads L = { ( a i ) : ϕ ji ( a j ) = a i }

67y construction this comes with compatible maps α i : L → A i .Now let X be any module with compatible maps { β i } . Deﬁne θ : X → ∏ A i by θ ( x ) = ( β i ( x )) Then Im θ ⊆ L . Further α i θ : x (cid:55)→ ( f i ( x )) (cid:55)→ f i ( x ) . Hence, the limit diagram commutes.To show that θ is unique, let π : X → L be another such morphism. Then π ( x ) = ( a i ) and α i π ( x ) = a i . Thus if α i π ( x ) = f i ( x ) , we have that π = θ and thus L ∼ = lim ←− A i This completes the proof.This proposition says that R - Mod is complete and cocomplete (meaning that all limitsand colimits exist). So clearly, lim −→ : R - Mod I → R - Mod is functorial. We would like toshow this in general. This is not true however.

Example 3.1.32.

Let

Ring be the category of rings. Then if { R i } is an indexed family ofobjects, lim −→ R i (cid:54)∈ Ring

Why is this? Well, the unit element is necessarily (

1, 1, ... ) . But this is non-zero in everyentry and thus cannot be an element of the colimit (in this case it is the inﬁnite direct sum).In fact, most categories are not complete or cocomplete. When they are, it is obvious thatlim −→ and lim ←− are functors. For more information, see [HS97]. R -Mod We now move into the ﬁnal subsection. Here we are interested in categories which gen-eralize the category of R -modules or abelian groups. The deﬁning charactersitics of thesecategories is that we can: • Always take kernels and cokernels • Have an object 0. • Can take arbitrary products and coproducts. • Hom ( A , B ) is an abelian group (or R -module).What of these properties is necessary in generalizing? This section will give an answerto this. At the end, we will introduce some homological algebra. This will allow us toassociate invariants to modules. We start with additive categories. Deﬁnition 3.1.33.

A category A is additive if the following are true:(a) Hom ( A , B ) is an abelian group for all A , B ∈ A .(b) There exists a zero object 0. 68c) Composition is distributive. That is f ( g + h ) = f g + f h and ( g + h ) i = gi + hi .(d) Finite products and coproducts exist.A functor F : A → B is additive if F ( f + g ) = F ( f ) + F ( g ) . That is the morphism F X , Y : Hom ( X , Y ) → Hom ( FX , FY ) is a group homomorphism.The following proposition gives some properties of additive categories and additivefunctors. Proposition 3.1.34.

Let A , B be additive categories. Then ﬁnite products and coproducts areisomorphic. Moreover if T is an additive functor, then T ( A ⊕ B ) = T ( A ) ⊕ T ( B ) .For a proof of this statement see [Rot09].Now, using the constructions of ker and coker from above, we can prove Lemma 3.1.35.

Let f ∈ Hom A ( A , B ) be a morphism in an additive category.(a) If ker f exists, then f is monic if and only if ker f = (b) If coker f exists, then f is epic if and only if coker f = Proof.

Let ι : ker f → A be the morphism from the diagramatic deﬁnition above. If ι = g : X → A satisﬁes f g =

0, then by the universal property of limits, there exists amorphism θ : X → ker f with g = ιθ =

0. Hence, f is monic.For for the opposite direction consider the diagram K ι ⇒ A f → B Since f ι = = f

0, we have that ι =

0. The proof for cokernels is dual to this one.

Deﬁnition 3.1.36.

An additive category A is Abelian if(a) Every morphism has a kernel and cokernel.(b) Every monomorphism is a kernel and every epimorphism is a cokernel.

Example 3.1.37. In R - Mod , we have that every submodule S ⊆ M can be realized asa kernel via the map M → M / S . Cokernels are then the projections as given by the ﬁrstisomorphism theorem (Theorem 2.5.8). Therefore, the requirements of an abelian categorymake it look strikingly like R - Mod .We are now able to form the same deﬁnitions as in Chapter 2, but now in the contextof abelian categories.

Deﬁnition 3.1.38.

A sequence of morphisms A f → B g → C in A is called exact if ker g = Im f as subobjects in A . Now let 0 → A → B → C → A We say an additivefunctor F : A → B between abelian categories is69a) Left Exact if 0 → FA → FB → FC is exact.(b) Right Exact if FA → FB → FC → Half Exact if FA → FB → FC is exact.(d) Exact if 0 → FA → FB → FC → Lemma 3.1.39 (Snake Lemma) . Consider the following commuting diagram in an abelian cate-gory A (cid:48)

A A (cid:48)(cid:48) B (cid:48) B B (cid:48)(cid:48) ψ α ϕ α θβ β If the rows are exact, then there exists a morphism ∂ : ker θ → coker ψ making the followingsequence exact ker ψ → ker ϕ → ker θ → coker ψ → coker ϕ → coker θ Proof.

Extend the above diagram to include ker θ and coker ψ . Now form the pull-backand pushout accordingly: A × A (cid:48)(cid:48) ker θ ker θ A (cid:48) A A (cid:48)(cid:48) B (cid:48) B B (cid:48)(cid:48) ψ coker ψ ⊕ B (cid:48) B ψ α ϕ α θβ β From this, we immediately see that the sequence0 → A (cid:48) → A × A (cid:48)(cid:48) ker θ → ker θ → σ : = ( A → A × A (cid:48)(cid:48) ker θ ) , γ : = ( coker ψ ⊕ B (cid:48) B → B (cid:48)(cid:48) ) , and the composite morphism (cid:101) : = ( A × A (cid:48)(cid:48) ker θ → coker ψ ⊕ B (cid:48) B ) . From the exactness of the rows in the above diagram, we get that γ(cid:101) = (cid:101)σ = (cid:101) factors through the cokernel of σ and the kernel of γ . Asthese two objects are ker θ and coker ψ , deﬁne δ : ker θ → coker ψ as this morphism. 70his yields a sequence of morphismsker ψ → ker ϕ → ker θ δ −→ coker ψ → coker ϕ → coker θ For all pairs of morphisms not involving δ , exactness follows immediately. For the re-maining morphisms, note that it sufﬁces to show that ker ϕ → ker θ → coker ψ is exact aswe can then dualize the argument to get the same result for the dual sequence. To showthis, let S ∈ A and π : S → ker θ any morphism such that δπ =

0. Form the pullback andadjoin it to the diagram as follows S SA × A (cid:48)(cid:48) ker θ ker θ A (cid:48) AB (cid:48) B coker ψ πδψ α ϕβ where the dashed morphism, call this f , exists by the fact that A × A (cid:48)(cid:48) ker θ → B → B (cid:48)(cid:48) is the zero morphism. Now, the composition S → coker ψ is 0 and thus, we can ﬁnd anepic morphism S (cid:16) S such that the composition S → B (cid:48) factors through A (cid:48) . Denote by g the morphism S → A (cid:48) . Deﬁne the composite morphism λ : S → A and then consider λ − f ◦ k : S → A This must factor through ker ϕ by the commutativity of the diagram above. Hence, weget a commuting square S S ker ϕ ker θ A A (cid:48)(cid:48) α The existence of this commuting diagram is equivalent to the exactness of the sequenceker ϕ → ker θ → coker ψ . Dualizing this argument we get the exactness of the othermorphisms. This completes the proof.Now we can tie together adjoints and abelian categories.71 heorem 3.1.40. Let F : A (cid:29) B : G be adjoint functors with F (cid:97) G . Then F is right exact andG is left exact. Further F ( lim −→ A i ) = lim −→ F ( A i ) and G ( lim ←− A i ) = lim ←− G ( A i ) .The proof relies on the Yoneda Embedding [ML71] which we will not cover. Thistheorem thus implies a stronger result than we stated before about Hom and ⊗ . Corollary 3.1.41.

Hom is left exact in both arguments and ⊗ is right exact in both arguments. Therefore, given a short exact sequence of R -modules, the resulting sequences0 → Hom ( Y , A ) → Hom ( Y , B ) → Hom ( Y , C ) Y ⊗ A → Y ⊗ B → Y ⊗ C → ⊗ are exact everywhere.Thus, for the rest of this chapter, we shall assume we are working in R -modules. Thismay seem at ﬁrst like we are becoming too speciﬁc to be of any use for category theory.The following theorem tells us that this is not correct. Theorem 3.1.42 (Mitchell) . Let A be a small abelian category. Then there exists an exact, fully-faithful functor A → R- Mod for some ring R .See [Rot09] for details.

Projective, Injective, and Flat modulesDeﬁnition 3.1.43. An R -module P is projective if for every surjective map M → N andany map P → N there exists a map M → P making the following diagram commute PM N ∃ Dually an R -module I is injective if for every injective map 0 → L → M and anymorphism L → I there exists a morphism making the following diagram commute: I L M ∃ These deﬁnitions seem obtuse and out of nowhere. The following lemma makes themseem less so arbitrary.

Lemma 3.1.44.

The functor

Hom ( P , − ) is exact if and only if P is projective. Also, the functor Hom ( − , I ) is exact if and only if I is injective. roof. We prove the injective case and leave the projective one to the reader as it is thesame argument. ( ⇒ ) Assume ﬁrst that Hom ( − , I ) is exact. Then for any exact sequenceof modules 0 → A → B → C → → Hom ( C , I ) → Hom ( B , I ) → Hom ( A , I ) → ( B , I ) → Hom ( A , I ) is surjective. Being surjectivemeans that for any morphism ϕ : A → I , there exists a morphism ˆ ϕ : B → I which makesthe diagram above commute. This is precisely the deﬁnition of I being injective.Now assume I is injective. Then we have a surjective map π : Hom ( B , I ) → Hom ( A , I ) by deﬁnition. For any f ∈ Hom ( A , I ) the deﬁnition tells us that f = i ∗ ( g ) for some g ∈ Hom ( B , I ) . Hence, π = i ∗ and Hom ( − , I ) is exact. This completes the proof. Deﬁnition 3.1.45.

A module is called ﬂat if − ⊗ R M is exact. Moreover, every projectivemodule is ﬂat [Rot09].For a given module M , we want to understand how far M is from being projective,injective, or ﬂat. Clearly the functors Hom and ⊗ will not tell us this information. Whatthey imply is that M is simply not ﬂat (projective, injective). To remedy this, we will ﬁnda free resolution of M which is quasi-isomorphic to M so that we can measure how far M is from being one of the special modules above. Deﬁnition 3.1.46. A free resolution of an R -module M is an exact sequence F • → M → F i and morphisms α i so that... → F → F → M → M as the cokernel of the map F → F ). If every F i is projective(resp. ﬂat) then F • is a projective (resp. ﬂat) resolution of M . As injective modules aredual to projective ones, we have that an injective resolution of M is an exact sequence0 → M → I • .We care about these resolutions because if we look at the quotients ker α i / Im α i + = i >

0. If we truncate the sequence and only consider up to F . Then the cokernelof α = M . Therefore, this sequence is in some sense no different from M itself. The nextpart goes into more detail about this. Derived Functors

For a general abelian category, we have the notion of short exact sequences. In addition tothis, we have the notion of (co)chain complexes . These will be the central objects we wantto consider when answering the questions posed in the previous section.

Deﬁnition 3.1.47.

Let ( C • , d • ) be a collection of objects in an abelian category A togetherwith a morphism d n : C n → C n − . We call ( C • , d ) a chain complex if d n − ◦ d n =

0. Ifinstead we have an object ( C • , ∂ • ) such that ∂ n : C n → C n + such that ∂ n + ◦ ∂ n = cochain complex . It is common practice to drop the index on thedifferential d • or ∂ • and simply denote them d and ∂ . We shall adopt this convention.73 morphism of (co)chain complexes ( C • , d ) and ( D • , d (cid:48) ) is a chain map f • (resp. f • ),that is a collection of maps f i so that the following diagram commutes for all n , C n C n − D n D n − df n f n − d (cid:48) With this notion of morphism, we can build a new category ( c ) Ch ( A ) of (co)chaincomplexes. Notice that because of the condition d =

0, we have that Im d n ⊆ ker d n − . Deﬁnition 3.1.48.

Let ( C • , d ) be a chain complex. Deﬁne the n − th homology groups of C • as H n ( C • ) = ker d n / Im d n + These are in fact groups as shown in [Rot09].Two chain complexes are quasi-isomorphic if there exists a chain map f • : C • → D • such that ( f i ) ∗ : H i ( C • ) ∼ → H i ( D • ) where ( f i ) ∗ is deﬁned as [ α ] (cid:55)→ [ f i ◦ α ] . This is well-deﬁned by the deﬁnition of a chain map. Further f i ◦ α ∈ ker d (cid:48) n . We have completelyanalogously the deﬁnition of cohomology groups H i ( C • ) . We call a (co)chain complex is exact if all of the (co)homology groups are identiically 0.We now return to the content of the previous section. Let M be an R -module and P • aprojective resolution of M .It then follows from the discussion above that P • → M at P ) and M are quasi-isomorphic as chain complexes (here M is considered as thetrivial chain complex with differential 0 everywhere). We can use this to our advantage.For any R -module A , consider Hom ( − , A ) . The resulting cochain complexHom ( P , A ) → Hom ( P , A ) → Hom ( P , A ) → ...is no longer exact. Deﬁnition 3.1.49.

The n-th cohomology groups or n -th Ext groups of M and are denotedExt nR ( M , A ) : = H i ( Hom ( P n , A )) Remark 3.1.50.

It can be shown [HS97] that these groups do not depend on the resolutiontaken. In fact, it does not even matter if we resolve A or M . There is a dual construction ofExt n ( M , A ) where instead of a projective resolution of M , we take an injective resolutionof A .For ⊗ , we have the corresponding construction but now we only use projective reso-lutions as ⊗ is covariant in both arguments. Deﬁnition 3.1.51.

The n-th homology groups or n -th Tor groups of M areTor Rn ( M , A ) : = H i ( P n ⊗ A )

74e now generalize to arbitrary abelian categories.

Deﬁnition 3.1.52.

An abelian category is said to have enough projectives if every elementhas a projective resolution (respectively, enough injectives and enough ﬂats)Let A be an abelian category with enough projectives and F : A → B be a rightexact functor. Then for any projective resolution of an object M , we can repeat the oper-ation above to deﬁne the derived functors of F . To be more speciﬁc, let P • be a projectiveresolution of M . Deﬁnition 3.1.53.

The functors L i F ( M ) = ker ( FP n → FP n − ) / Im ( FP n + → FP n ) are called the left derived functors of F . Dually if G is left exact and I • is an injectiveresolution, we can deﬁne R i G as the right derived functors for G .One may ask why we do not consider the left derived functors for a left exact functor.The answer to this is that these are all zero, or at least un interesting. They tell you nothingabout exactness as 0s appear in the sequences. Proposition 3.1.54.

If F is exact then R i F and L i F are for all i > Proof. As F is exact, the resulting long sequences are exact. Hence, the quotient groupsare 0 and R i F (resp. L i F ) is 0. Remark 3.1.55.

The derived functors measure the extent to which M is not projective,injective, or ﬂat. More generally, they measure how far F is from being exact. If R i F is non-zero for only very large i , then F is very close to being exact. Whereas if R F isnon-zero, then F is nowhere close to being exact.The ﬁnal theorem we present in this section is the most useful for computing thesefunctors. Theorem 3.1.56.

Let → A → B → C → be exact in A and F : A → B be a right exactfunctor. Then there is a long exact sequence ... L i F ( A ) → L i F ( B ) → L i F ( C ) → L i − F ( A ) → ... in the derived functors. The same is true for left exact functors. The proof of this is immediate from the Snake Lemma 3.1.39. The reason it is so impor-tant is because if we know that either A , B , or C is F -acyclic (that is L i F ( C ) =

0) then weget isomorphisms of the remaining groups! This single fact underlies most of homologi-cal algebra and will be integral in section 3.3.2. This completes this brisk tour of categorytheory. 75 .2 Topology

We shall depart from category theory for the time being and return to it in section 3.2.2.For the meantime, we shall introduce the second major topic of this chapter: topologicalspaces. The purpose of these objects is to formalize the somewhat colloquial notions ofconnectedness, compactness, and other concepts. The culmination of all of this will beto deﬁne and give some basic properties of singular homology groups for a topologicalspace. This concept will prove contentious in chapter 4 as some researchers have recentlyproposed using homology to discover geometric properties of the perceptual space.

The story of topology starts with the deﬁnition of a topological space. Before we give thisthough, we want to motivate the study of such objects by looking at the familiar case of R n and in particular R . In high-school algebra, we call sets of the form ( a , b ) open and [ a , b ] closed. Similarly, sets of the form B r ( p ) = { x ∈ R n : | x − p | < r } are open in R n and if we change < to ≤ , we get closed sets. In fact, we can have arbitrary open sets in R n but all of them are built out of sets of the form above. We want to generalize all of thisand formalize what we mean by open and closed . Some good references for this section are[Lee11], [Mun00], and [FF16]. The last of which is a fairly recent and thorough treatmentof the material in Section 3.2.2. Deﬁnition 3.2.1.

Let X be a set. A topology on X is a collection of subsets T ⊆ P ( X ) thepower set, subject to the following conditions:(a) ∅ , X ∈ T .(b) T is closed under arbitrary union. That is if { U i } i ∈ I is a collection of elements of T with | I | arbitatry, then (cid:91) i ∈ I U i ∈ T (c) T is closed under ﬁnite intersections. That is if { U i } i ∈ I is a collection of elements of T with | I | < ∞ , then (cid:92) i ∈ I U i ∈ T Elements of the topology are called open sets. A subset V ⊆ X is called closed if X − V ∈T . A set equipped with a topology is called a topological space .Notice that open and closed are not mutually exclusive: X is always closed and open(sometimes abbreviated to clopen ) and some sets, such as [

0, 1 ) in R are both not closedand not open. Further, simply because a set is not open does not imply closure. Example 3.2.2.

For any set, we can give it the discrete topology where every subset is declared open.Dually, we can deﬁne the trivial topology in which only ∅ and X are open.76or any subset A ⊆ ( X , T ) , we can topologize A by taking the open sets to be A ∩ T : = { A ∩ U : U ∈ T } This is called the subspace topology .The topology generated by the open balls in R n above is called the standard topology on R n .We want to formalize the ﬁnal example above. That is we want to answer the question: what does it mean to generate a topology? Similar to a basis for a vector space, we want todeﬁne an analogous object for a topology.

Deﬁnition 3.2.3.

Let X be a topological space with topology T . Then a collection of sub-sets, B , of X is called a basis for the topology T if the following conditions are satisﬁed:(a) Every B ∈ B is open in X .(b) Every open set U ∈ T can be written as a union of some collection of elements of B .It should now be clear that the standard. topology on R n is the topology with basisconsisting of the open balls. Now that we have this deﬁnition, we want to understandwhen it is applicable. Further, what conditions on a collection of subsets of a topologicalspace make it a basis? The following proposition answers this in full. Proposition 3.2.4.

Let X be a set and B a collection of subsets. Then B is a basis of a topologyon X if and only if the following conditions are satisﬁed:(a) (cid:83) B ∈ B B = X(b) For every B , B ∈ B , B ∩ B ∈ B and if B ∩ B (cid:54) = ∅ , there exists B ∈ B such thatB ⊆ B ∩ B . In fact, this topology is the unique topology generated by B . Proof.

Suppose B is a basis. Then (a) is satisﬁed immediately as every open set is a unionof basis elements and X is open in any topology. For (b), as. B and B are open, B ∩ B is open. Therefore we can write B ∩ B = (cid:91) B i where B i ∈ B are basis elements. Pick any of the B i to satisfy (b).For the reverse. direction, we need to show that the conditions above imply that T B isindeed a topology on X . By the (a), X , ∅ ∈ T B . Let { U i } be an arbitrary collection of opensets. Then each U i = (cid:83) j ∈ J i B ij . with each B ij ∈ B . Then (cid:91) U i = (cid:91) I (cid:91) J i B ij So T B is closed under arbitrary unions. To show it is. closed under ﬁnite intersection,let U , U ∈ T B . Then for every x ∈ U ∩ U , there exists some B ⊆ U and B ⊆ U x ∈ B ∩ B . By condition (b), we know there exists some B such that x ∈ B ⊆ B ∩ B ⊆ U ∩ U . Then U ∩ U is a union of each of these basis elements as x varies andhence is open. Therefore T B is closed under pairwise intersection and by induction, allﬁnite intersections. Hence T B is a topology on X . Uniqueness follows immediately fromthe deﬁnition of a basis. This completes the proof.This proposition says that it sufﬁces to deﬁne a topology by giving a basis. In Section3.3, we will use this to topologize manifolds in a unique way so that they are sufﬁcientlynice.We need to step back a bit and think about how we topologize R n . We have given abasis for some topology on R n above. What if we want to build a topology on R n out ofthe topologies on R . To answer this, we shall generalize to the notion of product topology. Deﬁnition 3.2.5.

Let { X α } α ∈ J be a J -indexed family of topological spaces. As a basis forsome topology on the product space ∏ J X α , we have the sets of the form ∏ U β where U β is open in X β and U β = X β for all but ﬁnitely many β ∈ J . This topology iscalled the product topology .There is a naive topology on the product which removes the ﬁnal condition that U β = X β for all but ﬁnitely many β . This is called the box topology . In the case of J ﬁnite, theseare equivalent. It is generally less useful than the product topology as it is too ﬁne; that istoo many sets are open. For this reason, whenever we have a product space, we assumeit has the product topology.In R n , it is relatively easy to distinguish whether or not a point lies within a givenset. For a general topological space, this is daunting as the topology may be particularlybad. We need to generalize the above notion arbitrary spaces so that we can speak ofboundaries of sets. To be more formal, let X ⊆ Y be topological spaces. We say that x ∈ Int ( X ) the interior of X if there exists an open set U (cid:40) X such that x ∈ U . Theboundary of X , denoted ∂ X is the collection of points { y } such that for any open set P containing y , P ∩ X is non-trivial. We deﬁne the closure of X to be X = Int ( X ) ∪ ∂ X It should be noted that this only makes sense for topological subspaces. More generallyit makes sense in the context of embeddings (see Example 3.2.8 below).

Proposition 3.2.6.

Let X be a topological space and A a subspace. Then

Int ( A ) is open, ∂ A isclosed, and A is closed.Proof.

For each point x ∈ Int ( A ) let U x ⊆ Int ( A ) be an open set containing x . Then Int ( A ) is. the union of these U x and is thus open. Consider X − ∂ A we wish to show that this isopen. From the deﬁnition, X − ∂ A = Int ( A ) ∪ ( X − A ) Therefore, it sufﬁces to show that X − A is open. Let p ∈ X − A . As p / ∈ A , there existssome V ⊆ X open such that V ∩ A = ∅ and p ∈ V . As X − A is a union of these opensets, it is open. This completes the proof. 78t should be clear now that a set is open (resp. closed) if and only if A = Int ( A ) (resp. A = A ).Now that we have the notions of topologies and bases, we can give a general deﬁnitionof continuity. Deﬁnition 3.2.7.

Let f : X → Y be a function between topological spaces. We call f continuous if for all V ⊆ Y open, f − ( V ) is open in X . We call a map open if for all U ⊆ X open, f ( U ) is open in Y .Together with continuous maps, topological spaces deﬁne a category denoted Top . Ifwe add an additional stipulation that every space be given a distinguished point, thenwe can deﬁne the category

Top ∗ of pointed topological spaces and base-point preservingmaps.The following examples of continuous maps are all fun exercises to the reader. Theyare incredibly important for later parts of this chapter. Example 3.2.8. (a) Let ( X , x ) and ( Y , y ) be pointed topological spaces. Then the constant map x (cid:55)→ y is continuous.(b) Let f : X → Y be a continuous map. Then for any subspace A ⊆ X , the restrictionmap f | A : A → Y is also continuous.(c) Let f : X → Y be a continuous map, and denote the image by f ( X ) . Then for anysubspace Z ⊆ Y with f ( X ) ⊆ Z , the map f Z : X → Z is continuous.(d) The composition of continuous maps is continuous.(e) Any inclusion map is continuous. That is, if A ⊆ X then there exists a map A (cid:44) → X and this map is continuous. In general an injective continuous map is called a topological embedding if it is a homeomorphism onto its image.Notice that the deﬁnition of continuity pays no mind to closed subsets. Could wepossibly get a different deﬁnition if we replace open with closed in the deﬁnition? Thefollowing lemma gives a negative answer. Lemma 3.2.9.

A function f : X → Y is continuous if and only if for all closed subsets V ⊆ Y,f − ( V ) is closed in X . Proof. ( ⇐ ) Let B be a closed set in Y and C its complement. By deﬁnition it is open. Wewant to show that f − ( C ) is open in X . Consider f − ( B ) = f − ( Y ) − f − ( C ) = X − f − ( C ) As f − ( B ) is closed in X , we conclude that f − ( C ) is open.79 ⇒ ) Let B be closed in Y . We need to show that f − ( B ) is closed. We need to showthat f − ( B ) = f − ( B ) . Let x ∈ f − ( B ) . Then f ( x ) ∈ f ( f − ( B )) ⊆ ¯ B = B where the inclusion follows from continuity. Therefore, x ∈ f − ( B ) and f − ( B ) ⊆ f − ( B ) . Hence, f − ( B ) is closed.Therefore, deﬁning continuity in terms of closed sets is equivalent to deﬁning it in termsof open sets.We now want to deﬁne quotient objects in Top . Let A be a subspace of a topologicalspace X . Then we deﬁne an equivalence relation on X as x ∼ y if x , y ∈ A . Then we havethe quotient space X / ∼ which is also written X / A . We want to topolgize X / A in a waywhich makes the canonical map X → X / A continuous. Deﬁnition 3.2.10.

The quotient topology is deﬁned as the coarsest topology for whichthe canonical morphism π : X → X / A is continuous. Equivalently, P ⊆ X / A is open ifand only if π − ( P ) is open in X . Remark 3.2.11.

This will allow us to give a topological structure to the Generalized Cate-gories from Chapter 1 and give a coarse categorization from the perceptual space.The quotient topology can be particularly opaque as it depends entirely on X and A .To give some idea of how it can manifest, lets give some examples of quotient spaces: Example 3.2.12. (a) Let S : = { x ∈ C : | x | = } . Then consider the subspace {−

1, 1 } .It then turns out that S / {−

1, 1 } is equivalent to two circles which touch at a sin-gle point. The topology of this space is then inherited from its embedding into C .Therefore, the quotient topology in this case is easy to see.(b) Consider Z (cid:44) → R . Then R / Z is equivalent to the interval [

0, 1 ] with the identiﬁca-tion of 0 ∼

1. Hence, the quotient space is S .What do we mean here by "equivalent?" We claimed above that Top is a categoryand thus equivalent should mean an isomorphism. What are the isomorphisms in thiscategory?

Deﬁnition 3.2.13.

Let f : X → Y be a continuous map. We call f a homeomorphism ifthere exists g : Y → X such that g ◦ f = X and f ◦ g = Y . Notice that every homeomor-phism is necessarily a bijection.If we consider "spaces up to homeomorphism" this is an equivalence relation. That is,we can think of isomorphism classes of topological spaces. This is a large area of researchfor say curves and surfaces. Before we move on to other general topological properties,we shall give some generic properties of homeomorphisms. Theorem 3.2.14.

Let f : X → Y be a bijective function between topological spaces. Then f is ahomeomorphism if and only if f ( T X ) = T Y . Further if f is a homeomorphism, then f is an openmap. roof. Notice that the second statement follows immediately from the ﬁrst. ( ⇒ ) Let U ∈ T X . Then f ( U ) = ( f − ) − ( U ) As f is a homeomorphism, f − is continuous and so f ( U ) is open in Y and thus f ( U ) ∈T Y . Therefore, we have an injection f ( T X ) (cid:44) → T Y . This map is surjective as f is continuous.Thus, f ( T X ) = T Y . ( ⇐ ) Assume now that f ( T X ) = T Y . f is continuous as for any V ∈ T Y , f ◦ f − ( V ) = V and therefore f − ( V ) ∈ T X . Similarly, f − is thus continuous. This completes the proof. Example 3.2.15.

Some classic examples of homeomorphisms are translations and dila-tions of R n . These are maps of the from f ( x ) = x + λ and f ( x ) = cx for some λ ∈ R n and c ∈ R . More importantly, let V , W be ﬁnite dimensional vector spaces. Then any linearmap V → W is necessarily continuous. In fact, as we will see in the next section, thesemaps are smooth! Example 3.2.16.

We end this subsection with an interesting example of topological spaces.Let G be a group. Then we call G a topological group if multiplication and inversion arecontinuous maps. A morphism of topological groups is a continuous group homomor-phism. Connectedness, Compactness, and Hausdorff

Now we give some characterizations of certain topological spaces. The properties areimportant for many mathematical applications and will be intrinsically important for forthe next section and chapter 4. We shall do them all in one pass and then go into somedetail about their relationships to each other.

Deﬁnition 3.2.17.

Let X be a topological space.(a) X is connected if there do not exist open sets U , U such that U ∩ U = ∅ and U ∪ U = X .(b) X is compact if for every open cover U of X there exists a ﬁnite subcover. An opencover of a topological space is a collection of open sets U = { U i } such that X ⊆ (cid:83) U i .(c) If X is non-empty an contains at least two elements, then X is Hausdorff if for anytwo distinct points, x , y ∈ X , there exists open sets U x , U y ⊆ X , such that x ∈ U x , y ∈ U y , and U x ∩ U y = ∅ . Distinct here means that there exists some open set about x which does not contain y . Spaces with thisproperty are sometimes called Kolmogorov X . For instance, every space is con-nected (resp. compact) if equipped with the trivial topology and every space is discon-nected (resp. non-compact) if it is equipped with the discrete topology. In general, a spaceis not connected but can be broken up into connected components. This partitions the setinto distinct subsets which can be of great use. There is another notion of connectednesswhich is slightly stronger. Deﬁnition 3.2.18.

A topological space X is path-connected if for each pair of points a , b ∈ X , there exists a continuous path γ : [

0, 1 ] → X such that γ ( ) = a and γ ( ) = b . Proposition 3.2.19.

If X is path-connected, then X is connected.Proof.

Assume for the sake of contradiction that X is disconnected. Let X = U ∪ V with U ∩ V = ∅ . Let a ∈ U , b ∈ V , and γ a path between them. Then γ − ( X ) = γ − ( U ) ∪ γ − ( V ) . This implies that [

0, 1 ] is. disconnected which is a contradiction. Hence, X isconnected.This proposition proves our assertion from before that path-connectedness is a strongercondition that connectedness. In fact, there are some highly non-trivial examples wherethe converse is not true. Example 3.2.20.

Let X be the space of lines in R connecting the origin to the points ( n ) ,together with the point (

1, 0 ) (note this does not include the line segment (

0, 0 ) → (

1, 0 ) .)Then X is connected, but not path-connected. See Figure 3.1.Similar to connected components, we can deﬁne path -connected components. For atopological space X , denote the set of path-connected components by π ( X ) .We want to understand how each of these notions interacts with (1) each other and (2)continuous maps.Let us investigate (2) ﬁrst. Theorem 3.2.21.

Let f : X → Y be a continuous function. If X is connected (resp. compact),then so is f ( X ) . Proof.

Let X be connected. Assume for the sake of contradiction that f ( X ) is disconnected.Then, let f ( X ) = A ∪ B . Each of these is open in Y and thus f − ( A ) and f − ( B ) is open in X . Further, f − ( A ) ∪ f − ( B ) = X . This contradicts the connectedness of X . Hence, f ( X ) is connected.Now assume X is compact. Let V be an open cover for f ( X ) . Then X = (cid:91) V i ∈V f − ( V i ) is an open cover. As X is compact, there exist ﬁnitely many V i such that X = (cid:83) n f − ( V i ) .Therefore V , ..., V n are is a ﬁnite open subcover of V . Hence, f ( X ) is compact. This com-pletes the proof. 82igure 3.1: The Witches Broom. An example of a connected but not path-connected topo-logical space. It is the union of all line segments [(

0, 0 ) , ( n )] ∪ { (

1, 0 ) } .This theorem is highly important to any ﬁeld of mathematics that concerns itself withtopologies of any kind. As it turns out, many theorems only work for compact spaces.So, knowing that compactness is preserved under continuous maps is crucial. Let’s un-derstand compact sets a bit better. Proposition 3.2.22.

Let X be a compact space.(a) If A ⊆ X is closed, then A is compact.(b) If X ⊆ Y a Hausdorff space, then X is closed in Y . Proof. (a) Let A be an open cover of A . As A is closed, A c = X − A is open in X and A ∪ A c is an open cover for X . As X is compact, there exists a ﬁnite subcover. If this re-sulting subcover contains A c , discard it. Else, this is a ﬁnite cover of A . This proves (a).(b) Let y ∈ X c . We want to construct an open set V containing y such that V ∩ X = ∅ .As Y is Hausdorff, for every x ∈ X , there exists disjoint open sets U x and (cid:102) U x such that x ∈ U x and y ∈ (cid:102) U x . Then (cid:83) x ∈ X U x is an open cover of X . By compactness, there is a ﬁnitecollection of points { x i } such that X = (cid:83) i U x i . Put V = (cid:92) i (cid:102) U x i X by construction. Hence, X c is open and thus X is closed.In a similar theme to topologies, we would like to know how connectedness, compact-ness, and Hausdorff-ness interact with products. Proposition 3.2.23.

Let { X i } be a family of connected (resp. Hausdorff) spaces. Then ∏ X i isconnected (resp. Hausdorff). We leave the proof of this proposition as an exercise to the reader as it follows entirelyfrom the deﬁnitions.For compactness, there are two results and both are surprising.

Theorem 3.2.24 (Heine-Borel) . A subset of R n is compact if and only if it is closed and bounded. Theorem 3.2.25 (Tychonoff) . Let { X i } be an arbitrary collection of compact spaces. Then ∏ X i is compact. Although we shall not prove this, it is interesting to know that this theorem is equivalentto the axiom of choice as its proofs rely entirely on Zorn’s Lemma. This is arguably themost important theorem in all of point-set topology. For a proof of both theorems see[Mun00].

Metric Spaces

We now give a brief introduction to metric spaces which will allow us to formally discuss"perceptual metrics" in chapter 4.

Deﬁnition 3.2.26.

Let X be a set. A metric on X is a function d : X × X → R ≥ ∪ { ∞ } such that(a) For x , y ∈ X , d ( x , y ) = d ( y , x ) .(b) For all x , y ∈ X , d ( x , y ) = ⇐⇒ x = y .(c) For all x , y , z ∈ X , d ( x , z ) ≤ d ( x , y ) + d ( y , z ) .The ﬁnal condition is called the triangle inequality and is the deﬁning characteristic ofmetrics. The set ( X , d ) is called a metric space. A function f : ( X , d ) → ( Y , g ) betweenmetric spaces is called a metric map if g ( f ( x ) , f ( y )) ≤ d ( x , y ) If equality holds for all x , y , then f is called an isometry . The collection of all metric spacesand all metric maps forms a category denoted Met . Theorem 3.2.27.

Let ( X , d ) be a metric space. Then d induces a topology on X (called the metrictopology ). This gives a faithful functor Met (cid:44) → Top

The image is the category of metrizable spaces (those which are homeomorphic to metric spaces). roof. Let x ∈ X and put B r ( x ) : = { y ∈ X : d ( x , y ) < r } Let B be the collection of all such balls for all points x ∈ X . We claim that B is a basis. Itsufﬁces to check the conditions of Proposition 3.2.4. Clearly, X = (cid:83) B ∈ B B . Let B r ( x ) and B r (cid:48) ( x (cid:48) ) be two elements of B such that B r ( x ) ∩ B r (cid:48) ( x (cid:48) ) (cid:54) = ∅ . By the triangle inequality,for any y ∈ B r ( x ) ∩ B r (cid:48) ( x (cid:48) ) we can ﬁnd δ < r and δ < r (cid:48) such that B δ ( y ) ⊆ B r ( x ) and B δ ( y ) ⊆ B r (cid:48) ( x (cid:48) ) . Pick δ = min { δ , δ } . Then B δ ( y ) is contained in the intersection. Hence, B is a basis for a topology on X .The functor Met → Top is precisely the forgetful functor which sends ( X , d , T ) → ( X , T ) .Now consider { x n } a sequence of points in a metric space. We say that { x n } convergesto a point x if d ( x , x n ) → n → ∞ . A Cauchy Sequence is a sequence { x n } such thatthere exists n ∗ and for all m , n > n ∗ , d ( x m , x n ) < (cid:101) for any (cid:101) > Deﬁnition 3.2.28.

We call a metric space complete if every Cauchy sequence converges.

Theorem 3.2.29.

Let ( X , d ) be a metric space. Then there exists a metric space ( (cid:98) X , d ) such that (cid:98) Xis complete with respect to d and there is a map X → (cid:98) X . which is an isometry with dense image. See [Kna05c] for a full proof of this statement.With this theorem in mind, we want to give the deﬁnitions/exmaples of some com-plete metric spaces and how they arise.

Deﬁnition/Example 3.2.30.

Let V be a C -vector space. A norm on V is a map V → C such that(a) || x || ≥ x = || ax || = | a | · || x || for all a ∈ C .(c) || x + y || ≤ || x || + || y || for all x , y ∈ V .Clearly, a norm induces a metric d ( x , y ) = || x − y || on V . We call V a Banach space if ( V , d ) is complete.Similarly, we can deﬁne a hermitian inner product on V as a sesquilinear (one and ahalf linear) map (cid:104) , (cid:105) : V → C such that (cid:104) x , y (cid:105) = (cid:104) y , x (cid:105) where (cid:104) , (cid:105) denotes the complex conjugate.This deﬁnes a norm and hence a metric on V . If ( V , d ) is complete with respect tothis metric, then V is a Hilbert Space . These are some of the most important spaces forHarmonic analysis and representation theory. We shall use Banach spaces and tensorproducts to understand manifolds better in Section 3.3. Given a topological space X and a subspace A , we call A dense in X if A = X . .2.2 Basic Algebraic Topology In this section, we shall introduce a different approach to topology which considers aweaker form of equivalence but focuses on algebraic invariants attached to topologicalspaces. The main references for this subsection are [Hat01], [Rot88], and [FF16]. We startwith the notion of homotopy.

Deﬁnition 3.2.31.

Let X f ⇒ g Y be continuous maps. A homotopy between f and g is acontinuous function H : [

0, 1 ] × X → Y such that H ( x ) = f ( x ) and H ( x ) = g ( x ) . If such a homotopy exists we say that f and g are homotopic, and denote this f (cid:39) g . Two spaces are said to be homotopy equivalent if there exist function f , g such that f ◦ g (cid:39) X and g ◦ f (cid:39) Y .Notice that considering "spaces up to homotopy equivalence" is a weaker conditionthan "spaces up to homeomorphism". In fact, spaces which are homeomorphic are neces-sarily homotopy equivalent. In fact if we consider pointed topological spaces, then thereis a category Htpy where the morphisms are homotopy classes of maps. In this cate-gory, we consider the morphisms with source S and a ﬁxed target X . More generally weconsider morphisms with source S n . Deﬁnition 3.2.32.

The space of maps π n ( X ) : = Hom

Htpy ( S n , X ) are called the n th homotopy groups . The group law is deﬁned by concatenation in eachcoordinate. A topological space is called simply connected if π ( X ) =

0. Further, if π n ( X ) = n ≥

1, then X is contractible. Equivalently, X is homotopy equivalentto a point.For n ≥

2, these are abelian groups. These are algebraic invariants for the space X .By this we mean that if X (cid:39) Y , then π n ( X ) ∼ = π n ( Y ) for all n [Hat01]. The problemwith these homotopy groups is that they are almost always not computable, and even ifthey are it is incredibly difﬁcult. For this reason, we want to consider a better algebraicinvariant: homology and cohomology. These in some sense classify the number of holesof each dimension in a space. Example 3.2.33.

Let T = S × S be the torus depicted below.It is clear that this has two loops which cannot be continuously deformed into one an-other: one goes around the large center hole and the other around the thickness of the86orus. Are there any 2-dimensional holes? Before we give the answer, consider that topo-logical tori are hollow. Therefore, there is some inner volume contained in a torus whichstops certain loops from being contractible.The answer to the above question is yes and there is only 1. There are no higher-dimensional holes. We shall see that a formal way to answer these questions is by com-puting the homology groups for T , which given the statements above should be H n ( X , Z ) =  {∗} n ≥ Z n =

0, 2 Z n = Simplicial Complexes

In a way, simplicial complexes are the most basic topological objects for which to deﬁnehomology and cohomology. As is such, we give a brief introduction to them here.

Deﬁnition 3.2.34. A simplex ∆ k is the convex hull of n + R n . A simplicial complex is a union of copies of ∆ i such that ∆ i ∩ ∆ j = ∆ k with k ≤ j .To deﬁne homology one needs the language of chains Deﬁnition 3.2.35 (Chains) . Let K be a simplicial complex and denote by C ∆ n ( K ) = (cid:40) ∑ i m i ∆ n | m i ∈ Z (cid:41) the free abelian group generated by n − simplices. If ∆ ni = [ v , ..., v n ] then deﬁne theboundary map ∂ n : C ∆ n ( K ) → C ∆ n − ( K ) in the following manner: ∂ n ( ∆ n ) = n ∑ i = ( − ) i [ v , ..., (cid:98) v i , ..., v n ] This makes ∂ n a group homomorphism. This set is called the set of simplicial n-chains This yields the following sequence for any given K ,... C ∆ n C ∆ n − C ∆ n − ... ∂ n + ∂ n ∂ n − ∂ n − Lemma 3.2.36. ∂ n ◦ ∂ n + = Remark 3.2.37.

We will drop the superscript ∆ when it is clear that the chain complex isconstructed from a simplicial complex. 87 roof. We apply the deﬁnition twice to the generators of C n + . ∂ n ∂ n + ( ∆ n ) = ∂ n ( ∂ n + [ v , ..., v n ])= ∂ n (cid:32) n + ∑ i = ( − ) i [ v , ..., (cid:98) v i , ..., v n ] (cid:33) = n ∑ i = (cid:32) i − ∑ j = ( − ) j [ v , ..., (cid:98) v i , ..., (cid:98) v j , ..., v n + ] + n + ∑ j = i + ( − ) j − [ v , ..., (cid:98) v i , ..., (cid:98) v j , ..., v n + ] (cid:33) = ( ∂ n ) ⊆ ker ∂ n − for all n . Deﬁnition 3.2.38.

For all n , put Z ∆ n ( K ) = ker ∂ n consisting of cycles and B ∆ n ( K ) = Im ∂ n + consisting of boundaries . Deﬁne the n-th homology group H ∆ n ( K ) = Z ∆ n ( K ) / B ∆ n ( K ) Let f be a simplicial map (that is to say that f ( ∑ t i ∆ i ) = ∑ t i f ( v i ) ) between two com-plexes K , L . Then f induces a map on the chain complexes f (cid:93) : C ∆ n ( K ) → C ∆ n ( L ) , f (cid:93) ( ∆ n ) = f ◦ ∆ n and thus a map on the homology groups f ∗ : H ∆ n ( K ) → H ∆ n ( L ) , f ∗ ([ z ]) = [ f ◦ z ] This gives us the following commutative square Z ∆ n ( K ) Z ∆ n − ( K ) Z ∆ n ( L ) Z ∆ n − ( L ) ∂ n f (cid:93) f (cid:93) ∂ n Lemma 3.2.39.

The boundary map preserves equivalence classes of the homology groups, that is ∂ n ( z + ∂ n + c ) = ∂ n ( z ) Proof.

We use the homomorphism property to get that ∂ n ( z + ∂ c ) = ∂ n ( z ) + ∂ n ( ∂ n + c ) = ∂ n ( z ) as desired. 88onsider the torus from Example 3.2.33. To give the torus a simplicial structure, wewant to realize it as a quotient space. That is T ∼ = R / Z . For this reason we can viewthe torus as a square with opposite sides glued together. Now the simplicial structureshould be more or less obvious.In general, a the sequence of maps in Deﬁnition 3.2.35 is called a chain complex. Fur-ther we denote C ∆ • = ... C ∆ n C ∆ n − C ∆ n − ... ∂ n + ∂ n ∂ n − ∂ n − Notice that for two simplicial complexes, and a simplicial map f : K → L we get thefollowing map of complexes f (cid:93) : C ∆ • ( K ) → C ∆ • ( L ) Lemma 3.2.40.

Consider two chain complexes C • , D • and a chain map g = ( g n ) n ≥ . Then thereexists a family of induced homomorphismsg n , ∗ : H ∆ n ( C • ) → H ∆ n ( D • ) Proof.

Let ξ ∈ H n ( C • ) . Let z ∈ Z n ( C • ) such that ξ = [ z ] . Consider g n ( z ) . Then ∂ n g n ( z ) = g n − ( ) = g n ( z ) ∈ Z n ( D • ) . That is to say [ g n ( z )] = η ∈ H n ( D • ) . Let z (cid:48) ∈ C n such that z ∼ z (cid:48) .Suppose z (cid:48) = z = ∂ n + ( c ) , c ∈ C n + . Then g n ( z (cid:48) ) = g n ( z ) + g n ∂ n + ( c )= g n ( z ) + ∂ n + g n + ( c )= g n ( z ) + b , b ∈ B n ( D • ) so g n ( z (cid:48) ) ∼ g n ( z ) and g n , ∗ is well deﬁned. Singular ComplexesDeﬁnition 3.2.41.

Deﬁne the set of singluar n-chains as C n ( X ) = Z [ { σ n : ∆ n → X } / ∼ ] where ∼ is homotopy equivalence. Each element of this group can be written as c = n σ n + ... + n k σ nk This is the set of all possible embeddings of an n − simplex into X . Further, we have a boundary operator ∂ n : C n ( X ) → C n − ( X ) , ∂ ( σ n ) = n ∑ i = ( − ) i σ n | [ v ,..., (cid:98) v i ,..., v n ] satisfying the same relations as with simplicial boundary maps.89urther, for every pair of maps f : X → Y , g : Y → W we get the induced maps onchain complexes g n , (cid:93) ◦ f n , (cid:93) = ( g ◦ f ) (cid:93) This induces a functor H n : Top → Ab given by X (cid:55)→ H n ( C • ( X )) . Deﬁnition 3.2.42 (Chain Homotopy) . If C • and D • are chain complexes and f , g chainmaps then a Chain Homotopy E = ( E n ) n ≥ is a collection of homomorphisms E n : C n → D n + such that ∂ n + E n + E n − ∂ n = g n − f n . Lemma 3.2.43.

If f and g are chain homotopic, thenf n , ∗ = g n , ∗ : H n ( C • ) → H n ( D • ) Proof.

Let z ∈ Z n ( C • ) . Put ξ = [ z ] ∈ H n ( C • ) . Then, g n ( z ) = f n ( z ) + ∂ n + E n ( z ) Hence, g n ( z ) ∼ f n ( z ) so g ∗ ξ = f ∗ ξ . Theorem 3.2.44 (Homotopy Invariance) . If f (cid:39) g : X → Y then on singluar homology,f ∗ = g ∗ : H n ( X ) → H n ( Y ) Proof.

See [Hat01].This theorem shows us that singluar homology is invariant under homotopy. This willbecome hugely important when classifying topological spaces.Let A ⊆ X . We want to explore the computability of H n ( X ) . As we know it, π n ( X ) is difﬁcult to compute. It turns out, that H n ( X ) is relatively easy to compute (in mostcases) and therefore is used substantially more by various areas of mathematics as a wayof providing invariants to spaces. The following deﬁnition hints at one possible way ofcomputing these groups explicitly. Deﬁnition 3.2.45 (Relative Chain Groups) . Let A ⊆ X be a subspace. Deﬁne the relativeChain Groups C n ( X , A ) = C n ( X ) / C n ( A ) Remark 3.2.46.

Notice that chains in X descend to chains relative to A . That is, the fol-lowing diagram exists and the top square commutes: C n ( A ) C n − ( A ) C n ( X ) C n − ( X ) C n ( X , A ) C n − ( X , A ) ∂ n ∂ n q q ∂ n eﬁnition 3.2.47 (Relative Homology groups) . In light of the previous remark, let A ⊆ X and C n ( X , A ) the relative chain groups. We deﬁne the relative homology groups H n ( X , A ) = Z n ( X , A ) / B n ( X , A ) where Z n ( X , A ) = ∂ − n C n − ( A ) / C n ( A ) and B n ( X , A ) = [ B n ( X ) + C n ( A )] / C n ( A ) . We canre-write H n ( X , A ) as H n ( X , A ) = ∂ − n C n − ( A ) B n ( X ) + C n ( A ) Now the question stands: how do we restrict an element of H n ( X ) to an element of H N ( X , A ) ? Simple. Pass it to the quotient. This will still be a cycle.Suppose η ∈ H n ( X , A ) and ¯ z ∈ Z n ( X , A ) such that η = [ ¯ z ] . Then ∂ n z ∈ Z n − ( A ) . Thisgives us a map ∂ : H n ( X , A ) → H n − ( A ) which sends z + ∂ n + c + d (cid:55)→ ∂ n d ∈ B n − ( A ) where d ∈ C n ( A ) . Piecing all of this infor-mation together gives us the following Theorem. Theorem 3.2.48.

For any subspace A ⊆ X , we get the following exact sequence of homologygroups ... H n ( A ) H n ( X ) H n ( X , A ) H n − ( A ) H n − ( X ) ... i ∗ j ∗ ∂ i ∗ Proof.

This follows immediately from the Snake Lemma.Consider the triple B ⊆ A ⊆ X and the associated exact sequence0 → C n ( A , B ) → C n ( X , B ) → C n ( X , A ) → Theorem 3.2.49 (Excision Theorem) . Let X be a topological space and A a subset. Suppose thatZ ⊆ Z ⊆ Int ( A ) . Then H n ( X , A ) ∼ = H n ( X \ Z , A \ Z ) We shall not prove the Excision theorem, but instead note its usefulness. It tells usthat H n ( X , A ) is computable. If we consider H n ( X , x ) for some point x ∈ X , then this is H n ( X ) and therefore the homology groups are computable. It can be shown further that H n ( X ) is ﬁnitely generated if X has ﬁnitely many n -cells. Furthermore, under some mildconditions, we have an isomorphism H n ( X , A ) ∼ = H ∆ n ( X , A ) This should not be surprising as giving a simplicial structure to a topological space is pre-cisely dictating the embeddings of simplices of various dimensions.91he ﬁnal topic we shall introduce in this section is

Cohomology . This is somehow for-mally dual to the notion of homology. It will seem a bit contrived in this setting, but insome ﬁelds (like algebraic geometry and differential geometry) cohomology is the mostnatural algebraic invariant on a space.Consider C • ( X ) the singular chain complex for X . The functor Hom ( − , G ) for anyabelian group G (we could have chosen a ring R here if we only consider it as an abeliangroup). Then we get a new chain complex C • ( X ) : = ... C n − ( X ; G ) C n ( X ; G ) C n + ( X ; G ) ... d n − d n d n + d n + where C n ( X ; G ) : = Hom Z ( C n ( X ) , G ) and d n : = Hom ( ∂ n , G ) = ∂ ∗ n . This is a chain com-plex as Hom ( − , G ) is a functor. Deﬁnition 3.2.50.

The cohomology groups of a topological space X are the groups H nsing ( X ; G ) : = ker d n + / Im d n The one immediate advantage of cohomology is that there is a canonical ring structureon H ∗ ( X ; G ) = (cid:77) n ∈ N H n ( X ; G ) call the cup product deﬁned as follows: let ϕ ∈ C l ( X ; G ) and ψ ∈ C k ( X ; G ) . Then ϕ (cid:94) ψ ([ v , ..., v k + l ]) = ϕ ( σ | [ v ,..., v k ] ) ψ ( σ | [ v k ,..., v k + l ] ) This cup product induces a map H l ( X ; G ) × H k ( X ; G ) → H k + l ( X ; G ) which is compatiblewith the quotients.The only theorem we will present here is the Universal Coefﬁcient Theorem. Naively,one would assume that somehow H n ( X ) and H n ( X ; G ) are related (possibly by Hom).This is not necessarily true. What is true however is that there exists a short exact se-quence involving these two, as the following theorem dictates: Theorem 3.2.51 (Universal Coefﬁcient Theorem) . Let X be a topological space, C • ( X ) itssingular chain complex and C • ( X ; G ) an associated cochain complex. Then the cohomology groupsare determined by the split exact sequence → Ext Z ( H n − ( X ) , G ) → H n ( X ; G ) → Hom ( H n ( X ) , G ) → [ Rot88 ] . This ends the section on algebraic topology. Manifolds pop up in every area of mathematics and play the starring role in the modelwe develop in Chapter 4. They are generalizations of Euclidean space ( R n ) and allowsfor a variety of new geometry to occur. All together they form a category Man ∞ which92ives a concrete example of a suitably bad category whose objects are easy to understand.This section will run through the basic theory of manifolds, vector bundles, and sheaves.We conclude with a discussion of de Rham theory which ties together the topologicalinformation on a manifold. Good references for the ﬁrst two sections are [Wed16], [Lee12],[Tu11], and [GP74]. ∞ There are two approaches to smooth manifolds which are commonly used: analytic andalgebraic. We shall focus on the algebraic theory as it more closely ties in the later sectionshere. We will not entirely neglect the analytic theory as we need the notion of differentia-tion which is purely analytic. We start with the deﬁnition of an atlas:

Deﬁnition 3.3.1.

Let M be a topological space and α ∈ N ∪ { ∞ } . A chart at p ∈ M is apair ( ϕ , U ) with p ∈ U ⊆ M open, and ϕ : U → ϕ ( U ) ⊆ R n (for n not depending on p )a homeomorphism. A collection A of charts is called a C α - atlas on M if for all p , q ∈ M ,there are charts ( ϕ p , U p ) and ( ϕ q , U q ) which are compatible : the transition map ϕ p ◦ ϕ − q : ϕ q ( U p ∩ U q ) → ϕ p ( U p ∩ U q ) is a C α -homeomorphism (each partial derivative is α -times differentiable in each coordi-nate). If α = ∞ , then we call the chart maps and the transition maps smooth. In this case,the atlas is called smooth . Example 3.3.2.

Let S n = { x ∈ R n + : | x | = } where | x | = (cid:113) x + ... + x n + be the n -sphere equipped with the subspace topology. To construct a smooth atlas on S n , we needto give charts. Consider the open subsets U ± i : = { x ∈ S n : x i > ( resp . < ) } and the function f : D n → R n by f ( u ) = (cid:112) − | u | . Then U + i ∩ S n is the graph of thisfunction and U − i ∩ S n is the graph of − f . Each x i ∈ U + i ∩ S n can then be written as x i = f ( x , ..., (cid:98) x i , ...., x n + ) Deﬁne the maps ϕ ± i : U i → R n by ϕ ± i ( x , ..., x n + ) = ( x , ..., (cid:98) x i , ..., x n + ) . There are seento be smooth. Further they are compatible trivially. Hence, A = { ( ϕ ± i , U ± i ) } is a smoothatlas on S n . Deﬁnition 3.3.3.

Let M be a topological space equipped with a C α -atlas A . We call M an n -dimensional C α - manifold if M is Hausdorff and there exists a countable basis for thetopology. Remark 3.3.4.

Normally, the requirement of an atlas is stated as M is locally Euclidean .This is the key property of manifolds over normal Euclidean space. They do not need tobe R n or even C α -homeomorphic to R n , only locally.93e shall study only smooth manifolds here. The non-smooth cases are important,however not for this thesis. For smooth manifolds, we would like to know that M doesnot depend on the atlas. Proposition 3.3.5.

Let M be a smooth manifold with atlas A . Then there exists some A (cid:93) a uniqueatlas which is maximal and contains all atlases on M . Proof.

Deﬁne A (cid:93) as the set of all charts which are smoothly compatible with the chartsin A . Let ( ϕ , U ) and ( ψ , V ) be charts in A (cid:93) . Put x = ϕ ( p ) ∈ ϕ ( U ∩ V ) . Then as A is anatlas, there exists a chart ( θ , W ) such that p ∈ W . As p ∈ U ∩ V ∩ W the intersection isnon-empty. Therefore by construction the map ( ψ ◦ θ − ) ◦ ( θ ◦ ϕ − ) : ϕ ( U ∩ V ∩ W ) → ψ ( U ∩ V ∩ W ) is smooth and therefore ψ ◦ ϕ − is smooth. Hence, A (cid:93) is an atlas on M containing A .To show it is unique, let B be another such atlas. Then in particular, each of its chartsis smoothly compatible with charts in A . Hence, B ⊆ A (cid:93) and by maximality they areequal. Example 3.3.6.

The following examples of manifolds show up everywhere and thus shouldbe well understood.(a) The unit sphere S n from above was shown to exhibit a smooth atlas. The fact that itis Hausdorff and second countable follows from being a compact subset of R n + .(b) Consider the action of R on R n by r ( x , ..., x n ) = ( rx , ..., rx n ) . Then the quotientspace ( R n − { } ) / R is called the real projective space and is denoted P n − ( R ) orjust P n if the ﬁeld is understood. We denote elements here as equivalence classes [ x , ..., x n ] . These are equivalence classes of lines in R n which go through the origin.To give charts on P n , we consider maps of the form ϕ i [ x , ..., x n ] = (cid:18) x x i , ..., x i − x i , x i + x i , ..., x n x i (cid:19) ∈ R n − .Then an easy check shows that these are smooth and are compatible. Hence, P n is asmooth manifold. Moreover, it is compact!(c) Let M and N be two smooth manifolds. Then M × N has the structure of a smoothmanifold given by charts of the form ( ϕ × ψ , U M × V N ) .(d) Let M ( m , n , R ) be the m × n matrices with real entries. This is a smooth manifoldby the diffeomorphism M ( m , n , R ) → R mn . If m = n we denote this by M ( n , R ) or M n ( R ) . Notice that for m = n , M n ( R ) comes equipped with a ring structure givenby matrix multiplication. In this case, there are many distinguished open submani-folds. The most important is GL n ( R ) the group of invertible linear transformations.We will return to this example later as it is the principal example of a Lie Group .These will turn out to be group objects in the category of manifolds.94ue to the above proposition, we will assume without a loss of generality that M isequipped with its maximal atlas. Now, we can deﬁne morphisms of smooth manifolds. Deﬁnition 3.3.7.

Let M and N be two smooth manifolds. Then a function F : M → N isa smooth map if for all ( ϕ , U ) ∈ A M and ( ψ , V ) ∈ A N such that F ( U ) ∩ V (cid:54) = ∅ the map ψ ◦ F ◦ ϕ − : ϕ ( U ) → ψ ( V ) is smooth. As a diagram: U V ϕ ( U ) ψ ( V ) ϕ F ψψ ◦ F ◦ ϕ − A bijective smooth map whose inverse is smooth is a diffeomorphism.The composition of smooth maps is smooth by an extended version of the diagramabove. Therefore, we have deﬁned a category

Man ∞ of smooth manifolds with mor-phisms as smooth maps. Using this, we can now deﬁne the functor: C ∞ : Man op ∞ → Alg R where C ∞ ( M ) = Hom

Man ∞ ( M , R ) and we deﬁne the operations point-wise. It is con-travariant by the following: let F : M → N be a morphism. Then F ∗ : C ∞ ( N ) → C ∞ ( M ) F ∗ ( s ) = s ◦ F Further, if M → N → P is a sequence, then ( F ◦ G ) ∗ ( d ) = d ◦ ( F ◦ G ) = ( d ◦ F ) ◦ G = G ∗ ◦ F ∗ A derivation of this ring at p ∈ M is a function d : C ∞ ( M ) → R such that d is linearand d ( f g ) = f ( p )( dg ) + ( d f ) g ( p ) . Remark 3.3.8.

In fact, C ∞ ( M ) is a Banach space. This changes the situation is quite asubtle way. If M × N is a product manifold, one would expect the smooth functionsto be C ∞ ( M ) ⊗ R C ∞ ( N ) . However, this cannot be true as trigonometric functions exist.Therefore, we need to take come metric completion of this tensor product. For moreinformation see [Rya02]. Deﬁnition 3.3.9.

Let M be a smooth manifold. If p ∈ M , we deﬁne the tangent space to M at p to be T p M = { ( f : C ∞ ( M ) → R ) : f is a derivation at p } This is clearly an R -vector space. In fact, it is ﬁnite dimensional.Elements of the tangent space should be thought of as vectors which are tangent to M atthe point p . We put dim p M = dim R T p M .The germ of a function f : M → R at the point p is an equivalence class [ f ] where twofunctions f , g are equivalent at p if there exists an open neighbourhood W of p such that f = g on W . Denote by C ∞ M , p the set of all germs at p . This is a local ring with maximalideal m p all functions which are non-zero at p .The following Theorem gives equivalent formulations of the tangent space.95 heorem 3.3.10. Let M be a smooth manifold and T p M its tangent space at p . The space ofgerms C ∞ M , p is a local ring with maximal ideal m p . The following are equivalent formulations ofthe tangent space:(a) Let D p M = Der ( C ∞ M , p , R ) . Then D p M ∼ = T p M by the map which sends [ f ] (cid:55)→ f . (b) Let γ : ( −

1, 1 ) → M be a smooth curve with γ ( ) = p. ThenC p M = { γ (cid:48) ( ) : ( γ : [

0, 1 ] → M ) a smooth curve and γ ( ) = p } / ∼ where γ ∼ δ if for all germs f ∈ C ∞ M , p we have ( f ◦ γ ) (cid:48) ( ) = ( f ◦ δ ) (cid:48) ( ) . Then C p M ∼ = T p M . (c) (cid:0) m p / ( m p ) (cid:1) ∗ ∼ = T p M .See [Wed16] for a proof of this. The hardest one to prove is ( c ) and it relies heavilyon the fact that M is C ∞ . If M were say C n , then this would not be true and dim m / m isinﬁnite.The operation of passing to the tangent space is functorial in M . That if if F : M → N is a morphism, then T p F : T p M → T F ( p ) N is a linear map deﬁned by d (cid:55)→ d ◦ F ∗ . Therefore, T p ( G ◦ F ) = d ◦ ( G ◦ F ) ∗ = d ◦ F ∗ ◦ G ∗ = T F ( p ) G ◦ T p F . Deﬁnition 3.3.11.

Let M be a smooth manifold. We deﬁne the tangent bundle of M asthe disjoint union T M = (cid:228) p ∈ M T p M = { ( p , v ) : p ∈ M , v ∈ T p M } The cotangent bundle is T ∗ M = (cid:228) ( T p M ) ∗ deﬁned analogously.There is a canonical projection π M : T M → M given by ( p , v ) (cid:55)→ p . Then π − M ( p ) = T p M . Picking a basis of T p M so that we may identify it with R n , we ﬁnd that in a neigh-bourhood of p , π − M ( U ) ∼ = U × R n . This property of the tangent bundle is called local triv-ialization . Further, using this identiﬁcation we get that T M (and thus T ∗ M ) are smoothmanifolds of dimension 2 dim M . This deﬁnition makes T : Man ∞ → Man ∞ into an end-ofunctor [Lee12].Using the tangent bundle, we can now study certain C ∞ ( M ) modules which arisenaturally. Let s : M → T M be a smooth map such that π M ◦ s = id M . We call s a section of T M and denote the space of all sections as Γ ( M , T M ) : = { ( s : M → T M ) : π m ◦ s = id M } A derivation is a linear map f such that it satisﬁes the Leibniz rule for multiplication: f ( xy ) = f ( x ) y + x f ( y ) . Der is the space of all such functions. For more details, see [Lee12] or [Wed16]. For a deﬁnition see [Lee12]. It is the coproduct in the category of sets. R -vector space under point-wise addition. Moreover it can be given the struc-ture of a C ∞ ( M ) -module. If s be a section and f ∈ C ∞ ( M ) . Then we deﬁne ( f · s )( p ) =( p , f ( p ) s ( p )) . If U ⊆ M is a submanifold, we deﬁne Γ ( U , T M ) as the sections of the bun-dle over U . Deﬁnition 3.3.12.

A smooth section s ∈ Γ ( M , T M ) is called a smooth vector ﬁeld on M . Itassociated to each point in M a tangent vector v ∈ T p M . We call M parallelizable if thereexists vector ﬁelds { V , ..., V n } such that { V ( p ) , ..., V n ( p ) } is a basis for T p M for all p . Proposition 3.3.13.

Let M be parallelizable, then T M = M × R n . Proof.

Let { V , ..., V n } be a parallelization of M . Then the map ϕ : T M → M × R n givenby ϕ ( p , ∑ a i V i ( p )) = ( p , ∑ a i e i ) is smooth trivially. Further, as T p M ∼ = R n via the isomorphism V i ( p ) (cid:55)→ e i , we get thatthis map is a diffeomorphism. Hence, T M ∼ = M × R n is trivial.One important operation that vector ﬁelds admit is the Lie Derivative. Given twovector ﬁelds V and W , we deﬁne L V ( W ) = [ V , W ] = VW − WV Here

X f ( p ) = X p ( f ) is a derivation of C ∞ ( M ) . It is readily checked that [ X , Y ] is again avector ﬁeld. Hence Γ ( M , T M ) admits the structure of a Lie algebra. Immersions and Submersions

Given the discussion above, we can now formulate some special morphisms in

Man ∞ . Deﬁnition 3.3.14.

Let F : M → N be a morphism in Man ∞ . We deﬁne the rank of F at thepoint p ∈ M , as rk ( T p F : T p M → T F ( p ) N ) If we pick bases for T p M and T F ( p ) N respectively, then by choosing bases, we can get asmooth map M → M ( m , n , R ) p (cid:55)→ T p F Further, the matrix of T p F is (up to a choice of basis) (cid:18) I rk ( F )

00 0 (cid:19)

From this deﬁnition it follows that if r = rk ( F ) at p and ( ϕ , U ) is a chart at p , thenthere exists a smooth function g : ϕ ( U ) → R n − r which sends 0 → T g =

0. We now have the immediate corollary97 orollary 3.3.15.

For every p ∈ M , there exists an open neighbourhood U such that rk p ( F ) ≤ rk q ( F ) for all q ∈ U .This tells us that the rank of a smooth function can only stay the same or increase in aneighbourhood of a point. If equality holds, we say F has constant rank at p . Corollary 3.3.16.

If F has constant rank at p , then there exist charts ( ϕ , U ) and ( ψ , V ) of p andF ( p ) respectively, such that ψ ◦ F ◦ ϕ − ( x , ..., x m ) = ( x , ..., x r , 0, ..., 0 ) This corollary is incredibly important to the study of manifolds as it gives a local rep-resentation of F in such a way that we can disregard a signiﬁcant number of variables.There are two extreme cases of the above corollary. Deﬁnition 3.3.17. F : M → N is called a:(a) Immersion if T p F is injective for all p .(b) Submersion if T p F is surjective for all p .A smooth immersion which is also a topological embedding is called a smooth embed-ding .Embeddings are particularly useful as they exhibit manifolds and sitting inside others.Immersions are also incredibly important. The following example is of an object whichcannot be embedded into R but instead can be immersed. Example 3.3.18.

Let I be the product of [

0, 1 ] with itself. We are going to build K theKlein bottle. Consider the relation ( x , 0 ) ∼ ( x , 1 ) and ( y ) ∼ (

1, 1 − y ) . Then we get anobject which cannot be embedded into R but can be immersed. This is the glueing oftwo mobius bands together to get a 1-sided object with no edges. To show it is a manifoldis not particularly difﬁcult as we have a representation of it above.We want to understand how immersions, submersions, and embeddings interact withsurjective, injective, and bijective maps. Theorem 3.3.19 (Global Rank Theorem) . Let F : M → N be a smooth map of smooth manifoldswith constant rank. Then(a) If F is injective, then F is an immersion.(b) If F is surjective, then F is a submersion.(c) If F is bijective, then F is a diffeomorphism.

The proof of this relies on a strong theorem from functional analysis. As we do notdevelop this theory here, the proof will be omitted. For a full treatment, see [Lee12] and[Kna05b]. This theorem gives a sufﬁcient condition for a smooth map to be an immer-sion (resp. submersion) and it is much more easily checked than the normal immersion(submersion) condition. 98igure 3.2: The Klein bottle immersed in R . It can be embedded in R . Vector Bundles

We now want to understand some generalizations of the (co)tangent bundle from above.

Deﬁnition 3.3.20.

Let M be a smooth manifold. We call a triple ( E , π , V ) consisting of asmooth manifold, a projection map, and a real vector space real vector bundle of rankdim V over M if:(a) π : E → M is surjective and a local diffeomorphism.(b) For each p ∈ M , the ﬁbre π − ( p ) ∼ = p × V ∼ = V is endowed with the structure of adim V -dimensional real vector space.(c) For each p ∈ M , there exists a neighbourhood U of p and a homeomorphism Φ : π − ( U ) → U × V satisfying:(i) π U ◦ Φ = π (where π U : U × V → U is the projection)(ii) For each q ∈ U , the restriction Φ q : E q → { q } × V is a vector space isomor-phism.Similarly, we could have deﬁned vector bundles as E = (cid:228) p ∈ M V p where V p = { p } × V .In this sense, we see that T M and T ∗ M are vector bundles. Similar to those, Γ ( M , E ) isa C ∞ ( E ) -module. The main purpose of this section is to understand transformations onbundles and transformations between them.99 eﬁnition 3.3.21. Let ( E , π ) and ( E (cid:48) , π (cid:48) ) be vector bundles over M and M (cid:48) respectively.Then a bundle homomorphism is a map F : E → E (cid:48) which is linear on each ﬁbre, suchthat there exists a map f : M → M (cid:48) and the following diagram commutes: E E (cid:48)

M M (cid:48) π F π (cid:48) f Proposition 3.3.22.

If F is smooth, then f is smooth.Proof. f = π (cid:48) M ◦ F ◦ ζ where ζ is the zero section. This is a composition of smooth mapsand therefore smooth.This lets us deﬁne a category Bun ( M ) whose objects are vector bundles over M andwhere morphisms are bundle homomorphisms. The forgetful functor U : Bun ( M ) → Man ∞ is faithful. In general, it is not full as there exist smooth maps E → E (cid:48) which do notcommute with the projection maps. We will denote by Bun ( M ) < ∞ the category of ﬁniterank vector bundles. This category will become interesting in the next section when werelate it to categories of certain sheaves. Example 3.3.23.

We now construct some interesting bundles over various manifolds.(a) Let M = S . Deﬁne an equivalence relation on R by ( x , y ) ∼ ( x (cid:48) , y (cid:48) ) if ( x (cid:48) , y (cid:48) ) =( x + n , ( − ) n y ) . Put E = R / ∼ . We claim E is a non-trivial bundle over S . First,let q : R → E be the quotient map. Consider the following diagram R E R S π q πε where ε ( x ) = e π ix . Then π is determined as the map which makes this diagramcommute. This makes ( E , π ) a real line bundle on S which is non-trivial (by thetwist of ( − ) n ). This is the chief example of how local information can be deceptivewhen trying to understand something globally.(b) Let M be a manifold and V any vector space. Then M × V has the canonical struc-ture of a vector bundle on M .(c) Let E , E (cid:48) be vector bundles over M . Then E ⊕ E (cid:48) is a vector bundle whose ﬁbres are V ⊕ V (cid:48) . This is called the Whitney Sum of vector bundles.100f E and E (cid:48) are vector bundles on a smooth manifold M , denote their space of smoothsections by Γ ( E ) and Γ ( E (cid:48) ) . If F : E → E (cid:48) is a bundle homomorphism, it induces a map (cid:101) F : Γ ( E ) → Γ ( E (cid:48) ) given by (cid:101) F ( σ )( p ) = F ( σ ( p )) Because a bundle homomorphism is linear on ﬁbres, (cid:101) F is R -linear on sections. In fact, itis even C ∞ ( M ) -linear. We can characterize all C ∞ ( M ) -linear maps Γ ( E ) → Γ ( E (cid:48) ) by thefollowing Theorem. Theorem 3.3.24.

Let E , E (cid:48) be vector bundles on M and F : Γ ( E ) → Γ ( E (cid:48) ) a map. Then F isC ∞ ( M ) -linear if and only if F = (cid:101) F for some F : E → E (cid:48) .The proof of this goes beyond the scope of this text. See [Lee12] for details. What thistheorem tells us is that for vector bundles, we have a bijective correspondence betweenHom C ∞ ( M ) ( Γ ( E ) , Γ ( E (cid:48) )) ∼ −→ Hom

Bun ( M ) ( E , E (cid:48) ) Remark 3.3.25.

The key to vector bundles is that they somehow encode both global andlocal information of the manifold. Further, understanding the category

Bun ( M ) is insome sense equivalent to understanding the slice category (see [ML71] for a deﬁnition) Man ∞ / M . Overall, we shall use these objects to transfer information from the physicalspace of a sensory system to the perceptual space. In fact, this will be how we build theperceptual space.We could have equivalently deﬁned ﬁbre bundles and gone through this section in moregenerality. These are similar to vector bundles but we do not require that the ﬁbres be vec-tor spaces. The story of these objects is largely mysterious as they are nearly too generalto say anything interesting about. Importantly though, they still have the property thatall ﬁbres are isomorphic.This concludes the section on manifolds. The story of sheaves begins where we just ﬁnished: ﬁbre bundles. Notice that ﬁbre bun-dles are characterized by the fact that the ﬁbres over every point are necessarily isomor-phic. Sheaves seek to generalize this idea by removing the restriction of constant ﬁbres.Sheaves are key in nearly every area of mathematics as they encode geometric informa-tion which is otherwise difﬁcult to access. In the late 1960s, Alexander Grothendieck ﬁrstdeveloped the idea that understanding sheaves on a space is equivalent (and in somesense better) than understanding the space itself. In this section we will give the ﬁrstproperties of (pre)sheaves, deﬁne ringed spaces, and construct the category O X - Mod . Weconclude the section with a brief introduction to sheaf cohomology, which in the samestyle as cohomology in the previous section, will provide rich invariants to the associatedmanifolds. Most of the material of this section comes from [Ive86], [Har77] , [Wed16],101EH00], and [Bre97]. As this forms the most technical material of this thesis, we shall onlyprove those statements which are fundamental to the reader’s understanding and willpoint to the appropriate reference otherwise.

Remark 3.3.26.

Due to the technical stress of this section, we encourage the reader toskip a majority of the proofs of the statements presented here. The proofs of some of amajority of these theorems can be opaque on a ﬁrst pass and thus should be revisited onlyif a deeper understanding is desired.Before we give the formal deﬁnitions of sheaves, recall some of the facts we provedabout C ∞ ( − ) as a functor Man ∞ → Ring . Fix M ∈ Man ∞ . We know that C ∞ M ( U ) is aring for any open submanifold U ⊆ M . Additionally, C ∞ M , x is a local ring for each x ∈ M .Further, we showed that given an open cover U of M and smooth functions deﬁned oneach U i such that f i | U i ∩ U j = f j | U i ∩ U j then there exists a unique global smooth function g with the property that g | U i = f i . What we will see is that C ∞ M is the structure sheaf of M .For now, let’s start, as always, with some deﬁnitions. Deﬁnition 3.3.27.

Let ( X , T ) be a topological space and C a category. A presheaf on X isa functor F : T op → C . T op is the category whose objects are open subsets of X and whose morphisms are onepoint sets if V ⊆ U and empty otherwise. Morphisms of presheaves are natural transfor-mations of functors. Remark 3.3.28.

Notice that for V ⊆ U , there is a unique morphism denotedRes UV : F ( U ) → F ( V ) .We sometimes call F ( U ) the set of sections of F over U , and denote this Γ ( U , F ) . Addi-tionally, instead of writing Res UV ( s ) for the image of s in F ( V ) , we instead write s | V .Every presheaf is the same as a contravariant functor. We use the term presheaf whenwe want to discuss some gluing conditions which we will see later. Some classical exam-ples of presheaves are C α M = { f : M → R : f is α timesdifferentiable } for a real C α -manifold M and α ∈ N ∪ { ∞ } . Deﬁnition 3.3.29.

Let X be a topological space and F a presheaf on X . F is a sheaf if thefollowing condition is satisﬁed( Sh ) If U ⊆ X is an open set and { U i } i ∈ I is an open cover of U such that for all i thereexists f i ∈ F ( U i ) and for all i (cid:54) = j ∈ I f i | U i ∩ U j = f j | U i ∩ U j then there exists a unique f ∈ F ( U ) such that f | U i = f i . Remark 3.3.30.

This deﬁnition can be generalized to general categories. To do this cor-rectly however one needs the language of sites. We will not cover these but refer thereader to [Met03],[KS06], and [Car11] for an in depth treatment.102 xample 3.3.31.

We have already seen an example of a sheaf, namely C α . It is easy to checkthe gluing condition ( Sh ). Other common examples are Ω pM the set sheaf of differentialforms of degree p and L the sheaf of locally constant functions on a space.Sheaves allow for local information to be glued together to make global information.What we mean by local here is up to some interpretation. We can either mean open neigh-bourhoods of points or the points themselves. As points are almost never open (exceptfor discrete sets) we need to ﬁgure out how to deﬁne F ( x ) . The following deﬁnition givesan answer in a category which admits colimits. Deﬁnition 3.3.32.

Let X be a topological space and U ( x ) = { U ∈ Open ( X ) : x ∈ U } .Suppose F is a (pre)sheaf on X . We deﬁne the stalk of F to be F x = lim −→ U ( x ) F ( U ) Here, we interpret the colimit as being taken over successively smaller sets containing x . In fact, if there exists some minimal U x contained in all neighbourhoods of x , then F x = F ( U x ) . We now want to understand how morphisms of sheaves interact with thestalks. Remark 3.3.33.

For the remainder of this text, we shall consider only sheaves of rings ormore generally R -modules for some ring R . This simpliﬁes the situation and also turnsout to be the situation for most spaces. Proposition 3.3.34.

A morphism of sheaves on a space X , ϕ : F → G is an isomorphism if andonly if it the induced map on stalks ϕ : F x → G x is an isomorphism.Proof. ( ⇒ ) Let x ∈ X and U ( x ) as in Deﬁnition 3.3.32. Considerlim −→ : Ring U ( x ) → Ring where we consider U ( x ) as a partially ordered set. As ϕ is a natural transformation itgives two direct systems {F ( U ) , Res UV } U ( x ) {G ( U ) , Res UV } U ( x ) As ϕ is an isomorphism, ϕ U is an isomorphism for all U ∈ U ( x ) . Thereforelim −→ U ( x ) { ϕ U : F ( U ) ∼ −→ G ( U ) } = ϕ x : F x ∼ −→ G x is an isomorphism. As x was arbitrary, we see that ϕ x is an isomorphism on all stalks.( ⇐ ) Now assume that ϕ x is an isomorphism for all x ∈ X . We shall show that ϕ U isa bijection for all U and thus taking ψ U = ϕ − U makes ϕ an isomorphism of sheaves. Letus ﬁrst show that ϕ U is injective. If s ∈ F ( U ) is such that ϕ U ( s ) =

0, then on all stalks ϕ x ( s ) =

0. As ϕ x is an isomorphism, we see that s x = x ∈ U . Therefore, thereexists some W x ⊆ U such that s | W = x ∈ W x . As (cid:83) W x is a cover for U , by the103heaf condition there exists a unique s ∗ ∈ F ( U ) such that s ∗ | W = s W =

0. By uniqueness, s ∗ = s = ϕ U is injective.To show it is surjective, let t ∈ G ( U ) . Let x ∈ U and t x ∈ G x be the germ of t at x . As ϕ x is surjective, there exists s x ∈ F x such that ϕ x ( s x ) = t x . Pick a representativesection s ( x ) ∈ F ( V x ) such that s ( x ) = s x . Then ϕ V x ( s ( x )) and t | V x have the same germin G x . Possibly replacing V x by a smaller open set, we may assume that ϕ V x ( s ( x )) = t | V x .The collection { V x } forms an open cover of U and on each V x we have a section s ( x ) . Let p , q ∈ X be distinct points. Then s ( p ) | V p ∩ V q and s ( q ) | V p ∩ V q are two sections in F ( V p ∩ V q ) which are sent by ϕ to t | V p ∩ V q . As ϕ U is injective, we conclude that s ( p ) | V p ∩ V q = s ( q ) | V p ∩ V q By the sheaf condition there exists s ∈ F ( U ) such that s | V p = s ( p ) . Lastly, we need tocheck that ϕ U ( s ) = t . By construction ϕ V x ( s ) = t | V x for all x ∈ U . Now, applying thesheaf condition again to ϕ U ( s ) − t we see that this must be 0 and hence ϕ U ( s ) = t and ϕ is surjective. This completes the proof.The collection of all C -valued sheaves on a topological space for a category denoted Sh ( X , C ) (Presheaves also form a category). Per the remark above, we shall denote Sh ( X , R - Mod ) : = Sh ( X ) when R is well understood. Notice that for a morphism ofsheaves the kernel presheaf deﬁnes a sheaf but the cokernel presheaf does not. Further,we would like for quotients to exist in this category. To remedy this, we come to thefollowing deﬁnition. Deﬁnition/Proposition 3.3.35.

For any presheaf F there is a sheaf (cid:101) F and a natural morphism θ : F → (cid:101) F with the following universal property: for any sheaf G and morphism of presheaves ϕ : F → G , there exsits a unique morphism of sheaves (cid:98) ϕ : (cid:101) F → G with (cid:98) ϕ ◦ θ = ϕ . That is, thefollowing diagram commutes F (cid:101) FG θϕ (cid:98) ϕ The sheaf (cid:101) F is called the sheaﬁﬁcation of F . One can prove that sheaﬁﬁcation is functorial inpresheaves. In fact, it is left adjoint to the forgetful functor Sh ( X ) → PSh ( X ) . Lemma 3.3.36.

The canonical map θ : F → (cid:101) F induces an isomorphism on stalks.Proof.

Consider the construction of the sheaﬁﬁcation of F as (cid:101) F ( U ) = (cid:40) ( s x ) ∈ ∏ x ∈ U F x : ∀ x ∈ U , ∃ W ⊆ U and ∃ t ∈ F ( W ) such that ∀ w ∈ W , t | W = s | W (cid:41) The restriction maps are given by the restriction on the products. Now, by deﬁnition θ x isnecessarily the identity. 104 emark 3.3.37. There is another way to build the sheaf associated to a presheaf. Given apresheaf F on X , we can construct a sheaf Spé ( F ) = (cid:70) p ∈ X F p . This has a natural projec-tion π : Spé ( F ) → X which projects each stalk onto the point it is over. We topologizethis space by endowing it with the strongest topology such that the sections s ∈ F ( U ) arecontinuous. It can be shown that these deﬁnitions agree.The sheaﬁﬁcation operation allows us to deﬁne cokernels, quotients, and constant sheaves.All of this together tells us that if A is an abelian category, then Sh ( X , A ) is also an abeliancategory [Ive86]. Speciﬁcally, Sh ( X ) is an abelian category. Example 3.3.38. (a) Let A be a ring. Then A deﬁnes a presheaf A X by A X ( U ) = A for all U open. If X is connected, then this is a sheaf. If X is disconnected this is not true. Suppose X = X (cid:116) X . Then if a (cid:54) = b ∈ A , then deﬁning a ∈ A X ( X ) and b ∈ A X ( X ) , theyagree trivially on the empty intersection yet there is no element c such that c = a and c = b . Hence, A X is not a sheaf in general. Therefore, we consider (cid:102) A X which isthe sheaf of locally constant functions on X with values in A .(b) We deﬁne i x , ∗ ( A ) to be the skyscraper sheaf which is deﬁned by i x , ∗ ( A )( U ) = (cid:40) A x ∈ U x / ∈ U This is a sheaf on any topological space and plays a key role in the theory as itprovides good counter examples to many conjectural relationships.(c) Let F be a sheaf and G a subsheaf on X , Then the functor U (cid:55)→ F ( U ) / G ( U ) is apresheaf. It is not a sheaf in general however. Therefore we can take the sheaﬁﬁ-cation to get ( F / G )( U ) as a sheaf on X . In general, ( F / G )( U ) does not agree with F ( U ) / G ( U ) .As Sh ( X ) is an abelian category, we can consider exact sequences of sheaves. Deﬁnition 3.3.39.

A sequence of sheaves on a space X is a sequence0 → F (cid:44) → G (cid:16) H → exact if Im [ F → G ] ∼ = ker [ G → H ] . Equivalently, this sequence is exact if the corre-sponding sequence 0 → F x → G x → H x → x ∈ X .It follows from an identical argument for Hom, that Γ ( X , − ) is a left exact functor Sh ( X ) → R - Mod . For this reason we deﬁne H i ( X , F ) : = R i Γ ( X , F ) Sheaf cohomology groups of F . These will tie together the entire chapter in Section3.2.4 via Theorem 3.3.62. Before then however, we want to consider how sheaves performunder maps between spaces.Up until this point, we have considered a ﬁxed space X . If we have a morphism oftopological spaces f : X → Y , we want to build a sheaf on Y which comes from f in someway. Deﬁnition 3.3.40.

Let f : X → Y be a map of topological spaces. Suppose F is a sheaf on X . The direct Image (or pushforward ) sheaf on Y with respect to f is the sheaf f ∗ F ( V ) : = F ( f − ( V )) Further, we deﬁne the inverse image sheaf on X of a sheaf on Y as f − G ( U ) = lim −→ f ( U ) ⊂ V G ( V ) Remark 3.3.41.

In the previous deﬁnition, one may want to give a naive deﬁnition of theinverse image sheaf in the the style of the pushforward, that is f − G ( U ) = G ( f ( U )) . Thisfails immediately however as we are not guaranteed that f ( U ) is open.Sometimes, topological spaces come naturally equipped with sheaves. Examples ofthis situation are smooth manifolds. to every real topological manifold M , we have C M the sheaf of continuous functions M → R . Deﬁnition 3.3.42. A ringed space is a topological space X equipped with a sheaf of rings O X called the structure sheaf of X . A morphism of ringed spaces is a pair ( f , f (cid:93) ) with f : X → Y a continuous map and f (cid:93) : O Y → f ∗ O X a map of sheaves. We call ( X , O X ) a locally ringed space if the stalks O X , p are local rings for all p ∈ X . A morphism of locallyringed spaces is a pair where the map on sheaves is a local homomorphism of local rings(on stalks it sends the maximal ideal at f ( p ) to the maximal ideal at p surjectively). Wecall O X the structure sheaf of X . Proposition 3.3.43.

Let ( M , O M ) be a locally ringed space. Then M is a smooth manifold inthe sense of Deﬁnition 3.3.3 if and only if there exists an open cover M = (cid:83) U i such that foreach U i there exists Y ⊆ R n open such that there is an isomorphism of locally ringed spaces ( U i , O M ( U i )) ∼ −→ ( Y , C ∞ R n ( Y )) . Proof. ( ⇐ ) This direct in obvious by deﬁning the charts of the atlas to be the projectiononto the ﬁrst coordinate of the morphisms ( f i , f i ) of ringed spaces. Then the sheaf condi-tion guarantees the glueing axiom holds.( ⇒ ) This direction is a bit more subtle. Let M be a smooth manifold with atlas A . Let M = (cid:83) U i and V ⊂ M an open subset. Then deﬁne O M ( V ) = { f : V → R : f | U i ∩ V ◦ ϕ − i : ϕ i ( U i ∩ V ) → R is C ∞ } This makes ( M , O M ) a ringed space. Further, it follows immediately that the inducedmorphisms ( U i , O M | U i ) −→ ( Y , C ∞ R n | Y ) are isomorphisms of ringed spaces. As the targetis locally ringed, so is ( M , O M ) . 106 orollary 3.3.44. Let M be a smooth manifold with smooth atlas A . Then ( M , C ∞ M ) is a locallyringed space. In some sense, locally ringed spaces are the correct setting to study everything wehave seen already. Manifolds and all of their analytic properties can be re-phrased interms of operation on the sheaf C ∞ ( M ) . The only object which we have seen so far thatneeds some further discussion is vector bundles. We ﬁrst discuss a generalization. Deﬁnition 3.3.45.

Let ( X , O X ) be a ringed space. An O X -Module is a sheaf F on X suchthat for each U ⊆ X open, there is a map O X ( U ) × F ( U ) → F ( U ) which turns F ( U ) into an O X ( U ) -module. A morphism of O X -modules is a morphismof sheaves which is O X -equivariant.In direct analogy with R -modules, we can consider some operations on O X -modules. Example 3.3.46.

For this set of examples, let F and G be O X -modules.(a) (Direct Sums) We can deﬁne

F ⊕ G ( U ) by F ( U ) ⊕ G ( U ) . It is nearly immediate thatthis is a sheaf. Therefore if I is a ﬁnite indexing set, we can deﬁne the direct sum forover this set and this will be a sheaf. In the ﬁnite case, this does not hold true andtherefore one must sheaﬁﬁy.(b) (Tensor Products) Consider the presheaf T : U (cid:55)→ F ( U ) ⊗ O X ( U ) G ( U ) . This is nota sheaf in general (this takes some work to ﬁnd an example). Therefore, we deﬁne F ⊗ O X G = (cid:101) T .(c) (Hom) We can consider the presheaf U (cid:55)→ Hom O X | U ( F | U , G | U ) . This is actually asheaf and is denoted as H om O X ( F , G ) It also turns out that H om and ⊗ O X are adjoint endofunctors.(d) (Duals) We deﬁne F ∗ : = H om O X ( F , O X ) This is a sheaf on X . There is a canonical morphism F → F ∗∗ the double dual givenon stalks by s x (cid:55)→ ev s x : F x → O X . x the evaluation at s x map. Further, this gives another construction of the tangent andcotangent bundles.Now we want to deﬁne "free" O X -modules. Remark 3.3.47.

For the remainder of this text, we shall write O U for the restriction of thestructure sheaf to U ⊆ X open. 107 eﬁnition 3.3.48. We call an O X -module F ﬁnite locally free if there exists an open cover U = { U i } i ∈ I such that F | U i is isomorphic (as sheaves) to O nU i for some n ∈ N . In this case, F x is a free O X , x -module. Deﬁne rk x ( F ) : = rk O X , x ( F x ) . This deﬁnes a locally constantfunction X → N x (cid:55)→ rk x ( F ) called the rank of F .We can build a category FLF ( X ) of all ﬁnite locally free sheaves on X . It turns outthat for FLF sheaves, the canonical morphism j : F → F ∗∗ is an isomorphism. Theselook surprisingly close to a generalization of vector bundles, and the following theoremexplains why. Theorem 3.3.49.

There is an equivalence of categories

Bun ( X ) < ∞ (cid:28) FLF ( X ) for any ringedspace X .For a proof, see [Wed16]. What this theorem tells us is that we can assign to each vectorbundle a ﬁnite locally free sheaf and vice-versa. Therefore, as the tangent and cotangentbundles are ﬁnite rank vector bundles on a manifold M , we get corresponding sheaves T M and Ω M . It turns out that we can deﬁne the cotangent bundle using the sheaf H om from above Ω M = T ∗ M = H om ( T M , C ∞ M ) Furthermore, as C ∞ M , x is a local ring (the maximal ideal m x is all non-zero functions at x )we can deﬁne T x M = ( m x / m x ) ∗ and then T M , x = ( m x / m x ) ∗ This gives an explicit description of the stalks of T M .We now turn to some homological methods to end this subsection. Together withmorphisms, O X - Mod (cid:44) → Sh ( X ) is a full subcategory which can be shown to have enoughinjectives [Ive86]. Injective O X -modules are deﬁned analogously to R -modules. For thisreason, given F in O X - Mod , we can ﬁnd an injective resolution J • and thus a quasi-isomorphism F qis −→ J • This gives us a way of computing H i ( X , F ) . In a similar manner to R -modules, H i ( X , F ) ∼ = H i ( Γ ( X , J • )) The following remarkable theorem gives yet another way to compute sheaf cohomol-ogy for constant sheaves corresponding to a ring R . Theorem 3.3.50.

Let R be a ring and (cid:101) R M the constant sheaf on ( M , C ∞ M ) . Then there is anisomorphism H i ( M , (cid:101) R M ) ∼ = H ising ( M ; R ) sheaﬁfying the singular cochain complex. Oncethis is done it follows nearly immediately. For a full proof, see [Wed16]. This concludesthe section on sheaves. Remark 3.3.51.

The main point of sheaves is to facilitate the transfer of local informationto global information via glueing. Notice that the axioms for sheaves and thus everythingelse in this section, was designed so that, under the right conditions , the sections gluedto global ones. This action of taking local information to global information is preciselywhat needs to happen in the olfactory system. We have local actions of granule cells onmitral cells and these "glue" together to form an action of the entire GC layer. As you canguess, the notion of sheaves will show up to help with the mathematical formulation ofthis property.

We end this chapter (and therefore all of the background material) with a short disucssionof de Rham theory for manifolds. This centers on the construction of differential forms on amanifold and the exterior derivative. The main theorem we will prove is de Rham’s theo-rem which gives an isomorphism of sheaf cohomology with so-called de Rham cohomology .This combined with Theorem 3.3.50 gives the grand conclusion that singular cohomologyon manifolds can be computed via the de Rham complex. The main references here are[Lee12] and [Wed16].This story begins with the construction of differential k -forms on a manifold. Beforewe can do this though, we need to deﬁne and study smooth functors . These allow us totransform vector bundles and will extend to endofunctors of Bun ( M ) < ∞ . Deﬁnition 3.3.52.

Let F : Vect R → Vect R be a functor (we will assume covariant but thisis not neccessary). We say that F is smooth if the induced map F (cid:91) : Hom ( V , W ) → Hom ( F ( V ) , F ( W )) is smooth as a map of smooth manifolds. Example 3.3.53.

Some common smooth functors which play a key role in the theory ofsmooth manifolds are presented below.(a) The functor ( − ) ⊗ k is a smooth functor via the Hom-tensor adjunction. Indeed eventaking T • ( − ) is smooth. In general, most of the operations on vector spaces aresmooth functors. Some care needs to be taken in the case of inﬁnite indexing sets,but we shall ignore these cases.(b) The functor (cid:86) k ( − ) is smooth. This will form the basis for all of de Rham theory. Ingeneral, if F arises as a quotient of ⊗ k by some homogeneous ideal (its generated byelements of the same degree) then F is smooth.(c) The functor ( − ) ∗ : = Hom ( − , R ) is smooth. This follows from the previous exam-ple. 109d) If we ﬁx a vector space W , then Hom ( W , − ) and Hom ( − , W ) are smooth func-tors. This follows from the ﬁrst and third example for the case of ﬁnite dimensionalspaces. For inﬁnite dimensional spaces this is more subtle and less useful.Now let M be a smooth manifold and π : E → M be a vector bundle. If F is a smoothfunctor, then F admits an extension (cid:98) F : Bun ( M ) < ∞ → Bun ( M ) by sending E (cid:55)→ (cid:98) F ( E ) where (cid:98) F ( E ) p = F ( E p ) . If F takes ﬁnite dimensional vector spacesto ﬁnite dimensional vector space, then (cid:98) F lands in Bun ( M ) < ∞ . Example 3.3.54.

Consider the cotangent bundle from before T ∗ M = (cid:228) p ∈ M T ∗ p M . Then werealize this as T ∗ M = (cid:92) ( T M ) ∗ To construct differential forms, we need to consider (cid:86) k ( T ∗ M ) . This is a smooth vectorbundle on M of rank ( dim Mk ) . By Theorem 3.3.49, we can associate a ﬁnite locally free C ∞ M -module to T ∗ M . What we would like to show is that this associated sheaf is Ω M frombefore.We now give a second construction of Ω M . Deﬁnition 3.3.55.

Let A be an R -algebra and B an A -module. Then the module of deriva-tions Ω B / A = { db : b ∈ B } / ∼ where ∼ is deﬁned by the relations for derivations as above. For ( X , O X ) a ringed space,we can deﬁne Ω X ( U ) : = Ω O X ( U ) / R where O X is a sheaf of R -modules. For a manifold ( M , C ∞ M ) , we have Ω M ( U ) = Ω C ∞ M ( U ) / R This is

Cotangent sheaf of M and the tangent sheaf is its dual as a C ∞ M -module. We deﬁnedifferential k -forms again, now as sections of the sheaf (cid:86) k Ω M . This is the locally free sheafassociated the k -th exterior power of the cotangent bundle on M .. Remark 3.3.56.

In general if ( X , O X ) is a locally ringed space, we cannot deﬁne p -forms asabove. This is because Ω X need not be a FLF sheaf. To remedy this, we use the canonicalmorphism Ω X → Ω ∗∗ X and take exterior powers. Proposition 3.3.57.

The two constructions of Ω M are equivalent.Proof. This follows from Theorem 3.3.49 and Proposition 3.3.10.110ow, consider T ∗ M as the vector bundle associated to Ω M . Then considering that (cid:86) k T ∗ M is a vector bundle as above we can sheaﬁfy it. As is expected, k (cid:94) T ∗ M (cid:55)→ Ω kM : = k (cid:94) Ω M Deﬁnition 3.3.58.

Using the constructions above, the module of differential k-forms isthe C ∞ M ( M ) -module Ω k ( M ) : = Γ ( M , k (cid:94) T ∗ M ) These modules come with a a differential d k : Ω k ( M ) → Ω k + ( M ) called the exteriorderivatives . In the greatest generality, if ω ∈ Ω k ( M ) and V , ..., V k are smooth vectorﬁelds on M , then d k ω ( V , ..., V k ) = ∑ ( − ) i ω ( V , ..., (cid:98) V i , ..., V k ) + ∑ ( − ) ij ω ([ V i , V j ] , V , ..., (cid:98) V i , ..., (cid:98) V j , ..., V k ) where (cid:98) V i means omission. Lemma 3.3.59. d k + ◦ d k = De Rham Complex associated to M . Deﬁnition 3.3.60.

A differential k -form ω is called closed if d ω =

0. It is called exact if ω = d η for some ( k − ) -form η . The above lemma tells us that every exact form is closed.Therefore, we can deﬁne a cohomology theory for M via this complex as H iDR ( M ) : = ker d i / Im d i − It then follows immediately that H ( M ) ∼ = R π ( M ) . Example 3.3.61.

For R n , the differential 1-forms are generated by the formal symbols dx i where { x i } is a basis for R n . For higher degrees, we have then that ω = ∑ α i ,..., i (cid:96) dx i ∧ ... ∧ dx i (cid:96) with α i ,..., i (cid:96) ∈ R . Further, every k -form for k ≥ H iDR ( R n ) = i ≥ H ( R m ) = R .Now that we have the notion of de Rham cohomology, we want to know what itsrelation is to sheaf cohomology with the corresponding complex of sheaves constructedin the examples above. Theorem 3.3.62.

Let M be a C ∞ -manifold. Then, we have the following isomorphism:H iDR ( M ) ∼ = H i ( M , ˜ R ) where ˜ R is the constant sheaf on M . 111his follows from the constructions above. For more details see [Wed16].The reason we care about this theorem is that it gives an analytic interpretation of sin-gular cohomology. By Theorem 3.3.50, we have that de Rham cohomology is isomorphicto singular cohomology. Therefore, de Rham cohomology is encoding topological infor-mation about the manifold. Further, this isomorphism gives another way to computesheaf cohomology.This ends the Chapter as well as the background material. We encourage the moti-vated reader to spend some time understanding the ﬁnal sections here as they are bothtechnical and widely applicable. They will be useful in understanding Chapter 4, as wellas some recent claims of computational neuroscientists on the construction of geometricframeworks for perceptual spaces via homology and cohomology.112 hapter 4A Geometric Framework for OlfactoryLearning and Processing Abstract

We present a generalized theoretical framework for olfactory representation, learning,and perception using the theory of smooth manifolds and sheaves. This frameworkenables the simultaneous depiction of sampling-based physical similarity and learning-dependent perceptual similarity, including related perceptual phenomena such as gener-alization gradients, hierarchical categorical perception, and the speed-accuracy tradeoff.Beginning with the space of all possible instantaneous afferent inputs to the olfactory sys-tem, we develop a dynamic model for perceptual learning that culminates in a perceptualspace in which qualitatively discrete odor representations are hierarchically constructed,exhibiting statistically appropriate consequential regions ("boundaries") and clear rela-tionships between the broader and narrower identities to which a given stimulus mightbe assigned. Individual training and experience generates correspondingly more sophis-ticated odor identiﬁcation capabilities. Critically, because these idiosyncratic hierarchiesare constructed from experience, geometries that ﬁx curvature are insufﬁcient to describethe capabilities of the system. In particular, the use of a hyperbolic geometry to map ordescribe odor spaces is contraindicated.

The task of sensory systems is to provide organisms with reliable, actionable informa-tion about their environments. However, such information is not readily available; theenvironmental features that are ecologically relevant to an organism are rarely directlyevident in primary receptor activation patterns. Rather, these representations of interestmust be constructed from the combined signals of populations of sensory receptors. Thisconstruction process is mediated by sophisticated networks of neural circuitry that drawout different aspects of potentially important information from the raw input patterns.We previously have proposed that these interactions and transformations can be most ef-fectively modeled as a cascade of successive representations [Cle14], in which each neuronal113nsemble constructs its representation by sampling the activity of its antecedents.The representational cascade that underlies odor recognition and identiﬁcation is im-pressively powerful and compact. Olfactory bulb circuits impose an internally generatedtemporal structure on afferent inputs [LC13a, LC13b, KSUM99, BLFL06] while also regu-lating contrast [CS06], normalizing neuronal activity levels [CCH +

11, BMA +

15, CBC20],and managing patterns of synaptic and structural plasticity [CPdLCPL +

16, Str09, GS09].Transient periods of synchronization with postbulbar networks such as piriform cortexare likely to govern interareal communication [Fri15, FBB +

16, Kay14], including feedbackeffects on bulbar plasticity [Str09, GS09]. The resulting perceptual system learns rapidlyand is conspicuously resistant to retroactive and compound interference [HE96, SCT07].Odors of interest also can be readily identiﬁed despite direct interference from simulta-neously encountered competing odorants; this is a major unsolved problem in olfactoryneuroscience, as competition for receptor binding sites by multiple odorant species pro-foundly degrades the odorant-speciﬁc receptor activity proﬁles on which odor recogni-tion ostensibly depends. We have constructed olfactory circuit models that learn rapidly,resist retroactive interference, and exhibit robust recall under high Bernoulli-Gaussiannoise (which models a combination of sampling uncertainty, innate stimulus variance,and high levels of unpredictable competitive interference from other ambient odors) us-ing a strategy of successive recurrent representations shaped by prior learning [IC20].The success of this approach accentuates the implications of the profound plasticity of theearly olfactory system: odor representations, and the basic function of olfactory percep-tion itself, are fundamentally and critically dependent on learning [WS03, WS06, RPS + +

15, WS03, MSN +

11, KSS + +

14, Her05, AK18,AK20, LKA + Theoretical frameworks for understanding sensory systems include perceptual spaces and hierarchical structures . Both are founded on metrics of similarity [ZVM +

13, ES12, She87,Cla19], though the former presumes an essentially continuous space of some dimen-sionality into which individual stimulus representations are deployed, whereas the latterpresumes some degree of qualitative category membership for each such representation,with intercategory similarities potentially being embedded in the hierarchical proximitiesamong categories. Perceptual spaces can be deﬁned using a variety of metrics, includ-ing both physical metrics such as wavelength (color) or frequency (pitch) and perceptualmetrics such as those revealed by generalization gradients [She87, CNB09, CMYL02] orby ratings on continuous scales by test subjects. Indeed, study of the transformationsbetween physical and perceptual metric spaces is foundational to understanding sensorysystems from this perspective [ZVM +

13, Mei15, VRC17]. In contrast, hierarchical struc-114ures arise from perceptual categorization processes, though relationships among the re-sulting categories still may respect underlying similarities in the physical properties ofstimuli (see

Discussion ). Critically, it is categories that are generally considered to be em-bedded with associative meaning ( categorical perception ) [Har87, GH10, AR18]; a usefultheoretical framework must concern itself with the construction of these categories withrespect to the physical similarity spaces that are sampled during sensory activity. Thatis, along their representational cascades, sensory systems can be effectively considered totransition from a physical similarity space metric to a perceptually modiﬁed space, aris-ing from perceptual learning and within which hierarchical categorical representationscan be constructed.Interestingly, the olfactory modality lacks a clear, organism-independent physical met-ric such as wavelength or pitch along which the receptive ﬁelds of different sensory neu-ron populations can be deployed (and against which the nonuniform sampling propertiesof the sensory system can be measured) [Cle14]. However, olfaction does provide an ob-jective basis for an organism-dependent physical similarity space. In this framework, theactivity of each odorant receptor type – e.g., each of the ∼

400 different odorant receptorsof the human nose or the > R -space; see below) are linearly independent of one another, and (2) every possible proﬁle ofreceptor activation, including any occluding effects of multiple agonists and antagonistscompeting for common receptors, is interpretable.Linear independence among the dimensions of R -space is important for analyticalpurposes, but their orthogonality is irrelevant [Coo15]. This is a vital distinction, notleast because orthogonality depends on the statistics of the chemosensory input spaceand hence cannot be uniquely deﬁned as a property of the olfactory system per se. Inprinciple, each receptor type should have regions of its receptive ﬁeld that distinguishit from any other single receptor type, such that activation of a given receptor need notalways imply activation of a particular different receptor (that is, no two dimensions willbe identical). However, within any given sensory world, as deﬁned by a ﬁnite set ofodorant stimuli with established probabilities of encounter, there will be reliable activitycorrelations among many pairs of receptor types that can support substantial dimension-ality reduction [HWK + signal sparse [BFC17] – but, perhaps more importantly, the process of odor115earning itself directly affects perceived olfactory similarity relationships within a contextof learned generalization gradients [CNB09, CCH + In addition to dimensionality, the second fundamental property of a sensory space is itsintrinsic geometry [ZVM + odorant representations byperceptual learning into meaningful, cognitive odor representations to which meaning canbe ascribed. Key features include the simultaneous depiction of sampling-based physi-cal similarity and learning-dependent perceptual similarity within the perceptual space,a basis for the speed-accuracy tradeoff [FBT +

17, RKG06, ZKU +

13, ASC + odor representations are hierarchically constructed through experience,exhibiting statistically appropriate consequential regions with probabilistic boundariesthat reﬂect learned generalization gradients [CNB09, CCH +

11, She87]. Critically, indi-vidual training and experience generates progressively more sophisticated hierarchiesand concomitantly superior odor identiﬁcation capabilities [RPS + R -space) comprising N receptor types can be depicted as an N -dimensional unit cube.Transformations arising primarily from initial post-sampling computations generate amodiﬁed receptor space termed R (cid:48) ; this space inherits the dimensionality of R -space butrespects the nonuniform likelihoods of different state points within that space. The sub-sequent transformation from R (cid:48) to S -space ("scent space") reﬂects the perceptual and cat-egorical learning processes that construct perceptual representations of meaningful odors . R R (cid:48)

S M B ξ ∆ C ∞ ( R m ) (4.1)116ormally, R is a unit parallelepiped deﬁned by primary olfactory receptor activationlevels. R (cid:48) denotes a subspace of normalized points, following glomerular processing, andis the image B ( R ) . M is a vector bundle over R (cid:48) of rank ξ denotes the input to mitral cells following glomerularprocessing, comprising a sparsened, statistically conservative manifold; it is a section ofthe vector bundle M . S denotes the perceptual space, and is realized as a transformationof R (cid:48) -space that embeds odor learning.Importantly, this theoretical model is broadly independent of precisely where in theolfactory representational cascade these computations take place. However, we considerthat the map B from R -space to R (cid:48) -space reﬂects signal conditioning computations per-formed within the glomerular layer of the olfactory bulb [Cle14, CBC20], whereas thesubsequent transformation into S -space is mediated by computations within the olfac-tory bulb external plexiform layer network [IC20], inclusive of its reciprocal interactionswith deeper olfactory cortices. Brieﬂy, we propose that the construction of categori-cal odor representations through statistical experience arises from learning-dependentweight changes between mitral cell principal neurons and granule cell interneurons inthe external plexiform layer of the olfactory bulb. In this theory, plastic interactions be-tween these two populations construct meaningful, categorical odor representations fromthe continuous, physical odorant representations of R (cid:48) -space based upon individual expe-rience. To construct this theoretical S -space, and attribute to it the capacities of general-ization, speed-accuracy tradeoff, and experience-dependent hierarchical categorization,we ﬁrst build a transitional space M based on mitral cell activity representations, inclu-sive of the actions performed on these representations via their interactions with granulecell interneurons (Diagram 1). This resulting S -space does not, indeed cannot, admit asingle geometry, because of the essential requirement for locally adaptable curvature. Wedescribe this generative process in detail below. R -Space The ﬁrst representational structure in olfaction is directly derived from the ligands of thephysical odorant stimulus interacting with the set of chemoreceptive ﬁelds presented bythe animal’s primary odorant receptor complement. Both vertebrate and arthropod olfac-tory systems are based on large numbers of receptor neurons, each of which expresses oneprimary odorant receptor out of a family of tens (in

Drosophila ) to over 1000 (in mice, rats,and dogs). The axons of primary sensory neurons expressing the same receptor convergetogether to form discrete glomeruli across the surface of the olfactory bulb (in vertebrates;the arthropod analogue is the antennal lobe), enabling second-order projection neurons(mitral cells) to sample selectively from one or a few receptor types. The response ofeach receptor type to an odor stimulus constitutes a unit vector that can range in mag-nitude from nonresponsive (0) to maximally activated (1). A complete representationalspace for instantaneous samples of this input stream consequently has a dimensionalityequal to the number of odorant receptor types N . That is, in a species with three odorantreceptors, the space containing all possible instantaneous input signals would be a three-dimensional unit parallelepiped (depending on the original placement of the vectors in117-space), whereas the R -space of a mouse expressing 1000 receptor types would comprisea 1000-dimensional unit space. As noted above, it is not necessary that these vectors beorthogonal, only that they be linearly independent [Coo15]; indeed, the orthogonality ofthese vectors cannot even be deﬁned without reference to the statistics of the particularphysical environment in which they are deployed.Formally, R -space is deﬁned as the space of linear combinations of these vectors withcoefﬁcients in (

0, 1 ) . Consider the space of all possible odorant stimuli in a species ex-pressing N odorant receptor classes. Each odorant stimulus s ∗ corresponds to a uniqueinstantaneous glomerular response proﬁle that can be represented as a vector s ∗ ∈ R N .Normalizing the activation in each glomerulus enables us to consider s ∗ ∈ ∏ n (

0, 1 ) , theunit cube in N dimensions. Denote this receptor activation-based representational space R . Because the tangent space at all points is T x R ∼ = R N , R has dimension N as a manifold.By considering a product of spaces, we are assuming that the responses of differentglomeruli are orthogonal. In the greatest generality, we would need to consider pointson a unit parallelepiped generated by the glomeruli. We can apply an invertible lineartransformation (namely the matrix generated by the Gram-Schmidt process) to this par-allelepiped to generate a cube (and vice-versa); this is a mathematical formalism and doesnot affect the particulars of this situation. Consequently, for the remaining sections, wecan assume without a loss of generality that R = ∏ n (

0, 1 ) . R (cid:48) The ﬁrst computational layer of the olfactory bulb – the glomerular layer – computes anumber of transformations important for the integrity and utility of odor representations,including contrast enhancement [CS06], global normalization [CCH +

11, BMA + R -space; for example, global feedback normalization in the deepglomerular layer ensures that the points at which most or all of the vectors have veryhigh values will be improbable. The outcome of this transformation is represented as R (cid:48) ,essentially a manifold embedded in R -space.In addition to the systematically unlikely points in R that are omitted from the mani-fold R (cid:48) , it is also the case that, under natural circumstances, most of the possible sensorystimuli s ∗ that could be encountered in R (cid:48) actually never will be encountered in an or-ganism’s lifetime. That is, odor representations within R -space are signal sparse [BFC17].Moreover, we argue that odor sources s ∗ are discrete, but inclusive of variance in qualityand concentration, and hence constitute volumes (manifolds) within R (cid:48) . To account forthis, we denote this variance by s ∗ = ( x , U x ) , where x ∈ R (cid:48) and U x denotes an n -tuple ofvariances (i.e., one variance for each dimension of freedom in R (cid:48) ). That is to say, U x = ( σ , ..., σ n ) From this we arrive at the following deﬁnition:

Deﬁnition 4.2.1.

A pair ( x , U x ) constitutes an odor source volume in R (cid:48) if U x (cid:54) = ( x , U x ) = s ∗ for some odorant s ∗ . 118hat is, an odor source volume corresponds to a manifold within R (cid:48) that comprises thepopulation of odorant stimulus vectors arising from the range of variance in receptoractivation patterns exhibited by a particular, potentially meaningful, odor source. Thisincludes variance arising from nonlinearities in concentration tolerance mechanisms thatcannot be completely avoided [CCH +

11] as well as genuine quality variance across dif-ferent examples of a source. For example, the odors of oranges vary across cultivars anddegrees of ripeness; the odors of red wines vary across grape cultivars, terroir, and pro-duction methods. The source representation in R (cid:48) thereby corresponds to an odor source(e.g., orange, red wine), inclusive of its variance, and delineates the consequential regionof the corresponding odor category that will be developed via perceptual learning. Crit-ically, it is not important at this stage to specify multiple levels of organization withinodor sources (e.g., red wine, resolved into Malbec, Cabernet, Montepulciano, etc., thenresolved further by producer and season); it is the process of odor learning itself thatwill progressively construct this hierarchy of representations at a level of sophisticationcorresponding to individual training and experience. M -Space The transformation from R (cid:48) to S -space depicted in Diagram 1 is mediated by the interac-tions of mitral and granule cells. In this framework, mitral cells directly inherit afferentglomerular activity from R (cid:48) (Diagram 1, ∆ ), but their activity also is modiﬁed substantiallyby patterns of granule cell inhibition that, via experience-dependent plasticity, effectivelymodify mitral cell receptive ﬁelds to also incorporate higher-order statistical dependen-cies sourced from the entire multiglomerular ﬁeld. (A simpliﬁed computational imple-mentation of this constructive plasticity is presented in the learning rules of Imam andCleland, 2020). This is depicted in Diagram 1 as an effect C ∞ ( R m ) of a mitral cell prod-uct space M which contributes to the construction of S , in order to highlight the smoothdeformations of R (cid:48) into S via passage to M .This effects of mitral cell interactions, arising from experience, are modeled locally asa product space M based on the principle that each glomerulus – corresponding to a re-ceptor type in R (cid:48) – directly contributes to the activity of some number of distinct mitralcells. In the mammalian architecture (shared by some insects, including honeybees), mi-tral cells receive direct afferent input from only a single glomerulus, such that the afferentactivity in each mitral cell (or group of sister mitral cells) corresponds directly to a singlereceptor type. In this "naive" case, M -space is globally a product. To formalize this, welabel the glomeruli g , ..., g q . To each, we associate the number of mitral cells to which itprojects; denoted m i ∈ Z . Let k = ∑ q m i . Then, the naive space constructed from thesedata is R (cid:48) × R k = { ( r , v ) : r ∈ R (cid:48) , v ∈ R k } The interpretation of this space is as follows: to each point in R (cid:48) , we can associate a vectorthat is an identiﬁer for how subsequent mitral-granule cell interactions in the olfactorybulb will transform the input in service to identifying it as a known percept. The man-ifolds associated with particular odor source volumes in R (cid:48) will, owing to experience-dependent plasticity, come to exhibit related vectors that, in concert, manifest source-119ssociated consequential regions. These regions reﬂect categorical perceptual represen-tations and are measurable as odor generalization gradients. Simpliﬁed computationalimplementations have depicted these acquired representations as ﬁxed-point attractors,tolerant of background interference and sampling error but lacking explicit consequentialregions [IC20].We refer to this space as naive because it is globally a product space only for themammalian architecture, in which the dimensionality of mitral cell output m (the num-ber of distinct mitral cells, grouping sister mitral cells together) is identical to that ofglomerular output k . However, this network architecture is not general; in nonmam-malian tetrapods, for example, individual mitral cells may sample from more than oneglomerulus [MNS81a, MNS81b]. This introduces a twist into the product space and ruinsthe naive structure, as m now can be less than k . In this general case where m ≤ k , themitral cell space becomes a rank m vector bundle R m (cid:44) → M π → R (cid:48) over R (cid:48) . Nevertheless, it can be depicted locally as a product space because vector bundlesare locally trivializable. Given any odor source volume ( x , U x ) we know that there existseither a subset U (cid:48) ⊂ U x such that π − ( U (cid:48) ) ∼ = U (cid:48) × R m or U (cid:48) ⊇ U x , then we can lookexclusively that U x × R m ∼ = π − ( U (cid:48) ) and this is a trivial bundle over the base.For simplicity, we here analyze the mammalian architecture case. In this architecture,the vector bundle is trivial because m = k ; no mitral cells innervate multiple glomeruli,and there is no possible twisting of the ﬁbers. Therefore, in mammals, M is globally aproduct space, M = R (cid:48) × R m rendering M a smooth manifold with the convenient property that to every input x ∈ R (cid:48) we associate a point ( x , v ) , where v is a vector whose i th component is the value of theoutput of the i th mitral cell. Formally, we say that M is a (trivial) vector bundle over R (cid:48) with ﬁbre R m . Then, the smooth maps which send x (cid:55)→ ( x , v ) such that compositionwith projection onto the ﬁrst coordinate is the identity are called global smooth sectionsof the bundle, and the set of these is denoted Γ ( R (cid:48) , M ) . To any smooth manifold P , we canassociate the ring of smooth functions C ∞ ( P ) = { f : P → R : f is smooth } To any open subset, we have a restriction map Res PU : C ∞ ( P ) → C ∞ ( U ) . In general, if U ⊆ P is open, then Γ ( U , E ) is a C ∞ ( U ) − module for any bundle π : E → P . C ∞ ( − ) makes P into a locally ringed space and Γ ( − , E ) is a sheaf of C ∞ ( − ) -modules. S -Space S -space, or scent space, is a constructed perceptual space tasked with preserving physicalrelationships among odorants while also embedding the transformations arising fromperceptual learning, speciﬁcally including those forming incipient categorical odors . Todo this, we embed R (cid:48) into a higher-dimensional space (with dimension N + S by growing U x in the positive N + th direction around odor source volumes in R (cid:48) , which does not affect distance relationshipsin R N (Figure 4.1A). (Discrimination training also can grow U x in the negative N + th direction). To quantify this transformation, we construct two distance metrics, d phys and d per on S . Deﬁnition 4.2.2.

Let x , y ∈ S be two points. We deﬁne the physical metric between the twopoints as the Euclidean distance between them in R . In notation, d phys ( x , y ) = | π R N ( x ) − π R N ( y ) | This metric reﬂects the physical similarities of the objects in the receptor space, which arenot affected by perceptual learning (i.e., distension in N + Deﬁnition 4.2.3.

Let x , y ∈ S . Consider x and y as vectors in R N + . Then, let γ : [

0, 1 ] → S be the curve deﬁned by γ ( ) = x , γ ( ) = y and π R N ( γ (cid:48) ( t )) = w · [ π R N ( γ ( ) − γ ( ))] with w some real number dependent on t . The perceptual metric , d per ( x , y ) = (cid:90) || γ (cid:48) ( t ) || dt is the arc-length along the surface of S between the points x and y (Figure 4.1A). No-tice that π R N ( γ (cid:48) ) is well deﬁned as S (cid:44) → R N + and thus the tangent space T γ ( t ) S ⊆ T γ ( t ) R N + = R N + .The relationship between these two metrics tracks the changes in S induced by the con-struction of odor representations; speciﬁcally, d per reﬂects experience-dependent changesin the perceptual distance between x , y ∈ S that are excluded from the d phys metric (Fig-ure 4.1A). Learning about an odor source ( x , U x ) progressively distends the volume (in R N ) in the N + R (cid:48) . That is, over time, the breadths (in each of the N dimensions) ofthe distension into the additional ( N + th ) dimension will come to reﬂect the actual vari-ances U x of the odor source s ∗ = ( x , U x ) as naturally encountered. The quasi-discretedistensions formed in the additional dimension correspond to incipient categories – i.e.,categorically perceived odors – and their breadths and gradients can be measured behav-iorally as generalization gradients [CNB09, CCH + U x = ( σ , ..., σ n ) in R (cid:48) is independent; that is, different sam-ples of a given natural odor source may vary substantially in some aspects of quality butnot others, where an aspect of quality refers to the relative levels of activation of a givenodorant receptor type (Figure 4.1B).Formally, to construct the perceptual space S in such a way that there exists a per-ceptual metric d per that interacts with the natural physical metric d phys of R (cid:48) , we con-sider the embedding R (cid:48) (cid:44) → R N + . The open neighborhoods for each odor source volumedeﬁne open sets in the subspace topology. If we embed R (cid:48) by the canonical inclusion R N → R N + , then R (cid:48) is ﬂat in R N + because the ﬁnal coordinate of its elements is 0.Therefore, we can consider transformations of R (cid:48) that smoothly vary the ﬁnal coordinate.For each transformation f , denote the resulting space as S : = S ( f ) . This constitutes the121igure 4.1: Depictions of S -space in the cases of N =

1, 2. (A) Three distinct odors in S -space in the case of N =

1. Going left to right, the ﬁrst odor is highly learned withmany distinct sub -odors. Further, it is decorated with a distinction of a speciﬁc odor andthe time axis. Per the discussion below, each red dotted line represents the formation ofequivalence classes of odors at a given time. As time increases, speciﬁcity increases andthis is reﬂected in the diagram. The second odor is overall less learned than the ﬁrst, yetthe ﬁrst two sub-odors are known to be distinct as shown by the large valley betweenthem. The third odor is poorly learned. (B) After learning has occurred, a valley has beencreated between the two sub-odor classes in the second odor. As the valley extends belowthe original line, we know that these two sub-odors are perceptually very different. (C)-(D) Depictions of a part of S -space for the case N =

2. Various amounts of learning havegenerated the landscapes presented.evolving perceptual space. Deﬁne the map ∆ : R (cid:48) → S as the distension of R (cid:48) in N + M and R (cid:48) simultaneously, and is a diffeo-morphism trivially.To better understand the map ∆ , we here construct it as the composition of mapsamong the spaces already described, speciﬁcally showing how the (acquired) propertiesof M govern the mapping of R (cid:48) to S . The map B : R → R (cid:48) reﬂects glomerular-layer trans-formations as described above. For a ﬁxed smooth section ξ : R (cid:48) → M (which alwaysexists by the triviality of M ), we generate Diagram 2 (an elaboration of Diagram 1),122 R (cid:48) S M B ∆ ( f ) ξ id R (cid:48) × f (4.2)where ∆ ( f ) is deﬁned to be a map that makes the diagram commute. Note that ∆ de-pends on f , and, therefore, so does S . That is, S depends on the functions R m → R from M , which are smooth. To allow for ongoing plasticity, it is more correct to denote the per-ceptual space as S : = S ( f ) ; however, as it will always be clear from context whether or not f is ﬁxed, we will simply refer to it as S . The map id × f reﬂects the fact that R (cid:48) ⊆ R N + ,and by construction x N + = x ∈ R (cid:48) . As M = R (cid:48) × R m , it follows that a dense setof maps M → R N + which are the identity on R (cid:48) can be split as maps i : R (cid:48) → R N and f : R m → R . Therefore, because id R (cid:48) × C ∞ ( R m ) = C ∞ ( R m ) , we abbreviate the collectionof all maps M → S as C ∞ ( R m ) , as depicted in Diagram 1.The outcome of these transformations is a formal deﬁnition for the construction ofcategorical odor representations in S : Deﬁnition 4.2.4.

Let ( x , U x ) be an odor source volume in R (cid:48) . We denote the image of thisvolume in S as ( x , (cid:102) U x ) . This image denotes an odor representation , also referred to as an odor percept , or simply an odor . The construction of odor representations ( x , (cid:102) U x ) in S enables the depiction of learning as ageometric object, naturally encompassing the transition between the physical and percep-tual space depictions of the olfactory landscape and illustrating the construction of mean-ingful categorical odor representations based on individual experience. As we describebelow, these odor representations admit hierarchy and exhibit the advantages of categor-ical perception. However, they remain continuous in S , with consequential regions thatare not discretely delimited; i.e., olfactory perceptual categorization is ultimately heuris-tic. This affords some powerful advantages. For example, it provides a natural basis forbehaviorally observed odor generalization gradients [LH99, CMYL02, CNB09, CCH + U x of the odorsource indicates that different samples fall within a common, relatively broad, distribu-tion with shared implications [CCH + d per between them. Each of these distinct and specialized modes of learning is considered totransform the plasticity-dependent distensions into dimension N + N + S -space will be variously persistent, either fading back towards ﬂatness with a giventime constant or enduring indeﬁnitely, according to learning-dependent temporal tagsthat are not explicitly discussed herein. The geometry of local plasticity

Plasticity in neural systems in general, and in the olfactory bulb in particular, is locallygoverned. Changes in cellular and synaptic functional properties rely substantially onthe synaptic interactions of directly connected neurons and the locally regulated releaseof neurochemicals. These local effects, coordinated by sophisticated network interactions,collectively generate global systemic performance at the network level. The present odorlearning framework also arises from localized plasticity: distensions into the additional( N + th ) dimension of S arise from learning the activity proﬁles of individual sensoryinputs, and are not globally governed (speciﬁcally, we argue that this arises from learnedpatterns of granule cell feedback onto mitral cells in olfactory bulb; for a simpliﬁed com-putational implementation of this process, see [IC20]). However, to characterize the func-tionality of the olfactory system as a whole, it is necessary to formally glue such localplasticity operations together, along with any relevant global processes, within a singleanalytical framework. To do this, we employ the theory of sheaves [Wed16]. Sheaves enable localized learning

We formally consider the local actions of granule cells onto mitral cells, and their con-comitant modiﬁcation of mitral cell output, as follows, considering that these actions124ay rely both on afferent sensory information and on additional inputs delivered ontogranule cells by piriform cortex and other association cortices [IS98]. Recall from the pre-vious section that for any vector bundle π : E → P , we generate ( C ∞ ( − ) , Γ ( − , E )) a pairof sheaves on P such that Γ ( U , E ) comes equipped with an action of C ∞ ( U ) for all open U ⊆ P . We here deﬁne an analogous pairing of sheaves to describe the modiﬁcation bygranule cells of afferent information contained in the mitral cell ensemble. The ﬁrst stepin this deﬁnition is to deﬁne a functor µ : T → R m where T is the category deﬁned by the topology on R (cid:48) , and R m is the set whose objectsare linear subspaces of R m and morphismsMor R m ( U , V ) = (cid:40) ∅ U (cid:54)⊆ V {∗} U ⊆ V To describe what this functor does, we need to turn to the anatomy of bulb. For a givenodorant, the induced signal passed from glomeruli to mitral cells may not excite somemitral cells. This corresponds to the situation where ξ ( s ) = ( s , v ) and v has some coor-dinates equal to 0. These non-zero coordinates form a basis for some subspace of R m . Let n ( ξ , s ) be the number of non-zero coordinates in v . Let O be any open subset of R (cid:48) . Then µ ( O ) = R (cid:96) where (cid:96) = max { n ( ξ , p ) : p ∈ O } . Composing τ and C ∞ and using the sheaf conditionof C ∞ we conclude that C ∞ ( µ ( − )) ∈ Sh ( R (cid:48) ) . Now, we deﬁne G ( − ) as a ﬂabby (ﬂasque)sheaf of rings on R (cid:48) which act on C ∞ ( µ ( − )) . This action is precisely the interaction oflocal inhibition on mitral cells, and in particular on those mitral cells that are activated bya given odorant stimulus. This makes C ∞ ( µ )) a G -module (as sheaves). Localized discrimination learning

Learning about an odor is generally modeled as growing a distension into the additional( N + th ) dimension of S , with the breadths of the distension across its N dimensionsultimately reﬂecting the physical proﬁle of quality variance U x associated with the cor-responding odor source s ∗ = ( x , U x ) . This category-construction framework can be con-sidered common to diverse forms of odor learning (e.g., nonassociative, reinforcement),despite their differences in other properties as noted above. However, explicit discrimina-tion learning – in which animals are rewarded for distinguishing physically similar odor-ants from one another by associating them with different outcomes – requires that thesedistensions into the additional dimension also be locally retractable, so as to reduce oreliminate the similarity-based categorical overlap that may exist between the odor sourcevolumes a priori . This is particularly important given the remarkable olfactory discrimi-nation capabilities exhibited by appropriately trained animals [MBB19].Consider two physically similar odorants s ∗ = ( x , (cid:102) U x ) and t ∗ = ( y , (cid:102) U y ) in S . Becausethe early stages of odor learning are characterized by broadened generalization gradients125CNB09], presumably reﬂecting sampling uncertainty, odor representations (distensionsin S ) at this stage are likely to overlap: (cid:102) U x ∩ (cid:102) U y (cid:54) = ∅ . This is appropriate, given the like-lihood (prior to discrimination training) that two highly similar odor stimuli, sampled inclose succession, simply constitute two samples from the same odor source volume. How-ever, discrimination training is capable of rapidly and strongly separating highly similarodors, and the between-category separation principle of category learning [PGJSTH19] in-dicates that we need to move them further apart than they would be prior to learning.Hence, discrimination learning needs to be able to not only retract distensions to zero,but to expand them in the negative direction if need be (see Figure 4.1B).To do this, we construct a map that decreases only those values of f which are suf-ﬁciently close (within some small ε >

0) to a distance-minimizing path γ connecting x and y . Its existence follows from the existence of smooth bump functions on M . Fix f ∈ C ∞ ( R m ) so that S = S ( f ) . We consider functions α ∈ C ∞ ( R ) . Then, by deﬁning thelearning operation as S (cid:55)→ S ( α ◦ f ) we have a realization of this transformation of learningtwo odors apart. In fact, what we have done here is deﬁned a (cid:94) C ∞ ( R ) -module structure on C ∞ ( µ ) . Therefore, by considering only the interaction of α and f over γ , we have reducedthe problem of discrimination learning to a 1-dimensional problem depicted in Figure4.1A-B. The map resulting from discrimination learning lengthens the perceptual metric d per between two similar odor source volumes, partitioning and expanding the previouslyshared space between the two representations so as to arbitrarily increase their perceptualdissimilarity, all without altering the physical distance d phys between their centers.Importantly, discrimination learning inherently depends on at least two odor sources,so can be targeted even more speciﬁcally between them. In high-dimensional space,can separate two such sources nearly arbitrarily without affecting similarity relation-ships among other nearby odor representations. This cannot be depicted in our lower-dimensional plots as the number of dimensions is too small for all of the odors to essen-tially be independent. Remark 4.2.5.

Based on the construction above, we can take (cid:94) C ∞ ( R ) to be a rough approx-imation of G as a sheaf. We cannot conclude that they are precisely equal as this wouldneed more analysis which we have not presented here.Putting all of this together, we arrive at the ﬁnal (for now) version of the model. Wenow have, R , R (cid:48) , M , G , S and can complete the picture of the model (reference Diagram4.1). The appearance of G and C ∞ ( µ ) encodes the local-to-global transformations of gran-ule cells and their interaction with the maps M → S which preserve R (cid:48) . R ( R (cid:48) , G , C ∞ ( µ )) S M = R (cid:48) × R mB ξ ∆ ( f ) Id R (cid:48) × f (4.3)All together, this diagram encodes everything which we have constructed above and therelations of the various spaces. 126 .2.6 The construction of hierarchical odor categories The last original part of this section is the construction of hierarchical categories from thecontinuous spaces we have built above. The surprising advantage of the process aboveis that it gives a geometric interpretation of the speed-accuracy tradeoff for identifyingodors in the wild.Suppose now that we need to identify a given odor. For example, a fox in the wildmay be hunting an animal and tracking it by scent or a human trying to discern a speciﬁcspice in a dish while at a restaurant. What is the mathematical interpretation of such asituation and how does the model deal with this interpretation. We ﬁrst view each peakas a continuous categorization for that stimulus (This is the image of a fully learned sys-tem). For instance we may have a peak deﬁned for “oranges". As we move up the peakwe reﬁne the categorization. Here reﬁnement means entering a subcategory. From thediscussion above we know that the peak will be parsed into a variety of sub-peaks whichcorrespond to physically similar but perceptually different types of orange. Pictured be-low is a complex of categories, ordered by inclusionCitrus Fruit ⊇ Oranges ⊇ Ripe Oranges ⊇ Ripe Valencia OrangesAlthough this example is linearly ordered, there is no need for there to be only one chainof inclusions. Every peak can break up into at most ﬁnitely many distinct subpeaks andthus the decomposition can become arbitrarily complicated.Now we shall construct the categorization by successively taking intersections withan afﬁne hyperplane (see Figure 4.1(A) for an illustration in the case N =

1) Suppose P is a peak, determined by some odorant pair ( x , U x ) , with several subpeaks { P i } i ∈ I .Then as each sub-peak has a boundary, we can deﬁne the minimum value attained in P i . Let P ∗ i ⊆ P i be the subset consisting of all points of P i with minimal x n + value. Let H = { x ∈ R n + : x n + = } be a hyperplane in R n + and deﬁne H t = H + (

0, 0, ...0, t ) .This is an afﬁne transformation of H and geometrically is the translation of H in the n + t h direction. Lemma 4.2.6. P ∗ i = P i ∩ H t for some t > Further if n ≥ , P ∗ i is connected.Proof. Let t ∗ be the n + th coordinate of all elements in P ∗ i . Then by construction P ∗ i ⊆ P i ∩ H t ∗ For the reverse inclusion let y ∈ P i ∩ H t ∗ . Then y ∈ P i and y n + = t ∗ and therefore y ∈ P ∗ i .Hence, P ∗ i = P i ∩ H t ∗ . The connectedness of P ∗ i follows immediately from the fact that P ∗ i = ∂ P i the boundary, and that P i is homeomorphic to D n the n -dimensional disk. For n ≥ ∂ D n = S n − and is thus connected.Using this lemma, we can now deﬁne a coarse categorization of S . Let t ∈ (

0, 1 ) be ar-bitrary. By the previous lemma, we know that if we consider H t ∩ S we will get disjointconnected subsets of S . So, consider the closed half space H ∗ t = { x ∈ R n + : x n + ≥ t } ∂ H ∗ t = H t and H ∗ t ∩ S is also a collection of disjoint connected subsets of S . Let { S ti } i ∈ I t be an enumeration of these subsets by the set I t . Now let P be a partition of (

0, 1 ) . Then for each p j ∈ P we have the associated collection { S p j i } of subsets. We knowby construction that for j < j (cid:48) that { S p j i } ⊇ { S p j (cid:48) i } . Therefore, we have built a method tobreak S into discrete categories and given in the local structure of a tree. Using this, wearrive immediately at a hierarchical categorization of odors which is solely dependent onthe amount of information learned about a class of odors. The method we have built above prioritizes the construction of a coarse categorization(partial order) from a geometric structure. One may ask if it is possible to proceed inthe other direction, that is build a geometric structure out of some form of categoriza-tion. This approach has been attempted by many researchers and in every case, there isa fundamental assumption made which makes the model unhelpful and in some cases,invalid. In [ZSS18] they make the claim that the human perceptual odor space (the ana-logue of S ) is three dimensional and hyperbolic (constant negative curvature − TS where S is the unit sphere in R and TS is the tangent bundle. The HairyBall Theorem [EG79] tells us that TS is not a trivial bundle, and yet we can always locallytrivialize a vector bundle. Therefore, the local structure tells you little about the globalstructure. This is one of the reasons their conclusion was ﬂawed. Their claim also hingedon the computation of some homology groups for certain simplicial complexes generatedby "similarity matrices" and showing that the distributions of the rank of these groupsclosely matches simulation estimates for hyperbolic space. This would have worked, hadthey not stopped computing the homology in degree 3. It does not take much thinkingto concoct a graph (and thus a simplicial complex) whose homology groups are zero forn=1,2,3 and are non-trivial for some higher degree (for example, the iterated suspensionof two points will yield simplicial complexes which are homotopy equivalent to spheres).This implies that the structure which they are trying to detect will have some higher di-mensional pieces. Simply not considering these (possibly because of the method used in[GPCI15]) leads to a false conclusion. Hence, the conclusion that the olfactory perceptualspace is hyperbolic is simply unfounded. More interestingly, should the perceptual spacebe related to the physical odorant space at all, there is no possible way to have constantcurvature! In this situation, when learning occurred, it would be impossible to preservethe physical metric and the perceptual metric simultaneously.128 hapter 5Future Directions This chapter will serve to present those ideas which we have not incorporated into themodel but believe are useful. Most of these topics are central to any ﬁeld of mathematicsand thus we should expect them to show up here too. Additionally, we close with aconjectural method to deal with noisy input odors and show its relation to some of thetopics introduced in the ﬁrst few chapters.

The representation theory of Lie groups is a fundamental ﬁeld of mathematics. So fun-damental in fact that one would be strained to ﬁnd an area of mathematics which doesnot appear in the usual course of study. In this short chapter, we shall study one of the-orems which lies in the intersection of complex analysis and representation theory: theBorel-Weil theorem.

Theorem 5.1.1 (Borel-Weil) . Let K be a compact, connected Lie group and T ⊆ K be a maximaltorus. Let G = K C be the complexiﬁcation and B = MAN a Borel subgroup. Then the irreducibleﬁnite dimensional representations of K stand in one-to-one correspondence with the dominant,analytically integral weights λ ∈ t ∗ with the correspondence given by λ (cid:55)→ Γ H ( K / T , L λ ) ∼ = F HolB , χ λ where Γ H ( K / T , L λ ) denotes the set of holomorphic sections of the bundle and F HolB , χ λ = (cid:110) f : G → C f ( gb ) = χ λ ( b ) − f ( g ) , f holomorphic (cid:111) with χ λ the character of B associated to the analytically integral weight λ .This was proven independently by Borel and Weil in [Ser54] and then by Harish-Chandrain [HC56]. The proof we shall present in section 5.5 is a combination of those presentedin [Kna88], [Kna86], and [Hel08]. As will be seen later, this theorem gives a geometricrealization of a purely algebraic object and vice versa. Therefore, we may be able to applysimilar methods to the model above and arrive at some striking consequences.129ecall that a smooth manifold is a second-countable, Hausdorff, topological spaceequipped with an atlas of (smooth) C ∞ -charts { ϕ U : U → R n } which are injective. Mor-phisms of smooth manifolds are smooth maps which are compatible with the atlases.Putting these two together, we get the category M of smooth manifolds. There is afunctor C ∞ ( − ) : M → R - Alg which assigns to any smooth manifold M an R -algebra C ∞ ( M ) : = { f : M → R | f smooth } with addition and multiplication deﬁned point-wise.For each p ∈ M we can deﬁne C ∞ M , p : = lim −→ U (cid:51) p C ∞ ( U ) .Let M be a manifold and p ∈ M a point. We deﬁne the tangent space at p to be T p M : = Der ( C ∞ M , p ) . This becomes an R -vector space is we equip it with addition.The collection ofall tangent spaces is called the tangent bundle and is denoted T M . This admits a smoothstructure and becomes a smooth manifold with dimension 2 dim M . The elements of T M can be given as pairs ( p , v ) where p ∈ M and v ∈ T p M . There is a canonical projection π M : T M → M which is a local diffeomorphism onto. A manifold is called parallelizable if T M = M × R n for n = dim M .A section of the canonical projection is a smooth map f : M → T M such that π M ◦ f = M . The set of all smooth sections is denoted Γ ( M , T M ) or X ( M ) and can be identiﬁedwith the collection of smooth vector ﬁelds on M . This has a natural structure as a C ∞ ( M ) -module. Deﬁnition 5.1.2. A Lie Group is a group object in the category M . More explicitly, it is asmooth manifold G equipped with two operations: multiplication G × G → G which issmooth and inversion ( − ) − : G → G which is also smooth. A Lie group homomorphism is a smooth map which respects the group structure.Let G be a Lie group and x ∈ G . Then x deﬁnes a smooth automorphism L x : G → G such that L x ( y ) = xy . An element q ∈ Γ ( G , TG ) is called left-invariant if for all x , y ∈ G we have T y L x ( q y ) = q xy where T y L x : T y G → T xy G is the tangent map. The space of all left-invariant vector ﬁeldson G will be denoted as X L ( G ) . Deﬁnition 5.1.3. A Lie Algebra is a vector space V equipped with an alternating, bilinearform [ − , − ] : V × V → V satisfying the Jacobi Identity [ X , [ Y , Z ]] + [ Z , [ X , Y ]] + [ Y , [ Z , X ]] = Lie Bracket . A

Lie algebra homomorphism is a linear map T : g → h such that T ([ X , Y ]) = [ T ( X ) , T ( Y )] where the ﬁrst bracket is in g and the second is taken in h . Lemma 5.1.4.

The map ( − ) : X L ( G ) → T ( G ) is a vector space isomorphism. Further, if weendow X L ( G ) with the operation [ X , Y ] = XY − YX . This makes X L ( G ) a Lie algebra over R .Further ( − ) respects the bracket operation and gives T ( G ) the structure of a Lie algebra. roof. The map has an inverse given by

X f ( x ) = X ( L x − f ) where L x − f ( y ) = f ( xy ) . Thefact that this map respects the Lie bracket is obvious. Corollary 5.1.5. TG ∼ = G × T ( G ) . Proof.

Every basis of T ( G ) consists of global left-invariant vector ﬁelds and hence G isparallelizable.For the remainder of this section we shall denote Lie algebras by the correspondinglower-case gothic letters. That is if G is a Lie group, then its Lie algebra is g . Example 5.1.6. (a) Let G = R n together with addition. This is a Lie group with Liealgebra R n . In general, any ﬁnite-dimensional real vector space is non-canonicallyisomorphic to R n for some n and therefore carries a smooth manifold structure andtherefore a Lie groups structure.(b) Recall that a matrix is invertible if det X (cid:54) =

0. Then GL n ( R ) (resp. C ) is the col-lection of all invertible n × n matrices with entries in R (resp. C ). It is called the General Linear group . This is an open subset of M n ( R ) (resp. C ) and thereforecarries an obvious manifold structure. In fact, matrix multiplication and matrix in-version are smooth operations. This makes GL n ( R ) (resp. C ) a real Lie group ofdimension n (resp. 2 n ). Its Lie algebra is gl n ( R ) = M n ( R ) (resp. C ).(c) Deﬁne the operation − ∗ : M n ( C ) → M n ( C ) by X (cid:55)→ X T . The matrix X ∗ is called theadjoint matrix to X . Let U ( n ) ⊆ GL n ( C ) to be the set of matrices such that X ∗ X = I n the n × n identity matrix. This is the Unitary group and is a closed subgroup of GL n ( C ) and thus inherits a Lie groups structure. To ﬁnd its dimension we passto the Lie algebra u ( n ) . An easy computation shows that u ( n ) consists of all skew-hermitian matrices ( X ∗ = − X ) and thus dim u ( n ) = n . Further, U ( n ) is a real Liegroup. To see this, see what happens when we take i u ( n ) .(d) Let S be the circle embedded as a submanifold of C . Then S carries a Lie groupstructure by writing its entries in polar coordinates. Deﬁne the Torus T n = ∏ n S .This carries a natural Lie group structure under component-wise multiplication. Itslie algebra is i R n . Lie algebras are signiﬁcantly easier to deal with than Lie groups because they are es-sentially generalized vector spaces. Therefore, we want to understand the structure ofvarious types Lie algebras so that we may possibly deduce some information about theassociated Lie group.

Deﬁnition 5.1.7. A Lie subalgebra (normally shortened to simply subalgebra) of a Liealgebra g is a vector subspace h such that [ h , h ] ⊆ h where the bracket of Lie algebrasis shorthand for the set of all [ X , Y ] . An ideal of g is a subset i such that [ g , i ] ⊆ i . Asubalgebra a is called abelian if [ a , a ] =

0. 131e will denote ideals of g as i (cid:68) g and subalgebras as h ⊆ g . Notice that [ i , g ] ⊆ i is equivalent to the deﬁnition given above as this amounts to putting a negative signeverywhere, but − i = i . Proposition 5.1.8. If g is a Lie algebra and i is an ideal, then g / i has the structure of a lie algebra.Proof. As a set, g / i is simply the vector space quotient. To show that the Lie bracketdescends to the quotient, we consider two classes X + i , Y + i ∈ g / i . Then [ X + i , Y + i ] = [ X , Y ] + i by the bilinearity of the bracket. It then follows immediately that this bracket satisﬁes theJacobi identity. Hence, g / i is a lie algebra.Similar to the case of ideals of a ring, it can be shown (quite easily) that any ideal canbe realized as the kernel of a Lie algebra homomorphism, namely ϕ : g → g / i . Deﬁnition 5.1.9.

A Lie algebra g is simple if it has no non-zero proper ideals. It is semisimple if it has no non-zero solvable ideals. We say that a Lie group G is semisimple(resp. simple) if g is semisimple (resp. simple).A fact which we will not prove is that all semisimple Lie algebras have no center,and therefore all semisimple Lie groups have a 0-dimensional center. Further, one canprove (say by Cartan’s criterion for semisimplicity) that all semisimple Lie algebras canbe realized as a direct sum of simple lie algebras [Kna96, Chapter 1]Semisimple Lie groups are of interest to many areas of mathematics and are fairly wellunderstood. The small piece of the theory of lie groups that we need for the rest of thissection is the representation theory of semisimple Lie groups and Lie algebras . Before we get intothis, we want to understand where representation theory comes from in the ﬁrst place.Why might we care about representations? Suppose G is a ﬁnite group (not assumed tobe of Lie type) and let G act on a set X . Denote by F ( X ) the set of all complex valuedfunctions on X . Then F ( X ) is naturally a C -vector space under point-wise addition andscalar multiplication. We can extend the action of G on X to an action on all of F ( X ) by ( g · f )( x ) = f ( g − · x ) This representation will break up into a direct sum of irreducible representations of G with some multiplicities (by Maschke’s Theorem). Precisely how this representation breaksup tells us something about the structure of X . In particular, if we put some conditionson the functions (that they are all L for instance) then we can better understand X andits symmetries. This has a similar ﬂavour to understanding Aut ( X ) for X in an arbitrarycategory. Deﬁnition 5.1.10.

Let g be a Lie algebra over an arbitrary ﬁeld. The commutator series for g is deﬁned by g = [ g , g ] and g n + = [ g n , g n ] . We get a chain of Lie subalgebras g = g ⊇ g ⊇ g ⊇ ...We say that g is solvable if g n = n .132 eﬁnition 5.1.11. Let g be a Lie algebra over an arbitrary ﬁeld. The lower central series for g is deﬁned by g = [ g , g ] and g n + = [ g , g n ] . We get a chain of ideals g = g ⊇ g ⊇ g ⊇ ...We say that g is nilpotent if g n = n . Corollary 5.1.12. If g is nilpotent then it is solvable. Lemma 5.1.13.

Every subalgebra of a solvable (resp. nilpotent) Lie algebra is solvable (resp.nilpotent).Proof.

Clearly, for each h ⊆ g the commutator series satisﬁes [ h , h ] ⊆ [ g , g ] . Theorem 5.1.14 (Lie’s Theorem) . Let g be a complex solvable Lie algebra and ( π , V ) a repre-sentation. Then there exists a simultaneous eigenvector for all elements in π ( g ) .This implies, for instance, that all elements of π ( g ) act by upper triangular matriceson any π ( g ) invariant subspaces. With the diagonal entries being the generalized eigen-values of the matrices.For proofs of this theorem see [Kna86] or [Bum13]. For this entire section, all statements not proven are presented in [Kna05a] with incredibledetail.

Deﬁnition 5.1.15.

Let g be a Lie algebra and ( π , V ) a representation. For α ∈ g ∗ put V α = { v ∈ V : ( π ( H ) − α ( H ) ) n v = ∀ H ∈ g , n = n ( v , H ) } If V α (cid:54) =

0, then V α is called a generalized weight space and α a weight . We will denotethe set of weights by Λ ( g , π ) .If V is ﬁnite dimensional then π ( H ) − α ( H ) V α via the theory of Jordan normal forms. Therefore, we may assumethat n ( v , H ) = dim V . In this case, we would like to somehow deduce information about π from the generalized weight spaces. Theorem 5.1.16.

Let h be a nilpotent lie algebra and ( π , V ) a ﬁnite dimensional complex repre-sentation. Then there are ﬁnitely many generalized weights of π . Further, each generalized weightspace is stable under π ( h ) and V = (cid:76) α ∈ Λ ( h , π ) V α Proof.

We ﬁrst prove that V α is invariant under π ( h ) . Fix H ∈ h . Then put V α , H = { v ∈ V : ( π ( H ) − α ( H ) ) n v = n = n ( v ) } Now, by construction V α = (cid:84) H ∈ h V α , H . It sufﬁces to prove that V α , H is π ( h ) -invariant.133ow, as h is nilpotent, ad H is nilpotent for all H . Put h ( m ) = { Y ∈ h : ( ad H ) m Y = } so that h = (cid:83) dim h m = o h ( m ) . We prove that π ( Y ) V α , H ⊆ V α , H for Y ∈ h ( m ) by induction on m .For the case of m = h ( m ) =

0. Therefore, assume that this holds forall Z ∈ h ( m − ) . If Y ∈ h ( m ) , then [ H , Y ] ∈ h ( m − ) be construction. Therefore, ( π ( H ) − α ( H ) ) π ( Y ) = π ( Y )( π ( H ) − α ( H )) + π ([ H , X ]) and ( π ( H ) − α ( H ) ) π ( Y ) = ( π ( H ) − α ( H ) ) π ( Y )( π ( H ) − α ( H ) ) + ( π ( H ) − α ( H ) ) π ([ H , Y ])= π ( Y )( π ( H ) − α ( H ) ) + ( π ( H ) − α ( H ) ) π ([ H , Y ]) + π ([ H , Y ])( π ( H ) − α ( H ) ) Iterating this computation, we get the general formula ( π ( H ) − α ( H ) ) (cid:96) π ( Y ) = π ( Y )( π ( H ) − α ( H ) ) (cid:96) + (cid:96) − ∑ s = ( π ( H ) − α ( H ) ) (cid:96) − − s π ([ H , Y ])( π ( H ) − α ( H ) ) s For v ∈ V α , H , we know that ( π ( H ) − α ( H ) ) N v = N ≥ dim V . Take (cid:96) = N inthe above expression and apply it to v . The only terms which survive are those for which s < N . In this case, (cid:96) − − s ≥ N and therefore ( π ( H ) − α ( H ) ) s v ∈ V α , H , π ([ H , Y ]) preserves V α , H be the induction hypothesis, and ( π ( H ) − α ( H ) ) (cid:96) − − s π ([ H , Y ])( π ( H ) − α ( H ) ) s v = ( π ( H ) − α ( H ) ) (cid:96) π ( Y ) v = V α , H is stable under π ( Y ) . This completesthe induction and V α is invariant under π ( h ) .Now we can obtain the decomposition. Let H , ..., H d be a basis for h . The Jordandecomposition for π ( H ) gives a generalized eigenspace decomposition that we can writeas V = (cid:77) λ V λ , H We can regard the complex numbers λ as running over all values of α ( H ) for α ∈ h ∗ arbitrary. Therefore, we can re-write the decomposition as V = (cid:77) α ( H ) , α ∈ h ∗ V α ( H ) , H However, V α ( H ) , H = V α , H which we deﬁned at the beginning of the proof. Therefore,each of these spaces is stable under π ( h ) . Therefore, we can further decompose it under π ( H ) to get V = (cid:77) α ( H ) (cid:77) α ( H ) ( V α , H ∩ V α , H ) h to get V = (cid:77) α ( H ) ,..., α ( H d )  d (cid:92) j = V α , H j  with each of these spaces π ( h ) -invariant. By Lie’s theorem, each π ( H i ) acts simultane-ously by an upper-triangular matrices on (cid:84) d V α , H i with diagonal entries evidently α ( H i ) .Then π ( ∑ c i H i ) acts by ∑ c i α ( H i ) . Thus, if we deﬁne α ( ∑ c i H i ) = ∑ c i α ( H i ) , we see that (cid:84) d V α , H i = V α and V = (cid:76) V α . In particular there are only ﬁnitely many α which satisfythis property. This completes the proof.Now let g be a semisimple Lie algebra and h a nilpotent subalgebra. Let h ∗ denote itsdual space. Then for all λ ∈ h ∗ , deﬁne g λ = { X ∈ g : ( ad H − λ ( H ) ) n X = ∀ H ∈ h , n = n ( X , H ) } As h is nilpotent, we know that g = (cid:76) λ ∈ h ∗ g λ . Further, there exist ﬁnitely many λ suchthat g λ is non-zero. Let ∆ ( g , h ) be the set of weights. Proposition 5.1.17.

In the setting above:(a) g = (cid:76) α ∈ ∆ ( g , h ) g α (b) [ g α , g β ] ⊆ g α + β (this space is understood to be if α + β (cid:54)∈ ∆ ( g , h ) . )(c) h ⊆ g Proof.

This all follows from the previous theorem by replacing V with g . Deﬁnition 5.1.18.

A nilpotent Lie subalgebra h is a Cartan subalgebra if h = g .This deﬁnition in general is hard to check. Therefore, we would like an equivalentway of deﬁning Cartan subalgebras so that this condition is not too abstract. Proposition 5.1.19.

Let g be a Lie algebra and h a nilpotent subalgebra. Then h is a Cartansubalgebra if and only if N g ( h ) = h . This is the normalizer of h and is { X ∈ g : [ X , h ] ⊆ h } . Proof.

See [Kna05a]

Theorem 5.1.20.

Let g be a complex ﬁnite-dimensional Lie algebra. Then there exists a Cartansubalgebra h ⊆ g . Further, every Cartan subalgebra is conjugate.Proof.

See [Kna86], [Kna05a], and [Hel78] for separate proofs of this theorem.For the remainder of this section, we shall only give sketches of the proofs for thebig theorems as there are much more important topics to cover. For a full treatment see[Lor18, Chapter 7]. 135 eﬁnition 5.1.21.

Let g be a complex semisimple Lie algebra and h a Cartan subalgebra.We call the weights of the adjoint representation of h on g roots . The decomposition g = h ⊕ (cid:77) α ∈ ∆ ( g , h ) g α is called the root space decomposition .We want to understand ∆ ( g , h ) . Proposition 5.1.22.

Consider the situation above.(a) If α , β ∈ ∆ ∪ { } and α + β (cid:54) = then B ( g α , g β ) = (b) If α ∈ ∆ ∪ { } , then B is non-singular on g α × g − α . (c) If α ∈ ∆ then − α ∈ ∆ . (d) B | h × h is non-degenerate and thus for each α there exists H α so that B ( H α , H ) = α ( H ) . (e) ∆ spans h ∗ . Proof.

See [Kna05a, Chapter 2]The following proposition reduces the case of the root space decomposition nicely.

Proposition 5.1.23. If α ∈ ∆ , then dim g α = Further n α (cid:54)∈ ∆ for n ≥ Proof.

See [Kna05a, Chapter 2]All of this together shows that ∆ ( g , h ) is an abstract, reduced root system. We can thusdeﬁne a notion of positivity . Deﬁnition 5.1.24.

Let V be a ﬁnite dimensional inner product space. Fix a spanning set ϕ , ..., ϕ m . Then a vector ϕ is positive (denoted ϕ >

0) if there exists an integer k ≥ (cid:104) ϕ , ϕ i (cid:105) = ≤ i ≤ k − (cid:104) ϕ , ϕ i (cid:105) > i ≥ k . Lemma 5.1.25. If ϕ ∈ ∆ , the one of ϕ or − ϕ is positive.Proof. See [Lor18, Chapter 7].

Deﬁnition 5.1.26. A basis Π for ∆ is a choice of of elements such that(a) Π is a basis of h ∗ .(b) For any β ∈ ∆ , we can write β = ∑ n i α i with α i ∈ Π and n i ∈ Z all positive ornegative by Lemma 5.1.25.We call elements in Π simple, and normally say choose a simple system for ∆ .136 eﬁnition 5.1.27. Let α , β ∈ h ∗ . We deﬁne an inner product on h ∗ by ( α , β ) = (cid:104) α , β (cid:105)(cid:104) β , β (cid:105) = || α |||| β || cos θ where θ is the angle between the functionals. Then the reﬂection of β by α ,denoted s α β is deﬁned by s α β = β − ( β , α ) α The

Weyl group is W ( g ) : = (cid:104) s α : α ∈ ∆ (cid:105) Theorem 5.1.28. W ( g ) acts transitively on the set of simple systems for ∆ . Proof.

See [Kna05a, Chapter 2, Section 6]This ﬁnal theorem eases the concern that picking positive elements is arbitrary andcould possibly lead to different results.Now, let α ∈ ∆ and put h ◦ = h − (cid:83) α ∈ ∆ α ⊥ . The connected components of h ◦ are called Weyl chambers and given a choice of simple system Π , there is a natural choice of Weylchamber associated to Π called the positive Weyl Chamber C ( Π ) = { α ∈ h ∗ : ( α , β ) > ∀ β ∈ ∆ + } = { α ∈ h ∗ : ( α , β ) > ∀ β ∈ Π } Associated to any ∆ ( g , h ) is a lattice Λ = { α ∈ h ∗ : ( α , β ) ∈ Z , ∀ β ∈ ∆ } . This is the weightlattice associated to ∆ . Deﬁnition 5.1.29.

An element α ∈ h ∗ is called dominant and algebraically integral if α ∈ Λ ∩ C ( Π ) . Lie algebras are easier to deal with than Lie groups, but still the fact that they are non-associative makes the situation a bit difﬁcult. What we would like is to ﬁne an associativealgebra A such that the representation theory of g is the same as the representation theoryof A in some semi-canonical sense. As a ﬁrst guess, we could take the tensor algebra. Let g be a complex Lie algebra assumed to be ﬁnite dimensional (this construction works forthe inﬁnite dimensional case as well). Let T • ( g ) = (cid:76) N g ⊗ k denote the tensor algebra of g . This does not force the resulting map A → End ( V ) to be a Lie algebra homomorphismand thus is not the correct choice. Therefore, let U ( g ) = T • ( g ) / (cid:104) X ⊗ Y − Y ⊗ X − [ X , Y ] (cid:105) with X , Y ∈ g . This is the universal enveloping algebra of g . Then the canonical map i : g → U ( g ) is a lie algebra homomorphism. It is universal in the sense that given anyunital associative algebra A and a Lie algebra homomorphism g → A there is a uniqueLie algebra homomorphism so that the following diagram commutes U ( g ) A g ˆ ϕϕ i The following theorem gives an algebraic description of the universal enveloping algebra.137 heorem 5.1.30 (Poincaré-Birkhoff-Witt) . Let g be a complex Lie algebra with basis { X i } . Thenthe monomials X p ... X p n n form a basis for U ( g ) . If in addition we assume g is semisimple, then Let { X − α , H α , X α } be a basisfor g with respect to a set of roots ∆ ( g , h ) and a choice of simple system Π . Then the monomialsX i − α ... X i p − α p H j α ... H j q α q X k α ... X k r α r form a basis for U ( g ) . Corollary 5.1.31.

The canonical map i : g → U ( g ) is an injective Lie algebra homomorphism. Proposition 5.1.32.

Every representation of g extends to a representation of U ( g ) and everyU ( g ) -module descends to a representation of g . Proof.

The inclusion of U ( g ) -modules into g -representations is done by the corollary above.Therefore, it sufﬁces to show that every g -representation extends to an associative algebrahomomorphism U ( g ) → End ( V ) . Any representation g → End ( V ) can be extended toan algebra homomorphism T • ( g ) → End ( V ) . The kernel of this map contains the idealdeﬁning U ( g ) and therefore descends to a map U ( g ) → End ( V ) .We want to give a more analytic interpretation of the universal enveloping algebra.Let G be a semisimple (or reductive) lie group with Lie algebra g . Then G acts on thespace of smooth functions C ∞ ( G ) in two ways L ( g ) f ( x ) = f ( g − x ) R ( g ) f ( x ) = f ( xg ) An easy consequence of the deﬁnitions the differentiated action d λ commutes with thedifferentiated action d ρ . Therefore L ( g ) dR ( X ) = dR ( X ) L ( g ) for all X ∈ g and g ∈ G . This exhibits g as left invariant differential operators on G . In fact,it is a faithful representation g → End ( C ∞ ( G )) . We can extend this action to U ( g ) andthereby realizing U ( g ) as a ring of left invariant differential operators on G . As it turnsout, much of the representation theory of G is determined by how certain differentialoperators (namely the Laplacian or Casimir element) act on representation. If the rep-resentation is irreducible for instance, then the center Z ( g ) of the universal envelopingalgebra acts by scalars. This parametrized the irreducible representations of G . Let g be a complex semisimple lie algebra with cartan subalgebra h and root system ∆ : = ∆ ( g , h ) . Let ∆ + denote the set of positive roots and Π a system of simple ones.It is known that the ﬁnite dimensional representation theory of semisimple lie alge-bras is semisimple. In the case of complex representations, we have that for every ﬁnitedimensional representation ϕ : g → gl ( V ) = End C ( V ) , we can decompose V = (cid:76) V i where each V i is irreducible. Therefore we want to classify all irreducible ﬁnite dimen-sional representations and this will yield all ﬁnite dimensional representations of g . Wehave the following theorem which does precisely this.138 heorem 5.1.33 (Theorem of Highest Weights) . Let g be a complex semisimple Lie algebra, h a Cartan subalgebra and ∆ ( g , h ) the roots with respect to h . Let C + be the positive Weyl chamber.Then the irreducible, ﬁnite-dimensional representations of g stand in one-one correspondence withthe set of algebraically integral, dominant weights. The correspondence is given in one directionby V (cid:55)→ λ its highest weight. The difﬁcult step in the proof of this theorem is the construction of the correspondencein the “ ← ” direction. To do this, we must build ﬁnite dimensional irreducible represen-tations which have highest weight λ . These are seen as quotients of Verma modules (tobe deﬁned below), which are inﬁnite dimensional representations of g that are universalin some sense(see Proposition 5.1.38).The setup to the construction of such representations makes use of the root space de-composition of g . If α ∈ ∆ , deﬁne g α : = { X ∈ g | ( ad H − α ( H ) ) n X = ∀ H ∈ h , and some n = n ( h , X ) } Then it is easy to see that g = (cid:77) α ∈ ∆ g α = h ⊕ (cid:77) α (cid:54) = g α By deﬁnition the zero root space is the Cartan subalgebra. If we pick an order on h ∗ wecan then decompose g further into positive and negative root spaces n = (cid:77) α ∈ ∆ + g α n − = (cid:77) α ∈ ∆ + g − α These are both lie subalgebras by construction.

Deﬁnition 5.1.34.

The lie subalgebra constructed by all of the non-negative roots is calledthe

Borel subalgebra of g . We denote this as b = h ⊕ n Any lie subalgebra p such that b ⊆ p (cid:40) g is called a parabolic subalgebra .Before we head into the theory of highest weight modules, we recall some facts about sl ( C ) . If we let { e , f , h } be a basis, then on any irreducible ﬁnite dimensional represen-tation we have a weight space decomposition and the basis elements act in the followingway ... u i u i − ... ffeh feh e u m such that e ( u m ) =

0. We say that u m isthe highest weight vector of this representation. In this same style we have the followingdeﬁnition. Deﬁnition 5.1.35.

Let V be a left U ( g ) -module. A vector v ∈ V is called a highest weightvector if n ( v ) =

0. The left U ( g ) -submodule generated by a highest weight vector is calleda highest weight module .The following proposition gives some properties of highest weight modules. Proposition 5.1.36.

Let M be a highest weight module for U ( g ) , and let v be a highest weightvector generating M . Suppose v is of weight λ . Then the following hold:(a) M = U ( n − ) v(b) M = (cid:76) µ ∈ h ∗ M µ with each M µ ﬁnite-dimensional and with dim C M λ = (c) Every weight of M is of the form λ − ∑ n i α i with α i ∈ Π and n i ∈ Z + . Proof. (a) As above, we have the decomposition g = b ⊕ n − . The Poincaré-Birkoff-Witt The-orem gives a basis for U ( g ) which gives us the decomposition U ( g ) = U ( b ) ⊗ U ( n − ) = U ( n ) ⊗ U ( h ) ⊗ U ( n − ) .On the vector v , U ( b ) acts by scalars. This follows from the fact that U ( n ) v = U ( h ) does not increase or decrease the weight. Therefore U ( g ) v = U ( n − ) v and as M isgenerated by v . We conclude that M = U ( n − ) v .(b,c) It is clear that (cid:76) M µ is stable under the left U ( g ) action. As v ∈ (cid:76) M µ , we havethat M ⊆ (cid:76) M µ . It is true by construction that (cid:76) M µ ⊆ M and therefore M = (cid:76) M µ .By ( a ) we know that M = U ( n − ) v . For any monomial E i − β ... E i k − β k , this element acts on M µ with weight µ − ∑ k i j β j . As λ is the highest weight, we have that there are ﬁnitelymany ways to write µ = λ − ∑ i j β j and a unique way to write λ . Therefore M µ is ﬁnite-dimensional and M λ is 1-dimensional. The weights are all λ − ∑ i j β j = λ − ∑ n i α i as β p = ∑ n i p α i for α i ∈ Π . This completes the proof.We will deﬁne Verma modules shortly. These will turn out to be highest weight mod-ules which are universal in some sense. Before then, let λ ∈ h ∗ , and put δ = ∑ α ∈ ∆ + α .We can make C into a U ( b ) -module by deﬁning how elements of h and n act and then bythe Poincaré-Birkoff-Witt Theorem we will have deﬁned how U ( b ) acts. Deﬁne the actionof b on C by Hz = ( λ − δ )( H ) z ∀ H ∈ h Xz = ∀ X ∈ n We denote C under this action as C λ − δ . Deﬁne a functor Ind gb : U ( b ) Mod → U ( g ) Mod by V (cid:55)→ U ( g ) ⊗ U ( b ) V U ( g ) -module as a module over the universal enveloping algebra of thesubalgebra. Deﬁnition 5.1.37.

The

Verma module corresponding to the weight lambda is V ( λ ) = Ind gb ( C λ − δ ) = U ( g ) ⊗ U ( b ) C λ − δ The following theorem characterizes Verma modules. Using these modules, one canprove the “ ← ” direction of the theorem of highest weights. Proposition 5.1.38.

Let λ ∈ h ∗ . (a) V ( λ ) is a highest weight module with weight λ − δ generated by ⊗ (b) Let M be a highest weight module of weight λ − δ . Then there exists a unique U ( g ) -modulemap ψ : V ( λ ) → M with ψ ( ⊗ ) = v with ψ onto. It is injective if and only if ker ψ = V ( λ ) . Part ( b ) follows from the universal mapping property for tensor products. Notice that V ( λ ) isinﬁnite dimensional over C . Proposition 5.1.39.

Let λ ∈ h ∗ , V ( λ ) the associated Verma module, and S the sum of all properU ( g ) submodules of V ( λ ) . Then L ( λ ) = V ( λ ) / S is an irreducible U (( g ) -module and is a highestweight module with weight λ − δ .This follows immediately from the deﬁnition and the fact that the image of 1 ⊗ L ( λ ) is non-zero. The following theorem completes the proof of the Theorem of HighestWeights. Theorem 5.1.40.

Let λ ∈ h ∗ such that λ is real on h , dominant, and algebraically integral. ThenL ( λ + δ ) is an irreducible ﬁnite-dimensional representation of g with highest weight λ .For a proof of this see [Kna05a, Chapter V, Section 3]. Remark 5.1.41.

The exact same result holds on the group level as well. There, the prooffollows from the theorem on the level of Lie algebras by differentiating the representationsand then following the same steps. The only difference is the replacement of algebraicallyintegral with analytically integral (deﬁned below). For more details see [Kna86, ChapterIV, Section 7].Now that we know these representations exist and are parametrized by dominant,algebraically integral weights, we want to ﬁnd an explicit realization of the L ( λ + δ ) . Todo this, we make use of the theory of holomorphic vector bundles.141 .2 Compact Groups and Tori The key to understanding a majority of the representation theory of reductive, semisim-ple, or compact Lie groups is the existence of a

Haar Measure . This is a left invariant Borelmeasure on G . The existence of such a measure implies, as an example, that all repre-sentations of compact Lie groups can be taken to be unitary without a loss of generality.Additionally, combined with the Iwasawa decomposition, we get a variety of strong re-sults. This will play a key role in the proof of the Borel-Weil theorem. Let us ﬁrst showthat such a measure exists.Let G be a Lie group of dimension n with Lie algebra g . Then as T ( G ) = g and there isan isomorphism g → Γ L ( G , TG ) the set of left-invariant smooth vector ﬁelds on G . Fromthis we conclude that G is parallelizable. For this reason, we know that there exists an n − form ω ∈ Ω n ( G ) such that ω is positive relative to a chosen atlas on G , is nowherevanishing, and. is left-invariant. Further, by the Riesz Representation theorem, thereexists a Borel measure d µ ω on G such that (cid:82) G f ω = (cid:82) G f d µ ω for all f ∈ C c ( G ) . Lemma 5.2.1. d µ ω is left invariant in the sense that d µ ω ( L g E ) = d µ ω ( E ) for all Borel setsE ⊆ G and all g ∈ G . Proof. As ω is left-invariant, we know that L ∗ g ω = ω . Therefore, we have that (cid:90) G f ω = (cid:90) G f ( gx ) L ∗ g ω = (cid:90) G f ( gx ) d µ ω ( x ) = (cid:90) G f ( x ) d µ ω ( x ) Hence, d µ ω is left-invariant. If K ⊆ G is compact, we apply the above integral formulato all f ≥ K . Taking the inﬁmum over these. functions we see that d µ ω ( L ∗ g K ) = d µ ω ( K ) .Since G has a countable base, d µ ω is regular and the lemma follows. Deﬁnition 5.2.2.

A left-invariant, positive, Borel measure on G is called a left Haar mea-sure . Proposition 5.2.3.

Every left Haar measure on G is proportional.Proof.

See [Kna05a, Theorem 8.23].We could have equivalently deﬁned right Haar measures . For most groups these aredifferent from the left Haar measures. Let d l x denote a left Haar measure and d r x a rightHaar measure. Notice that L g and R g commute with one another. Then, for any t ∈ G ,the measure d l ( · t ) is a left Haar measure. For this reason, we get a function ∆ : G → R + called the modular homomorphism which satisﬁes d l ( · t ) = ∆ − ( t ) d l ( · ) This is a smooth function.

Lemma 5.2.4. ∆ ( t ) = for all t ∈ K a compact subgroup of G . Proof. As ∆ is smooth, ∆ ( K ) is a compact subgroup of R + . Therefore ∆ ( K ) = { } .142 eﬁnition 5.2.5. A Lie group G is called unimodular if ∆ =

1. Equivalently, if d r ( x ) = d l ( x ) .We now want to know what groups are unimodular. Then, when integration arises onthese groups we do not have to worry about the choice of Haar measure. Theorem 5.2.6.

The following groups are unimodular:(a) Compact groups(b) semisimple groups(c) Reductive groups

We will not prove this as it requires the development of reductive lie groups which wedo not present. See [Kna05a] for a proof in full generality.Now we turn to general representation theory for compact groups. A representation is a continuous group homomorphism Π : K → Aut ( V ) for some Hilbert space V . (Theassumption that V is a Hilbert space is unnecessary for dim V < ∞ . As we want thegreatest generality, we do not place this ﬁniteness assumption on V .) A representation iscalled unitary if Π ( k ) us a unitary operator for all k ∈ K . Lemma 5.2.7.

Let K be a compact Lie group and ( Π , V ) a representation. Then there exists aHermitian inner product (cid:104) , (cid:105) on V so that the representation is unitary.Proof. As K is compact, every continuous function is integrable. Deﬁne ( u , v ) = (cid:90) K (cid:104) Π ( k ) u , Π ( k ) v (cid:105) dk where dk is the Haar measure on K . Then it is obvious that each Π ( k (cid:48) ) is a unitary operatorwith respect to this new Hermitian inner-product. Further, by the Principal of UniformBoundedness we conclude that the topology on V is the same as the topology generatedby (cid:104) , (cid:105) .Therefore, we can assume without a loss of generality that every representation of acompact Lie group is unitary. Another interesting feature of compact Lie groups is theexistence of a maximal abelian subgroup. Proposition 5.2.8 (Cartan) . Let K be a compact, connected Lie group. Then there exists a maxi-mal abelian subgroup which can be identiﬁed as a torus. Further, every maximal torus is conjugate.Proof.

See [Bum13].In a similar style to semisimple Lie algebras, we can deﬁne roots with respect to t the Lie algebra of T ⊆ K a maximal torus. As t is abelian, the adjoint representation on k breaks up (as a direct sum) into one-dimensional irreducible representations. Each ofthese representations corresponds to a linear functional on t . We deﬁne roots as the thosecharacters which yield non-zero spaces k α . 143 eﬁnition 5.2.9. Let λ ∈ t ∗ . Then we say λ is analytically integral if for every H ∈ t withexp H = λ ( H ) ∈ π i Z . By a simple argument it can be shown that this conditionis equivalent to the existence of a character ξ λ : T → C × such that ξ λ ( exp H ) = e λ ( H ) forall H ∈ t . Proposition 5.2.10. If λ is analytically integral, then λ is algebraically integral. That is ( λ , α ) ∈ Z , for each α ∈ ∆ ( k , t ) Proof.

See [Kna86].

We now depart from compact groups momentarily to set up the remaining backgroundfor the Borel-Weil theorem.

Let G be a real Lie group. We would like to ﬁnd a complex Lie group G C which extends G in some meaningful way. Deﬁnition 5.3.1.

The complexiﬁcation of a real Lie group G is a complex Lie group G C ,together with an analytic map G → G C such that the Lie algebra of G C is g C = g ⊗ R C and G C is universal in the following sense: if H is a complex Lie group, and ϕ : G → H is a smooth homomorphism, then there exists a unique holomorphic homomorphism G C → H making the appropriate diagram commute. Remark 5.3.2.

Note that not all Lie groups admit a complexiﬁcation. In fact, the double(unversal) cover of SL ( R ) does not admit a complexiﬁcation. Even if a complexiﬁcationexists, it is not necessarily unique up to isomorphism.The following theorem gives us another convenient property of compact groups: theyalways admit a complexiﬁcation! Theorem 5.3.3.

Let K be a compact Lie group. Then K admits a complexiﬁcation which is uniqueup to isomorphism.Proof.

See [Kna05a, Theorem 4.69 and Proposition 7.5]It turns out then that the ﬁnite-dimensional complex representations of compact Liegroups are is bijective correspondence with ﬁnite-dimensional holomorphic representa-tions of K C . Irreducibility need not be preserved by restriction.We now come to arguably the most important decomposition of complex Lie algebrasand the Lie groups associated to them. It is responsible for nearly all of the structuretheory for semisimple Lie groups. 144 heorem 5.3.4 (Iwasawa Decomposition) . Let g be a real semisimple Lie algebra and G aconnected Lie group with Lie algebra g . Then there exist Lie subalgebras k , a , n and associatedanalytic subgroups K , A , N , such that g = k ⊕ a ⊕ n and G = K ANwhere K is compact, A is abelian, and N is nilpotent and similarly for the lie algebras.Proof.

For the lie algebra decomposition, let ( g , θ ) by a semisimple Lie algebra togetherwith a Cartan involution. Put g = k ⊕ p the associated Cartan decomposition and h p be a maximal abelian subspace of p . As h p is maximal abelian, we can simultaneouslydiagonalize all elements ad H , H ∈ h p . Let g λ = { X ∈ g : [ X , H ] = λ ( H ) X , ∀ H ∈ h p , λ ∈ h ∗ p } Notice that θ ( g λ ) = g − λ . Pick an ordering on h ∗ p and let n = (cid:76) α > g α . Since h p is θ -invariant and maximal abelian, we have that g = ( g ∩ k ) + h p Now if X ∈ (cid:76) α < g α we can write it as X = X + θ ( X ) − θ ( X ) . This decomposition has X ∈ k ⊕ n . Therefore, we have a decomposition g = k + h p + n Applying θ we conclude that this decomposition is direct.For the Lie group decomposition see [Hel78].This theorem also holds in the complex case. There is some slight modiﬁcation that needsto be done to the proof above, but the big steps are identical. Example 5.3.5. (a) Let g = sl n ( R ) . SO ( n ) (cid:44) → SL n ( R ) is a maximal compact subgroupand therefore so ( n ) is the corresponding compact lie algebra. Let a be the tracelessdiagonal matrixes and | lien be strictly upper triangular matrices. Then sl n ( R ) = so ( n ) ⊕ a ⊕ n We can equivalently realize this on the group level as SL n ( R ) = SO ( n ) · T · N where N is upper triangular matrices and T is the maximal torus. Notice that this is equiv-alent to the Gram-Schmidt orthogonalization of a matrix in sl n .Now lets consider the Cartan decomposition of sl n ( R ) = so ( n ) ⊕ p where p aresymmetric matrices. Notice that so ( n ) appears in both decompositions yet for theCartan decomposition we have no lie algebra structure on p . This should not besurprising however as both decompositions are equivalences as vector spaces.145b) Now consider sp n ( C ) . We have that k = (cid:26)(cid:18) U V − ¯ V ¯ U (cid:19) : U skew-Hermitian, V symmetric (cid:27) Similar to sl n we have a = (cid:26)(cid:18) A − A (cid:19) : A real diagonal matrix (cid:27) which are thediagonal matrices and the nilpotent lie algebra are all upper triangular matrices,but now we can decompose them further into n = (cid:26)(cid:18) Z Z − Z T (cid:19) : Z strictly upper triangular, Z symmetric (cid:27) Then sp n ( C ) = k ⊕ a ⊕ n . Theorem 5.3.6.

Let G be the complexiﬁcation of a compact Lie group K , and T ⊆ K a maximaltorus, T C its complexiﬁcation. Let g = k C be the complexiﬁed Lie algebra of k and t C the Lie algebraof T C . Denote the set of roots of g with respect to t C by ∆ . Fix an ordering on t ∗ C and write ∆ + theset of positive roots. Denote by n = (cid:76) α ∈ ∆ + g α and let b = t C ⊕ n . If we denote by N = exp ( n ) and B = T C N . Then N and B are closed subgroups of G . Further, there exists n > such thatG (cid:44) → GL n ( C ) such that K consists of unitary matrices, T C consists of diagonal matrices, and Bconsists of upper triangular matrices.Proof. Let π : K → Aut ( V ) be a faithful unitary representation. By the deﬁnition ofthe complexiﬁcation, we can extend π to a holomorphic representation (also denoted π ) G → Aut ( V ) . Clearly, the Lie algebra b is solvable as [ b , b ] = n and n is nilpotent. By Lie’sTheorem, we may ﬁnd a basis of V such that d π ( X ) is upper-triangular for all X ∈ b .Identify G with its imagine in GL n ( C ) and its Lie algebra as a Lie subalgebra of gl n ( C ) .Thus, we write X instead of π ( X ) and regard it as a matrix. Now, as each X ∈ n isnilpotent we know that exp ( X ) = I n + X + X + ... + n ! X n Therefore, Y − I n is a sum of strictly upper triangular matrices and is therefore a strictlyupper triangular matrix, hence nilpotent. Reversing the exponential series, we have that X = log ( exp ( X )) where we deﬁne log ( Y ) = ∑ ( − ) k − k ( Y − I n ) k for Y an upper triangularunipotent matrix. In this case, the sum is ﬁnite. This deﬁnes a continuous map n → N which is an inverse to exp . Therefore n → N is a homeomorphism. Let n (cid:48) be the Liesubalgebra of gl n ( C ) of upper-triangular nilpotent matrices and λ , ..., λ r a set of linearfunctionals on n (cid:48) such that n = (cid:84) i ker λ i . Then N is characterized as the set of A ∈ GL n ( C ) such that λ i ( log ( g )) = N as a closed subgroup (sub-variety) of GL n ( C ) .Now, since [ t C , n ] ⊆ n , we know that T C normalizes N and thus B = T C N is a closesubgroup of GL n ( C ) . Further its Lie algebra is b by construction. This completes theproof. 146he group B is a bit too big for the Iwasawa decomposition of G above. Let a = i t .It is the Lie algebra of a closed, connected subgroup A or T . If we embed K and G into GL n ( C ) , then T is the group of diagonal matrices and A is the group of diagonal matriceswith positive real entries. Put B = AN . Then by the Iwasawa decomposition G = KB as a direct product. Corollary 5.3.7.

Let K be a compact Lie group and T a maximal torus. If we denote by G thecomplexiﬁcation of K , then there is a bijection K / T ∼ = G / B where B = T C N . This gives K / T thestructure of a complex manifold.Proof.

From the Iwasawa decomposition, we have that G = KB with B ∩ K = T . Notethat this decomposition is not direct as b + k = g is not a direct sum. Then we have adiffeomorphism G / B → K / T which is K -equivariant. Now as G is complex Lie group and B is a complex analyticsubmanifold, the quotient G / B has the structure of a complex manifold. Further, theaction of K on K / T is via holomorphic maps.As well will see later, the proof of the Borel-Weil theorem uses the Iwasawa decompo-sition in a fundamental way. In fact, nearly all of the structure theory for semisimple Liegroups is due to the Iwasawa decomposition. Deﬁnition 5.4.1.

Let M be a complex manifold. We call a triple ( E , π , V ) consisting of acomplex manifold, a holomorphic projection map, and a complex vector space holomor-phic vector bundle of rank dim V over M if:(a) π : E → M is surjective and a local isomorphism.(b) There exist biholomorphic π − ( U ) → U × V .(c) The ﬁbre π − ( p ) ∼ = p × V ∼ = V is endowed with a vector space structure.Similarly, we could have deﬁned vector bundles as E = (cid:228) p ∈ M V p where V p = { p } × V .In this sense, we see that T M and T ∗ M are vector bundles over smooth manifolds. Similarto those, Γ ( M , E ) is a O M -module. The main purpose of this section is to understandtransformations on bundles and transformations between them. Deﬁnition 5.4.2.

Let ( E , π ) and ( E (cid:48) , π (cid:48) ) be holomorphic vector bundles over M and M (cid:48) respectively. Then a holomorphic bundle homomorphism is a map F : E → E (cid:48) such thatthere exists a map f : M → M (cid:48) and the following diagram commutes: E E (cid:48)

M M (cid:48) π F π (cid:48) f roposition 5.4.3. If F is holomorphic, then f is holomorphic.Proof. f = π (cid:48) M ◦ F ◦ ζ where ζ is the zero section. This is a composition of holomorphicmaps and therefore holomorphic.This lets us deﬁne a category Bun H ( M ) whose objects are holomorphic vector bundlesover M and where morphisms are holomorphic bundle homomorphisms. The forgetfulfunctor U : Bun H ( M ) → Man C (with Man C the category of complex manifolds) is faithful. In general, it is not full asthere exist holomorphic maps E → E (cid:48) which do not commute with the projection maps.We will denote by Bun H ( M ) < ∞ the category of ﬁnite rank vector bundles. In some morerecent treatments of this material (say in [Wed16]) this category is treated as ﬁnite locallyfree sheaves over M . This is not useful for the theory presented below. Example 5.4.4.

Let

T M denote the real tangent bundle for the complex manifold M . Itis a rank 2 dim M real vector bundle. The complex structure on M induces an almostcomplex structure J on T M . This induces an endomorphism J : T M → T M such that J = −

1. This can thus be extended to a endomorphism

T M ⊗ C → T M ⊗ C deﬁned onﬁbres by J ( X + iY ) = J ( X ) + i J ( Y ) . As J = −

1, we get a decomposition of

T M ⊗ C intotwo eigenspaces for J corresponding to the eigenvalues i and − i . Then T M ⊗ C = T M i ⊕ T M − i Then

T M i is the holomorphic tangent bundle to M . The bundle T M − i is called the anti-holomorphic tangent bundle.If E and E (cid:48) are holomorphic vector bundles on a complex manifold M , denote theirspace of holomorphic sections by Γ ( E ) and Γ ( E (cid:48) ) . If F : E → E (cid:48) is a bundle homomor-phism, it induces a map (cid:101) F : Γ ( E ) → Γ ( E (cid:48) ) given by (cid:101) F ( σ )( p ) = F ( σ ( p )) Because a bundle homomorphism is linear on ﬁbres, (cid:101) F is C -linear on sections.We now want to construct some holomorphic vector bundles on a complex Lie groupand on complex homogeneous spaces G / H . Proposition 5.4.5.

Let G be a complex Lie group and ( π , W ) a complex representation of a closedsubgroup H . Then there exists a holomorphic vector bundle V over G / H such that G acts on thespace of sections.Proof.

The canonical map G → G / H is a principal H -bundle. Any complex representa-tion π : H → GL ( W ) induces an action of H on the space G × W by ( g , w ) · h = ( gh , π ( h − ) w ) V = G × H W = ( G × W ) / H . Then [ gh , w ] = [ g , π ( h ) w ] ∈ V . The map q : V → G / H given by [ g , w ] (cid:55)→ gH is well deﬁned, surjective, and q − ( gH ) ∼ = W . This is aﬁbre bundle with transition maps given by the transition maps for the principal bundle.Further, as the ﬁbres are complex vector spaces and the canonical map is holomorphic,we have that V is a holomorphic vector bundle over G / H . Let Γ ( G / H , V ) denote the setof sections s : G / H → V . We can identify Γ ( G / H , V ) ∼ −→ F H , π : = { f : G → V | f ( gh ) = π ( h ) − f ( g ) } Then G acts on this space by g · f ( x ) = f ( g − x ) . This completes the proof.Even for one dimensional representations χ of H , the space F H , χ is unbelievably mas-sive. We may home that if we restrict to some subset (say impose more restrictions on f ∈ F H , χ ) then we may be able to get a handle on what these representations are. As itwill turn out in the next section, we can restrict ourselves to holomorphic sections of V . Thisrestriction will turn out to be enough to realize all of the ﬁnite dimensional irreduciblerepresentations of K a compact Lie group and G = K C its complexiﬁcation. In this short subsection, we shall show that there is some interesting geometry happeningbehind the scenes here involving the quotients G / B or more generally G / P for any closedgroup containing B . This is done through the language of ﬂag manifolds . Before we get toﬂag manifolds, we need to discuss the Grassmann manifolds (also called Grassmannians). Deﬁnition 5.4.6.

Let V be a real (or complex) vector space of dimension n . The Grass-mannian of k-planes in V is the set of all k -dimensional subspaces in V and is denotedGr ( k , V ) .Let G = Aut ( V ) be the group of automorphisms of V . By choosing a basis for V ,we can identify Aut ( V ) ∼ = GL n ( R ) (resp. GL n ( C ) ). Now, let A and A (cid:48) be two differentelements of Gr ( k , V ) . By choosing bases and extending these to full bases of V , we can ﬁnda matrix X ∈ GL n ( R ) such that X A = A (cid:48) . Therefore, GL n ( R ) acts transitively on Gr ( k , V ) .Let { v , ..., v n } be the basis of V and S = Span R { v , ..., v k } be the standard k -plane. Thenthe isotropy subgroup of S is the closed subgroup H = (cid:26)(cid:18) P Q R (cid:19) : P ∈ GL k ( R ) , Q ∈ M k , n − k ( R ) , R ∈ GL n − k ( R ) (cid:27) This gives an identiﬁcation Gr ( k , V ) = GL n ( R ) / H . We call H a parabolic subgroup of G .This exhibits Gr ( k , V ) as a real (resp. complex) manifold.Now let ( n , ...., n j ) ∈ Z j j ≤ n be an increasing tuple of integers with n j = n = dim V .A ﬂag of type ( n , ..., n j ) is a chain of subspaces0 = V ⊆ V ⊆ V ⊆ ... ⊆ V j = V V i = n i . Equivalently, we could require that dim V i / V i − = n i − n i − . A full ﬂag corresponds to the tuple (

1, 2, 3, ..., n ) and thus a chain0 = V ⊆ V ⊆ ... ⊆ V n = V and dim V i / V i − = Deﬁnition 5.4.7.

The partial ﬂag manifold of type ( n , ..., n j ) is the collection of all ﬂagsof type ( n , ..., n j ) in V and is denoted Fl ( n , ..., n j ; V ) . The full ﬂag manifold of V will bedenoted Fl ( V ) .By choosing a basis for V and thus identifying it with R n , we have a natural ac-tion of GL n ( R ) on Fl ( n , ..., n j ; V ) . Now, let F and F (cid:48) be two distinct ﬂags. There ex-ists X ∈ GL n ( R ) such that XF = F (cid:48) and the action is transitive. The stabilizer of F isa closed subgroup P of GL n ( R ) and we identify Fl ( n , ..., n j ; V ) = G / P . This exhibitsFl ( n , ..., n j ; V ) as a smooth manifold. The stabilizer of the standard full ﬂag is the sub-group B of upper-triangular matrices. Thus Fl ( V ) = GL n ( R ) / B . Remark 5.4.8.

The groups P and B are called the standard parabolic and standard Borel subgroups respectively. An alternative deﬁnition of the standard Borel subgroup is as astandard minimal parabolic subgroup. We call the conjugates of B , Borel subgroups andthe conjugates of P parabolic subgroups. Notice that every parabolic subgroup containsa Borel subgroup.In the case of a complex vector space, we see that Fl ( V ) = GL n ( C ) / B . By Corollary5.3.7, we can realize Fl ( V ) = K / T for K = U ( n ) . In more generality, for a connected Liegroup C , there exists a maximal torus S and the quotient space C / S is a ﬂag manifold . Example 5.4.9. (a) As seen above, if V is a complex vector space then Gr ( k , V ) is aﬂag manifold corresponding to the tuple ( k , n ) ∈ Z . It is realized as the quotient GL n ( C ) / H with H the complex analog of the group deﬁned above.(b) Let CP n (or P n ( C ) ) denote the orbit space ( C n + − { } ) / C × . This is realized as thespace of all lines in C n + . In the language we have seen above, we can realize this asGr ( C n + ) . Deﬁnition 5.4.10.

Let G be a complex connected Lie group and H a closed subgroup.Then G / H is a complex homogeneous space. Let p : V → G / H be a holomorphic vectorbundle. V is homogeneous if the group of bundle automorphisms act transitively onthe set of ﬁbres of V . We call V homogeneous with respect to G if the G action on G / H lifts to a G action on V by bundle automorphisms. We will sometimes refer to these as G -homogeneous vector bundles.Let us now characterize all vector bundles on ﬂag manifolds which are homogeneouswith respect to K C . Proposition 5.4.11.

Let K be a compact, connected Lie group and G its complexiﬁcation. Let ( π , W ) be a representation of a parabolic subgroup P ⊆ G . Then this gives rise to a holomorphicvector bundle over the partial ﬂag manifold G / P which is homogeneous with respect to G . Further,every holomorphic vector bundle which is homogeneous with respect to G arises in this way. roof.

The existence of such a vector bundle was proven in Proposition 5.4.5. The ho-mogeneity condition is readily checked. Therefore, we shall show that every homoge-neous vector bundle arises in this way. Let V be a G -homogeneous vector bundle and V P the ﬁbre p − ( P ) . V P comes naturally equipped with the structure of a representation P → Aut ( V P ) . The map µ : G × V H → V deﬁned by µ ( g , z ) = g · z is surjective as G acts transitively on G / P . The ﬁbres of µ areprecisely the P orbits on G × V P via the diagonal action ( g , z ) (cid:55)→ ( gp − , p · z ) Therefore, we may represent any element uniquely as an equivalence [ g , z ] where [ gp , z ] =[ g , p · z ] . Hence, we can make the identiﬁcation V = G × P V P . This completes the proof. We will motivate the theorem by starting with some facts about G = GL n ( C ) . The naturalaction of G on C n − { } commutes with the action of C × and therefore descends to anaction on CP n − . Moreover this action is transitive. Now, the isotropy subgroup of theclass [ ] in G consists of all g ∈ G such that g · (

0, ..., 0, 1 ) T = (

0, ..., 0, λ ) T for λ ∈ C × . Let Q be this group. Then Q = (cid:26)(cid:18) A w T λ (cid:19)(cid:27) ∩ GL n ( C ) with λ ∈ C , w ∈ C n − , and A ∈ M n − ( C ) . Then Q is a complex subgroup of G as itsLie algebra is complex. Therefore the quotient G / Q becomes complex manifold which isbiholomorphic to CP n − .Now ﬁx N ≥ χ : Q → C × a character of Q of the form χ (cid:18) A w T λ (cid:19) = λ − N Then χ induces a holomorphic action of Q on C by q · z = χ ( q ) z . Using this, we can buildthe associated bundle G × Q C → G / Q in the style of the previous section. Now per theproof of Proposition 5.4.5, we can identify the C ∞ sections of this bundle with the spaceof functions F ∞ Q , χ = (cid:110) f : G → C f ( gq ) = χ ( q ) − f ( g ) , f smooth (cid:111) Now, let V N be the space of homogenous polynomials of degree N in n complex vari-ables. Then for any f ∈ V N deﬁne ϕ f ( g ) = f  g   q ∈ Q , we have that ϕ f ( gq ) = f  gq   = λ N ϕ f ( g ) Therefore ϕ f ∈ F ∞ Q , χ . In fact, this is holomorphic and therefore ϕ f ∈ F HolQ , χ , the space ofholomorphic sections. For the rest of this section, let (cid:96) =   . Proposition 5.5.1.

The only holomorphic sections of G × Q C → G / Q are those ϕ f . Proof.

Let ϕ : G → C be the function corresponding to a holomorphic section of thebundle. We want to deﬁne a polynomial P ( z , ..., z n ) on C n − { } . Let g ∈ G be such that g (cid:96) =  z ... z n  . Then deﬁne P ( z , ..., z n ) = ϕ ( g ) . To see this is well-deﬁned, let g (cid:48) be anotherelement of G satisfying g (cid:48) (cid:96) =  z ... z n  . Then g − g (cid:48) stabilizes (cid:96) and therefore is en element q of Q . Writing g (cid:48) = gq , we have that ϕ ( g (cid:48) ) = ϕ ( g ) and P is well-deﬁned. Moreover, byconstruction P is homogeneous of degree N . Since we can deﬁne P using open sets of G ,we have that P is holomorphic on C n − { } . The homogeneity condition implies that P isbounded near 0. Hence, P admits a holomorphic extension to C n . Now, the C ∞ . behavior,combined with the homogeneity implies that | P ( z ) | ≤ C | z | N and similarly | ∂ α z P ( z ) | ≤ C α | z | N −| α | for any multi-index α and z ∈ C n − { } . If | α | > N , then ∂ α P vanishes at ∞ and byLiouville’s theorem, is 0. Therefore, the Taylor expansion of P about 0 vanishes for alldegrees > N . Hence, P is a polynomial.This implies that the representation of G on V N can be realized as the space of sections F HolQ , χ . In different terminology, we say that V N = Ind GQ ( χ ) is the induced representation of G from the representation χ of Q . Now, we can turn to the general situation.Let K be a compact lie group with maximal torus T . If G = K C is the complexiﬁcation,then the Iwasawa decomposition implies that G = K AN and B = T C N where N are thelower-triangular nilpotent matrices. Then by Corollary 5.3.7, we know that G / B ∼ = K / T and both are complex manifolds. For any character λ : T → C × , we can extend λ to be acharacter of T C and then to B by declaring χ ( ¯ n ) =

1. Therefore, we get two line bundles G × B C ∼ = K × T C which are isomorphic as complex manifolds.152 heorem 5.5.2 (Borel-Weil) . Let K be a compact, connected Lie group and T ⊆ K be a maximaltorus. Let G = K C be the complexiﬁcation and B = MAN a Borel subgroup. Then the irreducibleﬁnite dimensional representations of K stand in one-to-one correspondence with the dominant,analytically integral weights λ ∈ t ∗ with the correspondence given by λ (cid:55)→ Γ H ( K / T , L λ ) ∼ = F HolB , χ λ where Γ H ( K / T , L λ ) denotes the set of holomorphic sections of the bundle and F HolB , χ λ = (cid:110) f : G → C f ( gb ) = χ λ ( b ) − f ( g ) , f holomorphic (cid:111) with χ λ the character of B associated to the analytically integral weight λ .We present a combination of the proofs presented in [Kna86], [Hel78], and [Hel08].The proof will proceed in two main steps: 1) show that Γ H ( K / T , L λ ) is a ﬁnite dimen-sional and 2) show it is irreducible. Throughout the proof, we shall make use of theisomorphism Γ H ( K / T , L λ ) → F HolT , χ λ ∼ = F HolB , χ λ . Remark 5.5.3.

Another way of thinking about this theorem is as a classiﬁcation result forvarious sheaves on the ﬂag varieties (manifolds) Fl ( C n ) . Every ﬁnite rank vector bundleon Fl ( C n ) corresponds to a ﬁnite locally free sheaf with the correspondence given bytaking global sections. The theorem above classiﬁes all of the line bundles (considered assheaves) on Fl ( C n ) which admit global sections.The Lie algebra of G has a Cartan decomposition g C = k ⊕ i k corresponding to theCartan involution θ : g → g . Let Θ be the corresponding involution of G . This givesan Iwasawa decomposition g = t ⊕ a ⊕ n . Pick a maximal abelian subalgebra i a of i k and form m = Z k ( a ) the centralizer of a in k . Then m is a Cartan subalgebra of k and m C = a ⊕ m . With respect to the roots ∆ ( k C , m C ) , put b = m ⊕ a ⊕ (cid:77) α ∈ ∆ + k − α Then B = MAN is the corresponding Iwasawa decomposition of the Borel subgroup.Now, let λ ∈ t ∗ be a dominant, analytically integral weight and ( Φ λ , V ) the irre-ducible, ﬁnite dimensional highest weight representation of K with highest weight λ .Let v λ ∈ V be a highest weight vector. This representation extends to a holomorphic rep-resentation (also denoted Φ λ ) of G via the universal property of the complexiﬁcation. Foreach v ∈ V , deﬁne a function ψ v ( x ) on G by ψ v ( x ) = ( Φ λ ( x ) − v , v λ ) where ( , ) is the inner product on V induced via the isomorphism with C n . Lemma 5.5.4.

For each v ∈ V , ψ v ∈ F HolB , χ λ . Moreover if L denotes the left regular action, thenL ( k ) ψ v = ψ Φ λ ( k ) v and the collection { ψ v : v ∈ V } is an irreducible subrepresentation of F HolB , χ λ which is equivalent to Φ λ . 153 roof of Lemma 5.3. Let ϕ λ be the differential of Φ λ . Since Φ λ is unitary on V , ϕ λ is skew-hermitian on k and complex-linear on g . Therefore, ϕ λ ( θ X ) = − ϕ λ ( X ) ∗ and Φ λ ( Θ x ) = Φ λ ( x − ) ∗ for all X ∈ g and x ∈ G . Now if b ∈ B = MAN we have that for all x ∈ G ψ v ( xma ¯ n ) = ( Φ λ ( ma ¯ n ) − Φ λ ( x ) v , v λ )= ( Φ λ ( x ) − v , Φ λ ( ma − n ) v λ ) as Θ ( ma ¯ n ) = ma − n ∈ MAN = ( Φ λ ( x ) − v , Φ λ ( ma − ) v λ ) as v λ is a highest weight vector = ( Φ λ ( x ) − v , χ λ ( m ) χ λ ( a ) − v λ ) as v λ has weight λ = χ λ ( m ) χ λ ( a ) − ( Φ λ ( x ) − v , v λ )= χ λ ( b ) − ψ v ( x ) Further, It is clearly holomorphic as is deﬁned by a holomorphic representation. Hence, ψ v ∈ F HolB , χ λ . Finally, ψ Φ λ ( k ) v ( x ) = ( Φ λ ( x ) − Φ λ ( k ) v , v λ )= ( Φ λ ( k − x ) − v , v λ )= ψ v ( k − x ) = L ( k ) ψ v ( x ) This completes the proof of the lemma.Now we wish to show that V → F HolB , χ λ is onto. Put ψ λ : = ψ v λ and F λ : = F HolB , χ λ . Lemma 5.5.5.

Let F ∈ F λ . Then (cid:90) M F ( mxm − ) dm = F ( ) ψ λ ( x ) for all x ∈ G . (dm is the normalized Haar measure on M . ) The idea of the proof is to show that the left side is a multiple of F ( ) independentof F . This multiple is a power series in x and evaluating at F = ψ λ , we see that they areequal near 1. By holomorphicity, the functions are thus equal everywhere. Proof of Lemma 5.4.

Let X ∈ g and (cid:101) X the corresponding left invariant vector ﬁeld on G .Since F is holomorphic, it is real-analytic and thus the Taylor series of F converges to F isa neighbourhood of 1. Thus F ( exp X ) = ∑ n ! ( (cid:101) XF )( ) Conjugating by m and integrating, we see that (cid:90) M F ( m exp Xm − ) dm = ∑ n ! (cid:18)(cid:26) (cid:90) M Ad ( m ) (cid:101) X n dm (cid:27) F (cid:19) ( ) Now let { X α , H α , X − α } be a basis of g with respect to a positive choice of roots. Writ-ing X in terms of this basis and expanding, we get integrals of monomials. The coefﬁ-cients can be factored out as (cid:101) X is complex-linear as an endomorphism of F λ . Now, by the154oincaré-Birkhoff-Witt Theorem, we can rewrite the expression as a linear combination ofAd ( m ) and monomials of the form X i − α ... X i p − α p H j α ... H j q α q X k α ... X k r α r . Then integrals of eachmonomial is now Ad ( m ) -invariant and a monomial. If this new monomial has no X − α term for α ∈ ∆ + then by Ad ( m ) -invariance it cannot have any X α term.On the other hand, any Ad ( m ) -invariant polynomial cannot have any X − α terms as thevector ﬁeld (cid:103) X − α F = tX − α ∈ N . Hence, all of the Ad ( m ) -invariantpolynomials lie in U ( m C ) and as exp m C = MA ⊆ B , each member of U ( m C ) acts byscalars depending only on λ . Hence, any expression of the form H j α ... H j n α n F ( ) is a scalarmultiple of F ( ) independent of F . This implies the lemma.Now we can prove Theorem 5.2 in a few easy steps. Proof of Theorem 5.2.

Deﬁne an inner product on F λ by (cid:104) F , F (cid:105) = (cid:90) K F ( k ) F ( k ) dk Claim 5.5.6. | F ( ) | ≤ || ψ λ || − · || F || In fact || F || = (cid:90) K | F ( k ) | dk = (cid:90) K | F ( mkm − ) | dk = (cid:90) K (cid:90) M | F ( mkm − ) | dmdk ≥ (cid:90) K (cid:18) (cid:90) M | F ( mkm − ) | dm (cid:19) dk = | F ( ) | (cid:90) K | ψ λ ( k ) | dk = | F ( ) | || ψ v || As a direct corollary of this, for every compact E ⊆ G , there exists a C E < ∞ such that | F ( x ) | ≤ C E || F || for all F ∈ F λ and x ∈ E . Therefore, F λ is complete (Cauchy sequences converge oncompact sets by the previous line and their limit is holomorphic and satisﬁes the desiredrelation). Now, F λ is ﬁnite-dimensional, as it is a locally compact Banach space.It remains to be shown that F λ is irreducible as a representation of K . Let U ⊆ F λ be aclosed, invariant subspace. Then for F (cid:54) = U , by applying some L ( k ) , we can assumethat F ( ) (cid:54) =

0. Therefore by completeness (cid:90) M χ λ ( m ) L ( m ) Fdm is an element of U . However, Lemma 5.5.5 says that this is equal to F ( ) ψ λ . Hence, ψ λ ∈ U . Similarly, we see that ψ λ ∈ U ⊥ . This is a contradiction and thus U = U ⊥ = F λ is an irreducible, ﬁnite-dimensional representation of K . By Lemma 5.5.4 themap V → F λ is a K -equivariant isomorphism. This completes the proof.155his result shows us that we can derive some algebraic information from a geometricobject. In the language of Chapter 3, this theorem can be restated as H ( G / B , F λ ) (cid:54) = λ is dominant and analytically integral. In fact, a stronger form of this theoremdue to Bott [FH04] says that the sheaf cohomology of the associated bundle is non-zero isonly one degree. This surprising appearance of sheaf cohomology indicates that it mayprove to be useful in understanding the sheaf G of chapter 4 as well as understanding C ∞ ( µ ( − )) as a G -module. Some care needs to be taken here as we do not know muchabout the category G - Mod . In fact, the case of O X -modules for a locally ringed space maydeviate highly from this situation in some critical ways. One being that there is no reason a priori that G x is a local ring. We do not provide a resolution to this here and thus thereis still much work to be done. One major deﬁcit of the model in Chapter 4 is its dependence on the odor source rep-resentations to be clean and precise. What should happen if an odor is presented in anenvironment which is particularly noisy? For example, consider a fox in the wilderness.If the fox is eating a meal the odors are in high concentration and thus can be distin-guished. If instead it is trotting along and the odor of rabbit wafts through the air, howmay it determine what this odor is? There are clearly many other odors present in thesecond situation and thus should make identiﬁcation nearly impossible. This contradictsexperimental and observational evidence however! We know that foxes can ﬁnd theirprey with minimal odor stimulation; this implies the existence of some mechanism whichproduces a "best guess" for what a given noisy odor may be. As we shall see below, thereis a naive way of modeling such a problem which we conjecture is indeed the correct ap-proach. This naive method relies on vector ﬁelds on S and generates an attractor basin forthe various odors. This has been shown to have some relation to ˘Cech cohomology whichcan be viewed as a reﬁnement of sheaf cohomology. This ties together all of the ideas pre-sented. We will not go through the construction of ˘Cech cohomology as it is a bit involvedand the main idea behind it is to serve as a computational tool for sheaf cohomology onsuitably nice spaces (of which manifolds happen to ﬁt). In general, the theory of ﬂows is a generalization of the theory of Ordinary differentialequations. Now, the equations are deﬁned on manifolds by vector ﬁelds ξ : M → T M .We shall not do the general case here but refer the reader to [Lee12]. Our situation issigniﬁcantly eased as R (cid:48) and therefore S are assumed to be diffeomorphic to open sub-manifolds of R n and therefore TR (cid:48) ∼ = TS ∼ = S × R n . So there exists vector ﬁelds { V , ..., V n } which span the tangent space at each s ∈ S . As a result, there exists a non-trivial vectorﬁeld ξ which is complete meaning that every trajectory can be given R as a domain. Bytrajectory we mean a smooth map γ : R → S such that γ (cid:48) ( t ) = ξ ( γ ( t )) . In general, this isthe solution of a differential equation and these trajectories are called maximal .156 eﬁnition 5.6.1. Let ξ ∈ Γ ( S , TS ) . The ﬂow of ξ is the mapping θ : S × R → S given by ( s , t ) (cid:55)→ γ s ( t ) where γ s ( ) = s and γ is the maximal trajectory.We can use ﬂows to understand noisy inputs into the olfactory system. Let K be thecollection of s ∈ S such that s is a local maximum of the function f deﬁning S . That is,these are the "tops" of the peaks. Now deﬁne a smooth vector ﬁeld on S which makes K anattractor. Then the attractor basin is the disjoint union of a ﬁnite number of contractibleopen sets. What we would like to know is that the attractor basin is a cover of S sothat any point can be draw into one of the peaks and identiﬁed as in the classiﬁcationscheme of Chapter 4. This would allow us to identify any noisy odor (one for which (cid:102) U x is particularly large) with some degree of accuracy. Sadly, this cannot be guaranteed asa cover would rely on exposure to an enormous number of different odors (then we canassume that the U x form a cover of R (cid:48) and thus (cid:102) U x is a cover of S themselves) or somenearly equivalent requirement. As a consolation, we can still identify noisy odors whichfall within the attractor basin of the learned odors.Let us connect the idea of ﬂows to representations. Let X ∈ Γ ( M , T M ) be a completesmooth vector ﬁeld and θ : R × M → M the associated ﬂow. This is equivalent to deﬁn-ing an action of the Lie group ( R , +) on M and therefore a non-linear representation of R . Here, as diffeomorphisms of M . By differentiating this action, we get a non-linear Liealgebra representation R → X ( M ) . As we can realize ﬂows as solutions to certain non-linear partial differential equations, we can equivalently understand theses solutions byunderstanding the corresponding representation on either the Lie group or Lie algebralevel. This is one reason representation theory my play a key role in the further develop-ment of this theory and for understanding the identiﬁcation of noisy odors.157 hanks for reading! ibliography [AK18] A. J. Aqrabawi and J. C. Kim. Hippocampal projections to the anterior ol-factory nucleus differentially convey spatiotemporal information duringepisodic odour memory. Nat Commun , 9(1):2735, 07 2018.[AK20] A. J. Aqrabawi and J. C. Kim. Olfactory memory representations arestored in the anterior olfactory nucleus.

Nat Commun , 11(1):1246, Mar2020.[AM69] M. F. Atiyah and I.G. Macdonald.

Introduction to Commutative Algebra .Advanced Book Program. Westview Press, 1969.[AR18] D. Aschauer and S. Rumpel. The sensory neocortex and associativememory.

Current Topics in Behavioral Neurosciences , 37:177–211, 2018.[ASC +

04] N. M. Abraham, H. Spors, A. Carleton, T. W. Margrie, T. Kuner, and A. T.Schaefer. Maintaining accuracy at the expense of speed: stimulus sim-ilarity deﬁnes odor discrimination time in mice.

Neuron , 44(5):865–876,Dec 2004.[BC19] Ayon Borthakur and Thomas A. Cleland. Signal conditioning for learn-ing in the wild. In

Proceedings of the 7th Annual Neuro-Inspired Computa-tional Elements Workshop , NICE ’19, New York, NY, USA, 2019. Associa-tion for Computing Machinery.[BFC17] M. D. Berke, D. J. Field, and T. A. Cleland. The sparse structure of naturalchemical environments. In , pages 1–3, May 2017.[BLFL06] B. Bathellier, S. Lagier, P. Faure, and P. M. Lledo. Circuit properties gen-erating gamma oscillations in a network model of the olfactory bulb.

J.Neurophysiol. , 95(4):2678–2691, Apr 2006.[BMA +

15] A. Banerjee, F. Marbach, F. Anselmi, M. S. Koh, M. B. Davis, P. Garcia daSilva, K. Delevich, H. K. Oyibo, P. Gupta, B. Li, and D. F. Albeanu. AnInterglomerular Circuit Gates Glomerular Output and Implements GainControl in the Mouse Olfactory Bulb.

Neuron , 87(1):193–207, Jul 2015.[Bre97] Glen E. Bredon.

Sheaf Theory , volume 170 of

Graduate Texts in Mathematics .Springer Science+Business Media LLC, 1997.159Bum13] Daniel Bump.

Lie Groups , volume 225 of

Graduate Texts in Mathematics .Springer Sceince+Business Media New York, LLC, 2nd edition, 2013.[BW13] S. Marc Breedlove and Neil V. Watson.

Biological psychology: An intro-duction to behavioral, cognitive, and clinical neuroscience, 7th ed.

Biologicalpsychology: An introduction to behavioral, cognitive, and clinical neu-roscience, 7th ed. Sinauer Associates, 2013.[Car11] David Joseph Carchedi.

Categorical Properties of Topological and Differen-tiable Stacks . PhD thesis, Utrecht University, 2011.[CBC20] T. A. Cleland, A. Borthakur, and A. Calambur. TBD: Biological insightsfrom engineered systems.

Frontiers in Computational Neuroscience , Inpreparation, 2020.[CCH +

11] T. A. Cleland, S. Y. Chen, K. W. Hozer, H. N. Ukatu, K. J. Wong, andF. Zheng. Sequential mechanisms underlying concentration invariancein biological olfaction.

Front Neuroeng , 4:21, Nov 2011.[CL05a] Thomas A. Cleland and Christiane Linster. Computation in the OlfactorySystem.

Chemical Senses , 30(9):801–813, 11 2005.[CL05b] Henri Cohen and Claire Lefebvre, editors.

Handbook of Categorization inCognitive Science . Elsevier, 2005.[Cla19] J. P. Clapper. Graded similarity in free categorization.

Cognition , 190:1–19, Sep 2019.[Cle14] Thomas A. Cleland. Chapter 7 - construction of odor representations byolfactory bulb microcircuits. In Edi Barkai and Donald A. Wilson, edi-tors,

Odor Memory and Perception , volume 208 of

Progress in Brain Research ,pages 177 – 203. Elsevier, 2014.[CMYL02] T. A. Cleland, A. Morse, E. L. Yue, and C. Linster. Behavioral models ofodor similarity.

Behav. Neurosci. , 116(2):222–231, Apr 2002.[CNB09] T. A. Cleland, V. A. Narla, and K. Boudadi. Multiple learning param-eters differentially regulate olfactory generalization.

Behav. Neurosci. ,123(1):26–35, Feb 2009.[Coo15] Bruce N. Cooperstein.

Advanced Linear Algebra . Textbooks in Mathemat-ics. CRC Press, 2nd edition, 2015.[CPdLCPL +

16] M. Chatterjee, F. Perez de Los Cobos Pallares, A. Loebel, M. Lukas, andV. Egger. Sniff-Like Patterned Input Results in Long-Term Plasticity atthe Rat Olfactory Bulb Mitral and Tufted Cell to Granule Cell Synapse.

Neural Plast. , 2016:9124986, 2016.160CPO18] Yubei Chen, Dylan M. Paiton, and Bruno A. Olshausen. The sparse man-ifold transform, 2018.[CRC13] Jason B. Castro, Arvind Ramanathan, and Chakra S. Chennubhotla. Cat-egorical dimensions of human odor descriptor space revealed by non-negative matrix factorization.

PLoS ONE , 8(9):1–16, 09 2013.[CS06] T. A. Cleland and P. Sethupathy. Non-topographical contrast enhance-ment in the olfactory bulb.

BMC Neurosci , 7:7, Jan 2006.[DF04] David S. Dummit and Richard M. Foote.

Abstract Algebra . John Wiley &Sons Inc., 3rd edition, 2004.[DR08] W. Doucette and D. Restrepo. Profound context-dependent plasticity ofmitral cell responses in olfactory bulb.

PLoS Biol. , 6(10):e258, Oct 2008.[EG79] Murray Eisenberg and Robert Guy. A proof of the hairy ball theorem.

The American Mathematical Monthly , 86(7):571–574, 1979.[EH00] David Eisenbud and Joe Harris.

The Geometry of Schemes , volume 197of

Graduate Texts in Mathematics . Springer Science+Business Media NewYork, 2000.[ES12] S. Edelman and R. Shahbazi. Renewing the respect for similarity.

FrontComput Neurosci , 6:45, 2012.[ET10] G. Bard Ermentrout and David Terman.

Mathematical Foundations of Neu-roscience . Springer Science+Business Media, 2010.[FBB +

16] D. E. Frederick, A. Brown, E. Brim, N. Mehta, M. Vujovic, and L. M.Kay. Gamma and Beta Oscillations Deﬁne a Sequence of NeurocognitiveModes Present in Odor Processing.

J. Neurosci. , 36(29):7750–7767, 07 2016.[FBT +

17] D. E. Frederick, A. Brown, S. Tacopina, N. Mehta, M. Vujovic, E. Brim,T. Amina, B. Fixsen, and L. M. Kay. Task-Dependent Behavioral Dynam-ics Make the Case for Temporal Integration in Multiple Strategies duringOdor Processing.

J. Neurosci. , 37(16):4416–4426, 04 2017.[FF16] Anatoly Fomenko and Dmitry Fuchs.

Homotopical Topology , volume273 of

Graduate Texts in Mathematics . Springer International PublishingSwitzerland, 2016.[FH04] William Fulton and Joe Harris.

Representation Theory: A First Course , vol-ume 129 of

Graduate Texts in Mathematics . Springer Science+Business Me-dia New York, 2004.[Fri15] P. Fries. Rhythms for Cognition: Communication through Coherence.

Neuron , 88(1):220–235, Oct 2015.161GH10] Robert L. Goldstone and Andrew T. Hendrickson. Categorical percep-tion.

WIREs Cognitive Science , 1(1):69–78, 2010.[GP74] Victor Guillemin and Alan Pollack.

Differential Topology . American Math-ematical Society Chelsea Publishing, 1974.[GPCI15] Chad Giusti, Eva Pastalkova, Carina Curto, and Vladimir Itskov. Cliquetopology reveals intrinsic geometric structure in neural correlations.

Pro-ceedings of the National Academy of Sciences , 112(44):13455–13460, 2015.[GS09] Y. Gao and B. W. Strowbridge. Long-term plasticity of excitatory inputsto granule cells in the rat olfactory bulb.

Nat. Neurosci. , 12(6):731–733, Jun2009.[Har77] Robin Hartshorne.

Algebraic Geometry , volume 52 of

Graduate Texts inMathematics . Springer Science+Business Media LLC, 1977.[Har87] Stevan Harnad, editor.

Categorical Perception: The Groundwork of Cogni-tion . Cambridge University Press, 1987.[Hat01] Allen Hatcher.

Algebraic Topology . Cambridge University Press, 2001.[HC56] Harish-Chandra. Representations of semisimple lie groups, v.

AmericanJournal of Mathematics , 78(1):1–41, 1956.[HE96] Rachel S. Herz and Trygg Engen. Odor memory: review and analysis.

Psychonomic Bulletin & Review , 3(3):300–313, 1996.[Hel78] Sigurdur Helgason.

Differential Geometry, Lie Groups, and SymmetricSpaces , volume 34 of

Graduate Studies in Mathematics . American Math-ematical Society, 1978.[Hel08] Sigurder Helgason.

Geometric Analysis on Symmetric Spaces , volume 39 of

Mathematical Surveys and Monographs . American Mathematical Society,2008.[Her05] R. S. Herz. Odor-associative learning and emotion: effects on perceptionand behavior.

Chem. Senses , 30 Suppl 1:i250–251, Jan 2005.[HS97] P.J. Hilton and U. Stammbach.

A Course in Homological Algebra , volume 4of

Graduate Texts in Mathematics . Springer Science+Business Media NewYork, 1997.[HWK +

10] Raﬁ Haddad, Tali Weiss, Rehan Khan, Boaz Nadler, Nathalie Mandairon,Moustafa Bensaﬁ, Elad Schneidman, and Noam Sobel. Global features ofneural activity in the olfactory system form a parallel code that predictsolfactory behavior and perception.

Journal of Neuroscience , 30(27):9017–9026, 2010. 162IC20] Nabil Imam and Thomas A. Cleland. Rapid online learning and robustrecall in a neuromorphic olfactory circuit.

Nature Machine Intelligence ,2:181–191, 2020.[IS98] J. S. Isaacson and B. W. Strowbridge. Olfactory reciprocal synapses: den-dritic signaling in the cns.

Neuron , 20(4):749–761, Apr 1998.[Ive86] Birger Iversen.

Cohomology of Sheaves . Universitext. Springer-VerlagBerlin Heidelberg, 1986.[Kas95] Christian Kassel.

Quantum Groups , volume 155 of

Graduate Texts in Math-ematics . Springer Science+Business Media New York, 1995.[Kay14] L. M. Kay. Circuit oscillations in odor perception and memory.

Prog.Brain Res. , 208:223–251, 2014.[KKER11] Alexei Koulakov, Brian Kolterman, Armen Enikolopov, and Dmitry Rin-berg. In search of the structure of human olfactory space.

Frontiers inSystems Neuroscience , 5:65, 2011.[Kna86] Anthony W. Knapp.

Representation Theory of Semisimple Groups: AnOverview Based on Examples (PMS-36) . Princeton University Press, rev- revised edition, 1986.[Kna88] Anthony K. Knapp.

Lie Groups, Lie Algebras, and Cohomology , volume 34of

Mathematical Notes . Princeton University Press, 1988.[Kna96] Anthony K. Knapp.

Lie Groups Beyond an Introduction , volume 140 of

Progress in Mathematics . Springer Sceince+Business Media, 1996.[Kna05a] Anthony K. Knapp.

Lie Groups Beyond an Introduction , volume 140 of

Progress in Mathematics . Springer Sceince+Business Media, 2nd edition,2005.[Kna05b] Anthony W. Knapp.

Advanced Real Analysis . Cornerstones in Mathemat-ics. Birkhäuser Boston, 2005.[Kna05c] Anthony W. Knapp.

Basic Real Analysis . Cornerstones in Mathematics.Birkhäuser Boston, 2005.[Kna06] Anthony W. Knapp.

Basic Algebra . Cornerstones in Mathematics.Birkhauser Boston, 2006.[Kna07] Anthony W. Knapp.

Advanced Algebra . Cornerstones in Mathematics.Birkhäuser Boston, 2007.[KS06] Masaki Kashiwara and Pierre Schapira.

Categories and Sheaves , volume332 of

Grundlehren der mathematischen Wissenschaften . Springer, 2006.163KSS +

10] F. Kermen, S. Sultan, J. Sacquet, N. Mandairon, and A. Didier. Consoli-dation of an olfactory memory trace in the olfactory bulb is required forlearning-induced survival of adult-born neurons and long-term memory.

PLoS ONE , 5(8):e12118, Aug 2010.[KSUM99] H. Kashiwadani, Y. F. Sasaki, N. Uchida, and K. Mori. Synchronized os-cillatory discharges of mitral/tufted cells with different molecular recep-tive ranges in the rabbit olfactory bulb.

J. Neurophysiol. , 82(4):1786–1792,Oct 1999.[Lan02] Serge Lang.

Algebra , volume 211 of

Graduate Texts in Mathematics .Springer Science+Business Media LLC, 3 edition, 2002.[LC13a] G. Li and T. A. Cleland. A two-layer biophysical model of choliner-gic neuromodulation in olfactory bulb.

J. Neurosci. , 33(7):3037–3058, Feb2013.[LC13b] Goushi Li and Thomas A. Cleland. A two-layer biophysical model ofcholinergic neuromodulation in olfactory bulb.

The Journal of Neuro-science , 33(7):3037–3058, 2013.[Lee11] John M. Lee.

Introduction to Topological Manifolds , volume 202 of

GraduateTexts in Mathematics . Springer Science+Business Media LLC, 2011.[Lee12] John M. Lee.

Introduction to Smooth Manifolds , volume 218 of

GraduateTexts in Mathematics . Springer Science+Business Media LLC, 2012.[LH99] C. Linster and M. E. Hasselmo. Behavioral responses to aliphatic aldehy-des can be predicted from known electrophysiological responses of mi-tral cells in the olfactory bulb.

Physiol. Behav. , 66(3):497–502, May 1999.[LKA +

20] M. Levinson, J. P. Kolenda, G. J. Alexandrou, O. Escanilla, D. M. Smith,T. A. Cleland, and C. Linster. Context-dependent odor learning requiresthe anterior olfactory nucleus.

Behav. Neurosci. , in press, 2020.[LMSW09] Christiane Linster, Alka V. Menon, Christopher Y. Singh, and Donald A.Wilson. Odor-speciﬁc habituation arises from interaction of afferentsynaptic adaptation and intrinsic synaptic potentiation in olfactory cor-tex.

Learning & Memory , 16(7):452–459, Jul 2009.[Lor18] Martin Lorenz.

A Tour of Representation Theory , volume 193 of

GraduateStudies in Mathematics . American Mathematical Society, 2018.[Mat86] Hideyuki Matsamura.

Commutative Ring Theory , volume 8 of

Cambridgestudies in advanced mathematics . Cambridge University Press, 1986.[MBB19] Ariella Y. Moser, Lewis Bizo, and Wendy Y. Brown. Olfactory generaliza-tion in detector dogs.

Animals: an Open Access Journal from MDPI , 9(9),Sep 2019. 164Mei15] M. Meister. On the dimensionality of odor space. eLife , 4:e07865, 2015.[Met03] David Metzler. Topological and Smooth Stacks. arXiv Mathematics e-prints , page math/0306176, Jun 2003.[MKC +

14] N. Mandairon, F. Kermen, C. Charpentier, J. Sacquet, C. Linster, andA. Didier. Context-driven activation of odor representations in the ab-sence of olfactory stimuli in the olfactory bulb and piriform cortex.

FrontBehav Neurosci , 8:138, 2014.[ML71] Saunders Mac Lane.

Categories for the Working Mathematician , volume 5of

Graduate Texts in Mathematics . Springer-Verlag New York Inc., 1971.[MLE +

09] Mélissa Moreno, Christiane Linster, Olga Escanilla, Joëlle Sacquet, AnneDidier, and Nathalie Mandairon. Olfactory perceptual learning requiresadult neurogenesis.

Proceedings of the National Academy of Sciences of theUnited States of America , 106:17980–5, 10 2009.[MNS81a] K. Mori, M. C. Nowycky, and G. M. Shepherd. Analysis of synapticpotentials in mitral cells in the isolated turtle olfactory bulb.

J. Physiol.(Lond.) , 314:295–309, May 1981.[MNS81b] K. Mori, M. C. Nowycky, and G. M. Shepherd. Electrophysiological anal-ysis of mitral cells in the isolated turtle olfactory bulb.

J. Physiol. (Lond.) ,314:281–294, May 1981.[MR07] X. Maio and R.P. Rao. Learning the lie groups of visual invariance.

NeuralComputations , 19:2665–2693, 2007.[MSN +

11] N. Mandairon, S. Sultan, M. Nouvian, J. Sacquet, and A. Didier. Involve-ment of newborn neurons in olfactory associative learning? The operantor non-operant component of the task makes all the difference.

J. Neu-rosci. , 31(35):12455–12460, Aug 2011.[Mun00] James R. Munkres.

Topology . Pearson, 2000.[NPLR14] A. Nunez-Parra, A. Li, and D. Restrepo. Coding odor identity and odorvalue in awake rodents.

Prog. Brain Res. , 208:205–222, 2014.[PGJSTH19] Fernanda Pérez-Gay Juárez, Tomy Sicotte, Christian Thériault, and Ste-van Harnad. Category learning can alter perception and its neural corre-lates.

PLoS ONE , 14(12):1–29, 12 2019.[RGMR18] D. Ramirez-Gordillo, M. Ma, and D. Restrepo. Precision of Classiﬁca-tion of Odorant Value by the Power of Olfactory Bulb Oscillations Is Al-tered by Optogenetic Silencing of Local Adrenergic Innervation.

FrontCell Neurosci , 12:48, 2018. 165RKG06] D. Rinberg, A. Koulakov, and A. Gelperin. Speed-accuracy tradeoff inolfaction.

Neuron , 51(3):351–358, Aug 2006.[Rot88] Joseph J. Rotman.

An Introduction to Algebraic Topology , volume 119 of

Graduate Texts in Mathematics . Springer-Verlag New York Inc., 1988.[Rot09] Joseph J. Rotman.

An Introduction to Homological Algebra . Universitext.Springer Science+Business Media LLC, 2009.[Rot15] Joseph J Rotman.

Advanced Modern Algebra: Part 1 , volume 165 of

Gradu-ate Studies in Mathematics . American Mathematical Society, third edition,2015.[RPS +

13] J. P. Royet, J. Plailly, A. L. Saive, A. Veyrac, and C. Delon-Martin. Theimpact of expertise in olfaction.

Front Psychol , 4:928, Dec 2013.[Rya02] Raymond A. Ryan.

Introduction to Tensor Products of Banach Spaces .Springer Monographs in Mathematics. Springer-Verlag London, 2002.[SCT07] Richard J. Stevenson, Trevor I. Case, and Caroline Tomiczek. Resistanceto interference of olfactory perceptual learning.

The Psychological Record ,57:103–116, 2007.[SdCSL04] Armen Saghatelyan, Antoine de Chevigny, Melitta Schachner, andPierre-Marie Lledo. Tenascin-r mediates activity-dependent recruit-ment of neuroblasts in the adult mouse forebrain.

Nature Neuroscience ,7(4):347–356, Apr 2004.[Ser54] Jean-Pierre Serre. Représentations linéaires et espaces homogènes käh-lériens des groupes de lie compacts. In

Séminaire Bourbaki : années 1951/52- 1952/53 - 1953/54, exposés 50-100 , number 2 in Séminaire Bourbaki, pages447–454. Société mathématique de France, 1954. talk:100.[She87] RN Shepard. Toward a universal law of generalization for psychologicalscience.

Science , 237(4820):1317–1323, 1987.[Str09] B. W. Strowbridge. Role of cortical feedback in regulating inhibitory mi-crocircuits.

Ann. N. Y. Acad. Sci. , 1170:270–274, Jul 2009.[TPC14a] M. T. Tong, S. T. Peace, and T. A. Cleland. Properties and mechanisms ofolfactory learning and memory.

Front Behav Neurosci , 8:238, 2014.[TPC14b] Michelle T. Tong, Shane T. Peace, and Thomas A. Cleland. Properties andmechanisms of olfactory learning and memory.

Frontiers in BehavioralNeuroscience , 8, Jul 2014.[Tu11] Loring W. Tu.

An Introduction to Manifolds . Universitext. Springer Sci-ence+Business Media LLC, 2011.166VKS +

15] J. Vinera, F. Kermen, J. Sacquet, A. Didier, N. Mandairon, and M. Richard.Olfactory perceptual learning requires action of noradrenaline in theolfactory bulb: comparison with olfactory associative learning.

Learn.Mem. , 22(3):192–196, Mar 2015.[VRC17] Jonathan D. Victor, Syed M. Rizvi, and Mary M. Conte. Two representa-tions of a high-dimensional perceptual space.

Vision Research , 137:1–23,Aug 2017.[Wed16] Torsten Wedhorn.

Manifolds, Sheaves, and Cohomology . Springer StadiumMathematik-Master. Springer Fachmedien Wiesbaden, 2016.[WS03] D. A. Wilson and R. J. Stevenson. The fundamental role of memory inolfactory perception.

Trends Neurosci. , 26(5):243–247, May 2003.[WS06] Donald A. Wilson and Richard J. Stevenson.

Learning to smell: olfactoryperception from neurobiology to behavior . Johns Hopkins University Press,United States, 2006.[ZKU +

13] H. A. Zariwala, A. Kepecs, N. Uchida, J. Hirokawa, and Z. F. Mainen. Thelimits of deliberation in a perceptual decision task.

Neuron , 78(2):339–351,Apr 2013.[ZS06] Manuel Zarzo and David T. Stanton. Identiﬁcation of Latent Variables ina Semantic Odor Proﬁle Database Using Principal Component Analysis.

Chemical Senses , 31(8):713–724, 07 2006.[ZSS18] Yuansheng Zhou, Brian H. Smith, and Tatyana O. Sharpee. Hyperbolicgeometry of the olfactory space.

Science Advances , 4(8), 2018.[ZVM +

13] Q. Zaidi, J. Victor, J. McDermott, M. Geffen, S. Bensmaia, and T. A. Cle-land. Perceptual spaces: mathematical structures to neural mechanisms.