A Differential Topological Model for Olfactory Learning and Representation
AA Differential Topological Model for OlfactoryLearning and Representation
It’s Amazing it Works at All
Jack Alexander Cook a r X i v : . [ q - b i o . N C ] S e p o Janet and Ralph cknowledgement I would like to thank my advisor Thomas Cleland for his patience and guidance through-out my undergraduate career. Without him, this thesis would not have materialized.And to my parents who supported me always, thank you. reface
This thesis is designed to be a self-contained exposition of the neurobiological and math-ematical aspects of sensory perception, memory, and learning with a bias towards ol-faction. The final chapters introduce a new approach to modeling focusing more on thegeometry of the system as opposed to element wise dynamics. Additionally, we constructan organism independent model for olfactory processing: something which is currentlymissing from the literature. Chapter 1, serves as an introduction to the basic biology,structure, and functions of the olfactory system and the related regions of the brain. Start-ing with the nasal cavity, odors excite receptors which in turn relay information to theolfactory bulb(we will often refer to this as bulb). From the bulb information is sent topiriform cortex which projects onto a myriad of structures, some of which are hippocam-pus, anterior olfactory nucleus, and amygdala. We discuss neuromodulation and someconjectures about higher order processing (post bulb).In Chapter 2, we take a brief aside to discuss some basic algebra which makes up thefirst half of the mathematical material needed to understand the later chapters. We beginthe tour with set theory where we lay down the preliminaries on functions, set theoreticnotation, and various definitions which will appear consistently throughout this text. Thenext stop is group theory where we study the symmetries of objects and build the notionof an Action on a set. We then pass to Ring Theory where we discuss ideals, morphismsand hidden group structures. Rings show up naturally in chapter 3 and play an importantrole in the theory of sections. We end the tour of basic structures with a discussion offields and polynomial rings with coefficients in a field. This leads to a natural discussionof higher order structures such as Vector Spaces and modules. The latter being an integralcomponent of the model.Chapter 3, forms the second half of the mathematical underpinnings for chapters 4and 5. Here we discuss geometry, topology and give a brief introduction to the theory ofcategories, sheaves, and differentiable stacks. Topology studies the intrinsic propertiesof a space endowed with a topology. It concerns itself with ideas such as connectedness,compactness, and continuity. Geometry studies calculus on these spaces and very quicklyleads to the ideas of flows, geodesics, and Lie groups. The terminal topics are abstractionsof the notions of set and function. These provide a convenient language and place todiscuss some of the algebraic invariants given to a topological space.Chapters 4 makes up the entirety of the original research of this thesis. We first explorethe topological and geometric properties of the physical and perceptual spaces involvedin the olfactory system and discuss how the use of vector bundles and non-canonicalmaps from a bundle to its base space provide insight into the geometry of the system as a3hole. We conclude with future directions of research and unanswered questions alongwith some conjectures about the model.Chapter 5 will focus on potential new areas of investigation. The majority of thischapter covers representation theory and culminates with the Borel-Weil theorem. Thisgives a realization of representations of certain groups as sections of line bundles. Thisgeometric view of the situation makes it natural to consider sheaves. As the ultimatetheorem will tell us, there is some interesting information contained in sheaf cohomologythat cannot be accessed through other means.4 ontents R - Mod . . . . . . . . . . . . . . . . . . . . . . 683.2 Topology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 763.2.1 Topological Spaces and Continuous Maps . . . . . . . . . . . . . . . . 763.2.2 Basic Algebraic Topology . . . . . . . . . . . . . . . . . . . . . . . . . 863.3 Differentiable Manifolds and Vector Bundles . . . . . . . . . . . . . . . . . . 923.3.1 Smooth Maps and the category
Man ∞ . . . . . . . . . . . . . . . . . . 933.3.2 Sheaves . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1013.3.3 de Rham Cohomology . . . . . . . . . . . . . . . . . . . . . . . . . . . 1095 A Geometric Framework for Olfactory Learning and Processing 113 R -Space . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1174.2.2 Glomerular-layer computations, R (cid:48) . . . . . . . . . . . . . . . . . . . . 1184.2.3 M -Space . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1194.2.4 S -Space . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1204.2.5 Forms and timescales of odor learning . . . . . . . . . . . . . . . . . . 1234.2.6 The construction of hierarchical odor categories . . . . . . . . . . . . 1274.2.7 "Olfactory space" is not hyperbolic . . . . . . . . . . . . . . . . . . . . 128 hapter 1Olfaction and the Problem of Learning This section is intended to be a crash-course in the neurobiology of the olfactory systemand the various computational aspects of neuroscience. We assume a passing knowledgeof general neuroscience. This includes the broad organization of the brain, structure ofa neuron, biochemistry of action potentials, existence of neurotransmitters, structure ofa synapse, and feedback loops at the level of [BW13]. The main goal of this section is tointroduce the idea of categorical perception and apply it to olfaction.
Sensory systems are the backbone of human perception and form the only method forwhich humans (an all animals) can gain information about the outside world. Althoughthe exact number of distinct senses is debated, it is generally agreed upon that humanshave 6-7 main ones which govern a occupy a large portion of the brain and almost all ofthe cortical space devoted to perception [BW13]. One could spend an enormous numberof pages discussing the intricacies of each of these sensory systems and their associatedperceptual constructions. As the main focus of this thesis is to understand olfactory pro-cessing, we shall only give a broad introduction to the other sensory systems and leavethe remaining details to the many references.
Remark 1.1.1.
For the remainder of this chapter, all definitions are operational (maychange between researchers) unless otherwise noted. We shall give some explanationof the definitions in the cases where we deviate from the standard references.The easiest way to begin an analysis of these systems is to understand the basic neu-rophysiology.
Definition 1.1.2. A sensory system is a part of the nervous system consisting of sensoryneurons, a neural pathway, and a cortical area.Sensory systems play a key role in every action the body performs: from simple thingslike standing up straight and picking up a glass of water to more complex tasks like skiingor identifying someone’s face in a dimly lit room. To better understand these objects, letsinvestigate a few well known examples. 7 xample 1.1.3. (a) Vision : In this case, sensory neurons are rods and cones. These light sensitive cellstransmit information to the optic nerve which relays this information to visual cor-tex. In fact, different cells along the pathways from the retina to the occipital lobehave varying receptor fields. This variance contributes to the processing of an im-age.(b)
Audition : Vibrations in the basilar membrane due to sound waves coming intocontact with the ear drum, lead to the vibration of hair cells. This motion inducesaction potentials in the auditory nerve. From here the signal partially decussates tothe temporal lobes where further processing occurs.We can further divide up the sensory systems into those which have chemical stimuliand those which do not. Those chemical senses depend on molecular interactions in thesensory neurons to facilitate the transformation from stimulus to perception. In the case ofgustation, there are five distinguished "tastes": salty, sweet, sour, bitter, umami. These allcorrespond to different molecules interacting with the papillae on the tongue. Somethingwhich tastes more "salty" is directly related to the Na + ions present in the solution ofsaliva and food. In contrast to this, we have audition. The pertinent objects here arepressure waves in the air which vibrate the ear drum which in turn vibrates the bones ofthe inner ear and causes waves in the basilar membrane. These waves cause the ends ofthe hair cells to be perturbed and induce an action potential.The point of these examples is to show that sensory systems have a wide array ofpossible stimuli. Now that we have the basic (and grossly vague) description of the structure of a sen-sory system. One may ask, “what is the purpose of such a system?" Beyond the obviousanswer (we need a way to interact with our environment), there are some subtle and in-credibly important operations that sensory systems accomplish. The main references forthis section are [ Har87 ] and [ CL05b ] .The main operations we will discuss are learning, representation, and categorization. They are closely related and in some perspectives, are even intertwined. In fact we canthink of learning and representation as disparate ideas, whereas categorical perceptionseeks to, in some sense, unify these ideas. The motivation for studying such a constructionfirst originated in vision and speech with color perception and categorization of variousspeech patterns.There is an obvious evolutionary advantage to the construction of categories. Typi-cally, stimuli are continuous, or at least abundant enough that any model would func-tion adequately considering them continuous. Categorical perception transforms thiscontinuity into a discrete spectrum of perceptions organized by similarity with respect We will discuss this at length in the next section. The intent here is to get some intuition for theproblems we will be attacking in the later chapters.
8o some metric. We take the following example from [ Har87 ] . Consider a digital clockwhich presents the time in 12hr increments with the use of am and pm . Then twice a day,the clock would present 12 : 00 with the only difference being which signifier is present.In this way, we can categorize the time on the clock as either times marked by am or timesmarked by pm .Now consider another construction of categories which will be revisited in chapter3. If asked to classify the capital letters in the english alphabet, what is an appropriatechoice of category? Suppose we choose to split them by the number of holes: in that anyletter with a closed loop has a hole, and having multiple closed loops should split upthe categories further. In this schema(which is font specific), the letters are grouped asfollows: { C , E , F , G , H , I , J , K , L , M , N , S , T , U , V , W , X , Y , Z } { A , D , O , P , Q , R } { B } So we have three distinct categories: No holes, One hole, two holes. Notice that we canbe a bit more general about this however. If we consider only letters that have a hole andletters which do not, we only have two categories. Inside of the holed category we get asub-category consisting of letters which have multiple holes.The point of these examples is to illuminate the idea that categories seek to simplify thestimuli. It is much easier to think about letters with or without holes than all of the letterssimultaneously. Due to this, it should be no surprise that a majority of current researchin sensory systems is devoted to understanding the process of categorization. This isprecisely what we will investigate in the upcoming sections and is the topic of Chapter4. We can think of completing a category, C , by adding to it all of the points which are"infinitesimally close" to C . If we think about this in a geometric way, this amounts to thecircle which bounds a disc.Before diving into the world of olfaction, we need one more general function of sen-sory systems: generalization. In 1987 Shepard introduced the idea of generalization forperceptual spaces. To this end, it is a different method of categorization, but one whichdepends on minimal learning. We shall call this perceptual generalization. Here is a moreformal definition. Definition 1.1.4. Perceptual generalization is the process by which a sensory system (inparticular the neural pathway) constructs a broad category for a given stimulus, basedsolely on the learning of one (or a few) stimulus.In the figure below (Figure 1.1), we show the first examples of perceptual generaliza-tion. This process seems to be a method of producing categories for unlearned/partiallylearned stimuli. The method of generalization is extrapolate information from one stimu-lus and use this to "learn" something about its nearest neighbors in the perceptual space.Perceptual generalization can be thought of as a pseudo-prior to categorization. Be-fore the system can split things into clean, discrete categories, it needs to build the objectsof the perceptual space (the things to categorize). Once this is done, but still before dis-cretization, the system has to understand the boundary of each perceptual object . This will be interpreted in chapter 3 and 4 as the boundary of a topological subspace of the perceptual
Definition 1.1.5.
The perceptual boundary (sometimes shortened to just boundary) of acategory is the collection of points which are extreme in the category. That is, these are thepoints which are only present in the completion of a category.One of the important themes of the research surrounding sensory systems is that ofdistinguishing the boundary of the perceptual space and the various percepts it contains[Har87, Chapter 1, Section 2]. This may seem like an easy task, but in the abstract this is space. This notion will allow us to give a more formal definition than the vague one given here and willalso lead to a clean method of discretizing the perceptual space.
The olfactory system can be broken up (coarsely) into two main regions: the olfactorybulb and piriform cortex. We shall focus on the olfactory bulb as the piriform cortexis much less understood. As the following figure (Figure 1.2) shows, the olfactory bulbis divided into several layers. Each plays a key role in the transmutation of the physicalstimulus to a usable perceptual object. As we still do not understand the full functionalityof each of the layers individually, we will treat them as separate objects and present whatwe do know about the different layers. In the same flavor as the previous section, we need to specify what the sensory neuronsare. In Figure 1.2, the layer marked OE (olfactory epithelium) and the colored receptorsare precisely the olfactory sensory neurons (abbreviated OSNs). These are chemical recep-tors and their level of activation (spike frequency) is directly proportional to the bindingaffinity of the odorant. This binding information is processed at a variety of places beforebeing sent off to piriform cortex and other higher-order brain regions. The main layerswe concern ourselves with here are GL , EPL , and
GCL . In contrast with other modalities(such as vision) olfaction is intrinsically high-dimensional and this high-dimensionalityis consistent across species. In humans there are roughly 350 different types of OSNswhereas in mice there are upwards of 1000 [Cle14]. Each distinct type of OSN convergesto a neuropil tangle which is roughly spherical in nature. We call these tangles glomeruli and the layer consisting of all of them is GL .The largest cells protruding from the glomeruli are mitral cells . These pyramidal cellsembody the immediate connection of the OB to piriform cortex. In Figure 1.2 the mitralcells are drawn to be in one-one correspondence with the glomeruli and are seen to sam-ple from only one glomerulus: this is false in general. It turns out that in most mammaliantetrapods the mitral cells do indeed sample from a single glomerulus. In [MNS81a] and[MNS81b], it was shown that in some turtles and reptiles, the mitral cells can samplefrom a variety of glomeruli. The benefits of this cross sampling are still not very wellunderstood.The final change of information in the OB is the modification by the granule cells.These are inhibitory synapses which delay the mitral cell action potential [Cle14]. It isthought that these synapses play a large role in the formation of the perception of anodorant, but no published research has looked at this yet. We do know however thatlearning is related to granule cell firing patterns. With repeated trials of an odorant, the11igure 1.2: Schematic of the mamalian olfactory bulb microcircuitry. Layers are identified,and are arranged bottom to top, external to internal.number granule cells which fire decreases monotonically. Each consecutive trial leads toa more specific and more refined response. This fits in with Shepard’s idea on generaliza-tion. This specialization implies however, that granule cells can become specified fairlyquickly and thus the brain should “run out" of possible specificity. That is is theory, thegranule cells could become specified to only one odorant. This however is a poor alloca-tion of energy and would then require the genesis of a hoard of new granule cells for eachvariant of the same odor. Some recent work by [MLE +
09] has shown that granule cells do exhibit adult neurogenesis which is incited from the piriform cortex. This neurogenesisis the reason for which it is thought that granule cells play an important role in buildingof perception and for which we can learn odors well into our adult years. It does nothowever remove the flawed idea that granule cells can become highly specialized.Now that we are aquatinted with the general form of the OB we need to discuss thegeneral schema of processing. In GL , the periglomerular cells (PGs) and superficial short-axon cells (sSA) are thought to be the cells which begin the construction of perceptualcategories. The evidence for this comes from recent work by [BC19] which shows that12earning can occur at the glomerular layer and not just at the granule cell layer. Thisimplies at a minimum, that the purpose of the early (exterior) layers of the OB are tonormalize data and to reduce noise in the sensation. Further it increases contrast betweensimilar odorants. A nice analogy to this is the existence of edges in visual perception.There is an enormous amount of cortical space allocated to the processing of edges. Thishelps build a better image and in the same way, contrast enhancement in bulb, “builds abetter odor."The common theme to keep in mind for this system, is that of sampling with noise.Every layer samples from the previous in order to get a more specific perceptual categoryat the end. With the introduction of some noise (variance) we can eliminate some of thetheoretical aspects of the system. For instance, the granule cell hyper-specification frombefore can be formally disregarded as the inherent noise of the odorants will not allow forthe accuracy necessary to determine "exactly" what the odorant is. What this does tell ushowever is that we can combine the notions of categorical perception and generalizationfor this system to arrive on what we shall call Categorical Generalization . At a first pass, thisidea is the construction of a generalized perceptual category which given some learningset, all contained in the same perceptual category, is the extension of the learning set viathe rules of generalization set forth in [ She87 ] . Definition 1.2.1.
Let O be an odorant and O the corresponding perceptual category gen-erated by learning on O . Then the Generalized Category G O is the perceptual categorywhich extends O by generalizing its boundary.To understand this idea further, we shall investigate computational models of the ol-factory system and see how the introduction of this concept motivates our model con-structed in chapter 4. Models of the olfactory system come in two main types: anatomical and theoretical (per-ceptual) . The anatomical models focus on understanding the biochemistry and spiketiming of the OSN and related cells of the OB, whereas perceptual models tend to be fan-tastical speculation on the "perceptual space" of the system [CL05a] [ET10]. In chapter4, we shall propose a model which has the advantage of being mixture of the two, withthe advantage of being mathematically elegant. Before then, let us understand some ofthe current problems with modeling and what makes the olfactory system significantlydifferent than the other sensory systems.Each flavor of OSN has a different receptive field and thus we can consider the "spaceof possible physical inputs" to be some collection of points in a 350 -dimensional space , witheach different "dimension" defined by a different OSN receptive field. Compared to thethree dimensions of vision, this is monstrous. This aspect of the olfactory system makesstudying it substantially different from other modalities. One feature for example, is thatdistances tend to increase with the dimension. What we mean by this is the following:consider the unit sphere in an even dimensional space. The volume of a cube of sidelength 2, centered at the origin, has volume 2 k where 2 k is the dimension. Whereas, the13olume of the unit sphere sitting inside this box is π k k ! . So as k increases, the volume of theunit sphere actually decreases. What this tells us is that, proportionally in higher dimen-sions, more points lie outside the unit sphere than inside. The importance of the aboveobservation cannot be understated. It implies that there are theoretically an incrediblylarge number of possible odorants detectable by the OSNs as well as decreases the prob-ability that any two odorants which are chemically different will be identified as similar.We can go one step further and say that the physical marker of an odorant and the sen-sation thereof is a large determining factor in the construction of the perception of thatodor.The many thousands of OSNs converge onto glomeruli, of which there are the exactlysame number as the different receptor types ( ∼ Remark 1.2.2.
The remainder of this section will be dedicated to the modeling of mi-tral and granule cells. These two cell types occupy a majority of the mental theatre ofresearchers in this field as they are the most mysterious cells in the olfactory bulb.We begin with mitral cells. As compared to the roughly 350 glomeruli, there are about3500 mitral cells (in humans) and even more in some mammalian tetrapods. The keyfeature of mammalian tetrapods is the independent sampling of the mitral cells from adistinct glomerulus. As mentioned above, this is not always the case and due to thisfact, modeling these cells is a delicate procedure. Most authors elect to simply ignore thepotential cross-sampling.As with most modeling, the early approaches were through linear algebra (see chapter2) and some form of calculus [ET10]. The type of modeling which makes use of calculusextensively is not particularly helpful for building understanding of the perceptual spaceas a geometric object. The use of linear algebra though is quite important in the construc-tion of a perceptual space. In [ZVM +
13] and many others, mitral cells are modeled asvectors in a Euclidean geometry. The important part here is the type of geometry cho-sen. Euclidean geometries are inherently the most restrictive geometry as it assumes nocurvature in the perceptual space.
Example 1.2.3.
To see why a Euclidean geometry is restrictive, consider two points on apiece of paper. Let d be the distance separating the points. Now, given any transformationof the paper which retains the flatness (a rotation or reflection) the distance between thepoints will stay the same. Now, let us introduce a fold into the paper. This can bring thepoints closer together in the ambient three-dimensional space but their distance along thepaper will not change. If instead of a fold we make it a smooth change, this is preciselythe introduction of curvature.Nonetheless, this choice of model has been shown repeatedly to not be useful. Sim-ply speaking, perceptual distances do not sit well inside a linear space. It is convenienthowever to have the mathematical ease of a Euclidean space. For this reason, current14esearch (such as [CPO18]) has begun to try and understand manifolds (see chapter 3 fora definition) and their applications to sensory processing. These are objects which "looklike" Euclidean space on a local scale. The advantage of these spaces is that we can in-troduce curvature to the perceptual space, while still retaining the linear structure on thetangent space at every point. In fact, the problem with Euclidean space is not unique toit. Any space with constant curvature will have the same deficit. We recommend runningthrough the example above but exchanging the piece of paper with a ball or a saddle.This will give the other two types of spaces of constant curvature. Even though the aboveapproach is flawed, some interesting results have appeared in other modalities [MR07]that imply we may want to consider vector-like mitral cells in olfactory system models.Furthermore, the use of some high-level algebra and differential geometry has led to theinvestigation of certain mathematical objects called Lie Groups (see chapter 3 for a defini-tion). These play an important role in mathematics and physics so it is no surprise thatthey have shown up in neuroscience as well.We now turn our attention to granule cells. One large mystery surrounding them isthe aforementioned adult neurogenesis. It was shown in [MLE +
09] that in order for theolfactory system to function at its current level of accuracy, adult neurogenesis is neces-sary. Some have argued however that all evidence of adult neurogenesis is actually rem-nants of embryonic stem-cell differentiation. We shall not contest either of these topicshere as the data is inconclusive either way. On a different note, granule cells are believedto be the workhorses of olfactory learning [Cle14]. These cells inhibit the action poten-tials of the far larger mitral cells and attribute to the variance in spike-timing seen acrossthe bulb for different odors. It should also be noted that there are orders of magnitudemore granule cells than mitral cells. The exact mechanism for mitral cell inhibition isup for debate, however it is clear that the piriform cortex plays some critical role in theexcitation-inhibition loop. Surprisingly however, models tend to not deal with subtle in-tricacies of granule cell inhibition. One possible explanation for this is that granule cellsonly act locally, in contrast to mitral cells which can inhibit relatively far away neighbors.This local action is not readily dealt with in computer models, and combining it with therelatively global action of mitral cells (sometimes having to intertwine the two) has beena blockade for some time now.As one final question of this chapter, we want to define the perceptual categories inolfaction. Given an odorant, the generalized category associated to that odorant is theresult of the generalization gradients above. In practice, one should think of this in thefollowing way: suppose O is the odorant (or combination thereof) corresponding to anorange. Then the generalized category of unlearned oranges may encompass all citrusfruits. This is clearly too broad to be of use when differentiating particular species of or-ange or even ripeness. Therefore, we know that there must be some mechanism (granulecell interactions) which restricts the size of the generalized categories so that they are ofuse for identification. In fact, as we shall see in chapter 4, we have proposed a way of gen-erating some specific hierarchies from such general data given some non-zero amount oflearning. Geometrically we can view this as constructing some rough approximation forthe perceptual space which somehow encodes the differences between distinct classes ofodorants.This completes the brief introduction to the computational neuroscience of olfaction.15 hapter 2An Introduction to Algebra Here we lay down the basics of set theory, its notation and how it is used in practice. Westart with a definition
Definition 2.1.0. A Set S , is any collection of elements (normally denoted with the cor-responding small letter) with cardinality some ordinal. The Order (size/cardinality) of aset S , is the number of elements in S and denoted | S | .We have the natural notion of a subset, denoted T ⊆ S . If T is strictly smaller than S ,then we write T (cid:40) S . The collection of all subsets of a set S is called the power set and isdenoted P ( S ) . Some classic examples of sets are the natural numbers, denoted N = {
0, 1, ... } and the integers, denoted Z = {
0, 1, −
1, 2, −
2, ... } .Some more interesting sets are Q , R , C the sets of rational, real, and complex numbersrespectively. Notice that N (cid:40) Z (cid:40) Q (cid:40) R (cid:40) C . For this reason, unless specified, we willuse C in examples.Additionally, we can define intersections and unions of sets. If S , T are two sets wedefine their intersection S ∩ T = { x : x ∈ S and x ∈ T } and their union S ∪ T = { x : x ∈ S or x ∈ T } . Further, if T ⊆ S , we can define the complement of T is S , to be T c = S − T = { s ∈ S : s / ∈ T } . Definition 2.1.1.
Let X , Y be two sets. We define the Cartesian Product , denoted X × Y ,as the set of all ordered pairs of elements in X and Y . That is X × Y = { ( x , y ) : x ∈ X , y ∈ Y } Example 2.1.2.
Let X = {
1, 2 } and Y = { a , b } . Then X × Y = { ( a ) , ( b ) , ( a ) , ( b ) } .16or finite sets, it is easy to see that | X × Y | = | X || Y | as for each element x ∈ X we canlook at the subset { x } × Y ⊆ X × Y each of these sets has size | Y | . As there are | X | choicesfor x , the claim follows. Definition 2.1.3. A Function f : S → T is a mapping between sets which assigns to eachelement s in the source space S , an element f ( s ) = t ∈ T . For this reason, we call S the domain of f , and T the codomain of f . Denote by f − ( t ) = { s ∈ S : f ( s ) = t } this iscalled the Pre-Image of t under f .We can compose functions assuming the codomain of the first is contained in the do-main of the second. We can actually relax this requirement to be that the image, denotedIm f , is contained in the domain of g .Notice that f may not hit every element of T : that is there may exists some t ∈ T suchthat t (cid:54) = f ( s ) for any s ∈ S . The following sister definitions provide us with insight intothis exact situation. Definition 2.1.4 (Injective, Surjective, and Bijective) . Let f : S → T be a function. f is called injective if whenever f ( s ) = f ( s ) this implies (denoted = ⇒ ) that s = s . f is called surjective if for all (denoted ∀ ) t ∈ T , there exists at least one s ∈ S , suchthat f ( s ) = t . A function which is both injective and surjective is called bijective . Example 2.1.5.
Let f : Z → Z be defined by f ( n ) = n . Then f is injective trivially. f isnot surjective as for any odd number l = k + n for any n ∈ Z .For an example of a surjective map, consider the absolute value function | · | : Z → N f ( z ) = f ( − z ) = | z | . In more standard notation, one writes z (cid:55)→ | z | . Proposition 2.1.6.
Let f : A → B and g : B → C be injective (respectively surjective, bijective)functions. Then g ◦ f : A → C is injective (resp. surjective, bijective).Proof. (Injectivity) Suppose that ( g ◦ f )( a ) = ( g ◦ f )( a (cid:48) ) . As g is injective, we know that f ( a ) = f ( a (cid:48) ) . Now, as f is injective, we have that a = a (cid:48) .(Surjectivity) Let c ∈ C . As f is surjective, we know that the domain of g is all of B .Now, we know that c = g ( b ) for some b ∈ B . As f is surjective, we have that b = f ( a ) forsome a ∈ A . Thus, for all c ∈ C , there exists at least one a ∈ A such that g ◦ f ( a ) = c . Asbijectivity is a combination of the previous two statements, this completes the proof. The symbol ∈ is to be read as "an element of." If we use the symbol / ∈ the slash means "not". Forexample − ∈ Z should be read as − − ∈ N should be read as − − is an integer. heorem 2.1.7. Let f : X → Y be a bijective function. Then there exists a map g : Y → X suchthat f ◦ g = Id Y and g ◦ f = Id X . Proof.
Define g : Y → X as g ( y ) = f − ( y ) . This is well defined as f is bijective so y ∈ Im f and ∃ ! x ∈ X such that f ( x ) = y . Then g ◦ f ( x (cid:48) ) = f − ( f ( x )) = x by bijectivity of f . Further, f ◦ g ( y ) = f ( f − ( y )) = f ( x (cid:48) ) = y by bijectivity. Hence, g satisfies the properties and we are done. Definition 2.1.8.
Let X be a set. We say E ⊆ X × X is an Equivalence Relation on X ifthe following properties hold:(a) ( x , x ) ∈ E for all x ∈ X .(b) If ( x , y ) ∈ E then ( y , x ) ∈ E .(c) If ( x , y ) , ( y , z ) ∈ E then ( x , z ) ∈ E .We call these properties reflexivity, symmetry, and transitivity respectively. It is commonpractice to not write E as a set of ordered pairs but rather write x ∼ y if ( x , y ) ∈ E . Wethen say ∼ is an equivalence relation on X . Further let [ x ] (also denoted ¯ x in some cases)be the set of all elements y ∈ X such that x ∼ y . We call [ x ] the Equivalence Class of x .We denote the set of equivalence classes as X / ∼ . Lemma 2.1.9.
Let ∼ be an equivalence relation on a set X . Then ∼ induces a partition of X viaequivalence classes. This is equivalent to saying for all elements x , y ∈ X , either [ x ] = [ y ] ∈ X / ∼ or [ x ] ∩ [ y ] = ∅ the empty set.Proof. Suppose [ x ] (cid:54) = [ y ] and [ x ] ∩ [ y ] (cid:54) = ∅ . Let w ∈ [ x ] ∩ [ y ] . Then x ∼ w and y ∼ w .Using the symmetry and transitive property of ∼ , we have that x ∼ y . Therefore [ x ] = [ y ] a contradiction. Hence, either [ x ] = [ y ] or [ x ] ∩ [ y ] for all x , y ∈ X . Example 2.1.10.
Let Z denote the set of integers as above. Fix some n ≥
0. Define a ∼ b if a − b = kn for some integer k . The space Z / ∼ : = Z n is called the set of integers modulo n . Notice that Z n = {
0, 1, 2, ..., n − } . Define the operation ( · ) mod n : Z → Z n whichsends k ∈ Z to [ k ] which is equivalent to its remainder after dividing by n . We have opted to start this section with a few examples to introduce the idea of a groupbefore giving the rigorous definition.
Example 2.2.1. Z . We can define + : Z × Z → Z by ( a , b ) (cid:55)→ a + b . Clearly if a (cid:54) =
0, then − a exists and is different from a . Further a + ( − a ) =
0. This makes 0 theadditive identity in Z .(b) Let D n denote the set of symmetries of the regular n -gon. Then it is left as an ex-ercise to the reader, to prove that | D n | = n . Note that we can compose two suchsymmetries. Take for example the case n =
4. Let the rotation by 90 ◦ counterclock-wise be denoted r = R ◦ and the vertical reflection s . Then rs is the reflection alongthe primary diagonal. There is an identity element r = R ◦ .(c) Let C × denote the set of all non-zero complex numbers. Then we can define · : C × × C × → C × by ( w , z ) (cid:55)→ w · z = wz the standard complex multiplication. Here1 is the multiplicative identity.With these examples in mind, we can now define groups in more abstraction. In gen-eral, one can think of groups as symmetries of some object, be it an n -gon or some set. Wewill make this more precise. Definition 2.2.2.
Let G be a set and define µ : G × G → G be a binary operation such that(a) For all x , y , z ∈ G , µ ( x , µ ( y , z )) = µ ( µ ( x , y ) , z ) .(b) There exists e ∈ G such that µ ( e , g ) = g = µ ( g , e ) for all g ∈ G (c) For all g ∈ G there exists h ∈ G such that µ ( g , h ) = µ ( h , g ) = e .We commonly denote µ ( g , h ) as gh when the operation is clear. Further, the last conditiontells us that every element has an inverse and we denote g − : = h from that condition.We call G equipped with µ , a Group and denote it ( G , µ ) . We say a group is Abelian iffor all g , h ∈ G , we have that gh = hg . Remark 2.2.3.
Other common notations for groups are ( G , · ) and ( G , (cid:63) ) where · and (cid:63) denote the multiplication operations.It should now be obvious that ( ) and ( ) in Example 2.2.1 are example of groups (i.e.every integer has an inverse, namely its negative and every non-zero complex numberis invertible. For ( ) , notice that applying r n − times, we get e . Therefore r n = e and r n − = r − . Further, s = e . and so s is its own inverse.Now we lay down some important non-examples. These, for various reasons, violateone or many of the group axioms. Non-Example 2.2.4. (a) Consider the Z , Q , R under standard multiplication. Z is not a group as all otherelements than ±
1, are not invertible as n is not an integer. Why do Q and R fail?(b) (Integers Modulo n ) Let Z n denote the set of integers {
0, 1, ..., n − } together withmultiplication modulo n . Multiplying modulo n , means that we first multiply the19umbers using normal arithmetic and then "remove" n as many times as possibleand the remaining number is their product. For an example let n =
5, then3 · ≡ ( · ) mod n precisely gives the remainder when dividing by n . Under this multiplica-tion operation not every element here has an inverse, namely 0. For n (cid:54) = p a primenumber, we can find other elements which are not invertible. Take for instance n = Lemma 2.2.5.
For any group ( G , · ) , inverses are unique. Further, the identity element is unique.Proof. Let g ∈ G . Suppose there exist h , h (cid:48) ∈ G , h (cid:48) (cid:54) = h both inverses for g . Then on onehand we have that h (cid:48) gh = ( h (cid:48) g ) h = eh = h on the other hand we have that h (cid:48) gh = h (cid:48) ( gh ) = h (cid:48) e = h (cid:48) Therefore h (cid:48) = h a contradiction. Hence, h = h (cid:48) is the unique element such that gh = hg = e . To see that the identity element is unique, use the same process as above. Thiscompletes the proof. Remark 2.2.6.
For the remainder of the text, we will refer to groups by the underlyingset ( G , · ) : = G when the multiplication is understood and there is no room for confusion.This is standard notation and in most cases the multiplication is well understood. We willspecify the multiplication when we have a choice of operation. Corollary 2.2.7.
If g , h ∈ G are any elements. Then ( gh ) − = h − g − . Corollary 2.2.8.
Suppose G is a group such that every non-identity element is an involution (thatis g = e). Then G is abelian.Proof. This proof is left as an exercise to the reader.
Hint:
Using the fact that inverses areunique, realize that x = e = ⇒ x = x − for all non-identity elements.In practice, it can be hard to know if a given set is indeed a group. The followingtheorem is integral in identifying groups from abstract sets. Theorem 2.2.9.
Let G be a set equipped with an associative binary operation and suppose thereexists e ∈ G with the following properties(a) ge = g for all g ∈ G(b) For every g ∈ G there exists h ∈ G such that gh = e . Then G is a group. roof. For g ∈ G , pick h ∈ G as in ( ) . Then it suffices to show that eg = g and hg = e .Using ( ) again for h , we can find an element i ∈ G such that hi = e . Then g = ge = g ( hi ) = ( gh ) i = ei = i Therefore, hg = h ( ei ) = ( he ) i = hi = e as desired. Now, g = ge = g ( hg ) = ( gh ) g = eg .This completes the proof.This theorem gives us a criterion to check whether or not a set X is actually a group.In practice, this is much more convenient to check than the entirety of the group axioms.An example of this is the set N . We have an identity element 0. However, we cannotfind n − = − n for any non-zero element. Therefore N is not a group. However, N isthe prototypical example of a Semi-Group : a set which has an associative, unital binaryoperation where not every element has an inverse. These objects will play a role in chapter6 when discussing Toric Varieties.The following lemma is provided for ease with later proofs. It gives a criterion for asubset H ⊆ G to be a subgroup. Lemma 2.2.10 (Subgroup Criterion) . Let H ⊆ G be any subset. Let x , y ∈ H . If xy − ∈ H , for all x , y ∈ H , then H is a group and thus a subgroup of G . Proof.
Assume xy − ∈ H for all x , y ∈ H . If x = y , then xx − = e ∈ H . Let associativityis clear as the multiplication is inherited from G . To show every element is invertible,consider x ∈ H and e , then ex − = x − ∈ H . Using this, consider x and y − . Then x ( y − ) − = xy ∈ H . Therefore · : H × H → H defines an associative, binary, unital, andinvertible map. Hence, H is a group and in fact a subgroup of G . Now that we have the basic objects of this section, we can consider maps between them.Note in the following definition, the maps act as you would expect: preserving the struc-ture of both groups.
Definition 2.2.11.
Let ϕ : ( G , · ) → ( H , (cid:63) ) be a map. ϕ is a Group Homomorphism (or aMorphism of groups, see Ch. 3) if for all g , g (cid:48) ∈ G , we have that ϕ ( g · g (cid:48) ) = ϕ ( g ) (cid:63) ϕ ( g (cid:48) ) That is ϕ is equivariant with respect to the multiplication operations on G and H . A grouphomomorphism which is bijective is called an Group Isomorphism . If such a map exists,then the domain and codomain groups are said to be isomorphic and denoted G ∼ = H . Example 2.2.12. G = { − i , − i } where i = √−
1. Then define f : Z → G by f ( m ) = i m . Thismakes f a group homomorphism as f ( m + n ) = i m + n = i m · i n = f ( m ) · f ( n ) (b) Let R denote the set of all real numbers and R × denote the set of non-zero realnumbers. Define g : R × → R t imes by x (cid:55)→ x . Then g is a homomorphism as ( xy ) = x y for real numbers. We encourage the reader to investigate how chang-ing the domain and/or range to R ≥ changes the properties of the homomorphism.We now lay down two definitions which are integral to the study of algebra and haveanalogs in all other branches of mathematics. Definition 2.2.13.
Let H ⊆ G be a set contained in a group G . We call H a Subgroup iffor all h , h (cid:48) ∈ H , h · h (cid:48) ∈ H and h − ∈ H . We denote subgroups using the notation H ≤ G .Denote by gH = { gh : h ∈ H } for any g ∈ G . Then we call a subgroup Normal if gHg − = H for all g ∈ G and write H (cid:69) G . Denote by Z ( G ) = { g ∈ G : gh = hg , ∀ h ∈ G } ,the Center of G . It should be obvious that Z ( G ) is a normal subgroup of G . Definition 2.2.14.
Let ϕ : G → H be a group homomorphism. Define the Kernel of thehomomorphism ϕ to be ker ϕ = { g ∈ G : ϕ ( g ) = e H } This is the set of all elements which are annihilated under the mapping ϕ . Proposition 2.2.15.
The set ker ϕ is a group. In particular, it is a normal subgroup of G . Proof.
We first show that ker ϕ is non-empty. Let e ∈ G be the identity. We claim that ϕ ( e G ) = e H . To see this, recognize that ϕ ( gg − ) = ϕ ( g ) (cid:63) ϕ ( g − ) = ϕ ( g ) (cid:63) ϕ ( g ) − = ϕ ( e G ) = e H for all g ∈ G . Thus, ker ϕ is nonempty. As ker ϕ ⊆ G , ker ϕ inherits multiplication from G . Notice that for x , y ∈ ker ϕ , we have that ϕ ( x · y ) = ϕ ( x ) (cid:63) ϕ ( y ) = e h Therefore ker ϕ is closed under multiplication. Further, it is closed under inverses for thesame reason. Hence, ker ϕ is a group and ker ϕ ≤ G . To check normality, notice that ϕ ( gxg − ) = ϕ ( g ) (cid:63) ϕ ( x ) (cid:63) ϕ ( g ) − = e for all g ∈ G . Hence, ker ϕ (cid:69) G . 22s shown by the proof above, the homomorphism condition is quite restricting andpowerful. We used the fact that ϕ ( g − ) = ϕ ( g ) − for group homomorphisms. it is left tothe reader to check this fact. Now, we have the following result which is important whenproving other theorems. Theorem 2.2.16.
Let ϕ : G → H be a group homomorphism. Then ϕ is injective if and only if ker ϕ = { e } . Proof. ( ⇒ ) Assume that ϕ is injective. That is ϕ ( x ) = ϕ ( y ) = ⇒ x = y for all x , y ∈ G .Then let g (cid:54) = e ∈ ker ϕ . By injectivity, ϕ ( g ) = ϕ ( e ) = e H = ⇒ g = e Hence ker ϕ = { e } . ( ⇐ ) Assume now that ker ϕ = { e } . Then suppose ϕ ( g ) = ϕ ( h ) . This tells us that ϕ ( gh − ) = ϕ ( g ) ϕ ( h ) − = e H .Therefore gh − ∈ ker ϕ . As ker ϕ = { e } , we know gh − = e and hence, g = h . Thiscompletes the proof. Example 2.2.17.
Let G be a simple group (that is the only normal subgroups are { e } and G itself). Then any map f : G → H is either injective or trivial. This follows fromProposition 2.2.15 and Theorem 2.2.16.Just as with sets, we can build the Cartesian product of groups, G and H , denoted G × H . As a set it is precisely the set G × H , but now we endow this with a group structuretaken component-wise. That is ( g , h )( g , h ) = ( g g , h (cid:63) h ) For a concrete example, consider the set Z × R × ( R × is the group of all non-zero realnumbers under multiplication). Here ( k , r )( k , r ) = ( k + k , r r ) as the multiplication in Z is addition. At this point, we have the ability to construct a group, transition between groups, and"multiply" groups to make new ones. Just as with high-school algebra, we can now con-sider dividing, or taking quotients of groups.
Definition 2.2.18.
Let G be a group and H any subgroup. We denote by G / H (resp. H \ G ) the set of all left(resp. right) Cosets gH = { gh : h ∈ H } under the equivalence relation that g ∼ g (cid:48) ⇐⇒ g = gh (cid:48) for some h ∈ H . This is ingeneral not a group as multiplication is not well defined.23otice how the notation for this set of cosets is the same notation we use for equiva-lence relations on a set. The reason for this is that then we take left(right) cosets, we areessentially glueing G along the orbits of the subgroup H .The first question one can ask about this set is when does it becomes a group? In otherwords, for what H ≤ G is G / H a group. The following Theorem provides an answer. Theorem 2.2.19.
Let G be a group and N a subgroup. Then G / N (read G mod N) is a groupunder the operation ( gN )( hN ) = ( gh ) N if and only if N (cid:69) G . Further there is a canonicalhomomorphism G → G / N which sends g (cid:55)→ gN such that ker ( G → G / N ) = N . Proof.
We first need to show that the proposed group operation is well defined. Suppose xN = gN and yN = hN . These two statements are equivalent to x = gn and y = hn (cid:48) .Then ( xy ) N = ( xN )( yN ) = ( gn ) N ( hn (cid:48) ) N = g ( nN ) h ( n (cid:48) N ) = ( gN )( hN ) = ( gh ) N Thus the multiplication is well defined. ( ⇒ ) Now assume N is normal in G . Multiplication is associative by definition and theunit element is eN . It remains to show that gN has an inverse and that it is unique. Let g − be the inverse of g in G . Then ( gN )( g − N ) = ( gg − ) N = eN So g − N is an inverse for gN . Suppose there exists some y ∈ G such that ( gN )( yN ) =( yN )( gN ) = N . Then starting from the middle: yN = ( eN )( yN ) = ( g − N )( gN )( yN ) = ( g − N )( eN ) = g − N Hence, g − N = yN and G / N is a group.We defer the other direction of the proof for a moment. Define ϕ : G → G / N by ϕ ( g ) = gN . This is a homomorphism by the multiplication in G / N . If x ∈ ker ϕ then ϕ ( x ) = xN = N . Therefore x ∈ N and ker ϕ ⊆ N . The reverse inclusion is obvious andthus ker ϕ = N ( ⇐ ) Now suppose G / N is a group. Consider the canonical projection G → G / N . Thenker ϕ = N by above and by Proposition 2.2.15 we conclude that N is normal. Corollary 2.2.20.
Let G be an abelian group. Then for every subgroup H ≤ G , G / H is an abeliangroup.Proof.
The fact that G is abelian tells us that gH = Hg for all g ∈ G . To see that G / H isabelian, let g , g (cid:48) ∈ G . Then ( gH )( g (cid:48) H ) = ( gg (cid:48) ) H = ( g (cid:48) g ) H = ( g (cid:48) H )( gH ) xample 2.2.21. Recall the group Z n from above. To formally define Z n , we consider thegroup Z and the subgroup of multiples of n denoted n Z . Then Z n : = Z / n Z That is, we glue the integers along the multiples of n . The group operation in Z is + andtherefore x ≡ y mod n ⇐⇒ x + y = kn , k ∈ Z The quotient is a group as Z is abelian.Now that we have the idea of quotients, we can define one of the most useful theoremsin algebra: the First Isomorphism Theorem. The proof of which will introduce one of themost fundamental objects in algebra: the commutative diagram. These will show upmany times in the latter parts of this text and as such, we encourage the reader to try andprove the following theorem themselves before reading the proof. Theorem 2.2.22 (First Isomorphism Theorem) . Let G , H be groups and ϕ : G → H be a grouphomomorphism. Then G / ker ϕ ∼ = ϕ ( G ) Proof.
Consider the commutative diagram G ϕ ( G ) G / ker ϕ ϕ q ˆ ϕ The top arrow is surjective by definition and the map q is the canonical quotient. Denotethe cosets in G / ker ϕ as [ g ] . We define the map ˆ ϕ ([ g ]) = ϕ ( g ) . To show that ˆ ϕ is welldefined, consider [ g ] = [ h ] that is g = ah where a ∈ ker ϕ . Then ϕ ( g ) = ϕ ( ah ) = ϕ ( a ) ϕ ( h ) = ϕ ( h ) Thus, ˆ ϕ is well defined. It is a homomorphism asˆ ϕ ([ g ][ g (cid:48) ]) = ˆ ϕ ([ gg (cid:48) ]) = ϕ ( gg (cid:48) ) = ϕ ( g ) ϕ ( g (cid:48) ) = ˆ ϕ ([ g ]) ˆ ϕ ([ g (cid:48) ]) By the commutativity of the diagram, ˆ ϕ is surjective. We computeker ˆ ϕ = { [ g ] ∈ G / ker ϕ : ϕ ( g ) = e H } = ker ϕ As ker ϕ is the identity element in the quotient space, ˆ ϕ is injective. Hence, ˆ ϕ is an iso-morphism. Corollary 2.2.23. If ϕ : G → H is a surjective homomorphism thenG / ker ϕ ∼ = H .25he next tool we will discuss is fundamental to the study of algebra. Definition 2.2.24.
Consider a sequence of groups... G i − G i G i + ... d i d i + we say that the sequence is exact at G i if ker d i + = Im d i . If the sequence is exact at every G i , we say the sequence is exact and we call it a Long Exact Sequence . If the sequencehas the following form { e } → G → G → G → { e } we say the sequence is a Short Exact Sequence . Example 2.2.25.
Let G be a group and N a normal subgroup. We can rephrase the quotientconstruction as the unique (up to isomorphism) group H such that the following sequenceis exact { e } → N → G → H → { e } Here, the arrow N → G is the inclusion. Exactness tells us that N → G is injective, andthat G → H is surjective. Thus, by the first isomorphism theorem, G / ker ( G → H ) ∼ = H .As ker ( G → H ) = Im ( N → G ) = N , we have our result. Let G be a group. Just as with the dihedral groups D n , we can ask how a group may act ona set; that is, how does it permute the elements? The formalization of this, a group action,is essential when understanding the later topics in this section. We give the following twodefinitions Definition 2.2.26.
Let G be a group and X be a set. A (left) Group Action on X is a map · : G × X → X such that(a) h · ( g · x ) = hg · x (b) ∃ e ∈ G such that e · x = x for all x ∈ X . Definition 2.2.27.
Let G be a group and X a set as above. A (left)group action is a grouphomomorphism ϕ : G → Sym ( X ) where Sym ( X ) = { f : X → X : f is bijective } . This is a group under composition. Inver-sion is well defined as every map is bijective. This is called the permutation representationof the group G on X . Lemma 2.2.28.
Definition and are equivalent. roof. It is clear that 2.2.27 = ⇒ = ⇒ ψ g to be the map on X suchthat ψ g ( x ) = g · x . We know that ψ g is invertible ( ψ g − ) and thus ψ g ∈ Sym ( X ) . Define amap ϕ : G → Sym ( X ) by ϕ ( g ) = ψ g . Then by the associative property of the action weget that ϕ ( gg (cid:48) ) = ψ gg (cid:48) = ψ g ◦ ψ g (cid:48) = ϕ ( g ) ◦ ϕ ( g (cid:48) ) Hence, ϕ is a group homomorphism and the definitions are equivalent.We can think of group actions as shuffling the elements of the set they act on. Thekernel of an action is precisely the kernel of the resulting homomorphism. We say anaction is faithful if the associated permutation representation is injective. Further, we callan action transitive if the has precisely one orbit. That is, for every pair ( x , y ) ∈ X × X ,there exists g ∈ G such that g · x = y . Remark 2.2.29.
We have been careful to refer to left and right multiplication. If G is non-abelian, these are different operations. When doing more advanced mathematics, one canconsider multiplication or an action on both the left and the right. This has some majorconsequences but as we will not make use of the them, we have made the decision to omitsuch a discussion. Lemma 2.2.30.
Let G be a group and suppose G acts on a set X.(a) Let
Stab G ( x ) = { g ∈ G : g · x = x } and Orb G ( x ) = { g · x : g ∈ G } denote the stabilizeand orbit of the point x ∈ X under the action of G . Then
Stab G ( x ) is a subgroup of G . (b) If X = G , then the action is transitive and faithful. Further, any subgroup acts faithfully.Proof. (a) It is clear that Stab G ( x ) is a subset of G . It carries the standard group multi-plication and is non-empty as e ∈ Stab G ( x ) . It suffices to show that all non-identityelements have an inverse. Let g ∈ Stab G ( x ) . Then x = e · x = ( g − g ) · x = g − · ( g · x ) = g − · x Thus, g − ∈ Stab G ( x ) and by the subgroup criterion, Stab G ( x ) is a subgroup of G .(b) Let G act on itself by left multiplication. To show the action is faithful, suppose g · h = g (cid:48) · h . Then gh = g (cid:48) h ⇐⇒ ghh − = g (cid:48) hh − ⇐⇒ g = g (cid:48) So the map G → Sym ( G ) is injective. To show it is transitive, let h , i ∈ G , we needto show that there is an element g ∈ G such that gh = i . Pick g = ih − . This is agroup element and gh = ih − h = i . Therefore every element is in the orbit of a singleelement, namely the identity element. Hence, the action is faithful and transitive.As H ≤ G , this faithful map restricts to any subgroup.27 orollary 2.2.31. Let G act on a set X . If this action is transitive, then it is equivalent to theaction of G on G / H by left multiplication for some H ≤ G . Proof.
Let x ∈ X and consider H = Stab G ( x ) . Transitivity gives us that for all y ∈ X , y = gx for some g ∈ G . Suppose gx = g (cid:48) x then ( g (cid:48) ) − g ∈ H . This makes the map f : G / H → X gH (cid:55)→ gx a bijection. It remains to show that this map is G -equivariant. Let g ∈ G and w ∈ X . Wecan write w = g x for some g . Then f ( gg H ) = gg x = gw = g f ( g H ) Hence f is G -equivariant and the actions are equivalent.In this case, G / H is called the orbit space of the action as no element is stabilized inthe set. This will play an important role in the next chapter. Linear algebra is one of the oldest and core subjects to mathematics. It began as the studyof solutions to linear systems of equations and has grown into the study of transforma-tions on vector spaces. For example, given the following set of equations3 x + y = x + z = x + y − z = x , y , z satisfy them? There are a variety of ways to find solutions, butperhaps the simplest is to use matrices. Definition 2.3.1. A Matrix is any rectangular array of numbers, symbols, operators, etc.arranged in rows and columns such that addition and multiplication are well defined. If A is a matrix of finite size, it is convention to read the lengths of the sides as “rows bycolums". That is a matrix with 3 rows and 4 columns is a 3 × A , B be m × n and n × k matrices. Then ( AB ) ij = n ∑ l = A il B lj where A ij is the element of A in the i th row and j th column. An n × n matrix A is invertible if there exists an n × n matrix B such that AB = BA = I n which is the matrix with 1 s alongthe main diagonal and 0 elsewhere.We can turn the system of equations above into the single matrix equation − xyz =
28e leave it to the reader to check that x = y = , z = As we have just seen with groups, endowing a set with multiplication has some strikingimplications. In this section we consider a new algebraic object, a field. Broadly, this is aset equipped with two operations, addition and multiplication which are compatible.
Definition 2.3.2.
Let F be a set and suppose it is equipped with two operations + , · . Let F × denote the set of non-zero elements of F . Suppose ( F , +) and ( F × , · ) are abelian groups.If for all a , b , c ∈ F , a ( b + c ) = ab + ac = ba + ca = ( b + c ) a then F is a Field . In a field, we denote the identity for the addition as 0 = F and formultiplication as 1 = F . For any field, we can define the characteristic of F , char F to bethe minimal n ∈ N such that n · =
0. If no such n exists, we say that char F = Example 2.3.3.
The quintessential example of a field is the real numbers R . One can thenconstruct C the complex numbers as a field which contains R . These fields both havechar F =
0. For an example of positive characteristic, consider Z p where p is prime. Thisis a field and has characteristic p . A good exercise to test your understanding is to provethat Z n , for n (cid:54) = p l for some l (cid:54) = ∈ N , fails to be a field. Example 2.3.4 (Polynomials) . Let F be a field and denote by F [ x ] the set of all formalpolynomials ∑ n a i x i , with a i ∈ F . For any polynomial f ∈ F [ x ] define the degree of f ,denoted deg f , to be deg f = max { i : a i (cid:54) = } . We define addition as n ∑ a i x i + m ∑ b i x i = max { n , m } ∑ ( a i + b i ) x i where a i (resp. b i ) is considered to be 0 if i > n ( resp . i > m ) and multiplication as (cid:18) n ∑ a i x i (cid:19) · (cid:18) m ∑ b i x i (cid:19) = n + m ∑ (cid:32) ∑ j + k = i a j b k (cid:33) x i This makes F [ x ] a group under addition. It is not a group under multiplication as the setof invertible elements is precisely the constant polynomials as x n does not have an inversefor n ≥
1. Therefore F [ x ] is not a field. As we will see later, F [ x ] is a ring. (See Section 2.4)If f ∈ F [ x ] cannot be written as f = gh for g , h ∈ F [ x ] and deg g , deg h (cid:54) = f is saidto be irreducible .We sometimes adjoin numbers to a field in the same way we do with formal variables.Let i = √−
1. Then R [ i ] Then by the rules above this consists of all finite sums ∑ r j i j .However, i = − R [ i ] = { a + bi : a , b ∈ R } = C This is precisely the definition of the complex numbers.29 emark 2.3.5.
We will only concern ourselves with characteristic 0 as positive character-istic is a bit technical and does not play a role in the later chapters of this text.
Definition 2.3.6.
Let E be a field which contains F as a subfield. Then we say that E is an extension of F and denote this E / F . Further, the degree of the extension, denoted [ E : F ] ,is the integer n such that E ∼ = F n = ∏ n F .Similar to groups, we can define Field homomorphisms. Definition 2.3.7.
Let F , E be field and f : F → E . If for all a , b ∈ F we have that f ( a + b ) = f ( a ) + f ( b ) f ( ab ) = f ( a ) f ( b ) the f is a Field Homomorphism . A bijective field homomorphism is an isomorphism.For groups, this was where the story ended. For fields, due to the added structure, wehave the following lemma.
Lemma 2.3.8.
Every non-zero field homomorphism is injective.Proof.
Let F , E be fields and x , y ∈ F . Suppose α : F → E is a morphism. If α ( x ) = α ( y ) ,then α ( x ) − α ( y ) = ⇐⇒ α ( x − y ) = = α ( ) If a − b (cid:54) =
0, then put q = a − b . Using the multiplication, α ( q ) α ( q − ) = α ( ) =
1. But α ( q ) =
0. This is a contradiction and thus a − b = α is injective.In this proof we used the fact that for non-trivial field homomorphisms α ( ) =
1. Weleave it to the reader to check this.
Definition 2.3.9.
Let F be a field and S any subset. Denote by F S the subfield of F con-taining S . It is a fairly simple exercise to show that this field always exists. For the specialcase that S = { } , we call F = F (cid:48) the prime subfield of F as it is the field generated by 1.A less trivial exercise is to prove that if F is finite with characteristic p then F (cid:48) ∼ = Z p andif F is infinite and char F =
0, then F (cid:48) ∼ = Q . Definition 2.3.10.
Let L / K be a field extension. An element a ∈ L is algebraic over K if there exists f ∈ K [ x ] such that f ( a ) = L is called an algebraic extension if everyelement is algebraic. A field L is called algebraically closed if any for all f ∈ L [ x ] , f ( x ) = = ⇒ x ∈ L . Example 2.3.11. (a) C / R is an algebraic field extension as the degree of the extension is finite and by theFundamental Theorem of Algebra, C is algebraically closed.(b) Q ( √ ) / Q is an algebraic extension.(c) R / Q is not an algebraic extension. Consider the element e = lim n → ∞ ( + n ) n . Thisis known to be transcendental 30 roposition 2.3.12. Let L / K be an algebraic extension. Then for every element α ∈ L , thereexists a unique monic irreducible m α ∈ K [ x ] such that m α ( α ) = and deg m α is minimalamong polynomials which have α as a root. We shall omit the proof of this proposition as it does not add to the text.The last theorem we shall prove on fields tells us that every intermediate set, closedunder addition and multiplication, of an algebraic extension is a field. More precisely,
Theorem 2.3.13.
Let L / K be an algebraic extension and S a set such that S is a group underaddition and is closed under multiplication. If L ⊇ S ⊇ K , then S is a field.Proof. As S ⊆ L , it is commutative and has a unit element. It suffices to show that for all s (cid:54) = ∈ S that s − exists and is contained in S . Existence follows from the fact that s ∈ L and is non-zero. To show it is contained in S , we use Proposition 2.3.12. As L / K is analgebraic extension, the minimal polynomial m s of s over K exists. Let m s = x n + a n − x n − ... + a with each a i ∈ K . Evaluating at x = s , we get − a = s ( s n − + ... + a ) = ⇒ s ( s n − + ... + a ) (cid:18) − a (cid:19) = s − = ( s n − + ... + a ) (cid:16) − a (cid:17) ∈ S . Hence, S is a field.Just as with groups, we can talk about actions of fields on sets. This does not varyfrom the theory of groups however as Sym ( X ) is not a field so defining the action in thisway is uninteresting. We thus need a different object to study. Linear algebra has emerged from its concrete origins in system of equations to the beau-tiful abstract algebra it is today. Vector spaces comprise the main objects of study. Theseobjects, as we will see, are incredibly well understood and intersect every area of mathe-matics. The main references for this section are [ Coo15 ] and [Kna06].We begin with the definition. Definition 2.3.14.
Let V be a set, and F a field. Equip V with two operations + : V × V → V · : F × V → V which are compatible in the sense that for all f ∈ F and v , w ∈ V , we have that f ( v + w ) = f v + f w = ( v + w ) f and 1 v = v . If under these operations V is an abelian group togetherwith an action of F , we say V is an F - Vector Space , with elements v ∈ V called vectors andelements f ∈ F called scalars . The element f v is a scaled vector. A subset W ⊆ V , whichis closed under the operations of addition and scalar multiplication is called an F -vectorsubspace. Typically we simply say subspace if the underlying field is understood. Definition: A monic polynomial is a polynomial whose highest degree term has coefficient 1 xample 2.3.15. We have already seen some examples of vector spaces and subspaces.(a) Let F be a field and E a finite field extension. It is clear that a field satisfies thedefinition of a vector space over itself. Now, by the finiteness condition on E , weknow that E ∼ = F n and therefore we can extend the action of F to each component of E . That is f · e = f · ( e , ..., e n ) = ( f e , f e , ..., f e n ) This is given by the diagonal inclusion of F (cid:44) → F n which sends f (cid:55)→ ( f , f , ..., f ) (cid:41) n − times (b) For a non-trivial example consider the space F [ x ] . Definition 2.3.16. An F - linear combination of vectors is anything of the form v = ∑ a i v i for finitely many i with each a i ∈ F . If v , ..., v m is a collection of vectors in a vector space V , denote by (cid:104) v , ..., v m (cid:105) the set consisting of all linear combinations of the v i . This is canonically a subspace of V .We say that v , ..., v m is a spanning set for a vector space V if every v ∈ V can be writtenas a linear combination of the v i . Given a set B we will denote bySpan F ( B ) the minimal vector space generated by the elements of B . We will omit F if it is clear fromthe situation and or if the section is true regardless of the field chosen. Corollary 2.3.17.
Every vector space admits a spanning set.
This follows immediately from the definition as V is a spanning set for itself. A moreinteresting statement is that there exists a unique (up to conjugation) minimal spanningset Definition 2.3.18.
Let v , ..., v n be vectors in a vector space V . We say these vectors are linearly independent if n ∑ i = a i v i = ⇐⇒ a i = ∀ i We will commonly abuse the term linearly independent and refer to sets as linearly inde-pendent if all of the finite subsets of elements are linearly independent.
Example 2.3.19. (a) Let V = C treated as a real vector space via the inclusion of R (cid:44) → C . Its elements arewritten as z = x + iy . Let z , z , z be three, non-colinear ( z i (cid:54) = a j z j ∀ i , j ∈ {
1, 2, 3 } )complex numbers. It can be shown that z can be written uniquely as a z + a z .32b) For a more concrete example consider V = R . Let v = v = − v = It should be easy to see that v = v + v . Notice that if we change the thirdcoordinate of v to 0, we have that v is no longer a linear combination of v and v . Definition 2.3.20.
Let B = { v i } i ∈ I be a spanning set of the vector space V . We call B a basis if it is linearly independent. We denote elements of V with respect to this basis ascolumn vectors (tuples) v = ( k , ..., k n , ... ) t which means v = ∑ I k v i .It should be noted immediately that any basis B for a vector space is necessarily min-imal among the sets with the above properties. Theorem 2.3.21.
Let B and C be two bases for the vector space V . Then |B| = |C | . Proof.
We shall prove this is two cases |B| is finite and |B| is infinite. Suppose first that |B| < ∞ . We want to give bounds on the size of C . Lemma 2.3.22.
Suppose that |C | > |B| . Then C is linearly dependent.Proof. As B is a basis, the set B ∪ c must be linearly dependent. Therefore, up to reorder-ing, we can assume that b n ∈ Span { c , b , ..., b n − } . This is now a linearly independent set.Notice that by assumption { c j } is linearly independent. Therefore, repeating the aboveprocess with c j for 2 ≤ j ≤ n and reordering, we conclude that { c , ..., c n } is a linearly in-dependent, spanning set. As |C | > n , we then conclude that C is linearly dependent.From this lemma, we conclude that |C | ≤ |B| . The key step of the proof relied on the factthat B was a basis. We can similarly apply this logic to C and deduce then that |B| ≤ |C | .Hence, they must be equal.Now assume |B| is infinite. The method above will not work as sets with infinite cardi-nality as adding an element does not give any information regarding linear dependence.We can rephrase this part of the proof however as there exists a bijection f : B → C . We canconstruct such a function in the following way: let B = { b i : i ∈ I } and C = { c j : j ∈ J } with I , J some indexing sets of infinite cardinality. For an arbitrary element c j ∈ C , weknow that c j ∈ Span ( B ) . In particular, we know that c j ∈ Span ( B j ) a finite subset of B .Put B = (cid:91) j ∈ J B j As C is a basis, it is in particular a spanning set. Therefore B is also a spanning set. As B ⊆ B . we know that B = B and therefore B = (cid:91) j ∈ J B j As each B j is finite we know that (cid:12)(cid:12)(cid:12) (cid:83) j ∈ J B j (cid:12)(cid:12)(cid:12) ≤ | J | = |C| . Hence, |B| ≤ |C | and, by applyingthe same logic, we have that |C | ≤ |B| . The proof is completed by the following theorem,a proof for which can be found in [Kna06, Appendix A.6].33 heorem 2.3.23 (Schroeder-Bernstein) . If A and B are sets such that there exists an injectivefunction f : A → B and and injective function g : B → A then | A | = | B | .This now begs the question: "does every vector space admit a basis?" The next theoremwill give an answer to this, but before giving a proof, we need the following famouslemma from Logic. Lemma 2.3.24 (Zorn’s Lemma) . Let P be a partially ordered set. Suppose that every totallyordered set has an upper bound. Then P contains a maximal element.
Definition 2.3.25. A partial order on a set X is a reflexive, antisymmetric, transitive, bi-nary relation (cid:22) . A total order is a partial order such that for all pairs ( x , y ) either x (cid:22) y or y (cid:22) x .The rest of the components of the lemma are self explanatory. The proof of this lemmawill be omitted as it does not add to the text. Although it seems innocuous, this lemmaprovides the technical support for many proofs in algebra. For example: Theorem 2.3.26.
Let V be a vector space defined over the field F . Then:(a) Every spanning set contains a basis.(b) Every linearly independent subset can be extended to a basis.(c) V has a basis.
We present the proof given in [Kna06].
Proof. (b) Let E be a linearly independent subset of V . Let S be the collection of all linearlyindependent subsets of V containing E . Then S is a partially ordered set under inclusionand non-empty as E ∈ S . Let T be a totally ordered subset of S and consider A = (cid:91) T ∈T T We claim that A ∈ S . It clearly contains E by construction. It remains to show it is linearlyindependent. To see this, suppose not. Then there exist v , ..., v n ∈ A such that c v + ... + c n v n = c i =
0. Let A j ∈ T be an element which contains v j . Then as T istotally ordered. There exists some A (cid:48) n such that A (cid:48) n ⊇ A j for all j ≤ n . As A (cid:48) n is linearlyindependent, c i = i , a contradiction. Hence, A is linearly independent and anupper bound for T . Thus, all totally ordered sets have an upper bound and by Zorn’sLemma, there is a maximal element B ∈ S . it remains to be shown that B is a spanningset. Let v ∈ V be arbitrary. Suppose v / ∈ Span F B . Then { v } ∪ B is a linearly dependent setby the maximality of B . Therefore, there exist constants c , c , ..., c m and vectors v , ..., v m such that cv + c v + ... + c m v m = c , c , ..., c m =
0. We know that c (cid:54) = B is linearly independent. Therefore v = − c − ( v c + ... + v m c m ) . Hence, v ∈ Span F B and B is a spanning set.34a) Now Let E be a spanning set. Let S denote the partially ordered set of linearlyindependent subsets contained in E ordered by inclusion. Let T be a totally orderedsubset of S . Let A be the union of all of the elements of T . Then it is clearly an upperbound by the argument in (b) above. By Zorn’s Lemma S contains a maximal element M and by an easy modification of the proof showing that B was linearly independent in part(b), we conclude that M is a spanning set and therefore M is a basis. (c) now follows from(a) by taking E = V and follows from (b) by taking E = ∅ .Now, by Theorems 2.3.26 and 2.3.21, we know bases exist and that their cardinalityis unique. Therefore it is an invariant of the vector space and motivates the followingdefinition. Definition 2.3.27.
Let V be an F -vector space and B a basis. By the F-dimension of V wemean dim F V = |B| Here it is important to distinguish the field of definition.
Example 2.3.28.
Let F p denote the field with p elements. It is a fun exercise to prove thatfor any natural number n ∈ N , there is a field extension F p n . Each of these fields is a vectorspace of dimension n over F p given by adjoining a root of an irreducible polynomial ofdegree n and thus is isomorphic to F np . We can see this isomorphism explicitly after wedevelop the theory of rings in the next section. Example 2.3.29.
We now give an interesting example of an infinite dimensional vectorspace. Consider R defined over Q . At first glance, this looks non-sensical as an infinitedimensional vector space as Q is dense in R . However, suppose R ∼ = Q n for some n ∈ N .Then we can pick a basis { x , ..., x n } of R over Q . By Cantor’s diagonalization argument,we know that | R | > | Q | . In fact, Q is countably infinite and R is uncountably infinite.Using the basis we have picked, the claim R ∼ = Q n would imply that R is countablyinfinite as the finite product of countably infinite sets is necessarily countably infinite.This is a contradiction and thus dim Q R (cid:54) = n ∀ n ∈ N Another way to think about this is to look at all transcendental numbers, t , over Q (num-bers such as π , e , ln ( ) etc.) If we look at Span Q { t } ∼ = Q (cid:40) R we get disjoint one dimen-sional subspaces for each unique transcendental number. Lemma 2.3.30.
There are only countably many algebraic numbers.Proof.
A real number, r , is algebraic if there exists f ∈ Q [ x ] such that f ( r ) =
0. Therefore,we need a bound on the cardinality of Q [ x ] as this gives an upper bound on the cardinalityof the algebraic numbers. Notice that { x i } i ∈ N is a basis for Q [ x ] as a Q vector space. Thisis a countable basis and therefore Q [ x ] is a countably infinite dimensional vector space.Hence, Q [ x ] is countably infinite as a set and therefore the cardinality of the algebraicnumbers is at most countably infinite. Corollary 2.3.31.
There are uncountably many transcendental numbers. R are uncountablymany copies of Q , each having trivial intersection, and thus R is an infinite dimensionalvector space over Q . Now that we have the notions of basis and dimension, we can introduce the idea of linearmaps between vector spaces. These play a massive role in modern mathematics as wellas many applied areas. The reason, as will be shown shortly, is that linear maps are insome sense the “easiest" functions to understand. Further, there is a natural association ofa matrix to any linear map, regardless of dimension. This will give us a clear method totackle problems like. Example 2.3.19(b) and after Definition 2.3.1. First, we introduce thenotion of quotient for vector spaces. This treatment will mirror the treatment for groupsabove, but will elucidate the differences that vector spaces bring.Similar to the case of sets, we want to impose a notion of equivalence on a genericvector space V . We do this by identifying an entire subspace, not just a subset. Definition 2.3.32.
Let W ⊆ V be a subspace. We define the quotient space V / W = V / ∼ where v ∼ v (cid:48) if v − v (cid:48) ∈ W . It is easy to check that this is an equivalence relation. As V is an abelian group, we have that V / W is also an abelian group under the operation [ x ] + [ y ] = [ x + y ] . We define scalar multiplication as k [ v ] : = [ kv ] . This turns V / W into avector space.We shall see some examples of these after Theorem 2.75 below. Before this, we givethe first definition of linear maps and some first properties. Definition 2.3.33.
Let K be a field and V , W be two K -vector spaces. We say a function f : V → W is a linear transformation if for all v , v (cid:48) ∈ V and k , k (cid:48) ∈ K , f ( kv + k (cid:48) v (cid:48) ) = f ( kv ) + f ( k (cid:48) v (cid:48) ) = k f ( v ) + k (cid:48) f ( v (cid:48) ) ∈ W The set of all v ∈ V such that f ( v ) = kernel and is denoted ker f . Similarly,the image, denoted Im f is defined as the set of w ∈ W such that w = f ( v ) for some v . Weretain the same definitions of isomorphism as for groups above. Lemma 2.3.34.
The canonical map q : V → V / W is linear and surjective.Proof.
By definition, q ( kx + y ) = [ kx + y ] = [ kx ] + [ y ] = k [ x ] + [ y ] = kq ( x ) + q ( y ) . There-fore, q is a linear transformation. Now let C be a basis for V / W .. Let C (cid:48) be a choice ofrepresentatives for the elements of C in V . Then C = q ( C (cid:48) ) and extending by linearity, weget that V / W = Span C = Span q ( C (cid:48) ) . Hence, q is surjective. Definition/Theorem 2.3.35.
Let f : V → W be a linear transformation. Then:(a) ker f and Im f are vector subspaces of V and W respectively. We then call dim K ker f the nullity and dim K Im f the rank .(b) f is injective if and only if ker f =
0. 36 c) (First Isomorphism Theorem) V / ker f ∼ = Im f . (d) If dim K V = dim K W < ∞ then the following are equivalent:(i) f is injective(ii) f is surjective(iii) f is an isomorphismProof. (a), (b), and (c) follow from the fact that linear functions are additive group homo-morphisms that also respect scalar multiplication. This implies that ker f and Im f areadditive abelian groups closed under scalars by the K -equivariance. What remains to beproven for (c) is that the following diagram of linear maps commutes V Im fV / ker f fq ˆ f Forgetting the K -equivariance momentarily, the diagram commutes on the level of abeliangroups by the proof of Theorem 2.2.22. Therefore, we need to show that K -equivarianceof ˆ f . If k ∈ K , then ˆ f ( k [ v ]) = ˆ f ([ kv ]) = f ( kv ) = k f ( v ) = k ˆ f ([ v ]) By the proof of Theorem 2.2.22, we know that ˆ f is a bijective linear map and thus a. vectorspace isomorphism.(d) If suffices to prove that ( i ) ⇐⇒ ( ii ) as ( iii ) = ⇒ ( i ) , ( ii ) trivially and ( i ) = ⇒ ( ii ) makes f a bijective linear map, hence an isomorphism.( ⇒ ) If f is injective, pick B a basis for V . Then f ( B ) is linearly independent by linear-ity. Since dim W = dim V , f ( B ) is a basis for W and f is surjective.( ⇐ ) If f is surjective, again let B be a basis for V and f ( B ) the corresponding basis of W . Let u ∈ ker f . We need to show u =
0. As B is a basis, let u = k v + ... + k n v n bethe unique expansion of u in the basis B . By the linearity of f , we know that f ( u ) = k f ( v ) + ... + k n f ( v n ) = W . However, f ( B ) is a basis for W and consequently k i = i . Thus u =
0. This completes the proof.
Corollary 2.3.36.
If V and W are finite dimensional vector spaces such that dimV = dim W , then V ∼ = WProof.
Let B be a basis for V and C a basis of W let f : V → W be defined by f ( k b + ... + k n b n ) = k c + ... + k n c n This is clearly injective and by Theorem 2.3.35(d), an isomorphism.37e will not provide a proof for the following theorem as it is more or less an exercisein Category theory which will be postponed until Chapter 3.
Theorem 2.3.37.
Let B be a basis for a vector space V . Let U be any other vector space. Iff : B →
U is any function, then there exists a unique linear transformation F : V → U such thatthe following diagram commutes: B UV f ι F This is an example of a universal mapping propery . These types of theorems areabundant in algebra and will be seen to be parts of more general schema in Chapter 3.
Example 2.3.38.
We now give some examples of vector spaces that arise from the consid-eration of various linear maps.(a) (Direct Sums and Direct Products)
Let { V i } i ∈ I be a collection of vector spaces. Wedefine two objects (cid:77) i ∈ I V I = { ( v i ) i ∈ I : all but finitely many v i = } ∏ i ∈ I V i = { ( v i ) i ∈ I } the direct sum and direct product respectively of vector spaces. These objects come withnatural linear maps ι j : V j → (cid:76) V i and π j : ∏ V i → V j . For a finite indexing set, (cid:76) V i = ∏ V i and thus the symbols ⊕ and × will be used interchangeably. IN general however (cid:76) V i (cid:44) → ∏ V i .The key feature of ⊕ for finite indexing sets is thatdim (cid:16) (cid:77) V i (cid:17) = ∑ dim V i This follows from the fact that we can take individual bases in each coordinate space. Aswill be seen in the next chapter, ⊕ is a coproduct (or colimit) of vector spaces and × is aproduct (or limit) of vector spaces.(b) (Hom and Dual Spaces) Let Hom F ( V , W ) denote the set of all F -linear transforma-tions V → W . This can be made into an F -vector space by defining addition and scalarmultiplication point-wise. For the case of W = F , we denoteHom F ( V , F ) = V ∗ the dual space to V . If dim V < ∞ , then there exist isomorphisms (non-canonically) of V ∼ = V ∗ and (canonically) of V ∼ = V ∗∗ the double dual. We call elements of V ∗ linearfunctionals on V . Let T : V → W be a linear transformation, and define T ∗ = T t the transpose map as T t ( g ) = g ◦ T : W ∗ → V ∗ . Below, we will show the motivation behindsuch a naming and its relation matrices. It can be seen that in generaldim Hom ( V , W ) = dim V · dim W .38o finish this subsection, we shall go back to the start and relate matrices to linearmaps on vector spaces. Theorem 2.3.39.
Let V and W be finite dimensional vector spaces over the field K and f : V → Wa linear transformation. Then, there exists a matrix A such that with respect to the bases on V andW , f ( v ) = Av . where the right side is taken to be matrix multiplication of the dim W × dim Vmatrix by the dim V × vector in V . Further more, given any matrix M of size dim W × dim V , this corresponds to a linear map g : V → W . Moreover, this correspondence is bijective.Proof.
Put M ( V , W ) to be all matrices in the bases B , C over V , W respectively. Noticethat this is a vector space over K and that the matrices E ij , whose only non-zero entry isa 1 in position ( i , j ) , forms a basis. Define f ij : V → W to be the unique linear extension(Theorem 2.3.37) of the map on the bases which sends b i (cid:55)→ c j . This gives an inclusion of E , the basis of M ( V , W ) , into Hom ( V , W ) via the map ϕ : E →
Hom ( V , W ) ϕ ( E ij ) = f ij We claim that { f ij } is a linearly independent set. To see this, consider the unique extensionˆ ϕ : M ( V , W ) → Hom ( V , W ) and the arbitrary sum0 = ∑ i , j a ij f ij Evaluating this at one of the b i , we get that0 = ∑ j a ij w j and by linear independence all a ij =
0. Hence, { f ij } is linearly independent and as thereare dim W · dim V many elements, we know that it is a basis for Hom ( V , W ) by Example2.3.38 and Corollary 2.3.36. Hence,ˆ ϕ : M ( V , W ) → Hom ( V , W ) is a surjection and by Theorem 2.3.35, an isomorphism. This completes the proof.What this tells us is that every matrix can be treated as a linear transformation and thusthe transpose map f t : W ∗ → V ∗ has a matrix representation as the transpose matrix. Aswe will see later, this correspondence between matrices and linear maps can be exploitedto prove a variety of theorems. One of the main theorems will be on determinants, tobe defined in section 2.5 which relates invertibility of a matrix (and of the correspondinglinear map) to its determinant. We now enter the belly of the algebraic beast. Ring and module (section 2.5) theory gener-alizes both fields and vector spaces in a way which makes doing mathematics with them39ignificantly more difficult. However, we are lucky in that for the main applications inChapter 4 and 5, we only need sufficiently nice objects called local and/or noetherianrings. Modules over these rings are relatively controlled and thus are incredibly impor-tant for analyzing these objects. A majority of this section comes from [Kna06],[Rot15]and [DF04]. The material on commutative rings follows [Mat86] and [AM69]. Similar tothe previous sections, we begin with some definitions:
Definition 2.4.1.
Let R be a set equipped with two associative binary operations ( + , × ).We call R a ring if the following hold:(a) R is an abelian group under + .(b) R is closed under × . That is for all a , b ∈ R , a × b = ab ∈ R .(c) For all a , b , c ∈ R , a ( b + c ) = ab + ac and ( a + b ) c = ac + bc .If in addition there exists an element 1 R such that x × R = x , for all x ∈ R then wesay that R is unital . We call R commutative if a × b = b × a for all a , b ∈ R . A ringhomomorphism is a function f : R → S such that for all s , t ∈ R , f ( s + t ) = f ( s ) + f ( t ) f ( st ) = f ( s ) f ( t ) If R and S are unital, then we also impose the condition that f ( R ) = S . The set of units(multiplicative invertible elements) is denoted R × . Remark 2.4.2.
It is common practice to assume that all rings are unital. This makes one’sjob much easier when considering homomorphisms and related objects. We shall followthis convention for the remainder of the text and note the instances when an object doesnot contain a unit.
Lemma 2.4.3.
Let R be a ring. Then the set R × is a group under multiplication. Example 2.4.4.
Rings play a key role in the later parts of this text and therefore it is im-perative that we have a wealth of examples to draw from.(a) Let F be a field, then F is a commutative, (unital) ring, where every non-zero elementhas an inverse. Therefore F × = F − Z , Q , R , C are rings with additional and multiplication defined asusual. In fact, Z is the prototypical example of a commutative ring which is not afield. For Q , R , C their group of units is the set of non-zero elements. For Z , its easyto see that Z × = {± } ∼ = Z /2 Z .(c) Let V be a finite dimensional K -vector spaces of dim V = n , then M n ( K ) : = M ( V , V ) the set of n × n matrices is a ring with identity element I n = diag (
1, ..., 1 ) the matrixwith 1s along the main diagonal. Further, the group of units is special and gets itsown symbol GL n ( K ) : = M n ( K ) × M ( C ) with basis = (cid:18) (cid:19) i = (cid:18) i − i (cid:19) j = (cid:18) − (cid:19) k = (cid:18) ii (cid:19) We denote this space as H = Span R { , i , j , k } . These are the hamiltonian quater-nions and are an example of a division ring , one where every non-zero element hasa multiplicative inverse.(e) Let V be a vector space over a field of char F (cid:54) = V with a bilinear map [ − , − ] : V ⊕ V → V which satisfies the following conditions for all x , y , z ∈ V (i) [ x , y ] = − [ y , x ] (Anti-commutativity)(ii) [ x , [ y , z ]] + [ z , [ x , y ]] + [ y , [ z , x ]] = V is a Lie Algebra . Every lie algebra is a non-commutative, non-associative,non-unital ring. These will play a part in the theory developed in Chapter 3.(f) Consider R [ x ] the polynomial ring with coefficients in a ring R .. This is a ring asdiscussed in Example 2.3.4. The group of units is necessarily R × as these are theonly elements with formal inverses. Proposition 2.4.5.
Let R be a ring. Then there exists a unique ring homomorphism ϕ : Z → RProof.
Fix r ∈ R and define ϕ r ( n ) = nr = r + r + ... + r ( n -times) in R . For all r (cid:54) = R , this isa homomorphism of non-unital rings. Notice that each of these is determined completelyand uniquely by where it sends 1. Hence, put ϕ = ϕ : Z → R . This sends 1 (cid:55)→ R andtherefore is the desired homomorphism.We define the kernel of a ring homomorphism in direct analog to vector spaces andgroup homomorphisms. The following theorem is the ring version of Theorem 2.3.35. Weleave the proof as an exercise to the reader as it follows with slight modification from theproof of Theorem 2.3.35. Theorem 2.4.6.
Let S and R be rings and ϕ : R → S a ring homomorphism. Then:(a) ker ϕ is a subring with no unit and ϕ ( R ) is a ring.(b) ϕ is injective if and only if ker ϕ = ϕ for a moment. It is a special example of an Ideal of R . Definition 2.4.7.
Let R be a ring. A left ideal I ≤ ( R , +) is a subgroup of the additivegroup of R such that RI = { ri : r ∈ R , i ∈ I } ⊆ I J ⊆ R such that JR ⊆ J . We call an ideal, m , maximal if there are no other ideals of R which properly contain m . We call an ideal, p , prime if ab ∈ p then either a ∈ p or b ∈ p . Remark 2.4.8.
Notice that in a commutative ring R , every left ideal is also a right ideal.An ideal which is a left and right ideal, is called two-sided . Further, over a commutativering we can think of ideals in the same way we thought about vector spaces. The maindifference however is that we cannot normally pick a generating set for I as non-trivialideals exist for rings which do not exist for fields. Example 2.4.9.
Lets consider some ideals in the rings given above.(a) Every field has no proper non-zero ideals. This follows from the fact that an ideal I is necessarily a vector space over F and therefore has a basis. If I is non-zero thebasis has to be the element 1 F .(b) For any ring R , let S be a subset. We can form (cid:104) S (cid:105) the ideal generated by S by taking (cid:104) S (cid:105) = (cid:92) S ⊂ I ⊆ R I where I is an ideal. We leave it to the reader to check that the intersection of idealsis necessarily an ideal. We call an ideal principal if I = (cid:104) r (cid:105) for some element r ∈ R .(c) Z is an example of a Principal ideal domain . This means that every ideal is prin-cipal and thus m Z are all possible ideals. Because of this, any ideal which contains1 ∈ Z must be Z itself. In fact in any ring, an ideal which contains 1 R must be theentire ring. Lemma 2.4.10.
Let ϕ : B → A be a ring homomorphism. Then if p is a prime ideal in A , ϕ − ( p ) is a prime ideal in B . This does not hold true for maximal ideals.Proof.
Let ab ∈ ϕ − ( p ) . Then ϕ ( ab ) = ϕ ( a ) ϕ ( b ) ∈ p and thus one of ϕ ( a ) or ϕ ( b ) is anelement of p . Hence, either a or b is an element of ϕ − ( p ) and it is a prime ideal of B .For a counterexample in the maximal case, consider the canonical inclusion ι : Z (cid:44) → Q .As Q is a field, its only ideal is 0 but ι − ( ) =
0, which is not maximal in Z .Similar to vector spaces and groups, we can take quotients of rings by two-sided ide-als. To see why these are the natural choice for quotients, consider that we want to havethe quotient R / I become a ring again. To do this, let I be an arbitrary subgroup of ( R , +) .A coset of I in R will be denoted r + I for r ∈ R . If we define addition and multiplicationin the obvious way ( r + I ) + ( s + I ) = ( r + s ) + I ( r + I )( s + I ) = ( rs ) + I The word prime here comes from the notion of prime integer. Normally a number is prime if its onlyfactors are 1 and itself. An equivalent condition is that p ∈ Z is prime if and only if when p divides theproduct ab for some a , b ∈ Z , then either p divides a or p divides b . ( R , +) is an abelian group, we know that the as groups R / I is well defined under + .We need to make sure it is well defined under × . That is for any r , s ∈ R and α , β ∈ I weshould have that ( r + α )( s + β ) + I = rs + I If we set r = s = I must be closed under multiplication. Thus, I is a subring(without unit) of R . By setting s = r β ∈ I for all r ∈ R and β ∈ I . Therefore I is closed under multiplication on the left by R . Setting r = s vary we alsosee that I must be closed under multiplication by R on the right. Conversely, if I is closedunder left and right multiplication by R then the relation above must be satisfied. Hence,being a two-sided ideal is a necessary and sufficient condition for R / I to be a ring. Whatwe have just shown is the following technical lemma: Lemma 2.4.11.
Let I ⊆ ( R , +) be a subgroup. A necessary and sufficient condition for R / I tohave the structure of a ring is that I is a two-sided ideal of R . Proposition 2.4.12 (First Isomorphism Theorem for Rings) . Let ϕ : R → S be a ring homo-morphism. Then:(a) ker ϕ is a two-sided ideal of R and R / ker ϕ ∼ = ϕ ( R ) . (b) If I is any two-sided ideal of R, the map π : R → R / I r (cid:55)→ r + Iis a surjective ring homomorphism with kernel I . Hence, every two-sided ideal of R can berealized as the kernel of some homomorphism.Proof. (a) The majority of this proof mirrors that of Theorem 2.2.22. What remains to beproven is that the map ˆ ϕ ( r + I ) = ϕ ( r ) is a bijection between R / I and ϕ ( R ) . This followimmediately from the definitions of a ring homomorphism.(b) We know that R / I is a ring from the discussion before the statement of the propo-sition. In particular R and R / I are abelian groups and therefore π : R → R / I is a grouphomomorphism. To see it is a ring homomorphism, consider two elements r , s ∈ R . Then π ( rs ) = rs + I = ( r + I )( s + I ) = π ( r ) π ( s ) Further, π ( ) = + I = R / I . Hence, π is a ring homomorphism.The final proof of this subsection is possibly the most useful and interesting isomor-phism theorem. Theorem 2.4.13 (Fourth Isomorphism Theorem(s)) . (a) Let G be a group and N (cid:69) G . Then the subgroups of G / N are in one-to-one correspondencewith the subgroups of G containing N . (b) Let R be a ring and I a two-sided ideal. Then the subrings of R / I are in one-to-one corre-spondence with the subrings of R containing I .43 roof.
We shall prove (b) and (a) will follow immediately by the same argument. Let π : R → R / I be the canonical projection map and S be a subring of R containing I . Then I is a two-sided ideal in S and thus S / I is a ring contained in R / I . Now assume that P ⊆ R / I is a subring. Then π − ( P ) = { r ∈ R : r + I ∈ P } . We first check that this is aring. If a , b , c ∈ π − ( P ) then π ( ab + c ) = ( ab + c ) + I = ( ab + I ) + ( c + I ) = ( a + I )( b + I ) + ( c + I ) This is an element of P by the definition of a ring. Thus, π − ( P ) is a ring and π − ( P ) / I = P so Lemma 2.4.11 tells us that I is an ideal in π − ( P ) . This completes the proof.An immediate corollary of this theorem is that Corollary 2.4.14.
Let R be a commutative ring and m an ideal. Then m is maximal if and only ifR / m is a field. This single corollary will play a large role in the formulation of certain categorical andalgebraic objects later in the text.
Proposition 2.4.15.
Every commutative ring has a maximal ideal.Proof.
Let P be the set of proper ideals of R ordered by inclusion. Every chain C in P has an upper bound, namely (cid:83) C ∈C C . This is easily seen to be an ideal. Applying Zorn’slemma, P has a maximal element, m . By definition, m is a maximal ideal. This will take up the remainder of this section on rings. The theory developed later in thistext relies on the notion of local and noetherian rings. These play a huge role in algebraicgeometry and the theory of smooth manifolds. Specifically they form the basis for whichsheaves (see Chapter 3) can be built. It is known that understanding sheaves on a spaceis equivalent to understanding the space itself. Therefore, to get a better grasp on thegeometry later, we will need to understand sheaves. To do so, we start with commutativealgebra and build our way up.
Remark 2.4.16.
For the remainder of this chapter, all rings are assumed to be commutativeand unital. Ideals are two-sided (by Remark 2.4.8) and thus R / I can be given the structureof a ring always.We start with Principal Ideal Domains (defined in Example 2.4.9) and their general-ization: Unique Factorization Domains and integral domains. Definition 2.4.17.
Let R be a ring. A zero-divisor in R is an element a ∈ R such thateither ab = ba = b ∈ R . A ring with no non-zero zero-divisors is called an integral domain . This amounts to being able to cancel the element from expressions suchas ab = cb = ⇒ a = c
44n an integral domain, an element r which is nonzero and not a unit is called irreducible if whenever r = ab then one of a or b is a unit. Otherwise r is reducible . An element iscalled prime if the ideal (cid:104) p (cid:105) is a prime ideal in the sense of Definition 2.4.7. An integraldomain is a principal ideal domain if every ideal is principal.Prime elements and irreducible elements are closely related. In most of the exampleswe have presented, they are in fact the same! The following lemma asserts this Lemma 2.4.18.
In an integral domain, R, every prime element is irreducible. If we assume furtherthat R is a principal ideal domain (P.I.D.) , then an element is prime if and only if it is irreducible.Proof.
Assume p = ab . Then by definition p divides a or p divides b . Assume without aloss of generality that p divides a . Then a = px for some x ∈ R . Then p = pxb As R is an integral domain it has no zero-divisors and thus xb = b is a unit. Hence, p is irreducible.Now assume further that R is a P . I . D . we need to show that irreducible elementsare prime. Let r be an irreducible element. Suppose that r ∈ M an ideal of R . By thehypothesis, M = (cid:104) m (cid:105) and r ∈ (cid:104) m (cid:105) = ⇒ r = mx for some x ∈ R . By irreducibility,either m or x is a unit. Thus either (cid:104) m (cid:105) = (cid:104) r (cid:105) or (cid:104) (cid:105) . Hence, (cid:104) r (cid:105) is a maximal ideal and allmaximal. ideals are prime. Definition 2.4.19.
An integral domain R is called a Unique Factorization Domain if ev-ery non-unit element has a unique (up to units) factorization into irreducible elements.Unique factorization is a topic that should be familiar to everyone. It is a standardresult in high-school level mathematics that every integer can be written as a product ofprime numbers. As Z is a P.I.D. this agrees with the definition above. The following resultputs all of these rings into context with what we have done prior. We shall not prove it asit does not add to the theory. Theorem 2.4.20.
The following inclusion of integral domains holds:Fields (cid:40)
Principal Ideal Domains (cid:40)
Unique Factorization Domains
Example 2.4.21.
Let F be a field and consider F [ x ] the polynomial ring. It is a well knownfact that F [ x ] is a P.I.D. In fact, we can relax the restriction that F is a field and consider R [ x ] the polynomial ring with coefficients in a ring R . There is a nice result [DF04, Theorem 7,Chapter 9.3] that says that if R is a U.F.D. then so is R [ x ] . Shortly, we shall see that thereis another theorem of this variety, Hilbert’s Basis Theorem, which asserts that if a ring isNoetherian then so is R [ x ] . Definition 2.4.22.
Let R be a ring and I an ideal. We say that I is finitely generated ifthere exists a finite set S such that I = (cid:104) S (cid:105) . We call the ring R Noetherian if every ideal isfinitely generated. 45 heorem 2.4.23 (Hilbert’s Basis Theorem) . Let R be a noetherian ring. Then R [ x , .., x n ] isnoetherian for any n ∈ N .The proof of this theorem is moderately technical and will be omitted.The main reason we consider Noetherian rings is that we have the following proposi-tion which characterizes chains of ideals in R . Proposition 2.4.24.
A ring R is Noetherian if and only if every ascending chain I ⊆ I ⊆ ... ofideals stabilizes: that is there exists n ∗ ∈ N such that for all n ≥ n ∗ . we have that I ∗ n = I n . Proof. ( ⇒ ) Let I ⊆ I ⊆ ... be an ascending chain of ideals. Then by assumption all ofthese are finitely. generated. Consider I = (cid:83) n ∈ N I n . This is also an ideal as any two ele-ments can be taken to be in some higher indexed ideal and thus addition is well defined.Multiplication by R also follows immediately. Thus, I is an ideal and finitely generated byassumption. Let { a , ..., a n } be a generating set. Then each of these is contained in some I j . Take j ∗ = max { j : a j ∈ I j } then all of these elements lie in I ∗ j and the chain stabilizes. ( ⇐ ) Let I be an ideal and consider the set of all finitely generated ideals contained in I . This has a maximal element m by Zorn’s Lemma (Lemma 2.3.24). We assert that m = I .If not, then there would be an ascending chain of ideals which did not stabilize, namelytake the generating set N of I . Then we can pick x i ∈ N such that Rx (cid:40) Rx + Rx ⊆ ....This is an infinite chain of ideals and it does not stabilize. Hence, I = m and I is finitelygenerated. This completes the proof.We will see early on in the next chapter that we can define the notion of Noetherianfor topological spaces. We use this notion to relate noetherian rings to a certain topologyon Spec ( R ) called the Zariski topology.Another key class of rings are rings that have a single maximal ideal. Definition 2.4.25.
Let R be a ring. We say that R is a local ring if there exists a singlemaximal ideal in R . It is customary to denote local rings as ( R , m ) or ( R , m , k ) where k = R / m is called the residue field of R . It is easy to show that m is precisely the set of allnon-units in R .We shall end this section with a discussion of localization and local rings. This will berelated to some geometry in the next chapter. We shall push off giving examples of localrings until the next chapter when they arise quite naturally in the theory of manifolds andschemes. Definition/Theorem 2.4.26.
A subset S ⊆ R is called multiplicative if x , y ∈ S = ⇒ xy ∈ S . We define the localization with respect to the multiplicative set S as the set of symbolsS − R = (cid:110) rs : r ∈ R , s ∈ S (cid:111) / ∼ where rs ∼ ab if there exists t ∈ R such that t ( rb − sa ) = If S = R − p for some prime ideal p then S − R = R p is a local ring. roof. Using the standard definitions of addition and multiplication of fractions, we seethat S − R is indeed a ring. We now need to show that R p is a local ring. We claim that p R p is a maximal ideal in R p . It is easily shown to be an ideal, and thus it suffices to showmaximality. Suppose not, then ∃ I ⊆ R an ideal such that m (cid:40) I by Proposition 2.4.15. As p R P consists of all non-unit elements, it follows that I must contain a unit. Therefore I contains 1 and must be R R p itself. Hence, p R p is a maximal ideal. Uniqueness followsfrom the same reason: any other ideal must contain a unit and therefore is the entire ring.Hence, R p is a local ring.This theorem gives the motivation for calling the operation Localization. As will beseen later, prime ideals will be the most important ideals in a ring as they precisely givea bijection between certain bits of geometry and algebra. For now, we shall move on tomodule theory. The final topic of this section is Module theory. This is a generalization of vector spacesover a field as we now allow for the ground space to be a ring. It is far more commonto come across modules over rings than vector spaces. For this reason, there is an entiretheory of modules and their generalizations to categories which is used extensively in thenext chapter. This section draws from [ DF04 ] , [ Mat86 ] , [ Rot15 ] , [ Lan02 ] , and [ Kna06 ] . Definition 2.5.1.
Let R be a ring. An abelian group M is called an R -module if thereexists an action map R × M → M which is associative. We also impose that 1 m = m forall m ∈ M . A module is said to be finitely generated if there exists a finite set S such that M = Span R S . Here we adopt the same notion of span as for vector spaces. A modulehomomorphism is a linear map which is R -equivariant. A module is an R -algebra if M also has the structure of a ring. That is, a map µ : M × M → M which is a multiplicationand satisfies the axioms of multiplication in a ring.Let us look at some examples of modules and submodules. Example 2.5.2. (a) Let M = R . Then R caries the structure of an R -module trivially.Further, every ideal I can be considered as an R -submodule. Moreover, we candefine R n = R ⊕ R ⊕ .... ⊕ R as a module by multiplication in each coordinate.(b) Let ϕ : A → B be a ring homomorphism. Then B can be given the structure ofan A -module by defining a · b = ϕ ( a ) b . The properties of a ring homomorphismguarantee that this is indeed an action and satisfies the axioms of a module.(c) Let F be a field and V a vector space over F . Then V is an F module. In particular,all F -modules are vector spaces. This is not true for modules.(d) Let R = Z and let M = Z ⊕ Z / m Z . Then M is an R -module by the multiplicationdefined by n · ( a , [ b ]) = ( na , [ nb ])
47n fact, any abelian group G has a natural structure of a Z -module. This comes fromthe identification of n · g = g n or ng depending on whether one uses multiplicativeor additive notation.(e) The polynomial ring R [ x , ..., x n ] is an R -module in the obvious way r · f = r f .The main difference between vector spaces and modules is that modules do not alwayshave bases (in fact they rarely have them). For example, over most rings, the existence ofquotient rings implies that modules can actually be very from from just R n .Let us consider some operations on modules. Definition 2.5.3.
Let M , N be R -modules. We define M ⊕ N : = { ( m , n ) : m ∈ M , n ∈ N } to be the external direct sum of modules. For some collection of modules { M i } we cantake their direct sum. If the indexing set is infinite we define (cid:76) I M i as the tuples withfinitely many non-zero elements. The direct sum comes equipped with natural mor-phisms M , N (cid:44) → M ⊕ N . We can also build the internal direct sum for two submodulesof a larger module P . In this case we denote the internal direct sum as M + N : = { m + n : m ∈ M , n ∈ N } It is an easy exercise to show that these are isomorphic if M ∩ N = For M an R -module and N ⊆ M , we can define the quotient module M / N in the sameway we defined V / W for vector spaces W ⊆ V . Furthermore, we can realize every sub-module as the kernel of some module homomorphism via the short exact sequence0 → N → M → M / N → Definition/Proposition 2.5.4.
We say a short exact sequence → M → N → P → splits (on the left) if there exists a morphism N → M which when composed with the inclusion isthe identity. It splits on the right if there exists a morphism P → N which composes with theprojection to be the identity. If a short exact sequence splits on the right, then N ∼ = M ⊕ P . Thisgives a characterization of short split short exact sequences.Proof.
Let i : M → N and p : N → P be the arrows in the above exact sequence. As weassume the sequences is split, we know ∃ j : P → N such that p ◦ j = P . We will show that N = Im i ⊕ Im j . Let n ∈ N , then n − ( j ◦ p )( n ) ∈ ker p as p ( n − jpn ) = p ( n ) − p ( n ) = m ∈ M such that i ( m ) = n − jpn . It follows then that N isthe internal direct sum N = Im i + Im j We need to prove that Im i ∩ Im j =
0. Let a ∈ M and b ∈ P such that i ( a ) = x = j ( b ) .Applying p to both sides, we get that j ( b ) = x =
0. Hence, N ∼ = M ⊕ P .48 orollary 2.5.5. If → U → V → W → is a short exact sequence of vector spaces, then thefollowing conditions are equivalent:(a) The sequence splits on the left.(b) The sequence splits on the right.(c) V ∼ = U ⊕ W . Corollary 2.5.6 (Rank-Nullity Theorem) . For a linear map f : V → W between vector spaces,we have that rank ( f ) + nullity ( f ) = dim V . Proof.
Set up the following exact sequence0 → ker f → V → Im ( f ) → Example 2.5.7.
Consider the exact sequence of Z -modules0 → Z → Q → Q / Z → → ker f → M → f ( M ) → Q (cid:54)∼ = Z ⊕ Q / Z An easy calculation shows that Q / Z has elements of arbitrary finite order and thereforecannot be a direct summand of Q .In fact, it is a fairly standard exercise to show that an exact sequence of R -modules splitsif and only if the middle term is a direct sum of the first and third terms.We now move on to some theorems for modules which we have seen before. Theorem 2.5.8 (First and Fourth Isomorphism Theorems) . (a) Let ϕ : M → N be an R-module homomorphism. Then M / ker ϕ ∼ = ϕ ( M ) . (b) Let N ⊆ M . Then the submodules of M / N are in one-to-one correspondence with thesubmodules of M containing N . Proof.
The proof of this follows immediately from the proof for the ring case: Theorems2.4.12 and 2.4.13.It should be no surprise at this point that this theorem is true. After all, we can regardmodules as abelian groups and the result held true there. The only thing needing to bechecked is R -equivariance, but this follows immediately from the definitions.49 efinition 2.5.9. We call an R -module free if M ∼ = R n = R ⊕ R ⊕ ... ⊕ R for some n ∈ N .We say that M is finitely presented if there exists a short exact sequence0 → K → F → M → K , F are free, finitely generated R -modules. For any set S we can build the free R -module R (cid:104) S (cid:105) with basis S . The following theorem gives a universal property for suchmodules. Theorem 2.5.10 (Universal Property of Free Modules) . Let S be a set and M an R − modulesuch that there exists a map ϕ : S → M . Then there exists a unique R-module homomorphism ˆ ϕ : R (cid:104) S (cid:105) → M such that the following diagram commutes:S MR (cid:104) S (cid:105) ϕ i ˆ ϕ The idea of finitely presented modules becomes important for the theory of sheaves whichwill be developed at the end of the next chapter. For now, we have a fundamental resulton modules over a principal ideal domain.
Theorem 2.5.11 (Fundamental Theorem of Finitely Generated Modules over a P.I.D.) . LetR be a P.I.D. and M a finitely generated R − module. ThenM ∼ = R k ⊕ R / (cid:104) r (cid:105) ⊕ R / (cid:104) r (cid:105) ⊕ ... ⊕ R / (cid:104) r n (cid:105) for some non-unit elements { r i } . Proof.
We will show existence of such a decomposition. Let v , ..., v n be a generating setfor M and consider R (cid:104) x , ..., x n (cid:105) the free module on the same number of generators. Itis obvious that there is a homomorphism ϕ : R n → M which sends x i (cid:55)→ v i . This ho-momorphism is surjective by construction. Therefore, by Theorem 2.5.8, we have that M ∼ = R (cid:104) S (cid:105) / ker ϕ . We know then that r x , ..., r m x m is a generating set for ker ϕ andtherefore ker ϕ = (cid:76) i ≤ m Rr i x i . Taking the quotient, we get that M ∼ = (cid:77) i ≤ n Rx i / (cid:77) i ≤ m Rr i x i = (cid:77) ( Rx i ./ Rr i x i ) ⊕ R n − m The terms of the direct sum become R / (cid:104) r i (cid:105) under the natural identification. Hence, M ∼ = R n − m ⊕ R / (cid:104) r (cid:105) ⊕ R / (cid:104) r (cid:105) ⊕ ... ⊕ R / (cid:104) r n (cid:105) As Z is a principal ideal domain, this applies to abelian groups as well. As will beseen in the next chapter, this theorem becomes incredibly important in homology theoryfor finite cell complexes. 50 .5.2 Multilinear algebra The final part of this chapter will concern multilinear algebra. In this section we intro-duce an other operation on modules which gives another way of building new modulesfrom old ones, the tensor product, and discuss related topics such as exterior powers ofmodules and vector spaces.For this section, let R be a ring and L , M , N R -modules. Further, let S ⊇ R be a ringcontaining R . Definition 2.5.12.
We call a function θ : M ⊕ N → L bilinear if it is linear in each argu-ment. In general, we have multilinear functions f : (cid:76) M i → L which are linear in eachargument.It should not come as surprising that multilinear functions are a bit more difficultto deal with than linear functions. There is a way to convert between the two, but thisinvolves a new module. Definition 2.5.13 (Definition/Construction) . Let F ( M × N ) denote the free R -modulegenerated by M × N . Consider the submodule G which is generated by the relations ( a , b ) ∼ ( a , b )( a + a (cid:48) , b ) ∼ ( a , b ) + ( a (cid:48) , b )( a , b + b (cid:48) ) ∼ ( a , b ) + ( a , b (cid:48) )( ar , b ) ∼ ( a , rb ) r ∈ R We define M ⊗ R N = F ( M × N ) / G . As R is commutative, we have that M ⊗ R N is an R − module with multiplication defined by the final relation of G . There is a canonical map ⊗ : V × W → V ⊗ W which sends ( m , n ) (cid:55)→ m ⊗ n . Elements of M ⊗ R N are sums of the formal symbols m ⊗ n which is called a simple tensorTheorem 2.5.14 (Universal Property) . For every bilinear map ϕ : M × N → L there exists aunique linear map ˆ ϕ : M ⊗ N → L such that ϕ = ˆ ϕ ◦ ⊗ .This theorem is sometimes given as the definition of the tensor product as it impliesthe tensor product is unique up to isomorphism. The nice part of this theorem is it givesa bijection Bil ( M , N , L ) ∼ = Hom R ( M ⊗ N , L ) where BIl ( M , N , L ) is the set of bilinear maps M × N → L . Therefore, we can turn multi-linear functions into linear ones by using the appropriate number of tensors. This is alsotrue for all arbitrary collections of modules and multilinear maps.Let us look at some immediate applications of tensor products. Proposition 2.5.15.
Let V , W be finite dimensional vector spaces (or R-modules) over K a field.Then there is a non-canonical isomorphismV ∗ ⊗ W ∼ = Hom K ( V , W ) which sends Π ( ϕ ⊗ w ) = ϕ ( − ) w . 51 emma 2.5.16. For V , W as above, dim V ⊗ W = dim V · dim W . Proof.
Let B , C be bases for V and W respectively. Then we can pick as a basis for V ⊗ W ,the set of simple tensors b i ⊗ c j for b i ∈ B and c j ∈ C . There are | B | × | C | of these. Proof of Proposition 2.2.15.
It suffices to prove that this map is injective as the dimensionof the spaces are the same, we can apply Theorem 2.3.35(d). So, let φ ⊗ w ∈ ker Π . Then Π ( φ ⊗ w )( v ) = φ ( v ) w = v ∈ V . If w = w (cid:54) =
0. Thenwe know that φ ( v ) = v ∈ V . Therefore, by the uniqueness of 0 ∈ V ∗ , we havethat φ =
0. This completes the proof.
Remark 2.5.17.
Notice that the map itself is canonical, but the choice of basis is not. Ingeneral, for simple tensors ϕ ⊗ w ∈ V ∗ ⊗ W the map is canonical, we need bases to extendthis map to the entire tensor product.Sometimes it is useful to consider the module M as an S -module instead of an R -module. The following lemma gives a way to do such a thing. Lemma 2.5.18.
We can extend scalars from R to S by takingM (cid:55)→ M ⊗ R SWhere the module structure on S is given by the inclusion map. This is an S module.Proof.
Let s ∈ S . We need to define s ( m ⊗ t ) and then extend by linearity. Well, simplydefine s ( m ⊗ t ) = m ⊗ st . As R and S are commutative, this is a valid action. Lemma 2.5.19.
There is a canonical isomorphism of R ⊗ R M ∼ = M for any R-module M . Proof.
Let ϕ : M ⊗ R R → M be given by ϕ ( ∑ m i ⊗ r i ) = ∑ r i m i We claim this is anisomorphism. Consider the map m (cid:55)→ m ⊗
1. This is an inverse for ϕ on both the left andright. Hence, ϕ is an isomorphism.Now let us investigate the module M and its tensor powers: M ⊗ n = (cid:78) n M . Thesespaces parametrize, in some sense, the multilinear maps of ∏ n M → M . We can build analgebra out of these modules by taking a large direct sum. Definition 2.5.20.
Let V be an R -module. The Tensor Algebra of M is the R -algebra T • ( M ) = (cid:77) n ∈ N M ⊗ n The algebra structure on T • ( M ) is given by concatenation v ∈ M ⊗ n and w ∈ M ⊗ m then v ⊗ w ∈ M ⊗ m + n .We have the following universal property of the tensor algebra, Some authors simply write T ( M ) or T ∗ ( M ) for the tensor algebra, we do not use these as it will becomedifficult to distinguish T ( M ) and TM in the next chapter. roposition 2.5.21 (Universal Mapping Property of the Tensor Algebra) . Let A be an R-algebra and f : M → A an R-module homomorphism. Then there exists a unique R-algebrahomomorphism ˆ f : T • ( M ) → A extending f such that the following diagram commutes:M AT • ( M ) fi ˆ f The proof of this is the same flavor as for the other universal mapping properties andthus will not be produced here. What we will concern ourselves with however is a certainideal of T • ( M ) . Definition 2.5.22.
A tensor v ∈ T • ( M ) is called alternating if v has the following form: v = m ⊗ ... m i ⊗ ... ⊗ m i ⊗ ... ⊗ m n The repeating element is the focus. Let J be the ideal of T • ( M ) generated by all suchalternating elements. Sometimes we say that J = (cid:104) v ⊗ v (cid:105) for v ∈ T • ( M ) . Lemma 2.5.23. J coincides with the ideal L = (cid:104) x ⊗ y + y ⊗ x − ( x + y ) ⊗ ( x + y ) + x ⊗ x + y ⊗ y (cid:105) only if Char ( R ) (cid:54) = Definition/Theorem 2.5.24.
Let R be a ring with Char ( R ) (cid:54) = and put (cid:86) • ( M ) = T • ( M ) / J . This is called the exterior algebra of M and comes with the following universal property: Givenany R-algebra A and a map φ : M → A such that ϕ ( m ) = there exists a unique algebrahomomorphism (cid:86) • ( M ) → A which makes the associated diagram commute.Proof.
The universal property for the tensor algebra gives us a map, Ψ , to A . Taking thekernel of this map, we see that it is precisely when Ψ ( m ⊗ m ) =
0. Hence, Ψ descends toa map on (cid:86) • ( M ) . This completes the proof.It is common practice to denote elements of (cid:86) • ( M ) with ∧ instead of ⊗ . In this way,we get immediately that v ∧ w = − w ∧ v . This is equivalent to the condition, v ∧ v = Remark 2.5.25.
We shall end this section with some nice properties of the exterior algebraso that we can use them in the next chapter readily.(a) We can build (cid:86) k ( M ) in a similar way to building (cid:86) • ( M ) we simply quotient T k ( M ) = (cid:76) k M ⊗ n . In this vein, (cid:86) k ( M ) ∧ (cid:86) l ( M ) ⊆ (cid:86) l + k ( M ) which gives (cid:86) • ( M ) an algebrastructure. 53b) If V is a finite dimensional vector space of dimension n . Then it can be shown thatdim (cid:86) k ( V ) = ( nk ) . Therefore (cid:86) • ( M ) is a finite dimensional algebra.(c) Recall the definition of a lie algebra from above. A different way to say the condi-tions of a lie algebra are that V is a vector space equipped with a map [ − , − ] : (cid:94) ( V ) → V satisfying the Jacobi identity.(d) As we will see in the next section, we can equivalently consider (cid:86) k ( V ) the vectorspace of differential k -forms on V . This allows us to do calculus on these spaces andis a bridge between the theory of manifolds (chapter 3) and algebra, among others.(e) (Determinants) Let V have dimension n and consider the top exterior power (cid:86) n ( V ) .This is a 1-dimensional space by (2) above. Consider any T ∈ Hom ( V , V ) : = End ( V ) and define the extension T : n (cid:94) ( V ) → n (cid:94) ( V ) T ( v ∧ ... ∧ v n ) = Tv ∧ ... ∧ Tv n As this is an endomorphism of a 1-dimensional space, it must be given by Tv = λ v for some λ ∈ K . Therefore we define the determinant of T to be the unique number λ such that Tv ∧ ... ∧ Tv n = ( det T )( v ∧ ... ∧ v n ) It then follows from the definition that for S , T ∈ End ( V ) , we get det ST = det S · det T . Those readers familiar with the determinant formula of a matrix should no-tice this as the standard property of the determinant. Furthermore, we have thefollowing lemma Lemma 2.5.26.
A matrix M is invertible if and only if det M (cid:54) = Proof.
Abusing notation, by Theorem 2.3.39 we consider the linear transformationassociated to the matrix M . Then det M (cid:54) = M : (cid:86) n ( V ) → (cid:86) n ( V ) isan isomorphism. Hence, M has an inverse as a linear transformation and thus as amatrix.This gives a nice way to think about determinants as the volume of the parallelepipedspanned by the basis vectors Tv , ..., Tv n .This completes the chapter. 54 hapter 3Topology and Geometry: From Spaces toSheaves This section will run through the basics of category theory, (point-set) topology, differen-tial geometry, and sheaf theory. The main goal is to define and give important propertiesof manifolds . To mathematicians, these are generalizations of Euclidean space and pro-vide a natural context to do calculus on non-flat spaces (more on this in Section 3.3).There is some ambiguity on the definition of a manifold for psychologists which causessome technical problems when comparing computational models which claim to rely onthe "manifold" structure. We shall give the formal, mathematical constructions of theseobjects and in Chapter 4, use this to construct a perceptual space which encodes the gen-eralized perceptual categories of Chapter 1. Before then, we want to bridge the gap fromthe previous chapter to this one by exploring category theory.
Category theory began as an observation that many of the well known results of algebra(such as the First isomorphism theorems above) seemed to be linked. We now knowthat the reason this is true follows from general facts about what are known as
Additive and
Abelian categories. Although this theory is beautiful to those who fully understandthe concepts, it can be seen as esoteric and impenetrable by some beginners. As we areassuming little to no familiarity with these topics, we shall go into a bit more detail formost of the proofs in this section and provide several examples for each definition andtheorem. For references, we make extensive use of [ML71], [Kna06],[Kna07],[Rot09], and[Lee12].
Before giving the definition of a category, we want to understand, more precisely, the lan-guage used in the previous chapter. The main goal will be to understand the relationshipbetween morphisms of groups, rings, and modules. Category theory provides a settingin which these are all intimately related. 55 xample 3.1.1.
Let G , H be groups (not necessarily abelian). Denote by Hom ( G , H ) the set of all grouphomomorphisms. If G and H are assumed to be abelian, then Hom ( G , H ) can be endowedwith the structure of an abelian group in a natural way: for any f ∈ Hom ( G , H ) define n · f ( g ) = f ( ng ) = n f ( g ) ∈ H Notice that for H non-abelian we can still define a Z -module structure on Hom ( G , H ) by n · f ( g ) = f ( ng ) . We can similarly define a Z -module structure if G is non-abelianand H is abelian. Thinking about Hom as a function on the set of all groups we can askif it preserves group homomorphisms. To check this, let ϕ : G → G (cid:48) be a morphism ofgroups. Define ϕ ∗ : Hom ( G (cid:48) , H ) → Hom ( G , H ) f (cid:55)→ f ◦ ϕ If instead we had a morphism ψ : H → H (cid:48) , then there is a canonical map ψ ∗ : Hom ( G , H ) → Hom ( G , H (cid:48) ) defined as you would imagine. Therefore, Hom can somehow detect which argument amorphism was taken in. If it is the first argument then the order is reversed, whereas thesecond argument preserves the order.If we generalize the above example to rings and ring homomorphisms, we get the exactsame result. Let R , R (cid:48) , S , S (cid:48) be rings and ϕ : R → R (cid:48) , ψ : S → S (cid:48) be ring homomorphisms.Then ϕ ∗ and ψ ∗ are defined according to the definitions above.The same story for rings works with modules as well. This should not be surprisinghowever as every abelian group is a Z -module and we know how Hom works for abeliangroups.This undercuts the original conclusion about Hom; it can detect which argument is be-ing manipulated but cannot (without some poking) detect group, ring, or module struc-tures. What we do know is that it also plays suitably nice with morphisms for the correctobjects. It is precisely this notion which categories and functors generalize. Definition 3.1.2.
A (small) category is a triple C = ( Obj ( C ) , Hom C ( − , − ) , ◦ ) with Obj ( C ) a set, an assignment for any two objects A , B ∈ Obj ( C ) a set Hom C ( A , B ) of morphisms between A and B , and a function ◦ such that for all A , B , C ∈ Obj ( C ) , ◦ : Hom C ( B , C ) × Hom C ( A , B ) → Hom C ( A , C ) These are subject to the following axioms(a) Hom sets are disjoint (that is every element has a unique domain and codomain).(b) There exists 1 A ∈ Hom C ( A , A ) for all A ∈ Obj ( C ) such that 1 A ◦ f = f . and g ◦ A = g . We are intentionally being sloppy here. As will be seen shortly Hom ( − , − ) is a functor Grp → Set . ◦ is associative.If it is clear from the context, we shall simply write Hom ( A , B ) for the set of morphisms.A subcategory of C is a triple D = ( Obj ( D ) , Hom ( − , − ) , ◦ ) where Obj ( D ) is a subset ofObj ( C ) and Hom D ( A , B ) ⊆ Hom C ( A , B ) . Composition is taken as in C .Notice that this definition does not require the objects themselves to be sets. Thisdistinction is what makes proving things in category theory particularly frustrating: onecannot reference elements of an object when defining a morphism. Example 3.1.3. (a) Consider the following graph •• •
Define a category C whose objects are the vertices of the above graph, the mor-phisms are the arrows, and composition is concatenation of paths. Notice that theobjects of this category have no notion of element (i.e. they are not sets) and there-fore if we wish to prove something about this category, we have to rely on "arrowtheoretic" proof. That is to say we need to understand the morphisms in the cate-gory instead of the objects.(b) We now return to the algebraic objects of the previous chapter. For your favoriteobject in the previous chapter, it should be obvious that they form a category. Wedenote the categories as such:(i) Grp : the category of groups.(ii)
Ring : the category of rings.(iii)
Field : the category of fields.(iv) R − Mod : the category of R -modules for a fixed ring R .(v) Vect K : the category of K -vector spaces.(vi) Ab : the category of abelian groups.Notice that Ab is a subcategory of Grp . In fact, every category above can be realizedas a subcategory of
Grp !(c) The "category" of Sets is denoted
Set . The quotations here are for caution: the "col-lection of all sets" is not itself a set (try to prove this!) but instead a proper class. Weare going to ignore almost all set theoretic problems that may arise. Nonetheless,this is an honest category (once you fix your model of set theory) and it is quite im-portant. A majority of what will come up when we discuss functors can be realizedas some generalization of something involving sets.57 emark 3.1.4.
For the remainder of this thesis, we shall denote categories by calligraphicor script letters C , C if we are in a general setting, or a corresponding bold-face name suchas Grp for the category of groups.
Definition 3.1.5.
Let C and D be two categories. We define the product category C × D as the category whose objects are pairs ( C , D ) and whose morphisms are pairs ( f , g ) .Now that we have the notion of a category, we may ask if there are any "special" mor-phisms in this category. What we mean by special here will become apparent shortly.Consider the category Set . The following lemma gives a different characterization of in-jective and surjective functions which is easily generalizable.
Lemma 3.1.6.
Let f : A → B and g : A (cid:48) → B (cid:48) be two functions. Then f is injective if and onlyif for any two arrows i , i : C → A , the equalityf ◦ i = f ◦ i = ⇒ i = i . Similarly, g is surjective if and only if for any two arrows s , s : B (cid:48) → C (cid:48) , the equalitys ◦ g = s ◦ g = ⇒ s = s . This means that injective maps are left cancellable and surjective maps are right cancellable.Proof.
We shall prove the injective case and leave the surjective case to the reader. ( ⇒ )Assume that f is left cancellable. For any a , a (cid:48) ∈ A , let ϕ a : {∗} → A be the functionwhich picks out the element a . Then if f is left cancelable and f ( ϕ a ) = f ( ϕ a (cid:48) ) = ⇒ ϕ a = ϕ a (cid:48) = ⇒ a = a (cid:48) Hence f is injective. The other direction is obvious from the definition of injective. Thiscompletes the proof.Notice that we can re-write the injectivity condition on the level of diagrams as C A B i i f More generally, we can think of arrows in arbitrary categories which have the left (resp.right) cancellable property.
Definition 3.1.7.
Let C be a category and f : A → B be a morphism. We say that f is monic when for any pair of morphisms g , h : C ⇒ A , the equality. f ◦ g = f ◦ h implies g = h . We say that f is epic when for any pair of morphisms p , q : B ⇒ D , the equality. p ◦ f = q ◦ f implies p = q . We call f an isomorphism if there exists r : B → A such that f r = B and r f = A . Further, we denote isomorphisms by either A ∼ = B or A ∼ → B .58n all concrete categories (ones which can be realized as subcategories of Set ) monicmaps are injective and epic maps are surjective. This mirrors the result of Lemma 3.1.6.In fact, this is precisely the definitions of isomorphism coincide with the categorical onefor all of the algebraic objects in Chapter 2! In general, the converse is not true. Let R , S be two rings and UR , US their underlying sets. Then an injective function f : UR → US need not be a ring homomorphism. For an easy example, consider R = S = Z . Then themap 2 : Z → Z x (cid:55)→ x is a perfectly well defined injective function but is definitely not a ring homomorphismas 1 cannot be written as 2 z for some z ∈ Z .Something else which needs generalization is the equivalence in Set between isomor-phisms and bijections. In general, every isomorphism is necessarily monic and epic. Theconverse may not be true (take for example the ring example above but change where theidentity is sent). We want to deal with categories where this is true.
Definition 3.1.8.
A category B is called balanced if all monic, epic morphisms are iso-morphisms.It should be clear that all concrete categories are balanced. More often than not, this issomething which needs to be proven but is not too hard.Before moving forward, it is important to label some distinguished objects of certaincategories. Definition 3.1.9.
An object T ∈ C is a terminal object if for all objects A ∈ C , there existsa unique (denoted ∃ !) A → T . An object I ∈ C is initial if for all objects A ∈ C , thereexists a unique I → A . A zero object is an object which is both terminal and initial. Proposition 3.1.10.
Initial, terminal, and zero objects are unique up to unique isomorphism.Proof.
The proof for initial, terminal, and zero objects is exactly the same. For this reason,we shall only prove the initial case. Let I , I be two initial objects. By definition thereexist unique morphisms ι : I → I and ι : I → I . It suffices to show that ι ◦ ι = I and ι ◦ ι = I . As the objects are initial, the set Hom ( I i , I i ) contains a single element,namely 1 I i . As the composition ι ◦ ι ∈ Hom ( I , I ) it must be 1 I . By the same reasoningwe have that ι ◦ ι = I . Hence, I ∼ = I and this isomorphism is unique. Example 3.1.11.
Zero, initial, and. terminal objects are incredibly important in the theoryof abelian categories (section 3.1.4). For this reason, we give the following exmaples:(a) In
Grp the zero object is the trivial group G = { } .(b) In Ring the initial object is Z while there is no terminal object.(c) In R- Mod the zero object is the 0 module.59 unctors
Now that we have the notion of a category, we want to define morphisms of categories.Similar to the restrictions of a ring homomorphism, we want a morphism of categories topreserve both the objects and the morphisms.
Definition 3.1.12.
Let C , D be two categories. A (covariant) functor F : C → D subject tothe following:(a) For all A ∈ Obj ( C ) , F ( A ) ∈ Obj ( D ) and similarly for morphisms.(b) If A f → B g → C is a sequence of morphisms in C , then F ( g ◦ f ) = F ( g ) ◦ F ( f ) is amorphism in D .(c) F ( A ) = F ( A ) .Dually, we have the notion of contravariant functors for which F ( g ◦ f ) = F ( f ) ◦ F ( g ) .It is common practice to write FX for an object as opposed to F ( X ) . We shall use thesenotations interchangeably.Functors play a core role in the rest of the theory presented in this thesis. Specifically,they will form an important class of objects called sheaves (see section 3.3.2 below) whichwill ease the technical burden of understanding the geometry. of perceptual spaces. Lemma 3.1.13.
Let F : C → D be a functor. Then if ϕ : A → B is an isomorphism in C , thenF ( ϕ ) is an isomorphism in D . Proof.
Let ψ be ϕ − in C . Computing F ( ϕ ◦ ψ ) and F ( ψ ◦ ϕ ) , we see that by property (b)of the definition of a functor, we have that1 F ( A ) = F ( A ) = F ( ψ ) ◦ F ( ϕ ) F ( B ) = F ( B ) = F ( ϕ ) ◦ F ( ψ ) Hence, F ( ϕ ) is an isomorphism.The following examples of functors will play an exceptional role in section 3.3 below. Example 3.1.14. (a) Let ( − ) op : Cat → Cat be an endofunctor of the category of categories (this mor-phisms in this category are functors). This sends a category C to the opposite category C op . The objects of this category are the objects of C but the morphisms have theirtarget and source flipped. That is, if f : A → B is a morphism in C then f op : B → A is a morphism in C op . This allows us to redefine contravariant functors as covariantfunctors from the opposite category. As an added fact, ( C op ) op = C .(b) Consider Hom C ( − , − ) : C op × C → Set . This is a bifunctor and is covariant in thefirst argument and covariant in the second argument.60c) In R - Mod , − ⊗ R − is a bifunctor, covariant in both arguments. As we assume R iscommutative, ⊗ makes R - Mod into a symmetric monoidal category . Algebras aremonoid objects in this category.(d) Let U : Grp → Set be the forgetful functor which sends a group to its underlying set.In fact, in any concrete category we have a forgetful functor to
Set .If C and D are categories, then denote byFun ( C , D ) : = { F : C → D } We want to turn this into a category. In order to do this, we need to introduce the idea ofa morphism of functors . Definition 3.1.15.
Let F , G : C → D be two functors of the same variance. A naturaltransformation is a family of morphisms { τ X } which intertwine the functors as the fol-lowing diagram shows F ( X ) G ( X ) F ( Y ) G ( Y ) τ X F ( f ) G ( f ) τ Y In this case we write τ : F → G .These define the morphisms in Fun ( C , D ) and make it a category. Isomorphisms arenatural transformations for which every τ X is an isomorphism in D . In this case, we saythat two functors are naturally equivalent . The following lemma gives a description ofNatural transformations involving the Hom ( A , − ) functor. Lemma 3.1.16 (Yoneda Lemma) . Let G : C → Set be a functor and A an object in C . Thenthere is a bijection y : Nat ( Hom ( A , − ) , G ) → G ( A ) Proof.
Define y ( τ ) = τ A ( A ) . To show this is injective, suppose y ( τ ) = τ A ( A ) = σ A ( A ) = y ( σ ) . For any object B ∈ C , and ϕ ∈ Hom ( A , B ) , we have the following commutative di-agram Hom ( A , A ) G ( A ) Hom ( A , B ) G ( B ) τ A ϕ ∗ G ϕτ B So that τ B ( ϕ ) = G ϕτ A ( A ) = G ϕσ A ( A ) = σ B ( ϕ ) . Hence, τ B = σ B for all B ∈ C and thus τ = σ . So y is injective. We shall not define this here, but instead suggest [Kas95, Chapter XI]. Kassel uses the term tensorcateogry which is equivalent to “monoidal cateogry."
61o show it is surjective, let x ∈ G ( A ) . For every object B ∈ C and ψ ∈ Hom ( A , B ) ,define τ B ( ψ ) = ( G ψ )( x ) . We claim then that τ is a natural transformation. Indeed, forany θ ∈ Hom ( B , C ) , then commuting squareHom ( A , B ) G ( B ) Hom ( A , C ) G ( C ) τ B θ ∗ G θτ C Then going clockwise we get that G θτ B ( ψ ) = G θ G ψ ( x ) . Going counter-clockwise wehave that τ C ( θ ∗ ψ ) = τ C ( θψ ) = G θψ ( x ) . As G is a functor, these are equal. Thus, τ is anatural transformation and τ A ( A ) = G A ( x ) = x . Hence y is bijective. This completesthe proof.Now let F : C → D be a functor and X , Y ∈ C . Then F induces a function on Hom-sets F X , Y : Hom C ( X , Y ) → Hom D ( FX , FY ) which takes a function f to F ( f ) . Definition 3.1.17.
We say that F is:(a) Full if F X , Y is surjective for all X , Y .(b) Faithful if F X , Y is injective for all X , Y .(c) Fully-Faithful if F X , Y is bijective for all X , Y .Therefore, concrete categories are those which admit a faithful functor into Set . Ingeneral, fully-faithful functors play the same role as bijective functions on sets. In
Cat isomorphisms are necessarily fully-faithful. In general, a bijection on the level of Hom-sets is incredibly important.
We now explore the final claim of the previous part. Let F : C (cid:29) D : G be functors suchthat there exists a natural transformation η : 1 C → GF . Then we want to understand theinduced morphism Hom D ( FX , Y ) → Hom C ( X , GY ) Definition 3.1.18.
Let F : C (cid:29) D : G . We say that ( F , G ) are an adjoint pair ifHom D ( FX , Y ) ∼ −→ Hom C ( X , GY ) for all X ∈ C , Y ∈ D . Further, the bijection is natural in X and Y . In this case, we say that F is left adjoint to G and G is right adjoint to F . We denote this by F (cid:97) G .62 heorem 3.1.19. An adjoint pair ( F , G ) induces two natural transformations η : 1 C → GF and ε : FG → D such that the compositionsF F η −→ FGF ε F −→ F G η G −→ GFG G ε −→ Gare the identity morphisms.Proof.
Let ϕ X , Y : Hom D ( FX , Y ) ∼ −→ Hom C ( X , GY ) be the bijection for the adjoint pair.Then if Y = FX , the element 1 FX ∈ Hom D ( FX , FX ) induces a morphism η X : = ϕ FX , FX ( FX ) : X → GFX
Define η : 1 C → GF by η X . We need to show that η is natural in X . Consider the followingdiagram: X GFXY GFY η X f GF ( f ) η Y It commutes by the fact that ϕ is natural in both X , Y . Similarly, we define ε Y : = ϕ − GY , Y ( GY ) .Its naturality is checked in a similar manner. Now,1 GY = ϕ GY , Y ( ε Y ) = G ( ε Y ) ◦ η GY again by the naturality of ϕ . We have the respective statement for 1 FX . This completes theproof. Remark 3.1.20.
The natural transformations η : 1 C → GF and ε : GF → D are calledthe Unit and
Counit of the adjunction. We then denote an adjunction as a quadruple ( F , G , η , ε ) . Corollary 3.1.21. If ( F , G , η , ε ) and ( F (cid:48) , G , η (cid:48) , ε (cid:48) ) are adjoint pairs, then F and F (cid:48) are naturallyisomorphic.Proof. η and η (cid:48) are universal arrows for each x . Therefore, there exists a unique isomor-phism θ X : FX → F (cid:48) X for all X ∈ C . This family of isomorphisms is natural in X by theuniversality of the units. Hence, F ∼ = Nat F (cid:48) .Adjoint functors play a large role in understanding isomorphisms of categories. Infact, two categories are equivalent if there exists an adjoint pair ( F , G , η , ε ) such that η and ε are natural isomorphisms. To build up some intuition, here are some examples of adjointfunctors. Example 3.1.22. (a) Let (cid:104)(cid:105) : Set → Grp be the free group functor and U the forgetfulfunctor. This sends a set X to the group (cid:104) X (cid:105) which is the group generated by allwords in the elements of X . It is characterized by the property that for any function63 : X → G a group, there exists a unique group homomorphism ˆ f : (cid:104) X (cid:105) → G . Weclaim this makes (cid:104)(cid:105) (cid:97) U . In fact, the universal property gives a bijectionHom Grp ( (cid:104) X (cid:105) , G ) ← Hom
Set ( X , UG ) In fact, for any concrete algebraic object we get an adjunction between the free func-tor and the forgetful functor in the same way.(b) Consider Hom ( M , − ) and − ⊗ R M as covariant endofunctors of R - Mod . Then forany objects A , B ∈ R - Mod , there is a bijectionHom ( A ⊗ M , B ) → Hom ( A , Hom ( M , B )) f (cid:55)→ ˜ f where ˜ f ( a )( m ) = f ( a ⊗ m ) . In this case, we have some additional facts that comefrom the adjunction. The two most interesting (and important) ones are:Hom ( M , ∏ A i ) = ∏ Hom ( M , A i ) M ⊗ (cid:77) A i = (cid:77) ( M ⊗ A i ) for arbitrary indexing sets. We will see shortly that this is more generally a propertyof adjoint functors between abelian categories . Limits and Colimits
We now want to generalize the last example and understand products and coproductsin generic categories. These manifest as limits and colimits respectively. Recall that aproduct of two objects A , B is an object A × B together with two maps A × B → A and A × B → B . To be more precise, this is somehow the universal object such that for anyother object with maps C → A and C → B , there exists a unique map C → A × B suchthat the following diragram commutes C A × B AB ∃ ! Let us now generalize this.
Definition 3.1.23. An inverse system in a category C is a collection indexed by a partiallyordered set I , { A i , ϕ ji : A j → A i } i (cid:22) j such that ϕ jk ϕ ij = ϕ ik for all i (cid:22) j (cid:22) k . Equivalently,an inverse system is a functor A : I op → C such that A ( i ) = A i and A ( i → j ) = ϕ ji .Therefore, A ∈ C I op = Fun ( I op , C ) .An inverse system is thus a diagram in the category C of shape I op .64 xample 3.1.24. (a) Let I = {
1, 2, 3 } with the partial order 1 (cid:22) (cid:22)
3. Thendiagrams of shape I op look like AB C (b) If I is discrete (that is the only partial order is equality) then a diagram of shape I op is an indexed family of objects. This is the case for products as above.(c) Let M be a concrete object. Then the subsets of M are ordered under inclusion andthus give a diagram of shape M op . Definition 3.1.25.
Let A ∈ C I op be an inverse system. Then we define the inverse limit (projective limit or limit) as the universal object lim ←− A i together with morphisms α j :lim ←− A i → A j for all j satisfying the following compatibility conditions:(a) ϕ ji ( α j ) = α i for i (cid:22) j .(b) If C is an object of C together with morphisms { β i } which are compatible with A ,then there exists a unique morphism so that the following diagram commutes forall i (cid:22) j : C lim ←− A i A i A j ∃ ! β j β i α j α i ϕ ji These objects are complicated to look at but are so useful that it’s worth the technical-ities. The following examples tie together some previous topics which at first so not seemnecessarily related but are all examples of limits.
Example 3.1.26. (a) Consider the following diagram D in R - Mod A C f Then lim ←− D = ker f . In this case, we see that the limit must have the following setrepresentation lim ←− D = { ( x , y ) ∈ × A : 0 = f ( y ) } In fact, arbitrary limits exist in R - Mod by a simple argument considering sets likethose above. 65b) Clearly, products as above are now limits. over the discrete set I = {
1, 2 } .(c) We define the pullback of a diagram of the form Example 3.1.24 (a), to be their limit.Almost always, these have a set representation as in example (a) here. In this case,we denote lim ←− D = A × C B .(d) If we want to define intersections without using elements, we can do it using limits.Let A → C and B → C be monic morphisms (they are subobjects). Taking the limitof this diagram we get A ∩ B AB C ij The resulting morphisms are clearly monic.We have the dual notion to the above construction.
Definition 3.1.27. A direct system in a category C is a collection indexed by a partiallyordered set I , { A i , ϕ ij : A i → A j } i (cid:22) j such that ϕ jk ϕ ij = ϕ ik for all i (cid:22) j (cid:22) k . Equivalently, andirect system is a functor A : I → C such that A ( i ) = A i and A ( i → j ) = ϕ ij . Therefore, A ∈ C I = Fun ( I , C ) . Example 3.1.28. (a) Let I = {
1, 2, 3 } with the partial order 1 (cid:22) (cid:22)
3. Thendiagrams of shape I op look like A BC (b) If I is discrete (that is the only partial order is equality) then a diagram of shape I isan indexed family of objects. This is the case for products as above.(c) Let M be a concrete object. Then the subsets of M are ordered under inclusion andthus give a diagram of shape M . Definition 3.1.29.
Let A ∈ C I be an direct system. Then we define the direct limit (in-ductive limit or colimit) as the universal object lim −→ A i together with morphisms α j : A j → lim −→ A i for all j satisfying the following compatibility conditions:(a) α j ϕ ij = α i for i (cid:22) j .(b) If C is an object of C together with morphisms { β i } which are compatible with A ,then there exists a unique morphism so that the following diagram commutes for66ll i (cid:22) j : A i A j lim −→ A C β i α i ϕ ij β j α j ∃ ! Example 3.1.30. (a) Consider the following diagram D in R - Mod
A B f Then lim −→ D = coker f . In this case, we see that the limit must have the following setrepresentation lim −→ D = ( B ⊕ ) / { ( f ( x ) , 0 ) ∈ ⊕ A : x ∈ A } In fact, arbitrary colimits exist in R - Mod by a simple argument considering sets likethose above.(b) Clearly, coproducts as above are now colimits. over the discrete set I = {
1, 2 } .(c) We define the pushout of a diagram of the form Example 3.1.28 (a), to be their col-imit. Almost always, these have a set representation as in example (a) here. In thiscase, we denote lim ←− D = A ⊕ C B .(d) If we want to define internal sums without using elements, we can do it using colim-its. Let A , B be two objects. Then A ∩ B → A and A ∩ B → B are monic morphisms(they are subobjects). Taking the colimit of this diagram we get A ∩ B AB A + B ji The resulting morphisms are clearly monic.The following proposition gives motivation for thinking of limits and colimits as func-tors.
Proposition 3.1.31.
Let I be a partially ordered set. Then all limits and colimits exist in R-
Mod .Proof.
We prove the case of limits. The case of colimits is then formally dual and left as afun exercise. Consider L ⊆ ∏ i ∈ I A i the submodule of threads L = { ( a i ) : ϕ ji ( a j ) = a i }
67y construction this comes with compatible maps α i : L → A i .Now let X be any module with compatible maps { β i } . Define θ : X → ∏ A i by θ ( x ) = ( β i ( x )) Then Im θ ⊆ L . Further α i θ : x (cid:55)→ ( f i ( x )) (cid:55)→ f i ( x ) . Hence, the limit diagram commutes.To show that θ is unique, let π : X → L be another such morphism. Then π ( x ) = ( a i ) and α i π ( x ) = a i . Thus if α i π ( x ) = f i ( x ) , we have that π = θ and thus L ∼ = lim ←− A i This completes the proof.This proposition says that R - Mod is complete and cocomplete (meaning that all limitsand colimits exist). So clearly, lim −→ : R - Mod I → R - Mod is functorial. We would like toshow this in general. This is not true however.
Example 3.1.32.
Let
Ring be the category of rings. Then if { R i } is an indexed family ofobjects, lim −→ R i (cid:54)∈ Ring
Why is this? Well, the unit element is necessarily (
1, 1, ... ) . But this is non-zero in everyentry and thus cannot be an element of the colimit (in this case it is the infinite direct sum).In fact, most categories are not complete or cocomplete. When they are, it is obvious thatlim −→ and lim ←− are functors. For more information, see [HS97]. R -Mod We now move into the final subsection. Here we are interested in categories which gen-eralize the category of R -modules or abelian groups. The defining charactersitics of thesecategories is that we can: • Always take kernels and cokernels • Have an object 0. • Can take arbitrary products and coproducts. • Hom ( A , B ) is an abelian group (or R -module).What of these properties is necessary in generalizing? This section will give an answerto this. At the end, we will introduce some homological algebra. This will allow us toassociate invariants to modules. We start with additive categories. Definition 3.1.33.
A category A is additive if the following are true:(a) Hom ( A , B ) is an abelian group for all A , B ∈ A .(b) There exists a zero object 0. 68c) Composition is distributive. That is f ( g + h ) = f g + f h and ( g + h ) i = gi + hi .(d) Finite products and coproducts exist.A functor F : A → B is additive if F ( f + g ) = F ( f ) + F ( g ) . That is the morphism F X , Y : Hom ( X , Y ) → Hom ( FX , FY ) is a group homomorphism.The following proposition gives some properties of additive categories and additivefunctors. Proposition 3.1.34.
Let A , B be additive categories. Then finite products and coproducts areisomorphic. Moreover if T is an additive functor, then T ( A ⊕ B ) = T ( A ) ⊕ T ( B ) .For a proof of this statement see [Rot09].Now, using the constructions of ker and coker from above, we can prove Lemma 3.1.35.
Let f ∈ Hom A ( A , B ) be a morphism in an additive category.(a) If ker f exists, then f is monic if and only if ker f = (b) If coker f exists, then f is epic if and only if coker f = Proof.
Let ι : ker f → A be the morphism from the diagramatic definition above. If ι = g : X → A satisfies f g =
0, then by the universal property of limits, there exists amorphism θ : X → ker f with g = ιθ =
0. Hence, f is monic.For for the opposite direction consider the diagram K ι ⇒ A f → B Since f ι = = f
0, we have that ι =
0. The proof for cokernels is dual to this one.
Definition 3.1.36.
An additive category A is Abelian if(a) Every morphism has a kernel and cokernel.(b) Every monomorphism is a kernel and every epimorphism is a cokernel.
Example 3.1.37. In R - Mod , we have that every submodule S ⊆ M can be realized asa kernel via the map M → M / S . Cokernels are then the projections as given by the firstisomorphism theorem (Theorem 2.5.8). Therefore, the requirements of an abelian categorymake it look strikingly like R - Mod .We are now able to form the same definitions as in Chapter 2, but now in the contextof abelian categories.
Definition 3.1.38.
A sequence of morphisms A f → B g → C in A is called exact if ker g = Im f as subobjects in A . Now let 0 → A → B → C → A We say an additivefunctor F : A → B between abelian categories is69a) Left Exact if 0 → FA → FB → FC is exact.(b) Right Exact if FA → FB → FC → Half Exact if FA → FB → FC is exact.(d) Exact if 0 → FA → FB → FC → Lemma 3.1.39 (Snake Lemma) . Consider the following commuting diagram in an abelian cate-gory A (cid:48)
A A (cid:48)(cid:48) B (cid:48) B B (cid:48)(cid:48) ψ α ϕ α θβ β If the rows are exact, then there exists a morphism ∂ : ker θ → coker ψ making the followingsequence exact ker ψ → ker ϕ → ker θ → coker ψ → coker ϕ → coker θ Proof.
Extend the above diagram to include ker θ and coker ψ . Now form the pull-backand pushout accordingly: A × A (cid:48)(cid:48) ker θ ker θ A (cid:48) A A (cid:48)(cid:48) B (cid:48) B B (cid:48)(cid:48) ψ coker ψ ⊕ B (cid:48) B ψ α ϕ α θβ β From this, we immediately see that the sequence0 → A (cid:48) → A × A (cid:48)(cid:48) ker θ → ker θ → σ : = ( A → A × A (cid:48)(cid:48) ker θ ) , γ : = ( coker ψ ⊕ B (cid:48) B → B (cid:48)(cid:48) ) , and the composite morphism (cid:101) : = ( A × A (cid:48)(cid:48) ker θ → coker ψ ⊕ B (cid:48) B ) . From the exactness of the rows in the above diagram, we get that γ(cid:101) = (cid:101)σ = (cid:101) factors through the cokernel of σ and the kernel of γ . Asthese two objects are ker θ and coker ψ , define δ : ker θ → coker ψ as this morphism. 70his yields a sequence of morphismsker ψ → ker ϕ → ker θ δ −→ coker ψ → coker ϕ → coker θ For all pairs of morphisms not involving δ , exactness follows immediately. For the re-maining morphisms, note that it suffices to show that ker ϕ → ker θ → coker ψ is exact aswe can then dualize the argument to get the same result for the dual sequence. To showthis, let S ∈ A and π : S → ker θ any morphism such that δπ =
0. Form the pullback andadjoin it to the diagram as follows S SA × A (cid:48)(cid:48) ker θ ker θ A (cid:48) AB (cid:48) B coker ψ πδψ α ϕβ where the dashed morphism, call this f , exists by the fact that A × A (cid:48)(cid:48) ker θ → B → B (cid:48)(cid:48) is the zero morphism. Now, the composition S → coker ψ is 0 and thus, we can find anepic morphism S (cid:16) S such that the composition S → B (cid:48) factors through A (cid:48) . Denote by g the morphism S → A (cid:48) . Define the composite morphism λ : S → A and then consider λ − f ◦ k : S → A This must factor through ker ϕ by the commutativity of the diagram above. Hence, weget a commuting square S S ker ϕ ker θ A A (cid:48)(cid:48) α The existence of this commuting diagram is equivalent to the exactness of the sequenceker ϕ → ker θ → coker ψ . Dualizing this argument we get the exactness of the othermorphisms. This completes the proof.Now we can tie together adjoints and abelian categories.71 heorem 3.1.40. Let F : A (cid:29) B : G be adjoint functors with F (cid:97) G . Then F is right exact andG is left exact. Further F ( lim −→ A i ) = lim −→ F ( A i ) and G ( lim ←− A i ) = lim ←− G ( A i ) .The proof relies on the Yoneda Embedding [ML71] which we will not cover. Thistheorem thus implies a stronger result than we stated before about Hom and ⊗ . Corollary 3.1.41.
Hom is left exact in both arguments and ⊗ is right exact in both arguments. Therefore, given a short exact sequence of R -modules, the resulting sequences0 → Hom ( Y , A ) → Hom ( Y , B ) → Hom ( Y , C ) Y ⊗ A → Y ⊗ B → Y ⊗ C → ⊗ are exact everywhere.Thus, for the rest of this chapter, we shall assume we are working in R -modules. Thismay seem at first like we are becoming too specific to be of any use for category theory.The following theorem tells us that this is not correct. Theorem 3.1.42 (Mitchell) . Let A be a small abelian category. Then there exists an exact, fully-faithful functor A → R- Mod for some ring R .See [Rot09] for details.
Projective, Injective, and Flat modulesDefinition 3.1.43. An R -module P is projective if for every surjective map M → N andany map P → N there exists a map M → P making the following diagram commute PM N ∃ Dually an R -module I is injective if for every injective map 0 → L → M and anymorphism L → I there exists a morphism making the following diagram commute: I L M ∃ These definitions seem obtuse and out of nowhere. The following lemma makes themseem less so arbitrary.
Lemma 3.1.44.
The functor
Hom ( P , − ) is exact if and only if P is projective. Also, the functor Hom ( − , I ) is exact if and only if I is injective. roof. We prove the injective case and leave the projective one to the reader as it is thesame argument. ( ⇒ ) Assume first that Hom ( − , I ) is exact. Then for any exact sequenceof modules 0 → A → B → C → → Hom ( C , I ) → Hom ( B , I ) → Hom ( A , I ) → ( B , I ) → Hom ( A , I ) is surjective. Being surjectivemeans that for any morphism ϕ : A → I , there exists a morphism ˆ ϕ : B → I which makesthe diagram above commute. This is precisely the definition of I being injective.Now assume I is injective. Then we have a surjective map π : Hom ( B , I ) → Hom ( A , I ) by definition. For any f ∈ Hom ( A , I ) the definition tells us that f = i ∗ ( g ) for some g ∈ Hom ( B , I ) . Hence, π = i ∗ and Hom ( − , I ) is exact. This completes the proof. Definition 3.1.45.
A module is called flat if − ⊗ R M is exact. Moreover, every projectivemodule is flat [Rot09].For a given module M , we want to understand how far M is from being projective,injective, or flat. Clearly the functors Hom and ⊗ will not tell us this information. Whatthey imply is that M is simply not flat (projective, injective). To remedy this, we will finda free resolution of M which is quasi-isomorphic to M so that we can measure how far M is from being one of the special modules above. Definition 3.1.46. A free resolution of an R -module M is an exact sequence F • → M → F i and morphisms α i so that... → F → F → M → M as the cokernel of the map F → F ). If every F i is projective(resp. flat) then F • is a projective (resp. flat) resolution of M . As injective modules aredual to projective ones, we have that an injective resolution of M is an exact sequence0 → M → I • .We care about these resolutions because if we look at the quotients ker α i / Im α i + = i >
0. If we truncate the sequence and only consider up to F . Then the cokernelof α = M . Therefore, this sequence is in some sense no different from M itself. The nextpart goes into more detail about this. Derived Functors
For a general abelian category, we have the notion of short exact sequences. In addition tothis, we have the notion of (co)chain complexes . These will be the central objects we wantto consider when answering the questions posed in the previous section.
Definition 3.1.47.
Let ( C • , d • ) be a collection of objects in an abelian category A togetherwith a morphism d n : C n → C n − . We call ( C • , d ) a chain complex if d n − ◦ d n =
0. Ifinstead we have an object ( C • , ∂ • ) such that ∂ n : C n → C n + such that ∂ n + ◦ ∂ n = cochain complex . It is common practice to drop the index on thedifferential d • or ∂ • and simply denote them d and ∂ . We shall adopt this convention.73 morphism of (co)chain complexes ( C • , d ) and ( D • , d (cid:48) ) is a chain map f • (resp. f • ),that is a collection of maps f i so that the following diagram commutes for all n , C n C n − D n D n − df n f n − d (cid:48) With this notion of morphism, we can build a new category ( c ) Ch ( A ) of (co)chaincomplexes. Notice that because of the condition d =
0, we have that Im d n ⊆ ker d n − . Definition 3.1.48.
Let ( C • , d ) be a chain complex. Define the n − th homology groups of C • as H n ( C • ) = ker d n / Im d n + These are in fact groups as shown in [Rot09].Two chain complexes are quasi-isomorphic if there exists a chain map f • : C • → D • such that ( f i ) ∗ : H i ( C • ) ∼ → H i ( D • ) where ( f i ) ∗ is defined as [ α ] (cid:55)→ [ f i ◦ α ] . This is well-defined by the definition of a chain map. Further f i ◦ α ∈ ker d (cid:48) n . We have completelyanalogously the definition of cohomology groups H i ( C • ) . We call a (co)chain complex is exact if all of the (co)homology groups are identiically 0.We now return to the content of the previous section. Let M be an R -module and P • aprojective resolution of M .It then follows from the discussion above that P • → M at P ) and M are quasi-isomorphic as chain complexes (here M is considered as thetrivial chain complex with differential 0 everywhere). We can use this to our advantage.For any R -module A , consider Hom ( − , A ) . The resulting cochain complexHom ( P , A ) → Hom ( P , A ) → Hom ( P , A ) → ...is no longer exact. Definition 3.1.49.
The n-th cohomology groups or n -th Ext groups of M and are denotedExt nR ( M , A ) : = H i ( Hom ( P n , A )) Remark 3.1.50.
It can be shown [HS97] that these groups do not depend on the resolutiontaken. In fact, it does not even matter if we resolve A or M . There is a dual construction ofExt n ( M , A ) where instead of a projective resolution of M , we take an injective resolutionof A .For ⊗ , we have the corresponding construction but now we only use projective reso-lutions as ⊗ is covariant in both arguments. Definition 3.1.51.
The n-th homology groups or n -th Tor groups of M areTor Rn ( M , A ) : = H i ( P n ⊗ A )
74e now generalize to arbitrary abelian categories.
Definition 3.1.52.
An abelian category is said to have enough projectives if every elementhas a projective resolution (respectively, enough injectives and enough flats)Let A be an abelian category with enough projectives and F : A → B be a rightexact functor. Then for any projective resolution of an object M , we can repeat the oper-ation above to define the derived functors of F . To be more specific, let P • be a projectiveresolution of M . Definition 3.1.53.
The functors L i F ( M ) = ker ( FP n → FP n − ) / Im ( FP n + → FP n ) are called the left derived functors of F . Dually if G is left exact and I • is an injectiveresolution, we can define R i G as the right derived functors for G .One may ask why we do not consider the left derived functors for a left exact functor.The answer to this is that these are all zero, or at least un interesting. They tell you nothingabout exactness as 0s appear in the sequences. Proposition 3.1.54.
If F is exact then R i F and L i F are for all i > Proof. As F is exact, the resulting long sequences are exact. Hence, the quotient groupsare 0 and R i F (resp. L i F ) is 0. Remark 3.1.55.
The derived functors measure the extent to which M is not projective,injective, or flat. More generally, they measure how far F is from being exact. If R i F is non-zero for only very large i , then F is very close to being exact. Whereas if R F isnon-zero, then F is nowhere close to being exact.The final theorem we present in this section is the most useful for computing thesefunctors. Theorem 3.1.56.
Let → A → B → C → be exact in A and F : A → B be a right exactfunctor. Then there is a long exact sequence ... L i F ( A ) → L i F ( B ) → L i F ( C ) → L i − F ( A ) → ... in the derived functors. The same is true for left exact functors. The proof of this is immediate from the Snake Lemma 3.1.39. The reason it is so impor-tant is because if we know that either A , B , or C is F -acyclic (that is L i F ( C ) =
0) then weget isomorphisms of the remaining groups! This single fact underlies most of homologi-cal algebra and will be integral in section 3.3.2. This completes this brisk tour of categorytheory. 75 .2 Topology
We shall depart from category theory for the time being and return to it in section 3.2.2.For the meantime, we shall introduce the second major topic of this chapter: topologicalspaces. The purpose of these objects is to formalize the somewhat colloquial notions ofconnectedness, compactness, and other concepts. The culmination of all of this will beto define and give some basic properties of singular homology groups for a topologicalspace. This concept will prove contentious in chapter 4 as some researchers have recentlyproposed using homology to discover geometric properties of the perceptual space.
The story of topology starts with the definition of a topological space. Before we give thisthough, we want to motivate the study of such objects by looking at the familiar case of R n and in particular R . In high-school algebra, we call sets of the form ( a , b ) open and [ a , b ] closed. Similarly, sets of the form B r ( p ) = { x ∈ R n : | x − p | < r } are open in R n and if we change < to ≤ , we get closed sets. In fact, we can have arbitrary open sets in R n but all of them are built out of sets of the form above. We want to generalize all of thisand formalize what we mean by open and closed . Some good references for this section are[Lee11], [Mun00], and [FF16]. The last of which is a fairly recent and thorough treatmentof the material in Section 3.2.2. Definition 3.2.1.
Let X be a set. A topology on X is a collection of subsets T ⊆ P ( X ) thepower set, subject to the following conditions:(a) ∅ , X ∈ T .(b) T is closed under arbitrary union. That is if { U i } i ∈ I is a collection of elements of T with | I | arbitatry, then (cid:91) i ∈ I U i ∈ T (c) T is closed under finite intersections. That is if { U i } i ∈ I is a collection of elements of T with | I | < ∞ , then (cid:92) i ∈ I U i ∈ T Elements of the topology are called open sets. A subset V ⊆ X is called closed if X − V ∈T . A set equipped with a topology is called a topological space .Notice that open and closed are not mutually exclusive: X is always closed and open(sometimes abbreviated to clopen ) and some sets, such as [
0, 1 ) in R are both not closedand not open. Further, simply because a set is not open does not imply closure. Example 3.2.2.
For any set, we can give it the discrete topology where every subset is declared open.Dually, we can define the trivial topology in which only ∅ and X are open.76or any subset A ⊆ ( X , T ) , we can topologize A by taking the open sets to be A ∩ T : = { A ∩ U : U ∈ T } This is called the subspace topology .The topology generated by the open balls in R n above is called the standard topology on R n .We want to formalize the final example above. That is we want to answer the question: what does it mean to generate a topology? Similar to a basis for a vector space, we want todefine an analogous object for a topology.
Definition 3.2.3.
Let X be a topological space with topology T . Then a collection of sub-sets, B , of X is called a basis for the topology T if the following conditions are satisfied:(a) Every B ∈ B is open in X .(b) Every open set U ∈ T can be written as a union of some collection of elements of B .It should now be clear that the standard. topology on R n is the topology with basisconsisting of the open balls. Now that we have this definition, we want to understandwhen it is applicable. Further, what conditions on a collection of subsets of a topologicalspace make it a basis? The following proposition answers this in full. Proposition 3.2.4.
Let X be a set and B a collection of subsets. Then B is a basis of a topologyon X if and only if the following conditions are satisfied:(a) (cid:83) B ∈ B B = X(b) For every B , B ∈ B , B ∩ B ∈ B and if B ∩ B (cid:54) = ∅ , there exists B ∈ B such thatB ⊆ B ∩ B . In fact, this topology is the unique topology generated by B . Proof.
Suppose B is a basis. Then (a) is satisfied immediately as every open set is a unionof basis elements and X is open in any topology. For (b), as. B and B are open, B ∩ B is open. Therefore we can write B ∩ B = (cid:91) B i where B i ∈ B are basis elements. Pick any of the B i to satisfy (b).For the reverse. direction, we need to show that the conditions above imply that T B isindeed a topology on X . By the (a), X , ∅ ∈ T B . Let { U i } be an arbitrary collection of opensets. Then each U i = (cid:83) j ∈ J i B ij . with each B ij ∈ B . Then (cid:91) U i = (cid:91) I (cid:91) J i B ij So T B is closed under arbitrary unions. To show it is. closed under finite intersection,let U , U ∈ T B . Then for every x ∈ U ∩ U , there exists some B ⊆ U and B ⊆ U x ∈ B ∩ B . By condition (b), we know there exists some B such that x ∈ B ⊆ B ∩ B ⊆ U ∩ U . Then U ∩ U is a union of each of these basis elements as x varies andhence is open. Therefore T B is closed under pairwise intersection and by induction, allfinite intersections. Hence T B is a topology on X . Uniqueness follows immediately fromthe definition of a basis. This completes the proof.This proposition says that it suffices to define a topology by giving a basis. In Section3.3, we will use this to topologize manifolds in a unique way so that they are sufficientlynice.We need to step back a bit and think about how we topologize R n . We have given abasis for some topology on R n above. What if we want to build a topology on R n out ofthe topologies on R . To answer this, we shall generalize to the notion of product topology. Definition 3.2.5.
Let { X α } α ∈ J be a J -indexed family of topological spaces. As a basis forsome topology on the product space ∏ J X α , we have the sets of the form ∏ U β where U β is open in X β and U β = X β for all but finitely many β ∈ J . This topology iscalled the product topology .There is a naive topology on the product which removes the final condition that U β = X β for all but finitely many β . This is called the box topology . In the case of J finite, theseare equivalent. It is generally less useful than the product topology as it is too fine; that istoo many sets are open. For this reason, whenever we have a product space, we assumeit has the product topology.In R n , it is relatively easy to distinguish whether or not a point lies within a givenset. For a general topological space, this is daunting as the topology may be particularlybad. We need to generalize the above notion arbitrary spaces so that we can speak ofboundaries of sets. To be more formal, let X ⊆ Y be topological spaces. We say that x ∈ Int ( X ) the interior of X if there exists an open set U (cid:40) X such that x ∈ U . Theboundary of X , denoted ∂ X is the collection of points { y } such that for any open set P containing y , P ∩ X is non-trivial. We define the closure of X to be X = Int ( X ) ∪ ∂ X It should be noted that this only makes sense for topological subspaces. More generallyit makes sense in the context of embeddings (see Example 3.2.8 below).
Proposition 3.2.6.
Let X be a topological space and A a subspace. Then
Int ( A ) is open, ∂ A isclosed, and A is closed.Proof.
For each point x ∈ Int ( A ) let U x ⊆ Int ( A ) be an open set containing x . Then Int ( A ) is. the union of these U x and is thus open. Consider X − ∂ A we wish to show that this isopen. From the definition, X − ∂ A = Int ( A ) ∪ ( X − A ) Therefore, it suffices to show that X − A is open. Let p ∈ X − A . As p / ∈ A , there existssome V ⊆ X open such that V ∩ A = ∅ and p ∈ V . As X − A is a union of these opensets, it is open. This completes the proof. 78t should be clear now that a set is open (resp. closed) if and only if A = Int ( A ) (resp. A = A ).Now that we have the notions of topologies and bases, we can give a general definitionof continuity. Definition 3.2.7.
Let f : X → Y be a function between topological spaces. We call f continuous if for all V ⊆ Y open, f − ( V ) is open in X . We call a map open if for all U ⊆ X open, f ( U ) is open in Y .Together with continuous maps, topological spaces define a category denoted Top . Ifwe add an additional stipulation that every space be given a distinguished point, thenwe can define the category
Top ∗ of pointed topological spaces and base-point preservingmaps.The following examples of continuous maps are all fun exercises to the reader. Theyare incredibly important for later parts of this chapter. Example 3.2.8. (a) Let ( X , x ) and ( Y , y ) be pointed topological spaces. Then the constant map x (cid:55)→ y is continuous.(b) Let f : X → Y be a continuous map. Then for any subspace A ⊆ X , the restrictionmap f | A : A → Y is also continuous.(c) Let f : X → Y be a continuous map, and denote the image by f ( X ) . Then for anysubspace Z ⊆ Y with f ( X ) ⊆ Z , the map f Z : X → Z is continuous.(d) The composition of continuous maps is continuous.(e) Any inclusion map is continuous. That is, if A ⊆ X then there exists a map A (cid:44) → X and this map is continuous. In general an injective continuous map is called a topological embedding if it is a homeomorphism onto its image.Notice that the definition of continuity pays no mind to closed subsets. Could wepossibly get a different definition if we replace open with closed in the definition? Thefollowing lemma gives a negative answer. Lemma 3.2.9.
A function f : X → Y is continuous if and only if for all closed subsets V ⊆ Y,f − ( V ) is closed in X . Proof. ( ⇐ ) Let B be a closed set in Y and C its complement. By definition it is open. Wewant to show that f − ( C ) is open in X . Consider f − ( B ) = f − ( Y ) − f − ( C ) = X − f − ( C ) As f − ( B ) is closed in X , we conclude that f − ( C ) is open.79 ⇒ ) Let B be closed in Y . We need to show that f − ( B ) is closed. We need to showthat f − ( B ) = f − ( B ) . Let x ∈ f − ( B ) . Then f ( x ) ∈ f ( f − ( B )) ⊆ ¯ B = B where the inclusion follows from continuity. Therefore, x ∈ f − ( B ) and f − ( B ) ⊆ f − ( B ) . Hence, f − ( B ) is closed.Therefore, defining continuity in terms of closed sets is equivalent to defining it in termsof open sets.We now want to define quotient objects in Top . Let A be a subspace of a topologicalspace X . Then we define an equivalence relation on X as x ∼ y if x , y ∈ A . Then we havethe quotient space X / ∼ which is also written X / A . We want to topolgize X / A in a waywhich makes the canonical map X → X / A continuous. Definition 3.2.10.
The quotient topology is defined as the coarsest topology for whichthe canonical morphism π : X → X / A is continuous. Equivalently, P ⊆ X / A is open ifand only if π − ( P ) is open in X . Remark 3.2.11.
This will allow us to give a topological structure to the Generalized Cate-gories from Chapter 1 and give a coarse categorization from the perceptual space.The quotient topology can be particularly opaque as it depends entirely on X and A .To give some idea of how it can manifest, lets give some examples of quotient spaces: Example 3.2.12. (a) Let S : = { x ∈ C : | x | = } . Then consider the subspace {−
1, 1 } .It then turns out that S / {−
1, 1 } is equivalent to two circles which touch at a sin-gle point. The topology of this space is then inherited from its embedding into C .Therefore, the quotient topology in this case is easy to see.(b) Consider Z (cid:44) → R . Then R / Z is equivalent to the interval [
0, 1 ] with the identifica-tion of 0 ∼
1. Hence, the quotient space is S .What do we mean here by "equivalent?" We claimed above that Top is a categoryand thus equivalent should mean an isomorphism. What are the isomorphisms in thiscategory?
Definition 3.2.13.
Let f : X → Y be a continuous map. We call f a homeomorphism ifthere exists g : Y → X such that g ◦ f = X and f ◦ g = Y . Notice that every homeomor-phism is necessarily a bijection.If we consider "spaces up to homeomorphism" this is an equivalence relation. That is,we can think of isomorphism classes of topological spaces. This is a large area of researchfor say curves and surfaces. Before we move on to other general topological properties,we shall give some generic properties of homeomorphisms. Theorem 3.2.14.
Let f : X → Y be a bijective function between topological spaces. Then f is ahomeomorphism if and only if f ( T X ) = T Y . Further if f is a homeomorphism, then f is an openmap. roof. Notice that the second statement follows immediately from the first. ( ⇒ ) Let U ∈ T X . Then f ( U ) = ( f − ) − ( U ) As f is a homeomorphism, f − is continuous and so f ( U ) is open in Y and thus f ( U ) ∈T Y . Therefore, we have an injection f ( T X ) (cid:44) → T Y . This map is surjective as f is continuous.Thus, f ( T X ) = T Y . ( ⇐ ) Assume now that f ( T X ) = T Y . f is continuous as for any V ∈ T Y , f ◦ f − ( V ) = V and therefore f − ( V ) ∈ T X . Similarly, f − is thus continuous. This completes the proof. Example 3.2.15.
Some classic examples of homeomorphisms are translations and dila-tions of R n . These are maps of the from f ( x ) = x + λ and f ( x ) = cx for some λ ∈ R n and c ∈ R . More importantly, let V , W be finite dimensional vector spaces. Then any linearmap V → W is necessarily continuous. In fact, as we will see in the next section, thesemaps are smooth! Example 3.2.16.
We end this subsection with an interesting example of topological spaces.Let G be a group. Then we call G a topological group if multiplication and inversion arecontinuous maps. A morphism of topological groups is a continuous group homomor-phism. Connectedness, Compactness, and Hausdorff
Now we give some characterizations of certain topological spaces. The properties areimportant for many mathematical applications and will be intrinsically important for forthe next section and chapter 4. We shall do them all in one pass and then go into somedetail about their relationships to each other.
Definition 3.2.17.
Let X be a topological space.(a) X is connected if there do not exist open sets U , U such that U ∩ U = ∅ and U ∪ U = X .(b) X is compact if for every open cover U of X there exists a finite subcover. An opencover of a topological space is a collection of open sets U = { U i } such that X ⊆ (cid:83) U i .(c) If X is non-empty an contains at least two elements, then X is Hausdorff if for anytwo distinct points, x , y ∈ X , there exists open sets U x , U y ⊆ X , such that x ∈ U x , y ∈ U y , and U x ∩ U y = ∅ . Distinct here means that there exists some open set about x which does not contain y . Spaces with thisproperty are sometimes called Kolmogorov X . For instance, every space is con-nected (resp. compact) if equipped with the trivial topology and every space is discon-nected (resp. non-compact) if it is equipped with the discrete topology. In general, a spaceis not connected but can be broken up into connected components. This partitions the setinto distinct subsets which can be of great use. There is another notion of connectednesswhich is slightly stronger. Definition 3.2.18.
A topological space X is path-connected if for each pair of points a , b ∈ X , there exists a continuous path γ : [
0, 1 ] → X such that γ ( ) = a and γ ( ) = b . Proposition 3.2.19.
If X is path-connected, then X is connected.Proof.
Assume for the sake of contradiction that X is disconnected. Let X = U ∪ V with U ∩ V = ∅ . Let a ∈ U , b ∈ V , and γ a path between them. Then γ − ( X ) = γ − ( U ) ∪ γ − ( V ) . This implies that [
0, 1 ] is. disconnected which is a contradiction. Hence, X isconnected.This proposition proves our assertion from before that path-connectedness is a strongercondition that connectedness. In fact, there are some highly non-trivial examples wherethe converse is not true. Example 3.2.20.
Let X be the space of lines in R connecting the origin to the points ( n ) ,together with the point (
1, 0 ) (note this does not include the line segment (
0, 0 ) → (
1, 0 ) .)Then X is connected, but not path-connected. See Figure 3.1.Similar to connected components, we can define path -connected components. For atopological space X , denote the set of path-connected components by π ( X ) .We want to understand how each of these notions interacts with (1) each other and (2)continuous maps.Let us investigate (2) first. Theorem 3.2.21.
Let f : X → Y be a continuous function. If X is connected (resp. compact),then so is f ( X ) . Proof.
Let X be connected. Assume for the sake of contradiction that f ( X ) is disconnected.Then, let f ( X ) = A ∪ B . Each of these is open in Y and thus f − ( A ) and f − ( B ) is open in X . Further, f − ( A ) ∪ f − ( B ) = X . This contradicts the connectedness of X . Hence, f ( X ) is connected.Now assume X is compact. Let V be an open cover for f ( X ) . Then X = (cid:91) V i ∈V f − ( V i ) is an open cover. As X is compact, there exist finitely many V i such that X = (cid:83) n f − ( V i ) .Therefore V , ..., V n are is a finite open subcover of V . Hence, f ( X ) is compact. This com-pletes the proof. 82igure 3.1: The Witches Broom. An example of a connected but not path-connected topo-logical space. It is the union of all line segments [(
0, 0 ) , ( n )] ∪ { (
1, 0 ) } .This theorem is highly important to any field of mathematics that concerns itself withtopologies of any kind. As it turns out, many theorems only work for compact spaces.So, knowing that compactness is preserved under continuous maps is crucial. Let’s un-derstand compact sets a bit better. Proposition 3.2.22.
Let X be a compact space.(a) If A ⊆ X is closed, then A is compact.(b) If X ⊆ Y a Hausdorff space, then X is closed in Y . Proof. (a) Let A be an open cover of A . As A is closed, A c = X − A is open in X and A ∪ A c is an open cover for X . As X is compact, there exists a finite subcover. If this re-sulting subcover contains A c , discard it. Else, this is a finite cover of A . This proves (a).(b) Let y ∈ X c . We want to construct an open set V containing y such that V ∩ X = ∅ .As Y is Hausdorff, for every x ∈ X , there exists disjoint open sets U x and (cid:102) U x such that x ∈ U x and y ∈ (cid:102) U x . Then (cid:83) x ∈ X U x is an open cover of X . By compactness, there is a finitecollection of points { x i } such that X = (cid:83) i U x i . Put V = (cid:92) i (cid:102) U x i X by construction. Hence, X c is open and thus X is closed.In a similar theme to topologies, we would like to know how connectedness, compact-ness, and Hausdorff-ness interact with products. Proposition 3.2.23.
Let { X i } be a family of connected (resp. Hausdorff) spaces. Then ∏ X i isconnected (resp. Hausdorff). We leave the proof of this proposition as an exercise to the reader as it follows entirelyfrom the definitions.For compactness, there are two results and both are surprising.
Theorem 3.2.24 (Heine-Borel) . A subset of R n is compact if and only if it is closed and bounded. Theorem 3.2.25 (Tychonoff) . Let { X i } be an arbitrary collection of compact spaces. Then ∏ X i is compact. Although we shall not prove this, it is interesting to know that this theorem is equivalentto the axiom of choice as its proofs rely entirely on Zorn’s Lemma. This is arguably themost important theorem in all of point-set topology. For a proof of both theorems see[Mun00].
Metric Spaces
We now give a brief introduction to metric spaces which will allow us to formally discuss"perceptual metrics" in chapter 4.
Definition 3.2.26.
Let X be a set. A metric on X is a function d : X × X → R ≥ ∪ { ∞ } such that(a) For x , y ∈ X , d ( x , y ) = d ( y , x ) .(b) For all x , y ∈ X , d ( x , y ) = ⇐⇒ x = y .(c) For all x , y , z ∈ X , d ( x , z ) ≤ d ( x , y ) + d ( y , z ) .The final condition is called the triangle inequality and is the defining characteristic ofmetrics. The set ( X , d ) is called a metric space. A function f : ( X , d ) → ( Y , g ) betweenmetric spaces is called a metric map if g ( f ( x ) , f ( y )) ≤ d ( x , y ) If equality holds for all x , y , then f is called an isometry . The collection of all metric spacesand all metric maps forms a category denoted Met . Theorem 3.2.27.
Let ( X , d ) be a metric space. Then d induces a topology on X (called the metrictopology ). This gives a faithful functor Met (cid:44) → Top
The image is the category of metrizable spaces (those which are homeomorphic to metric spaces). roof. Let x ∈ X and put B r ( x ) : = { y ∈ X : d ( x , y ) < r } Let B be the collection of all such balls for all points x ∈ X . We claim that B is a basis. Itsuffices to check the conditions of Proposition 3.2.4. Clearly, X = (cid:83) B ∈ B B . Let B r ( x ) and B r (cid:48) ( x (cid:48) ) be two elements of B such that B r ( x ) ∩ B r (cid:48) ( x (cid:48) ) (cid:54) = ∅ . By the triangle inequality,for any y ∈ B r ( x ) ∩ B r (cid:48) ( x (cid:48) ) we can find δ < r and δ < r (cid:48) such that B δ ( y ) ⊆ B r ( x ) and B δ ( y ) ⊆ B r (cid:48) ( x (cid:48) ) . Pick δ = min { δ , δ } . Then B δ ( y ) is contained in the intersection. Hence, B is a basis for a topology on X .The functor Met → Top is precisely the forgetful functor which sends ( X , d , T ) → ( X , T ) .Now consider { x n } a sequence of points in a metric space. We say that { x n } convergesto a point x if d ( x , x n ) → n → ∞ . A Cauchy Sequence is a sequence { x n } such thatthere exists n ∗ and for all m , n > n ∗ , d ( x m , x n ) < (cid:101) for any (cid:101) > Definition 3.2.28.
We call a metric space complete if every Cauchy sequence converges.
Theorem 3.2.29.
Let ( X , d ) be a metric space. Then there exists a metric space ( (cid:98) X , d ) such that (cid:98) Xis complete with respect to d and there is a map X → (cid:98) X . which is an isometry with dense image. See [Kna05c] for a full proof of this statement.With this theorem in mind, we want to give the definitions/exmaples of some com-plete metric spaces and how they arise.
Definition/Example 3.2.30.
Let V be a C -vector space. A norm on V is a map V → C such that(a) || x || ≥ x = || ax || = | a | · || x || for all a ∈ C .(c) || x + y || ≤ || x || + || y || for all x , y ∈ V .Clearly, a norm induces a metric d ( x , y ) = || x − y || on V . We call V a Banach space if ( V , d ) is complete.Similarly, we can define a hermitian inner product on V as a sesquilinear (one and ahalf linear) map (cid:104) , (cid:105) : V → C such that (cid:104) x , y (cid:105) = (cid:104) y , x (cid:105) where (cid:104) , (cid:105) denotes the complex conjugate.This defines a norm and hence a metric on V . If ( V , d ) is complete with respect tothis metric, then V is a Hilbert Space . These are some of the most important spaces forHarmonic analysis and representation theory. We shall use Banach spaces and tensorproducts to understand manifolds better in Section 3.3. Given a topological space X and a subspace A , we call A dense in X if A = X . .2.2 Basic Algebraic Topology In this section, we shall introduce a different approach to topology which considers aweaker form of equivalence but focuses on algebraic invariants attached to topologicalspaces. The main references for this subsection are [Hat01], [Rot88], and [FF16]. We startwith the notion of homotopy.
Definition 3.2.31.
Let X f ⇒ g Y be continuous maps. A homotopy between f and g is acontinuous function H : [
0, 1 ] × X → Y such that H ( x ) = f ( x ) and H ( x ) = g ( x ) . If such a homotopy exists we say that f and g are homotopic, and denote this f (cid:39) g . Two spaces are said to be homotopy equivalent if there exist function f , g such that f ◦ g (cid:39) X and g ◦ f (cid:39) Y .Notice that considering "spaces up to homotopy equivalence" is a weaker conditionthan "spaces up to homeomorphism". In fact, spaces which are homeomorphic are neces-sarily homotopy equivalent. In fact if we consider pointed topological spaces, then thereis a category Htpy where the morphisms are homotopy classes of maps. In this cate-gory, we consider the morphisms with source S and a fixed target X . More generally weconsider morphisms with source S n . Definition 3.2.32.
The space of maps π n ( X ) : = Hom
Htpy ( S n , X ) are called the n th homotopy groups . The group law is defined by concatenation in eachcoordinate. A topological space is called simply connected if π ( X ) =
0. Further, if π n ( X ) = n ≥
1, then X is contractible. Equivalently, X is homotopy equivalentto a point.For n ≥
2, these are abelian groups. These are algebraic invariants for the space X .By this we mean that if X (cid:39) Y , then π n ( X ) ∼ = π n ( Y ) for all n [Hat01]. The problemwith these homotopy groups is that they are almost always not computable, and even ifthey are it is incredibly difficult. For this reason, we want to consider a better algebraicinvariant: homology and cohomology. These in some sense classify the number of holesof each dimension in a space. Example 3.2.33.
Let T = S × S be the torus depicted below.It is clear that this has two loops which cannot be continuously deformed into one an-other: one goes around the large center hole and the other around the thickness of the86orus. Are there any 2-dimensional holes? Before we give the answer, consider that topo-logical tori are hollow. Therefore, there is some inner volume contained in a torus whichstops certain loops from being contractible.The answer to the above question is yes and there is only 1. There are no higher-dimensional holes. We shall see that a formal way to answer these questions is by com-puting the homology groups for T , which given the statements above should be H n ( X , Z ) = {∗} n ≥ Z n =
0, 2 Z n = Simplicial Complexes
In a way, simplicial complexes are the most basic topological objects for which to definehomology and cohomology. As is such, we give a brief introduction to them here.
Definition 3.2.34. A simplex ∆ k is the convex hull of n + R n . A simplicial complex is a union of copies of ∆ i such that ∆ i ∩ ∆ j = ∆ k with k ≤ j .To define homology one needs the language of chains Definition 3.2.35 (Chains) . Let K be a simplicial complex and denote by C ∆ n ( K ) = (cid:40) ∑ i m i ∆ n | m i ∈ Z (cid:41) the free abelian group generated by n − simplices. If ∆ ni = [ v , ..., v n ] then define theboundary map ∂ n : C ∆ n ( K ) → C ∆ n − ( K ) in the following manner: ∂ n ( ∆ n ) = n ∑ i = ( − ) i [ v , ..., (cid:98) v i , ..., v n ] This makes ∂ n a group homomorphism. This set is called the set of simplicial n-chains This yields the following sequence for any given K ,... C ∆ n C ∆ n − C ∆ n − ... ∂ n + ∂ n ∂ n − ∂ n − Lemma 3.2.36. ∂ n ◦ ∂ n + = Remark 3.2.37.
We will drop the superscript ∆ when it is clear that the chain complex isconstructed from a simplicial complex. 87 roof. We apply the definition twice to the generators of C n + . ∂ n ∂ n + ( ∆ n ) = ∂ n ( ∂ n + [ v , ..., v n ])= ∂ n (cid:32) n + ∑ i = ( − ) i [ v , ..., (cid:98) v i , ..., v n ] (cid:33) = n ∑ i = (cid:32) i − ∑ j = ( − ) j [ v , ..., (cid:98) v i , ..., (cid:98) v j , ..., v n + ] + n + ∑ j = i + ( − ) j − [ v , ..., (cid:98) v i , ..., (cid:98) v j , ..., v n + ] (cid:33) = ( ∂ n ) ⊆ ker ∂ n − for all n . Definition 3.2.38.
For all n , put Z ∆ n ( K ) = ker ∂ n consisting of cycles and B ∆ n ( K ) = Im ∂ n + consisting of boundaries . Define the n-th homology group H ∆ n ( K ) = Z ∆ n ( K ) / B ∆ n ( K ) Let f be a simplicial map (that is to say that f ( ∑ t i ∆ i ) = ∑ t i f ( v i ) ) between two com-plexes K , L . Then f induces a map on the chain complexes f (cid:93) : C ∆ n ( K ) → C ∆ n ( L ) , f (cid:93) ( ∆ n ) = f ◦ ∆ n and thus a map on the homology groups f ∗ : H ∆ n ( K ) → H ∆ n ( L ) , f ∗ ([ z ]) = [ f ◦ z ] This gives us the following commutative square Z ∆ n ( K ) Z ∆ n − ( K ) Z ∆ n ( L ) Z ∆ n − ( L ) ∂ n f (cid:93) f (cid:93) ∂ n Lemma 3.2.39.
The boundary map preserves equivalence classes of the homology groups, that is ∂ n ( z + ∂ n + c ) = ∂ n ( z ) Proof.
We use the homomorphism property to get that ∂ n ( z + ∂ c ) = ∂ n ( z ) + ∂ n ( ∂ n + c ) = ∂ n ( z ) as desired. 88onsider the torus from Example 3.2.33. To give the torus a simplicial structure, wewant to realize it as a quotient space. That is T ∼ = R / Z . For this reason we can viewthe torus as a square with opposite sides glued together. Now the simplicial structureshould be more or less obvious.In general, a the sequence of maps in Definition 3.2.35 is called a chain complex. Fur-ther we denote C ∆ • = ... C ∆ n C ∆ n − C ∆ n − ... ∂ n + ∂ n ∂ n − ∂ n − Notice that for two simplicial complexes, and a simplicial map f : K → L we get thefollowing map of complexes f (cid:93) : C ∆ • ( K ) → C ∆ • ( L ) Lemma 3.2.40.
Consider two chain complexes C • , D • and a chain map g = ( g n ) n ≥ . Then thereexists a family of induced homomorphismsg n , ∗ : H ∆ n ( C • ) → H ∆ n ( D • ) Proof.
Let ξ ∈ H n ( C • ) . Let z ∈ Z n ( C • ) such that ξ = [ z ] . Consider g n ( z ) . Then ∂ n g n ( z ) = g n − ( ) = g n ( z ) ∈ Z n ( D • ) . That is to say [ g n ( z )] = η ∈ H n ( D • ) . Let z (cid:48) ∈ C n such that z ∼ z (cid:48) .Suppose z (cid:48) = z = ∂ n + ( c ) , c ∈ C n + . Then g n ( z (cid:48) ) = g n ( z ) + g n ∂ n + ( c )= g n ( z ) + ∂ n + g n + ( c )= g n ( z ) + b , b ∈ B n ( D • ) so g n ( z (cid:48) ) ∼ g n ( z ) and g n , ∗ is well defined. Singular ComplexesDefinition 3.2.41.
Define the set of singluar n-chains as C n ( X ) = Z [ { σ n : ∆ n → X } / ∼ ] where ∼ is homotopy equivalence. Each element of this group can be written as c = n σ n + ... + n k σ nk This is the set of all possible embeddings of an n − simplex into X . Further, we have a boundary operator ∂ n : C n ( X ) → C n − ( X ) , ∂ ( σ n ) = n ∑ i = ( − ) i σ n | [ v ,..., (cid:98) v i ,..., v n ] satisfying the same relations as with simplicial boundary maps.89urther, for every pair of maps f : X → Y , g : Y → W we get the induced maps onchain complexes g n , (cid:93) ◦ f n , (cid:93) = ( g ◦ f ) (cid:93) This induces a functor H n : Top → Ab given by X (cid:55)→ H n ( C • ( X )) . Definition 3.2.42 (Chain Homotopy) . If C • and D • are chain complexes and f , g chainmaps then a Chain Homotopy E = ( E n ) n ≥ is a collection of homomorphisms E n : C n → D n + such that ∂ n + E n + E n − ∂ n = g n − f n . Lemma 3.2.43.
If f and g are chain homotopic, thenf n , ∗ = g n , ∗ : H n ( C • ) → H n ( D • ) Proof.
Let z ∈ Z n ( C • ) . Put ξ = [ z ] ∈ H n ( C • ) . Then, g n ( z ) = f n ( z ) + ∂ n + E n ( z ) Hence, g n ( z ) ∼ f n ( z ) so g ∗ ξ = f ∗ ξ . Theorem 3.2.44 (Homotopy Invariance) . If f (cid:39) g : X → Y then on singluar homology,f ∗ = g ∗ : H n ( X ) → H n ( Y ) Proof.
See [Hat01].This theorem shows us that singluar homology is invariant under homotopy. This willbecome hugely important when classifying topological spaces.Let A ⊆ X . We want to explore the computability of H n ( X ) . As we know it, π n ( X ) is difficult to compute. It turns out, that H n ( X ) is relatively easy to compute (in mostcases) and therefore is used substantially more by various areas of mathematics as a wayof providing invariants to spaces. The following definition hints at one possible way ofcomputing these groups explicitly. Definition 3.2.45 (Relative Chain Groups) . Let A ⊆ X be a subspace. Define the relativeChain Groups C n ( X , A ) = C n ( X ) / C n ( A ) Remark 3.2.46.
Notice that chains in X descend to chains relative to A . That is, the fol-lowing diagram exists and the top square commutes: C n ( A ) C n − ( A ) C n ( X ) C n − ( X ) C n ( X , A ) C n − ( X , A ) ∂ n ∂ n q q ∂ n efinition 3.2.47 (Relative Homology groups) . In light of the previous remark, let A ⊆ X and C n ( X , A ) the relative chain groups. We define the relative homology groups H n ( X , A ) = Z n ( X , A ) / B n ( X , A ) where Z n ( X , A ) = ∂ − n C n − ( A ) / C n ( A ) and B n ( X , A ) = [ B n ( X ) + C n ( A )] / C n ( A ) . We canre-write H n ( X , A ) as H n ( X , A ) = ∂ − n C n − ( A ) B n ( X ) + C n ( A ) Now the question stands: how do we restrict an element of H n ( X ) to an element of H N ( X , A ) ? Simple. Pass it to the quotient. This will still be a cycle.Suppose η ∈ H n ( X , A ) and ¯ z ∈ Z n ( X , A ) such that η = [ ¯ z ] . Then ∂ n z ∈ Z n − ( A ) . Thisgives us a map ∂ : H n ( X , A ) → H n − ( A ) which sends z + ∂ n + c + d (cid:55)→ ∂ n d ∈ B n − ( A ) where d ∈ C n ( A ) . Piecing all of this infor-mation together gives us the following Theorem. Theorem 3.2.48.
For any subspace A ⊆ X , we get the following exact sequence of homologygroups ... H n ( A ) H n ( X ) H n ( X , A ) H n − ( A ) H n − ( X ) ... i ∗ j ∗ ∂ i ∗ Proof.
This follows immediately from the Snake Lemma.Consider the triple B ⊆ A ⊆ X and the associated exact sequence0 → C n ( A , B ) → C n ( X , B ) → C n ( X , A ) → Theorem 3.2.49 (Excision Theorem) . Let X be a topological space and A a subset. Suppose thatZ ⊆ Z ⊆ Int ( A ) . Then H n ( X , A ) ∼ = H n ( X \ Z , A \ Z ) We shall not prove the Excision theorem, but instead note its usefulness. It tells usthat H n ( X , A ) is computable. If we consider H n ( X , x ) for some point x ∈ X , then this is H n ( X ) and therefore the homology groups are computable. It can be shown further that H n ( X ) is finitely generated if X has finitely many n -cells. Furthermore, under some mildconditions, we have an isomorphism H n ( X , A ) ∼ = H ∆ n ( X , A ) This should not be surprising as giving a simplicial structure to a topological space is pre-cisely dictating the embeddings of simplices of various dimensions.91he final topic we shall introduce in this section is
Cohomology . This is somehow for-mally dual to the notion of homology. It will seem a bit contrived in this setting, but insome fields (like algebraic geometry and differential geometry) cohomology is the mostnatural algebraic invariant on a space.Consider C • ( X ) the singular chain complex for X . The functor Hom ( − , G ) for anyabelian group G (we could have chosen a ring R here if we only consider it as an abeliangroup). Then we get a new chain complex C • ( X ) : = ... C n − ( X ; G ) C n ( X ; G ) C n + ( X ; G ) ... d n − d n d n + d n + where C n ( X ; G ) : = Hom Z ( C n ( X ) , G ) and d n : = Hom ( ∂ n , G ) = ∂ ∗ n . This is a chain com-plex as Hom ( − , G ) is a functor. Definition 3.2.50.
The cohomology groups of a topological space X are the groups H nsing ( X ; G ) : = ker d n + / Im d n The one immediate advantage of cohomology is that there is a canonical ring structureon H ∗ ( X ; G ) = (cid:77) n ∈ N H n ( X ; G ) call the cup product defined as follows: let ϕ ∈ C l ( X ; G ) and ψ ∈ C k ( X ; G ) . Then ϕ (cid:94) ψ ([ v , ..., v k + l ]) = ϕ ( σ | [ v ,..., v k ] ) ψ ( σ | [ v k ,..., v k + l ] ) This cup product induces a map H l ( X ; G ) × H k ( X ; G ) → H k + l ( X ; G ) which is compatiblewith the quotients.The only theorem we will present here is the Universal Coefficient Theorem. Naively,one would assume that somehow H n ( X ) and H n ( X ; G ) are related (possibly by Hom).This is not necessarily true. What is true however is that there exists a short exact se-quence involving these two, as the following theorem dictates: Theorem 3.2.51 (Universal Coefficient Theorem) . Let X be a topological space, C • ( X ) itssingular chain complex and C • ( X ; G ) an associated cochain complex. Then the cohomology groupsare determined by the split exact sequence → Ext Z ( H n − ( X ) , G ) → H n ( X ; G ) → Hom ( H n ( X ) , G ) → [ Rot88 ] . This ends the section on algebraic topology. Manifolds pop up in every area of mathematics and play the starring role in the modelwe develop in Chapter 4. They are generalizations of Euclidean space ( R n ) and allowsfor a variety of new geometry to occur. All together they form a category Man ∞ which92ives a concrete example of a suitably bad category whose objects are easy to understand.This section will run through the basic theory of manifolds, vector bundles, and sheaves.We conclude with a discussion of de Rham theory which ties together the topologicalinformation on a manifold. Good references for the first two sections are [Wed16], [Lee12],[Tu11], and [GP74]. ∞ There are two approaches to smooth manifolds which are commonly used: analytic andalgebraic. We shall focus on the algebraic theory as it more closely ties in the later sectionshere. We will not entirely neglect the analytic theory as we need the notion of differentia-tion which is purely analytic. We start with the definition of an atlas:
Definition 3.3.1.
Let M be a topological space and α ∈ N ∪ { ∞ } . A chart at p ∈ M is apair ( ϕ , U ) with p ∈ U ⊆ M open, and ϕ : U → ϕ ( U ) ⊆ R n (for n not depending on p )a homeomorphism. A collection A of charts is called a C α - atlas on M if for all p , q ∈ M ,there are charts ( ϕ p , U p ) and ( ϕ q , U q ) which are compatible : the transition map ϕ p ◦ ϕ − q : ϕ q ( U p ∩ U q ) → ϕ p ( U p ∩ U q ) is a C α -homeomorphism (each partial derivative is α -times differentiable in each coordi-nate). If α = ∞ , then we call the chart maps and the transition maps smooth. In this case,the atlas is called smooth . Example 3.3.2.
Let S n = { x ∈ R n + : | x | = } where | x | = (cid:113) x + ... + x n + be the n -sphere equipped with the subspace topology. To construct a smooth atlas on S n , we needto give charts. Consider the open subsets U ± i : = { x ∈ S n : x i > ( resp . < ) } and the function f : D n → R n by f ( u ) = (cid:112) − | u | . Then U + i ∩ S n is the graph of thisfunction and U − i ∩ S n is the graph of − f . Each x i ∈ U + i ∩ S n can then be written as x i = f ( x , ..., (cid:98) x i , ...., x n + ) Define the maps ϕ ± i : U i → R n by ϕ ± i ( x , ..., x n + ) = ( x , ..., (cid:98) x i , ..., x n + ) . There are seento be smooth. Further they are compatible trivially. Hence, A = { ( ϕ ± i , U ± i ) } is a smoothatlas on S n . Definition 3.3.3.
Let M be a topological space equipped with a C α -atlas A . We call M an n -dimensional C α - manifold if M is Hausdorff and there exists a countable basis for thetopology. Remark 3.3.4.
Normally, the requirement of an atlas is stated as M is locally Euclidean .This is the key property of manifolds over normal Euclidean space. They do not need tobe R n or even C α -homeomorphic to R n , only locally.93e shall study only smooth manifolds here. The non-smooth cases are important,however not for this thesis. For smooth manifolds, we would like to know that M doesnot depend on the atlas. Proposition 3.3.5.
Let M be a smooth manifold with atlas A . Then there exists some A (cid:93) a uniqueatlas which is maximal and contains all atlases on M . Proof.
Define A (cid:93) as the set of all charts which are smoothly compatible with the chartsin A . Let ( ϕ , U ) and ( ψ , V ) be charts in A (cid:93) . Put x = ϕ ( p ) ∈ ϕ ( U ∩ V ) . Then as A is anatlas, there exists a chart ( θ , W ) such that p ∈ W . As p ∈ U ∩ V ∩ W the intersection isnon-empty. Therefore by construction the map ( ψ ◦ θ − ) ◦ ( θ ◦ ϕ − ) : ϕ ( U ∩ V ∩ W ) → ψ ( U ∩ V ∩ W ) is smooth and therefore ψ ◦ ϕ − is smooth. Hence, A (cid:93) is an atlas on M containing A .To show it is unique, let B be another such atlas. Then in particular, each of its chartsis smoothly compatible with charts in A . Hence, B ⊆ A (cid:93) and by maximality they areequal. Example 3.3.6.
The following examples of manifolds show up everywhere and thus shouldbe well understood.(a) The unit sphere S n from above was shown to exhibit a smooth atlas. The fact that itis Hausdorff and second countable follows from being a compact subset of R n + .(b) Consider the action of R on R n by r ( x , ..., x n ) = ( rx , ..., rx n ) . Then the quotientspace ( R n − { } ) / R is called the real projective space and is denoted P n − ( R ) orjust P n if the field is understood. We denote elements here as equivalence classes [ x , ..., x n ] . These are equivalence classes of lines in R n which go through the origin.To give charts on P n , we consider maps of the form ϕ i [ x , ..., x n ] = (cid:18) x x i , ..., x i − x i , x i + x i , ..., x n x i (cid:19) ∈ R n − .Then an easy check shows that these are smooth and are compatible. Hence, P n is asmooth manifold. Moreover, it is compact!(c) Let M and N be two smooth manifolds. Then M × N has the structure of a smoothmanifold given by charts of the form ( ϕ × ψ , U M × V N ) .(d) Let M ( m , n , R ) be the m × n matrices with real entries. This is a smooth manifoldby the diffeomorphism M ( m , n , R ) → R mn . If m = n we denote this by M ( n , R ) or M n ( R ) . Notice that for m = n , M n ( R ) comes equipped with a ring structure givenby matrix multiplication. In this case, there are many distinguished open submani-folds. The most important is GL n ( R ) the group of invertible linear transformations.We will return to this example later as it is the principal example of a Lie Group .These will turn out to be group objects in the category of manifolds.94ue to the above proposition, we will assume without a loss of generality that M isequipped with its maximal atlas. Now, we can define morphisms of smooth manifolds. Definition 3.3.7.
Let M and N be two smooth manifolds. Then a function F : M → N isa smooth map if for all ( ϕ , U ) ∈ A M and ( ψ , V ) ∈ A N such that F ( U ) ∩ V (cid:54) = ∅ the map ψ ◦ F ◦ ϕ − : ϕ ( U ) → ψ ( V ) is smooth. As a diagram: U V ϕ ( U ) ψ ( V ) ϕ F ψψ ◦ F ◦ ϕ − A bijective smooth map whose inverse is smooth is a diffeomorphism.The composition of smooth maps is smooth by an extended version of the diagramabove. Therefore, we have defined a category
Man ∞ of smooth manifolds with mor-phisms as smooth maps. Using this, we can now define the functor: C ∞ : Man op ∞ → Alg R where C ∞ ( M ) = Hom
Man ∞ ( M , R ) and we define the operations point-wise. It is con-travariant by the following: let F : M → N be a morphism. Then F ∗ : C ∞ ( N ) → C ∞ ( M ) F ∗ ( s ) = s ◦ F Further, if M → N → P is a sequence, then ( F ◦ G ) ∗ ( d ) = d ◦ ( F ◦ G ) = ( d ◦ F ) ◦ G = G ∗ ◦ F ∗ A derivation of this ring at p ∈ M is a function d : C ∞ ( M ) → R such that d is linearand d ( f g ) = f ( p )( dg ) + ( d f ) g ( p ) . Remark 3.3.8.
In fact, C ∞ ( M ) is a Banach space. This changes the situation is quite asubtle way. If M × N is a product manifold, one would expect the smooth functionsto be C ∞ ( M ) ⊗ R C ∞ ( N ) . However, this cannot be true as trigonometric functions exist.Therefore, we need to take come metric completion of this tensor product. For moreinformation see [Rya02]. Definition 3.3.9.
Let M be a smooth manifold. If p ∈ M , we define the tangent space to M at p to be T p M = { ( f : C ∞ ( M ) → R ) : f is a derivation at p } This is clearly an R -vector space. In fact, it is finite dimensional.Elements of the tangent space should be thought of as vectors which are tangent to M atthe point p . We put dim p M = dim R T p M .The germ of a function f : M → R at the point p is an equivalence class [ f ] where twofunctions f , g are equivalent at p if there exists an open neighbourhood W of p such that f = g on W . Denote by C ∞ M , p the set of all germs at p . This is a local ring with maximalideal m p all functions which are non-zero at p .The following Theorem gives equivalent formulations of the tangent space.95 heorem 3.3.10. Let M be a smooth manifold and T p M its tangent space at p . The space ofgerms C ∞ M , p is a local ring with maximal ideal m p . The following are equivalent formulations ofthe tangent space:(a) Let D p M = Der ( C ∞ M , p , R ) . Then D p M ∼ = T p M by the map which sends [ f ] (cid:55)→ f . (b) Let γ : ( −
1, 1 ) → M be a smooth curve with γ ( ) = p. ThenC p M = { γ (cid:48) ( ) : ( γ : [
0, 1 ] → M ) a smooth curve and γ ( ) = p } / ∼ where γ ∼ δ if for all germs f ∈ C ∞ M , p we have ( f ◦ γ ) (cid:48) ( ) = ( f ◦ δ ) (cid:48) ( ) . Then C p M ∼ = T p M . (c) (cid:0) m p / ( m p ) (cid:1) ∗ ∼ = T p M .See [Wed16] for a proof of this. The hardest one to prove is ( c ) and it relies heavilyon the fact that M is C ∞ . If M were say C n , then this would not be true and dim m / m isinfinite.The operation of passing to the tangent space is functorial in M . That if if F : M → N is a morphism, then T p F : T p M → T F ( p ) N is a linear map defined by d (cid:55)→ d ◦ F ∗ . Therefore, T p ( G ◦ F ) = d ◦ ( G ◦ F ) ∗ = d ◦ F ∗ ◦ G ∗ = T F ( p ) G ◦ T p F . Definition 3.3.11.
Let M be a smooth manifold. We define the tangent bundle of M asthe disjoint union T M = (cid:228) p ∈ M T p M = { ( p , v ) : p ∈ M , v ∈ T p M } The cotangent bundle is T ∗ M = (cid:228) ( T p M ) ∗ defined analogously.There is a canonical projection π M : T M → M given by ( p , v ) (cid:55)→ p . Then π − M ( p ) = T p M . Picking a basis of T p M so that we may identify it with R n , we find that in a neigh-bourhood of p , π − M ( U ) ∼ = U × R n . This property of the tangent bundle is called local triv-ialization . Further, using this identification we get that T M (and thus T ∗ M ) are smoothmanifolds of dimension 2 dim M . This definition makes T : Man ∞ → Man ∞ into an end-ofunctor [Lee12].Using the tangent bundle, we can now study certain C ∞ ( M ) modules which arisenaturally. Let s : M → T M be a smooth map such that π M ◦ s = id M . We call s a section of T M and denote the space of all sections as Γ ( M , T M ) : = { ( s : M → T M ) : π m ◦ s = id M } A derivation is a linear map f such that it satisfies the Leibniz rule for multiplication: f ( xy ) = f ( x ) y + x f ( y ) . Der is the space of all such functions. For more details, see [Lee12] or [Wed16]. For a definition see [Lee12]. It is the coproduct in the category of sets. R -vector space under point-wise addition. Moreover it can be given the struc-ture of a C ∞ ( M ) -module. If s be a section and f ∈ C ∞ ( M ) . Then we define ( f · s )( p ) =( p , f ( p ) s ( p )) . If U ⊆ M is a submanifold, we define Γ ( U , T M ) as the sections of the bun-dle over U . Definition 3.3.12.
A smooth section s ∈ Γ ( M , T M ) is called a smooth vector field on M . Itassociated to each point in M a tangent vector v ∈ T p M . We call M parallelizable if thereexists vector fields { V , ..., V n } such that { V ( p ) , ..., V n ( p ) } is a basis for T p M for all p . Proposition 3.3.13.
Let M be parallelizable, then T M = M × R n . Proof.
Let { V , ..., V n } be a parallelization of M . Then the map ϕ : T M → M × R n givenby ϕ ( p , ∑ a i V i ( p )) = ( p , ∑ a i e i ) is smooth trivially. Further, as T p M ∼ = R n via the isomorphism V i ( p ) (cid:55)→ e i , we get thatthis map is a diffeomorphism. Hence, T M ∼ = M × R n is trivial.One important operation that vector fields admit is the Lie Derivative. Given twovector fields V and W , we define L V ( W ) = [ V , W ] = VW − WV Here
X f ( p ) = X p ( f ) is a derivation of C ∞ ( M ) . It is readily checked that [ X , Y ] is again avector field. Hence Γ ( M , T M ) admits the structure of a Lie algebra. Immersions and Submersions
Given the discussion above, we can now formulate some special morphisms in
Man ∞ . Definition 3.3.14.
Let F : M → N be a morphism in Man ∞ . We define the rank of F at thepoint p ∈ M , as rk ( T p F : T p M → T F ( p ) N ) If we pick bases for T p M and T F ( p ) N respectively, then by choosing bases, we can get asmooth map M → M ( m , n , R ) p (cid:55)→ T p F Further, the matrix of T p F is (up to a choice of basis) (cid:18) I rk ( F )
00 0 (cid:19)
From this definition it follows that if r = rk ( F ) at p and ( ϕ , U ) is a chart at p , thenthere exists a smooth function g : ϕ ( U ) → R n − r which sends 0 → T g =
0. We now have the immediate corollary97 orollary 3.3.15.
For every p ∈ M , there exists an open neighbourhood U such that rk p ( F ) ≤ rk q ( F ) for all q ∈ U .This tells us that the rank of a smooth function can only stay the same or increase in aneighbourhood of a point. If equality holds, we say F has constant rank at p . Corollary 3.3.16.
If F has constant rank at p , then there exist charts ( ϕ , U ) and ( ψ , V ) of p andF ( p ) respectively, such that ψ ◦ F ◦ ϕ − ( x , ..., x m ) = ( x , ..., x r , 0, ..., 0 ) This corollary is incredibly important to the study of manifolds as it gives a local rep-resentation of F in such a way that we can disregard a significant number of variables.There are two extreme cases of the above corollary. Definition 3.3.17. F : M → N is called a:(a) Immersion if T p F is injective for all p .(b) Submersion if T p F is surjective for all p .A smooth immersion which is also a topological embedding is called a smooth embed-ding .Embeddings are particularly useful as they exhibit manifolds and sitting inside others.Immersions are also incredibly important. The following example is of an object whichcannot be embedded into R but instead can be immersed. Example 3.3.18.
Let I be the product of [
0, 1 ] with itself. We are going to build K theKlein bottle. Consider the relation ( x , 0 ) ∼ ( x , 1 ) and ( y ) ∼ (
1, 1 − y ) . Then we get anobject which cannot be embedded into R but can be immersed. This is the glueing oftwo mobius bands together to get a 1-sided object with no edges. To show it is a manifoldis not particularly difficult as we have a representation of it above.We want to understand how immersions, submersions, and embeddings interact withsurjective, injective, and bijective maps. Theorem 3.3.19 (Global Rank Theorem) . Let F : M → N be a smooth map of smooth manifoldswith constant rank. Then(a) If F is injective, then F is an immersion.(b) If F is surjective, then F is a submersion.(c) If F is bijective, then F is a diffeomorphism.
The proof of this relies on a strong theorem from functional analysis. As we do notdevelop this theory here, the proof will be omitted. For a full treatment, see [Lee12] and[Kna05b]. This theorem gives a sufficient condition for a smooth map to be an immer-sion (resp. submersion) and it is much more easily checked than the normal immersion(submersion) condition. 98igure 3.2: The Klein bottle immersed in R . It can be embedded in R . Vector Bundles
We now want to understand some generalizations of the (co)tangent bundle from above.
Definition 3.3.20.
Let M be a smooth manifold. We call a triple ( E , π , V ) consisting of asmooth manifold, a projection map, and a real vector space real vector bundle of rankdim V over M if:(a) π : E → M is surjective and a local diffeomorphism.(b) For each p ∈ M , the fibre π − ( p ) ∼ = p × V ∼ = V is endowed with the structure of adim V -dimensional real vector space.(c) For each p ∈ M , there exists a neighbourhood U of p and a homeomorphism Φ : π − ( U ) → U × V satisfying:(i) π U ◦ Φ = π (where π U : U × V → U is the projection)(ii) For each q ∈ U , the restriction Φ q : E q → { q } × V is a vector space isomor-phism.Similarly, we could have defined vector bundles as E = (cid:228) p ∈ M V p where V p = { p } × V .In this sense, we see that T M and T ∗ M are vector bundles. Similar to those, Γ ( M , E ) isa C ∞ ( E ) -module. The main purpose of this section is to understand transformations onbundles and transformations between them.99 efinition 3.3.21. Let ( E , π ) and ( E (cid:48) , π (cid:48) ) be vector bundles over M and M (cid:48) respectively.Then a bundle homomorphism is a map F : E → E (cid:48) which is linear on each fibre, suchthat there exists a map f : M → M (cid:48) and the following diagram commutes: E E (cid:48)
M M (cid:48) π F π (cid:48) f Proposition 3.3.22.
If F is smooth, then f is smooth.Proof. f = π (cid:48) M ◦ F ◦ ζ where ζ is the zero section. This is a composition of smooth mapsand therefore smooth.This lets us define a category Bun ( M ) whose objects are vector bundles over M andwhere morphisms are bundle homomorphisms. The forgetful functor U : Bun ( M ) → Man ∞ is faithful. In general, it is not full as there exist smooth maps E → E (cid:48) which do notcommute with the projection maps. We will denote by Bun ( M ) < ∞ the category of finiterank vector bundles. This category will become interesting in the next section when werelate it to categories of certain sheaves. Example 3.3.23.
We now construct some interesting bundles over various manifolds.(a) Let M = S . Define an equivalence relation on R by ( x , y ) ∼ ( x (cid:48) , y (cid:48) ) if ( x (cid:48) , y (cid:48) ) =( x + n , ( − ) n y ) . Put E = R / ∼ . We claim E is a non-trivial bundle over S . First,let q : R → E be the quotient map. Consider the following diagram R E R S π q πε where ε ( x ) = e π ix . Then π is determined as the map which makes this diagramcommute. This makes ( E , π ) a real line bundle on S which is non-trivial (by thetwist of ( − ) n ). This is the chief example of how local information can be deceptivewhen trying to understand something globally.(b) Let M be a manifold and V any vector space. Then M × V has the canonical struc-ture of a vector bundle on M .(c) Let E , E (cid:48) be vector bundles over M . Then E ⊕ E (cid:48) is a vector bundle whose fibres are V ⊕ V (cid:48) . This is called the Whitney Sum of vector bundles.100f E and E (cid:48) are vector bundles on a smooth manifold M , denote their space of smoothsections by Γ ( E ) and Γ ( E (cid:48) ) . If F : E → E (cid:48) is a bundle homomorphism, it induces a map (cid:101) F : Γ ( E ) → Γ ( E (cid:48) ) given by (cid:101) F ( σ )( p ) = F ( σ ( p )) Because a bundle homomorphism is linear on fibres, (cid:101) F is R -linear on sections. In fact, itis even C ∞ ( M ) -linear. We can characterize all C ∞ ( M ) -linear maps Γ ( E ) → Γ ( E (cid:48) ) by thefollowing Theorem. Theorem 3.3.24.
Let E , E (cid:48) be vector bundles on M and F : Γ ( E ) → Γ ( E (cid:48) ) a map. Then F isC ∞ ( M ) -linear if and only if F = (cid:101) F for some F : E → E (cid:48) .The proof of this goes beyond the scope of this text. See [Lee12] for details. What thistheorem tells us is that for vector bundles, we have a bijective correspondence betweenHom C ∞ ( M ) ( Γ ( E ) , Γ ( E (cid:48) )) ∼ −→ Hom
Bun ( M ) ( E , E (cid:48) ) Remark 3.3.25.
The key to vector bundles is that they somehow encode both global andlocal information of the manifold. Further, understanding the category
Bun ( M ) is insome sense equivalent to understanding the slice category (see [ML71] for a definition) Man ∞ / M . Overall, we shall use these objects to transfer information from the physicalspace of a sensory system to the perceptual space. In fact, this will be how we build theperceptual space.We could have equivalently defined fibre bundles and gone through this section in moregenerality. These are similar to vector bundles but we do not require that the fibres be vec-tor spaces. The story of these objects is largely mysterious as they are nearly too generalto say anything interesting about. Importantly though, they still have the property thatall fibres are isomorphic.This concludes the section on manifolds. The story of sheaves begins where we just finished: fibre bundles. Notice that fibre bun-dles are characterized by the fact that the fibres over every point are necessarily isomor-phic. Sheaves seek to generalize this idea by removing the restriction of constant fibres.Sheaves are key in nearly every area of mathematics as they encode geometric informa-tion which is otherwise difficult to access. In the late 1960s, Alexander Grothendieck firstdeveloped the idea that understanding sheaves on a space is equivalent (and in somesense better) than understanding the space itself. In this section we will give the firstproperties of (pre)sheaves, define ringed spaces, and construct the category O X - Mod . Weconclude the section with a brief introduction to sheaf cohomology, which in the samestyle as cohomology in the previous section, will provide rich invariants to the associatedmanifolds. Most of the material of this section comes from [Ive86], [Har77] , [Wed16],101EH00], and [Bre97]. As this forms the most technical material of this thesis, we shall onlyprove those statements which are fundamental to the reader’s understanding and willpoint to the appropriate reference otherwise.
Remark 3.3.26.
Due to the technical stress of this section, we encourage the reader toskip a majority of the proofs of the statements presented here. The proofs of some of amajority of these theorems can be opaque on a first pass and thus should be revisited onlyif a deeper understanding is desired.Before we give the formal definitions of sheaves, recall some of the facts we provedabout C ∞ ( − ) as a functor Man ∞ → Ring . Fix M ∈ Man ∞ . We know that C ∞ M ( U ) is aring for any open submanifold U ⊆ M . Additionally, C ∞ M , x is a local ring for each x ∈ M .Further, we showed that given an open cover U of M and smooth functions defined oneach U i such that f i | U i ∩ U j = f j | U i ∩ U j then there exists a unique global smooth function g with the property that g | U i = f i . What we will see is that C ∞ M is the structure sheaf of M .For now, let’s start, as always, with some definitions. Definition 3.3.27.
Let ( X , T ) be a topological space and C a category. A presheaf on X isa functor F : T op → C . T op is the category whose objects are open subsets of X and whose morphisms are onepoint sets if V ⊆ U and empty otherwise. Morphisms of presheaves are natural transfor-mations of functors. Remark 3.3.28.
Notice that for V ⊆ U , there is a unique morphism denotedRes UV : F ( U ) → F ( V ) .We sometimes call F ( U ) the set of sections of F over U , and denote this Γ ( U , F ) . Addi-tionally, instead of writing Res UV ( s ) for the image of s in F ( V ) , we instead write s | V .Every presheaf is the same as a contravariant functor. We use the term presheaf whenwe want to discuss some gluing conditions which we will see later. Some classical exam-ples of presheaves are C α M = { f : M → R : f is α timesdifferentiable } for a real C α -manifold M and α ∈ N ∪ { ∞ } . Definition 3.3.29.
Let X be a topological space and F a presheaf on X . F is a sheaf if thefollowing condition is satisfied( Sh ) If U ⊆ X is an open set and { U i } i ∈ I is an open cover of U such that for all i thereexists f i ∈ F ( U i ) and for all i (cid:54) = j ∈ I f i | U i ∩ U j = f j | U i ∩ U j then there exists a unique f ∈ F ( U ) such that f | U i = f i . Remark 3.3.30.
This definition can be generalized to general categories. To do this cor-rectly however one needs the language of sites. We will not cover these but refer thereader to [Met03],[KS06], and [Car11] for an in depth treatment.102 xample 3.3.31.
We have already seen an example of a sheaf, namely C α . It is easy to checkthe gluing condition ( Sh ). Other common examples are Ω pM the set sheaf of differentialforms of degree p and L the sheaf of locally constant functions on a space.Sheaves allow for local information to be glued together to make global information.What we mean by local here is up to some interpretation. We can either mean open neigh-bourhoods of points or the points themselves. As points are almost never open (exceptfor discrete sets) we need to figure out how to define F ( x ) . The following definition givesan answer in a category which admits colimits. Definition 3.3.32.
Let X be a topological space and U ( x ) = { U ∈ Open ( X ) : x ∈ U } .Suppose F is a (pre)sheaf on X . We define the stalk of F to be F x = lim −→ U ( x ) F ( U ) Here, we interpret the colimit as being taken over successively smaller sets containing x . In fact, if there exists some minimal U x contained in all neighbourhoods of x , then F x = F ( U x ) . We now want to understand how morphisms of sheaves interact with thestalks. Remark 3.3.33.
For the remainder of this text, we shall consider only sheaves of rings ormore generally R -modules for some ring R . This simplifies the situation and also turnsout to be the situation for most spaces. Proposition 3.3.34.
A morphism of sheaves on a space X , ϕ : F → G is an isomorphism if andonly if it the induced map on stalks ϕ : F x → G x is an isomorphism.Proof. ( ⇒ ) Let x ∈ X and U ( x ) as in Definition 3.3.32. Considerlim −→ : Ring U ( x ) → Ring where we consider U ( x ) as a partially ordered set. As ϕ is a natural transformation itgives two direct systems {F ( U ) , Res UV } U ( x ) {G ( U ) , Res UV } U ( x ) As ϕ is an isomorphism, ϕ U is an isomorphism for all U ∈ U ( x ) . Thereforelim −→ U ( x ) { ϕ U : F ( U ) ∼ −→ G ( U ) } = ϕ x : F x ∼ −→ G x is an isomorphism. As x was arbitrary, we see that ϕ x is an isomorphism on all stalks.( ⇐ ) Now assume that ϕ x is an isomorphism for all x ∈ X . We shall show that ϕ U isa bijection for all U and thus taking ψ U = ϕ − U makes ϕ an isomorphism of sheaves. Letus first show that ϕ U is injective. If s ∈ F ( U ) is such that ϕ U ( s ) =
0, then on all stalks ϕ x ( s ) =
0. As ϕ x is an isomorphism, we see that s x = x ∈ U . Therefore, thereexists some W x ⊆ U such that s | W = x ∈ W x . As (cid:83) W x is a cover for U , by the103heaf condition there exists a unique s ∗ ∈ F ( U ) such that s ∗ | W = s W =
0. By uniqueness, s ∗ = s = ϕ U is injective.To show it is surjective, let t ∈ G ( U ) . Let x ∈ U and t x ∈ G x be the germ of t at x . As ϕ x is surjective, there exists s x ∈ F x such that ϕ x ( s x ) = t x . Pick a representativesection s ( x ) ∈ F ( V x ) such that s ( x ) = s x . Then ϕ V x ( s ( x )) and t | V x have the same germin G x . Possibly replacing V x by a smaller open set, we may assume that ϕ V x ( s ( x )) = t | V x .The collection { V x } forms an open cover of U and on each V x we have a section s ( x ) . Let p , q ∈ X be distinct points. Then s ( p ) | V p ∩ V q and s ( q ) | V p ∩ V q are two sections in F ( V p ∩ V q ) which are sent by ϕ to t | V p ∩ V q . As ϕ U is injective, we conclude that s ( p ) | V p ∩ V q = s ( q ) | V p ∩ V q By the sheaf condition there exists s ∈ F ( U ) such that s | V p = s ( p ) . Lastly, we need tocheck that ϕ U ( s ) = t . By construction ϕ V x ( s ) = t | V x for all x ∈ U . Now, applying thesheaf condition again to ϕ U ( s ) − t we see that this must be 0 and hence ϕ U ( s ) = t and ϕ is surjective. This completes the proof.The collection of all C -valued sheaves on a topological space for a category denoted Sh ( X , C ) (Presheaves also form a category). Per the remark above, we shall denote Sh ( X , R - Mod ) : = Sh ( X ) when R is well understood. Notice that for a morphism ofsheaves the kernel presheaf defines a sheaf but the cokernel presheaf does not. Further,we would like for quotients to exist in this category. To remedy this, we come to thefollowing definition. Definition/Proposition 3.3.35.
For any presheaf F there is a sheaf (cid:101) F and a natural morphism θ : F → (cid:101) F with the following universal property: for any sheaf G and morphism of presheaves ϕ : F → G , there exsits a unique morphism of sheaves (cid:98) ϕ : (cid:101) F → G with (cid:98) ϕ ◦ θ = ϕ . That is, thefollowing diagram commutes F (cid:101) FG θϕ (cid:98) ϕ The sheaf (cid:101) F is called the sheafification of F . One can prove that sheafification is functorial inpresheaves. In fact, it is left adjoint to the forgetful functor Sh ( X ) → PSh ( X ) . Lemma 3.3.36.
The canonical map θ : F → (cid:101) F induces an isomorphism on stalks.Proof.
Consider the construction of the sheafification of F as (cid:101) F ( U ) = (cid:40) ( s x ) ∈ ∏ x ∈ U F x : ∀ x ∈ U , ∃ W ⊆ U and ∃ t ∈ F ( W ) such that ∀ w ∈ W , t | W = s | W (cid:41) The restriction maps are given by the restriction on the products. Now, by definition θ x isnecessarily the identity. 104 emark 3.3.37. There is another way to build the sheaf associated to a presheaf. Given apresheaf F on X , we can construct a sheaf Spé ( F ) = (cid:70) p ∈ X F p . This has a natural projec-tion π : Spé ( F ) → X which projects each stalk onto the point it is over. We topologizethis space by endowing it with the strongest topology such that the sections s ∈ F ( U ) arecontinuous. It can be shown that these definitions agree.The sheafification operation allows us to define cokernels, quotients, and constant sheaves.All of this together tells us that if A is an abelian category, then Sh ( X , A ) is also an abeliancategory [Ive86]. Specifically, Sh ( X ) is an abelian category. Example 3.3.38. (a) Let A be a ring. Then A defines a presheaf A X by A X ( U ) = A for all U open. If X is connected, then this is a sheaf. If X is disconnected this is not true. Suppose X = X (cid:116) X . Then if a (cid:54) = b ∈ A , then defining a ∈ A X ( X ) and b ∈ A X ( X ) , theyagree trivially on the empty intersection yet there is no element c such that c = a and c = b . Hence, A X is not a sheaf in general. Therefore, we consider (cid:102) A X which isthe sheaf of locally constant functions on X with values in A .(b) We define i x , ∗ ( A ) to be the skyscraper sheaf which is defined by i x , ∗ ( A )( U ) = (cid:40) A x ∈ U x / ∈ U This is a sheaf on any topological space and plays a key role in the theory as itprovides good counter examples to many conjectural relationships.(c) Let F be a sheaf and G a subsheaf on X , Then the functor U (cid:55)→ F ( U ) / G ( U ) is apresheaf. It is not a sheaf in general however. Therefore we can take the sheafifi-cation to get ( F / G )( U ) as a sheaf on X . In general, ( F / G )( U ) does not agree with F ( U ) / G ( U ) .As Sh ( X ) is an abelian category, we can consider exact sequences of sheaves. Definition 3.3.39.
A sequence of sheaves on a space X is a sequence0 → F (cid:44) → G (cid:16) H → exact if Im [ F → G ] ∼ = ker [ G → H ] . Equivalently, this sequence is exact if the corre-sponding sequence 0 → F x → G x → H x → x ∈ X .It follows from an identical argument for Hom, that Γ ( X , − ) is a left exact functor Sh ( X ) → R - Mod . For this reason we define H i ( X , F ) : = R i Γ ( X , F ) Sheaf cohomology groups of F . These will tie together the entire chapter in Section3.2.4 via Theorem 3.3.62. Before then however, we want to consider how sheaves performunder maps between spaces.Up until this point, we have considered a fixed space X . If we have a morphism oftopological spaces f : X → Y , we want to build a sheaf on Y which comes from f in someway. Definition 3.3.40.
Let f : X → Y be a map of topological spaces. Suppose F is a sheaf on X . The direct Image (or pushforward ) sheaf on Y with respect to f is the sheaf f ∗ F ( V ) : = F ( f − ( V )) Further, we define the inverse image sheaf on X of a sheaf on Y as f − G ( U ) = lim −→ f ( U ) ⊂ V G ( V ) Remark 3.3.41.
In the previous definition, one may want to give a naive definition of theinverse image sheaf in the the style of the pushforward, that is f − G ( U ) = G ( f ( U )) . Thisfails immediately however as we are not guaranteed that f ( U ) is open.Sometimes, topological spaces come naturally equipped with sheaves. Examples ofthis situation are smooth manifolds. to every real topological manifold M , we have C M the sheaf of continuous functions M → R . Definition 3.3.42. A ringed space is a topological space X equipped with a sheaf of rings O X called the structure sheaf of X . A morphism of ringed spaces is a pair ( f , f (cid:93) ) with f : X → Y a continuous map and f (cid:93) : O Y → f ∗ O X a map of sheaves. We call ( X , O X ) a locally ringed space if the stalks O X , p are local rings for all p ∈ X . A morphism of locallyringed spaces is a pair where the map on sheaves is a local homomorphism of local rings(on stalks it sends the maximal ideal at f ( p ) to the maximal ideal at p surjectively). Wecall O X the structure sheaf of X . Proposition 3.3.43.
Let ( M , O M ) be a locally ringed space. Then M is a smooth manifold inthe sense of Definition 3.3.3 if and only if there exists an open cover M = (cid:83) U i such that foreach U i there exists Y ⊆ R n open such that there is an isomorphism of locally ringed spaces ( U i , O M ( U i )) ∼ −→ ( Y , C ∞ R n ( Y )) . Proof. ( ⇐ ) This direct in obvious by defining the charts of the atlas to be the projectiononto the first coordinate of the morphisms ( f i , f i ) of ringed spaces. Then the sheaf condi-tion guarantees the glueing axiom holds.( ⇒ ) This direction is a bit more subtle. Let M be a smooth manifold with atlas A . Let M = (cid:83) U i and V ⊂ M an open subset. Then define O M ( V ) = { f : V → R : f | U i ∩ V ◦ ϕ − i : ϕ i ( U i ∩ V ) → R is C ∞ } This makes ( M , O M ) a ringed space. Further, it follows immediately that the inducedmorphisms ( U i , O M | U i ) −→ ( Y , C ∞ R n | Y ) are isomorphisms of ringed spaces. As the targetis locally ringed, so is ( M , O M ) . 106 orollary 3.3.44. Let M be a smooth manifold with smooth atlas A . Then ( M , C ∞ M ) is a locallyringed space. In some sense, locally ringed spaces are the correct setting to study everything wehave seen already. Manifolds and all of their analytic properties can be re-phrased interms of operation on the sheaf C ∞ ( M ) . The only object which we have seen so far thatneeds some further discussion is vector bundles. We first discuss a generalization. Definition 3.3.45.
Let ( X , O X ) be a ringed space. An O X -Module is a sheaf F on X suchthat for each U ⊆ X open, there is a map O X ( U ) × F ( U ) → F ( U ) which turns F ( U ) into an O X ( U ) -module. A morphism of O X -modules is a morphismof sheaves which is O X -equivariant.In direct analogy with R -modules, we can consider some operations on O X -modules. Example 3.3.46.
For this set of examples, let F and G be O X -modules.(a) (Direct Sums) We can define
F ⊕ G ( U ) by F ( U ) ⊕ G ( U ) . It is nearly immediate thatthis is a sheaf. Therefore if I is a finite indexing set, we can define the direct sum forover this set and this will be a sheaf. In the finite case, this does not hold true andtherefore one must sheafifiy.(b) (Tensor Products) Consider the presheaf T : U (cid:55)→ F ( U ) ⊗ O X ( U ) G ( U ) . This is nota sheaf in general (this takes some work to find an example). Therefore, we define F ⊗ O X G = (cid:101) T .(c) (Hom) We can consider the presheaf U (cid:55)→ Hom O X | U ( F | U , G | U ) . This is actually asheaf and is denoted as H om O X ( F , G ) It also turns out that H om and ⊗ O X are adjoint endofunctors.(d) (Duals) We define F ∗ : = H om O X ( F , O X ) This is a sheaf on X . There is a canonical morphism F → F ∗∗ the double dual givenon stalks by s x (cid:55)→ ev s x : F x → O X . x the evaluation at s x map. Further, this gives another construction of the tangent andcotangent bundles.Now we want to define "free" O X -modules. Remark 3.3.47.
For the remainder of this text, we shall write O U for the restriction of thestructure sheaf to U ⊆ X open. 107 efinition 3.3.48. We call an O X -module F finite locally free if there exists an open cover U = { U i } i ∈ I such that F | U i is isomorphic (as sheaves) to O nU i for some n ∈ N . In this case, F x is a free O X , x -module. Define rk x ( F ) : = rk O X , x ( F x ) . This defines a locally constantfunction X → N x (cid:55)→ rk x ( F ) called the rank of F .We can build a category FLF ( X ) of all finite locally free sheaves on X . It turns outthat for FLF sheaves, the canonical morphism j : F → F ∗∗ is an isomorphism. Theselook surprisingly close to a generalization of vector bundles, and the following theoremexplains why. Theorem 3.3.49.
There is an equivalence of categories
Bun ( X ) < ∞ (cid:28) FLF ( X ) for any ringedspace X .For a proof, see [Wed16]. What this theorem tells us is that we can assign to each vectorbundle a finite locally free sheaf and vice-versa. Therefore, as the tangent and cotangentbundles are finite rank vector bundles on a manifold M , we get corresponding sheaves T M and Ω M . It turns out that we can define the cotangent bundle using the sheaf H om from above Ω M = T ∗ M = H om ( T M , C ∞ M ) Furthermore, as C ∞ M , x is a local ring (the maximal ideal m x is all non-zero functions at x )we can define T x M = ( m x / m x ) ∗ and then T M , x = ( m x / m x ) ∗ This gives an explicit description of the stalks of T M .We now turn to some homological methods to end this subsection. Together withmorphisms, O X - Mod (cid:44) → Sh ( X ) is a full subcategory which can be shown to have enoughinjectives [Ive86]. Injective O X -modules are defined analogously to R -modules. For thisreason, given F in O X - Mod , we can find an injective resolution J • and thus a quasi-isomorphism F qis −→ J • This gives us a way of computing H i ( X , F ) . In a similar manner to R -modules, H i ( X , F ) ∼ = H i ( Γ ( X , J • )) The following remarkable theorem gives yet another way to compute sheaf cohomol-ogy for constant sheaves corresponding to a ring R . Theorem 3.3.50.
Let R be a ring and (cid:101) R M the constant sheaf on ( M , C ∞ M ) . Then there is anisomorphism H i ( M , (cid:101) R M ) ∼ = H ising ( M ; R ) sheafifying the singular cochain complex. Oncethis is done it follows nearly immediately. For a full proof, see [Wed16]. This concludesthe section on sheaves. Remark 3.3.51.
The main point of sheaves is to facilitate the transfer of local informationto global information via glueing. Notice that the axioms for sheaves and thus everythingelse in this section, was designed so that, under the right conditions , the sections gluedto global ones. This action of taking local information to global information is preciselywhat needs to happen in the olfactory system. We have local actions of granule cells onmitral cells and these "glue" together to form an action of the entire GC layer. As you canguess, the notion of sheaves will show up to help with the mathematical formulation ofthis property.
We end this chapter (and therefore all of the background material) with a short disucssionof de Rham theory for manifolds. This centers on the construction of differential forms on amanifold and the exterior derivative. The main theorem we will prove is de Rham’s theo-rem which gives an isomorphism of sheaf cohomology with so-called de Rham cohomology .This combined with Theorem 3.3.50 gives the grand conclusion that singular cohomologyon manifolds can be computed via the de Rham complex. The main references here are[Lee12] and [Wed16].This story begins with the construction of differential k -forms on a manifold. Beforewe can do this though, we need to define and study smooth functors . These allow us totransform vector bundles and will extend to endofunctors of Bun ( M ) < ∞ . Definition 3.3.52.
Let F : Vect R → Vect R be a functor (we will assume covariant but thisis not neccessary). We say that F is smooth if the induced map F (cid:91) : Hom ( V , W ) → Hom ( F ( V ) , F ( W )) is smooth as a map of smooth manifolds. Example 3.3.53.
Some common smooth functors which play a key role in the theory ofsmooth manifolds are presented below.(a) The functor ( − ) ⊗ k is a smooth functor via the Hom-tensor adjunction. Indeed eventaking T • ( − ) is smooth. In general, most of the operations on vector spaces aresmooth functors. Some care needs to be taken in the case of infinite indexing sets,but we shall ignore these cases.(b) The functor (cid:86) k ( − ) is smooth. This will form the basis for all of de Rham theory. Ingeneral, if F arises as a quotient of ⊗ k by some homogeneous ideal (its generated byelements of the same degree) then F is smooth.(c) The functor ( − ) ∗ : = Hom ( − , R ) is smooth. This follows from the previous exam-ple. 109d) If we fix a vector space W , then Hom ( W , − ) and Hom ( − , W ) are smooth func-tors. This follows from the first and third example for the case of finite dimensionalspaces. For infinite dimensional spaces this is more subtle and less useful.Now let M be a smooth manifold and π : E → M be a vector bundle. If F is a smoothfunctor, then F admits an extension (cid:98) F : Bun ( M ) < ∞ → Bun ( M ) by sending E (cid:55)→ (cid:98) F ( E ) where (cid:98) F ( E ) p = F ( E p ) . If F takes finite dimensional vector spacesto finite dimensional vector space, then (cid:98) F lands in Bun ( M ) < ∞ . Example 3.3.54.
Consider the cotangent bundle from before T ∗ M = (cid:228) p ∈ M T ∗ p M . Then werealize this as T ∗ M = (cid:92) ( T M ) ∗ To construct differential forms, we need to consider (cid:86) k ( T ∗ M ) . This is a smooth vectorbundle on M of rank ( dim Mk ) . By Theorem 3.3.49, we can associate a finite locally free C ∞ M -module to T ∗ M . What we would like to show is that this associated sheaf is Ω M frombefore.We now give a second construction of Ω M . Definition 3.3.55.
Let A be an R -algebra and B an A -module. Then the module of deriva-tions Ω B / A = { db : b ∈ B } / ∼ where ∼ is defined by the relations for derivations as above. For ( X , O X ) a ringed space,we can define Ω X ( U ) : = Ω O X ( U ) / R where O X is a sheaf of R -modules. For a manifold ( M , C ∞ M ) , we have Ω M ( U ) = Ω C ∞ M ( U ) / R This is
Cotangent sheaf of M and the tangent sheaf is its dual as a C ∞ M -module. We definedifferential k -forms again, now as sections of the sheaf (cid:86) k Ω M . This is the locally free sheafassociated the k -th exterior power of the cotangent bundle on M .. Remark 3.3.56.
In general if ( X , O X ) is a locally ringed space, we cannot define p -forms asabove. This is because Ω X need not be a FLF sheaf. To remedy this, we use the canonicalmorphism Ω X → Ω ∗∗ X and take exterior powers. Proposition 3.3.57.
The two constructions of Ω M are equivalent.Proof. This follows from Theorem 3.3.49 and Proposition 3.3.10.110ow, consider T ∗ M as the vector bundle associated to Ω M . Then considering that (cid:86) k T ∗ M is a vector bundle as above we can sheafify it. As is expected, k (cid:94) T ∗ M (cid:55)→ Ω kM : = k (cid:94) Ω M Definition 3.3.58.
Using the constructions above, the module of differential k-forms isthe C ∞ M ( M ) -module Ω k ( M ) : = Γ ( M , k (cid:94) T ∗ M ) These modules come with a a differential d k : Ω k ( M ) → Ω k + ( M ) called the exteriorderivatives . In the greatest generality, if ω ∈ Ω k ( M ) and V , ..., V k are smooth vectorfields on M , then d k ω ( V , ..., V k ) = ∑ ( − ) i ω ( V , ..., (cid:98) V i , ..., V k ) + ∑ ( − ) ij ω ([ V i , V j ] , V , ..., (cid:98) V i , ..., (cid:98) V j , ..., V k ) where (cid:98) V i means omission. Lemma 3.3.59. d k + ◦ d k = De Rham Complex associated to M . Definition 3.3.60.
A differential k -form ω is called closed if d ω =
0. It is called exact if ω = d η for some ( k − ) -form η . The above lemma tells us that every exact form is closed.Therefore, we can define a cohomology theory for M via this complex as H iDR ( M ) : = ker d i / Im d i − It then follows immediately that H ( M ) ∼ = R π ( M ) . Example 3.3.61.
For R n , the differential 1-forms are generated by the formal symbols dx i where { x i } is a basis for R n . For higher degrees, we have then that ω = ∑ α i ,..., i (cid:96) dx i ∧ ... ∧ dx i (cid:96) with α i ,..., i (cid:96) ∈ R . Further, every k -form for k ≥ H iDR ( R n ) = i ≥ H ( R m ) = R .Now that we have the notion of de Rham cohomology, we want to know what itsrelation is to sheaf cohomology with the corresponding complex of sheaves constructedin the examples above. Theorem 3.3.62.
Let M be a C ∞ -manifold. Then, we have the following isomorphism:H iDR ( M ) ∼ = H i ( M , ˜ R ) where ˜ R is the constant sheaf on M . 111his follows from the constructions above. For more details see [Wed16].The reason we care about this theorem is that it gives an analytic interpretation of sin-gular cohomology. By Theorem 3.3.50, we have that de Rham cohomology is isomorphicto singular cohomology. Therefore, de Rham cohomology is encoding topological infor-mation about the manifold. Further, this isomorphism gives another way to computesheaf cohomology.This ends the Chapter as well as the background material. We encourage the moti-vated reader to spend some time understanding the final sections here as they are bothtechnical and widely applicable. They will be useful in understanding Chapter 4, as wellas some recent claims of computational neuroscientists on the construction of geometricframeworks for perceptual spaces via homology and cohomology.112 hapter 4A Geometric Framework for OlfactoryLearning and Processing Abstract
We present a generalized theoretical framework for olfactory representation, learning,and perception using the theory of smooth manifolds and sheaves. This frameworkenables the simultaneous depiction of sampling-based physical similarity and learning-dependent perceptual similarity, including related perceptual phenomena such as gener-alization gradients, hierarchical categorical perception, and the speed-accuracy tradeoff.Beginning with the space of all possible instantaneous afferent inputs to the olfactory sys-tem, we develop a dynamic model for perceptual learning that culminates in a perceptualspace in which qualitatively discrete odor representations are hierarchically constructed,exhibiting statistically appropriate consequential regions ("boundaries") and clear rela-tionships between the broader and narrower identities to which a given stimulus mightbe assigned. Individual training and experience generates correspondingly more sophis-ticated odor identification capabilities. Critically, because these idiosyncratic hierarchiesare constructed from experience, geometries that fix curvature are insufficient to describethe capabilities of the system. In particular, the use of a hyperbolic geometry to map ordescribe odor spaces is contraindicated.
The task of sensory systems is to provide organisms with reliable, actionable informa-tion about their environments. However, such information is not readily available; theenvironmental features that are ecologically relevant to an organism are rarely directlyevident in primary receptor activation patterns. Rather, these representations of interestmust be constructed from the combined signals of populations of sensory receptors. Thisconstruction process is mediated by sophisticated networks of neural circuitry that drawout different aspects of potentially important information from the raw input patterns.We previously have proposed that these interactions and transformations can be most ef-fectively modeled as a cascade of successive representations [Cle14], in which each neuronal113nsemble constructs its representation by sampling the activity of its antecedents.The representational cascade that underlies odor recognition and identification is im-pressively powerful and compact. Olfactory bulb circuits impose an internally generatedtemporal structure on afferent inputs [LC13a, LC13b, KSUM99, BLFL06] while also regu-lating contrast [CS06], normalizing neuronal activity levels [CCH +
11, BMA +
15, CBC20],and managing patterns of synaptic and structural plasticity [CPdLCPL +
16, Str09, GS09].Transient periods of synchronization with postbulbar networks such as piriform cortexare likely to govern interareal communication [Fri15, FBB +
16, Kay14], including feedbackeffects on bulbar plasticity [Str09, GS09]. The resulting perceptual system learns rapidlyand is conspicuously resistant to retroactive and compound interference [HE96, SCT07].Odors of interest also can be readily identified despite direct interference from simulta-neously encountered competing odorants; this is a major unsolved problem in olfactoryneuroscience, as competition for receptor binding sites by multiple odorant species pro-foundly degrades the odorant-specific receptor activity profiles on which odor recogni-tion ostensibly depends. We have constructed olfactory circuit models that learn rapidly,resist retroactive interference, and exhibit robust recall under high Bernoulli-Gaussiannoise (which models a combination of sampling uncertainty, innate stimulus variance,and high levels of unpredictable competitive interference from other ambient odors) us-ing a strategy of successive recurrent representations shaped by prior learning [IC20].The success of this approach accentuates the implications of the profound plasticity of theearly olfactory system: odor representations, and the basic function of olfactory percep-tion itself, are fundamentally and critically dependent on learning [WS03, WS06, RPS + +
15, WS03, MSN +
11, KSS + +
14, Her05, AK18,AK20, LKA + Theoretical frameworks for understanding sensory systems include perceptual spaces and hierarchical structures . Both are founded on metrics of similarity [ZVM +
13, ES12, She87,Cla19], though the former presumes an essentially continuous space of some dimen-sionality into which individual stimulus representations are deployed, whereas the latterpresumes some degree of qualitative category membership for each such representation,with intercategory similarities potentially being embedded in the hierarchical proximitiesamong categories. Perceptual spaces can be defined using a variety of metrics, includ-ing both physical metrics such as wavelength (color) or frequency (pitch) and perceptualmetrics such as those revealed by generalization gradients [She87, CNB09, CMYL02] orby ratings on continuous scales by test subjects. Indeed, study of the transformationsbetween physical and perceptual metric spaces is foundational to understanding sensorysystems from this perspective [ZVM +
13, Mei15, VRC17]. In contrast, hierarchical struc-114ures arise from perceptual categorization processes, though relationships among the re-sulting categories still may respect underlying similarities in the physical properties ofstimuli (see
Discussion ). Critically, it is categories that are generally considered to be em-bedded with associative meaning ( categorical perception ) [Har87, GH10, AR18]; a usefultheoretical framework must concern itself with the construction of these categories withrespect to the physical similarity spaces that are sampled during sensory activity. Thatis, along their representational cascades, sensory systems can be effectively considered totransition from a physical similarity space metric to a perceptually modified space, aris-ing from perceptual learning and within which hierarchical categorical representationscan be constructed.Interestingly, the olfactory modality lacks a clear, organism-independent physical met-ric such as wavelength or pitch along which the receptive fields of different sensory neu-ron populations can be deployed (and against which the nonuniform sampling propertiesof the sensory system can be measured) [Cle14]. However, olfaction does provide an ob-jective basis for an organism-dependent physical similarity space. In this framework, theactivity of each odorant receptor type – e.g., each of the ∼
400 different odorant receptorsof the human nose or the > R -space; see below) are linearly independent of one another, and (2) every possible profile ofreceptor activation, including any occluding effects of multiple agonists and antagonistscompeting for common receptors, is interpretable.Linear independence among the dimensions of R -space is important for analyticalpurposes, but their orthogonality is irrelevant [Coo15]. This is a vital distinction, notleast because orthogonality depends on the statistics of the chemosensory input spaceand hence cannot be uniquely defined as a property of the olfactory system per se. Inprinciple, each receptor type should have regions of its receptive field that distinguishit from any other single receptor type, such that activation of a given receptor need notalways imply activation of a particular different receptor (that is, no two dimensions willbe identical). However, within any given sensory world, as defined by a finite set ofodorant stimuli with established probabilities of encounter, there will be reliable activitycorrelations among many pairs of receptor types that can support substantial dimension-ality reduction [HWK + signal sparse [BFC17] – but, perhaps more importantly, the process of odor115earning itself directly affects perceived olfactory similarity relationships within a contextof learned generalization gradients [CNB09, CCH + In addition to dimensionality, the second fundamental property of a sensory space is itsintrinsic geometry [ZVM + odorant representations byperceptual learning into meaningful, cognitive odor representations to which meaning canbe ascribed. Key features include the simultaneous depiction of sampling-based physi-cal similarity and learning-dependent perceptual similarity within the perceptual space,a basis for the speed-accuracy tradeoff [FBT +
17, RKG06, ZKU +
13, ASC + odor representations are hierarchically constructed through experience,exhibiting statistically appropriate consequential regions with probabilistic boundariesthat reflect learned generalization gradients [CNB09, CCH +
11, She87]. Critically, indi-vidual training and experience generates progressively more sophisticated hierarchiesand concomitantly superior odor identification capabilities [RPS + R -space) comprising N receptor types can be depicted as an N -dimensional unit cube.Transformations arising primarily from initial post-sampling computations generate amodified receptor space termed R (cid:48) ; this space inherits the dimensionality of R -space butrespects the nonuniform likelihoods of different state points within that space. The sub-sequent transformation from R (cid:48) to S -space ("scent space") reflects the perceptual and cat-egorical learning processes that construct perceptual representations of meaningful odors . R R (cid:48)
S M B ξ ∆ C ∞ ( R m ) (4.1)116ormally, R is a unit parallelepiped defined by primary olfactory receptor activationlevels. R (cid:48) denotes a subspace of normalized points, following glomerular processing, andis the image B ( R ) . M is a vector bundle over R (cid:48) of rank ξ denotes the input to mitral cells following glomerularprocessing, comprising a sparsened, statistically conservative manifold; it is a section ofthe vector bundle M . S denotes the perceptual space, and is realized as a transformationof R (cid:48) -space that embeds odor learning.Importantly, this theoretical model is broadly independent of precisely where in theolfactory representational cascade these computations take place. However, we considerthat the map B from R -space to R (cid:48) -space reflects signal conditioning computations per-formed within the glomerular layer of the olfactory bulb [Cle14, CBC20], whereas thesubsequent transformation into S -space is mediated by computations within the olfac-tory bulb external plexiform layer network [IC20], inclusive of its reciprocal interactionswith deeper olfactory cortices. Briefly, we propose that the construction of categori-cal odor representations through statistical experience arises from learning-dependentweight changes between mitral cell principal neurons and granule cell interneurons inthe external plexiform layer of the olfactory bulb. In this theory, plastic interactions be-tween these two populations construct meaningful, categorical odor representations fromthe continuous, physical odorant representations of R (cid:48) -space based upon individual expe-rience. To construct this theoretical S -space, and attribute to it the capacities of general-ization, speed-accuracy tradeoff, and experience-dependent hierarchical categorization,we first build a transitional space M based on mitral cell activity representations, inclu-sive of the actions performed on these representations via their interactions with granulecell interneurons (Diagram 1). This resulting S -space does not, indeed cannot, admit asingle geometry, because of the essential requirement for locally adaptable curvature. Wedescribe this generative process in detail below. R -Space The first representational structure in olfaction is directly derived from the ligands of thephysical odorant stimulus interacting with the set of chemoreceptive fields presented bythe animal’s primary odorant receptor complement. Both vertebrate and arthropod olfac-tory systems are based on large numbers of receptor neurons, each of which expresses oneprimary odorant receptor out of a family of tens (in
Drosophila ) to over 1000 (in mice, rats,and dogs). The axons of primary sensory neurons expressing the same receptor convergetogether to form discrete glomeruli across the surface of the olfactory bulb (in vertebrates;the arthropod analogue is the antennal lobe), enabling second-order projection neurons(mitral cells) to sample selectively from one or a few receptor types. The response ofeach receptor type to an odor stimulus constitutes a unit vector that can range in mag-nitude from nonresponsive (0) to maximally activated (1). A complete representationalspace for instantaneous samples of this input stream consequently has a dimensionalityequal to the number of odorant receptor types N . That is, in a species with three odorantreceptors, the space containing all possible instantaneous input signals would be a three-dimensional unit parallelepiped (depending on the original placement of the vectors in117-space), whereas the R -space of a mouse expressing 1000 receptor types would comprisea 1000-dimensional unit space. As noted above, it is not necessary that these vectors beorthogonal, only that they be linearly independent [Coo15]; indeed, the orthogonality ofthese vectors cannot even be defined without reference to the statistics of the particularphysical environment in which they are deployed.Formally, R -space is defined as the space of linear combinations of these vectors withcoefficients in (
0, 1 ) . Consider the space of all possible odorant stimuli in a species ex-pressing N odorant receptor classes. Each odorant stimulus s ∗ corresponds to a uniqueinstantaneous glomerular response profile that can be represented as a vector s ∗ ∈ R N .Normalizing the activation in each glomerulus enables us to consider s ∗ ∈ ∏ n (
0, 1 ) , theunit cube in N dimensions. Denote this receptor activation-based representational space R . Because the tangent space at all points is T x R ∼ = R N , R has dimension N as a manifold.By considering a product of spaces, we are assuming that the responses of differentglomeruli are orthogonal. In the greatest generality, we would need to consider pointson a unit parallelepiped generated by the glomeruli. We can apply an invertible lineartransformation (namely the matrix generated by the Gram-Schmidt process) to this par-allelepiped to generate a cube (and vice-versa); this is a mathematical formalism and doesnot affect the particulars of this situation. Consequently, for the remaining sections, wecan assume without a loss of generality that R = ∏ n (
0, 1 ) . R (cid:48) The first computational layer of the olfactory bulb – the glomerular layer – computes anumber of transformations important for the integrity and utility of odor representations,including contrast enhancement [CS06], global normalization [CCH +
11, BMA + R -space; for example, global feedback normalization in the deepglomerular layer ensures that the points at which most or all of the vectors have veryhigh values will be improbable. The outcome of this transformation is represented as R (cid:48) ,essentially a manifold embedded in R -space.In addition to the systematically unlikely points in R that are omitted from the mani-fold R (cid:48) , it is also the case that, under natural circumstances, most of the possible sensorystimuli s ∗ that could be encountered in R (cid:48) actually never will be encountered in an or-ganism’s lifetime. That is, odor representations within R -space are signal sparse [BFC17].Moreover, we argue that odor sources s ∗ are discrete, but inclusive of variance in qualityand concentration, and hence constitute volumes (manifolds) within R (cid:48) . To account forthis, we denote this variance by s ∗ = ( x , U x ) , where x ∈ R (cid:48) and U x denotes an n -tuple ofvariances (i.e., one variance for each dimension of freedom in R (cid:48) ). That is to say, U x = ( σ , ..., σ n ) From this we arrive at the following definition:
Definition 4.2.1.
A pair ( x , U x ) constitutes an odor source volume in R (cid:48) if U x (cid:54) = ( x , U x ) = s ∗ for some odorant s ∗ . 118hat is, an odor source volume corresponds to a manifold within R (cid:48) that comprises thepopulation of odorant stimulus vectors arising from the range of variance in receptoractivation patterns exhibited by a particular, potentially meaningful, odor source. Thisincludes variance arising from nonlinearities in concentration tolerance mechanisms thatcannot be completely avoided [CCH +
11] as well as genuine quality variance across dif-ferent examples of a source. For example, the odors of oranges vary across cultivars anddegrees of ripeness; the odors of red wines vary across grape cultivars, terroir, and pro-duction methods. The source representation in R (cid:48) thereby corresponds to an odor source(e.g., orange, red wine), inclusive of its variance, and delineates the consequential regionof the corresponding odor category that will be developed via perceptual learning. Crit-ically, it is not important at this stage to specify multiple levels of organization withinodor sources (e.g., red wine, resolved into Malbec, Cabernet, Montepulciano, etc., thenresolved further by producer and season); it is the process of odor learning itself thatwill progressively construct this hierarchy of representations at a level of sophisticationcorresponding to individual training and experience. M -Space The transformation from R (cid:48) to S -space depicted in Diagram 1 is mediated by the interac-tions of mitral and granule cells. In this framework, mitral cells directly inherit afferentglomerular activity from R (cid:48) (Diagram 1, ∆ ), but their activity also is modified substantiallyby patterns of granule cell inhibition that, via experience-dependent plasticity, effectivelymodify mitral cell receptive fields to also incorporate higher-order statistical dependen-cies sourced from the entire multiglomerular field. (A simplified computational imple-mentation of this constructive plasticity is presented in the learning rules of Imam andCleland, 2020). This is depicted in Diagram 1 as an effect C ∞ ( R m ) of a mitral cell prod-uct space M which contributes to the construction of S , in order to highlight the smoothdeformations of R (cid:48) into S via passage to M .This effects of mitral cell interactions, arising from experience, are modeled locally asa product space M based on the principle that each glomerulus – corresponding to a re-ceptor type in R (cid:48) – directly contributes to the activity of some number of distinct mitralcells. In the mammalian architecture (shared by some insects, including honeybees), mi-tral cells receive direct afferent input from only a single glomerulus, such that the afferentactivity in each mitral cell (or group of sister mitral cells) corresponds directly to a singlereceptor type. In this "naive" case, M -space is globally a product. To formalize this, welabel the glomeruli g , ..., g q . To each, we associate the number of mitral cells to which itprojects; denoted m i ∈ Z . Let k = ∑ q m i . Then, the naive space constructed from thesedata is R (cid:48) × R k = { ( r , v ) : r ∈ R (cid:48) , v ∈ R k } The interpretation of this space is as follows: to each point in R (cid:48) , we can associate a vectorthat is an identifier for how subsequent mitral-granule cell interactions in the olfactorybulb will transform the input in service to identifying it as a known percept. The man-ifolds associated with particular odor source volumes in R (cid:48) will, owing to experience-dependent plasticity, come to exhibit related vectors that, in concert, manifest source-119ssociated consequential regions. These regions reflect categorical perceptual represen-tations and are measurable as odor generalization gradients. Simplified computationalimplementations have depicted these acquired representations as fixed-point attractors,tolerant of background interference and sampling error but lacking explicit consequentialregions [IC20].We refer to this space as naive because it is globally a product space only for themammalian architecture, in which the dimensionality of mitral cell output m (the num-ber of distinct mitral cells, grouping sister mitral cells together) is identical to that ofglomerular output k . However, this network architecture is not general; in nonmam-malian tetrapods, for example, individual mitral cells may sample from more than oneglomerulus [MNS81a, MNS81b]. This introduces a twist into the product space and ruinsthe naive structure, as m now can be less than k . In this general case where m ≤ k , themitral cell space becomes a rank m vector bundle R m (cid:44) → M π → R (cid:48) over R (cid:48) . Nevertheless, it can be depicted locally as a product space because vector bundlesare locally trivializable. Given any odor source volume ( x , U x ) we know that there existseither a subset U (cid:48) ⊂ U x such that π − ( U (cid:48) ) ∼ = U (cid:48) × R m or U (cid:48) ⊇ U x , then we can lookexclusively that U x × R m ∼ = π − ( U (cid:48) ) and this is a trivial bundle over the base.For simplicity, we here analyze the mammalian architecture case. In this architecture,the vector bundle is trivial because m = k ; no mitral cells innervate multiple glomeruli,and there is no possible twisting of the fibers. Therefore, in mammals, M is globally aproduct space, M = R (cid:48) × R m rendering M a smooth manifold with the convenient property that to every input x ∈ R (cid:48) we associate a point ( x , v ) , where v is a vector whose i th component is the value of theoutput of the i th mitral cell. Formally, we say that M is a (trivial) vector bundle over R (cid:48) with fibre R m . Then, the smooth maps which send x (cid:55)→ ( x , v ) such that compositionwith projection onto the first coordinate is the identity are called global smooth sectionsof the bundle, and the set of these is denoted Γ ( R (cid:48) , M ) . To any smooth manifold P , we canassociate the ring of smooth functions C ∞ ( P ) = { f : P → R : f is smooth } To any open subset, we have a restriction map Res PU : C ∞ ( P ) → C ∞ ( U ) . In general, if U ⊆ P is open, then Γ ( U , E ) is a C ∞ ( U ) − module for any bundle π : E → P . C ∞ ( − ) makes P into a locally ringed space and Γ ( − , E ) is a sheaf of C ∞ ( − ) -modules. S -Space S -space, or scent space, is a constructed perceptual space tasked with preserving physicalrelationships among odorants while also embedding the transformations arising fromperceptual learning, specifically including those forming incipient categorical odors . Todo this, we embed R (cid:48) into a higher-dimensional space (with dimension N + S by growing U x in the positive N + th direction around odor source volumes in R (cid:48) , which does not affect distance relationshipsin R N (Figure 4.1A). (Discrimination training also can grow U x in the negative N + th direction). To quantify this transformation, we construct two distance metrics, d phys and d per on S . Definition 4.2.2.
Let x , y ∈ S be two points. We define the physical metric between the twopoints as the Euclidean distance between them in R . In notation, d phys ( x , y ) = | π R N ( x ) − π R N ( y ) | This metric reflects the physical similarities of the objects in the receptor space, which arenot affected by perceptual learning (i.e., distension in N + Definition 4.2.3.
Let x , y ∈ S . Consider x and y as vectors in R N + . Then, let γ : [
0, 1 ] → S be the curve defined by γ ( ) = x , γ ( ) = y and π R N ( γ (cid:48) ( t )) = w · [ π R N ( γ ( ) − γ ( ))] with w some real number dependent on t . The perceptual metric , d per ( x , y ) = (cid:90) || γ (cid:48) ( t ) || dt is the arc-length along the surface of S between the points x and y (Figure 4.1A). No-tice that π R N ( γ (cid:48) ) is well defined as S (cid:44) → R N + and thus the tangent space T γ ( t ) S ⊆ T γ ( t ) R N + = R N + .The relationship between these two metrics tracks the changes in S induced by the con-struction of odor representations; specifically, d per reflects experience-dependent changesin the perceptual distance between x , y ∈ S that are excluded from the d phys metric (Fig-ure 4.1A). Learning about an odor source ( x , U x ) progressively distends the volume (in R N ) in the N + R (cid:48) . That is, over time, the breadths (in each of the N dimensions) ofthe distension into the additional ( N + th ) dimension will come to reflect the actual vari-ances U x of the odor source s ∗ = ( x , U x ) as naturally encountered. The quasi-discretedistensions formed in the additional dimension correspond to incipient categories – i.e.,categorically perceived odors – and their breadths and gradients can be measured behav-iorally as generalization gradients [CNB09, CCH + U x = ( σ , ..., σ n ) in R (cid:48) is independent; that is, different sam-ples of a given natural odor source may vary substantially in some aspects of quality butnot others, where an aspect of quality refers to the relative levels of activation of a givenodorant receptor type (Figure 4.1B).Formally, to construct the perceptual space S in such a way that there exists a per-ceptual metric d per that interacts with the natural physical metric d phys of R (cid:48) , we con-sider the embedding R (cid:48) (cid:44) → R N + . The open neighborhoods for each odor source volumedefine open sets in the subspace topology. If we embed R (cid:48) by the canonical inclusion R N → R N + , then R (cid:48) is flat in R N + because the final coordinate of its elements is 0.Therefore, we can consider transformations of R (cid:48) that smoothly vary the final coordinate.For each transformation f , denote the resulting space as S : = S ( f ) . This constitutes the121igure 4.1: Depictions of S -space in the cases of N =
1, 2. (A) Three distinct odors in S -space in the case of N =
1. Going left to right, the first odor is highly learned withmany distinct sub -odors. Further, it is decorated with a distinction of a specific odor andthe time axis. Per the discussion below, each red dotted line represents the formation ofequivalence classes of odors at a given time. As time increases, specificity increases andthis is reflected in the diagram. The second odor is overall less learned than the first, yetthe first two sub-odors are known to be distinct as shown by the large valley betweenthem. The third odor is poorly learned. (B) After learning has occurred, a valley has beencreated between the two sub-odor classes in the second odor. As the valley extends belowthe original line, we know that these two sub-odors are perceptually very different. (C)-(D) Depictions of a part of S -space for the case N =
2. Various amounts of learning havegenerated the landscapes presented.evolving perceptual space. Define the map ∆ : R (cid:48) → S as the distension of R (cid:48) in N + M and R (cid:48) simultaneously, and is a diffeo-morphism trivially.To better understand the map ∆ , we here construct it as the composition of mapsamong the spaces already described, specifically showing how the (acquired) propertiesof M govern the mapping of R (cid:48) to S . The map B : R → R (cid:48) reflects glomerular-layer trans-formations as described above. For a fixed smooth section ξ : R (cid:48) → M (which alwaysexists by the triviality of M ), we generate Diagram 2 (an elaboration of Diagram 1),122 R (cid:48) S M B ∆ ( f ) ξ id R (cid:48) × f (4.2)where ∆ ( f ) is defined to be a map that makes the diagram commute. Note that ∆ de-pends on f , and, therefore, so does S . That is, S depends on the functions R m → R from M , which are smooth. To allow for ongoing plasticity, it is more correct to denote the per-ceptual space as S : = S ( f ) ; however, as it will always be clear from context whether or not f is fixed, we will simply refer to it as S . The map id × f reflects the fact that R (cid:48) ⊆ R N + ,and by construction x N + = x ∈ R (cid:48) . As M = R (cid:48) × R m , it follows that a dense setof maps M → R N + which are the identity on R (cid:48) can be split as maps i : R (cid:48) → R N and f : R m → R . Therefore, because id R (cid:48) × C ∞ ( R m ) = C ∞ ( R m ) , we abbreviate the collectionof all maps M → S as C ∞ ( R m ) , as depicted in Diagram 1.The outcome of these transformations is a formal definition for the construction ofcategorical odor representations in S : Definition 4.2.4.
Let ( x , U x ) be an odor source volume in R (cid:48) . We denote the image of thisvolume in S as ( x , (cid:102) U x ) . This image denotes an odor representation , also referred to as an odor percept , or simply an odor . The construction of odor representations ( x , (cid:102) U x ) in S enables the depiction of learning as ageometric object, naturally encompassing the transition between the physical and percep-tual space depictions of the olfactory landscape and illustrating the construction of mean-ingful categorical odor representations based on individual experience. As we describebelow, these odor representations admit hierarchy and exhibit the advantages of categor-ical perception. However, they remain continuous in S , with consequential regions thatare not discretely delimited; i.e., olfactory perceptual categorization is ultimately heuris-tic. This affords some powerful advantages. For example, it provides a natural basis forbehaviorally observed odor generalization gradients [LH99, CMYL02, CNB09, CCH + U x of the odorsource indicates that different samples fall within a common, relatively broad, distribu-tion with shared implications [CCH + d per between them. Each of these distinct and specialized modes of learning is considered totransform the plasticity-dependent distensions into dimension N + N + S -space will be variously persistent, either fading back towards flatness with a giventime constant or enduring indefinitely, according to learning-dependent temporal tagsthat are not explicitly discussed herein. The geometry of local plasticity
Plasticity in neural systems in general, and in the olfactory bulb in particular, is locallygoverned. Changes in cellular and synaptic functional properties rely substantially onthe synaptic interactions of directly connected neurons and the locally regulated releaseof neurochemicals. These local effects, coordinated by sophisticated network interactions,collectively generate global systemic performance at the network level. The present odorlearning framework also arises from localized plasticity: distensions into the additional( N + th ) dimension of S arise from learning the activity profiles of individual sensoryinputs, and are not globally governed (specifically, we argue that this arises from learnedpatterns of granule cell feedback onto mitral cells in olfactory bulb; for a simplified com-putational implementation of this process, see [IC20]). However, to characterize the func-tionality of the olfactory system as a whole, it is necessary to formally glue such localplasticity operations together, along with any relevant global processes, within a singleanalytical framework. To do this, we employ the theory of sheaves [Wed16]. Sheaves enable localized learning
We formally consider the local actions of granule cells onto mitral cells, and their con-comitant modification of mitral cell output, as follows, considering that these actions124ay rely both on afferent sensory information and on additional inputs delivered ontogranule cells by piriform cortex and other association cortices [IS98]. Recall from the pre-vious section that for any vector bundle π : E → P , we generate ( C ∞ ( − ) , Γ ( − , E )) a pairof sheaves on P such that Γ ( U , E ) comes equipped with an action of C ∞ ( U ) for all open U ⊆ P . We here define an analogous pairing of sheaves to describe the modification bygranule cells of afferent information contained in the mitral cell ensemble. The first stepin this definition is to define a functor µ : T → R m where T is the category defined by the topology on R (cid:48) , and R m is the set whose objectsare linear subspaces of R m and morphismsMor R m ( U , V ) = (cid:40) ∅ U (cid:54)⊆ V {∗} U ⊆ V To describe what this functor does, we need to turn to the anatomy of bulb. For a givenodorant, the induced signal passed from glomeruli to mitral cells may not excite somemitral cells. This corresponds to the situation where ξ ( s ) = ( s , v ) and v has some coor-dinates equal to 0. These non-zero coordinates form a basis for some subspace of R m . Let n ( ξ , s ) be the number of non-zero coordinates in v . Let O be any open subset of R (cid:48) . Then µ ( O ) = R (cid:96) where (cid:96) = max { n ( ξ , p ) : p ∈ O } . Composing τ and C ∞ and using the sheaf conditionof C ∞ we conclude that C ∞ ( µ ( − )) ∈ Sh ( R (cid:48) ) . Now, we define G ( − ) as a flabby (flasque)sheaf of rings on R (cid:48) which act on C ∞ ( µ ( − )) . This action is precisely the interaction oflocal inhibition on mitral cells, and in particular on those mitral cells that are activated bya given odorant stimulus. This makes C ∞ ( µ )) a G -module (as sheaves). Localized discrimination learning
Learning about an odor is generally modeled as growing a distension into the additional( N + th ) dimension of S , with the breadths of the distension across its N dimensionsultimately reflecting the physical profile of quality variance U x associated with the cor-responding odor source s ∗ = ( x , U x ) . This category-construction framework can be con-sidered common to diverse forms of odor learning (e.g., nonassociative, reinforcement),despite their differences in other properties as noted above. However, explicit discrimina-tion learning – in which animals are rewarded for distinguishing physically similar odor-ants from one another by associating them with different outcomes – requires that thesedistensions into the additional dimension also be locally retractable, so as to reduce oreliminate the similarity-based categorical overlap that may exist between the odor sourcevolumes a priori . This is particularly important given the remarkable olfactory discrimi-nation capabilities exhibited by appropriately trained animals [MBB19].Consider two physically similar odorants s ∗ = ( x , (cid:102) U x ) and t ∗ = ( y , (cid:102) U y ) in S . Becausethe early stages of odor learning are characterized by broadened generalization gradients125CNB09], presumably reflecting sampling uncertainty, odor representations (distensionsin S ) at this stage are likely to overlap: (cid:102) U x ∩ (cid:102) U y (cid:54) = ∅ . This is appropriate, given the like-lihood (prior to discrimination training) that two highly similar odor stimuli, sampled inclose succession, simply constitute two samples from the same odor source volume. How-ever, discrimination training is capable of rapidly and strongly separating highly similarodors, and the between-category separation principle of category learning [PGJSTH19] in-dicates that we need to move them further apart than they would be prior to learning.Hence, discrimination learning needs to be able to not only retract distensions to zero,but to expand them in the negative direction if need be (see Figure 4.1B).To do this, we construct a map that decreases only those values of f which are suf-ficiently close (within some small ε >
0) to a distance-minimizing path γ connecting x and y . Its existence follows from the existence of smooth bump functions on M . Fix f ∈ C ∞ ( R m ) so that S = S ( f ) . We consider functions α ∈ C ∞ ( R ) . Then, by defining thelearning operation as S (cid:55)→ S ( α ◦ f ) we have a realization of this transformation of learningtwo odors apart. In fact, what we have done here is defined a (cid:94) C ∞ ( R ) -module structure on C ∞ ( µ ) . Therefore, by considering only the interaction of α and f over γ , we have reducedthe problem of discrimination learning to a 1-dimensional problem depicted in Figure4.1A-B. The map resulting from discrimination learning lengthens the perceptual metric d per between two similar odor source volumes, partitioning and expanding the previouslyshared space between the two representations so as to arbitrarily increase their perceptualdissimilarity, all without altering the physical distance d phys between their centers.Importantly, discrimination learning inherently depends on at least two odor sources,so can be targeted even more specifically between them. In high-dimensional space,can separate two such sources nearly arbitrarily without affecting similarity relation-ships among other nearby odor representations. This cannot be depicted in our lower-dimensional plots as the number of dimensions is too small for all of the odors to essen-tially be independent. Remark 4.2.5.
Based on the construction above, we can take (cid:94) C ∞ ( R ) to be a rough approx-imation of G as a sheaf. We cannot conclude that they are precisely equal as this wouldneed more analysis which we have not presented here.Putting all of this together, we arrive at the final (for now) version of the model. Wenow have, R , R (cid:48) , M , G , S and can complete the picture of the model (reference Diagram4.1). The appearance of G and C ∞ ( µ ) encodes the local-to-global transformations of gran-ule cells and their interaction with the maps M → S which preserve R (cid:48) . R ( R (cid:48) , G , C ∞ ( µ )) S M = R (cid:48) × R mB ξ ∆ ( f ) Id R (cid:48) × f (4.3)All together, this diagram encodes everything which we have constructed above and therelations of the various spaces. 126 .2.6 The construction of hierarchical odor categories The last original part of this section is the construction of hierarchical categories from thecontinuous spaces we have built above. The surprising advantage of the process aboveis that it gives a geometric interpretation of the speed-accuracy tradeoff for identifyingodors in the wild.Suppose now that we need to identify a given odor. For example, a fox in the wildmay be hunting an animal and tracking it by scent or a human trying to discern a specificspice in a dish while at a restaurant. What is the mathematical interpretation of such asituation and how does the model deal with this interpretation. We first view each peakas a continuous categorization for that stimulus (This is the image of a fully learned sys-tem). For instance we may have a peak defined for “oranges". As we move up the peakwe refine the categorization. Here refinement means entering a subcategory. From thediscussion above we know that the peak will be parsed into a variety of sub-peaks whichcorrespond to physically similar but perceptually different types of orange. Pictured be-low is a complex of categories, ordered by inclusionCitrus Fruit ⊇ Oranges ⊇ Ripe Oranges ⊇ Ripe Valencia OrangesAlthough this example is linearly ordered, there is no need for there to be only one chainof inclusions. Every peak can break up into at most finitely many distinct subpeaks andthus the decomposition can become arbitrarily complicated.Now we shall construct the categorization by successively taking intersections withan affine hyperplane (see Figure 4.1(A) for an illustration in the case N =
1) Suppose P is a peak, determined by some odorant pair ( x , U x ) , with several subpeaks { P i } i ∈ I .Then as each sub-peak has a boundary, we can define the minimum value attained in P i . Let P ∗ i ⊆ P i be the subset consisting of all points of P i with minimal x n + value. Let H = { x ∈ R n + : x n + = } be a hyperplane in R n + and define H t = H + (
0, 0, ...0, t ) .This is an affine transformation of H and geometrically is the translation of H in the n + t h direction. Lemma 4.2.6. P ∗ i = P i ∩ H t for some t > Further if n ≥ , P ∗ i is connected.Proof. Let t ∗ be the n + th coordinate of all elements in P ∗ i . Then by construction P ∗ i ⊆ P i ∩ H t ∗ For the reverse inclusion let y ∈ P i ∩ H t ∗ . Then y ∈ P i and y n + = t ∗ and therefore y ∈ P ∗ i .Hence, P ∗ i = P i ∩ H t ∗ . The connectedness of P ∗ i follows immediately from the fact that P ∗ i = ∂ P i the boundary, and that P i is homeomorphic to D n the n -dimensional disk. For n ≥ ∂ D n = S n − and is thus connected.Using this lemma, we can now define a coarse categorization of S . Let t ∈ (
0, 1 ) be ar-bitrary. By the previous lemma, we know that if we consider H t ∩ S we will get disjointconnected subsets of S . So, consider the closed half space H ∗ t = { x ∈ R n + : x n + ≥ t } ∂ H ∗ t = H t and H ∗ t ∩ S is also a collection of disjoint connected subsets of S . Let { S ti } i ∈ I t be an enumeration of these subsets by the set I t . Now let P be a partition of (
0, 1 ) . Then for each p j ∈ P we have the associated collection { S p j i } of subsets. We knowby construction that for j < j (cid:48) that { S p j i } ⊇ { S p j (cid:48) i } . Therefore, we have built a method tobreak S into discrete categories and given in the local structure of a tree. Using this, wearrive immediately at a hierarchical categorization of odors which is solely dependent onthe amount of information learned about a class of odors. The method we have built above prioritizes the construction of a coarse categorization(partial order) from a geometric structure. One may ask if it is possible to proceed inthe other direction, that is build a geometric structure out of some form of categoriza-tion. This approach has been attempted by many researchers and in every case, there isa fundamental assumption made which makes the model unhelpful and in some cases,invalid. In [ZSS18] they make the claim that the human perceptual odor space (the ana-logue of S ) is three dimensional and hyperbolic (constant negative curvature − TS where S is the unit sphere in R and TS is the tangent bundle. The HairyBall Theorem [EG79] tells us that TS is not a trivial bundle, and yet we can always locallytrivialize a vector bundle. Therefore, the local structure tells you little about the globalstructure. This is one of the reasons their conclusion was flawed. Their claim also hingedon the computation of some homology groups for certain simplicial complexes generatedby "similarity matrices" and showing that the distributions of the rank of these groupsclosely matches simulation estimates for hyperbolic space. This would have worked, hadthey not stopped computing the homology in degree 3. It does not take much thinkingto concoct a graph (and thus a simplicial complex) whose homology groups are zero forn=1,2,3 and are non-trivial for some higher degree (for example, the iterated suspensionof two points will yield simplicial complexes which are homotopy equivalent to spheres).This implies that the structure which they are trying to detect will have some higher di-mensional pieces. Simply not considering these (possibly because of the method used in[GPCI15]) leads to a false conclusion. Hence, the conclusion that the olfactory perceptualspace is hyperbolic is simply unfounded. More interestingly, should the perceptual spacebe related to the physical odorant space at all, there is no possible way to have constantcurvature! In this situation, when learning occurred, it would be impossible to preservethe physical metric and the perceptual metric simultaneously.128 hapter 5Future Directions This chapter will serve to present those ideas which we have not incorporated into themodel but believe are useful. Most of these topics are central to any field of mathematicsand thus we should expect them to show up here too. Additionally, we close with aconjectural method to deal with noisy input odors and show its relation to some of thetopics introduced in the first few chapters.
The representation theory of Lie groups is a fundamental field of mathematics. So fun-damental in fact that one would be strained to find an area of mathematics which doesnot appear in the usual course of study. In this short chapter, we shall study one of the-orems which lies in the intersection of complex analysis and representation theory: theBorel-Weil theorem.
Theorem 5.1.1 (Borel-Weil) . Let K be a compact, connected Lie group and T ⊆ K be a maximaltorus. Let G = K C be the complexification and B = MAN a Borel subgroup. Then the irreduciblefinite dimensional representations of K stand in one-to-one correspondence with the dominant,analytically integral weights λ ∈ t ∗ with the correspondence given by λ (cid:55)→ Γ H ( K / T , L λ ) ∼ = F HolB , χ λ where Γ H ( K / T , L λ ) denotes the set of holomorphic sections of the bundle and F HolB , χ λ = (cid:110) f : G → C f ( gb ) = χ λ ( b ) − f ( g ) , f holomorphic (cid:111) with χ λ the character of B associated to the analytically integral weight λ .This was proven independently by Borel and Weil in [Ser54] and then by Harish-Chandrain [HC56]. The proof we shall present in section 5.5 is a combination of those presentedin [Kna88], [Kna86], and [Hel08]. As will be seen later, this theorem gives a geometricrealization of a purely algebraic object and vice versa. Therefore, we may be able to applysimilar methods to the model above and arrive at some striking consequences.129ecall that a smooth manifold is a second-countable, Hausdorff, topological spaceequipped with an atlas of (smooth) C ∞ -charts { ϕ U : U → R n } which are injective. Mor-phisms of smooth manifolds are smooth maps which are compatible with the atlases.Putting these two together, we get the category M of smooth manifolds. There is afunctor C ∞ ( − ) : M → R - Alg which assigns to any smooth manifold M an R -algebra C ∞ ( M ) : = { f : M → R | f smooth } with addition and multiplication defined point-wise.For each p ∈ M we can define C ∞ M , p : = lim −→ U (cid:51) p C ∞ ( U ) .Let M be a manifold and p ∈ M a point. We define the tangent space at p to be T p M : = Der ( C ∞ M , p ) . This becomes an R -vector space is we equip it with addition.The collection ofall tangent spaces is called the tangent bundle and is denoted T M . This admits a smoothstructure and becomes a smooth manifold with dimension 2 dim M . The elements of T M can be given as pairs ( p , v ) where p ∈ M and v ∈ T p M . There is a canonical projection π M : T M → M which is a local diffeomorphism onto. A manifold is called parallelizable if T M = M × R n for n = dim M .A section of the canonical projection is a smooth map f : M → T M such that π M ◦ f = M . The set of all smooth sections is denoted Γ ( M , T M ) or X ( M ) and can be identifiedwith the collection of smooth vector fields on M . This has a natural structure as a C ∞ ( M ) -module. Definition 5.1.2. A Lie Group is a group object in the category M . More explicitly, it is asmooth manifold G equipped with two operations: multiplication G × G → G which issmooth and inversion ( − ) − : G → G which is also smooth. A Lie group homomorphism is a smooth map which respects the group structure.Let G be a Lie group and x ∈ G . Then x defines a smooth automorphism L x : G → G such that L x ( y ) = xy . An element q ∈ Γ ( G , TG ) is called left-invariant if for all x , y ∈ G we have T y L x ( q y ) = q xy where T y L x : T y G → T xy G is the tangent map. The space of all left-invariant vector fieldson G will be denoted as X L ( G ) . Definition 5.1.3. A Lie Algebra is a vector space V equipped with an alternating, bilinearform [ − , − ] : V × V → V satisfying the Jacobi Identity [ X , [ Y , Z ]] + [ Z , [ X , Y ]] + [ Y , [ Z , X ]] = Lie Bracket . A
Lie algebra homomorphism is a linear map T : g → h such that T ([ X , Y ]) = [ T ( X ) , T ( Y )] where the first bracket is in g and the second is taken in h . Lemma 5.1.4.
The map ( − ) : X L ( G ) → T ( G ) is a vector space isomorphism. Further, if weendow X L ( G ) with the operation [ X , Y ] = XY − YX . This makes X L ( G ) a Lie algebra over R .Further ( − ) respects the bracket operation and gives T ( G ) the structure of a Lie algebra. roof. The map has an inverse given by
X f ( x ) = X ( L x − f ) where L x − f ( y ) = f ( xy ) . Thefact that this map respects the Lie bracket is obvious. Corollary 5.1.5. TG ∼ = G × T ( G ) . Proof.
Every basis of T ( G ) consists of global left-invariant vector fields and hence G isparallelizable.For the remainder of this section we shall denote Lie algebras by the correspondinglower-case gothic letters. That is if G is a Lie group, then its Lie algebra is g . Example 5.1.6. (a) Let G = R n together with addition. This is a Lie group with Liealgebra R n . In general, any finite-dimensional real vector space is non-canonicallyisomorphic to R n for some n and therefore carries a smooth manifold structure andtherefore a Lie groups structure.(b) Recall that a matrix is invertible if det X (cid:54) =
0. Then GL n ( R ) (resp. C ) is the col-lection of all invertible n × n matrices with entries in R (resp. C ). It is called the General Linear group . This is an open subset of M n ( R ) (resp. C ) and thereforecarries an obvious manifold structure. In fact, matrix multiplication and matrix in-version are smooth operations. This makes GL n ( R ) (resp. C ) a real Lie group ofdimension n (resp. 2 n ). Its Lie algebra is gl n ( R ) = M n ( R ) (resp. C ).(c) Define the operation − ∗ : M n ( C ) → M n ( C ) by X (cid:55)→ X T . The matrix X ∗ is called theadjoint matrix to X . Let U ( n ) ⊆ GL n ( C ) to be the set of matrices such that X ∗ X = I n the n × n identity matrix. This is the Unitary group and is a closed subgroup of GL n ( C ) and thus inherits a Lie groups structure. To find its dimension we passto the Lie algebra u ( n ) . An easy computation shows that u ( n ) consists of all skew-hermitian matrices ( X ∗ = − X ) and thus dim u ( n ) = n . Further, U ( n ) is a real Liegroup. To see this, see what happens when we take i u ( n ) .(d) Let S be the circle embedded as a submanifold of C . Then S carries a Lie groupstructure by writing its entries in polar coordinates. Define the Torus T n = ∏ n S .This carries a natural Lie group structure under component-wise multiplication. Itslie algebra is i R n . Lie algebras are significantly easier to deal with than Lie groups because they are es-sentially generalized vector spaces. Therefore, we want to understand the structure ofvarious types Lie algebras so that we may possibly deduce some information about theassociated Lie group.
Definition 5.1.7. A Lie subalgebra (normally shortened to simply subalgebra) of a Liealgebra g is a vector subspace h such that [ h , h ] ⊆ h where the bracket of Lie algebrasis shorthand for the set of all [ X , Y ] . An ideal of g is a subset i such that [ g , i ] ⊆ i . Asubalgebra a is called abelian if [ a , a ] =
0. 131e will denote ideals of g as i (cid:68) g and subalgebras as h ⊆ g . Notice that [ i , g ] ⊆ i is equivalent to the definition given above as this amounts to putting a negative signeverywhere, but − i = i . Proposition 5.1.8. If g is a Lie algebra and i is an ideal, then g / i has the structure of a lie algebra.Proof. As a set, g / i is simply the vector space quotient. To show that the Lie bracketdescends to the quotient, we consider two classes X + i , Y + i ∈ g / i . Then [ X + i , Y + i ] = [ X , Y ] + i by the bilinearity of the bracket. It then follows immediately that this bracket satisfies theJacobi identity. Hence, g / i is a lie algebra.Similar to the case of ideals of a ring, it can be shown (quite easily) that any ideal canbe realized as the kernel of a Lie algebra homomorphism, namely ϕ : g → g / i . Definition 5.1.9.
A Lie algebra g is simple if it has no non-zero proper ideals. It is semisimple if it has no non-zero solvable ideals. We say that a Lie group G is semisimple(resp. simple) if g is semisimple (resp. simple).A fact which we will not prove is that all semisimple Lie algebras have no center,and therefore all semisimple Lie groups have a 0-dimensional center. Further, one canprove (say by Cartan’s criterion for semisimplicity) that all semisimple Lie algebras canbe realized as a direct sum of simple lie algebras [Kna96, Chapter 1]Semisimple Lie groups are of interest to many areas of mathematics and are fairly wellunderstood. The small piece of the theory of lie groups that we need for the rest of thissection is the representation theory of semisimple Lie groups and Lie algebras . Before we get intothis, we want to understand where representation theory comes from in the first place.Why might we care about representations? Suppose G is a finite group (not assumed tobe of Lie type) and let G act on a set X . Denote by F ( X ) the set of all complex valuedfunctions on X . Then F ( X ) is naturally a C -vector space under point-wise addition andscalar multiplication. We can extend the action of G on X to an action on all of F ( X ) by ( g · f )( x ) = f ( g − · x ) This representation will break up into a direct sum of irreducible representations of G with some multiplicities (by Maschke’s Theorem). Precisely how this representation breaksup tells us something about the structure of X . In particular, if we put some conditionson the functions (that they are all L for instance) then we can better understand X andits symmetries. This has a similar flavour to understanding Aut ( X ) for X in an arbitrarycategory. Definition 5.1.10.
Let g be a Lie algebra over an arbitrary field. The commutator series for g is defined by g = [ g , g ] and g n + = [ g n , g n ] . We get a chain of Lie subalgebras g = g ⊇ g ⊇ g ⊇ ...We say that g is solvable if g n = n .132 efinition 5.1.11. Let g be a Lie algebra over an arbitrary field. The lower central series for g is defined by g = [ g , g ] and g n + = [ g , g n ] . We get a chain of ideals g = g ⊇ g ⊇ g ⊇ ...We say that g is nilpotent if g n = n . Corollary 5.1.12. If g is nilpotent then it is solvable. Lemma 5.1.13.
Every subalgebra of a solvable (resp. nilpotent) Lie algebra is solvable (resp.nilpotent).Proof.
Clearly, for each h ⊆ g the commutator series satisfies [ h , h ] ⊆ [ g , g ] . Theorem 5.1.14 (Lie’s Theorem) . Let g be a complex solvable Lie algebra and ( π , V ) a repre-sentation. Then there exists a simultaneous eigenvector for all elements in π ( g ) .This implies, for instance, that all elements of π ( g ) act by upper triangular matriceson any π ( g ) invariant subspaces. With the diagonal entries being the generalized eigen-values of the matrices.For proofs of this theorem see [Kna86] or [Bum13]. For this entire section, all statements not proven are presented in [Kna05a] with incredibledetail.
Definition 5.1.15.
Let g be a Lie algebra and ( π , V ) a representation. For α ∈ g ∗ put V α = { v ∈ V : ( π ( H ) − α ( H ) ) n v = ∀ H ∈ g , n = n ( v , H ) } If V α (cid:54) =
0, then V α is called a generalized weight space and α a weight . We will denotethe set of weights by Λ ( g , π ) .If V is finite dimensional then π ( H ) − α ( H ) V α via the theory of Jordan normal forms. Therefore, we may assumethat n ( v , H ) = dim V . In this case, we would like to somehow deduce information about π from the generalized weight spaces. Theorem 5.1.16.
Let h be a nilpotent lie algebra and ( π , V ) a finite dimensional complex repre-sentation. Then there are finitely many generalized weights of π . Further, each generalized weightspace is stable under π ( h ) and V = (cid:76) α ∈ Λ ( h , π ) V α Proof.
We first prove that V α is invariant under π ( h ) . Fix H ∈ h . Then put V α , H = { v ∈ V : ( π ( H ) − α ( H ) ) n v = n = n ( v ) } Now, by construction V α = (cid:84) H ∈ h V α , H . It suffices to prove that V α , H is π ( h ) -invariant.133ow, as h is nilpotent, ad H is nilpotent for all H . Put h ( m ) = { Y ∈ h : ( ad H ) m Y = } so that h = (cid:83) dim h m = o h ( m ) . We prove that π ( Y ) V α , H ⊆ V α , H for Y ∈ h ( m ) by induction on m .For the case of m = h ( m ) =
0. Therefore, assume that this holds forall Z ∈ h ( m − ) . If Y ∈ h ( m ) , then [ H , Y ] ∈ h ( m − ) be construction. Therefore, ( π ( H ) − α ( H ) ) π ( Y ) = π ( Y )( π ( H ) − α ( H )) + π ([ H , X ]) and ( π ( H ) − α ( H ) ) π ( Y ) = ( π ( H ) − α ( H ) ) π ( Y )( π ( H ) − α ( H ) ) + ( π ( H ) − α ( H ) ) π ([ H , Y ])= π ( Y )( π ( H ) − α ( H ) ) + ( π ( H ) − α ( H ) ) π ([ H , Y ]) + π ([ H , Y ])( π ( H ) − α ( H ) ) Iterating this computation, we get the general formula ( π ( H ) − α ( H ) ) (cid:96) π ( Y ) = π ( Y )( π ( H ) − α ( H ) ) (cid:96) + (cid:96) − ∑ s = ( π ( H ) − α ( H ) ) (cid:96) − − s π ([ H , Y ])( π ( H ) − α ( H ) ) s For v ∈ V α , H , we know that ( π ( H ) − α ( H ) ) N v = N ≥ dim V . Take (cid:96) = N inthe above expression and apply it to v . The only terms which survive are those for which s < N . In this case, (cid:96) − − s ≥ N and therefore ( π ( H ) − α ( H ) ) s v ∈ V α , H , π ([ H , Y ]) preserves V α , H be the induction hypothesis, and ( π ( H ) − α ( H ) ) (cid:96) − − s π ([ H , Y ])( π ( H ) − α ( H ) ) s v = ( π ( H ) − α ( H ) ) (cid:96) π ( Y ) v = V α , H is stable under π ( Y ) . This completesthe induction and V α is invariant under π ( h ) .Now we can obtain the decomposition. Let H , ..., H d be a basis for h . The Jordandecomposition for π ( H ) gives a generalized eigenspace decomposition that we can writeas V = (cid:77) λ V λ , H We can regard the complex numbers λ as running over all values of α ( H ) for α ∈ h ∗ arbitrary. Therefore, we can re-write the decomposition as V = (cid:77) α ( H ) , α ∈ h ∗ V α ( H ) , H However, V α ( H ) , H = V α , H which we defined at the beginning of the proof. Therefore,each of these spaces is stable under π ( h ) . Therefore, we can further decompose it under π ( H ) to get V = (cid:77) α ( H ) (cid:77) α ( H ) ( V α , H ∩ V α , H ) h to get V = (cid:77) α ( H ) ,..., α ( H d ) d (cid:92) j = V α , H j with each of these spaces π ( h ) -invariant. By Lie’s theorem, each π ( H i ) acts simultane-ously by an upper-triangular matrices on (cid:84) d V α , H i with diagonal entries evidently α ( H i ) .Then π ( ∑ c i H i ) acts by ∑ c i α ( H i ) . Thus, if we define α ( ∑ c i H i ) = ∑ c i α ( H i ) , we see that (cid:84) d V α , H i = V α and V = (cid:76) V α . In particular there are only finitely many α which satisfythis property. This completes the proof.Now let g be a semisimple Lie algebra and h a nilpotent subalgebra. Let h ∗ denote itsdual space. Then for all λ ∈ h ∗ , define g λ = { X ∈ g : ( ad H − λ ( H ) ) n X = ∀ H ∈ h , n = n ( X , H ) } As h is nilpotent, we know that g = (cid:76) λ ∈ h ∗ g λ . Further, there exist finitely many λ suchthat g λ is non-zero. Let ∆ ( g , h ) be the set of weights. Proposition 5.1.17.
In the setting above:(a) g = (cid:76) α ∈ ∆ ( g , h ) g α (b) [ g α , g β ] ⊆ g α + β (this space is understood to be if α + β (cid:54)∈ ∆ ( g , h ) . )(c) h ⊆ g Proof.
This all follows from the previous theorem by replacing V with g . Definition 5.1.18.
A nilpotent Lie subalgebra h is a Cartan subalgebra if h = g .This definition in general is hard to check. Therefore, we would like an equivalentway of defining Cartan subalgebras so that this condition is not too abstract. Proposition 5.1.19.
Let g be a Lie algebra and h a nilpotent subalgebra. Then h is a Cartansubalgebra if and only if N g ( h ) = h . This is the normalizer of h and is { X ∈ g : [ X , h ] ⊆ h } . Proof.
See [Kna05a]
Theorem 5.1.20.
Let g be a complex finite-dimensional Lie algebra. Then there exists a Cartansubalgebra h ⊆ g . Further, every Cartan subalgebra is conjugate.Proof.
See [Kna86], [Kna05a], and [Hel78] for separate proofs of this theorem.For the remainder of this section, we shall only give sketches of the proofs for thebig theorems as there are much more important topics to cover. For a full treatment see[Lor18, Chapter 7]. 135 efinition 5.1.21.
Let g be a complex semisimple Lie algebra and h a Cartan subalgebra.We call the weights of the adjoint representation of h on g roots . The decomposition g = h ⊕ (cid:77) α ∈ ∆ ( g , h ) g α is called the root space decomposition .We want to understand ∆ ( g , h ) . Proposition 5.1.22.
Consider the situation above.(a) If α , β ∈ ∆ ∪ { } and α + β (cid:54) = then B ( g α , g β ) = (b) If α ∈ ∆ ∪ { } , then B is non-singular on g α × g − α . (c) If α ∈ ∆ then − α ∈ ∆ . (d) B | h × h is non-degenerate and thus for each α there exists H α so that B ( H α , H ) = α ( H ) . (e) ∆ spans h ∗ . Proof.
See [Kna05a, Chapter 2]The following proposition reduces the case of the root space decomposition nicely.
Proposition 5.1.23. If α ∈ ∆ , then dim g α = Further n α (cid:54)∈ ∆ for n ≥ Proof.
See [Kna05a, Chapter 2]All of this together shows that ∆ ( g , h ) is an abstract, reduced root system. We can thusdefine a notion of positivity . Definition 5.1.24.
Let V be a finite dimensional inner product space. Fix a spanning set ϕ , ..., ϕ m . Then a vector ϕ is positive (denoted ϕ >
0) if there exists an integer k ≥ (cid:104) ϕ , ϕ i (cid:105) = ≤ i ≤ k − (cid:104) ϕ , ϕ i (cid:105) > i ≥ k . Lemma 5.1.25. If ϕ ∈ ∆ , the one of ϕ or − ϕ is positive.Proof. See [Lor18, Chapter 7].
Definition 5.1.26. A basis Π for ∆ is a choice of of elements such that(a) Π is a basis of h ∗ .(b) For any β ∈ ∆ , we can write β = ∑ n i α i with α i ∈ Π and n i ∈ Z all positive ornegative by Lemma 5.1.25.We call elements in Π simple, and normally say choose a simple system for ∆ .136 efinition 5.1.27. Let α , β ∈ h ∗ . We define an inner product on h ∗ by ( α , β ) = (cid:104) α , β (cid:105)(cid:104) β , β (cid:105) = || α |||| β || cos θ where θ is the angle between the functionals. Then the reflection of β by α ,denoted s α β is defined by s α β = β − ( β , α ) α The
Weyl group is W ( g ) : = (cid:104) s α : α ∈ ∆ (cid:105) Theorem 5.1.28. W ( g ) acts transitively on the set of simple systems for ∆ . Proof.
See [Kna05a, Chapter 2, Section 6]This final theorem eases the concern that picking positive elements is arbitrary andcould possibly lead to different results.Now, let α ∈ ∆ and put h ◦ = h − (cid:83) α ∈ ∆ α ⊥ . The connected components of h ◦ are called Weyl chambers and given a choice of simple system Π , there is a natural choice of Weylchamber associated to Π called the positive Weyl Chamber C ( Π ) = { α ∈ h ∗ : ( α , β ) > ∀ β ∈ ∆ + } = { α ∈ h ∗ : ( α , β ) > ∀ β ∈ Π } Associated to any ∆ ( g , h ) is a lattice Λ = { α ∈ h ∗ : ( α , β ) ∈ Z , ∀ β ∈ ∆ } . This is the weightlattice associated to ∆ . Definition 5.1.29.
An element α ∈ h ∗ is called dominant and algebraically integral if α ∈ Λ ∩ C ( Π ) . Lie algebras are easier to deal with than Lie groups, but still the fact that they are non-associative makes the situation a bit difficult. What we would like is to fine an associativealgebra A such that the representation theory of g is the same as the representation theoryof A in some semi-canonical sense. As a first guess, we could take the tensor algebra. Let g be a complex Lie algebra assumed to be finite dimensional (this construction works forthe infinite dimensional case as well). Let T • ( g ) = (cid:76) N g ⊗ k denote the tensor algebra of g . This does not force the resulting map A → End ( V ) to be a Lie algebra homomorphismand thus is not the correct choice. Therefore, let U ( g ) = T • ( g ) / (cid:104) X ⊗ Y − Y ⊗ X − [ X , Y ] (cid:105) with X , Y ∈ g . This is the universal enveloping algebra of g . Then the canonical map i : g → U ( g ) is a lie algebra homomorphism. It is universal in the sense that given anyunital associative algebra A and a Lie algebra homomorphism g → A there is a uniqueLie algebra homomorphism so that the following diagram commutes U ( g ) A g ˆ ϕϕ i The following theorem gives an algebraic description of the universal enveloping algebra.137 heorem 5.1.30 (Poincaré-Birkhoff-Witt) . Let g be a complex Lie algebra with basis { X i } . Thenthe monomials X p ... X p n n form a basis for U ( g ) . If in addition we assume g is semisimple, then Let { X − α , H α , X α } be a basisfor g with respect to a set of roots ∆ ( g , h ) and a choice of simple system Π . Then the monomialsX i − α ... X i p − α p H j α ... H j q α q X k α ... X k r α r form a basis for U ( g ) . Corollary 5.1.31.
The canonical map i : g → U ( g ) is an injective Lie algebra homomorphism. Proposition 5.1.32.
Every representation of g extends to a representation of U ( g ) and everyU ( g ) -module descends to a representation of g . Proof.
The inclusion of U ( g ) -modules into g -representations is done by the corollary above.Therefore, it suffices to show that every g -representation extends to an associative algebrahomomorphism U ( g ) → End ( V ) . Any representation g → End ( V ) can be extended toan algebra homomorphism T • ( g ) → End ( V ) . The kernel of this map contains the idealdefining U ( g ) and therefore descends to a map U ( g ) → End ( V ) .We want to give a more analytic interpretation of the universal enveloping algebra.Let G be a semisimple (or reductive) lie group with Lie algebra g . Then G acts on thespace of smooth functions C ∞ ( G ) in two ways L ( g ) f ( x ) = f ( g − x ) R ( g ) f ( x ) = f ( xg ) An easy consequence of the definitions the differentiated action d λ commutes with thedifferentiated action d ρ . Therefore L ( g ) dR ( X ) = dR ( X ) L ( g ) for all X ∈ g and g ∈ G . This exhibits g as left invariant differential operators on G . In fact,it is a faithful representation g → End ( C ∞ ( G )) . We can extend this action to U ( g ) andthereby realizing U ( g ) as a ring of left invariant differential operators on G . As it turnsout, much of the representation theory of G is determined by how certain differentialoperators (namely the Laplacian or Casimir element) act on representation. If the rep-resentation is irreducible for instance, then the center Z ( g ) of the universal envelopingalgebra acts by scalars. This parametrized the irreducible representations of G . Let g be a complex semisimple lie algebra with cartan subalgebra h and root system ∆ : = ∆ ( g , h ) . Let ∆ + denote the set of positive roots and Π a system of simple ones.It is known that the finite dimensional representation theory of semisimple lie alge-bras is semisimple. In the case of complex representations, we have that for every finitedimensional representation ϕ : g → gl ( V ) = End C ( V ) , we can decompose V = (cid:76) V i where each V i is irreducible. Therefore we want to classify all irreducible finite dimen-sional representations and this will yield all finite dimensional representations of g . Wehave the following theorem which does precisely this.138 heorem 5.1.33 (Theorem of Highest Weights) . Let g be a complex semisimple Lie algebra, h a Cartan subalgebra and ∆ ( g , h ) the roots with respect to h . Let C + be the positive Weyl chamber.Then the irreducible, finite-dimensional representations of g stand in one-one correspondence withthe set of algebraically integral, dominant weights. The correspondence is given in one directionby V (cid:55)→ λ its highest weight. The difficult step in the proof of this theorem is the construction of the correspondencein the “ ← ” direction. To do this, we must build finite dimensional irreducible represen-tations which have highest weight λ . These are seen as quotients of Verma modules (tobe defined below), which are infinite dimensional representations of g that are universalin some sense(see Proposition 5.1.38).The setup to the construction of such representations makes use of the root space de-composition of g . If α ∈ ∆ , define g α : = { X ∈ g | ( ad H − α ( H ) ) n X = ∀ H ∈ h , and some n = n ( h , X ) } Then it is easy to see that g = (cid:77) α ∈ ∆ g α = h ⊕ (cid:77) α (cid:54) = g α By definition the zero root space is the Cartan subalgebra. If we pick an order on h ∗ wecan then decompose g further into positive and negative root spaces n = (cid:77) α ∈ ∆ + g α n − = (cid:77) α ∈ ∆ + g − α These are both lie subalgebras by construction.
Definition 5.1.34.
The lie subalgebra constructed by all of the non-negative roots is calledthe
Borel subalgebra of g . We denote this as b = h ⊕ n Any lie subalgebra p such that b ⊆ p (cid:40) g is called a parabolic subalgebra .Before we head into the theory of highest weight modules, we recall some facts about sl ( C ) . If we let { e , f , h } be a basis, then on any irreducible finite dimensional represen-tation we have a weight space decomposition and the basis elements act in the followingway ... u i u i − ... ffeh feh e u m such that e ( u m ) =
0. We say that u m isthe highest weight vector of this representation. In this same style we have the followingdefinition. Definition 5.1.35.
Let V be a left U ( g ) -module. A vector v ∈ V is called a highest weightvector if n ( v ) =
0. The left U ( g ) -submodule generated by a highest weight vector is calleda highest weight module .The following proposition gives some properties of highest weight modules. Proposition 5.1.36.
Let M be a highest weight module for U ( g ) , and let v be a highest weightvector generating M . Suppose v is of weight λ . Then the following hold:(a) M = U ( n − ) v(b) M = (cid:76) µ ∈ h ∗ M µ with each M µ finite-dimensional and with dim C M λ = (c) Every weight of M is of the form λ − ∑ n i α i with α i ∈ Π and n i ∈ Z + . Proof. (a) As above, we have the decomposition g = b ⊕ n − . The Poincaré-Birkoff-Witt The-orem gives a basis for U ( g ) which gives us the decomposition U ( g ) = U ( b ) ⊗ U ( n − ) = U ( n ) ⊗ U ( h ) ⊗ U ( n − ) .On the vector v , U ( b ) acts by scalars. This follows from the fact that U ( n ) v = U ( h ) does not increase or decrease the weight. Therefore U ( g ) v = U ( n − ) v and as M isgenerated by v . We conclude that M = U ( n − ) v .(b,c) It is clear that (cid:76) M µ is stable under the left U ( g ) action. As v ∈ (cid:76) M µ , we havethat M ⊆ (cid:76) M µ . It is true by construction that (cid:76) M µ ⊆ M and therefore M = (cid:76) M µ .By ( a ) we know that M = U ( n − ) v . For any monomial E i − β ... E i k − β k , this element acts on M µ with weight µ − ∑ k i j β j . As λ is the highest weight, we have that there are finitelymany ways to write µ = λ − ∑ i j β j and a unique way to write λ . Therefore M µ is finite-dimensional and M λ is 1-dimensional. The weights are all λ − ∑ i j β j = λ − ∑ n i α i as β p = ∑ n i p α i for α i ∈ Π . This completes the proof.We will define Verma modules shortly. These will turn out to be highest weight mod-ules which are universal in some sense. Before then, let λ ∈ h ∗ , and put δ = ∑ α ∈ ∆ + α .We can make C into a U ( b ) -module by defining how elements of h and n act and then bythe Poincaré-Birkoff-Witt Theorem we will have defined how U ( b ) acts. Define the actionof b on C by Hz = ( λ − δ )( H ) z ∀ H ∈ h Xz = ∀ X ∈ n We denote C under this action as C λ − δ . Define a functor Ind gb : U ( b ) Mod → U ( g ) Mod by V (cid:55)→ U ( g ) ⊗ U ( b ) V U ( g ) -module as a module over the universal enveloping algebra of thesubalgebra. Definition 5.1.37.
The
Verma module corresponding to the weight lambda is V ( λ ) = Ind gb ( C λ − δ ) = U ( g ) ⊗ U ( b ) C λ − δ The following theorem characterizes Verma modules. Using these modules, one canprove the “ ← ” direction of the theorem of highest weights. Proposition 5.1.38.
Let λ ∈ h ∗ . (a) V ( λ ) is a highest weight module with weight λ − δ generated by ⊗ (b) Let M be a highest weight module of weight λ − δ . Then there exists a unique U ( g ) -modulemap ψ : V ( λ ) → M with ψ ( ⊗ ) = v with ψ onto. It is injective if and only if ker ψ = V ( λ ) . Part ( b ) follows from the universal mapping property for tensor products. Notice that V ( λ ) isinfinite dimensional over C . Proposition 5.1.39.
Let λ ∈ h ∗ , V ( λ ) the associated Verma module, and S the sum of all properU ( g ) submodules of V ( λ ) . Then L ( λ ) = V ( λ ) / S is an irreducible U (( g ) -module and is a highestweight module with weight λ − δ .This follows immediately from the definition and the fact that the image of 1 ⊗ L ( λ ) is non-zero. The following theorem completes the proof of the Theorem of HighestWeights. Theorem 5.1.40.
Let λ ∈ h ∗ such that λ is real on h , dominant, and algebraically integral. ThenL ( λ + δ ) is an irreducible finite-dimensional representation of g with highest weight λ .For a proof of this see [Kna05a, Chapter V, Section 3]. Remark 5.1.41.
The exact same result holds on the group level as well. There, the prooffollows from the theorem on the level of Lie algebras by differentiating the representationsand then following the same steps. The only difference is the replacement of algebraicallyintegral with analytically integral (defined below). For more details see [Kna86, ChapterIV, Section 7].Now that we know these representations exist and are parametrized by dominant,algebraically integral weights, we want to find an explicit realization of the L ( λ + δ ) . Todo this, we make use of the theory of holomorphic vector bundles.141 .2 Compact Groups and Tori The key to understanding a majority of the representation theory of reductive, semisim-ple, or compact Lie groups is the existence of a
Haar Measure . This is a left invariant Borelmeasure on G . The existence of such a measure implies, as an example, that all repre-sentations of compact Lie groups can be taken to be unitary without a loss of generality.Additionally, combined with the Iwasawa decomposition, we get a variety of strong re-sults. This will play a key role in the proof of the Borel-Weil theorem. Let us first showthat such a measure exists.Let G be a Lie group of dimension n with Lie algebra g . Then as T ( G ) = g and there isan isomorphism g → Γ L ( G , TG ) the set of left-invariant smooth vector fields on G . Fromthis we conclude that G is parallelizable. For this reason, we know that there exists an n − form ω ∈ Ω n ( G ) such that ω is positive relative to a chosen atlas on G , is nowherevanishing, and. is left-invariant. Further, by the Riesz Representation theorem, thereexists a Borel measure d µ ω on G such that (cid:82) G f ω = (cid:82) G f d µ ω for all f ∈ C c ( G ) . Lemma 5.2.1. d µ ω is left invariant in the sense that d µ ω ( L g E ) = d µ ω ( E ) for all Borel setsE ⊆ G and all g ∈ G . Proof. As ω is left-invariant, we know that L ∗ g ω = ω . Therefore, we have that (cid:90) G f ω = (cid:90) G f ( gx ) L ∗ g ω = (cid:90) G f ( gx ) d µ ω ( x ) = (cid:90) G f ( x ) d µ ω ( x ) Hence, d µ ω is left-invariant. If K ⊆ G is compact, we apply the above integral formulato all f ≥ K . Taking the infimum over these. functions we see that d µ ω ( L ∗ g K ) = d µ ω ( K ) .Since G has a countable base, d µ ω is regular and the lemma follows. Definition 5.2.2.
A left-invariant, positive, Borel measure on G is called a left Haar mea-sure . Proposition 5.2.3.
Every left Haar measure on G is proportional.Proof.
See [Kna05a, Theorem 8.23].We could have equivalently defined right Haar measures . For most groups these aredifferent from the left Haar measures. Let d l x denote a left Haar measure and d r x a rightHaar measure. Notice that L g and R g commute with one another. Then, for any t ∈ G ,the measure d l ( · t ) is a left Haar measure. For this reason, we get a function ∆ : G → R + called the modular homomorphism which satisfies d l ( · t ) = ∆ − ( t ) d l ( · ) This is a smooth function.
Lemma 5.2.4. ∆ ( t ) = for all t ∈ K a compact subgroup of G . Proof. As ∆ is smooth, ∆ ( K ) is a compact subgroup of R + . Therefore ∆ ( K ) = { } .142 efinition 5.2.5. A Lie group G is called unimodular if ∆ =
1. Equivalently, if d r ( x ) = d l ( x ) .We now want to know what groups are unimodular. Then, when integration arises onthese groups we do not have to worry about the choice of Haar measure. Theorem 5.2.6.
The following groups are unimodular:(a) Compact groups(b) semisimple groups(c) Reductive groups
We will not prove this as it requires the development of reductive lie groups which wedo not present. See [Kna05a] for a proof in full generality.Now we turn to general representation theory for compact groups. A representation is a continuous group homomorphism Π : K → Aut ( V ) for some Hilbert space V . (Theassumption that V is a Hilbert space is unnecessary for dim V < ∞ . As we want thegreatest generality, we do not place this finiteness assumption on V .) A representation iscalled unitary if Π ( k ) us a unitary operator for all k ∈ K . Lemma 5.2.7.
Let K be a compact Lie group and ( Π , V ) a representation. Then there exists aHermitian inner product (cid:104) , (cid:105) on V so that the representation is unitary.Proof. As K is compact, every continuous function is integrable. Define ( u , v ) = (cid:90) K (cid:104) Π ( k ) u , Π ( k ) v (cid:105) dk where dk is the Haar measure on K . Then it is obvious that each Π ( k (cid:48) ) is a unitary operatorwith respect to this new Hermitian inner-product. Further, by the Principal of UniformBoundedness we conclude that the topology on V is the same as the topology generatedby (cid:104) , (cid:105) .Therefore, we can assume without a loss of generality that every representation of acompact Lie group is unitary. Another interesting feature of compact Lie groups is theexistence of a maximal abelian subgroup. Proposition 5.2.8 (Cartan) . Let K be a compact, connected Lie group. Then there exists a maxi-mal abelian subgroup which can be identified as a torus. Further, every maximal torus is conjugate.Proof.
See [Bum13].In a similar style to semisimple Lie algebras, we can define roots with respect to t the Lie algebra of T ⊆ K a maximal torus. As t is abelian, the adjoint representation on k breaks up (as a direct sum) into one-dimensional irreducible representations. Each ofthese representations corresponds to a linear functional on t . We define roots as the thosecharacters which yield non-zero spaces k α . 143 efinition 5.2.9. Let λ ∈ t ∗ . Then we say λ is analytically integral if for every H ∈ t withexp H = λ ( H ) ∈ π i Z . By a simple argument it can be shown that this conditionis equivalent to the existence of a character ξ λ : T → C × such that ξ λ ( exp H ) = e λ ( H ) forall H ∈ t . Proposition 5.2.10. If λ is analytically integral, then λ is algebraically integral. That is ( λ , α ) ∈ Z , for each α ∈ ∆ ( k , t ) Proof.
See [Kna86].
We now depart from compact groups momentarily to set up the remaining backgroundfor the Borel-Weil theorem.
Let G be a real Lie group. We would like to find a complex Lie group G C which extends G in some meaningful way. Definition 5.3.1.
The complexification of a real Lie group G is a complex Lie group G C ,together with an analytic map G → G C such that the Lie algebra of G C is g C = g ⊗ R C and G C is universal in the following sense: if H is a complex Lie group, and ϕ : G → H is a smooth homomorphism, then there exists a unique holomorphic homomorphism G C → H making the appropriate diagram commute. Remark 5.3.2.
Note that not all Lie groups admit a complexification. In fact, the double(unversal) cover of SL ( R ) does not admit a complexification. Even if a complexificationexists, it is not necessarily unique up to isomorphism.The following theorem gives us another convenient property of compact groups: theyalways admit a complexification! Theorem 5.3.3.
Let K be a compact Lie group. Then K admits a complexification which is uniqueup to isomorphism.Proof.
See [Kna05a, Theorem 4.69 and Proposition 7.5]It turns out then that the finite-dimensional complex representations of compact Liegroups are is bijective correspondence with finite-dimensional holomorphic representa-tions of K C . Irreducibility need not be preserved by restriction.We now come to arguably the most important decomposition of complex Lie algebrasand the Lie groups associated to them. It is responsible for nearly all of the structuretheory for semisimple Lie groups. 144 heorem 5.3.4 (Iwasawa Decomposition) . Let g be a real semisimple Lie algebra and G aconnected Lie group with Lie algebra g . Then there exist Lie subalgebras k , a , n and associatedanalytic subgroups K , A , N , such that g = k ⊕ a ⊕ n and G = K ANwhere K is compact, A is abelian, and N is nilpotent and similarly for the lie algebras.Proof.
For the lie algebra decomposition, let ( g , θ ) by a semisimple Lie algebra togetherwith a Cartan involution. Put g = k ⊕ p the associated Cartan decomposition and h p be a maximal abelian subspace of p . As h p is maximal abelian, we can simultaneouslydiagonalize all elements ad H , H ∈ h p . Let g λ = { X ∈ g : [ X , H ] = λ ( H ) X , ∀ H ∈ h p , λ ∈ h ∗ p } Notice that θ ( g λ ) = g − λ . Pick an ordering on h ∗ p and let n = (cid:76) α > g α . Since h p is θ -invariant and maximal abelian, we have that g = ( g ∩ k ) + h p Now if X ∈ (cid:76) α < g α we can write it as X = X + θ ( X ) − θ ( X ) . This decomposition has X ∈ k ⊕ n . Therefore, we have a decomposition g = k + h p + n Applying θ we conclude that this decomposition is direct.For the Lie group decomposition see [Hel78].This theorem also holds in the complex case. There is some slight modification that needsto be done to the proof above, but the big steps are identical. Example 5.3.5. (a) Let g = sl n ( R ) . SO ( n ) (cid:44) → SL n ( R ) is a maximal compact subgroupand therefore so ( n ) is the corresponding compact lie algebra. Let a be the tracelessdiagonal matrixes and | lien be strictly upper triangular matrices. Then sl n ( R ) = so ( n ) ⊕ a ⊕ n We can equivalently realize this on the group level as SL n ( R ) = SO ( n ) · T · N where N is upper triangular matrices and T is the maximal torus. Notice that this is equiv-alent to the Gram-Schmidt orthogonalization of a matrix in sl n .Now lets consider the Cartan decomposition of sl n ( R ) = so ( n ) ⊕ p where p aresymmetric matrices. Notice that so ( n ) appears in both decompositions yet for theCartan decomposition we have no lie algebra structure on p . This should not besurprising however as both decompositions are equivalences as vector spaces.145b) Now consider sp n ( C ) . We have that k = (cid:26)(cid:18) U V − ¯ V ¯ U (cid:19) : U skew-Hermitian, V symmetric (cid:27) Similar to sl n we have a = (cid:26)(cid:18) A − A (cid:19) : A real diagonal matrix (cid:27) which are thediagonal matrices and the nilpotent lie algebra are all upper triangular matrices,but now we can decompose them further into n = (cid:26)(cid:18) Z Z − Z T (cid:19) : Z strictly upper triangular, Z symmetric (cid:27) Then sp n ( C ) = k ⊕ a ⊕ n . Theorem 5.3.6.
Let G be the complexification of a compact Lie group K , and T ⊆ K a maximaltorus, T C its complexification. Let g = k C be the complexified Lie algebra of k and t C the Lie algebraof T C . Denote the set of roots of g with respect to t C by ∆ . Fix an ordering on t ∗ C and write ∆ + theset of positive roots. Denote by n = (cid:76) α ∈ ∆ + g α and let b = t C ⊕ n . If we denote by N = exp ( n ) and B = T C N . Then N and B are closed subgroups of G . Further, there exists n > such thatG (cid:44) → GL n ( C ) such that K consists of unitary matrices, T C consists of diagonal matrices, and Bconsists of upper triangular matrices.Proof. Let π : K → Aut ( V ) be a faithful unitary representation. By the definition ofthe complexification, we can extend π to a holomorphic representation (also denoted π ) G → Aut ( V ) . Clearly, the Lie algebra b is solvable as [ b , b ] = n and n is nilpotent. By Lie’sTheorem, we may find a basis of V such that d π ( X ) is upper-triangular for all X ∈ b .Identify G with its imagine in GL n ( C ) and its Lie algebra as a Lie subalgebra of gl n ( C ) .Thus, we write X instead of π ( X ) and regard it as a matrix. Now, as each X ∈ n isnilpotent we know that exp ( X ) = I n + X + X + ... + n ! X n Therefore, Y − I n is a sum of strictly upper triangular matrices and is therefore a strictlyupper triangular matrix, hence nilpotent. Reversing the exponential series, we have that X = log ( exp ( X )) where we define log ( Y ) = ∑ ( − ) k − k ( Y − I n ) k for Y an upper triangularunipotent matrix. In this case, the sum is finite. This defines a continuous map n → N which is an inverse to exp . Therefore n → N is a homeomorphism. Let n (cid:48) be the Liesubalgebra of gl n ( C ) of upper-triangular nilpotent matrices and λ , ..., λ r a set of linearfunctionals on n (cid:48) such that n = (cid:84) i ker λ i . Then N is characterized as the set of A ∈ GL n ( C ) such that λ i ( log ( g )) = N as a closed subgroup (sub-variety) of GL n ( C ) .Now, since [ t C , n ] ⊆ n , we know that T C normalizes N and thus B = T C N is a closesubgroup of GL n ( C ) . Further its Lie algebra is b by construction. This completes theproof. 146he group B is a bit too big for the Iwasawa decomposition of G above. Let a = i t .It is the Lie algebra of a closed, connected subgroup A or T . If we embed K and G into GL n ( C ) , then T is the group of diagonal matrices and A is the group of diagonal matriceswith positive real entries. Put B = AN . Then by the Iwasawa decomposition G = KB as a direct product. Corollary 5.3.7.
Let K be a compact Lie group and T a maximal torus. If we denote by G thecomplexification of K , then there is a bijection K / T ∼ = G / B where B = T C N . This gives K / T thestructure of a complex manifold.Proof.
From the Iwasawa decomposition, we have that G = KB with B ∩ K = T . Notethat this decomposition is not direct as b + k = g is not a direct sum. Then we have adiffeomorphism G / B → K / T which is K -equivariant. Now as G is complex Lie group and B is a complex analyticsubmanifold, the quotient G / B has the structure of a complex manifold. Further, theaction of K on K / T is via holomorphic maps.As well will see later, the proof of the Borel-Weil theorem uses the Iwasawa decompo-sition in a fundamental way. In fact, nearly all of the structure theory for semisimple Liegroups is due to the Iwasawa decomposition. Definition 5.4.1.
Let M be a complex manifold. We call a triple ( E , π , V ) consisting of acomplex manifold, a holomorphic projection map, and a complex vector space holomor-phic vector bundle of rank dim V over M if:(a) π : E → M is surjective and a local isomorphism.(b) There exist biholomorphic π − ( U ) → U × V .(c) The fibre π − ( p ) ∼ = p × V ∼ = V is endowed with a vector space structure.Similarly, we could have defined vector bundles as E = (cid:228) p ∈ M V p where V p = { p } × V .In this sense, we see that T M and T ∗ M are vector bundles over smooth manifolds. Similarto those, Γ ( M , E ) is a O M -module. The main purpose of this section is to understandtransformations on bundles and transformations between them. Definition 5.4.2.
Let ( E , π ) and ( E (cid:48) , π (cid:48) ) be holomorphic vector bundles over M and M (cid:48) respectively. Then a holomorphic bundle homomorphism is a map F : E → E (cid:48) such thatthere exists a map f : M → M (cid:48) and the following diagram commutes: E E (cid:48)
M M (cid:48) π F π (cid:48) f roposition 5.4.3. If F is holomorphic, then f is holomorphic.Proof. f = π (cid:48) M ◦ F ◦ ζ where ζ is the zero section. This is a composition of holomorphicmaps and therefore holomorphic.This lets us define a category Bun H ( M ) whose objects are holomorphic vector bundlesover M and where morphisms are holomorphic bundle homomorphisms. The forgetfulfunctor U : Bun H ( M ) → Man C (with Man C the category of complex manifolds) is faithful. In general, it is not full asthere exist holomorphic maps E → E (cid:48) which do not commute with the projection maps.We will denote by Bun H ( M ) < ∞ the category of finite rank vector bundles. In some morerecent treatments of this material (say in [Wed16]) this category is treated as finite locallyfree sheaves over M . This is not useful for the theory presented below. Example 5.4.4.
Let
T M denote the real tangent bundle for the complex manifold M . Itis a rank 2 dim M real vector bundle. The complex structure on M induces an almostcomplex structure J on T M . This induces an endomorphism J : T M → T M such that J = −
1. This can thus be extended to a endomorphism
T M ⊗ C → T M ⊗ C defined onfibres by J ( X + iY ) = J ( X ) + i J ( Y ) . As J = −
1, we get a decomposition of
T M ⊗ C intotwo eigenspaces for J corresponding to the eigenvalues i and − i . Then T M ⊗ C = T M i ⊕ T M − i Then
T M i is the holomorphic tangent bundle to M . The bundle T M − i is called the anti-holomorphic tangent bundle.If E and E (cid:48) are holomorphic vector bundles on a complex manifold M , denote theirspace of holomorphic sections by Γ ( E ) and Γ ( E (cid:48) ) . If F : E → E (cid:48) is a bundle homomor-phism, it induces a map (cid:101) F : Γ ( E ) → Γ ( E (cid:48) ) given by (cid:101) F ( σ )( p ) = F ( σ ( p )) Because a bundle homomorphism is linear on fibres, (cid:101) F is C -linear on sections.We now want to construct some holomorphic vector bundles on a complex Lie groupand on complex homogeneous spaces G / H . Proposition 5.4.5.
Let G be a complex Lie group and ( π , W ) a complex representation of a closedsubgroup H . Then there exists a holomorphic vector bundle V over G / H such that G acts on thespace of sections.Proof.
The canonical map G → G / H is a principal H -bundle. Any complex representa-tion π : H → GL ( W ) induces an action of H on the space G × W by ( g , w ) · h = ( gh , π ( h − ) w ) V = G × H W = ( G × W ) / H . Then [ gh , w ] = [ g , π ( h ) w ] ∈ V . The map q : V → G / H given by [ g , w ] (cid:55)→ gH is well defined, surjective, and q − ( gH ) ∼ = W . This is afibre bundle with transition maps given by the transition maps for the principal bundle.Further, as the fibres are complex vector spaces and the canonical map is holomorphic,we have that V is a holomorphic vector bundle over G / H . Let Γ ( G / H , V ) denote the setof sections s : G / H → V . We can identify Γ ( G / H , V ) ∼ −→ F H , π : = { f : G → V | f ( gh ) = π ( h ) − f ( g ) } Then G acts on this space by g · f ( x ) = f ( g − x ) . This completes the proof.Even for one dimensional representations χ of H , the space F H , χ is unbelievably mas-sive. We may home that if we restrict to some subset (say impose more restrictions on f ∈ F H , χ ) then we may be able to get a handle on what these representations are. As itwill turn out in the next section, we can restrict ourselves to holomorphic sections of V . Thisrestriction will turn out to be enough to realize all of the finite dimensional irreduciblerepresentations of K a compact Lie group and G = K C its complexification. In this short subsection, we shall show that there is some interesting geometry happeningbehind the scenes here involving the quotients G / B or more generally G / P for any closedgroup containing B . This is done through the language of flag manifolds . Before we get toflag manifolds, we need to discuss the Grassmann manifolds (also called Grassmannians). Definition 5.4.6.
Let V be a real (or complex) vector space of dimension n . The Grass-mannian of k-planes in V is the set of all k -dimensional subspaces in V and is denotedGr ( k , V ) .Let G = Aut ( V ) be the group of automorphisms of V . By choosing a basis for V ,we can identify Aut ( V ) ∼ = GL n ( R ) (resp. GL n ( C ) ). Now, let A and A (cid:48) be two differentelements of Gr ( k , V ) . By choosing bases and extending these to full bases of V , we can finda matrix X ∈ GL n ( R ) such that X A = A (cid:48) . Therefore, GL n ( R ) acts transitively on Gr ( k , V ) .Let { v , ..., v n } be the basis of V and S = Span R { v , ..., v k } be the standard k -plane. Thenthe isotropy subgroup of S is the closed subgroup H = (cid:26)(cid:18) P Q R (cid:19) : P ∈ GL k ( R ) , Q ∈ M k , n − k ( R ) , R ∈ GL n − k ( R ) (cid:27) This gives an identification Gr ( k , V ) = GL n ( R ) / H . We call H a parabolic subgroup of G .This exhibits Gr ( k , V ) as a real (resp. complex) manifold.Now let ( n , ...., n j ) ∈ Z j j ≤ n be an increasing tuple of integers with n j = n = dim V .A flag of type ( n , ..., n j ) is a chain of subspaces0 = V ⊆ V ⊆ V ⊆ ... ⊆ V j = V V i = n i . Equivalently, we could require that dim V i / V i − = n i − n i − . A full flag corresponds to the tuple (
1, 2, 3, ..., n ) and thus a chain0 = V ⊆ V ⊆ ... ⊆ V n = V and dim V i / V i − = Definition 5.4.7.
The partial flag manifold of type ( n , ..., n j ) is the collection of all flagsof type ( n , ..., n j ) in V and is denoted Fl ( n , ..., n j ; V ) . The full flag manifold of V will bedenoted Fl ( V ) .By choosing a basis for V and thus identifying it with R n , we have a natural ac-tion of GL n ( R ) on Fl ( n , ..., n j ; V ) . Now, let F and F (cid:48) be two distinct flags. There ex-ists X ∈ GL n ( R ) such that XF = F (cid:48) and the action is transitive. The stabilizer of F isa closed subgroup P of GL n ( R ) and we identify Fl ( n , ..., n j ; V ) = G / P . This exhibitsFl ( n , ..., n j ; V ) as a smooth manifold. The stabilizer of the standard full flag is the sub-group B of upper-triangular matrices. Thus Fl ( V ) = GL n ( R ) / B . Remark 5.4.8.
The groups P and B are called the standard parabolic and standard Borel subgroups respectively. An alternative definition of the standard Borel subgroup is as astandard minimal parabolic subgroup. We call the conjugates of B , Borel subgroups andthe conjugates of P parabolic subgroups. Notice that every parabolic subgroup containsa Borel subgroup.In the case of a complex vector space, we see that Fl ( V ) = GL n ( C ) / B . By Corollary5.3.7, we can realize Fl ( V ) = K / T for K = U ( n ) . In more generality, for a connected Liegroup C , there exists a maximal torus S and the quotient space C / S is a flag manifold . Example 5.4.9. (a) As seen above, if V is a complex vector space then Gr ( k , V ) is aflag manifold corresponding to the tuple ( k , n ) ∈ Z . It is realized as the quotient GL n ( C ) / H with H the complex analog of the group defined above.(b) Let CP n (or P n ( C ) ) denote the orbit space ( C n + − { } ) / C × . This is realized as thespace of all lines in C n + . In the language we have seen above, we can realize this asGr ( C n + ) . Definition 5.4.10.
Let G be a complex connected Lie group and H a closed subgroup.Then G / H is a complex homogeneous space. Let p : V → G / H be a holomorphic vectorbundle. V is homogeneous if the group of bundle automorphisms act transitively onthe set of fibres of V . We call V homogeneous with respect to G if the G action on G / H lifts to a G action on V by bundle automorphisms. We will sometimes refer to these as G -homogeneous vector bundles.Let us now characterize all vector bundles on flag manifolds which are homogeneouswith respect to K C . Proposition 5.4.11.
Let K be a compact, connected Lie group and G its complexification. Let ( π , W ) be a representation of a parabolic subgroup P ⊆ G . Then this gives rise to a holomorphicvector bundle over the partial flag manifold G / P which is homogeneous with respect to G . Further,every holomorphic vector bundle which is homogeneous with respect to G arises in this way. roof.
The existence of such a vector bundle was proven in Proposition 5.4.5. The ho-mogeneity condition is readily checked. Therefore, we shall show that every homoge-neous vector bundle arises in this way. Let V be a G -homogeneous vector bundle and V P the fibre p − ( P ) . V P comes naturally equipped with the structure of a representation P → Aut ( V P ) . The map µ : G × V H → V defined by µ ( g , z ) = g · z is surjective as G acts transitively on G / P . The fibres of µ areprecisely the P orbits on G × V P via the diagonal action ( g , z ) (cid:55)→ ( gp − , p · z ) Therefore, we may represent any element uniquely as an equivalence [ g , z ] where [ gp , z ] =[ g , p · z ] . Hence, we can make the identification V = G × P V P . This completes the proof. We will motivate the theorem by starting with some facts about G = GL n ( C ) . The naturalaction of G on C n − { } commutes with the action of C × and therefore descends to anaction on CP n − . Moreover this action is transitive. Now, the isotropy subgroup of theclass [ ] in G consists of all g ∈ G such that g · (
0, ..., 0, 1 ) T = (
0, ..., 0, λ ) T for λ ∈ C × . Let Q be this group. Then Q = (cid:26)(cid:18) A w T λ (cid:19)(cid:27) ∩ GL n ( C ) with λ ∈ C , w ∈ C n − , and A ∈ M n − ( C ) . Then Q is a complex subgroup of G as itsLie algebra is complex. Therefore the quotient G / Q becomes complex manifold which isbiholomorphic to CP n − .Now fix N ≥ χ : Q → C × a character of Q of the form χ (cid:18) A w T λ (cid:19) = λ − N Then χ induces a holomorphic action of Q on C by q · z = χ ( q ) z . Using this, we can buildthe associated bundle G × Q C → G / Q in the style of the previous section. Now per theproof of Proposition 5.4.5, we can identify the C ∞ sections of this bundle with the spaceof functions F ∞ Q , χ = (cid:110) f : G → C f ( gq ) = χ ( q ) − f ( g ) , f smooth (cid:111) Now, let V N be the space of homogenous polynomials of degree N in n complex vari-ables. Then for any f ∈ V N define ϕ f ( g ) = f g q ∈ Q , we have that ϕ f ( gq ) = f gq = λ N ϕ f ( g ) Therefore ϕ f ∈ F ∞ Q , χ . In fact, this is holomorphic and therefore ϕ f ∈ F HolQ , χ , the space ofholomorphic sections. For the rest of this section, let (cid:96) = . Proposition 5.5.1.
The only holomorphic sections of G × Q C → G / Q are those ϕ f . Proof.
Let ϕ : G → C be the function corresponding to a holomorphic section of thebundle. We want to define a polynomial P ( z , ..., z n ) on C n − { } . Let g ∈ G be such that g (cid:96) = z ... z n . Then define P ( z , ..., z n ) = ϕ ( g ) . To see this is well-defined, let g (cid:48) be anotherelement of G satisfying g (cid:48) (cid:96) = z ... z n . Then g − g (cid:48) stabilizes (cid:96) and therefore is en element q of Q . Writing g (cid:48) = gq , we have that ϕ ( g (cid:48) ) = ϕ ( g ) and P is well-defined. Moreover, byconstruction P is homogeneous of degree N . Since we can define P using open sets of G ,we have that P is holomorphic on C n − { } . The homogeneity condition implies that P isbounded near 0. Hence, P admits a holomorphic extension to C n . Now, the C ∞ . behavior,combined with the homogeneity implies that | P ( z ) | ≤ C | z | N and similarly | ∂ α z P ( z ) | ≤ C α | z | N −| α | for any multi-index α and z ∈ C n − { } . If | α | > N , then ∂ α P vanishes at ∞ and byLiouville’s theorem, is 0. Therefore, the Taylor expansion of P about 0 vanishes for alldegrees > N . Hence, P is a polynomial.This implies that the representation of G on V N can be realized as the space of sections F HolQ , χ . In different terminology, we say that V N = Ind GQ ( χ ) is the induced representation of G from the representation χ of Q . Now, we can turn to the general situation.Let K be a compact lie group with maximal torus T . If G = K C is the complexification,then the Iwasawa decomposition implies that G = K AN and B = T C N where N are thelower-triangular nilpotent matrices. Then by Corollary 5.3.7, we know that G / B ∼ = K / T and both are complex manifolds. For any character λ : T → C × , we can extend λ to be acharacter of T C and then to B by declaring χ ( ¯ n ) =
1. Therefore, we get two line bundles G × B C ∼ = K × T C which are isomorphic as complex manifolds.152 heorem 5.5.2 (Borel-Weil) . Let K be a compact, connected Lie group and T ⊆ K be a maximaltorus. Let G = K C be the complexification and B = MAN a Borel subgroup. Then the irreduciblefinite dimensional representations of K stand in one-to-one correspondence with the dominant,analytically integral weights λ ∈ t ∗ with the correspondence given by λ (cid:55)→ Γ H ( K / T , L λ ) ∼ = F HolB , χ λ where Γ H ( K / T , L λ ) denotes the set of holomorphic sections of the bundle and F HolB , χ λ = (cid:110) f : G → C f ( gb ) = χ λ ( b ) − f ( g ) , f holomorphic (cid:111) with χ λ the character of B associated to the analytically integral weight λ .We present a combination of the proofs presented in [Kna86], [Hel78], and [Hel08].The proof will proceed in two main steps: 1) show that Γ H ( K / T , L λ ) is a finite dimen-sional and 2) show it is irreducible. Throughout the proof, we shall make use of theisomorphism Γ H ( K / T , L λ ) → F HolT , χ λ ∼ = F HolB , χ λ . Remark 5.5.3.
Another way of thinking about this theorem is as a classification result forvarious sheaves on the flag varieties (manifolds) Fl ( C n ) . Every finite rank vector bundleon Fl ( C n ) corresponds to a finite locally free sheaf with the correspondence given bytaking global sections. The theorem above classifies all of the line bundles (considered assheaves) on Fl ( C n ) which admit global sections.The Lie algebra of G has a Cartan decomposition g C = k ⊕ i k corresponding to theCartan involution θ : g → g . Let Θ be the corresponding involution of G . This givesan Iwasawa decomposition g = t ⊕ a ⊕ n . Pick a maximal abelian subalgebra i a of i k and form m = Z k ( a ) the centralizer of a in k . Then m is a Cartan subalgebra of k and m C = a ⊕ m . With respect to the roots ∆ ( k C , m C ) , put b = m ⊕ a ⊕ (cid:77) α ∈ ∆ + k − α Then B = MAN is the corresponding Iwasawa decomposition of the Borel subgroup.Now, let λ ∈ t ∗ be a dominant, analytically integral weight and ( Φ λ , V ) the irre-ducible, finite dimensional highest weight representation of K with highest weight λ .Let v λ ∈ V be a highest weight vector. This representation extends to a holomorphic rep-resentation (also denoted Φ λ ) of G via the universal property of the complexification. Foreach v ∈ V , define a function ψ v ( x ) on G by ψ v ( x ) = ( Φ λ ( x ) − v , v λ ) where ( , ) is the inner product on V induced via the isomorphism with C n . Lemma 5.5.4.
For each v ∈ V , ψ v ∈ F HolB , χ λ . Moreover if L denotes the left regular action, thenL ( k ) ψ v = ψ Φ λ ( k ) v and the collection { ψ v : v ∈ V } is an irreducible subrepresentation of F HolB , χ λ which is equivalent to Φ λ . 153 roof of Lemma 5.3. Let ϕ λ be the differential of Φ λ . Since Φ λ is unitary on V , ϕ λ is skew-hermitian on k and complex-linear on g . Therefore, ϕ λ ( θ X ) = − ϕ λ ( X ) ∗ and Φ λ ( Θ x ) = Φ λ ( x − ) ∗ for all X ∈ g and x ∈ G . Now if b ∈ B = MAN we have that for all x ∈ G ψ v ( xma ¯ n ) = ( Φ λ ( ma ¯ n ) − Φ λ ( x ) v , v λ )= ( Φ λ ( x ) − v , Φ λ ( ma − n ) v λ ) as Θ ( ma ¯ n ) = ma − n ∈ MAN = ( Φ λ ( x ) − v , Φ λ ( ma − ) v λ ) as v λ is a highest weight vector = ( Φ λ ( x ) − v , χ λ ( m ) χ λ ( a ) − v λ ) as v λ has weight λ = χ λ ( m ) χ λ ( a ) − ( Φ λ ( x ) − v , v λ )= χ λ ( b ) − ψ v ( x ) Further, It is clearly holomorphic as is defined by a holomorphic representation. Hence, ψ v ∈ F HolB , χ λ . Finally, ψ Φ λ ( k ) v ( x ) = ( Φ λ ( x ) − Φ λ ( k ) v , v λ )= ( Φ λ ( k − x ) − v , v λ )= ψ v ( k − x ) = L ( k ) ψ v ( x ) This completes the proof of the lemma.Now we wish to show that V → F HolB , χ λ is onto. Put ψ λ : = ψ v λ and F λ : = F HolB , χ λ . Lemma 5.5.5.
Let F ∈ F λ . Then (cid:90) M F ( mxm − ) dm = F ( ) ψ λ ( x ) for all x ∈ G . (dm is the normalized Haar measure on M . ) The idea of the proof is to show that the left side is a multiple of F ( ) independentof F . This multiple is a power series in x and evaluating at F = ψ λ , we see that they areequal near 1. By holomorphicity, the functions are thus equal everywhere. Proof of Lemma 5.4.
Let X ∈ g and (cid:101) X the corresponding left invariant vector field on G .Since F is holomorphic, it is real-analytic and thus the Taylor series of F converges to F isa neighbourhood of 1. Thus F ( exp X ) = ∑ n ! ( (cid:101) XF )( ) Conjugating by m and integrating, we see that (cid:90) M F ( m exp Xm − ) dm = ∑ n ! (cid:18)(cid:26) (cid:90) M Ad ( m ) (cid:101) X n dm (cid:27) F (cid:19) ( ) Now let { X α , H α , X − α } be a basis of g with respect to a positive choice of roots. Writ-ing X in terms of this basis and expanding, we get integrals of monomials. The coeffi-cients can be factored out as (cid:101) X is complex-linear as an endomorphism of F λ . Now, by the154oincaré-Birkhoff-Witt Theorem, we can rewrite the expression as a linear combination ofAd ( m ) and monomials of the form X i − α ... X i p − α p H j α ... H j q α q X k α ... X k r α r . Then integrals of eachmonomial is now Ad ( m ) -invariant and a monomial. If this new monomial has no X − α term for α ∈ ∆ + then by Ad ( m ) -invariance it cannot have any X α term.On the other hand, any Ad ( m ) -invariant polynomial cannot have any X − α terms as thevector field (cid:103) X − α F = tX − α ∈ N . Hence, all of the Ad ( m ) -invariantpolynomials lie in U ( m C ) and as exp m C = MA ⊆ B , each member of U ( m C ) acts byscalars depending only on λ . Hence, any expression of the form H j α ... H j n α n F ( ) is a scalarmultiple of F ( ) independent of F . This implies the lemma.Now we can prove Theorem 5.2 in a few easy steps. Proof of Theorem 5.2.
Define an inner product on F λ by (cid:104) F , F (cid:105) = (cid:90) K F ( k ) F ( k ) dk Claim 5.5.6. | F ( ) | ≤ || ψ λ || − · || F || In fact || F || = (cid:90) K | F ( k ) | dk = (cid:90) K | F ( mkm − ) | dk = (cid:90) K (cid:90) M | F ( mkm − ) | dmdk ≥ (cid:90) K (cid:18) (cid:90) M | F ( mkm − ) | dm (cid:19) dk = | F ( ) | (cid:90) K | ψ λ ( k ) | dk = | F ( ) | || ψ v || As a direct corollary of this, for every compact E ⊆ G , there exists a C E < ∞ such that | F ( x ) | ≤ C E || F || for all F ∈ F λ and x ∈ E . Therefore, F λ is complete (Cauchy sequences converge oncompact sets by the previous line and their limit is holomorphic and satisfies the desiredrelation). Now, F λ is finite-dimensional, as it is a locally compact Banach space.It remains to be shown that F λ is irreducible as a representation of K . Let U ⊆ F λ be aclosed, invariant subspace. Then for F (cid:54) = U , by applying some L ( k ) , we can assumethat F ( ) (cid:54) =
0. Therefore by completeness (cid:90) M χ λ ( m ) L ( m ) Fdm is an element of U . However, Lemma 5.5.5 says that this is equal to F ( ) ψ λ . Hence, ψ λ ∈ U . Similarly, we see that ψ λ ∈ U ⊥ . This is a contradiction and thus U = U ⊥ = F λ is an irreducible, finite-dimensional representation of K . By Lemma 5.5.4 themap V → F λ is a K -equivariant isomorphism. This completes the proof.155his result shows us that we can derive some algebraic information from a geometricobject. In the language of Chapter 3, this theorem can be restated as H ( G / B , F λ ) (cid:54) = λ is dominant and analytically integral. In fact, a stronger form of this theoremdue to Bott [FH04] says that the sheaf cohomology of the associated bundle is non-zero isonly one degree. This surprising appearance of sheaf cohomology indicates that it mayprove to be useful in understanding the sheaf G of chapter 4 as well as understanding C ∞ ( µ ( − )) as a G -module. Some care needs to be taken here as we do not know muchabout the category G - Mod . In fact, the case of O X -modules for a locally ringed space maydeviate highly from this situation in some critical ways. One being that there is no reason a priori that G x is a local ring. We do not provide a resolution to this here and thus thereis still much work to be done. One major deficit of the model in Chapter 4 is its dependence on the odor source rep-resentations to be clean and precise. What should happen if an odor is presented in anenvironment which is particularly noisy? For example, consider a fox in the wilderness.If the fox is eating a meal the odors are in high concentration and thus can be distin-guished. If instead it is trotting along and the odor of rabbit wafts through the air, howmay it determine what this odor is? There are clearly many other odors present in thesecond situation and thus should make identification nearly impossible. This contradictsexperimental and observational evidence however! We know that foxes can find theirprey with minimal odor stimulation; this implies the existence of some mechanism whichproduces a "best guess" for what a given noisy odor may be. As we shall see below, thereis a naive way of modeling such a problem which we conjecture is indeed the correct ap-proach. This naive method relies on vector fields on S and generates an attractor basin forthe various odors. This has been shown to have some relation to ˘Cech cohomology whichcan be viewed as a refinement of sheaf cohomology. This ties together all of the ideas pre-sented. We will not go through the construction of ˘Cech cohomology as it is a bit involvedand the main idea behind it is to serve as a computational tool for sheaf cohomology onsuitably nice spaces (of which manifolds happen to fit). In general, the theory of flows is a generalization of the theory of Ordinary differentialequations. Now, the equations are defined on manifolds by vector fields ξ : M → T M .We shall not do the general case here but refer the reader to [Lee12]. Our situation issignificantly eased as R (cid:48) and therefore S are assumed to be diffeomorphic to open sub-manifolds of R n and therefore TR (cid:48) ∼ = TS ∼ = S × R n . So there exists vector fields { V , ..., V n } which span the tangent space at each s ∈ S . As a result, there exists a non-trivial vectorfield ξ which is complete meaning that every trajectory can be given R as a domain. Bytrajectory we mean a smooth map γ : R → S such that γ (cid:48) ( t ) = ξ ( γ ( t )) . In general, this isthe solution of a differential equation and these trajectories are called maximal .156 efinition 5.6.1. Let ξ ∈ Γ ( S , TS ) . The flow of ξ is the mapping θ : S × R → S given by ( s , t ) (cid:55)→ γ s ( t ) where γ s ( ) = s and γ is the maximal trajectory.We can use flows to understand noisy inputs into the olfactory system. Let K be thecollection of s ∈ S such that s is a local maximum of the function f defining S . That is,these are the "tops" of the peaks. Now define a smooth vector field on S which makes K anattractor. Then the attractor basin is the disjoint union of a finite number of contractibleopen sets. What we would like to know is that the attractor basin is a cover of S sothat any point can be draw into one of the peaks and identified as in the classificationscheme of Chapter 4. This would allow us to identify any noisy odor (one for which (cid:102) U x is particularly large) with some degree of accuracy. Sadly, this cannot be guaranteed asa cover would rely on exposure to an enormous number of different odors (then we canassume that the U x form a cover of R (cid:48) and thus (cid:102) U x is a cover of S themselves) or somenearly equivalent requirement. As a consolation, we can still identify noisy odors whichfall within the attractor basin of the learned odors.Let us connect the idea of flows to representations. Let X ∈ Γ ( M , T M ) be a completesmooth vector field and θ : R × M → M the associated flow. This is equivalent to defin-ing an action of the Lie group ( R , +) on M and therefore a non-linear representation of R . Here, as diffeomorphisms of M . By differentiating this action, we get a non-linear Liealgebra representation R → X ( M ) . As we can realize flows as solutions to certain non-linear partial differential equations, we can equivalently understand theses solutions byunderstanding the corresponding representation on either the Lie group or Lie algebralevel. This is one reason representation theory my play a key role in the further develop-ment of this theory and for understanding the identification of noisy odors.157 hanks for reading! ibliography [AK18] A. J. Aqrabawi and J. C. Kim. Hippocampal projections to the anterior ol-factory nucleus differentially convey spatiotemporal information duringepisodic odour memory. Nat Commun , 9(1):2735, 07 2018.[AK20] A. J. Aqrabawi and J. C. Kim. Olfactory memory representations arestored in the anterior olfactory nucleus.
Nat Commun , 11(1):1246, Mar2020.[AM69] M. F. Atiyah and I.G. Macdonald.
Introduction to Commutative Algebra .Advanced Book Program. Westview Press, 1969.[AR18] D. Aschauer and S. Rumpel. The sensory neocortex and associativememory.
Current Topics in Behavioral Neurosciences , 37:177–211, 2018.[ASC +
04] N. M. Abraham, H. Spors, A. Carleton, T. W. Margrie, T. Kuner, and A. T.Schaefer. Maintaining accuracy at the expense of speed: stimulus sim-ilarity defines odor discrimination time in mice.
Neuron , 44(5):865–876,Dec 2004.[BC19] Ayon Borthakur and Thomas A. Cleland. Signal conditioning for learn-ing in the wild. In
Proceedings of the 7th Annual Neuro-Inspired Computa-tional Elements Workshop , NICE ’19, New York, NY, USA, 2019. Associa-tion for Computing Machinery.[BFC17] M. D. Berke, D. J. Field, and T. A. Cleland. The sparse structure of naturalchemical environments. In , pages 1–3, May 2017.[BLFL06] B. Bathellier, S. Lagier, P. Faure, and P. M. Lledo. Circuit properties gen-erating gamma oscillations in a network model of the olfactory bulb.
J.Neurophysiol. , 95(4):2678–2691, Apr 2006.[BMA +
15] A. Banerjee, F. Marbach, F. Anselmi, M. S. Koh, M. B. Davis, P. Garcia daSilva, K. Delevich, H. K. Oyibo, P. Gupta, B. Li, and D. F. Albeanu. AnInterglomerular Circuit Gates Glomerular Output and Implements GainControl in the Mouse Olfactory Bulb.
Neuron , 87(1):193–207, Jul 2015.[Bre97] Glen E. Bredon.
Sheaf Theory , volume 170 of
Graduate Texts in Mathematics .Springer Science+Business Media LLC, 1997.159Bum13] Daniel Bump.
Lie Groups , volume 225 of
Graduate Texts in Mathematics .Springer Sceince+Business Media New York, LLC, 2nd edition, 2013.[BW13] S. Marc Breedlove and Neil V. Watson.
Biological psychology: An intro-duction to behavioral, cognitive, and clinical neuroscience, 7th ed.
Biologicalpsychology: An introduction to behavioral, cognitive, and clinical neu-roscience, 7th ed. Sinauer Associates, 2013.[Car11] David Joseph Carchedi.
Categorical Properties of Topological and Differen-tiable Stacks . PhD thesis, Utrecht University, 2011.[CBC20] T. A. Cleland, A. Borthakur, and A. Calambur. TBD: Biological insightsfrom engineered systems.
Frontiers in Computational Neuroscience , Inpreparation, 2020.[CCH +
11] T. A. Cleland, S. Y. Chen, K. W. Hozer, H. N. Ukatu, K. J. Wong, andF. Zheng. Sequential mechanisms underlying concentration invariancein biological olfaction.
Front Neuroeng , 4:21, Nov 2011.[CL05a] Thomas A. Cleland and Christiane Linster. Computation in the OlfactorySystem.
Chemical Senses , 30(9):801–813, 11 2005.[CL05b] Henri Cohen and Claire Lefebvre, editors.
Handbook of Categorization inCognitive Science . Elsevier, 2005.[Cla19] J. P. Clapper. Graded similarity in free categorization.
Cognition , 190:1–19, Sep 2019.[Cle14] Thomas A. Cleland. Chapter 7 - construction of odor representations byolfactory bulb microcircuits. In Edi Barkai and Donald A. Wilson, edi-tors,
Odor Memory and Perception , volume 208 of
Progress in Brain Research ,pages 177 – 203. Elsevier, 2014.[CMYL02] T. A. Cleland, A. Morse, E. L. Yue, and C. Linster. Behavioral models ofodor similarity.
Behav. Neurosci. , 116(2):222–231, Apr 2002.[CNB09] T. A. Cleland, V. A. Narla, and K. Boudadi. Multiple learning param-eters differentially regulate olfactory generalization.
Behav. Neurosci. ,123(1):26–35, Feb 2009.[Coo15] Bruce N. Cooperstein.
Advanced Linear Algebra . Textbooks in Mathemat-ics. CRC Press, 2nd edition, 2015.[CPdLCPL +
16] M. Chatterjee, F. Perez de Los Cobos Pallares, A. Loebel, M. Lukas, andV. Egger. Sniff-Like Patterned Input Results in Long-Term Plasticity atthe Rat Olfactory Bulb Mitral and Tufted Cell to Granule Cell Synapse.
Neural Plast. , 2016:9124986, 2016.160CPO18] Yubei Chen, Dylan M. Paiton, and Bruno A. Olshausen. The sparse man-ifold transform, 2018.[CRC13] Jason B. Castro, Arvind Ramanathan, and Chakra S. Chennubhotla. Cat-egorical dimensions of human odor descriptor space revealed by non-negative matrix factorization.
PLoS ONE , 8(9):1–16, 09 2013.[CS06] T. A. Cleland and P. Sethupathy. Non-topographical contrast enhance-ment in the olfactory bulb.
BMC Neurosci , 7:7, Jan 2006.[DF04] David S. Dummit and Richard M. Foote.
Abstract Algebra . John Wiley &Sons Inc., 3rd edition, 2004.[DR08] W. Doucette and D. Restrepo. Profound context-dependent plasticity ofmitral cell responses in olfactory bulb.
PLoS Biol. , 6(10):e258, Oct 2008.[EG79] Murray Eisenberg and Robert Guy. A proof of the hairy ball theorem.
The American Mathematical Monthly , 86(7):571–574, 1979.[EH00] David Eisenbud and Joe Harris.
The Geometry of Schemes , volume 197of
Graduate Texts in Mathematics . Springer Science+Business Media NewYork, 2000.[ES12] S. Edelman and R. Shahbazi. Renewing the respect for similarity.
FrontComput Neurosci , 6:45, 2012.[ET10] G. Bard Ermentrout and David Terman.
Mathematical Foundations of Neu-roscience . Springer Science+Business Media, 2010.[FBB +
16] D. E. Frederick, A. Brown, E. Brim, N. Mehta, M. Vujovic, and L. M.Kay. Gamma and Beta Oscillations Define a Sequence of NeurocognitiveModes Present in Odor Processing.
J. Neurosci. , 36(29):7750–7767, 07 2016.[FBT +
17] D. E. Frederick, A. Brown, S. Tacopina, N. Mehta, M. Vujovic, E. Brim,T. Amina, B. Fixsen, and L. M. Kay. Task-Dependent Behavioral Dynam-ics Make the Case for Temporal Integration in Multiple Strategies duringOdor Processing.
J. Neurosci. , 37(16):4416–4426, 04 2017.[FF16] Anatoly Fomenko and Dmitry Fuchs.
Homotopical Topology , volume273 of
Graduate Texts in Mathematics . Springer International PublishingSwitzerland, 2016.[FH04] William Fulton and Joe Harris.
Representation Theory: A First Course , vol-ume 129 of
Graduate Texts in Mathematics . Springer Science+Business Me-dia New York, 2004.[Fri15] P. Fries. Rhythms for Cognition: Communication through Coherence.
Neuron , 88(1):220–235, Oct 2015.161GH10] Robert L. Goldstone and Andrew T. Hendrickson. Categorical percep-tion.
WIREs Cognitive Science , 1(1):69–78, 2010.[GP74] Victor Guillemin and Alan Pollack.
Differential Topology . American Math-ematical Society Chelsea Publishing, 1974.[GPCI15] Chad Giusti, Eva Pastalkova, Carina Curto, and Vladimir Itskov. Cliquetopology reveals intrinsic geometric structure in neural correlations.
Pro-ceedings of the National Academy of Sciences , 112(44):13455–13460, 2015.[GS09] Y. Gao and B. W. Strowbridge. Long-term plasticity of excitatory inputsto granule cells in the rat olfactory bulb.
Nat. Neurosci. , 12(6):731–733, Jun2009.[Har77] Robin Hartshorne.
Algebraic Geometry , volume 52 of
Graduate Texts inMathematics . Springer Science+Business Media LLC, 1977.[Har87] Stevan Harnad, editor.
Categorical Perception: The Groundwork of Cogni-tion . Cambridge University Press, 1987.[Hat01] Allen Hatcher.
Algebraic Topology . Cambridge University Press, 2001.[HC56] Harish-Chandra. Representations of semisimple lie groups, v.
AmericanJournal of Mathematics , 78(1):1–41, 1956.[HE96] Rachel S. Herz and Trygg Engen. Odor memory: review and analysis.
Psychonomic Bulletin & Review , 3(3):300–313, 1996.[Hel78] Sigurdur Helgason.
Differential Geometry, Lie Groups, and SymmetricSpaces , volume 34 of
Graduate Studies in Mathematics . American Math-ematical Society, 1978.[Hel08] Sigurder Helgason.
Geometric Analysis on Symmetric Spaces , volume 39 of
Mathematical Surveys and Monographs . American Mathematical Society,2008.[Her05] R. S. Herz. Odor-associative learning and emotion: effects on perceptionand behavior.
Chem. Senses , 30 Suppl 1:i250–251, Jan 2005.[HS97] P.J. Hilton and U. Stammbach.
A Course in Homological Algebra , volume 4of
Graduate Texts in Mathematics . Springer Science+Business Media NewYork, 1997.[HWK +
10] Rafi Haddad, Tali Weiss, Rehan Khan, Boaz Nadler, Nathalie Mandairon,Moustafa Bensafi, Elad Schneidman, and Noam Sobel. Global features ofneural activity in the olfactory system form a parallel code that predictsolfactory behavior and perception.
Journal of Neuroscience , 30(27):9017–9026, 2010. 162IC20] Nabil Imam and Thomas A. Cleland. Rapid online learning and robustrecall in a neuromorphic olfactory circuit.
Nature Machine Intelligence ,2:181–191, 2020.[IS98] J. S. Isaacson and B. W. Strowbridge. Olfactory reciprocal synapses: den-dritic signaling in the cns.
Neuron , 20(4):749–761, Apr 1998.[Ive86] Birger Iversen.
Cohomology of Sheaves . Universitext. Springer-VerlagBerlin Heidelberg, 1986.[Kas95] Christian Kassel.
Quantum Groups , volume 155 of
Graduate Texts in Math-ematics . Springer Science+Business Media New York, 1995.[Kay14] L. M. Kay. Circuit oscillations in odor perception and memory.
Prog.Brain Res. , 208:223–251, 2014.[KKER11] Alexei Koulakov, Brian Kolterman, Armen Enikolopov, and Dmitry Rin-berg. In search of the structure of human olfactory space.
Frontiers inSystems Neuroscience , 5:65, 2011.[Kna86] Anthony W. Knapp.
Representation Theory of Semisimple Groups: AnOverview Based on Examples (PMS-36) . Princeton University Press, rev- revised edition, 1986.[Kna88] Anthony K. Knapp.
Lie Groups, Lie Algebras, and Cohomology , volume 34of
Mathematical Notes . Princeton University Press, 1988.[Kna96] Anthony K. Knapp.
Lie Groups Beyond an Introduction , volume 140 of
Progress in Mathematics . Springer Sceince+Business Media, 1996.[Kna05a] Anthony K. Knapp.
Lie Groups Beyond an Introduction , volume 140 of
Progress in Mathematics . Springer Sceince+Business Media, 2nd edition,2005.[Kna05b] Anthony W. Knapp.
Advanced Real Analysis . Cornerstones in Mathemat-ics. Birkhäuser Boston, 2005.[Kna05c] Anthony W. Knapp.
Basic Real Analysis . Cornerstones in Mathematics.Birkhäuser Boston, 2005.[Kna06] Anthony W. Knapp.
Basic Algebra . Cornerstones in Mathematics.Birkhauser Boston, 2006.[Kna07] Anthony W. Knapp.
Advanced Algebra . Cornerstones in Mathematics.Birkhäuser Boston, 2007.[KS06] Masaki Kashiwara and Pierre Schapira.
Categories and Sheaves , volume332 of
Grundlehren der mathematischen Wissenschaften . Springer, 2006.163KSS +
10] F. Kermen, S. Sultan, J. Sacquet, N. Mandairon, and A. Didier. Consoli-dation of an olfactory memory trace in the olfactory bulb is required forlearning-induced survival of adult-born neurons and long-term memory.
PLoS ONE , 5(8):e12118, Aug 2010.[KSUM99] H. Kashiwadani, Y. F. Sasaki, N. Uchida, and K. Mori. Synchronized os-cillatory discharges of mitral/tufted cells with different molecular recep-tive ranges in the rabbit olfactory bulb.
J. Neurophysiol. , 82(4):1786–1792,Oct 1999.[Lan02] Serge Lang.
Algebra , volume 211 of
Graduate Texts in Mathematics .Springer Science+Business Media LLC, 3 edition, 2002.[LC13a] G. Li and T. A. Cleland. A two-layer biophysical model of choliner-gic neuromodulation in olfactory bulb.
J. Neurosci. , 33(7):3037–3058, Feb2013.[LC13b] Goushi Li and Thomas A. Cleland. A two-layer biophysical model ofcholinergic neuromodulation in olfactory bulb.
The Journal of Neuro-science , 33(7):3037–3058, 2013.[Lee11] John M. Lee.
Introduction to Topological Manifolds , volume 202 of
GraduateTexts in Mathematics . Springer Science+Business Media LLC, 2011.[Lee12] John M. Lee.
Introduction to Smooth Manifolds , volume 218 of
GraduateTexts in Mathematics . Springer Science+Business Media LLC, 2012.[LH99] C. Linster and M. E. Hasselmo. Behavioral responses to aliphatic aldehy-des can be predicted from known electrophysiological responses of mi-tral cells in the olfactory bulb.
Physiol. Behav. , 66(3):497–502, May 1999.[LKA +
20] M. Levinson, J. P. Kolenda, G. J. Alexandrou, O. Escanilla, D. M. Smith,T. A. Cleland, and C. Linster. Context-dependent odor learning requiresthe anterior olfactory nucleus.
Behav. Neurosci. , in press, 2020.[LMSW09] Christiane Linster, Alka V. Menon, Christopher Y. Singh, and Donald A.Wilson. Odor-specific habituation arises from interaction of afferentsynaptic adaptation and intrinsic synaptic potentiation in olfactory cor-tex.
Learning & Memory , 16(7):452–459, Jul 2009.[Lor18] Martin Lorenz.
A Tour of Representation Theory , volume 193 of
GraduateStudies in Mathematics . American Mathematical Society, 2018.[Mat86] Hideyuki Matsamura.
Commutative Ring Theory , volume 8 of
Cambridgestudies in advanced mathematics . Cambridge University Press, 1986.[MBB19] Ariella Y. Moser, Lewis Bizo, and Wendy Y. Brown. Olfactory generaliza-tion in detector dogs.
Animals: an Open Access Journal from MDPI , 9(9),Sep 2019. 164Mei15] M. Meister. On the dimensionality of odor space. eLife , 4:e07865, 2015.[Met03] David Metzler. Topological and Smooth Stacks. arXiv Mathematics e-prints , page math/0306176, Jun 2003.[MKC +
14] N. Mandairon, F. Kermen, C. Charpentier, J. Sacquet, C. Linster, andA. Didier. Context-driven activation of odor representations in the ab-sence of olfactory stimuli in the olfactory bulb and piriform cortex.
FrontBehav Neurosci , 8:138, 2014.[ML71] Saunders Mac Lane.
Categories for the Working Mathematician , volume 5of
Graduate Texts in Mathematics . Springer-Verlag New York Inc., 1971.[MLE +
09] Mélissa Moreno, Christiane Linster, Olga Escanilla, Joëlle Sacquet, AnneDidier, and Nathalie Mandairon. Olfactory perceptual learning requiresadult neurogenesis.
Proceedings of the National Academy of Sciences of theUnited States of America , 106:17980–5, 10 2009.[MNS81a] K. Mori, M. C. Nowycky, and G. M. Shepherd. Analysis of synapticpotentials in mitral cells in the isolated turtle olfactory bulb.
J. Physiol.(Lond.) , 314:295–309, May 1981.[MNS81b] K. Mori, M. C. Nowycky, and G. M. Shepherd. Electrophysiological anal-ysis of mitral cells in the isolated turtle olfactory bulb.
J. Physiol. (Lond.) ,314:281–294, May 1981.[MR07] X. Maio and R.P. Rao. Learning the lie groups of visual invariance.
NeuralComputations , 19:2665–2693, 2007.[MSN +
11] N. Mandairon, S. Sultan, M. Nouvian, J. Sacquet, and A. Didier. Involve-ment of newborn neurons in olfactory associative learning? The operantor non-operant component of the task makes all the difference.
J. Neu-rosci. , 31(35):12455–12460, Aug 2011.[Mun00] James R. Munkres.
Topology . Pearson, 2000.[NPLR14] A. Nunez-Parra, A. Li, and D. Restrepo. Coding odor identity and odorvalue in awake rodents.
Prog. Brain Res. , 208:205–222, 2014.[PGJSTH19] Fernanda Pérez-Gay Juárez, Tomy Sicotte, Christian Thériault, and Ste-van Harnad. Category learning can alter perception and its neural corre-lates.
PLoS ONE , 14(12):1–29, 12 2019.[RGMR18] D. Ramirez-Gordillo, M. Ma, and D. Restrepo. Precision of Classifica-tion of Odorant Value by the Power of Olfactory Bulb Oscillations Is Al-tered by Optogenetic Silencing of Local Adrenergic Innervation.
FrontCell Neurosci , 12:48, 2018. 165RKG06] D. Rinberg, A. Koulakov, and A. Gelperin. Speed-accuracy tradeoff inolfaction.
Neuron , 51(3):351–358, Aug 2006.[Rot88] Joseph J. Rotman.
An Introduction to Algebraic Topology , volume 119 of
Graduate Texts in Mathematics . Springer-Verlag New York Inc., 1988.[Rot09] Joseph J. Rotman.
An Introduction to Homological Algebra . Universitext.Springer Science+Business Media LLC, 2009.[Rot15] Joseph J Rotman.
Advanced Modern Algebra: Part 1 , volume 165 of
Gradu-ate Studies in Mathematics . American Mathematical Society, third edition,2015.[RPS +
13] J. P. Royet, J. Plailly, A. L. Saive, A. Veyrac, and C. Delon-Martin. Theimpact of expertise in olfaction.
Front Psychol , 4:928, Dec 2013.[Rya02] Raymond A. Ryan.
Introduction to Tensor Products of Banach Spaces .Springer Monographs in Mathematics. Springer-Verlag London, 2002.[SCT07] Richard J. Stevenson, Trevor I. Case, and Caroline Tomiczek. Resistanceto interference of olfactory perceptual learning.
The Psychological Record ,57:103–116, 2007.[SdCSL04] Armen Saghatelyan, Antoine de Chevigny, Melitta Schachner, andPierre-Marie Lledo. Tenascin-r mediates activity-dependent recruit-ment of neuroblasts in the adult mouse forebrain.
Nature Neuroscience ,7(4):347–356, Apr 2004.[Ser54] Jean-Pierre Serre. Représentations linéaires et espaces homogènes käh-lériens des groupes de lie compacts. In
Séminaire Bourbaki : années 1951/52- 1952/53 - 1953/54, exposés 50-100 , number 2 in Séminaire Bourbaki, pages447–454. Société mathématique de France, 1954. talk:100.[She87] RN Shepard. Toward a universal law of generalization for psychologicalscience.
Science , 237(4820):1317–1323, 1987.[Str09] B. W. Strowbridge. Role of cortical feedback in regulating inhibitory mi-crocircuits.
Ann. N. Y. Acad. Sci. , 1170:270–274, Jul 2009.[TPC14a] M. T. Tong, S. T. Peace, and T. A. Cleland. Properties and mechanisms ofolfactory learning and memory.
Front Behav Neurosci , 8:238, 2014.[TPC14b] Michelle T. Tong, Shane T. Peace, and Thomas A. Cleland. Properties andmechanisms of olfactory learning and memory.
Frontiers in BehavioralNeuroscience , 8, Jul 2014.[Tu11] Loring W. Tu.
An Introduction to Manifolds . Universitext. Springer Sci-ence+Business Media LLC, 2011.166VKS +
15] J. Vinera, F. Kermen, J. Sacquet, A. Didier, N. Mandairon, and M. Richard.Olfactory perceptual learning requires action of noradrenaline in theolfactory bulb: comparison with olfactory associative learning.
Learn.Mem. , 22(3):192–196, Mar 2015.[VRC17] Jonathan D. Victor, Syed M. Rizvi, and Mary M. Conte. Two representa-tions of a high-dimensional perceptual space.
Vision Research , 137:1–23,Aug 2017.[Wed16] Torsten Wedhorn.
Manifolds, Sheaves, and Cohomology . Springer StadiumMathematik-Master. Springer Fachmedien Wiesbaden, 2016.[WS03] D. A. Wilson and R. J. Stevenson. The fundamental role of memory inolfactory perception.
Trends Neurosci. , 26(5):243–247, May 2003.[WS06] Donald A. Wilson and Richard J. Stevenson.
Learning to smell: olfactoryperception from neurobiology to behavior . Johns Hopkins University Press,United States, 2006.[ZKU +
13] H. A. Zariwala, A. Kepecs, N. Uchida, J. Hirokawa, and Z. F. Mainen. Thelimits of deliberation in a perceptual decision task.
Neuron , 78(2):339–351,Apr 2013.[ZS06] Manuel Zarzo and David T. Stanton. Identification of Latent Variables ina Semantic Odor Profile Database Using Principal Component Analysis.
Chemical Senses , 31(8):713–724, 07 2006.[ZSS18] Yuansheng Zhou, Brian H. Smith, and Tatyana O. Sharpee. Hyperbolicgeometry of the olfactory space.
Science Advances , 4(8), 2018.[ZVM +
13] Q. Zaidi, J. Victor, J. McDermott, M. Geffen, S. Bensmaia, and T. A. Cle-land. Perceptual spaces: mathematical structures to neural mechanisms.