A General Approach to Coding in Early Olfactory and Visual Neural Populations
aa r X i v : . [ q - b i o . Q M ] A ug A General Approach to Coding in Early Olfactory and VisualNeural Populations
William T. RedmanUniversity of California, Santa [email protected]
Abstract
Recent experimental and theoretical work on neural populations belonging to two separateearly sensory systems, olfaction and vision, has challenged the notion that the two operate underdifferent computational paradigms by providing evidence for the respective neural populationcodes having three central, common features: they are highly redundant; they are organized suchthat information is carried in the identity, and not the relative timing, of the active neurons; theyare capable of error correction. We present the first model that captures these three properties ina general manner, making it possible to investigate whether similar structure is present in otherpopulation codes. Our model also makes specific predictions about additional, as yet unseen,structure in such codes. If these predictions are found in real data, this would provide newevidence that such population codes are operating under more general computational principles.
Introduction
Because of their relative ease to record from and their importance in being the input to upstreambrain areas, the coding properties of neural populations belonging to early sensory systems havebeen the focus of a large body of experimental and theoretical literature. This work has exploredthe precision of retinal spike trains [1], the correlation structure of individual retinal ganglioncells and their population wide collective states [2], [3], the possibility of criticality in retinalpopulations [4], [5], the preservation of odor identity representation by glomeruli (structures inthe olfactory bulb that receive projections from olfactory sensory neurons - OSNs) across varyingodor concentrations [6], [7], [8], [9] and the combinatorial nature of the glomeruli code [10], [11].While the respective work has followed a similar mission (to understand the nature of thepopulation codes), the results have caused a divergence in the belief of the coding principles at se. Yet, in both systems, the exact structure of the exact nature of the population code hasbeen unclear.Recent work has taken advantage of advances in recording and manipulation technology, aswell as statistical methods, to probe more intricately at this question of the exact structureof the retinal ganglion and the glomeruli population codes [12], [13], [14], [15]. Despite thedifferences in system and approach, the two have converged on three similar principles for therespective codes. First, the codes are highly redundant, as only a small subset of the neuronsare responsible for carrying the information about the identity of any given stimulus (whetherit be some feature of the visual field or an odor). Second, it is the identity of these neurons inthe specific subset (and not their relative timing) that carries the relevant information. Finally,a neural population code with those two features is endowed with the capability to be robustto noise, in the sense that the code is capable of error correction. These core similarities clearlychallenge the assumption that the two systems are operating under different coding paradigms.In the retinal ganglion population code, these features come about by the fact that theprobability space of population responses is populated by geometric objects identified as ridges.These ridges correspond to unique “codewords” that a downstream system maps all responsesin the ridge to [13]. The identity of these ridges is determined by an active set of neurons(those neurons that were active in the states that make up a given ridge) and a silent set (thoseneurons that weren’t active). The mapping from a given neural response to the appropriate,ridge-specific codeword was hypothesized to be achieved by an additional layer of neurons, eachone firing if a certain fraction of neurons in a given active set fire and none of neurons in thecorresponding silent set fire (see [13] Fig. 13). Under this simple model, it’s easy to see thatthere could be a number of population response states that get mapped to the same codeword.In the glomeruli population code, these features arise from the fact that the code has beenfound operate under the “Primacy Hypothesis”, namely that it is the first n active glomerulithat are responsible for encoding odor identity [14]. Therefore, all population responses thathave the same first n active glomeruli (where the order of the first n active glomeruli doesn’tmatter [14]), are recognized as being the same odor. Any of those given states can therefore beseen as a codeword. The relevant time scale (which determines the relevant n ) was found to be <
100 ms [14]. Whether n is fixed, or whether it can vary for various odors and odor mixtures,has yet to be determined.Recent theoretical work has suggested that the aforementioned three properties might bea more universal feature of neural population codes [16]. In particular, brain regions such as T and V1, which have similar firing rates and pairwise correlations to the retinal ganglioncells, were hypothesized to have these same properties in their neural codes. A more generalframework in which to talk about neural population codes with these three properties is thereforedesirable, especially if it can make predictions to specific structures that these codes should have.The rest of this paper will focus on such a framework. In the
Method section, we outline ourmodel for mapping arbitrary neural population responses to codewords. Our model utilizes theuniversal property of free groups (UPFG). We then provide an example of our model by mappinga specific set of neural responses to a given set codewords. We compare our methods mappingwith the mapping generated by using Generalized Minimum Distance (GMD) decoding [17], awell known method in coding theory. This example is given to provide a more intuitive feel forour model (especially for those unfamiliar with group theory) and illustrate how it comparesto other known (but not directly implicated in neural population coding) error correction al-gorithms. Finally, in the
Discussion section, we discuss the implications of our model withregards to real neural population data and possible future directions for this work.
Why use the UPFG?
There are two reasons why we turn to using such a formal, and mathematically abstract, lan-guage as group theory for this problem. First, for those with a background in group theory, webelieve that our model is fairly intuitive. Second, and more pertinently, theoretical neurosciencehas seen great advances when appropriate theories/descriptions from formal physics and math-ematics have been applied to neural problems. For instance, by borrowing ideas and analysistechniques from statistical physics, attractor neural networks (ANNs) were able to be developedand thoroughly explored [18], [19]. In this vein, we feel that when considering redundant neuralcodes that “collapse” different population response states onto the same output (or recognized)state, the theory associated with homomorphisms (and group theory), is an appropriate lan-guage to use. For those unfamiliar with group theory, see some basic discussion in the
Materialand Methods section for a brief discussion of the simple group theoretic definitions and notionsused in our model.
Model
We start by defining the set of generators , G , as G = S + B , where S = { , , ..., n } , B is asubset of the power set of S (i.e. B ⊂ P ( S )), and + is the set concatenation operator. We refer o B as the basis set . The relevance of B will be discussed below.A simple example of this is, for n = 5, G = { , , , , , , } , where the last two elementsare the elements of the basis set (here 12 stands for { , } ).We define the group of codewords , ( C, ∗ C ), as C = { c , c , ..., c m } and ∗ C is some operationon the elements of C that meets the standard group criteria [20].Finally, the restricted free group of G , ˜ F ( G ), is defined to be the abeleanized group of allelements in the free group of G , F ( G ), that are made up of, at most, each element of G once.For instance, while 121 ∈ F ( G ), 121 / ∈ ˜ F ( G ) because it has 1 twice.Using the universal property of free groups [20], we have the following diagram, G F ( G ) Cig ϕ (1)where i is the inclusion map (i.e. i ( x ) = x for all x ∈ G ), g is a group function determining C from G , and ϕ is a unique homomorphism (given a specific g ) from F ( G ) to C . The UPFG tellsus that we can relate g to ϕ by ϕ ( A ) = g ( a ) ∗ C ... ∗ C g ( a n ) (2)where A ∈ F ( G ) and A = a ...a n , such that a i ∈ G for all 1 ≤ i ≤ n .A final point on our model must be made. We define, for all s ∈ S , g ( s ) = id C (3)where id C is the identity element of the group of code words (i.e. for all a ∈ C , a ∗ C id C = a ).With this definition, it is more clear what the role of the basis set, B , is (as it is unnecessaryfor “building” F ( G )); it is the generator of ( C, ∗ C ) under g .We now consider an element (or “word”), ω , of ˜ F ( G ) (we consider ˜ F ( G ) because of itrepresents all possible neural responses that were considered in the neural data from [12] - [15],but the same holds true, with small modifications, for any element in F ( G )). We want to findthe codeword (i.e. the element in ( C, ∗ C ) that corresponds to ω . In particular, for any given ω , here exists a subset of B , { b , ..., b k } , and a subset of S , { s , .., s m } ( s j ∈ S ), s.t. ω = b ...b k s ...s m (4)(the exact ordering of the b i ’s and s j ’s doesn’t matter because ˜ F ( G ) is abelian). Of course, thisdecomposition of ω is by no means unique. We therefore define the decomposition of ω as thepair ( { b i } , { s j } ), s.t. the number of elements of { b i } is the (if possible) non-zero minimum of allthe possible decompositions of ω . From the neural perspective, this is equivalent to demandingthat every word is represented as simply as possible by the activity states that make up thebasis set.With this, we can now look at the mapping of ω to its relevant codeword, which is given by ϕ ( ω ) (since ϕ : ˜ F ( G ) → C ). ϕ ( ω ) = ϕ ( b ...b k s ...s l ) (5)= g ( b ) ∗ C ... ∗ C g ( b k ) ∗ C g ( s ) ∗ C ... ∗ C g ( s l )= g ( b ) ∗ C ... ∗ C g ( b k )By the defined decomposition of ω , this decoding is unique.Note therefore that if two elements, ω and ω , in ˜ F ( G ) have the same basis elements, { b i } ,in their decomposition, then the decoding of the two elements is equivalent ϕ ( ω ) = g ( b ) ∗ C ... ∗ C g ( b k ) = ϕ ( ω ) (6) Example
To illustrate our model, we map an example neural response space onto to example codewords.We also compare this mapping to an existing error correction method, Generalized MinimalDistance (GMD) decoding [17], as a way to show the possible strengths of our method inreference to existing methods.We will consider G = { , , , , , } , C = (cid:0) { , , , } , + (cid:1) . For simplicity, we will con-vert each element of G and C into a binary string. This corresponds to G = { , , , , , } and C = (cid:0) { , , , } , + mod(2) (cid:1) , if we take each number, 1 , ..,
4, to be a positionin a four bit string that has a value of 1.The result of applying our mapping method and applying GMD decoding (where the decod- ng is determined by the codeword that has the minimal Hamming distance from the word weare trying to decode) is given in Table 1. Note that we are assuming that every word is equallylikely to be received. From this, we see that, first and foremost, the number of three way ties(as denoted by ?) is significantly less using the UPFG decoding as opposed to GMD decoding(one vs. eight). Additionally, in every determined decoding, the two methods agree. Table 1: Comparison of our model (UPFG mapping) and GMD decoding on an example responsespace Word UPFG decoding GMD decoding1111 ? ?1110 1100 11001101 1100 ?1011 1001 10010111 0101 01011100 1100 11001001 1001 10010011 0000 ?1010 0000 ?0101 0101 01010110 0000 ?1000 0000 ?0100 0000 ?0010 0000 00000001 0000 ?0000 0000 0000Table 2
Discussion
Recent work investigating the structure of the retinal ganglion and glomeruli population codes,[12] - [15], has challenged a number of widely held theoretical biophysical and neural beliefswith their three central (and convergent) conclusions: the respective neural population codesare redundant, in the sense that multiple neural response states are interpreted as encodingthe same information; it is the identity of the relevant subset of neurons that fire, and not therelative timing of their firing, that encodes the identity of the stimulus; such codes with theprevious two properties are endowed with the capability of error correction.The fact that these neural population codes were found to be redundant contradicts the beliefthat neural codes should be very efficient. Such conclusions will force theorists to reconsiderwhat, if anything, neural population codes are optimized for. Additionally, the finding that the rst two properties allow for error correction marks a transition from the focus of biophysical andneural theory on how systems can output robustly despite noise (for a few of many examples,see [21], [22], [23], [24]) to the idea that the output of neural systems can be noisy, but it is thesubsequent mapping (or “interpretation”) of that output that is robust to noise.Finally, that these conclusions are reached by very different experimental and theoreticalmethods in the retinal ganglion and glomeruli populations suggest that the two sensory systems(vision and olfaction), despite being previously believed to be operating under different codingprinciples, are, in fact, using the same principles. This surprising (and powerful) fact, coupledwith theoretical work [16] arguing that these coding principles might be used in more than justearly sensory systems, and in higher brain areas like MT and V1, highlights the need for ageneral framework in which to discuss codes with such three properties.To see that our model indeed is a general framework that meets all three of these facets ofneural population codes, note first that it is only the basis elements that determine the decodingof any word. Therefore, words that vary by elements that don’t affect the basis elements areseen as equivalent (i.e. eq. 6). This can also be clearly seen in Table 1 (e.g. 1100 and 1101have the same mapping). Second, the fact that we restrict ourselves to looking specifically atthe restricted free group of G , ˜ F ( G ), we are not only restricting ourself to the more reasonableneural case where neurons are consider to have fired at most once in a time bin, we are alsorestricting ourselves to only the identity of the basis elements mattering in decoding and not theordering, as ˜ F ( G ) is abelian, and hence, the words ω ω and ω ω are equivalent. Finally, asnoted before, because only the basis elements in the decomposition of a given word determinethe decoding, our method is capable of error correction.Our model, while capturing the three main principles of the aforementioned work, alsomakes predictions about aspects of the early sensory systems’ coding. In particular it predictsthe existence of a basis set. In the context of the glomeruli population code, there exist twopossibilities. Either one, n (the number of relevant active glomeruli for odor classification) isfixed and the basis set is the set of codewords, which gives no new insight into the code. If n is not fixed, then there exists the possibility of all codewords being able to be built from somesmaller set of response states that are themselves codewords. For example, if n = 2 , , and 4 areall allowed (e.g. n is a function of odor complexity), then any n = 4 codeword could be builtfrom two codewords that are n = 2. This greatly reduces the amount of elements needed fordescribing the codewords, and sheds light onto possible downstream decoding mechanisms.Similarly, for the retinal population code, the existence of a non-trivial basis set would shed ight on downstream decoding schemes. In particular, it would provide possible adjustments tothe model hypothesized in [13] (i.e. Fig. 13). It is important to note that looking for the basisset in the retinal population code may require a switch in perspective, where it is the neuronsin a given codewords silent set that might be the relevant feature. Future work will focus onmethods for searching for basis sets, as well as specific implications of the existence of basissets for downstream decoding schemes and how such schemes might develop in a natural way.Finding such basis sets in both the glomeruli and retinal population codes would extend evenfurther the growing understanding of how similar the two codes are, and provide more evidencefor the two operating under more universal computing principles.One clear failure of our model is that is relies on group theoretic notions that most in theneuroscience community aren’t familiar with. While we believe that continuing to think interms of this language will be useful (and indeed, we hope that our model convinces others inthe mathematical community to consider the possible role free groups, and the UPFG, mightplay in error correction and neural coding), we hope to translate this model into a more clear,and less mathematically technical, model that still captures the main principles, but can moreeasily be communicated to others.We hope that we have made clear the similarities of the two early sensory systems’ populationcodes, the need for a general framework in which to explore arbitrary codes that share theproperties exhibited by those two codes, and the possible utility of using group theory as alanguage to talk about such codes. Materials and Methods
We provide here the basic group theory definitions and notions that are required for understand-ing our model. We have tried to make this as easy to understand as possible for the reader notfamiliar with group theory. Note that therefore some of the extra complications or subtletiesare swept under the rug at the discretion of the author, especially if they are not believed to berelevant for understanding our model. Everything written below can be found in the followingtwo references [20], [25].
Groups
A group is defined as the pair ( G, ∗ G ), where G is a set of elements and ∗ G is a binary operationthat satisfies the following three properties: . There exists an identity element in G under ∗ G . That is, there is an element id G such thatid G ∗ G x = x for all x ∈ G .2. The elements of G are self contained under ∗ G . That is, for all x, y ∈ G , ( x ∗ G y ) ∈ G .3. There exists an inverse element for all elements of G . That is, for all x ∈ G , there exists x − such that x ∗ G x − = id G .An example of a group is the integers under addition. It is easily verified that all threeconditions are met, where the identity element is 0 and the inverse of n is − n .For simplicity, groups will now be referred to as just G , where ∗ G is implicitly assumed. Abelian
A group is said to be abelian if the elements of G commute. That is, if for all x, y ∈ G , x ∗ G y = y ∗ G x .For the integers under addition, this is clearly the case. But most groups are not abelian(e.g. the set of matrices with unit eigenvalues under matrix multiplication are not abelian). Homomorphisms
A homomorphism is a map, ϕ , between two groups, G and H , such that ϕ ( x ∗ G y ) = ϕ ( x ) ∗ H ϕ ( y )for all x, y ∈ G . Note that the binary operations are different on each side of the equation.In simpler terms, a homomorphism is a map that collapses one group onto another, whilepreserving some structure. For instance, the parity map (i.e. the map that returns 0 if theargument is even and 1 if the argument is odd), ϕ , from the integers to the integers modulus2 (i.e. ( { , } , +mod(2)), where 1 + 1 = 0 mod(2)) is a homomorphism. ϕ clearly collapsesthe integers (it reduces them to a set with only two elements), but it preserves some structure(namely, parity). Free groups
A free group, F ( G ), is an infinite group (in the sense that the set that comprises the elementsof the free group is infinite) that is comprised of every possible combination of the elementsof the set G , using the binary operation ∗ . For instance, if G = { a, b, a − , b − } , then the setcomprising F ( G ) is given by { a, b, a ∗ a, a ∗ b, b ∗ a, a − ∗ b, a ∗ b − , a − ∗ b − , a ∗ a ∗ a, ... } . cknowledgements We thank Sylvain Cappell for introducing us to free groups and for his clear explanation of theUFPG, Sanchit Chaturvedi and Roy Rinberg for their insightful discussions, and Nick Verga forinviting us to present our work early on. Finally, we thank Michael Berry and Dima Rinbergfor discussing their work with us.
References [1] Berry M, Warland D, Meister M (1997) The structure and precision of retinal spike trains.Proc. Natl. Acad. Sci.[2] Schneidman E, Berry II M, Segev R, Bialek W (2006) Weak pairwise correlations implystrongly correlated network states in a neural population. Nature.[3] Tkacik G, et al. (2014) Searching for collective behavior in a large network of sensory neurons.PLoS Computational Biology.[4] Tkacik G, et al. (2015) Thermodynamics and signatures of criticality in a network of neurons.Proc. Natl. Acad. Sci.[5] Mora T, Deny S, Marre O (2015) Dynamical criticality in the collective activity of a popu-lation of retinal neurons. Physics Review Letters.[6] Gross-Isseroff R, Smith B (1988) Concentration-dependent changes of percieved odor quality.Chem. Senses.[7] Bhagavan S, Smith B (1997) Olfactory conditioning in the honey bee, apis mellifera: effectsof odor intensity. Physiol. Behav.[8] Uchida N, Mainen Z (2007) Odor concentration invariance by chemical ratio coding. Front.Syst. Neurosci.[9] Cleland T, et al. (2011) Sequential mechanisms underlying concentration invariance in bio-logical olfaction. Front. Neuroeng.[10] Malnic B, Hirono J, Sato T, Buck LB (1999) Combinatorial receptor codes for odors. Cell.[11] Saito H, Chi Q, Zhuang H, Matusnami H, Mainland JD (2009) Odor coding by a mammalianreceptor repertoire. Sci. Signal.[12] Prentice J, et al. (2016) Error-robust modes of the retinal population code. PLoS Compu-tational Biology.
13] Loback A, Prentice J, Ioffe M, Berry II M (2017) Noise-robust modes of the retinal pop-ulation code have the geometry of ridges and correspond to neuronal communities. NeuralComputation 29(12):31193180.[14] Wilson C, Serrano G, Koulakov A, Rinberg D (2017) A primacy code for odor identity.Nature communications 8(1).[15] Giaffar H, Rinberg D, Koulakov A (2018) Primacy model and the evolution of the olfactoryreceptor repertoire. bioRxiv.[16] Ioffe M, Berry II M (2017) The structured low temperature phase of the retinal populationcode. PLoS Computational Biology.[17] Forney G (1996) Generalized minimum distance decoding. IEEE Transactions on Informa-tion Theory.[18] Hopfield J (1982) Neural networks and physical systems with emergent collective compu-tational abilities. Proc. Natl. Acad. Sci.[19] Amit D (1989) Modeling Brain Function: The World of Attractor Neural Networks. (Cam-bridge University Press).[20] Dummit D, Foote R (1991) Abstract Algebra. (Prentice Hall, Engelwood Cliffs, N.J.).[21] Wolpert L (1969) Positional information and the spatial pattern of cellular differentiation.J. Theor. Biol.[22] Gregor T, Tank D, Wieschaus E, Bialek W (2007) Probing the limits to positional infor-mation. Cell.[23] Elowitz M, Levine A, Siggia E, Swain P (2002) Stochastic gene expression in a single cell.Science.[24] Losick R, Desplan C (2008) Stochasticity and cell fate. Science.[25] Herstein IN (1975) Topics in Algebra. (John Wiley & Sons), 2 edition.13] Loback A, Prentice J, Ioffe M, Berry II M (2017) Noise-robust modes of the retinal pop-ulation code have the geometry of ridges and correspond to neuronal communities. NeuralComputation 29(12):31193180.[14] Wilson C, Serrano G, Koulakov A, Rinberg D (2017) A primacy code for odor identity.Nature communications 8(1).[15] Giaffar H, Rinberg D, Koulakov A (2018) Primacy model and the evolution of the olfactoryreceptor repertoire. bioRxiv.[16] Ioffe M, Berry II M (2017) The structured low temperature phase of the retinal populationcode. PLoS Computational Biology.[17] Forney G (1996) Generalized minimum distance decoding. IEEE Transactions on Informa-tion Theory.[18] Hopfield J (1982) Neural networks and physical systems with emergent collective compu-tational abilities. Proc. Natl. Acad. Sci.[19] Amit D (1989) Modeling Brain Function: The World of Attractor Neural Networks. (Cam-bridge University Press).[20] Dummit D, Foote R (1991) Abstract Algebra. (Prentice Hall, Engelwood Cliffs, N.J.).[21] Wolpert L (1969) Positional information and the spatial pattern of cellular differentiation.J. Theor. Biol.[22] Gregor T, Tank D, Wieschaus E, Bialek W (2007) Probing the limits to positional infor-mation. Cell.[23] Elowitz M, Levine A, Siggia E, Swain P (2002) Stochastic gene expression in a single cell.Science.[24] Losick R, Desplan C (2008) Stochasticity and cell fate. Science.[25] Herstein IN (1975) Topics in Algebra. (John Wiley & Sons), 2 edition.