Model of the hippocampal formation explains the coexistence of grid cells and place cells
aa r X i v : . [ q - b i o . N C ] A p r MODEL OF THE HIPPOCAMPAL FORMATION EXPLAINS THE COEXISTENCE OFGRID CELLS AND PLACE CELLSANDRÁS LŽRINCZ , MELINDA KISZLINGER AND GÁBOR SZIRTES , Abstra t. In this paper we explain the strikingly regular a tivity of the `grid' ells in rodent dorsal medialentorhinal ortex (dMEC) and the spatially lo alized a tivity of the hippo ampal pla e ells in CA3 andCA1 by assuming that the hippo ampal region is onstru ted to support an internal dynami al model of thesensory information. The fun tioning of the di(cid:27)erent areas of the hippo ampal-entorhinal loop and theirintera tion are derived from a set of information theoreti al prin iples. We demonstrate through simpletransformations of the stimulus representations that the double form of spa e representation (i.e. pla e(cid:28)eld and regular grid tiling) an be seen as a omputational `by-produ t' of the ir uit. In ontrast toother theoreti al or omputational models we an also explain how pla e and grid a tivity may emerge atthe respe tive areas simultaneously. In a ord with re ent views, our results point toward a lose relationbetween the formation of episodi memory and spatial navigation.1. INTRODUCTIONWhen we enter a new pla e, even without having immediately re ognized ea h of the obje ts surroundingus, we need only a moment to per eive the parti ular on(cid:28)guration of these obje ts within their environment(the mapping) and de(cid:28)ne our own relative position (lo alization or ego entri des ription) in the sameenvironment. Sizing up distan es is not of great di(cid:30) ulty either. In doing so we an use approximate learntmetri or intrinsi , idiotheti (self-motion based) ues, e.g., the number of steps needed to rea h the wall.Why does spatial navigation, i.e., mapping, lo alization and remembering pla es seem so easy for animals,whereas it still onstitutes a major hallenge in roboti s? What are the underlying omputations that provideus with a metri required to gain not only topologi al, but also geometri al per eption of our environment?An explanation of the surprising dis overy of `grid' ells [Hafting et al., 2005a℄ in the rodent dorsal medialentorhinal ortex (dMEC) may o(cid:27)er some answers to these questions.In ontrast to the spatially lo alized unimodal a tivity distribution of the pla e ells found most promi-nently in the sub(cid:28)elds CA3 and CA1 of the rodent hippo ampus (HC) [O'Keefe and Nadel, 1978℄ or, forexample, in humans [Ekstrom et al., 2003℄, the a tivity of these grid ells shows more or less regular,multi-peaked a tivity that forms `hexagrid' tiling of the spa e. Interestingly, in di(cid:27)erent layers withinthe dMEC, while preserving this ompa t overing stru ture, the a tivity is also modulated by velo ityand dire tional information [Sargolini et al., 2006℄. Due to this regularity, these ells are thought to main-tain a metri , and thus provide a basis for self-motion information or `path-integration' (for a review, see[M Naughton et al., 2006℄). Although this onstru t is very appealing, the (cid:28)nding [Barry et al., 2007℄ thatgrids may faithfully follow the distortion of the (familiar) environment asts doubt on the straightforwardlink between grids and path-integration, as su h distortions may point to a topologi al des ription instead ofa metri one [Dabaghian et al., 2007a℄. A knowledging that the fun tional explanation of these grid stru -tures has yet to be found, attention has re ently been fo used on (1) fun tional links between grid and pla e ells, and (2) possible me hanisms that would be able to generate su h regular stru tures. As a ompletereview is beyond the s ope of this paper, here we only list some of the most re ent proposals orrespondingto these two dire tions.Several models in the (cid:28)rst group elaborate on the ideas des ribed in [Sharp, 1991℄: ompetitive learn-ing resulting in sparse representations may explain the formation of pla e ells in the dentate gyrus(DG), [Rolls et al., 2006℄ and in CA3 and CA1 [Franzius et al., 2007℄. In these models the existen e ofan appropriately de(cid:28)ned set of regular grid inputs is the most stringent hypothesis. Another route isbased on the ideas of [Cash and Yuste, 1999℄ on linearity and the proposals in [O'Keefe and Burgess, 2005,Key words and phrases. entorhinal ortex, neural omputations, grid ells, spatial representation.1 Naughton et al., 2006℄: it has been shown [Solstad et al., 2006℄ that pla e ells an easily be formed ifanatomi ally and physiologi ally sound onstraints are taken into a ount. The problem with this modelis that it requires grids with diverse orientations, but re ent reports [Barry et al., 2007, Fyhn et al., 2007℄show more uniformly oriented grids.Similar ideas provide the basis for models of the se ond group: Linear summation of harmoni fun tions forms the ore idea of di(cid:27)erent os illatory interferen e models [O'Keefe and Burgess, 2005,Burgess et al., 2007℄. In this dynami model grid ells re eive dire tionally modulated os illating dendriti inputs superimposed on somati large s ale os illations o urring at 4-10 Hz (theta-os illation). With appro-priate dire tional modulation provided by subi ular head-dire tion ells [Ran k, Jr., 1984, Taube et al., 1990℄this model yields regular interferen e patterns. To enable path-integration, grid patterns should be pre iselybound to environmental ues, be ause error an be a umulated in both motor signals (speed) and dire tionsignals. Feedba k from CA1 has been suggested to provide the ne essary orre tion and thus to maintainthe oheren e of the os illations by regulating phase resetting. However, as CA1 is one step downstreamof the super(cid:28) ial layers of the entorhinal ortex (EC), it is not obvious why it would re eive at the sametime a more dire t sensory stimulus ompared to the information available at the entorhinal ortex. A spe- i(cid:28) lass of ontinuous attra tor models has also been proposed either with periodi boundary onditions[M Naughton et al., 2006℄ or with aperiodi boundaries, but with highly restri tive symmetri onstraintson the synapti onne tion matrix [Fuhs and Touretzky, 2006℄. These models a hieve path-integration usingthe grids and an explain many important aspe ts of the biologi al system, e.g., the similar orientation of thegrids, s aling and phase properties. However, the orre t integration of signals to perform path-integration isvery sensitive only to fa tors related to the model setup, not to the system at hand [Burak and Fiete, 2006℄.In this paper we sket h an alternative view of the problem of grid ells. Unlike the models des ribedabove, whi h attempt to explain a parti ular phenomenon or omputation assigned to a given area, wedes ribe a fun tional model of the hippo ampal region (HR, omprising the entorhinal ortex, the dentategyrus, areas CA3 and CA1, para- and presubi ulum and the subi ulum; see [Witter and Amaral, 2004,Mohedano-Moriano et al., 2007℄ in whi h spatial navigation and spa e representation are addressed withinthe more general ontext of e(cid:30) ient memory systems. Explanation of the onne tions among di(cid:27)erentmemory fun tions, su h as the formation of episodi memories, memory onsolidation and retrieval, has longbeen re ognized as one of the major hallenges in neuros ien e, and several attempts have already beenmade to provide a unifying view [Levy, 1996, Re e and Harris, 1996, Wallenstein et al., 1998, Ga(cid:27)an, 1998,Redish, 1999℄. Albeit with di(cid:27)erent emphases, similar motifs emerge in most models. One su h motif isthat the ontext for separate episodi memory tra es orresponds to the environment of the a tual position.While this metaphor may help to on eptualize the a quisition of new memory tra es, it does little tofurther our understanding of retrieval (that is the a tual usage) and onsolidation [Nadel et al., 2007℄ of thisknowledge, as well as the role of the HR in these tasks. Here we show that the information theoreti notion ofe(cid:30) ient representation may link these diverse fun tions and lead to a large-s ale omputational model of thehippo ampal region in whi h the intriguing grid-like a tivity pattern may naturally emerge. The proposedar hite ture is partly rooted in the fun tional omparator model des ribed in [L®rin z and Buzsáki, 2000,L®rin z et al., 2002℄ and is strongly motivated by new theoreti al results on blind sour e separation problems[Pó zos and L®rin z, 2005, Pó zos and L®rin z, 2006, Szabó et al., 2007℄.In the Methods se tion, theoreti al motivations about e(cid:30) ient representation are exposed. Afterwards,relevant anatomi al and physiologi al properties of the hippo ampal region are highlighted to support theresulting mapping. In the Results se tion (1) we formalize our model a ording to the motivations des ribed,(2) explain the fun tional orresponden e between the theoreti al onstru t and the neural substrate (fun -tional mapping) and (3) present model verifying simulations that show how our model exhibits hara teristi spatial behavior similar to that found in di(cid:27)erent parts of the HR. In the last se tion we dis uss the relevan eof our (cid:28)ndings, interpret our results and make predi tions on erning the fun tioning of the HR. Finally,some relevant but unresolved issues are enumerated.2. MethodsWe begin with some de(cid:28)nitions that we use throughout the paper. Then we highlight the entral mo-tivations behind our large-s ale fun tional model. The model is not yet extended to low-level ellular and2etwork me hanisms and thus the mapping of the proposed fun tion to the neurobiologi al stru ture isessentially a logi al arrangement of known anatomi al and physiologi al (cid:28)ndings.Theoreti al motivationsWe propose a hypothesis set based on theoreti al onsiderations. Then we enumerate the supportingarguments for ea h hypothesis and explain the essential statisti al on epts that form the ore of our proposal.We use the term `memory' for internal representations of spatio-temporal patterns of observations that insome way helps the system (agent or animal) to analyze, predi t and rea t to hanges (used in a very broadsense). Here observation in orporates not only the per eption of the external world, but also the registeringof the internal states of the self: motor ommands, emotions, goal-oriented behavior and so on. In thisframework, sensory-motor binding, for example, is about to form an intermediate representation that anfaithfully represent the omplex observations in a ompressed form whi h is then used to de(cid:28)ne the responseto those observations.Motivated by ideas in ma hine learning, information theory and goal-oriented reinfor ement learning, one an make the following hypothesis about an e(cid:30) ient memory system: • Predi tion: In order to in rease the han e of survival under varying onditions, memory reationshould serve dete tion of novelty or hange. • Probabilisti interpretation: Due to the sto hasti nature of hanges, representations may onlybe interpreted within a probabilisti framework. • Information separation and fusion: For tra table probabilisti inferen e, the e(cid:27)e t of the ` urseof dimensionality' has to be e(cid:30) iently diminished through the dis overy of the independen e of theunderlying auses of the hanges experien ed.Predi tionIn line with [Rao and Ballard, 1997, Friston, 2005℄ we hypothesize that the goal of the memory system isto help maintain, a elerate and (cid:28)ne-tune a predi tive oding me hanism (for a review on predi tive odingin the brain, see [Kveraga et al., 2007℄). The predi tive fa ulty is needed for two reasons: not only doesthe agent/animal have to intera t with a hangeable environment, but fun tional delays (rea tion time,internal fun tioning, synapti delays) also have to be ompensated. Models of predi tive oding usuallyemploy loops that allow omparison of bottom-up signals (`input') and expe ted signals (`output') of theinternal dynami al model of the observations. It has already been proposed that the HR [Szirtes et al., 2005℄realizes a Kalman-(cid:28)lter like internal model to predi t sensory signals. Interestingly, some re ent results[L®rin z and Szabó, 2007℄ on the approximation of independent pro esses (that is dynami al models thatassume independent noise as opposed to the Gaussian noise assumption of the Kalman-(cid:28)lter approa h) mayprovide a natural ombination of e(cid:30) ient predi tion and information extra tion, thus serving both the (cid:28)rstand the third hypotheses. Probabilisti interpretationAlternatively, the expe ted signals may ome from a generative model [Hinton and Ghahramani, 1997℄whi h seeks probabilisti sour es that ould make up or ause the per eived signals: the hidden sour es `ex-plain' the observed signals. Su h a statisti al approa h is useful in that the system has to ope with multipleun ertainties: noisy signals, hidden auses, faulty internal working, multiple potential interpretations. Thelearned spatio-temporal stru ture of the hidden sour es restri ts the representations of the world and, inturn, an be used for inferen e in a Bayesian manner [Körding and Wolpert, 2004℄. The omputational mo-tivation for seeking the hidden auses is to redu e the daunting problem of inferen e: the dete ted temporal hanges are either ausally related and an thus be predi ted or are intrinsi ally independent. If the ausesare statisti ally independent then their joint probability distribution may be fa tored.The probabilisti framework has an added advantage ompared to a deterministi en oding me hanism:the belief of the system in its own judgment (e.g. about the existen e of a parti ular sour e) may also be ex-pli itly en oded or maintained to support further inferen e [Yu and Dayan, 2003℄. Re onstru tion networks[Grossberg, 1980, Ullman, 1995℄ try to integrate the `best of both worlds': by maintaining an internal modelof the external world, fast manipulations of the sensory-motor integration (modulation, planning, and so on)3an be a hieved. On the other hand, by extra ting useful statisti s of the in oming signals, robustness againstnoise and novelty dete tion may also be realized. To the best of our knowledge, the (cid:28)rst re onstru tion net-work model for brain modeling that suggested approximate pseudo-inverse omputation for information pro- essing between neo orti al areas was published by Kawato et al., [Kawato et al., 1993℄. The omputationalmodel of the neo ortex was extended by Rao and Ballard [Rao and Ballard, 1997, Rao and Ballard, 1999℄,who onsidered neo orti al sensory pro essing as a hierar hy of Kalman-(cid:28)lters.The re onstru tion idea has also appeared in hippo ampal models [L®rin z, 1998℄. An exten-sion of that model [L®rin z and Buzsáki, 2000℄ suggested the integration of the early omparator idea[Sokolov, 1963, Vinogradova, 1975℄. In these models, the whole EC-HC ir uitry forms a `novelty'-dete tingnetwork, in whi h novelty or re onstru tion error is the di(cid:27)eren e between the expe ted (top-down)and experien ed (bottom-up) neuronal representations. The proposed model su essfully predi ted in-dependen e in the ellular a tivity in CA1 [Redish et al., 2001℄ and was the (cid:28)rst to suggest distin troles for the dire t and tri-synapti pathways [Kloosterman et al., 2004℄. A re onstru tion-network likeme hanism[Hasselmo et al., 2002℄ onne ting CA3 and CA1 has been suggested that dire ts the information(cid:29)ow during en oding and retrieval. In another study [Be ker, 2005℄, ea h hippo ampal layer forms a separaterepresentation that ould be transformed linearly to re onstru t the original a tivation patterns in the EC.These lines of arguments lead to the (cid:28)rst assumption about fun tional mapping: the HR may be onsideredas a re onstru tion-network with predi tive apa ity.Information separation and information fusionFor any probabilisti reasoning, we have to de(cid:28)ne the elementary events that make up all possible out- omes. Without knowing their true probability distribution, we need to sample them (by experien ing di(cid:27)er-ent out omes) and approximate the unknown distribution. This task be omes omputationally intra tablewith the in reasing number of possible events. Furthermore, dis retization of the spa e-time ontinuum, e.g.,sampling is another sour e of noise and omputational explosion.Consider, for example, the problem of sequen e learning [Fusi et al., 2007℄. If we want to take into a ountall pie es of sensory information at ea h moment during whi h the system is able to take a sample, we anonly store sequen es of limited temporal duration. Furthermore, the number of patterns that make up thesequen e is not known beforehand. In turn, the system should be able to (cid:29)exibly ompress the spatiotemporalpatterns into an internal form whi h (1) is subje t to memory apa ity onstraints, but (2) still preserves allrelevant information on erning the ongoing events. To do so, information should be olle ted, represented,and possibly ompressed over time, be ause (spatial) hanges take pla e at di(cid:27)erent temporal s ales omparedto the internal lo k. Motion indu ed visual hanges, for example, imply that part of the information is lostunless it is remembered in some e onomi al forms.Temporal ompression an be a hieved by implementing a predi tive system whi h an re over (explainin simpler terms) the deterministi parts of a sto hasti pro ess. If the predi table part is extra ted, therest of the available information (the so alled `innovation') has redu ed temporal orrelation.On the other hand, if the independen e of the underlying auses may be assumed (as noted above),information transfer an be optimized by for ing independen e among the omponents of the emergingrepresentation [Jutten and Herault, 1991, Comon, 1994, Ci ho ki et al., 1994, Laheld and Çardoso, 1994,Bell and Sejnowski, 1995, Amari et al., 1996℄.Importantly, the very same assumption may greatly simplify the predi tive modeling as well. This is inline with Barlow's revised formulation of the redundan y redu tion prin iple [Barlow, 2001℄: representationsshould not be rigid stru tures but rather tools that serve the animal's urrent (that is variable) goals. Theyshould therefore appropriately map the hanging statisti s of the world they represent.Elaborating on his idea about obvious (simple) and `hidden' forms of redundan y (see [Barlow, 2001℄ andthe referen es therein), the se ond main fun tional onje ture in our model is that HR maximizes informationtransfer throughout the neural ir uitry (by redu ing the obvious redundan y) and at the same time revealsthe hidden stru tures by separating them into independent subspa es. That is, the learning system revealsthe types of approximately independent sour es and their own intrinsi dimensionality.To highlight this issue, onsider the problem of spa e representation formed by the HR. The maininput through MEC to the HR is primarily multimodal sensory information with impli it and limited4patial information ontent [Fyhn et al., 2004℄, su h as dire tion or on(cid:28)guration. Con(cid:28)guration of ob-je ts an be interpreted as one of the independent des riptors of the environment. However, its true orapproximate dimension an only be revealed if the system is able to deta h the orresponding orrela-tions among the omponents of the representation from those that arry information about other phys-i al aspe ts, su h as texture or olor. Su h separation may reveal that on(cid:28)guration may best be de-s ribed in a 2 or 3 dimensional spa e that a tually orresponds to our abstra t notion of Eu lidean spa e[Dabaghian et al., 2007a, Dabaghian et al., 2007b℄.Dimensionality, in general, may not be well de(cid:28)ned for the other physi al aspe ts, su h as texture or olor,see, e.g., [Ben-Shahar and Zu ker, 2004℄. Note that these des riptors, or `fa tors' assume ea h other, butthey are also highly independent. This di hotomy an be exploited in the following way. On the one hand,there is ombinatorial gain in the des ription of events if hara terization, ategorization and predi tionof the fa tors takes pla e separately. For instan e, a s reenshot of an animal is a stati image ontainingno dire t information on erning motion. Still, the parti ular ombination of fa tors or omponents of theanimal may help to draw inferen es on erning the unseen parts of the animal and the (intended) dire tionof motion. Pattern ompletion an be seen as a parti ular inferen e problem that o urs in spa e and time.Interestingly, as new results [Pó zos and L®rin z, 2005℄ on blind sour e separation show, fa torial odingand subspa e separation an be a hieved simultaneously. In blind sour e separation problems, not onlythe sour es, but also the mixing pro ess that generates the re eived signals are unknown. In general, thisproblem annot be solved without regularization. Assuming independen e in time seems plausible in manyproblems. For a spe ial ase of instantaneous linear mixtures of (statisti ally) independent and identi allydistributed (i.i.d), one dimensional sour es, where the dimension of the signal is larger than or equal tothe dimension of the sour es, there exist e(cid:30) ient, neurally plausible Independent Component Algorithm(ICA) algorithms [Giannakopoulos et al., 1998, Linsker, 1999℄ that an re over the true non-Gaussian lowerdimensional sour es by demixing the signal.ICA an be signi(cid:28) antly faster [Amari et al., 1996℄ if separation is pre eded by whitening. This inter-mediate transformation redu es the instantaneous (zero time lag or spatial ) se ond-order orrelations (i.e.,it de orrelates) and it also normalizes the signals. Informally, de orrelation transforms the data onto anorthogonal subspa e su h that the proje tion of the data onto the (cid:28)rst (prin ipal) dire tion of the subspa ehas the greatest varian e, proje tion on the se ond prin ipal dire tion has the se ond greatest varian e andso on. The de orrelation part is also alled Prin ipal Component Analysis (PCA) and may be used for di-mension redu tion in an informed way as it provides a measure of how mu h information (at the se ond-order orrelation level) is lost by ignoring the last k dire tions or omponents. Whitening admits that all sour esmay equally be important, so after the de orrelation step it equalizes the varian es of the omponents. Theterms `whitening' and `de orrelation' may be used inter hangeably, but they s ale the results di(cid:27)erently.For more general ases of ICA, there is no trivial solution, but as both experien e and[Çardoso, 1998, Pó zos and L®rin z, 2005℄ several theoreti al advan es have indi ated [Szabó et al., 2007,Pó zos et al., 2007℄, sour es an in many ases be re overed even if onditions (independen e, i.i.d. proper-ties or equal dimensions) are not met. The re overed omponents an be grouped by their mutual information(cid:22) that is using the `non-independen e' information (cid:22) thus revealing the number of separable sour es andthe dimensions of their subspa es. This pro edure fa torizes the information and gives rise to ombinatorialgains in the storage requirements. In addition, re ent theoreti al (cid:28)ndings allege that the sear h for thesefa tors an be a elerated in a non- ombinatorial way [Pó zos and L®rin z, 2006, L®rin z and Szabó, 2007,Szabó et al., 2008℄ even if the dimensions of the subspa es are not known beforehand [Pó zos et al., 2007℄.3. Known anatomi al and physiologi al onstraintsIn this se tion, we des ribe those hara teristi s of the HR that guide and onstrain our model. The ir uitry of HR (left panel of Fig. 1) has several unique properties that probably ontribute to its entralrole in all memory fun tions. Here we highlight features that seem relevant for mapping the fun tions ontothe neural substrate. 5 Q III Q II R II R III W dir W tri K CA1: s, e EC V/VI: h, n h EC III: yEC II: n z Cortex: x M h CA1 EC V/VIEC IIIEC II CortexCA3DG SUB (a) (b)
Figure 1: (a): Diagram of the main onne tions of HR. Arrows denote ex itatory onne tions and solid ir les denote mostly inhibitory onne tions. (b): Conne tions playing a role in the model. Roman lettersdenote the layers in the entorhinal ortex (EC), x : signal from ortex, y : whitened input at EC III, n z : whitened novelty (or innovation) of the input at EC II, h : hidden model at EC deep layers, n h : innovation of the hidden model at EC deep layers, s : ICA output at CA1 during positive theta phase, e : ICA output at CA1 during negative theta phase, R II and R III : postrhinal to EC II and postrhinal to EC III e(cid:27)erents, respe tively, Q II and Q III : EC deep layers to EC II and EC III onne tions, respe tively, K inhibitory feedba k from EC III to EC II. V : CA1 to EC deep layer e(cid:27)erents, M h : re urrent ollaterals at the deep layers of the EC, W tri and W dir : tri-synapti and dire t onne tions between EC super(cid:28) ial layers and the CA1 sub(cid:28)eld,respe tively. Dire tion of information flowFirst, there is a dominantly unidire tional ([Naber et al., 2001℄, but see [Shao and Dudek, 2005℄), andparallel onne tion system among all parts: super(cid:28) ial layers of EC re eive input from adja ent orti alregions and transmit the signals toward CA1 and the subi ulum mediated by CA3. This transmission,however, is not a simple relay: it takes pla e in a tightly ontrolled way using two separate routes: theso- alled tri-synapti onne tion system (EC II(cid:21)DG(cid:21)CA3(cid:21)CA1) and the dire t route from EC III to CA1.As the exa t nature of the input re eived by EC II and EC III is not known and we want to fo us on thefun tioning within the HR, we assume that the super(cid:28) ial layers share the same orti al input. We alsoassume that di(cid:27)eren es in the a tivity of these layers stem from their di(cid:27)ering intrinsi physiology (e.g.the ratio of interneurons that enables strong feedforward inhibition in EC II), anatomy (role of re urrent ollaterals) and the re eived feedba k (EC layers V/VI proje t ba k to both layers and EC III re eives signalsfrom the subi ulum, too).CA1 and the subi ulum, whi h are onsidered to be the main output regions of the HR, proje t ba kto the deep layers of EC. In parallel with the subi ular pathway, CA1 is linked to the deep layers dire tlyas well. The parallel systems in part preserve topographi al arrangement [Witter, 2006℄ but there exists aseparation along the lateral to medial dire tion. The lateral and medial parts of the entorhinal ortex (LECand MEC, respe tively) re eive input from di(cid:27)erent orti al areas and, in turn, proje t to non-overlappingportions of CA1 and the subi ulum. In ontrast, DG and CA3 re eive onvergent input from both LEC and6EC. An important fun tional onsequen e is that the fusion of spatial and non-spatial information may bestri tly ontrolled within HR [Gigg, 2006, Witter and Moser, 2006℄.The EC deep layers, whi h presumably also re eive modulatory or ontrol signals from di(cid:27)erent orti alareas, lose the loop: they send mostly ex itatory [van Haeften et al., 2003℄ feedba k to the super(cid:28) ial layers.Unique intra-regional intera tions in ea h areaAlthough pla e ells an be found everywhere in DG, CA3 and CA1, their oding me hanism may bequite di(cid:27)erent, as the underlying onne tion systems have signi(cid:28) antly distin t features. DG is uniquefor its temporally tunable onne tions [Henze et al., 2002℄. CA3 has a dense ollateral system whi h has aparti ular role in memory replay [Louie and Wilson, 2001, Foster and Wilson, 2006, Diba and Buzsáki, 2007,Csi svari et al., 2007, O'Neill et al., 2008℄. CA1, as a single ex eption in the whole ir uitry, has no re urrent ollaterals and the a tivity of the prin ipal ells seems to be independent [Redish et al., 2001℄.Temporal syn hrony a ross and within different areasIn addition to the intri ate anatomy, the physiology of the separate modules is also striking. Themost prominent feature is the interplay between di(cid:27)erent forms of os illatory a tivities, the syn hronizedmembrane potential os illation between the 4-10 Hz theta and the 40-100 Hz gamma frequen y bands,[Bragin et al., 1995, Canolty et al., 2006℄, whi h have di(cid:27)erential e(cid:27)e ts on the di(cid:27)erent modules. Sev-eral fun tional roles have already been assigned to these a tivity forms , su h as the ontrol of syn- hrony throughout the ir uitry [Denham and Borisyuk, 2000℄ or the provision of an internal referen e lo k[Je(cid:27)erys et al., 1996, Jensen et al., 1996℄.The main generator of theta is thought to be in the septum (whi h is the only extra-hippo ampal targetof CA3), but layer EC II may also be able to initiate theta a tivity. The re ipro ity between the subi ulumand the HR via CA3 may suggest that HR has a sophisti ated me hanism for self-regulating syn hrony. Inaddition, EC II neurons are theta modulated and show phase pre ession, similarly to the pla e ells in thehippo ampus [Hafting et al., 2005b℄.EC III, whi h is very lose to layer EC II, however, is phase lo ked to the main theta and an maintainpersistent a tivity [Tahvildari et al., 2007℄. Deep layers of the EC show pe uliar fun tioning as well. In ontrast to the super(cid:28) ial layers, EC V an generate input spe i(cid:28) graded persistent a tivity in individualneurons [Egorov et al., 2002℄ whi h is generally onsidered the underlying neural me hanism of workingmemory [Goldman-Raki , 1995℄. Furthermore, the relative homogeneity of the CA1 response to hanginginputs as ompared to that seen in the deep EC may suggest [Frank et al., 2006℄ that a tive CA1 neuronsare engaged in representing one environment, while deep EC may ontain multiple subpopulations, sometied to CA1 output while others are more independent of CA1. Interestingly, separate modules or ` ellislands' an be found in EC II as well [Witter and Moser, 2006℄. As a onsequen e, if deep layers anrepresent several likely models on erning the world, there should be a swit hing me hanism that an helpsele t the one that best serves orre t predi tive oding. It is intriguing that layer III of the EC has beenfound to re eive su h swit hing signals [Tahvildari et al., 2007℄. Last, but not least, signals arrying di(cid:27)erentaspe ts of spatial information, su h as position, head-dire tion or speed, seem to interfere at several stages.While a tivity in CA3 and CA1 doesn't show orrelation with dire tional information, postsubi ular head-dire tion ells dire tly innervate the deep layers of EC, whi h in turn send this information to the super(cid:28) iallayers. A ording to this s enario, grid ells in EC III show lear onjun tive orrelation representing mixedinformation at the same time [Sargolini et al., 2006℄. However, the a tivity of neurons in EC layer II is freeof dire tional modulation. 4. ResultsIn the (cid:28)rst part we formalize the proposed fun tions by providing a mathemati al onstru tion. In theresulting omputational model the di(cid:27)erent fun tional modules are not yet an hored to the real system. Sin ethis reverse-engineering approa h (assignment of the fun tions pre edes the des ription of the stru ture) isessentially ill-posed (o(cid:27)ering several solutions), in the se ond part we attempt to map the modules onto thereal neural system by taking into a ount the biologi al onstraints olle ted in the previous se tion. Finally,7imulations are presented in whi h the fun tion of the model is demonstrated on inputs that an be relatedto signals re eived by the hippo ampal region.Results I: Formal des ription of the fun tional modelLet us assume the system's goal is to form e(cid:30) ient representation of the sensory information whi h anbe used for predi tion. E(cid:30) ien y refers to storage apa ity (a small number of `fa tors' should be used tore onstru t large number of possible inputs) and speed (the system should try out only a few ombinations ofthe fa tors). Predi tion is the ability to generate expe ted inputs. Let us begin with an abstra t des riptionof the observation of the external world (At this point we don't model di(cid:27)erent sensory modalities. Theinput variable is simply a des ription of the external world). The sensory input x ( t ) to the system may beassumed to be a mixture of hidden sour e signals or auses:(4.1) x ( t ) = A s ( t ) , where A ∈ R n × n is a mixing matrix, and s ( t ) ∈ R n are the sour es to extra t. Regarding our hypothesis(3), ICA is designed to solve a similar problem under the ondition that the omponents of s are i.i.d., andstatisti ally independent. However, the observed quantities may not be i.i.d.,(4.2) s ( t + 1) = F s ( t ) + e ( t + 1) , where e ( t ) is alled the `driving noise', `true sour e', or `innovation'. The expression `driving noise' refersto the fa t that pro ess s is maintained by the `true sour e' e : without this input, s ( t ) would de ay. Dueto the mixing e(cid:27)e t of matrix F whi h des ribes the deterministi part of the pro ess, the omponentsin s ( t ) are not independent anymore. Obviously one an envision more sophisti ated systems. Neverthe-less, for higher order pro esses or signals with e hoes, the formalism an be brought to very similar forms[L®rin z and Buzsáki, 2000, Szabó et al., 2007, Pó zos et al., 2007℄. As long as the omponents of the truesour e, e ( t ) an be onsidered independent, the e(cid:30) ient representation an again be a hieved by extra tingthese omponents. If the dynami s are `weak' in the sense that only weak temporal orrelations are intro-du ed by F , then we arrive at the original ICA problem. Be ause we are interested in the auses, i.e., in thedriving noise, we need to learn both the autoregressive pro ess ( F ) and the mixing pro ess ( A ). This anbe a hieved [L®rin z and Szabó, 2007℄ only if omponents of the true driving noises are independent. Underthe normal (Gaussian) noise assumption the e(cid:27)e ts of these pro esses annot be distinguished. We need to arry out some manipulations in order not to misguide ICA.We make use of the identities(4.3) x ( t + 1) = A s ( t + 1) = AF s ( t ) + A e ( t + 1)) to get(4.4) x ( t + 1) = M x ( t ) + n ( t + 1) , where n ( t + 1) . = A e ( t + 1) and M . = AF A − under the assumption that matrix A an be inverted. Thus,both Eq. (4.2) and Eq. (4.4) have autoregressive forms. Due to the mixing e(cid:27)e t of A (Central LimitTheorem), the distribution of A e ( t + 1) is more Gaussian-like ompared to the true sour es. It implies thatthe standard solution of the Gaussian autoregressive pro esses an be applied as the (cid:28)rst step to unfold thehidden pro esses.Now let us suppose we have a tunable system and our task is to (cid:28)nd the hidden pro ess s and thedriving sour e e using only the observation x ( t ) . In what follows, we distinguish approximations of the truequantities by a small hat.First, one an remove the autoregressive part by estimating matrix ˆ M through the minimization of thefollowing ost fun tion(4.5) J ( ˆ M ) = 12 X t | x ( t + 1) − ˆ M ( t ) x ( t ) | . for all available data pairs ( x ( t + 1) , x ( t )) . Then, we have a model that predi ts the next expe ted input(4.6) ˆx ( t + 1) = ˆ M ( t ) x ( t ) t :(4.7) ˆn x ( t ) = x ( t ) − ˆx ( t ) . For Gaussian ˆn x ( t ) , the minimization of Eq. (4.5) leads to the following gradient rule:(4.8) ∆ ˆ M ( t + 1) = α t (cid:0) x ( t + 1) − ˆ M ( t ) x ( t ) (cid:1) x ( t ) ′ = α t ˆn x ( t + 1) x ( t ) ′ where prime ′ denotes the transposed form for ve tors and also for matri es, and α t is the learn-ing rate. If α t diminishes a ording to some suitable s hedule then ˆ M ( t ) onverges to the real M [Robbins and Monro, 1951℄. In what follows, the learning rules will be written as(4.9) ∆ ˆ M ( t + 1) ∝ ˆn x ( t + 1) x ( t ) ′ where the sign ` ∝ ' denotes the Robbins-Monro s hedule. Note, however, that if the world is hanging thenit is better to maintain adaptation forever.So far we have exploited the Gaussianity property of the driving noise to learn the dynami al system.Now we an make use of the fa t that upon onvergen e, the innovation term also onverges to the mixedtrue sour es of Eq. (4.1) ( n x ( t ) A e ( t ) ). In turn, simple separation of the innovation yields the demixingpro ess W , whi h is the approximation of the inverse of the mixing matrix: W = ˆ A − . Then ˆe ( t ) = W ˆn x ( t ) is the approximation of true sour es, whereas ˆs ( t ) = W x ( t ) approximates the hidden pro ess.One an approximate the autoregressive matrix F using quantities x , ˆ M , and ˆe . The goal of the approx-imation is to optimize predi tion, that is, to minimize the following ost fun tion:(4.10) J ( ˆ F ) = 12 X t | ˆs ( t + 1) − ˆ F ( t ) ˆs ( t ) | . As with matrix M , matrix F an be learned through the following gradient rule:(4.11) ∆ ˆ F ( t + 1) ∝ (cid:0) ˆs ( t + 1) − ˆ F ( t ) ˆs ( t ) (cid:1) ˆs ( t ) ′ = ˆe ( t + 1) ˆs ( t ) ′ , that is,(4.12) ∆ ˆ F ( t + 1) ∝ W n x ( t + 1)( W x ( t )) ′ This strategy has been detailed in [L®rin z and Szabó, 2007℄.Let us note that the gradient learning rules of Eqs. (4.8) and (4.11) may have plausible neural implemen-tations as they are in remental and the hange in one synapse does not depend on the hange in all the othersynapses. If this latter ondition is met, then we say that learning is Hebbian, or alternatively, the learningrule is `lo al'.As signals should be separated and (cid:22) as was argued before (cid:22) separation an be fa ilitated if whiteningtakes pla e (cid:28)rst, a de orrelation stage might be introdu ed. A ording to [Çardoso and Laheld, 1996℄, signals y = P y x be ome de orrelated if(4.13) ∆ P y ( t + 1) ′ ∝ P y ( t ) ′ ( I − y ( t ) y ( t ) ′ ) for all times t = 1 , , . . . and under suitable onditions. Note that here, in Eq. (4.13), and in similarequations later, the learning rule ontains the transposed form of matrix P y and thus dimension redu tion( dim( y ) ≤ dim( x ) ) is possible. Intuitively this serial update algorithm pushes the ovarian e matrix of y ( t ) ( E ( y y ′ ) , where E ( . ) denotes expe tation) to be ome identity. Let us remark that there are many arti(cid:28)- ial neuronal implementations of su h algorithms [Foldiak, 1990, Hyvärinen and Oja., 1998, Linsker, 1999,Basalyga and Rattray, 2003℄.For the very same reason, innovation n x ( t ) should also be de orrelated. The linear transformation n z = P n z n x ( t ) of innovation n x ( t ) be omes white if tuning of P n z is as follows:(4.14) ∆ P n z ( t + 1) ′ ∝ P ′ n z ( t )( I − n z ( t ) n z ( t ) ′ ) at time t . 9tatisti ally independent sour es from n z an be extra ted via a nonlinear modi(cid:28) ation[Çardoso and Laheld, 1996℄ of update rule Eq. (4.14). There are many variants for this non-linear learn-ing rule and we provide the simplest of these here:(4.15) ∆ W n z ( t + 1) ′ ∝ W n z ( t ) ′ ( I − ˆe ( t ) f ( ˆe ( t )) ′ ) . Here, f ( · ) is an (almost) arbitrary omponent-wise nonlinear fun tion. Upon onvergen e, the omponentsof ˆe ( t ) = W n z n z ( t ) approximate the omponents of the independent sour e e ( t ) apart from an arbitrarypermutation in the order of the omponents, their s ale and sign. Interestingly, spike timing dependentplasti ity has been suggested to realize this non-linear learning rule [Bell and Parra, 2005℄.The learning equations of the whitening and separation pro esses have several impli ations on erningpossible mappings.Two stages: Removal of the temporal orrelations pre edes the extra tion of the independent fa tors.Two hannels: A ording to Eq. (4.11), the pro ess of learning the predi tive system requires on ur-rent a ess to the input and the innovation. These variables may be stored separately and onveyedto the predi tive layer via separate hannels.Identi al separation: It an be seen from Eq. (4.12) that both ˆe ( t ) and ˆs ( t ) are demixed by the samematrix, so they should be pro essed in the same demixing hannel (violating the onje ture above)or there should be a me hanism that an ompensate for the di(cid:27)eren es (e.g., sign and permutationof the omponents) in the linear transformations in two hannels for proper demixing.Results II: Fun tional mapping of the modelSin e both the omputational onsiderations and the anatomi al (cid:28)ndings are quite omplex, we need tointrodu e some simpli(cid:28) ations:Rate oding: How information is a tually transmitted by the neurons is negle ted. The key issue isthat on e the parti ular form is given, the fun tion of the system an be analyzed as an informationpro essing system. (On the ontroversies on erning the potential forms of information pro essing,however, see e.g. [Reyes, 2003, Masuda and Aihara, 2007℄.) Our system des ription be omes simplerif we use analog values, whi h orresponds to the on ept of rate oding as opposed to spike basedtemporal oding. The supposed low-pass (cid:28)ltering e(cid:27)e t of the theta os illation also suggests that forsome fun tions (cid:28)ne s ale temporal pre ision might be negle ted.Laminar homogeneity: We negle t the omplexity and ri hness at the ellular level and onsiderneurons as omputational units. The omputations may hange from layer to layer, but within alayer the nature of the omputation is the same for all neurons. This orresponds to the terminologyof standard arti(cid:28) ial neural networks.Apparent linearity: Although strong nonlinearities are present everywhere, from the sub ellular levelto the network level, there are nonetheless many ases in whi h the overall response of the system isapproximately linear, see, e.g., [Linsker, 1999℄, [Hsu et al., 2004℄ and [Es abi et al., 2005℄ and the ited referen es. The omplex ontrast normalization me hanisms in visual sensory pro essing may onstitute a spe i(cid:28) example [Finn et al., 2007℄.>From now on, matri es denote synapti weights ( onne tion strength) between layers and ve tor denotesthe a tivity at a given layer. We shall slightly abuse notation and will dis ard the hats from our equations,as all learned quantities are approximations.Figure 1 may help to understand the modular stru ture of our model and its relation to the hippo ampalregion. While the left panel depi ts the gross anatomy of the areas, in luding the di(cid:27)erent onne tionsystems, the right panel of Fig. 1 shows the simpli(cid:28)ed ar hite ture and the fun tional orresponden es.The following areas of the hippo ampal regions are onsidered in the fun tional mapping: deep layers ofthe medial entorhinal ortex (denoted by EC V/VI), super(cid:28) ial layers (EC II and EC III) and sub(cid:28)eld CA1of the hippo ampus. The tri-synapti path (denoted as W tri on Fig. 1) involving the Dentate Gyrus (DG)and CA3 will be ollapsed into an integrated transformation. The potential role of the DG, CA3 as wellas the Subi ulum (SUB) will be dis ussed in the last se tion. For simpli ity, all areas and sub(cid:28)elds will bereferred to as `layers'. 10s all omputations des ribed above require statisti al hara terization of input ensembles, sampling andpro essing of the sensory input and in remental tuning (learning) are also ne essary. Input pro essing andlearning, i.e., (cid:28)ne tuning of the synapti weights that a tually (cid:28)lter the information, are dis ussed separately.Chara terization of the input to the hippo ampal regionLet x ( t ) ∈ R n denote the analog valued postrhinal input to the entorhinal ortex at dis rete time t where n is the dimension of the input. In this model, we limit ourselves to square problems, whi h is to say that n may be onsidered as both the number of postrhinal neurons and the number of entorhinal neurons of thetargeted layer. Let us also assume that the input follows the dynami s des ribed above. The postrhinal inputenters the ir uitry at the super(cid:28) ial layers of EC through two parallel onne tion systems R II ∈ R n × n and R III ∈ R n × n , so, we assume that the number of prin ipal ells in ea h super(cid:28) ial layer is equal and is also n . These onne tion systems may only transmit orti al input to HR, so their tuning is omitted: admittingthe la k of knowledge on erning the exa t nature of the parallel postrhinal inputs, we may suppose that R II = R III = I , where I ∈ R n × n denotes the n × n identity matrix. When the pro ess of learning thematri es is onsidered, a temporal index is shown in most ases. For better readability the time index isdropped for non-tunable matri es and in the dynami al equations.>From EC II/III the signals are sent to the hippo ampus through the dire t, i.e., EC III → CA1, and theindire t,tri-synapti i.e., EC II → CA1 pathways (denoted by subs ripts `dir' and `tri' on the right hand sideof Fig. 1, respe tively).Detailed orresponden e between the fun tional model and the neural layers of the HRThe formal des ription has some dire t onsequen es on erning the potential roles of the di(cid:27)erent layersof the HR. First, it is obvious that innovation (that is the omparison of the predi ted and a tual inputs) an only be stored in a layer that not only re eives the input, but is also the target of inhibitory feedba k.Due to its widespread inhibitory network, EC II is assigned to hold the innovation. The a tivity at EC II isas follows:(4.16) n z ( t + 1) = R II x ( t + 1) + Q II h ( t ) − K y ( t ) , where y ( t ) and h ( t ) denote the a tivity at EC III and EC V/VI, respe tively. (Roman subs ripts of the onne tion matri es denote the number of targeted layers.) Conne tions from EC III to EC II, denoted by K , are assumed to be mostly inhibitory. The reason for this assumption is that the vast majority of the deepto super(cid:28) ial onne tions are ex itatory and mostly target prin ipal ells in EC II [van Haeften et al., 2003℄and the orti al inputs are also of ex itatory nature. In turn, K is the andidate onne tion system thate(cid:27)e tively targets the inhibitory network of EC II. Here, the role of Q II is to whiten the innovation, whereasthe role of K is to ensure that the emerging a tivity pattern is indeed proportional to the required innovation.Equation (4.16) and the onne tivity of the HR implies that y ( t ) should be proportional to the inputand is made of two terms from bottom-up and top-down ontributions. The a tivity of EC III is thus thefollowing:(4.17) y ( t ) = R III x ( t ) + Q III h ( t ) , where Q III (cid:22) in a ordan e with the redundan y redu tion prin iple (cid:22) is assumed to de orrelate thea tivity at the targeted layer, EC III. However, de orrelation of quantity y ( t ) may in(cid:29)uen e (distort) theinnovation in EC II. This raises some doubts, be ause quantity n z ( t +1) might be ontaminated by predi table omponents, or its whiteness might be spoiled. In turn, tuning of matrix K should somehow ountera t bothproblems under the onstraint that learning is Hebbian. The solution to this threefold problem is an emergingproperty in our model.As was noted earlier, CA1 has a entral lo ation sin e it is targeted by both layers EC III and EC II via W dir ( t ) ∈ R n × n and W tri ( t ) ∈ R n × n , respe tively: s ( t ) = W dir ( t ) y ( t ) , where s ( t ) ∈ R n denotes the a tivity of CA1, if its driving input is proje ted from EC III and(4.18) e ( t ) = W tri ( t ) n y ( t ) , e ( t ) ∈ R n denotes the a tivity of CA1, if its driving input is proje ted from EC II. Followingthe proposal of [L®rin z, 1998, L®rin z and Buzsáki, 2000℄ and supported by the experimental (cid:28)ndings of[Redish et al., 2001℄, independent omponents should be expressed in CA1. In turn, we believe transforma-tions W dir ( t ) and W tri ( t ) realize the a tual signal separation and provide approximate independent ompo-nents. We note that a ording to Eq. (4.2) e ( t ) should be equal to the innovation of s ( t ) . However, unlikein the super(cid:28) ial layers, there are no re urrent ollaterals in CA1. This means that for properly tuned Q II , Q III and K , the two bottom-up transformations, i.e., W tri and W dir should be ome e(cid:27)e tively identi al inthe absen e of re urrent ollaterals.CA1 signals may leave the loop through the subi ulum or they may be sent ba k to the deep layers of ECvia the onne tion system denoted by V ∈ R n × n . (On the intriguing properties of V (not modeled here),see [Naber et al., 2001℄).In line with [L®rin z et al., 2002℄, a entral fun tion of the deep layers of EC may be pattern ompletion.However, as was already noted, for ing independen e does not support pattern ompletion. It is also known?that a tivity patterns of the deep layers of EC are not in fa t independent [Sargolini et al., 2006℄. Thisimplies that `remixing' of the omponents is advantageous. Of the many possibilities, whitening seems themost straightforward transformation, as it does not in rease the number of transformations within the EC-HC ir uitry. The resulting patterns may show higher-order orrelations supporting the task of pattern ompletion. Sin e the internal predi tive system is based on the intensive use of re urrent onne tions, onlyCA3 and EC V/VI may be onsidered. If our assumption about the roles of the super(cid:28) ial layers are valid,then EC V/VI should realize the predi tive system sin e CA3 is not supposed to re eive signi(cid:28) ant inputfrom EC III.Consequently, the a tivity at EC V/VI an be written as:(4.19) h ( t + 1) = M h h ( t ) + n h ( t ) , where predi tive system M h an propagate a tivity h ( t ) in time, h ( t ) = V s ( t ) and n h ( t ) = V e ( t ) . Inaddition to onveying information from CA1, V is responsible for the de orrelation of the a tivity patterns. M h is an approximation of the dynami al model underlying the observations (see Eq. (4.2)). The queuingof the arrival of the two di(cid:27)erent inputs ( s ( t ) and e ( t ) ) requires a me hanism that an maintain a tivitylong enough to enable integration. Experimental (cid:28)ndings on gradually modi(cid:28)able persistent a tivity in ECV [Egorov et al., 2002℄ may support this proposal.At last, the deep layers proje t ba k to EC II and EC III via Q II and Q III , respe tively.Learning pro essesFor di(cid:27)erent reasons, 3 onne tion systems are assumed to de orrelate the a tivity of their targeted layer: Q II , Q III , and V . Their tuning follows the form given in Eqs. (4.13) or (4.14). For example, learning of Q II an be given as:(4.20) ∆ Q II ( t + 1) ′ ∝ Q ′ II ( t )( I − n z ( t ) n z ( t ) ′ ) where n y ( t ) is the emerging a tivity of the targeted layer, EC II.To arrive at the right form of innovation, onne tions between EC III and EC II need to be tuned. Thelearning rule of K ( t ) is supposed to satisfy a Hebbian form, similar to Eq. (4.9)(4.21) ∆ K ( t + 1) ∝ n z ( t + 1) y ( t ) ′ . This is the perfe t learning rule, be ause it minimizes ost fun tion(4.22) J ( K ) = 12 X t (cid:12)(cid:12) R II x ( t + 1) + Q II h ( t ) − K y ( t ) (cid:12)(cid:12) = 12 X t (cid:12)(cid:12) n z ( t + 1) (cid:12)(cid:12) . whi h is the Eu lidean norm of n z . In this expression ea h term is a linear transform of x with di(cid:27)erenttime lags. The result of the learning rule is that, apart from an arbitrary linear transformation, K y ( t ) = Q II h ( t ) + R II x ( t ) , is satis(cid:28)ed in all instan es. This is the net result, i.e., n z ( t ) is indeed a linear transform of innovation n x ( t ) .Quantity n z ( t ) will be white given the learning rule for Q II detailed in Eq. 4.20. In sum, learning rule12q. 4.21 is Hebbian and adjusts the inhibitory ontribution until n z ( t ) be omes a linear transformation ofthe innovation, subje t to the onstraint, that both y ( t ) and n z ( t ) are white.Separation takes pla e in both the dire t and the indire t pathways, so W dir and W tri should undergotuning similar to Eq. (4.15). At this point some remarks are in order. We expe t to have two separate hannels, one for the input and one for the innovation, whi h an basi ally reverse the mixing e(cid:27)e t of thevery same mixing pro ess (see Eq. (4.1)). We have also seen that both separation pro esses would probablyend up reating approximately independent omponents in the same layer (CA1). First, it is ne essary toensure that learning in the two separation pathways onverges to approximately the same solution. Se ond,it is ne essary to s hedule the a tivity at CA1 to avoid interferen e between the patterns orrespondingto the independent omponents of the input or the innovation. Regarding the intera tion between W dir and W tri , it is intriguing that while the original problem of ICA (that is when the mixing pro ess andthe omponents are unknown) is truly unsupervised, by onstraining the outputs of the tunable matrix tosome pres ribed outputs the learning algorithm be omes supervised. Thus, the two matri es may be omeidenti al if one hannel dominates (supervises) the other. Physiologi al onsiderations seem to suggest apossible me hanism.Regarding separation, in the beginning the faster dire t pathway may supervise the indire t one byproviding approximately independent omponents in CA1. We note that there is a temporal oordina-tion between the (cid:28)ring of the neurons that send information through the dire t and the indire t paths[Dragoi and Buzsáki, 2006℄. It is also possible that supervising signals may rea h CA1 at one phase of thetheta os illations, while the signals from the tri-synapti pathway may rea h CA1 at the other phase. An-other argument is that although pla e (cid:28)elds in CA1 begin to stabilize early ( ompared to the pla e (cid:28)elds inCA3) and even without input from the tri-synapti route, full stabilization takes mu h longer. We suggestthat the two routes work together. The early stabilization results in approximate independent omponentsif the signal from EC III is ontaminated by large temporal orrelations. The task of the indire t route maybe to diminish this kind of temporal dependen e and to pro eed with the separation of the sour es, but thisis a slower pro ess.Following our hypothesis, tuning of W dir ( t ) and W tri ( t ) may assume two di(cid:27)erent forms during the ourseof learning: ∆ W dir/tri ( t + 1) ′ ∝ W dir/tri ( t ) ′ ( I − s ( t ) f ( s ( t )) ′ ) (4.23) ∆ W dir/tri ( t + 1) ′ ∝ W dir/tri ( t ) ′ ( I − e ( t ) f ( e ( t )) ′ ) . (4.24)where f ( · ) is an (almost) arbitrary omponent-wise nonlinear fun tion.In the formal model we have seen that all these transformations are required to provide the right informa-tion for the internal predi tive model. However, this model also needs tuning in order to mat h the observedsignals.The approximation of predi tive matrix M h (cid:21) as with all predi tive matri es in the model (cid:21) an be writtenas follows:(4.25) ∆ M h ( t + 1) ∝ n h ( t + 1) h ( t ) ′ This rule trains matrix M h to optimize predi tion in with? Eu lidean norm norms?. Due to the s heduledarrival, we need to suppose that the time window is broad enough to enable intera tion of the transformedinput signal and the innovation. As we see, training is Hebbian, but a detailed me hanism that woulda tually be able to arry on this tuning is missing. Nevertheless, we onje ture that the double loops ofthe dire t and indire t pathways have a fundamental role in tunneling the right information at the righttime. It is worth noting that this assumption is also supported by the experimental (cid:28)nding that a tivityin CA3 under one theta os illation (50-80 ms) may orrespond to 1 se ond of the external sensory (cid:29)ow.Unfortunately, available experimental data is not su(cid:30) ient to better model this interplay.In summary, if all transformations are optimally tuned, then (1), temporal orrelations F are learnt andrepresented in the internal model through matrix M h , (2), the hidden pro esses h an be estimated by thelearnt model and (3), the true independent auses e an also be revealed. Note that two main goals area hieved; the independent auses ( e ) are revealed up to an arbitrary permutation, s ale and sign, and thepredi tive matrix F is learnt up to a linear transformation.13 [ c m ] Figure 2: (a): Cir ular maze, diameter: 2m, with a short sample traje tory. Step size varies between 11 and55 m. (b): Sample input to the loop in the form of an a tivity map within the maze (see, Eq. (4.27) fordetails). A tivity map is shown in arbitrary units.In the next se tion we turn ba k to the original problem of the emergen e of parti ular spatial a tivity atdi(cid:27)erent parts of the HR. In the simulations the transformations assigned to di(cid:27)erent parts of the loop areimplemented and applied on stru tured high dimensional inputs ontaining spatial information. The goal isto study whether the emergent a tivity at the di(cid:27)erent modules orresponding to e.g. CA1 and ECII/ECIIIresembles that found experimentally. Results III: SimulationsWe present a series of simulations with inputs of in reasing omplexity. The more realisti the inputs,the more omplex the omputations that are required to extra t spatial information. In doing so, the roleof di(cid:27)erent modules an be highlighted.In our sample simulations a virtual rat has explored a 2 m wide, open-(cid:28)eld ir ular maze. Similar resultswere rea hed using a square maze. The path has been generated as follows: the rat runs on a linear path ata onstant speed and makes a small random turn at ea h step with a given han e. It also makes a randomturn if it `senses' that it may ollide with the wall. Input sampling has been (cid:28)xed to 55 m. The length ofthis random traje tory and input sampling were hosen to get a fair overage of the full area of the maze witha reasonable number of samples. The maze and a sample traje tory is shown in Fig. 2. Inputs orrespondingto turns may only be interpreted by higher order autoregressive pro esses for whi h the order would be aboutthe average number of steps in a single dire tion. As the implemented internal model assumes (cid:28)rst orderpro esses (see the omment at Eq. (4.2)), su h inputs have been ex luded. We shall ome ba k to this pointin Se tion 5.The most restri tive approximation in our simulations is that the input ontains information about thelo al ues only, no distal information is in luded. One might think that the input is a mixture of smells thatdi(cid:27)ers from point to point. This lo al nature implies that parametri maze distortions an not be modeled inthis framework. On the other hand, this simpli(cid:28) ation ex luded any artifa t that would result from arbitrarymodeling of low-level sensory pro essing. Instead, we simply mimi ked postrhinal (`parahippo ampal' inprimates) [Burwell and Hafeman, 2003℄ inputs. In ontrast to perirhinal input [Ea ott and Ga(cid:27)an, 2005℄,postrhinal input is assumed to re(cid:29)e t hanges of spatial properties or dire tly arry spatial information(albeit in weak orrelations, [Fyhn et al., 2004℄). Su h spatial dependen e of the postrhinal a tivity wasapproximated by (cid:28)rst reating n Gaussian pat hes with ea h Gaussian having a maximum amplitude of 1:(4.26) g i ( p ) = exp (cid:18) − ( p − c i ) σ i (cid:19) , where p ∈ R denotes the oordinate ve tor of the rat, c i ∈ R is the oordinate ve tor of the enter of the i th Gaussian, and i ∈ { , . . . , } . Centers c i were drawn from the uniform distribution over the full mazewhile σ i were uniformly drawn from the range [20 m, 40 m℄.Input x was reated by using a random, binary mixing matrix G ∈ [0 , × over the set of theGaussians:(4.27) x ( t ) = G g ( p ( t )) , p ( t ) denotes the oordinates of the rat in the maze at time t and the i th omponent of ve tor g ∈ R is g i ( p ( t )) at time t . Ea h row of matrix G ontains 20 positive non-zero elements on average. The resultinga tivity map for a single omponent of x ( t ) ∈ R , i.e., for one of our `sensors' is shown in Fig. 2(b).In simulation x ( t ) was 1050. The new units`sensed' dire tions and had no spatial dependen e. The dire tion sensitivity has been de(cid:28)ned as:(4.28) x i = f i ( φ ) = max(0 , cos( φ − φ i )) , for i ∈ { , . . . } , where φ denotes the dire tion between the last and the urrent positions and φ i denotes the dire tion for whi h the i th omponent ( < i ≤ ) is the most sensitive. This parti ular hoi e results in broadly tuned ( ∼ π/ ) dire tional a tivities.In simulation x i ( t ) = f i ( φ ( t )) [ G g ( x ( t ) , y ( t ))] i where φ ( t ) is the dire tion of the rat at time t .Last, in simulation x ( tc ) ( t + 1) = (1 − α ) x ( tc ) ( t ) + α x ( t + 1) where supers ript ` tc ' stands for `temporally onvolved'. This is essentially the simplest autoregressivepro ess regarding Eq. (4.2). Spatial analysisAs opposed to real spiking data, linear transformations may give rise to negative signals. In turn, the orresponden e between the unit a tivity values after ea h transformation and the neurons' responses is notstraightforward. In order to generate the a tivity maps of the input units, (cid:28)rst we dis retized the spa e (theresolution was so a bin is . m × . m, whi h is omparable to [Hafting et al., 2005a℄), and for ea hbin we summed up the a tivity measured in those steps that ended in the given bin. This spatial averagingsmoothes out the artifa ts aused by unattended spots. The a tivity after, e.g., de orrelation may assumenegative values, so the data were half-wave re ti(cid:28)ed ( lipped) and s aled to range [0 , .ICA is invariant for the hange of sign [Jutten and Herault, 1991℄. In turn, the sign of an a tivity maphas been de(cid:28)ned by the average sign of the (cid:28)rst 10 bins with the highest absolute value. That is, if morethan 5 units were negative, we simply (cid:29)ipped the sign of the map. The resulting maps were then half-wavere ti(cid:28)ed. We also omputed the 2 dimensional normalized auto orrelation for ea h a tivity map.The spatial analysis of the peak a tivity regions for the auto orrelation image has been doneby (cid:28)tting a grid on the lo ally maximal points using Delaunay-triangulation [Markus et al., 1995,Taká s and L®rin z, 2007℄. Border verti es and nodes have been ex luded from the analysis. Verti es are onsidered as internal if they belong to two triangles and nodes are internal if they only onne t to othernodes through internal verti es. To hara terize the regularity of the resulting grids, we al ulated the ver-tex length and the angle distribution. Dis retization, however, de(cid:28)nes a lower bound of the edge length,whi h is about 2 bins, that is ∼ . cm . Be ause the mean angle in Delaunay-triangulation is obviously60 degrees, the spread around this value (that is the standard deviation, or std for short) an be used toquantify regularity. The distribution of the mean vertex lengths and the distribution of the std of the anglevalues for the whole population have been used to ompare the spatial hara teristi s of the input set andthe set of the transformed signals.For simulations × bins and in ea h binwe olle ted those steps that ended in that bin. Their dire tion, weighted by the response value at the endpoint, was then added up. The resulting dire ted a tivity values an be visualized in a `dire tion-(cid:28)eld' plot.In order to hara terize the spatial heterogeneity of the dire tional sele tivity, the dire ted values may alsobe grouped a ording to their dire tion and these lumped sum values will be presented on a polar plot.These analysis serve to hara terize the strength of spatial heterogeneity in dire tion sele tivity.15 (a) ± (b) ± ( ) ± (d)
27 40 53 67 80 93 107 120 13300.050.10.150.20.250.30.35 Mean edge length [cm]
InputDecorrelated input (e)
InputDecorrelated input (f) (g) (h) (i) (j)
Figure 3: Simulation
5 10 15 (a)
1 2 3 (b) ( ) (d) (e) (f) (g)
Figure 4: Simulation (a) (b) correlation map ( ) (d) (e) (f) (g)
Figure 5: Simulation (a) (b) ( )
Figure 6: Innovation on the de orrelated onjun tive inputs. (a- ): olumns orrespond to the outputof di(cid:27)erent separating (ICA) units. First row: half-wave re ti(cid:28)ed a tivity maps. Se ond row: spatialdistribution of the dire tion sele tivity is shown on a square grid of size 10x10. Third row: overall dire tionsele tivity in the form of a polar plot. (a) (b) ( )
Figure 7: Separation on temporally onvolved, position and dire tion dependent inputs. (a- ): ea h olumn orresponds to di(cid:27)erent ICA output units. First row: sign (cid:29)ipped and half-wave re ti(cid:28)ed a tivity maps.Se ond row: spatial distribution of the dire tion sele tivity is shown on a square grid of size 10x10. Thirdrow: overall dire tion sele tivity in the form of a polar plot.Interpretation of the simulation resultsAlthough our model onstru t is based on general ideas about e(cid:30) ient representation of sensory events,when applied to spatially an hored inputs it has shown some intriguing properties that an dire tly orre-spond to experimental data.The model orresponden es have already been supported by the (cid:28)rst simulation in that grid-like a tivityhas appeared in exa tly those modules the neural substrates of whi h were reported to present this parti ulara tivity. On e grid-like a tivity is present, for ing independen e results in lo alized a tivity, as was shownfor example, in [Franzius et al., 2007℄. What is more interesting, though, is that re ipro ity (i.e., pla e ellsare needed to get stable grid ells) an also be explained by the loopy stru ture of our model. Anotherobservation is that the weak overlap among the resulting pla e (cid:28)elds an be onsidered as dis retization of20he spa e. Similarly to what was found (cid:28)rst in [L®rin z et al., 2001℄, this is what ICA seems to do if thereis a small dimensional spa e behind the high dimensional inputs.In Simulation V ) may be resolved. De orrelation seems appropriate.Another onsequen e is that depending on the temporal stru ture of the input, after realizing these parti ular orrelations separating transformations may e(cid:30) iently hannel the information. Although we omitted themodeling of the subi ular omplex, this observation may explain the existen e of the distal/proximal loopsbetween CA1, subi ulum and EC [Gigg, 2006℄.Before presenting our onje tures and predi tions, let us re ap the logi behind them. First we laimedthat a memory system is e(cid:30) ient if the resulting representations (1) support a predi tive internal model ofsensory events, (2) an be interpreted in a probabilisti framework to ope with un ertainties and (3) anbe fa tored to maintain the redundan y redu tion prin iple, but also help reveal relevant subspa es. Thesehigh level fun tional motivations lead to a omputational model that an explain the sensory input in termsof independent auses and an also predi t the temporal hanges of these auses and their intera tions.The predi tive fa ulty of the proposed stru ture is realized in an internal model that an take into a ountintrinsi (e.g. self-motion indu ed) and extrinsi hanges in the observed signals. It is worth noting herethat su h distin tions are only meaningful if ontrol of the intrinsi hanges (for instan e, hanging the pa ethrough appropriate motor ommands) is possible. The required omputational stages form a loop in whi hlearning (tuning) and fun tioning are tightly oupled. The loopy stru ture implies that the HR onne ts thedownstream and upstream information (cid:29)ow between the e(cid:27)erent and a(cid:27)erent pathways.Next we attempted to map the proposed fun tional model onto the neuronal substrate by enumeratingsupporting anatomi al, physiologi al and behavioral data. Due to omplexity of the problem a series ofsimpli(cid:28) ations had to be introdu ed. Our large s ale fun tional model ignores (cid:28)ne temporal s ales, thus(1) rate based oding of information is su(cid:30) ient. We also redu e the di(cid:30) ulty by fo using only on (2)linear transformations (cid:21) apart from the re ti(cid:28) ation of the neuronal outputs (cid:21) although ea h stage an21lso be extended to be nonlinear. We intend to provide (3) a network level des ription only, in whi hthe transformations are arried out by similar omputational units. These onsiderations together withthe validating simulations, whi h were spe i(cid:28) ally aimed at studying spatial dependen e, may lead to thefollowing onje tures:(1) The ore transformation of the ir uitry may be seen as a realization of independent pro ess analysiswhi h provides a two stage solution to re over hidden omponents as well as the dynami s.(a) In one stage separation of independent (hidden) auses and their orresponding subspa es maytake pla e. The HC plays a ru ial role in mapping independent oordinates su h as positionand dire tion to di(cid:27)erent areas. Grouping of the omponents of the non-independent fa torsmay o ur by using the information about their `non-independen e', i.e., within the subspa esthemselves.(b) In the other stage, a predi tive system is implemented that an be (cid:28)ne tuned to (cid:28)t the temporals ale of the evolution of the observed signals. Due to the interplay between these two fun tions,de orrelation and separation take pla e repeatedly along both the dire t and the tri-synapti routes.(2) Depending on the apa ity of the available resour es, separation an be seen as a means (1) of (cid:28)ndingseparable subspa es and (2) of dis retizing these low dimensional subspa es. Position and dire tion,for instan e, an be seen as two omplementary but independent pie es of information. In turn,separation has a entral role in shaping the responses of both the pla e and the head dire tion ells.(3) The predi tive internal model is maintained by EC V/VI(4) The innovation term is formed in EC II through a omplex intera tion of at least 3 di(cid:27)erent areasproje ting to the given layer.(5) The main input is held in EC III(6) The innovation term is the net result of the omparison of the expe ted input produ ed by thepredi tive system and the real input. Su h omparison is made possible through the a tivation (bythe EC III to EC II onne tions) of the widespread inhibitory network of EC II.(7) For both the innovation and the input, bottom-up and top-down onne tions work in on ert toa hieve de orrelation.(8) A tual separation is arried out in both the dire t and the tri-synapti pathways, resulting in inde-pendent a tivity in CA1. The two pro esses intera t during learning as well as fun tioning.(9) For ing independen e may interfere with predi tion, so some remixing is needed. Whitening seemedto be a natural hoi e, and this was supported by the omparison of the simulation results (grida tivity emerges by de orrelation) with the experimental (cid:28)ndings (grid a tivity an be found in alllayers of dMEC).(10) The loopy stru ture and the whitening role of EC deep to EC super(cid:28) ial onne tions explains thefa t that when the HC is removed signals of both super(cid:28) ial layers of EC hange [Fyhn et al., 2004℄.The resulting mapping is an improvement over the one proposed in [L®rin z and Buzsáki, 2000,L®rin z et al., 2002℄ where de orrelation was assigned to CA3. This modi(cid:28) ation is ne essary[Taká s and L®rin z, 2007℄, be ause in applying de orrelation to spatially de(cid:28)ned inputs grid like a tiv-ity emerges and su h grids were found in the entorhinal ortex and not in the CA3. In our model themain role of the deep-to-super(cid:28) ial onne tions is whitening, whereas the omparator role of EC layer II[L®rin z and Buzsáki, 2000, L®rin z et al., 2002℄ has not been modi(cid:28)ed.Regarding spatial information, simulations revealed that de oupling of dire tional and positional infor-mation is viable in our model framework. If the neuronal mapping is orre t, this de oupling de(cid:28)nes theinterplay between the hippo ampus proper (responsible for shaping and maintaining primarily positionalinformation) and the subi ular omplex (responsible for dire tional information). It may also explain thene essity of the two parallel routes. Be ause the proximal and distant targets di(cid:27)er in the two areas, it ispossible that omputations are similar at the CA1 and at the subi ulum.These onsiderations imply the following predi tions:Subspa e separation: At the initial stage of pla e (cid:28)eld stabilization in CA1 ells may show graduallydiminishing dire tion sensitivity. If this onje ture is not supported by experimental (cid:28)ndings, thenpla e (cid:28)eld formation annot be explained by applying purely statisti al onsiderations.22op-down in(cid:29)uen es: The key role of the deep layers of EC in extra ting temporal dependen ies (i.e.separating the predi table parts) implies that perturbation at these layers would result in a faultypredi tion system and a weaker representation of dire tions in the subi ular omplex. In parti ular, hanges in the a tivity of EC super(cid:28) ial layer neurons are expe ted. If su h hanges indeed exist,then the hara teristi properties of these hanges provide information about top-down in(cid:29)uen eson input (cid:28)ltering: modulation of the internal dynami al model may hange the information thattraverses to the CA1 sub(cid:28)eld.Distortions: In the model, parametri distortion of the grids ([Barry et al., 2007℄) may only be demon-strated by providing information about the motor e(cid:27)erents or by providing a ess to ontrol thepro essing of sensory information.Pro essing along the dire t pathway is faster, as fewer transformations are involved. However, whentemporal orrelations are present, the resulting omponents may be distorted. In this ase, the tri-synati pathway is expe ted to be ome dominant, as an diminish these orrelations. In sum, the varying in(cid:29)uen eof the two pathways may ause the temporary dire tion sele tivity of the emerging pla e (cid:28)elds.In our proposal, learning in the dire t and tri-synapti pathways takes pla e at di(cid:27)erent speeds. Asindependent sour es should be developed on i.i.d. sour es we would expe t that the CA1 responses arede(cid:28)ned by the tri-synapti pathway at least at the (cid:28)ne tuning stage of learning. A ording to the experiments[Sybirska et al., 2000℄, ea h pathway an form stable pla e (cid:28)elds in the absen e of the other. The pro essingalong the dire t pathway is probably faster [Leutgeb et al., 2004℄ and we think that this is due to la k oftemporal de orrelation in this pathway. However, when temporal de orrelation is present, the tri-synapti route may take the lead in tuning CA1. Both proximal and distant dendrites may play a role in learning theseparation transformations, espe ially in the oordination of the ICA omponents.The se ond predi tion emphasizes the fa t that physi al onstraints of the animal's motion set the temporals ale of hanges in dire tion. If the predi tive internal model annot orre tly register this times ale, thenextra tion of the this kind of information will be impaired while re overy of positional information remainsinta t.The last predi tion deserves some omments. As the main omputations in our model are aimed at har-a terizing a set of inputs by extra ting statisti al information, any hange in the underlying statisti s wouldresult in strong distortions of the emerging a tivity pattern (see Simulations W dir and W tri ) is quite restri tive. If this onstraint is not experimentally supported thenserious re onsideration seems ne essary.The other issue regards the e(cid:27)e t of goal-oriented behavior. Although we have seen that our modelyields orthogonal hexagrid tiling, the resulting grids are not oriented. Oriented grids, however, may not beformed without additional onstraints in our model. Based on the arguments on erning the di(cid:27)erentiationof internal and external observation signals we believe that integration with ontrol [Szita and L®rin z, 2004℄over the observation pro ess ould yield the desired property.23pen issuesWhile the model we have proposed su essfully repli ated the reported spa e-dependent a tivity at dif-ferent areas of the HR, several questions remain unanswered. First, we enumerate issues related either tothe urrent stage of our model onstru tion or to the parti ular form of the presented simulations.In our simulations we used lo ally de(cid:28)ned inputs and did not model sensory asso iations between lo aland distal ues. Su h binding is not trivial and remains a hot issue for example in omputer vision.Although we showed that separation of relevant low-dimensional subspa es is possible, the me hanism ofregrouping or fusion of the fa tors belonging to the same subspa e is not yet known. We suppose that theparti ular ross talk between CA1 and the subi ulum [Gigg, 2006℄ may provide a lue.As regards predi tion, even for the simplest ase of the (cid:28)rst order autoregressive pro ess the training ofthe predi tive matrix is quite involved, as the required innovation and signal terms are supposedly storedin di(cid:27)erent areas. In turn, queuing their arrival is very fundamental. At present it is not known what kindof network me hanism may set the timing. As was suggested in [Dragoi and Buzsáki, 2006℄, one andidatewould be the network level theta-os illation that may gate information transfer to the deep layers of EC. Infavor of this proposal, it is known that deep layer prin ipal ells have distin tive theta modulation properties(see, e.g., [Chrobak et al., 2000℄ and referen es therein) and LTP in the deep layers of the EC may bepreferentially responsive to slow patterned a tivity [Yun et al., 2002℄.In the following, we name a few important properties of the hippo ampal region not yet integrated intothe model.One relevant question on erns how the memory system an store information after one en ounter (`one-shot' learning). This phenomena probably requires an additional me hanism not yet in orporated into ourmodel sin e it is not based on statisti al learning prin iples. Su h a me hanism ould be simple and Hebbian[Körmendy-Rá z et al., 1999℄.Setting aside this prompt learning, onsolidating the a quired knowledge usually takes more time. Pre-sumably sequential replay of previously formed a tivity patterns in CA3 may fa ilitate this pro ess. In linewith our initial assumptions we onje ture that forward replay may a tually help shape the predi tive system,while reverse replay is required to form better strategies for goal-oriented behavior [Sutton and Barto, 1998℄.To de(cid:28)ne goals and behavior for our system, (cid:28)rst a ontrol me hanism should be integrated. Su h a me h-anism would a(cid:27)e t the sampling of the available inputs by hanging the traje tory. In the simulations, weintrodu ed one form of temporal onvolution, but it is known that HR is able to represent sequen es ofspatiotemporal a tivity patterns in a temporally ompressed form of varying times ales. Su h highly versa-tile onvolution makes de oding even harder. It was suggested [L®rin z and Buzsáki, 2000℄ that this task isassigned to the EC-DG-CA3 loop. A further improvement of our model would be to in orporate this loopas well. 6. A knowledgmentsG. Sz. is supported by the Zoltán Magyary fellowship of the Hungarian Ministry of Edu ation. We aregrateful to Zoltán Szabó, for his help in running some of the omputer experiments.This material is based upon work supported partially by the EC FET grant, the `New Ties proje t' andEC NEST grant, the `Per ept proje t' under under ontra ts No. 003752 and No. 043261, respe tively. Anyopinions, (cid:28)ndings and on lusions or re ommendations expressed in this material are those of the authorsand do not ne essarily re(cid:29)e t the views of the EC, or other members of the EC New Ties or Per ept proje ts.Referen es[Amari et al., 1996℄ Amari, S. I., Ci ho ki, A., and Yang, H. (1996). A new learning algorithm for blind signal separation. InAdvan es in Neural Information Pro essing Systems, pages 757(cid:21)763. Morgan Kaufmann, San Mateo, CA.[Barlow, 2001℄ Barlow, H. (2001). Redundan y redu tion revisited. Network: Comput. Neural Syst., 12:241(cid:21)253.[Barry et al., 2007℄ Barry, C., Hayman, R., Burgess, N., and Je(cid:27)ery, K. J. (2007). Experien e-dependent res aling of entorhinalgrids. Nature Neuros ien e, 10(6):682(cid:21)684.[Basalyga and Rattray, 2003℄ Basalyga, G. and Rattray, M. (2003). Statisti al dynami s of on-line independent omponentanalysis. Journal of Ma hine Learning Resear h, 4:1393(cid:21)1410.[Be ker, 2005℄ Be ker, S. (2005). A omputational prin iple for hippo ampal learning and neurogenesis. Hippo ampus, 15:722(cid:21)738. 24Bell and Parra, 2005℄ Bell, A. J. and Parra, L. C. (2005). Maximising sensitivity in a spiking network. In Saul, L. K., Weiss,Y., and Bottou, L., editors, Advan es in Neural Information Pro essing Systems 17, pages 121(cid:21)128. MIT Press, Cambridge,MA.[Bell and Sejnowski, 1995℄ Bell, A. J. and Sejnowski, T. J. (1995). An information-maximization approa h to blind separationand blind de onvolution. Neural Computation, 7:1129(cid:21)1159.[Ben-Shahar and Zu ker, 2004℄ Ben-Shahar, O. and Zu ker, S. (2004). Geometri al omputations explain proje tion patternsof long-range horizontal onne tions in visual ortex. Neural Computation, 16:445(cid:21)476.[Bragin et al., 1995℄ Bragin, A., Jandó, G., Nádasdy, Z., Hetke, J., Wise, K., and Buzsáki, G. (1995). Gamma (40-100 Hz)os illation in the hippo ampus of the behaving rat. Journal of Neuros ien e, 15:47(cid:21)60.[Burak and Fiete, 2006℄ Burak, Y. and Fiete, I. (2006). Do we understand the emergent dynami s of grid? Journal of Neuro-s ien e, 26(37):9352(cid:21)9354.[Burgess et al., 2007℄ Burgess, N., Barry, C., and O'Keefe, J. (2007). An os illatory interferen e model of grid ell (cid:28)ring.Hippo ampus, (17):801(cid:21)812.[Burwell and Hafeman, 2003℄ Burwell, R. D. and Hafeman, D. M. (2003). Positional (cid:28)ring properties of postrhinal ortexneurons. Neuros ien e, 119:577(cid:21)588.[Buzsáki, 2006℄ Buzsáki, G. (2006). Rythms of the Barin. Oxford University Press, Oxford, UK.[Canolty et al., 2006℄ Canolty, R., Edwards, E., Soltani, M., Dalal, S. S., Barbaro, H. E. K. N. M., Berger, M. S., and Knight,R. T. (2006). High gamma power is phase-lo ked to theta os illations in human neo ortex. S ien e, 313:1626(cid:21)1628.[Cash and Yuste, 1999℄ Cash, S. and Yuste, R. (1999). Linear summation of ex itatory inputs by CA1 pyramidal neurons.Neuron, 22:383(cid:21)394.[Çardoso, 1998℄ Çardoso, J.-F. (1998). Multidimensional independent omponent analysis. In Pro eedings of International Con-feren e on A ousti s, Spee h, and Signal Pro essing (ICASSP '98), volume 4, pages 1941(cid:21)1944, Seattle, WA, USA.[Çardoso and Laheld, 1996℄ Çardoso, J.-F. and Laheld, B. (1996). Equivariant adaptive sour e separation. IEEE Transa tionson Signal Pro essing, 44(12):3017(cid:21)3030.[Chrobak et al., 2000℄ Chrobak, J. J., L®rin z, A., and Buzsáki, G. (2000). Physiologi al patterns in the hippo ampo-entorhinal ortex system. Hippo ampus, 10:457(cid:21)465.[Ci ho ki et al., 1994℄ Ci ho ki, A., Unbehauen, R., and Rummert, E. (1994). Robust learning algorithm for blind separationof signals. Ele troni s Letters, 30:1386(cid:21)1387.[Comon, 1994℄ Comon, P. (1994). Independent omponent analysis, a new on ept? Signal Pro essing, 36(3):287(cid:21)314.[Csi svari et al., 2007℄ Csi svari, J., O'Neill, J., Allen, K., and Senior, T. (2007). Pla e-sele tive (cid:28)ring ontributes to the reverse-order rea tivation of a1 pyramidal ells during sharp waves in open-(cid:28)eld exploration. European Journal of Neuros ien e,26:704(cid:21)716.[Dabaghian et al., 2007a℄ Dabaghian, Y., Cohn, A. G., and Frank, L. (2007a). Topologi al oding in hippo ampus.http://uk.arxiv.org/abs/q-bio/0702052v1.[Dabaghian et al., 2007b℄ Dabaghian, Y., Cohn, A. G., and Frank, L. (2007b). Topologi al maps from signals. In Pro eedingsof the 15th ACM International Symposium ACM GIS, pages 392(cid:21)395.[Denham and Borisyuk, 2000℄ Denham, M. J. and Borisyuk, R. M. (2000). A model of theta rhythm produ tion in the septal-hippo ampal system and its modulation by as ending brain stem pathways. Hippo ampus, 10:698(cid:21)716.[Diba and Buzsáki, 2007℄ Diba, K. and Buzsáki, G. (2007). Forward and reverse hippo ampal pla e- ell sequen es during ripples.Nature Neuros ien e, 10:1241(cid:21)1242.[Dragoi and Buzsáki, 2006℄ Dragoi, G. and Buzsáki, G. (2006). Temporal en oding of pla e sequen es by hippo ampal ellassemblies. Neuron, 50:145(cid:21)157.[Ea ott and Ga(cid:27)an, 2005℄ Ea ott, M. J. and Ga(cid:27)an, E. A. (2005). The roles of perirhinal ortex, postrhinal ortex, and thefornix in memory for obje ts, ontexts, and events in the rat. The Quarterly Journal of Experimental Psy hology B, 58(3-4):202(cid:21)217.[Egorov et al., 2002℄ Egorov, A. V., Hamam, B. N., Fransen, E., Hasselmo, M. E., and Alonso, A. A. (2002). Graded persistenta tivity in entorhinal ortex neurons. Nature, 420:173(cid:21)178.[Ekstrom et al., 2003℄ Ekstrom, A. D., Kahana, M. J., Caplan, J. B., Fields, T. A., Isham, E. A., Newman, E. L., and Fried,I. (2003). Cellular networks underlying human spatial navigation. Nature, 425(11):184(cid:21)187.[Es abi et al., 2005℄ Es abi, M. A., Nassiri, R., Miller, L. M., S hreiner, C. E., and Read, H. L. (2005). The ontribution ofspike threshold to a ousti feature sele tivity, spike information ontent, and information throughput. Journal of Neuros ien e,,41:9524(cid:21)9534.[Finn et al., 2007℄ Finn, I. M., Priebe, N. J., and Ferster, D. (2007). The emergen e of ontrast-invariant orientation tuning insimple ells of at visual ortex. Neuron, 54:137(cid:21)152.[Foldiak, 1990℄ Foldiak, P. (1990). Forming sparse representations by lo al anti-Hebbian learning. Biologi al Cyberneti s,64:165(cid:21)170.[Foster and Wilson, 2006℄ Foster, D. J. and Wilson, M. A. (2006). Reverse replay of behavioural sequen es in hippo ampalpla e ells during the awake state. Nature, 440:680(cid:21)683.[Frank et al., 2006℄ Frank, L. M., Brown, E. N., and Stanley, G. B. (2006). Hippo ampal and orti al pla e ell plasti ity:Impli ations for episodi memory. Hippo ampus, 16:775(cid:21)784.[Franzius et al., 2007℄ Franzius, M., Vollgraf, R., and Wiskott, L. (2007). From grids to pla es. Journal of ComputationalNeuros ien e, (22):297(cid:21)299. 25Friston, 2005℄ Friston, K. (2005). A theory of orti al responses. Philosophi al Transa tions of the Royal So iety of London.Series B, Biologi al S ien es, 360(1456):815(cid:21)836.[Fuhs and Touretzky, 2006℄ Fuhs, M. and Touretzky, D. (2006). A spin glass model of path integration in rat medial entorhinal ortex. Journal of Neuros ien e, 26:4266(cid:21)4276.[Fusi et al., 2007℄ Fusi, S., Asaad, W. F., Miller, E. K., and Wang, X.-J. (2007). A neural ir uit model of (cid:29)exible sensorimotormapping: Learning and forgetting on multiple times ales. Neuron, 54:319(cid:21)333.[Fyhn et al., 2007℄ Fyhn, M., Hafting, T., Treves, A., M.-B. Moser, and Moser, E. I. (2007). Hippo ampal remapping and gridrealignment in entorhinal ortex. Nature, (446):190(cid:21)194.[Fyhn et al., 2004℄ Fyhn, M., Molden, S., Witter, M. P., Moser, E. I., and Moser, M.-B. (2004). Spatial representation in theentorhinal ortex. S ien e, 305:1258(cid:21)1264.[Ga(cid:27)an, 1998℄ Ga(cid:27)an, D. (1998). Idiotheti input into obje t-pla e on(cid:28)guration as the ontribution to memory of the monkeyand human hippo ampus: A review. Experimental Brain Resear h, 123:201(cid:21)209.[Giannakopoulos et al., 1998℄ Giannakopoulos, X., Karhunen, J., and Oja, E. (1998). Experimental omparison of neural ICAalgorithms. In Pro . Int. Conf. on Arti(cid:28) ial Neural Networks (ICANN'98), pages 651(cid:21)656, Skovde, Sweden.[Gigg, 2006℄ Gigg, J. (2006). Constraints on hippo ampal pro essing imposed by the onne tivity between CA1, subi ulum andsubi ular targets. Behavioural Brain Resear h, 174:265(cid:21)271.[Goldman-Raki , 1995℄ Goldman-Raki , P. S. (1995). Cellular basis of working memory. Neuron, 14:477(cid:21)485.[Grossberg, 1980℄ Grossberg, S. (1980). How does a brain build a ognitive ode? Psy hologi al Review, 87(1):1(cid:21)51.[Hafting et al., 2005a℄ Hafting, T., Fyhn, M., Molden, S., Moser, M.-B., and Moser, E. I. (2005a). Mi rostru ture of a spatialmap in the entorhinal ortex. Nature, 436:801(cid:21)806.[Hafting et al., 2005b℄ Hafting, T., Fyhn, M., Molden, S., Moser, M.-B., and Moser, E. I. (2005b). Topographi organization ofa spatial map in the entorhinal ortex. Neuros ien e 2005, page 198.3. SfN.[Hasselmo et al., 2002℄ Hasselmo, M. E., Bodelon, C., and Wyble, B. (2002). A proposed fun tion for hippo ampal thetarhythm: Separate phases of en oding and retrieval enhan e reversal of prior learning. Neural Computation, 14(4):793(cid:21)817.[Henze et al., 2002℄ Henze, D. A., Wittner, L., and Buzsáki, G. (2002). Single granule ells reliably dis harge targets in thehippo ampal CA3 network in vivo. Nature Neuros ien e, 5:790(cid:21)795.[Hinton and Ghahramani, 1997℄ Hinton, G. E. and Ghahramani, Z. (1997). Generative models for dis overing sparse distributedrepresentations. Philosophi al Transa tions of the Royal So iety of London, Series B, Biologi al S ien es, 352:1177(cid:21)1190.[Hsu et al., 2004℄ Hsu, A., Borst, A., and Theunissen, F. E. (2004). Quantifying variability in neural responses and its appli a-tion for the validation of model predi tions. Network: Computation in Neural Systems, 15:91(cid:21)109.[Hyvärinen and Oja., 1998℄ Hyvärinen, A. and Oja., E. (1998). Independent omponent analysis by general non-linear Hebbian-like learning rules. Signal Pro essing, 64:301(cid:21)313.[Je(cid:27)erys et al., 1996℄ Je(cid:27)erys, J., Traub, R., and Whittington, M. (1996). Neuronal networks for indu ed `40 Hz' rhythms.Trends in Neuros ien e, 19:202(cid:21)208.[Jensen et al., 1996℄ Jensen, O., Idiart, M., and Lisman, J. (1996). Physiologi ally realisti formation of autoasso iative memoryin networks with theta/gamma os illations: role of fast NMDA hannels. Learning and Memory, 3:243(cid:21)256.[Jutten and Herault, 1991℄ Jutten, C. and Herault, J. (1991). Blind separation of sour es. Part I: An adaptive algorithm basedon neuromimeti ar hite ture. Signal Pro essing, 24:1(cid:21)10.[Kawato et al., 1993℄ Kawato, M., Hayakawa, H., and Inui, T. (1993). A forward-inverse model of re ipro al onne tions betweenvisual neo orti al areas. Network, 4:415(cid:21)422.[Kloosterman et al., 2004℄ Kloosterman, F., van Haeften, T., and da Silva, F. H. L. (2004). Two reentrant pathways in thehippo ampal-entorhinal system. Hippo ampus, 14:1026(cid:21)1039.[Körding and Wolpert, 2004℄ Körding, K. P. and Wolpert, D. M. (2004). Bayesian integration in sensorimotor learning. Nature,427:244(cid:21)247.[Körmendy-Rá z et al., 1999℄ Körmendy-Rá z, J., Szabó, S., L®rin z, J., Antal, G., Ková s, G., and L®rin z, A. (1999). Winner-take-all network utilizing pseudoinverse re onstru tion subnets demonstrates robustness on the handprinted hara ter re og-nition problem. Neural Computing and Appli ations, 8:163(cid:21)176.[Kveraga et al., 2007℄ Kveraga, K., Ghuman, A. S., and Bar, M. (2007). Top-down predi tions in the ognitive brain. Brain andCognition, (65):145(cid:21)168.[Laheld and Çardoso, 1994℄ Laheld, B. and Çardoso, J. (1994). Adaptive sour e separation with uniform performan e. In et al.,M. J. J. H., editor, Signal Pro essing VII (cid:21) Theories and appli ations, volume 2, pages 183(cid:21)186, Edinburgh, UK. EURASIPEUSIPCO94.[Leutgeb et al., 2004℄ Leutgeb, S., Leutgeb, J. K., Treves, A., Moser, M.-B., and Moser, E. I. (2004). Distin t ensemble odesin hippo ampal areas CA3 and CA1. S ien e, 305:1295(cid:21)1298.[Levy, 1996℄ Levy, W. B. (1996). A sequen e predi ting CA3 is a (cid:29)exible asso iator that learns and uses ontext to solvehippo ampal-like tasks. Hippo ampus, 6:579(cid:21)590.[Linsker, 1999℄ Linsker, R. (1999). Unsupervised learning, hapter Lo al synapti learning rules su(cid:30) e to maximize mutualinformation in a linear network, pages 19(cid:21)30. Computational Neuros ien e. MIT Press, CA.[Louie and Wilson, 2001℄ Louie, K. and Wilson, M. A. (2001). Temporally stru tured replay of awake hippo ampal ensemblea tivity during rapid eye movement sleep. Neuron, 29:145(cid:21)156.[L®rin z, 1998℄ L®rin z, A. (1998). Forming independent omponents via temporal lo king of re onstru tion ar hite tures: Afun tional model of the hippo ampus. Biologi al Cyberneti s, (79):263(cid:21)275.26L®rin z and Buzsáki, 2000℄ L®rin z, A. and Buzsáki, G. (2000). The parahippo ampal region: Impli ations for neurologi al andpsy hiatri diseases, hapter Two-phase omputational model of the entorhinal-hippo ampal region, pages 83(cid:21)111. Number911 in Annals of the New York A ademy of S ien es.[L®rin z and Szabó, 2007℄ L®rin z, A. and Szabó, Z. (2007). Neurally plausible, non- ombinatorial iterative independent pro essanalysis. Neuro omputing, 70:1569(cid:21)1573.[L®rin z et al., 2002℄ L®rin z, A., Szatmáry, B., and Szirtes, G. (2002). Mystery of stru ture and fun tion of sensory pro essingareas of the neo ortex: A resolution. Journal of Computational Neuros ien e, 13:187(cid:21)205.[L®rin z et al., 2001℄ L®rin z, A., Szirtes, G., Taká s, B., and Buzsáki, G. (2001). Independent omponent analysis of temporalsequen es forms pla e ells. Neuro omputing, 38:769(cid:21)774.[Markus et al., 1995℄ Markus, E. J., Qin, Y.-L., Leonard, B., Skaggs, W. E., M Naughton, B. L., and Barnes, C. A. (1995).Intera tions between lo ation and task a(cid:27)e t the spatial and dire tional (cid:28)ring of hippo ampal neurons. Journal of Neuros ien e,15:7079(cid:21)7094.[Masuda and Aihara, 2007℄ Masuda, N. and Aihara, K. (2007). Dual oding hypotheses for neural information representation.Mathemati al Bios ien es, 207:312(cid:21)321.[M Naughton et al., 2006℄ M Naughton, B. L., Battaglia, F. P., Jensen, O., Moser, E. I., and Moser, M. (2006). Path integrationand the neural basis of the ¢ognitive map(cid:1). Nature Reviews Neuros ien e, 7:663(cid:21)678.[Mohedano-Moriano et al., 2007℄ Mohedano-Moriano, A., Pro-Sistiaga, P., Arroyo-Jimenez, M. M., Arta ho-Pérula, E., In-sausti, A. M., Mar os, P., Cebada-Sán hez, S., Martinez-Ruiz, J., Mu«oz, M., Blaizot, X., Martinez-Mar os, A., Amaral,D. G., and Insausti, R. (2007). Topographi al and laminar distribution of orti al input to the monkey entorhinal ortex.Journal of Anatomy, 211:250(cid:21)260.[Naber et al., 2001℄ Naber, P. A., Lopes da Silva, F. H., and Witter, M. P. (2001). Re ipro al onne tions between the entorhinal ortex and hippo ampal (cid:28)elds CA1 and the subi ulum are in register with the proje tions from CA1 to the subi ulum.Hippo ampus, 11:99(cid:21)104.[Nadel et al., 2007℄ Nadel, L., Wino ur, G., Ryan, L., and Mos ovit h, M. (2007). Systems onsolidation and hippo ampus:two views. Debates in Neuros ien e, 1(2-4):55(cid:21)66.[O'Keefe and Burgess, 2005℄ O'Keefe, J. and Burgess, N. (2005). Dual phase and rate oding in hippo ampal pla e ells:Theoreti al signi(cid:28) an e and relationship to entorhinal grid ells. Hippo ampus, 15:853(cid:21)866.[O'Keefe and Nadel, 1978℄ O'Keefe, J. and Nadel, L. (1978). The Hippo ampus as a Cognitive Map. Clarendon, Oxford.[O'Neill et al., 2008℄ O'Neill, J., Senior, T. J., Allen, K., Huxter, J. R., and Csi svari, J. (2008). Rea tivation of experien e-dependent ell assembly patterns in the hippo ampus. Nature Neuros ien e, 11:209(cid:21)215.[Pó zos and L®rin z, 2005℄ Pó zos, B. and L®rin z, A. (2005). Independent subspa e analysis using geodesi spanning trees. InRaedt, L. D. and Wrobel, S., editors, Ma hine Learning, Pro eedings of the Twenty-Se ond International Conferen e, ICML2005, volume 22, pages 673(cid:21)680.[Pó zos et al., 2007℄ Pó zos, B., Szabó, Z., Kiszlinger, M., and L®rin z, A. (2007). Independent pro ess analysis without apriori dimensional information. Le ture Notes in Computer S ien e, 4666:252(cid:21)259.[Pó zos and L®rin z, 2006℄ Pó zos, B. and L®rin z, A. (2006). Non- ombinatorial estimation of independent autoregressivesour es. Neuro omputing, 69:2416(cid:21)2419.[Ran k, Jr., 1984℄ Ran k, Jr., J. B. (1984). Head-dire tion ells in the deep ell layers of the dorsal presubi ulum in freelymoving rats. In So Neuros i Abstr, volume 10, page 599.[Rao and Ballard, 1997℄ Rao, R. P. N. and Ballard, D. H. (1997). Dynami model of visual re ognition predi ts neural responseproperties in the visual ortex. Neural Computation, 9:721(cid:21)763.[Rao and Ballard, 1999℄ Rao, R. P. N. and Ballard, D. H. (1999). Predi tive oding in the visual ortex: A fun tional interpre-tation of some extra- lassi al re eptive-(cid:28)eld e(cid:27)e ts. Nature Neuros i., 2:79(cid:21)87.[Re e and Harris, 1996℄ Re e, M. and Harris, K. D. (1996). Memory for pla es: A navigational model in support of marr'stheory of hippo ampal fun tion. Hippo ampus, 6:735(cid:21)748.[Redish, 1999℄ Redish, A. D. (1999). Beyond the ognitive map: From pla e ells to episodi memory. MIT Press, Cambridge,MA.[Redish et al., 2001℄ Redish, A. D., Battaglia, F. P., Chawla, M. K., Ekstrom, A. D., Gerrard, J. L., Lipa, P., Rosenzweig, E. S.,Worley, P. F., Guzowski, J. F., M Naughton, B. L., and Barnes, C. A. (2001). Independen e of (cid:28)ring orrelates of anatomi allyproximate hippo ampal pyramidal ells. Journal of Neuros ien e, 21:1(cid:21)6.[Reyes, 2003℄ Reyes, A. (2003). Syn hrony-dependent propagation of (cid:28)ring rate in iteratively onstru ted networks in vitro.Nature Neuros ien e, 6:593(cid:21)599.[Robbins and Monro, 1951℄ Robbins, H. and Monro, S. (1951). A sto hasti approximation method. Annals of Mathemati alStatisti s, 22:400(cid:21)407.[Rolls et al., 2006℄ Rolls, E., Stringer, S., and T.Elliot (2006). Entorhinal ortex grid ells an map to hippo ampal pla e ellsby ompetitive learning. Network: Computation in Neural Systems, 447:447(cid:21)465.[Sargolini et al., 2006℄ Sargolini, F., Fyhn, M., Hafting, T., M Naughton, B. L., Witter, M. P., Moser, M.-B., and Moser, E. I.(2006). Conjun tive representation of position, dire tion, and velo ity in entorhinal ortex. S ien e, 312:758(cid:21)762.[Shao and Dudek, 2005℄ Shao, L.-R. and Dudek, F. E. (2005). Ele trophysiologi al eviden e using fo al (cid:29)ash photolysis of aged glutamate that CA1 pyramidal ells re eive ex itatory synapti input from the subi ulum. Journal of Neurophysiology,93:3007(cid:21)3011.[Sharp, 1996℄ Sharp, P. (1996). Multiple spatial/behavioral orrelates for ells in the rat postsubi ulum: Multiple regressionanalysis and omparison to other hippo ampal areas. Cerebral Cortex, 6:238(cid:21)259.27Sharp, 1991℄ Sharp, P. E. (1991). Computer simulation of hippo ampal pla e ells. Psy hobiology, 19:103(cid:21)115.[Sokolov, 1963℄ Sokolov, E. N. (1963). Per eption and the onditioned re(cid:29)ex. Pergamon Press, London, UK.[Solstad et al., 2006℄ Solstad, T., Moser, E., and Einevoll, G. T. (2006). From grid ells to pla e ells: A mathemati al model.Hippo ampus, 16:1026(cid:21)1031.[Sutton and Barto, 1998℄ Sutton, R. S. and Barto, A. G. (1998). Reinfor ement Learning: An Introdu tion. MIT Press, Cam-bridge.[Sybirska et al., 2000℄ Sybirska, E., Dava hi, L., and Goldman-Raki , P. S. (2000). Prominen e of dire t entorhinal-CA1 path-way a tivation in sensorimotor and ognitive tasks revealed by 2-DG fun tional mapping in nonhuman primate. Journal ofNeuros ien e, 20:5827(cid:21)5834.[Szabó et al., 2008℄ Szabó, Z., Pó zos, B., and L®rin z, A. (2008). Auto-regressive independent pro ess analysis without om-binatorial e(cid:27)orts. Pattern Analysis and Appli ations. a epted.[Szabó et al., 2007℄ Szabó, Z., Pó zos, B., and L®rin z, A. (2007). Under omplete blind subspa e de onvolution. Journal ofMa hine Learning Resear h, 8:1063(cid:21)1095.[Szirtes et al., 2005℄ Szirtes, G., Pó zos, B., and L®rin z, A. (2005). Neural Kalman (cid:28)lter. Neuro omputing, 65-66:349(cid:21)355.[Szita and L®rin z, 2004℄ Szita, I. and L®rin z, A. (2004). Kalman (cid:28)lter ontrol embedded into the reinfor ement learningframework. Neural Computation, 16:491(cid:21)499.[Tahvildari et al., 2007℄ Tahvildari, B., Fransen, E., Alonso, A. A., and Hasselmo, M. E. (2007). Swit hing between (cid:16)on(cid:17) and(cid:16)o(cid:27)(cid:17) states of persistent a tivity in lateral entorhinal layer iii neurons. Hippo ampus, 17:257(cid:21)263.[Taká s and L®rin z, 2007℄ Taká s, B. and L®rin z, A. (2007). Simple onditions for forming triangular grids. Neuro omputing,70:1741(cid:21)1747.[Taube et al., 1990℄ Taube, J. S., Muller, R. U., and Ran k Jr., J. (1990). Headdire tion ells re orded from the postsubi ulumin freely moving rats. I. Des ription and quantitative analysis. Journal of Neuros ien e, 10:420(cid:21)435.[Ullman, 1995℄ Ullman, S. (1995). Sequen e seeking and ounter streams: A omputational model for bidire tional information(cid:29)ow in the visual ortex. Cerebral Cortex, pages 1(cid:21)11.[van Haeften et al., 2003℄ van Haeften, T., te Bulte, L. B., Goede, P. H., Wouterlood, F. G., and Witter, M. P. (2003).Morphologi al and numeri al analysis of synapti intera tions between neurons in deep and super(cid:28) ial layers of the entorhinal ortex of the rat. Hippo ampus, 13:943(cid:21)952.[Vinogradova, 1975℄ Vinogradova, O. S. (1975). Registration of information and the limbi system. In Horn, G. and Hinde, R.,editors, Short-term hanges in the neural a tivity and behavior, pages 95(cid:21)148. Univ. Press, Cambridge, UK.[Wallenstein et al., 1998℄ Wallenstein, G. V., Ei henbaum, H., and Hasselmo, M. E. (1998). The hippo ampus as an asso iatorof dis ontiguous events. Trends in Neuros ien es, 21:317(cid:21)323.[Witter, 2006℄ Witter, M. P. (2006). Conne tions of the subi ulum of the rat: Topography in relation to olumnar and laminarorganization. Behavioural Brain Resear h, 174:251(cid:21)264.[Witter and Amaral, 2004℄ Witter, M. P. and Amaral, D. G. (2004). The Rat Nervous System, hapter Hippo ampal Formation,pages 635(cid:21)704. A ademi Press, San Diego, CA, 3rd edition.[Witter and Moser, 2006℄ Witter, M. P. and Moser, E. I. (2006). Spatial representation and the ar hite ture of the entorhinal ortex. Trends in Neuros ien es, 29:671(cid:21)678.[Yu and Dayan, 2003℄ Yu, A. J. and Dayan, P. (2003). Expe ted and unexpe ted un ertainty: ACh and NE in the neo ortex.In Be ker, S. and Obermayer, K., editors, Advan es in Neural Information Pro essing Systems, volume 15, pages 157(cid:21)164.Cambridge, MA: MIT Press.[Yun et al., 2002℄ Yun, S. H., Mook-Jung, I., and Jung, M. W. (2002). Variation in e(cid:27)e tive stimulus patterns for indu tion oflong-term potentiation a ross di(cid:27)erent layers of rat entorhinal ortex. Journal of Neuros ien e, 22:RC214(cid:21)RC218. Department of Information Systems and2