[PDF] Before and beyond the Wilson-Cowan equations

Abstract

The Wilson-Cowan equations represent a landmark in the history of computational neuroscience. Among the insights Wilson and Cowan offered for neuroscience, they crystallized an approach to modeling neural dynamics and brain function. Although their iconic equations are used in various guises today, the ideas that led to their formulation and the relationship to other approaches are not well known. Here, we give a little context to some of the biological and theoretical concepts that lead to the Wilson-Cowan equations and discuss how to extend beyond them.

Full PDF

BBefore and beyond the Wilson-Cowanequations

Carson C. Chow and Yahya KarimipanahLaboratory of Biological Modeling, NIDDK, NIHAugust 1, 2019

Abstract

The Wilson-Cowan equations represent a landmark in the historyof computational neuroscience. Among the insights Wilson and Cowanoﬀered for neuroscience, they crystallized an approach to modelingneural dynamics and brain function. Although their iconic equationsare used in various guises today, the ideas that led to their formulationand the relationship to other approaches are not well known. Here, wegive a little context to some of the biological and theoretical conceptsthat lead to the Wilson-Cowan equations and discuss how to extendbeyond them.

Wilson and Cowan published the ﬁrst of two classic papers in 1972 [1],wherein they obtained a set of coupled mean ﬁeld equations providing acoarse-grained description of the dynamics of a network of excitatory andinhibitory neuron populations. The second paper in 1973 included spatialdependence [2]. Their work provided a demarcation in the practice of theo-retical neuroscience. Prior to Wilson and Cowan, the nascent ﬁeld of com-putational neuroscience was still grappling with how best to model neuraldynamics. Thereafter, it began to coalesce onto a common set of neuralnetwork equations and studying their properties became the focus.By the mid twentieth century, it became an established idea that at leastsome information processing in the brain is performed via global (population)1 a r X i v : . [ q - b i o . N C ] J u l ctivity (as opposed to the single neuron level) [3]. A number of studieslooked into the dynamics of neuronal populations in order to ﬁnd a theoreticalframework for studying the collective behavior of neurons [4, 5, 6]. Giventhe complex nature of the brain and our lack of access to details of themicroscopic processes, building a dynamical theory naturally called for theuse of statistical approaches, including coarse-graining and mean ﬁeld theory.Course-graining considers the system at a lower resolution, thereby re-ducing the degrees of freedom of the system. However, some information willbe lost. The strategy is to ensure that the lost information is not crucial tounderstanding the phenomenon of interest. The classic example is the dy-namics of gas molecules in a closed room. If we are only interested in whetherthe room will be comfortable, then the coarse-grained measures of tempera-ture, pressure, and relative densities of the constituent molecules (e.g. water,nitrogen, oxygen, carbon dioxide, etc.) is suﬃcient. The actual microscopicdynamics of the over 10 molecules is irrelevant. However, in systems thatare highly coupled, such as the brain, it is not clear what information is im-portant. Mean ﬁeld theory is a coarse-graining approach that captures therelevant mean dynamics while discounting (hopefully irrelevant) ﬂuctuationsaround the mean. Mean ﬁeld theory was originally developed for magnetism[7, 8], which involves the alignment of atomic spins under the competingeﬀects of coupling between the spin magnetic moments of each atom anddisorder quantiﬁed by the temperature. Reminiscent of neurons in the brain,the dynamics are highly complex since a perturbation of one spin may leadto unending reverberating loops where each spin inﬂuences other spins whichin turn aﬀect themselves. Mean ﬁeld theory cuts through this complexity byimposing self-consistency between the role of a spin as the inﬂuencer of otherspins and the one being inﬂuenced by other spins. More concretely, the meanﬁeld equations are derived by computing the mean state of one spin givenﬁxed yet unknown values of all the coupled spins then equating the valuesof the other spins to that of the ﬁrst spin. If the number of coupled spins islarge enough and the spins are uncorrelated enough then the variance of thespin state will go to zero (by the central limit theorem) and the mean willbe the only relevant quantity. The neglect of correlations and higher orderstatistics is the hallmark of mean ﬁeld theory.The original Wilson-Cowan equations were a coarse grained mean ﬁeldsystem for a continuous activity variable representing the proportion of alocal population of neurons that is ﬁring or active at any moment of time.However, this interpretation is not rigidly adhered to and often the activ-2ty variable is deemed a physical quantity like the mean ﬁring rate or anelectrochemical potential of a single neuron or small group of neurons. TheWilson-Cowan and related equations became ubiquitous because they wereable to model many elements of observed neural phenomena. Here we presenta brief (and grossly incomplete) review of the history of some key ideas thatled to their inception. We discuss when the assumptions of the equationsare valid, when they break down, and how to go beyond them to take intoaccount of ﬂuctuations and correlations. The use of neural network equations is pervasive. They have been referred toas Wilson-Cowan equations [1, 9, 10], Amari equations [11], Cohen-Grossbergequations [12], rate equations [13, 14, 15], graded-response equations [16, 17],continuous short term memory (STM) equations [18, 19, 20], and neural ﬁeldequations [21, 22]. Here, we will adopt the term activity equations. Most ofthese equations come in one of three forms. The actual activity equations inWilson and Cowan’s 1972 paper are τ ˙ a i = − a i + (1 − ra i ) f j (cid:32)(cid:88) j w ij a j + I i (cid:33) (1)where a i is neural activity of a local population indexed by i , f i is a nonlinearresponse or gain function, w ij is a coupling weight matrix, I i is an externalinput, τ is a decay time constant, and r is a refractory time constant. Wilsonand Cowan explicitly separated the eﬀects of excitatory and an inhibitoryneurons. Their 1973 paper introduced a continuum version: τ ˙ a ( x ) = − a ( x ) + (cid:0) − ra ( x ) (cid:1) f (cid:18)(cid:90) Ω w ( x − y ) a ( y ) dy + I ( y ) (cid:19) where x is a spatial variable on a domain Ω. The other two common formsare τ ˙ s i = − s i + (cid:88) j W ij f j ( s j ) + I i (2) τ ˙ s i = − s i + ( B i − s i ) (cid:88) j W ij f j ( s j ) + I i (3)3hich Grossberg [20] calls the additive and shunting short term memory(STM) equations respectively. Grossberg and colleagues use (2) and (3) withexcitatory and inhibitory eﬀects separated. Amari and subsequent neuralﬁeld papers use the continuum version of (2), often with excitation and inhi-bition lumped together so neurons no longer obey Dale’s law but simpliﬁesthe analysis.The term Wilson-Cowan equations often refer to either (1) with r = 0 or(2). For constant external input, these two equations are actually identical[20] as can be seen by setting s i = I i + (cid:80) j w ij a j , and taking the derivative:˙ s i = (cid:88) j w ij ˙ a j = (cid:88) j w ij (cid:34) − a i + f j (cid:32)(cid:88) j w ij a j + I i (cid:33)(cid:35) (4)= − s i + (cid:88) j w ij f j ( s j ) + I i (5)From this perspective, if a is neural activity then s is best viewed as a synapticinput or drive [23]. We will use the term Wilson-Cowan (WC) equationsto exclusively refer to (1). For (2) we will use the term graded-responsemodel. For a quasi-stationary state where s i ≈ (cid:80) j w ij f j ( s j ) + I i , a i can beinterpreted as a neuronal mean ﬁring rate a i = f i ( s i ) (6)and f i ( s ) is a single-neuron gain function (FI curve) specifying rate as afunction of input. Equation (1) would then be considered a rate-based modelequivalent to the graded-response model (2). There are three essential ingredients that activity equations include: 1) acontinuous time dependent activity variable, 2) a linear weighted sum overthese variables in the input, and 3) a nonlinear gain or activation functionlinking input to output. Underlying these ingredients are coarse-graining andmean ﬁeld theory. Here we explore some of the history behind these ideas.A list of milestones is in the box.It was well known throughout the last century that neurons ﬁred actionpotentials if given suﬃcient stimulus. Thus it was reasonable to model them4s binary state machines that activate when the inputs exceed a threshold.In 1943, McCulloch and Pitts [24] showed that a network of such binarythreshold neurons with both excitatory and inhibitory connections is capa-ble of producing any logical operation and thus performing universal com-putations. This launched the study of neural networks in computer science,artiﬁcial intelligence, and neuroscience.Shortly thereafter a dissenting opinion emerged that sprouted a branchof research that would eventually lead to the Wilson-Cowan equations anddynamical systems modeling of the brain. Shimbel and Rapoport [25] arguedthat the McCulloch and Pitts “neural net” approach fell short of buildingplausible theories for neural systems. They argued that countless numbers ofequivalent networks could produce the same desired output and that neuralnets were not robust to the failure of a few neurons. Additionally, it wasunlikely that genes predetermined the details of the microscopic structure ofbiological systems but rather imposed statistical traces on the macroscopicproperties. They proposed that the goal should be to seek the statisticalproperties that govern the evolution and function of a biological system,rather than proposing speciﬁc network connectivity to perform a particulartask.They derived a dynamic equation governing the ﬁring probability of neu-rons located in a given region of a brain network in terms of a recursive map.Like McCulloch and Pitts they assumed binary neurons and supposed thata particular synapse receives input from n axon terminals, which they called“bulbs” partitioned into p “bulb groups”. The probability that a neuron willﬁre depends on an integer number of bulbs exceeding a threshold h . Neu-ron ﬁring dynamics are decomposed into a set of probabilities governing theconnectivity and synaptic properties. The fraction of neurons at location Q ﬁring at time t = 1 was presumed to be f ( Q ) = (cid:88) n,p,h P ( Q ; n, p, h ) P nph [ I ( Q )] I ( Q ) = (cid:90) Ω O ( Q, X ) f ( X ) dX (7)where f ( X ) is the ﬁring probability of neurons at X on a domain Ω at time t = 0, O ( Q, X ) is the probability that a bulb group at Q originates from X , P ( Q ; n, p, h ) is the probability that a neuron at Q has a ﬁring threshold h andsynapses of type ( n, p ), and P nph [ I ( Q )] denotes the conditional probabilitythat an active neuron at Q is of ( n, p, h ) type. A crucial idea for theseequations is that the probability of ﬁring at a given time only depends on the5robability of ﬁring in the previous time and not on any higher order statistic.This is an implicit mean ﬁeld theory assumption. Although P nph [ I ( Q )] arederivable in principle, it would be a daunting task because the probability ofactivation depended on the microscopic details of synaptic conﬁgurations.Beurle [4] introduced coarse-graining, which resolved the synaptic struc-ture problem. He was inspired by experimental ﬁndings on the distribu-tion of dendritic ﬁbers as a function of distance from the cell body [26, 27].Anatomical studies had started to focus on connectivity patterns of neuronalaggregates (as opposed to the connections of individual neurons), providingprobability distributions instead of speciﬁc circuits [28]. Berule’s coarse-graining scheme represented neural activity as a continuous ﬁeld in spaceand time and was the ﬁrst neural ﬁeld theory (see [22] for a review). Beurleassumed a network of purely excitatory cells with uniform thresholds and re-fractory period. A neuron becomes active if it receives input from a thresholdnumber of active neurons. If q is the threshold then the rate at which thecells become active at a time t + τ is F ( t + τ ) = RP ( q − F k (8)where F ( x, t ) is the proportion of cells active in any region per unit time, k is a scale constant, τ is the cell time constant, R is the proportion of cellswhich are sensitive to input, and P ( q − = e − N N ( q − ( q − q −

1) andare ready to spike with the arrival of one further impulse. The mean numberof activated neurons given some input ¯ N obeyed N ( x, t ) = (cid:90) −∞ (cid:90) ∞−∞ w ( x − x (cid:48) ) F ( x (cid:48) , t (cid:48) ) α ( t − t (cid:48) )d x (cid:48) d t (cid:48) (10)where α ( t ) is the temporal response function and w ( x ) = be −| x | /a is a spatiallydependent connection function inspired by experimental studies [26, 27].Again, this was a mean ﬁeld theory formulation as activity depends onlyon past activity and not higher moments (cumulants) of the activity.Beurle analyzed spontaneous random activity and waves. He was in-terested in waves of activity because of then-recent electroencephalograph6tudies [29, 30, 31, 32]. He showed that his model with minimal structurewas capable of supporting plane, spherical and circular waves as well as vor-tices. In addition, he was driven by the idea that switching of neural wavesmay be related to shifting of attention (or other perceptions). However, hismodel also showed that the stationary ﬁxed point is unstable; a slight de-viation from the critical point either leads the activity to cease or saturate(due to refractoriness). Therefore, he erroneously concluded that there is nostable continuous random activity. Despite the achievements of the theory,the lack of inhibition as well as some technical ﬂaws drove others to lookfor alternative formulations. Nevertheless, Beurle established most of theconcepts that eventually led to the Wilson-Cowan equations.The next milestone was Griﬃth [5, 6]. After establishing the role of inhi-bition in stabilizing brain activity [33], Griﬃth took a diﬀerent approach andderived a reaction-diﬀusion equation for neural activity. He desired a classi-cal ﬁeld theory with a ﬁeld representing the synaptic excitation or input tothe cells and a source that quantiﬁes the activity of the cells (e.g. spiking fre-quency). He built a phenomenological model by enforcing a set of constraintson the equations.Griﬃth adopted a linear inhomogeneous diﬀerential equation for ψ of theform: Hψ = κF (11)where H is a linear operator and κ is a constant. He deﬁned F as F ( x, t ) ≡ f (cid:104) (cid:90) + ∞−∞ ψ ( x, t − τ ) χ ( τ ) dτ (cid:105) (12)where f is a nonlinear activation function and the temporal response function χ ( τ ) obeys causality. Assuming that ψ is diﬀerentiable and quasi-stationary( ∂ f /∂ψ (cid:28) F by Taylorexpanding to second order: F ≈ f ( ψ ) + A ∂ψ∂t + A ∂ ψ∂t , (13)He then assumed that the ﬁeld operator H obeyed translational and ro-tational invariance, which implied H = a + b ∂∂t + c ∂ ∂t + ∇ . (14)7ubstituting a ﬁeld solution ansatz with constant velocity v , ψ ( r , t ) = (cid:90) F ( R , t − p/v ) w ( | r − R | ) d R , (15)where w is a connectivity function, into equation (14), he obtained ∇ ψ = 14 β ψ + (cid:18) βv − πA (cid:19) ∂ψ∂t + (cid:18) v − πA (cid:19) ∂ ψ∂t − πf ( ψ ) (16)for w ( p ) = Ap − e − βp , which Griﬃth argued was compatible with Sholl’s ex-perimental ﬁndings [26]. Rescaling and assuming a shallow activation func-tion where ∂f /∂ψ varies slowly, in the limit of v → ∞ equation (16) can bereduced to ∇ ψ = µψ + ˙ ψ − f ( ψ ) (17)where µ > f is also redeﬁned as a general nonlinearfunction. From (17) it is apparent that for spatial homogeneity one ﬁnds amean ﬁeld equation ˙ ψ = − µψ + f ( ψ ) (18)which is the graded-response model (2) for uniform connectivity, which asshown above is equivalent to the Wilson-Cowan equation.Griﬃth showed that Beurle’s formalism is equivalent to his if the activa-tion function f ( ψ ) is adjusted appropriately. He originally assumed a linearresponse function, namely f ( ψ ) = f ( ψ c ) + b ( ψ − ψ c ), but argued that a sig-moidal f could have two stable critical points and an unstable ﬁxed pointin between that corresponded to Beurle’s unstable steady state. Griﬃth wasthe ﬁrst to show the possibility of stable spontaneous activity depending onthe gain function. However, he did not account for excitatory and inhibitorysynaptic connections explicitly. By the mid 1960’s most of the concepts that would be incorporated into theWilson-Cowan equation had been proposed although no one had put them to-gether into a cohesive whole. Wilson and Cowan developed a coarse-grained8

McCulloch-Pitts

Neural networks as computational units; foundation of neural networks • Shimbel and Rapoport

Statistical approach to theory of central nervous system • Beurle

First neural field theory of purely excitatory neurons • Griffith

Role of inhibition in stabilizing neural activity • Griffith

Reaction-diffusion neural field theory • Grosssberg

Continuous models of associative memory and pattern recognition • Wilson and Cowan

Mean field theory of coarse-grained neural activity with excitation and inhibition • Wilson and Cowan

Wilson-Cowan equations with spatial dependence • Feldman and Cowan

Wilson-Cowan equations can be derived from microscopic activity under quasi-stationarity • Amari

Neural field theory with local excitation and lateral inhibition • Cohen and Grossberg

Lyapunov function for general class of continuous activity equations • Gerstner

Spike response formalism

Historical milestones in mean field activity equations

Figure 1: Milestones in the development of activity equations.9escription of neuronal activity where the distinction between excitatory andinhibitory cells was taken into account explicitly. They were motivated byphysiological evidence from Hubel and Wiesel [34, 35], Mountcastle [36],Szentagothai and Lissak [37] and Colonnier [38], which suggested the exis-tence of certain populations of neurons with similar responses to externalstimuli.In line with Sholl [3] and Beurle [4], Wilson and Cowan argued that amicroscopic (single-neuron) description of neural activity is probably not wellsuited for understanding higher level functions that entail more complex pro-cesses such as sensory processing, memory and learning. Using dynamicalsystems analysis, they showed that their equations exhibit multi-stability andhysteresis, which could serve as a substrate for memory [39, 40] , and limitcycles, where the frequency of the oscillation is a monotonic function of stim-ulus intensity. The combination of mathematical tractability and dynamicalrichness is the reason for the lasting legacy of their equations.Wilson and Cowan derived their activity equations from ﬁrst principlesusing a probabilistic framework on an aggregate of heterogeneous thresholdneurons coupled with excitatory and inhibitory synapses with random con-nectivity by which spatial interactions could be neglected. They extendedthe model to the case of spatial inhomogeneity the following year [2].They assumed that that neurons had a sensitive phase where input ex-ceeding a threshold would cause them to ﬁre and a refractory period in whichthey would not ﬁre regardless of input. Deﬁning A i as the proportion of cellsof type i ∈ { E, I } (excitatory or inhibitory) active at time t , the fraction thatare refractory will be (cid:82) tt − r A i ( t (cid:48) ) dt , with refractory period r and the fractionthat are sensitive is then 1 − (cid:82) tt − r A i ( t (cid:48) ) dt .The activation function f ( z ) gives the expected proportion of cells thatwould respond to a given level of input z if nonrefractory. If this and thefraction of sensitive neurons are independent, then the updated fraction ofactive cells would be (cid:20) − (cid:90) tt − r A i ( t (cid:48) ) dt (cid:48) (cid:21) f i ( z i ) δt , (19)where z i ≡ (cid:90) t ∞ α ( t − t (cid:48) ) [ W iE A E ( t (cid:48) ) − W iI A I ( t (cid:48) ) + Q i ( t (cid:48) )] dt (cid:48) ,Q i is an external input, α ( t − t (cid:48) ) is a response function governing the timeevolution of the spike arrivals, and W jk is the average number of synapses10rom type k to j . Wilson and Cowan noted that the input to a cell andthe sensitive fraction could be correlated, which would violate mean ﬁeldtheory since both involve A i . They argued that this correlation could benegligible for highly interconnected networks due to the presence of spatialand temporal ﬂuctuations in the average excitation within the population,and also due to the variability of thresholds supported by experiments [41,42, 43].To estimate the shape of the activation function, they assumed that evenif all the neurons receive the same input, any variability in other parameterssuch as the ﬁring threshold or the number of aﬀerent connections could leadto a probability distribution for the number of spiking neurons. Assumingthat the variability mainly stems from the threshold, the expected proportionof neurons that receive at least a threshold excitation (per unit time) wouldbe f ( x ) = (cid:90) x ( t )0 p ( θ )d θ . (20)where p ( θ ) is the threshold probability density. If p ( θ ) is unimodal then f ( x ) would be a sigmoid function. Later, Feldman and Cowan [44] expandedon that interpretation showing that the activation function in the Wilson-Cowan equations can be regarded as the average of single-neuron activationfunctions.Putting this all together gives A E ( t + τ ) = (cid:20) − (cid:90) tt − τ A E ( t (cid:48) ) dt (cid:48) (cid:21) (21) × f e (cid:32) (cid:90) t −∞ α ( t − t (cid:48) ) (cid:104) W EE A E ( t (cid:48) ) − W EI A I ( t (cid:48) ) + Q E ( t (cid:48) ) (cid:105) dt (cid:48) (cid:33) A I ( t + τ ) = (cid:20) − (cid:90) tt − τ A I ( t (cid:48) ) dt (cid:48) (cid:21) (22) × f I (cid:32) (cid:90) t −∞ α ( t − t (cid:48) ) (cid:104) W IE A E ( t (cid:48) ) − W II A I ( t (cid:48) ) + Q I ( t (cid:48) ) (cid:105) dt (cid:48) (cid:33) A similar argument holds for a uniform threshold and a distribution of neuronal af-ferent synapses, for which f ( x ) = (cid:82) ∞ θ/x ( t ) c ( w )d w where c ( w ) is the probability density forsynaptic connections, and the lower bound on the integral is because all neurons with atleast θ/x ( t ) connections would cross the threshold. a i ( t ) = 1 s (cid:90) tt − s A i ( t (cid:48) ) d t (cid:48) (23)they argued that if α ( t ) ≈ ≤ t ≤ r and decays rapidly for t > r , then (cid:90) tt − r A ( t (cid:48) ) d t (cid:48) → ra ( t ) (24) (cid:90) t −∞ α ( t − t (cid:48) ) A ( t (cid:48) ) d t (cid:48) → ka ( t ) (25)where k and r are constants. Inserting into (21) and (22), Taylor expanding,and rescaling gives τ E d a E d t = − a E + (1 − ra E ) f E [ W EE a E − W EI a I + Q E ( t )] (26) τ I d a I d t = − a I + (1 − ra I ) f I [ W IE a E − W II a I + Q I ( t )] , (27)After deriving the original mean ﬁeld equations, Wilson and Cowan con-sidered spatial inhomogeneity. Inspired by Beurle [4], they extended theiroriginal equations to: τ ∂A E ( x , t ) ∂t = − A E ( x , t ) + (1 − rA E ( x , t )) × f E (cid:20)(cid:90) Ω d x (cid:48) ρ E ( x (cid:48) ) W EE ( x − x (cid:48) ) A E ( x (cid:48) , t ) − (cid:90) Ω d x (cid:48) ρ I ( x (cid:48) ) d x (cid:48) W EI ( x − x (cid:48) ) A I ( x (cid:48) , t ) + I E ( x , t )] (28) τ ∂A I ( x , t ) ∂t = − A I ( x , t ) + (1 − rA I ( x , t )) × f I (cid:20)(cid:90) Ω d x (cid:48) ρ E ( x (cid:48) ) W IE ( x − x (cid:48) ) A E ( x (cid:48) , t ) − (cid:90) Ω d x (cid:48) ρ I ( x (cid:48) ) W II ( x − x (cid:48) ) A I ( x (cid:48) , t ) + I I ( x , t )] (29)12here the activity variables are A E ( x , t ) = (cid:82) t −∞ dt (cid:48) α ( t − t (cid:48) ) n e ( x , t (cid:48) ) A i ( x , t ) = (cid:82) t −∞ dt (cid:48) α ( t − t (cid:48) ) n I ( x , t (cid:48) ) (30)where α ( t − t (cid:48) ) = α e − ( t − t (cid:48) ) /τ , W ij ( x − x (cid:48) ) is the spatially dependent connec-tion weight, ρ i ( x ) deﬁnes the density of neurons in a small volume around x ,and n E and n I represent the proportions of excitatory and inhibitory neuronsactivated per unit time. Although Wilson and Cowan developed the modelto incorporate spatial dependence of connections, the formulation could gobeyond spatial dependency as x could in general be any abstract quantity. The artiﬁcial intelligence branch of neural networks following McCulloch andPitts focused initially on the computational capabilities of the binary neuron,e.g. [45, 46, 47]. However, Hartline and Ratliﬀ [48] considered the continuousﬁring rate of single cells in a neural network to successfully model the liminusretina, which could be directly measured. This was followed by Grossberg,whose initial goal was to understand how the behavior of an individual canadapt stably in real-time to complex and changing environmental conditions[20]. The core of his approach led to a class of continuous neural networksdeﬁned by the nonlinear coupling between activity and synaptic (adaptive)weights. He proved that the computational units of these networks are notindividual activities or connections, but are the pattern of these variables(Grossberg,1968b, 1969a,1969b,1970a) [49, 50, 18, 18, 20].Grossberg ﬁrst considered model (2), which he termed the additive STMequation since the equation exhibits bistability as shown by Griﬃth. He thenadded dynamics for the synaptic weights to model long term memory. In-spired by the structure of the Hodgkin-Huxley model, Grossberg next derivedan equation for neural networks that more closely modeled the shunting dy-namics of individual neurons resulting in (3) [20]. The shunting STM modelis approximated by the additive STM model when the activities s i in (3) arefar from saturation. These networks are capable of performing a rich reper-toire of behaviors including content addressable memory and oscillations [19].Cohen and Grossberg (1983) proved that under general conditions includingsymmetric weights, general activity equations possess a Lyapunov function,13ndicating that all orbits will ﬂow to local minima and can serve as a basisfor models of associative memory.In 1982, Hopﬁeld using the analogy to the statistical mechanics of spinglasses, showed that a discrete time binary neuron model with symmetricconnections has a non-increasing energy function (Lyapunov function) andthus stable attractors that can act as memory states [51]. The foundations ofthis work had been set in the early 1960’s, when several studies demonstratedthat (artiﬁcial) adaptive networks could perform recognition tasks [52, 53].These advances were followed by a number of studies on associative memoryand pattern recognition, including those by Grossberg [49, 50, 18, 18] andAmari [54, 55, 11]. Little and Shaw [56, 57] pointed out the analogy betweenneural networks and spin glasses. Hopﬁeld’s contributions [17] attracted a lotof interest from the physics community [58, 59], but it is rarely acknowledgedthat his discovery was a special case of the general Cohen-Grossberg theoremand that there was much work in this area that preceded his. The Wilson-Cowan equations were highly impactful in modeling a number ofneural phenomena such as pattern formation, waves, and slow oscillations.However, Feldman and Cowan showed that the Wilson-Cowan equations (1)are only valid for quasi-stationary activity [28]. Gerstner and van Hemmen[60] also showed that for stationary activity, any single-neuron model can bereduced to a network of graded-response neurons, but this is not true forcoherent oscillations [60].To address this deﬁciency, Gerstner sought a general formulation of globaldynamics whereby one can systematically estimate the accuracy of rate-basedmodels. To that end, he developed the spike-response formalism, which is asystematic derivation of eﬀective population (macroscopic) dynamics, giventhat the single neuron (microscopic) dynamics are known [61]. His approachechoes the ideas underlying the Wilson-Cowan equations, with a focus on arealistic model of single-neuron dynamics.In the spike-response formalism, the membrane potential of a neuron ismodeled by the combination of the time dependent refractory response ofthe neuron to its own activity and the summed responses to the incomingspikes [62, 60]. As a result the synaptic potential of a single neuron can be14escribed by the integro-diﬀerential equation h i ( t ) = h ext i ( t ) + F (cid:88) f =1 η refr (cid:16) t − t fi (cid:17) + (cid:88) j ∈ Γ i J ij (cid:90) ∞ κ ( s, s (cid:48) ) S ( F ) j ( t − s (cid:48) ) ds (cid:48) (31)where h i is the membrane potential, η refr is the refractory function, t fi denotesthe spike times for neuron i , and κ ( s, s (cid:48) ) denotes the response kernel in which s is the time that has passed since the last postsynaptic spike representedby S ( F ) j [62]. Diﬀerent models of single-neuron dynamics can be reduced toa spike-response model with appropriate kernel functions [60, 63, 61].In order to ﬁnd the connection to rate-based models, Gerstner appliedmean ﬁeld theory. He considered a uniform population of neurons for whichthe activity A p ( t ) of a single pool p is deﬁned as the proportion of spikingneurons per unit time within the pool. Neurons in a pool are equivalent;connection weights and the response kernel κ ( s, s (cid:48) ) only depend the poolidentity. However, the addition of noise, which turned out to be essentialin his formulation, does cause variability among spike trains within a singlepool. To formulate his pool dynamics, Gerstner assumed that the spiketrains could be approximated as renewal processes if the synapses are weak | W pq | (cid:28)

1. This led to a closed set of mean ﬁeld equations describing thesynaptic input h ( p, t ), pool activity A p ( t ) and the ﬁring noise probabilityimposed by the dynamics. His formalism is suited to model neural networkdynamics at arbitrarily short time scales.In line with previous studies [44, 60], Gerstner showed that diﬀerentialequation activity models, while excellent for modeling asynchronous ﬁring,break down for fast transients and coherent oscillations. He derived a correc-tion to the quasi-stationarity assumption, estimating the error of rate-basedmodels. He showed that in order for the rate-based description to be valid oneshould have (cid:104) s (cid:105) ˙ A ( t ) /A ( t ) (cid:28) (cid:104) s (cid:105) denotes the inter-spike interval (in-verse of mean ﬁring rate). In addition, he also re-derived the Wilson-Cowanequations by adjusting the appropriate kernels and provided a more rigor-ous derivation wherein the noise-induced neuron ﬁring probability plays thesame role as the sigmoidal activation function in the original Wilson-Cowanderivation. 15 Mean ﬁeld theory for known microscopicdynamics

The Wilson-Cowan equations and variants have successfully modeled andpredicted a variety of neural phenomena. However, the question remains as tohow quantitatively accurate they are in modeling a neural system with knownmicroscopic neuron and synaptic dynamics, the spike response formalismnotwithstanding. Here, we explicitly derive a mean ﬁeld theory for the neuralactivity of a deterministic network of coupled spiking neurons [64, 65].Consider a network of N conductance-based neurons: τ V dV i dt = h V ( V i , m i , s i ) τ m dmdt = h m ( V i , m i , s i ) τ s ds i dt = − s i + N (cid:88) j =1 w ij A ( V j )where V i is the membrane potential of neuron i , m i represents a single orset of auxiliary variables, s i represents the synaptic input or drive, h ’s arecontinuous functions specifying the dynamics of the respective variabls, τ ’sare time constants, w ij is a matrix of synaptic weights between neurons, and A is a function that activates whenever V j exceeds a threshold indicatingan action potential. We do not explicitly distinguish between excitatory orinhibitory neurons but this is reﬂected in the parameter values, which canvary from neuron to neuron, and the synaptic weight matrix (e.g. obey Dale’slaw).If the individual neurons have limit cycle dynamics, as expected for spik-ing neurons, and the coupling between individual neurons is not excessivelystrong (i.e. a single spike does not strongly perturb the spiking dynamics,although many can and will), then the neuron dynamics can be reduced toa phase variable around or near the limit cycle [66]. The system takes thesimpler form of ˙ θ i = F i ( θ i , s i ) (32) τ s ˙ s i = − s i + 1 N (cid:88) j w ij δ ( t − t js ) (33)16here θ i is the phase of neuron i , F i is the phase velocity, and t js are thespiking times of neuron j , which we set to θ j = π .Our goal is to generate a coarse-grained mean ﬁeld description of thenetwork speciﬁed by (32) and (33). We quantify the neuron activity in termsof a density of the neuron phase: η i ( θ, t ) = δ ( θ − θ i ( t )) (34)where δ ( · ) is the Dirac delta function. This allows us to write δ ( t − t js ) = ˙ θ j η ( π, t ) = F j ( π, s j ) η ( π, t ) (35)which is also the ﬁring rate of the neuron, i.e. the neuron velocity or ﬂux at π . Inserting (35) into (33) gives τ s ˙ s i = − s i + 1 N (cid:88) j w ij F j ( π, t ) η j ( π, t ) (36)We obtain the dynamics of η i by imposing local neuron conservation.Although η i is not diﬀerentiable, we can still formally write ∂ t η i ( θ, t ) + ∂ θ (cid:2) F i ( θ i , s ) η i ( θ i , t ) (cid:3) = 0 (37)which is called the Klimontovich equation in the kinetic theory of plasmas[67, 68]. Equations (37) and (36) fully specify the spiking neural networkdynamics but are no simpler to solve than the original equations. They needa regularization scheme to make them useful. One approach is to averageover a selected population. This was the strategy employed in sections 3,4, and 6 and in previous mean ﬁeld treatments of networks [69, 70] where apopulation average is taken. In the limit of N → ∞ , the phases are assumedto be suﬃciently asynchronous so that η is diﬀerentiable and (37) becomesa well behaved partial diﬀerential equation. With the addition of Gaussianwhite noise, it takes the form of a Fokker-Planck equation.An alternative is to average over an ensemble of networks, each preparedwith a diﬀerent initial condition drawn from a speciﬁed distribution [71, 72,73, 64, 74, 75, 76]. Taking the ensemble average (cid:104)·(cid:105) over (36) and (37) gives ∂ t (cid:104) η i ( θ, t ) (cid:105) + ∂ θ (cid:104) F i ( θ i , s i ) η i ( θ i , t ) (cid:105) = 0 τ s (cid:104) ˙ s i (cid:105) = −(cid:104) s i (cid:105) + 1 N (cid:88) j w ij F j ( π, t ) (cid:104) η j ( π, t ) (cid:105)

17e see that the dynamics of the mean of η i and s i depend on the higherorder moment (cid:104) F i ( θ i , s i ) η i (cid:105) . We can construct an equation for this by diﬀer-entiating F i ( θ i , s i ) η i , inserting (36) and (37) then taking the average againbut this will include even higher order moments. Repeating will result in amoment hierarchy that is more complicated than the original equations.However, if all higher order moments factorize into products of the means(i.e. cumulants are zero) then we obtain the mean ﬁeld theory of the spikingnetwork (32) and (33): ∂ t (cid:104) η i ( θ, t ) (cid:105) + ∂ θ F i ( θ i , (cid:104) s i (cid:105) ) (cid:104) η i ( θ i , t ) (cid:105) = 0 (38) τ s (cid:104) ˙ s i (cid:105) = −(cid:104) s i (cid:105) + 1 N (cid:88) j w ij F j ( π, t ) (cid:104) η j ( π, t ) (cid:105) (39)which does not match any of the mean ﬁeld activity equations in Sec 2.However, since F j ( π, t ) (cid:104) η j ( π, t ) (cid:105) is the ensemble mean ﬁring rate of neuron j , then we see that (39) has the form of (2) with time dependent ﬁring ratedynamics given by (38). If (cid:104) η i (cid:105) were to go to steady state then (38) with ∂ t (cid:104) η i ( θ, t ) (cid:105) set to zero can be solved to yield (cid:104) η i ( θ ) (cid:105) = C j (cid:104) F j ( θ, s j ) (cid:105) where C j is determined by the normalization condition (cid:82) θ η j ( θ ) dθ = 1. Wecan then obtain (2) with gain function f j ( s i ) = C j F j ( π, s j ) (cid:104) F j ( θ, s j ) (cid:105) However, (38) is dissipation free and thus will not relax to steady state fromall initial conditions without a noise source or ﬁnite size eﬀects [72, 73, 64].The formalism can be applied to any type of neuron and synaptic dy-namics. The neuron model (32) and (36) was chosen for simplicity and alsoso that the mean ﬁeld equations would be similar to those of Sec 2. Theinclusion of higher order time derivatives or nonlinear terms could result invery diﬀerent mean ﬁeld equations. A similar argument to arrive at meanﬁeld theory could be applied with the population average, although this willbe further complicated by the neuron dependent weight w ij , which we willaddress below. No matter what average is used, we still do not know if orwhen mean ﬁeld theory is valid. To answer this question we need to computethe corrections to mean ﬁeld theory and see when they are small.18 Beyond mean ﬁeld theory

Consider again the Klimontovich formulation of the spiking network equa-tions (37) and (36) L η [ η i ] ≡ ∂ t η i ( θ, t ) + ∂ θ F i ( θ, s i ) η i ( θ, t ) − δ ( t − t ) η i ( θ ) = 0 (40) L s [ s i ] ≡ ˙ s i ( t ) + βs i ( t ) − βN N (cid:88) j =1 w ij F j ( π, t ) η j ( π, t ) − δ ( t − t ) s i = 0 (41)where we have expressed the initial conditions as forcing terms and β = 1 /τ s .We quantify the ensemble average by deﬁning an empirical probabilitydensity functional for η i ( θ, t ) and s i ( t ) as functions constrained to (40) and(41), to wit P [ η, s | η , s ] = (cid:89) i δ (cid:2) ∂ t η i + ∂ θ F i ( θ, u i ) η i − δ ( t − t ) η i ( θ ) (cid:3) × δ (cid:34) ˙ s i + βs i − β N (cid:88) j w ij η j ( π, t ) − δ ( t − t ) s i (cid:35) (42) P [ η, s | η , s ] is a Dirac delta functional in function space. This formal expres-sion can be rendered useful with the functional Fourier transform expressionof the Dirac delta functional [77]: δ [ L η ] = (cid:90) D ˜ η i e − (cid:82) dtdθ ˜ η i ( θ,t ) L η [ η i ( θ,t )] δ [ L s ] = (cid:90) D ˜ η i e − (cid:82) dt ˜ s i L s [ s i ( t )] where ˜ η i ( θ, t ) and ˜ s i ( t ) are transform variables, called response ﬁelds, and thefunctional integral is over all possible paths or histories of these functions.Applying the functional Fourier transform gives the functional or path inte-gral expression P [ η, s | η , s ] = (cid:90) D ˜ η i D ˜ se − (cid:82) dtdθ ˜ η i L η [ η i ] − (cid:80) i (cid:82) dt ˜ s i L s [ s i ] (43)Assuming ﬁxed s i and a distribution for initial density P [ η ], we canintegrate or marginalize over the initial condition densities to obtain P [ η, s ] = (cid:90) D η P [ η, s | η , s ] P [ η ] (44)19f we set η i ( θ ) = δ ( θ − θ i ( t = 0)), the distribution over initial densities isgiven by the distribution over the initial phase ρ i ( θ ). Thus we can write (cid:82) D η P [ η ] = (cid:82) (cid:81) i dθρ i ( θ ). The initial condition contribution is given bythe integral e W [˜ η ] = (cid:90) (cid:89) i dθ i ρ i ( θ i ) e (cid:80) i ˜ η i ( θ i ,t ) = (cid:89) i (cid:90) dθρ i ( θ ) e ˜ η i ( θ,t ) = e (cid:80) i ln ( − (cid:82) dθρ i ( θ )( e ˜ ηi ( θ,t − )Hence, the system given by (40) and (41) can be mapped to the distribu-tion functional P [ η, s ] = (cid:90) D η D ˜ η D ˜ se − S (45)with action S = S η + S s given by S η = (cid:88) i (cid:90) t t dt (cid:90) π − π dθ ˜ η i ( θ, t )[ ∂ t η i ( θ, t ) + ∂ θ F i ( θ, s i ) η i ( θ, t )]+ (cid:88) i ln (cid:18) − (cid:90) dθρ i ( θ )( e ˜ η i ( θ,t ) − (cid:19) S s = (cid:88) i (cid:90) t t dt ˜ s i ( t ) (cid:32) ˙ s i + βs i − βN (cid:88) j w ij F j ( π, s j ) η j ( π, t ) − δ ( t − t ) s i (cid:33) The exponential in the initial data contribution to the action (which is agenerating function for a Poisson distribution) can be simpliﬁed via the Doi-Peliti-Janssen transformation [71, 72, 73, 64, 74, 75, 76]: ψ i = η i exp( − ˜ η i )˜ ψ i = exp( ˜ η i ) − S ψ = (cid:88) i (cid:90) dθdt ˜ ψ i ( θ, t )[ ∂ t ψ i ( θ, t ) + ∂ θ F i ( θ, s i ) ψ i ( θ, t )]+ (cid:88) i ln (cid:18) − (cid:90) dθρ i ( θ ) ˜ ψ i ( θ, t ) (cid:19) (46) S s = (cid:88) i (cid:90) dt ˜ s i ( t ) (cid:0) ˙ s i ( t ) + βs i − δ ( t − t ) s i − βN (cid:88) j w ij F j ( π, s j )( ˜ ψ j ( π, t ) + 1) ψ j ( π, t ) (cid:33) (47)where we have not included noncontributing terms that arise after integrationby parts.The probability density functional (44) with the action (46) and (47)describes the ensemble statistics for the entire spiking and synaptic drivehistory of the network. By marginalizing over initial conditions, the func-tional density is no longer sharply peaked in function space. The expressioncannot be evaluated explicitly but can be computed perturbatively by em-ploying Laplace’s method and expanding around the extremum of the inte-grand. This extremum is given by the point in function space with minimalvariation, δS = 0 (i.e. the least action principle). The resulting equations ofmotion, if they exist, are the mean ﬁeld equations.The accuracy of mean ﬁeld theory can be assessed by computing the vari-ance around the mean ﬁeld solution using perturbation theory. The inversenetwork size 1 /N is a candidate expansion parameter. The action S is com-prised of the transformed phase density ψ i , which is not diﬀerentiable, butthe sum over neurons provides an opportunity to regularize it. Consider ﬁrsta homogeneous network, F i = F , and uniform global coupling, w ij = w .Then (40) and (41) become ∂ t η i ( θ, t ) + ∂ θ F ( θ, s i ) η i ( θ, t ) − δ ( t − t ) η i ( θ ) = 0 (48)˙ s i ( t ) + βs i ( t ) − wF ( π, t ) βN N (cid:88) j =1 η j ( π, t ) − δ ( t − t ) s i = 0 (49)The dynamics of all neurons are identical. We can drop the neuron index21nd rewrite the action as S ψ = N (cid:90) dθdt ˜ ψ ( θ, t )[ ∂ t ψ ( θ, t ) + ∂ θ F ( θ, s ) ψ ( θ, t )]+ N ln (cid:18) − (cid:90) dθρ ( θ ) ˜ ψ ( θ, t ) (cid:19) (50) S s = N (cid:90) dt ˜ s ( t ) (cid:0) ˙ s ( t ) + βs − δ ( t − t ) s − βN (cid:88) j wF ( π, s )( ˜ ψ ( π, t ) + 1) ψ ( π, t ) (cid:33) (51)Applying the least action principle, which amounts to setting the functionalderivatives to zero [74, 65] returns the mean ﬁeld theory equations ∂ t ρ ( θ, t ) + ∂ θ F ( θ, s i ) ρ ( θ, t ) (cid:105) = 0 (cid:104) ˙ s (cid:105) = − β (cid:104) s (cid:105) + βwF ( π, t ) ρ ( π, t ) (cid:105) where ρ ( θ, t ) = 1 N (cid:88) j η j ( θ, t ) (52)The factor of N in the action indicates that deviations away from mean ﬁeldwill be suppressed as N → ∞ . A perturbative expansion in 1 /N can beconstructed using Laplace’s method to compute all higher order cumulants,which go to zero as as N → ∞ [64, 65].The same argument can be applied to a heterogeneous network with spa-tially dependent coupling. If the coupling function and neuron speciﬁc pa-rameters are continuous functions of the neuron index, then mean ﬁeld theoryapplies for the density and synaptic drive averaged over a local populationof neurons at each location in the network in the limit of N → ∞ [65]. We have provided a brief and biased view of the ideas behind the Wilson-Cowan equations and their descendants. We show that one can arrive at asimilar set of activity equations with ﬁrst order time derivatives, nonlinearactivation (gain) function, and a linear sum over inputs, from various starting22oints. Despite the fact that the equations are not generally quantitativelyaccurate, they seem to capture many facets of neural activity and function.They are a set of eﬀective equations that have stood the test of time. Meanﬁeld theory is implicitly assumed in all of these equations. Correlations aredeliberately ignored to make the equations tractable. However, they areomnipresent in the brain and there is much ongoing eﬀort to account forthem. Nonetheless, we predict that forty seven years from now, the Wilson-Cowan equations will still have great utility.

10 Acknowledgments

This work was supported by the Intramural Research Program of the NIH,NIDDK.

References [1] H. R. Wilson and J. D. Cowan, “Excitatory and Inhibitory Interac-tions in Localized Populations of Model Neurons,”

Biophysical Journal ,vol. 12, pp. 1–24, Jan. 1972.[2] H. R. Wilson and J. D. Cowan, “A mathematical theory of the functionaldynamics of cortical and thalamic nervous tissue,”

Kybernetik , vol. 13,pp. 55–80, Sept. 1973.[3] H. Hosokawa, “Sholl, da: The organization of the cerebral cortex.(london& new york. 1956),” ??????? , vol. 2, no. 4, pp. 810–819, 1958.[4] R. L. Beurle, “Properties of a mass of cells capable of regeneratingpulses,”

Philosophical Transactions of the Royal Society of London. Se-ries B, Biological Sciences , pp. 55–94, 1956.[5] J. S. Griﬃth, “A ﬁeld theory of neural nets: I: Derivation of ﬁeld equa-tions,”

The Bulletin of Mathematical Biophysics , vol. 25, pp. 111–120,Mar. 1963.[6] J. S. Griﬃth, “A ﬁeld theory of neural nets: II. Properties of the ﬁeldequations,”

The Bulletin of Mathematical Biophysics , vol. 27, pp. 187–195, June 1965. 237] M. Le Bellac, M. Le Bellac, F. Mortessagne, G. G. Batrouni, G. Ba-trouni, et al. , Equilibrium and non-equilibrium statistical thermodynam-ics . Cambridge University Press, 2004.[8] L. Peliti,

Statistical mechanics in a nutshell , vol. 10. Princeton Univer-sity Press, 2011.[9] J. D. Cowan, J. Neuman, and W. van Drongelen, “WilsonCowan Equa-tions for Neocortical Dynamics,”

The Journal of Mathematical Neuro-science , vol. 6, p. 1, Dec. 2016.[10] Z. P. Kilpatrick,

Wilson-Cowan Model , pp. 1–5. New York, NY: SpringerNew York, 2013.[11] R. Potthast,

Amari Model , pp. 1–6. New York, NY: Springer New York,2013.[12] W. Lu and T. Chen, “New conditions on global stability of cohen-grossberg neural networks,”

Neural Computation , vol. 15, no. 5,pp. 1173–1189, 2003.[13] H. Sompolinsky, A. Crisanti, and H. J. Sommers, “Chaos in randomneural networks,”

Phys. Rev. Lett. , vol. 61, pp. 259–262, Jul 1988.[14] O. Shriki, D. Hansel, and H. Sompolinsky, “Rate Models forConductance-Based Cortical Neuronal Networks,”

Neural Computation ,vol. 15, pp. 1809–1841, Aug. 2003.[15] T. P. Vogels, K. Rajan, and L. F. Abbott, “Neural network dynamics,”

Annu. Rev. Neurosci. , vol. 28, pp. 357–376, 2005.[16] J. J. Hopﬁeld, “Neurons with graded response have collective compu-tational properties like those of two-state neurons.,”

Proceedings of theNational Academy of Sciences , vol. 81, pp. 3088–3092, May 1984.[17] J. Hopﬁeld and D. Tank, “Computing with neural circuits: a model,”

Science , vol. 233, pp. 625–633, Aug. 1986.[18] S. Grossberg, “Some networks that can learn, remember, and reproduceany number of complicated space-time patterns, i,”

Journal of Mathe-matics and Mechanics , vol. 19, no. 1, pp. 53–91, 1969.2419] M. A. Cohen and S. Grossberg, “Absolute stability of global patternformation and parallel memory storage by competitive neural networks,”

IEEE Transactions on Systems, Man, and Cybernetics , vol. SMC-13,pp. 815–826, Sept. 1983.[20] S. Grossberg, “Nonlinear neural networks: Principles, mechanisms, andarchitectures,”

Neural networks , vol. 1, no. 1, pp. 17–61, 1988.[21] P. C. Bressloﬀ, “Spatiotemporal dynamics of continuum neural ﬁelds,”

Journal of Physics A: Mathematical and Theoretical , vol. 45, no. 3,p. 033001, 2011.[22] S. Coombes, “Waves, bumps, and patterns in neural ﬁeld theories,”

Biological Cybernetics , vol. 93, pp. 91–108, Aug. 2005.[23] D. J. Pinto, J. C. Brumberg, D. J. Simons, G. B. Ermentrout, andR. Traub, “A quantitative population model of whisker barrels: Re-examining the Wilson-Cowan equations,”

Journal of ComputationalNeuroscience , vol. 3, pp. 247–264, Sept. 1996.[24] W. S. McCulloch and W. Pitts, “A logical calculus of the ideas immanentin nervous activity,”

The bulletin of mathematical biophysics , vol. 5,no. 4, pp. 115–133, 1943.[25] A. Shimbel and A. Rapoport, “A statistical approach to the theory ofthe central nervous system,”

The Bulletin of Mathematical Biophysics ,vol. 10, pp. 41–55, Mar. 1948.[26] D. Sholl, “The organization of the visual cortex in the cat,”

Journal ofanatomy , vol. 89, no. Pt 1, p. 33, 1955.[27] A. Uttley, “The probability of neural connexions,”

Proceedings of theRoyal Society of London. Series B-Biological Sciences , vol. 144, no. 915,pp. 229–240, 1955.[28] J. L. Feldman and J. D. Cowan, “Large-scale activity in neural netsI: Theory with application to motoneuron pool responses,”

BiologicalCybernetics , vol. 17, pp. 29–38, Jan. 1975.[29] H.-T. Chang, “Dendritic potential of cortical neurons produced by directelectrical stimulation of the cerebral cortex,”

Journal of neurophysiology ,vol. 14, no. 1, pp. 1–21, 1951. 2530] B. D. Burns, “Some properties of isolated cerebral cortex in the unanaes-thetized cat,”

The Journal of physiology , vol. 112, no. 1-2, pp. 156–175,1951.[31] W. G. Walter and H. Shipton, “A new toposcopic display system,”

Elec-troencephalography and clinical neurophysiology , vol. 3, no. 3, pp. 281–292, 1951.[32] J. C. Lilly and R. B. Cherry, “Surface movements of click responses fromacoustic cerebral cortex of cat: leading and trailing edges of a responseﬁgure,”

Journal of Neurophysiology , vol. 17, no. 6, pp. 521–532, 1954.[33] J. Griﬃth, “On the Stability of Brain-Like Structures,”

BiophysicalJournal , vol. 3, pp. 299–308, July 1963.[34] T. N. Wiesel and D. H. Hubel, “Single-cell responses in striate cortexof kittens deprived of vision in one eye,”

Journal of neurophysiology ,vol. 26, no. 6, pp. 1003–1017, 1963.[35] D. H. Hubel and T. N. Wiesel, “Receptive ﬁelds and functional archi-tecture in two nonstriate visual areas (18 and 19) of the cat,”

Journalof neurophysiology , vol. 28, no. 2, pp. 229–289, 1965.[36] V. B. Mountcastle, “Modality and topographic properties of single neu-rons of cat’s somatic sensory cortex,”

Journal of neurophysiology , vol. 20,no. 4, pp. 408–434, 1957.[37] J. Szent´agothai and K. Lissak, “Recent developments of neurobiology inhungary,”

Academiai Kiado, Budapest , 1967.[38] M. Colonnier, “The structural design of the neocortex,” in

Brain andconscious experience , pp. 1–23, Springer, 1965.[39] B. G. Cragg and H. N. V. Temperley, “Memory: The analogy withferromagnetic hysteresis,”

Brain , vol. 78, no. 2, pp. 304–316, 1955.[40] D. Fender and B. Julesz, “Extension of panums fusional area in binoc-ularly stabilized vision,”

JOSA , vol. 57, no. 6, pp. 819–830, 1967.[41] C. Geisler, L. Frishkopf, and W. Rosenblith, “Extracranial responses toacoustic clicks in man,” science , vol. 128, no. 3333, pp. 1210–1211, 1958.2642] A. Verveen and H. Derksen, “Amplitude distribution of axon membranenoise voltage.,”

Acta physiologica et pharmacologica Neerlandica , vol. 15,no. 3, p. 353, 1969.[43] W. Rall and C. C. Hunt, “Analysis of reﬂex variability in terms ofpartially correlated excitability ﬂuctuation in a population of motoneu-rons,”

The Journal of general physiology , vol. 39, no. 3, p. 397, 1956.[44] J. L. Feldman and J. D. Cowan, “Large-scale activity in neural nets II: Amodel for the brainstem respiratory oscillator,”

Biological Cybernetics ,vol. 17, pp. 39–51, Jan. 1975.[45] E. R. Caianiello, “Outline of a theory of thought-processes and thinkingmachines,”

Journal of theoretical biology , vol. 1, no. 2, pp. 204–235,1961.[46] J. Orbach, “Principles of neurodynamics. perceptrons and the theoryof brain mechanisms.,”

Archives of General Psychiatry , vol. 7, no. 3,pp. 218–219, 1962.[47] B. Widrow, “Generalization and information storage in network of ada-line’neurons’,”

Self-organizing systems-1962 , pp. 435–462, 1962.[48] K. Hadeler and D. Kuhn, “Stationary states of the hartline-ratliﬀmodel,”

Biological Cybernetics , vol. 56, no. 5, pp. 411–417, 1987.[49] S. Grossberg, “Some nonlinear networks capable of learning a spatialpattern of arbitrary complexity,”

Proceedings of the National Academyof Sciences of the United States of America , vol. 59, no. 2, p. 368, 1968.[50] S. Grossberg, “On learning and energy-entropy dependence in recurrentand nonrecurrent signed networks,” journal of Statistical Physics , vol. 1,no. 2, pp. 319–350, 1969.[51] J. J. Hopﬁeld, “Neural networks and physical systems with emergentcollective computational abilities,”

Proceedings of the national academyof sciences , vol. 79, no. 8, pp. 2554–2558, 1982.[52] F. Rosenblatt, “Principles of neurodynamics. perceptrons and the theoryof brain mechanisms,” tech. rep., Cornell Aeronautical Lab Inc BuﬀaloNY, 1961. 2753] B. Widrow, “An adaptive’adaline’neuron using chemical’memistors’,1553-1552,” 1960.[54] S.-i. Amari, “Dynamics of pattern formation in lateral-inhibition typeneural ﬁelds,”

Biological Cybernetics , vol. 27, no. 2, pp. 77–87, 1977.[55] S.-I. Amari, “Field theory of self-organizing neural nets,”

IEEE Trans-actions on Systems, Man, and Cybernetics , vol. SMC-13, pp. 741–748,Sept. 1983.[56] W. A. Little, “The existence of persistent states in the brain,” in

FromHigh-Temperature Superconductivity to Microminiature Refrigeration ,pp. 145–164, Springer, 1974.[57] W. A. Little and G. L. Shaw, “Analytic study of the memory storagecapacity of a neural network,”

Mathematical biosciences , vol. 39, no. 3-4,pp. 281–290, 1978.[58] D. J. Amit, H. Gutfreund, and H. Sompolinsky, “Spin-glass models ofneural networks,”

Physical Review A , vol. 32, pp. 1007–1018, Aug. 1985.[59] H. Sompolinsky, “Statistical mechanics of neural networks,”

Physics To-day , vol. 41, no. 12, pp. 70–80, 1988.[60] W. Gerstner and J. L. van Hemmen, “Universality in neural networks:the importance of the mean ﬁring rate,”

Biological Cybernetics , vol. 67,pp. 195–205, July 1992.[61] W. Gerstner, “Time structure of the activity in neural network models,”

Physical Review E , vol. 51, pp. 738–758, Jan. 1995.[62] W. Gerstner, W. M. Kistler, R. Naud, and L. Paninski,

Neuronal dy-namics: From single neurons to networks and models of cognition . Cam-bridge University Press, 2014.[63] W. Gerstner and J. L. van Hemmen, “Associative memory in a networkof spiking neurons,”

Network: Computation in Neural Systems , vol. 3,pp. 139–164, Jan. 1992.[64] M. A. Buice and C. C. Chow, “Dynamic Finite Size Eﬀects in SpikingNeural Networks,”

PLoS Computational Biology , vol. 9, p. e1002872,Jan. 2013. 2865] S.-W. Qiu and C. C. Chow, “Finite-size eﬀects for spiking neural net-works with spatially dependent coupling,”

Physical Review E , vol. 98,no. 6, p. 062414, 2018.[66] E. Brown, J. Moehlis, and P. Holmes, “On the Phase Reduction andResponse Dynamics of Neural Oscillator Populations,”

Neural Compu-tation , vol. 16, pp. 673–715, Apr. 2004.[67] S. Ichimaru,

Basic Principles of Plasma Physics: A Statistical Approach .A lecture note and reprint series, W. A. Benjamin, 1973.[68] D.R.Nicholson,

Introduction to Plasma Theory , vol. 2. Cambridge Uni-versity Press, 1984.[69] S. H. Strogatz and R. E. Mirollo, “Stability of incoherence in a pop-ulation of coupled oscillators,”

Journal of Statistical Physics , vol. 63,pp. 613–635, May 1991.[70] N. Brunel, “Dynamics of sparsely connected networks of excitatoryand inhibitory spiking neurons,”

Journal of computational neuroscience ,vol. 8, no. 3, pp. 183–208, 2000.[71] M. A. Buice and J. D. Cowan, “Field-theoretic approach to ﬂuctuationeﬀects in neural networks,”

Phys. Rev. E , vol. 75, p. 051919, May 2007.[72] E. J. Hildebrand, M. A. Buice, and C. C. Chow, “Kinetic theory ofcoupled oscillators,”

Phys. Rev. Lett. , vol. 98, p. 054101, Jan 2007.[73] M. A. Buice and C. C. Chow, “Correlations, ﬂuctuations, and stabilityof a ﬁnite-size network of coupled oscillators,”

Phys. Rev. E , vol. 76,p. 031118, Sep 2007.[74] M. A. Buice and C. C. Chow, “Generalized activity equations for spikingneural network dynamics,”

Frontiers in Computational Neuroscience ,vol. 7, p. 162, 2013.[75] M. A. Buice, J. D. Cowan, and C. C. Chow, “Systematic ﬂuctuationexpansion for neural network activity equations,”

Neural Computation ,vol. 22, no. 2, pp. 377–426, 2010. PMID: 19852585.2976] M. A. Buice and C. C. Chow, “Beyond mean ﬁeld theory: statistical ﬁeldtheory for neural networks,”

Journal of Statistical Mechanics: Theoryand Experiment , vol. 2013, no. 03, p. P03003, 2013.[77] C. C. Chow and M. A. Buice, “Path integral methods for stochastic dif-ferential equations,”