Before and beyond the Wilson-Cowan equations
BBefore and beyond the Wilson-Cowanequations
Carson C. Chow and Yahya KarimipanahLaboratory of Biological Modeling, NIDDK, NIHAugust 1, 2019
Abstract
The Wilson-Cowan equations represent a landmark in the historyof computational neuroscience. Among the insights Wilson and Cowanoffered for neuroscience, they crystallized an approach to modelingneural dynamics and brain function. Although their iconic equationsare used in various guises today, the ideas that led to their formulationand the relationship to other approaches are not well known. Here, wegive a little context to some of the biological and theoretical conceptsthat lead to the Wilson-Cowan equations and discuss how to extendbeyond them.
Wilson and Cowan published the first of two classic papers in 1972 [1],wherein they obtained a set of coupled mean field equations providing acoarse-grained description of the dynamics of a network of excitatory andinhibitory neuron populations. The second paper in 1973 included spatialdependence [2]. Their work provided a demarcation in the practice of theo-retical neuroscience. Prior to Wilson and Cowan, the nascent field of com-putational neuroscience was still grappling with how best to model neuraldynamics. Thereafter, it began to coalesce onto a common set of neuralnetwork equations and studying their properties became the focus.By the mid twentieth century, it became an established idea that at leastsome information processing in the brain is performed via global (population)1 a r X i v : . [ q - b i o . N C ] J u l ctivity (as opposed to the single neuron level) [3]. A number of studieslooked into the dynamics of neuronal populations in order to find a theoreticalframework for studying the collective behavior of neurons [4, 5, 6]. Giventhe complex nature of the brain and our lack of access to details of themicroscopic processes, building a dynamical theory naturally called for theuse of statistical approaches, including coarse-graining and mean field theory.Course-graining considers the system at a lower resolution, thereby re-ducing the degrees of freedom of the system. However, some information willbe lost. The strategy is to ensure that the lost information is not crucial tounderstanding the phenomenon of interest. The classic example is the dy-namics of gas molecules in a closed room. If we are only interested in whetherthe room will be comfortable, then the coarse-grained measures of tempera-ture, pressure, and relative densities of the constituent molecules (e.g. water,nitrogen, oxygen, carbon dioxide, etc.) is sufficient. The actual microscopicdynamics of the over 10 molecules is irrelevant. However, in systems thatare highly coupled, such as the brain, it is not clear what information is im-portant. Mean field theory is a coarse-graining approach that captures therelevant mean dynamics while discounting (hopefully irrelevant) fluctuationsaround the mean. Mean field theory was originally developed for magnetism[7, 8], which involves the alignment of atomic spins under the competingeffects of coupling between the spin magnetic moments of each atom anddisorder quantified by the temperature. Reminiscent of neurons in the brain,the dynamics are highly complex since a perturbation of one spin may leadto unending reverberating loops where each spin influences other spins whichin turn affect themselves. Mean field theory cuts through this complexity byimposing self-consistency between the role of a spin as the influencer of otherspins and the one being influenced by other spins. More concretely, the meanfield equations are derived by computing the mean state of one spin givenfixed yet unknown values of all the coupled spins then equating the valuesof the other spins to that of the first spin. If the number of coupled spins islarge enough and the spins are uncorrelated enough then the variance of thespin state will go to zero (by the central limit theorem) and the mean willbe the only relevant quantity. The neglect of correlations and higher orderstatistics is the hallmark of mean field theory.The original Wilson-Cowan equations were a coarse grained mean fieldsystem for a continuous activity variable representing the proportion of alocal population of neurons that is firing or active at any moment of time.However, this interpretation is not rigidly adhered to and often the activ-2ty variable is deemed a physical quantity like the mean firing rate or anelectrochemical potential of a single neuron or small group of neurons. TheWilson-Cowan and related equations became ubiquitous because they wereable to model many elements of observed neural phenomena. Here we presenta brief (and grossly incomplete) review of the history of some key ideas thatled to their inception. We discuss when the assumptions of the equationsare valid, when they break down, and how to go beyond them to take intoaccount of fluctuations and correlations. The use of neural network equations is pervasive. They have been referred toas Wilson-Cowan equations [1, 9, 10], Amari equations [11], Cohen-Grossbergequations [12], rate equations [13, 14, 15], graded-response equations [16, 17],continuous short term memory (STM) equations [18, 19, 20], and neural fieldequations [21, 22]. Here, we will adopt the term activity equations. Most ofthese equations come in one of three forms. The actual activity equations inWilson and Cowan’s 1972 paper are τ ˙ a i = − a i + (1 − ra i ) f j (cid:32)(cid:88) j w ij a j + I i (cid:33) (1)where a i is neural activity of a local population indexed by i , f i is a nonlinearresponse or gain function, w ij is a coupling weight matrix, I i is an externalinput, τ is a decay time constant, and r is a refractory time constant. Wilsonand Cowan explicitly separated the effects of excitatory and an inhibitoryneurons. Their 1973 paper introduced a continuum version: τ ˙ a ( x ) = − a ( x ) + (cid:0) − ra ( x ) (cid:1) f (cid:18)(cid:90) Ω w ( x − y ) a ( y ) dy + I ( y ) (cid:19) where x is a spatial variable on a domain Ω. The other two common formsare τ ˙ s i = − s i + (cid:88) j W ij f j ( s j ) + I i (2) τ ˙ s i = − s i + ( B i − s i ) (cid:88) j W ij f j ( s j ) + I i (3)3hich Grossberg [20] calls the additive and shunting short term memory(STM) equations respectively. Grossberg and colleagues use (2) and (3) withexcitatory and inhibitory effects separated. Amari and subsequent neuralfield papers use the continuum version of (2), often with excitation and inhi-bition lumped together so neurons no longer obey Dale’s law but simplifiesthe analysis.The term Wilson-Cowan equations often refer to either (1) with r = 0 or(2). For constant external input, these two equations are actually identical[20] as can be seen by setting s i = I i + (cid:80) j w ij a j , and taking the derivative:˙ s i = (cid:88) j w ij ˙ a j = (cid:88) j w ij (cid:34) − a i + f j (cid:32)(cid:88) j w ij a j + I i (cid:33)(cid:35) (4)= − s i + (cid:88) j w ij f j ( s j ) + I i (5)From this perspective, if a is neural activity then s is best viewed as a synapticinput or drive [23]. We will use the term Wilson-Cowan (WC) equationsto exclusively refer to (1). For (2) we will use the term graded-responsemodel. For a quasi-stationary state where s i ≈ (cid:80) j w ij f j ( s j ) + I i , a i can beinterpreted as a neuronal mean firing rate a i = f i ( s i ) (6)and f i ( s ) is a single-neuron gain function (FI curve) specifying rate as afunction of input. Equation (1) would then be considered a rate-based modelequivalent to the graded-response model (2). There are three essential ingredients that activity equations include: 1) acontinuous time dependent activity variable, 2) a linear weighted sum overthese variables in the input, and 3) a nonlinear gain or activation functionlinking input to output. Underlying these ingredients are coarse-graining andmean field theory. Here we explore some of the history behind these ideas.A list of milestones is in the box.It was well known throughout the last century that neurons fired actionpotentials if given sufficient stimulus. Thus it was reasonable to model them4s binary state machines that activate when the inputs exceed a threshold.In 1943, McCulloch and Pitts [24] showed that a network of such binarythreshold neurons with both excitatory and inhibitory connections is capa-ble of producing any logical operation and thus performing universal com-putations. This launched the study of neural networks in computer science,artificial intelligence, and neuroscience.Shortly thereafter a dissenting opinion emerged that sprouted a branchof research that would eventually lead to the Wilson-Cowan equations anddynamical systems modeling of the brain. Shimbel and Rapoport [25] arguedthat the McCulloch and Pitts “neural net” approach fell short of buildingplausible theories for neural systems. They argued that countless numbers ofequivalent networks could produce the same desired output and that neuralnets were not robust to the failure of a few neurons. Additionally, it wasunlikely that genes predetermined the details of the microscopic structure ofbiological systems but rather imposed statistical traces on the macroscopicproperties. They proposed that the goal should be to seek the statisticalproperties that govern the evolution and function of a biological system,rather than proposing specific network connectivity to perform a particulartask.They derived a dynamic equation governing the firing probability of neu-rons located in a given region of a brain network in terms of a recursive map.Like McCulloch and Pitts they assumed binary neurons and supposed thata particular synapse receives input from n axon terminals, which they called“bulbs” partitioned into p “bulb groups”. The probability that a neuron willfire depends on an integer number of bulbs exceeding a threshold h . Neu-ron firing dynamics are decomposed into a set of probabilities governing theconnectivity and synaptic properties. The fraction of neurons at location Q firing at time t = 1 was presumed to be f ( Q ) = (cid:88) n,p,h P ( Q ; n, p, h ) P nph [ I ( Q )] I ( Q ) = (cid:90) Ω O ( Q, X ) f ( X ) dX (7)where f ( X ) is the firing probability of neurons at X on a domain Ω at time t = 0, O ( Q, X ) is the probability that a bulb group at Q originates from X , P ( Q ; n, p, h ) is the probability that a neuron at Q has a firing threshold h andsynapses of type ( n, p ), and P nph [ I ( Q )] denotes the conditional probabilitythat an active neuron at Q is of ( n, p, h ) type. A crucial idea for theseequations is that the probability of firing at a given time only depends on the5robability of firing in the previous time and not on any higher order statistic.This is an implicit mean field theory assumption. Although P nph [ I ( Q )] arederivable in principle, it would be a daunting task because the probability ofactivation depended on the microscopic details of synaptic configurations.Beurle [4] introduced coarse-graining, which resolved the synaptic struc-ture problem. He was inspired by experimental findings on the distribu-tion of dendritic fibers as a function of distance from the cell body [26, 27].Anatomical studies had started to focus on connectivity patterns of neuronalaggregates (as opposed to the connections of individual neurons), providingprobability distributions instead of specific circuits [28]. Berule’s coarse-graining scheme represented neural activity as a continuous field in spaceand time and was the first neural field theory (see [22] for a review). Beurleassumed a network of purely excitatory cells with uniform thresholds and re-fractory period. A neuron becomes active if it receives input from a thresholdnumber of active neurons. If q is the threshold then the rate at which thecells become active at a time t + τ is F ( t + τ ) = RP ( q − F k (8)where F ( x, t ) is the proportion of cells active in any region per unit time, k is a scale constant, τ is the cell time constant, R is the proportion of cellswhich are sensitive to input, and P ( q − = e − N N ( q − ( q − q −
1) andare ready to spike with the arrival of one further impulse. The mean numberof activated neurons given some input ¯ N obeyed N ( x, t ) = (cid:90) −∞ (cid:90) ∞−∞ w ( x − x (cid:48) ) F ( x (cid:48) , t (cid:48) ) α ( t − t (cid:48) )d x (cid:48) d t (cid:48) (10)where α ( t ) is the temporal response function and w ( x ) = be −| x | /a is a spatiallydependent connection function inspired by experimental studies [26, 27].Again, this was a mean field theory formulation as activity depends onlyon past activity and not higher moments (cumulants) of the activity.Beurle analyzed spontaneous random activity and waves. He was in-terested in waves of activity because of then-recent electroencephalograph6tudies [29, 30, 31, 32]. He showed that his model with minimal structurewas capable of supporting plane, spherical and circular waves as well as vor-tices. In addition, he was driven by the idea that switching of neural wavesmay be related to shifting of attention (or other perceptions). However, hismodel also showed that the stationary fixed point is unstable; a slight de-viation from the critical point either leads the activity to cease or saturate(due to refractoriness). Therefore, he erroneously concluded that there is nostable continuous random activity. Despite the achievements of the theory,the lack of inhibition as well as some technical flaws drove others to lookfor alternative formulations. Nevertheless, Beurle established most of theconcepts that eventually led to the Wilson-Cowan equations.The next milestone was Griffith [5, 6]. After establishing the role of inhi-bition in stabilizing brain activity [33], Griffith took a different approach andderived a reaction-diffusion equation for neural activity. He desired a classi-cal field theory with a field representing the synaptic excitation or input tothe cells and a source that quantifies the activity of the cells (e.g. spiking fre-quency). He built a phenomenological model by enforcing a set of constraintson the equations.Griffith adopted a linear inhomogeneous differential equation for ψ of theform: Hψ = κF (11)where H is a linear operator and κ is a constant. He defined F as F ( x, t ) ≡ f (cid:104) (cid:90) + ∞−∞ ψ ( x, t − τ ) χ ( τ ) dτ (cid:105) (12)where f is a nonlinear activation function and the temporal response function χ ( τ ) obeys causality. Assuming that ψ is differentiable and quasi-stationary( ∂ f /∂ψ (cid:28) F by Taylorexpanding to second order: F ≈ f ( ψ ) + A ∂ψ∂t + A ∂ ψ∂t , (13)He then assumed that the field operator H obeyed translational and ro-tational invariance, which implied H = a + b ∂∂t + c ∂ ∂t + ∇ . (14)7ubstituting a field solution ansatz with constant velocity v , ψ ( r , t ) = (cid:90) F ( R , t − p/v ) w ( | r − R | ) d R , (15)where w is a connectivity function, into equation (14), he obtained ∇ ψ = 14 β ψ + (cid:18) βv − πA (cid:19) ∂ψ∂t + (cid:18) v − πA (cid:19) ∂ ψ∂t − πf ( ψ ) (16)for w ( p ) = Ap − e − βp , which Griffith argued was compatible with Sholl’s ex-perimental findings [26]. Rescaling and assuming a shallow activation func-tion where ∂f /∂ψ varies slowly, in the limit of v → ∞ equation (16) can bereduced to ∇ ψ = µψ + ˙ ψ − f ( ψ ) (17)where µ > f is also redefined as a general nonlinearfunction. From (17) it is apparent that for spatial homogeneity one finds amean field equation ˙ ψ = − µψ + f ( ψ ) (18)which is the graded-response model (2) for uniform connectivity, which asshown above is equivalent to the Wilson-Cowan equation.Griffith showed that Beurle’s formalism is equivalent to his if the activa-tion function f ( ψ ) is adjusted appropriately. He originally assumed a linearresponse function, namely f ( ψ ) = f ( ψ c ) + b ( ψ − ψ c ), but argued that a sig-moidal f could have two stable critical points and an unstable fixed pointin between that corresponded to Beurle’s unstable steady state. Griffith wasthe first to show the possibility of stable spontaneous activity depending onthe gain function. However, he did not account for excitatory and inhibitorysynaptic connections explicitly. By the mid 1960’s most of the concepts that would be incorporated into theWilson-Cowan equation had been proposed although no one had put them to-gether into a cohesive whole. Wilson and Cowan developed a coarse-grained8
McCulloch-Pitts
Neural networks as computational units; foundation of neural networks • Shimbel and Rapoport
Statistical approach to theory of central nervous system • Beurle
First neural field theory of purely excitatory neurons • Griffith
Role of inhibition in stabilizing neural activity • Griffith
Reaction-diffusion neural field theory • Grosssberg
Continuous models of associative memory and pattern recognition • Wilson and Cowan
Mean field theory of coarse-grained neural activity with excitation and inhibition • Wilson and Cowan
Wilson-Cowan equations with spatial dependence • Feldman and Cowan
Wilson-Cowan equations can be derived from microscopic activity under quasi-stationarity • Amari
Neural field theory with local excitation and lateral inhibition • Cohen and Grossberg
Lyapunov function for general class of continuous activity equations • Gerstner
Spike response formalism
Historical milestones in mean field activity equations
Figure 1: Milestones in the development of activity equations.9escription of neuronal activity where the distinction between excitatory andinhibitory cells was taken into account explicitly. They were motivated byphysiological evidence from Hubel and Wiesel [34, 35], Mountcastle [36],Szentagothai and Lissak [37] and Colonnier [38], which suggested the exis-tence of certain populations of neurons with similar responses to externalstimuli.In line with Sholl [3] and Beurle [4], Wilson and Cowan argued that amicroscopic (single-neuron) description of neural activity is probably not wellsuited for understanding higher level functions that entail more complex pro-cesses such as sensory processing, memory and learning. Using dynamicalsystems analysis, they showed that their equations exhibit multi-stability andhysteresis, which could serve as a substrate for memory [39, 40] , and limitcycles, where the frequency of the oscillation is a monotonic function of stim-ulus intensity. The combination of mathematical tractability and dynamicalrichness is the reason for the lasting legacy of their equations.Wilson and Cowan derived their activity equations from first principlesusing a probabilistic framework on an aggregate of heterogeneous thresholdneurons coupled with excitatory and inhibitory synapses with random con-nectivity by which spatial interactions could be neglected. They extendedthe model to the case of spatial inhomogeneity the following year [2].They assumed that that neurons had a sensitive phase where input ex-ceeding a threshold would cause them to fire and a refractory period in whichthey would not fire regardless of input. Defining A i as the proportion of cellsof type i ∈ { E, I } (excitatory or inhibitory) active at time t , the fraction thatare refractory will be (cid:82) tt − r A i ( t (cid:48) ) dt , with refractory period r and the fractionthat are sensitive is then 1 − (cid:82) tt − r A i ( t (cid:48) ) dt .The activation function f ( z ) gives the expected proportion of cells thatwould respond to a given level of input z if nonrefractory. If this and thefraction of sensitive neurons are independent, then the updated fraction ofactive cells would be (cid:20) − (cid:90) tt − r A i ( t (cid:48) ) dt (cid:48) (cid:21) f i ( z i ) δt , (19)where z i ≡ (cid:90) t ∞ α ( t − t (cid:48) ) [ W iE A E ( t (cid:48) ) − W iI A I ( t (cid:48) ) + Q i ( t (cid:48) )] dt (cid:48) ,Q i is an external input, α ( t − t (cid:48) ) is a response function governing the timeevolution of the spike arrivals, and W jk is the average number of synapses10rom type k to j . Wilson and Cowan noted that the input to a cell andthe sensitive fraction could be correlated, which would violate mean fieldtheory since both involve A i . They argued that this correlation could benegligible for highly interconnected networks due to the presence of spatialand temporal fluctuations in the average excitation within the population,and also due to the variability of thresholds supported by experiments [41,42, 43].To estimate the shape of the activation function, they assumed that evenif all the neurons receive the same input, any variability in other parameterssuch as the firing threshold or the number of afferent connections could leadto a probability distribution for the number of spiking neurons. Assumingthat the variability mainly stems from the threshold, the expected proportionof neurons that receive at least a threshold excitation (per unit time) wouldbe f ( x ) = (cid:90) x ( t )0 p ( θ )d θ . (20)where p ( θ ) is the threshold probability density. If p ( θ ) is unimodal then f ( x ) would be a sigmoid function. Later, Feldman and Cowan [44] expandedon that interpretation showing that the activation function in the Wilson-Cowan equations can be regarded as the average of single-neuron activationfunctions.Putting this all together gives A E ( t + τ ) = (cid:20) − (cid:90) tt − τ A E ( t (cid:48) ) dt (cid:48) (cid:21) (21) × f e (cid:32) (cid:90) t −∞ α ( t − t (cid:48) ) (cid:104) W EE A E ( t (cid:48) ) − W EI A I ( t (cid:48) ) + Q E ( t (cid:48) ) (cid:105) dt (cid:48) (cid:33) A I ( t + τ ) = (cid:20) − (cid:90) tt − τ A I ( t (cid:48) ) dt (cid:48) (cid:21) (22) × f I (cid:32) (cid:90) t −∞ α ( t − t (cid:48) ) (cid:104) W IE A E ( t (cid:48) ) − W II A I ( t (cid:48) ) + Q I ( t (cid:48) ) (cid:105) dt (cid:48) (cid:33) A similar argument holds for a uniform threshold and a distribution of neuronal af-ferent synapses, for which f ( x ) = (cid:82) ∞ θ/x ( t ) c ( w )d w where c ( w ) is the probability density forsynaptic connections, and the lower bound on the integral is because all neurons with atleast θ/x ( t ) connections would cross the threshold. a i ( t ) = 1 s (cid:90) tt − s A i ( t (cid:48) ) d t (cid:48) (23)they argued that if α ( t ) ≈ ≤ t ≤ r and decays rapidly for t > r , then (cid:90) tt − r A ( t (cid:48) ) d t (cid:48) → ra ( t ) (24) (cid:90) t −∞ α ( t − t (cid:48) ) A ( t (cid:48) ) d t (cid:48) → ka ( t ) (25)where k and r are constants. Inserting into (21) and (22), Taylor expanding,and rescaling gives τ E d a E d t = − a E + (1 − ra E ) f E [ W EE a E − W EI a I + Q E ( t )] (26) τ I d a I d t = − a I + (1 − ra I ) f I [ W IE a E − W II a I + Q I ( t )] , (27)After deriving the original mean field equations, Wilson and Cowan con-sidered spatial inhomogeneity. Inspired by Beurle [4], they extended theiroriginal equations to: τ ∂A E ( x , t ) ∂t = − A E ( x , t ) + (1 − rA E ( x , t )) × f E (cid:20)(cid:90) Ω d x (cid:48) ρ E ( x (cid:48) ) W EE ( x − x (cid:48) ) A E ( x (cid:48) , t ) − (cid:90) Ω d x (cid:48) ρ I ( x (cid:48) ) d x (cid:48) W EI ( x − x (cid:48) ) A I ( x (cid:48) , t ) + I E ( x , t )] (28) τ ∂A I ( x , t ) ∂t = − A I ( x , t ) + (1 − rA I ( x , t )) × f I (cid:20)(cid:90) Ω d x (cid:48) ρ E ( x (cid:48) ) W IE ( x − x (cid:48) ) A E ( x (cid:48) , t ) − (cid:90) Ω d x (cid:48) ρ I ( x (cid:48) ) W II ( x − x (cid:48) ) A I ( x (cid:48) , t ) + I I ( x , t )] (29)12here the activity variables are A E ( x , t ) = (cid:82) t −∞ dt (cid:48) α ( t − t (cid:48) ) n e ( x , t (cid:48) ) A i ( x , t ) = (cid:82) t −∞ dt (cid:48) α ( t − t (cid:48) ) n I ( x , t (cid:48) ) (30)where α ( t − t (cid:48) ) = α e − ( t − t (cid:48) ) /τ , W ij ( x − x (cid:48) ) is the spatially dependent connec-tion weight, ρ i ( x ) defines the density of neurons in a small volume around x ,and n E and n I represent the proportions of excitatory and inhibitory neuronsactivated per unit time. Although Wilson and Cowan developed the modelto incorporate spatial dependence of connections, the formulation could gobeyond spatial dependency as x could in general be any abstract quantity. The artificial intelligence branch of neural networks following McCulloch andPitts focused initially on the computational capabilities of the binary neuron,e.g. [45, 46, 47]. However, Hartline and Ratliff [48] considered the continuousfiring rate of single cells in a neural network to successfully model the liminusretina, which could be directly measured. This was followed by Grossberg,whose initial goal was to understand how the behavior of an individual canadapt stably in real-time to complex and changing environmental conditions[20]. The core of his approach led to a class of continuous neural networksdefined by the nonlinear coupling between activity and synaptic (adaptive)weights. He proved that the computational units of these networks are notindividual activities or connections, but are the pattern of these variables(Grossberg,1968b, 1969a,1969b,1970a) [49, 50, 18, 18, 20].Grossberg first considered model (2), which he termed the additive STMequation since the equation exhibits bistability as shown by Griffith. He thenadded dynamics for the synaptic weights to model long term memory. In-spired by the structure of the Hodgkin-Huxley model, Grossberg next derivedan equation for neural networks that more closely modeled the shunting dy-namics of individual neurons resulting in (3) [20]. The shunting STM modelis approximated by the additive STM model when the activities s i in (3) arefar from saturation. These networks are capable of performing a rich reper-toire of behaviors including content addressable memory and oscillations [19].Cohen and Grossberg (1983) proved that under general conditions includingsymmetric weights, general activity equations possess a Lyapunov function,13ndicating that all orbits will flow to local minima and can serve as a basisfor models of associative memory.In 1982, Hopfield using the analogy to the statistical mechanics of spinglasses, showed that a discrete time binary neuron model with symmetricconnections has a non-increasing energy function (Lyapunov function) andthus stable attractors that can act as memory states [51]. The foundations ofthis work had been set in the early 1960’s, when several studies demonstratedthat (artificial) adaptive networks could perform recognition tasks [52, 53].These advances were followed by a number of studies on associative memoryand pattern recognition, including those by Grossberg [49, 50, 18, 18] andAmari [54, 55, 11]. Little and Shaw [56, 57] pointed out the analogy betweenneural networks and spin glasses. Hopfield’s contributions [17] attracted a lotof interest from the physics community [58, 59], but it is rarely acknowledgedthat his discovery was a special case of the general Cohen-Grossberg theoremand that there was much work in this area that preceded his. The Wilson-Cowan equations were highly impactful in modeling a number ofneural phenomena such as pattern formation, waves, and slow oscillations.However, Feldman and Cowan showed that the Wilson-Cowan equations (1)are only valid for quasi-stationary activity [28]. Gerstner and van Hemmen[60] also showed that for stationary activity, any single-neuron model can bereduced to a network of graded-response neurons, but this is not true forcoherent oscillations [60].To address this deficiency, Gerstner sought a general formulation of globaldynamics whereby one can systematically estimate the accuracy of rate-basedmodels. To that end, he developed the spike-response formalism, which is asystematic derivation of effective population (macroscopic) dynamics, giventhat the single neuron (microscopic) dynamics are known [61]. His approachechoes the ideas underlying the Wilson-Cowan equations, with a focus on arealistic model of single-neuron dynamics.In the spike-response formalism, the membrane potential of a neuron ismodeled by the combination of the time dependent refractory response ofthe neuron to its own activity and the summed responses to the incomingspikes [62, 60]. As a result the synaptic potential of a single neuron can be14escribed by the integro-differential equation h i ( t ) = h ext i ( t ) + F (cid:88) f =1 η refr (cid:16) t − t fi (cid:17) + (cid:88) j ∈ Γ i J ij (cid:90) ∞ κ ( s, s (cid:48) ) S ( F ) j ( t − s (cid:48) ) ds (cid:48) (31)where h i is the membrane potential, η refr is the refractory function, t fi denotesthe spike times for neuron i , and κ ( s, s (cid:48) ) denotes the response kernel in which s is the time that has passed since the last postsynaptic spike representedby S ( F ) j [62]. Different models of single-neuron dynamics can be reduced toa spike-response model with appropriate kernel functions [60, 63, 61].In order to find the connection to rate-based models, Gerstner appliedmean field theory. He considered a uniform population of neurons for whichthe activity A p ( t ) of a single pool p is defined as the proportion of spikingneurons per unit time within the pool. Neurons in a pool are equivalent;connection weights and the response kernel κ ( s, s (cid:48) ) only depend the poolidentity. However, the addition of noise, which turned out to be essentialin his formulation, does cause variability among spike trains within a singlepool. To formulate his pool dynamics, Gerstner assumed that the spiketrains could be approximated as renewal processes if the synapses are weak | W pq | (cid:28)
1. This led to a closed set of mean field equations describing thesynaptic input h ( p, t ), pool activity A p ( t ) and the firing noise probabilityimposed by the dynamics. His formalism is suited to model neural networkdynamics at arbitrarily short time scales.In line with previous studies [44, 60], Gerstner showed that differentialequation activity models, while excellent for modeling asynchronous firing,break down for fast transients and coherent oscillations. He derived a correc-tion to the quasi-stationarity assumption, estimating the error of rate-basedmodels. He showed that in order for the rate-based description to be valid oneshould have (cid:104) s (cid:105) ˙ A ( t ) /A ( t ) (cid:28) (cid:104) s (cid:105) denotes the inter-spike interval (in-verse of mean firing rate). In addition, he also re-derived the Wilson-Cowanequations by adjusting the appropriate kernels and provided a more rigor-ous derivation wherein the noise-induced neuron firing probability plays thesame role as the sigmoidal activation function in the original Wilson-Cowanderivation. 15 Mean field theory for known microscopicdynamics
The Wilson-Cowan equations and variants have successfully modeled andpredicted a variety of neural phenomena. However, the question remains as tohow quantitatively accurate they are in modeling a neural system with knownmicroscopic neuron and synaptic dynamics, the spike response formalismnotwithstanding. Here, we explicitly derive a mean field theory for the neuralactivity of a deterministic network of coupled spiking neurons [64, 65].Consider a network of N conductance-based neurons: τ V dV i dt = h V ( V i , m i , s i ) τ m dmdt = h m ( V i , m i , s i ) τ s ds i dt = − s i + N (cid:88) j =1 w ij A ( V j )where V i is the membrane potential of neuron i , m i represents a single orset of auxiliary variables, s i represents the synaptic input or drive, h ’s arecontinuous functions specifying the dynamics of the respective variabls, τ ’sare time constants, w ij is a matrix of synaptic weights between neurons, and A is a function that activates whenever V j exceeds a threshold indicatingan action potential. We do not explicitly distinguish between excitatory orinhibitory neurons but this is reflected in the parameter values, which canvary from neuron to neuron, and the synaptic weight matrix (e.g. obey Dale’slaw).If the individual neurons have limit cycle dynamics, as expected for spik-ing neurons, and the coupling between individual neurons is not excessivelystrong (i.e. a single spike does not strongly perturb the spiking dynamics,although many can and will), then the neuron dynamics can be reduced toa phase variable around or near the limit cycle [66]. The system takes thesimpler form of ˙ θ i = F i ( θ i , s i ) (32) τ s ˙ s i = − s i + 1 N (cid:88) j w ij δ ( t − t js ) (33)16here θ i is the phase of neuron i , F i is the phase velocity, and t js are thespiking times of neuron j , which we set to θ j = π .Our goal is to generate a coarse-grained mean field description of thenetwork specified by (32) and (33). We quantify the neuron activity in termsof a density of the neuron phase: η i ( θ, t ) = δ ( θ − θ i ( t )) (34)where δ ( · ) is the Dirac delta function. This allows us to write δ ( t − t js ) = ˙ θ j η ( π, t ) = F j ( π, s j ) η ( π, t ) (35)which is also the firing rate of the neuron, i.e. the neuron velocity or flux at π . Inserting (35) into (33) gives τ s ˙ s i = − s i + 1 N (cid:88) j w ij F j ( π, t ) η j ( π, t ) (36)We obtain the dynamics of η i by imposing local neuron conservation.Although η i is not differentiable, we can still formally write ∂ t η i ( θ, t ) + ∂ θ (cid:2) F i ( θ i , s ) η i ( θ i , t ) (cid:3) = 0 (37)which is called the Klimontovich equation in the kinetic theory of plasmas[67, 68]. Equations (37) and (36) fully specify the spiking neural networkdynamics but are no simpler to solve than the original equations. They needa regularization scheme to make them useful. One approach is to averageover a selected population. This was the strategy employed in sections 3,4, and 6 and in previous mean field treatments of networks [69, 70] where apopulation average is taken. In the limit of N → ∞ , the phases are assumedto be sufficiently asynchronous so that η is differentiable and (37) becomesa well behaved partial differential equation. With the addition of Gaussianwhite noise, it takes the form of a Fokker-Planck equation.An alternative is to average over an ensemble of networks, each preparedwith a different initial condition drawn from a specified distribution [71, 72,73, 64, 74, 75, 76]. Taking the ensemble average (cid:104)·(cid:105) over (36) and (37) gives ∂ t (cid:104) η i ( θ, t ) (cid:105) + ∂ θ (cid:104) F i ( θ i , s i ) η i ( θ i , t ) (cid:105) = 0 τ s (cid:104) ˙ s i (cid:105) = −(cid:104) s i (cid:105) + 1 N (cid:88) j w ij F j ( π, t ) (cid:104) η j ( π, t ) (cid:105)
17e see that the dynamics of the mean of η i and s i depend on the higherorder moment (cid:104) F i ( θ i , s i ) η i (cid:105) . We can construct an equation for this by differ-entiating F i ( θ i , s i ) η i , inserting (36) and (37) then taking the average againbut this will include even higher order moments. Repeating will result in amoment hierarchy that is more complicated than the original equations.However, if all higher order moments factorize into products of the means(i.e. cumulants are zero) then we obtain the mean field theory of the spikingnetwork (32) and (33): ∂ t (cid:104) η i ( θ, t ) (cid:105) + ∂ θ F i ( θ i , (cid:104) s i (cid:105) ) (cid:104) η i ( θ i , t ) (cid:105) = 0 (38) τ s (cid:104) ˙ s i (cid:105) = −(cid:104) s i (cid:105) + 1 N (cid:88) j w ij F j ( π, t ) (cid:104) η j ( π, t ) (cid:105) (39)which does not match any of the mean field activity equations in Sec 2.However, since F j ( π, t ) (cid:104) η j ( π, t ) (cid:105) is the ensemble mean firing rate of neuron j , then we see that (39) has the form of (2) with time dependent firing ratedynamics given by (38). If (cid:104) η i (cid:105) were to go to steady state then (38) with ∂ t (cid:104) η i ( θ, t ) (cid:105) set to zero can be solved to yield (cid:104) η i ( θ ) (cid:105) = C j (cid:104) F j ( θ, s j ) (cid:105) where C j is determined by the normalization condition (cid:82) θ η j ( θ ) dθ = 1. Wecan then obtain (2) with gain function f j ( s i ) = C j F j ( π, s j ) (cid:104) F j ( θ, s j ) (cid:105) However, (38) is dissipation free and thus will not relax to steady state fromall initial conditions without a noise source or finite size effects [72, 73, 64].The formalism can be applied to any type of neuron and synaptic dy-namics. The neuron model (32) and (36) was chosen for simplicity and alsoso that the mean field equations would be similar to those of Sec 2. Theinclusion of higher order time derivatives or nonlinear terms could result invery different mean field equations. A similar argument to arrive at meanfield theory could be applied with the population average, although this willbe further complicated by the neuron dependent weight w ij , which we willaddress below. No matter what average is used, we still do not know if orwhen mean field theory is valid. To answer this question we need to computethe corrections to mean field theory and see when they are small.18 Beyond mean field theory
Consider again the Klimontovich formulation of the spiking network equa-tions (37) and (36) L η [ η i ] ≡ ∂ t η i ( θ, t ) + ∂ θ F i ( θ, s i ) η i ( θ, t ) − δ ( t − t ) η i ( θ ) = 0 (40) L s [ s i ] ≡ ˙ s i ( t ) + βs i ( t ) − βN N (cid:88) j =1 w ij F j ( π, t ) η j ( π, t ) − δ ( t − t ) s i = 0 (41)where we have expressed the initial conditions as forcing terms and β = 1 /τ s .We quantify the ensemble average by defining an empirical probabilitydensity functional for η i ( θ, t ) and s i ( t ) as functions constrained to (40) and(41), to wit P [ η, s | η , s ] = (cid:89) i δ (cid:2) ∂ t η i + ∂ θ F i ( θ, u i ) η i − δ ( t − t ) η i ( θ ) (cid:3) × δ (cid:34) ˙ s i + βs i − β N (cid:88) j w ij η j ( π, t ) − δ ( t − t ) s i (cid:35) (42) P [ η, s | η , s ] is a Dirac delta functional in function space. This formal expres-sion can be rendered useful with the functional Fourier transform expressionof the Dirac delta functional [77]: δ [ L η ] = (cid:90) D ˜ η i e − (cid:82) dtdθ ˜ η i ( θ,t ) L η [ η i ( θ,t )] δ [ L s ] = (cid:90) D ˜ η i e − (cid:82) dt ˜ s i L s [ s i ( t )] where ˜ η i ( θ, t ) and ˜ s i ( t ) are transform variables, called response fields, and thefunctional integral is over all possible paths or histories of these functions.Applying the functional Fourier transform gives the functional or path inte-gral expression P [ η, s | η , s ] = (cid:90) D ˜ η i D ˜ se − (cid:82) dtdθ ˜ η i L η [ η i ] − (cid:80) i (cid:82) dt ˜ s i L s [ s i ] (43)Assuming fixed s i and a distribution for initial density P [ η ], we canintegrate or marginalize over the initial condition densities to obtain P [ η, s ] = (cid:90) D η P [ η, s | η , s ] P [ η ] (44)19f we set η i ( θ ) = δ ( θ − θ i ( t = 0)), the distribution over initial densities isgiven by the distribution over the initial phase ρ i ( θ ). Thus we can write (cid:82) D η P [ η ] = (cid:82) (cid:81) i dθρ i ( θ ). The initial condition contribution is given bythe integral e W [˜ η ] = (cid:90) (cid:89) i dθ i ρ i ( θ i ) e (cid:80) i ˜ η i ( θ i ,t ) = (cid:89) i (cid:90) dθρ i ( θ ) e ˜ η i ( θ,t ) = e (cid:80) i ln ( − (cid:82) dθρ i ( θ )( e ˜ ηi ( θ,t − )Hence, the system given by (40) and (41) can be mapped to the distribu-tion functional P [ η, s ] = (cid:90) D η D ˜ η D ˜ se − S (45)with action S = S η + S s given by S η = (cid:88) i (cid:90) t t dt (cid:90) π − π dθ ˜ η i ( θ, t )[ ∂ t η i ( θ, t ) + ∂ θ F i ( θ, s i ) η i ( θ, t )]+ (cid:88) i ln (cid:18) − (cid:90) dθρ i ( θ )( e ˜ η i ( θ,t ) − (cid:19) S s = (cid:88) i (cid:90) t t dt ˜ s i ( t ) (cid:32) ˙ s i + βs i − βN (cid:88) j w ij F j ( π, s j ) η j ( π, t ) − δ ( t − t ) s i (cid:33) The exponential in the initial data contribution to the action (which is agenerating function for a Poisson distribution) can be simplified via the Doi-Peliti-Janssen transformation [71, 72, 73, 64, 74, 75, 76]: ψ i = η i exp( − ˜ η i )˜ ψ i = exp( ˜ η i ) − S ψ = (cid:88) i (cid:90) dθdt ˜ ψ i ( θ, t )[ ∂ t ψ i ( θ, t ) + ∂ θ F i ( θ, s i ) ψ i ( θ, t )]+ (cid:88) i ln (cid:18) − (cid:90) dθρ i ( θ ) ˜ ψ i ( θ, t ) (cid:19) (46) S s = (cid:88) i (cid:90) dt ˜ s i ( t ) (cid:0) ˙ s i ( t ) + βs i − δ ( t − t ) s i − βN (cid:88) j w ij F j ( π, s j )( ˜ ψ j ( π, t ) + 1) ψ j ( π, t ) (cid:33) (47)where we have not included noncontributing terms that arise after integrationby parts.The probability density functional (44) with the action (46) and (47)describes the ensemble statistics for the entire spiking and synaptic drivehistory of the network. By marginalizing over initial conditions, the func-tional density is no longer sharply peaked in function space. The expressioncannot be evaluated explicitly but can be computed perturbatively by em-ploying Laplace’s method and expanding around the extremum of the inte-grand. This extremum is given by the point in function space with minimalvariation, δS = 0 (i.e. the least action principle). The resulting equations ofmotion, if they exist, are the mean field equations.The accuracy of mean field theory can be assessed by computing the vari-ance around the mean field solution using perturbation theory. The inversenetwork size 1 /N is a candidate expansion parameter. The action S is com-prised of the transformed phase density ψ i , which is not differentiable, butthe sum over neurons provides an opportunity to regularize it. Consider firsta homogeneous network, F i = F , and uniform global coupling, w ij = w .Then (40) and (41) become ∂ t η i ( θ, t ) + ∂ θ F ( θ, s i ) η i ( θ, t ) − δ ( t − t ) η i ( θ ) = 0 (48)˙ s i ( t ) + βs i ( t ) − wF ( π, t ) βN N (cid:88) j =1 η j ( π, t ) − δ ( t − t ) s i = 0 (49)The dynamics of all neurons are identical. We can drop the neuron index21nd rewrite the action as S ψ = N (cid:90) dθdt ˜ ψ ( θ, t )[ ∂ t ψ ( θ, t ) + ∂ θ F ( θ, s ) ψ ( θ, t )]+ N ln (cid:18) − (cid:90) dθρ ( θ ) ˜ ψ ( θ, t ) (cid:19) (50) S s = N (cid:90) dt ˜ s ( t ) (cid:0) ˙ s ( t ) + βs − δ ( t − t ) s − βN (cid:88) j wF ( π, s )( ˜ ψ ( π, t ) + 1) ψ ( π, t ) (cid:33) (51)Applying the least action principle, which amounts to setting the functionalderivatives to zero [74, 65] returns the mean field theory equations ∂ t ρ ( θ, t ) + ∂ θ F ( θ, s i ) ρ ( θ, t ) (cid:105) = 0 (cid:104) ˙ s (cid:105) = − β (cid:104) s (cid:105) + βwF ( π, t ) ρ ( π, t ) (cid:105) where ρ ( θ, t ) = 1 N (cid:88) j η j ( θ, t ) (52)The factor of N in the action indicates that deviations away from mean fieldwill be suppressed as N → ∞ . A perturbative expansion in 1 /N can beconstructed using Laplace’s method to compute all higher order cumulants,which go to zero as as N → ∞ [64, 65].The same argument can be applied to a heterogeneous network with spa-tially dependent coupling. If the coupling function and neuron specific pa-rameters are continuous functions of the neuron index, then mean field theoryapplies for the density and synaptic drive averaged over a local populationof neurons at each location in the network in the limit of N → ∞ [65]. We have provided a brief and biased view of the ideas behind the Wilson-Cowan equations and their descendants. We show that one can arrive at asimilar set of activity equations with first order time derivatives, nonlinearactivation (gain) function, and a linear sum over inputs, from various starting22oints. Despite the fact that the equations are not generally quantitativelyaccurate, they seem to capture many facets of neural activity and function.They are a set of effective equations that have stood the test of time. Meanfield theory is implicitly assumed in all of these equations. Correlations aredeliberately ignored to make the equations tractable. However, they areomnipresent in the brain and there is much ongoing effort to account forthem. Nonetheless, we predict that forty seven years from now, the Wilson-Cowan equations will still have great utility.
10 Acknowledgments
This work was supported by the Intramural Research Program of the NIH,NIDDK.
References [1] H. R. Wilson and J. D. Cowan, “Excitatory and Inhibitory Interac-tions in Localized Populations of Model Neurons,”
Biophysical Journal ,vol. 12, pp. 1–24, Jan. 1972.[2] H. R. Wilson and J. D. Cowan, “A mathematical theory of the functionaldynamics of cortical and thalamic nervous tissue,”
Kybernetik , vol. 13,pp. 55–80, Sept. 1973.[3] H. Hosokawa, “Sholl, da: The organization of the cerebral cortex.(london& new york. 1956),” ??????? , vol. 2, no. 4, pp. 810–819, 1958.[4] R. L. Beurle, “Properties of a mass of cells capable of regeneratingpulses,”
Philosophical Transactions of the Royal Society of London. Se-ries B, Biological Sciences , pp. 55–94, 1956.[5] J. S. Griffith, “A field theory of neural nets: I: Derivation of field equa-tions,”
The Bulletin of Mathematical Biophysics , vol. 25, pp. 111–120,Mar. 1963.[6] J. S. Griffith, “A field theory of neural nets: II. Properties of the fieldequations,”
The Bulletin of Mathematical Biophysics , vol. 27, pp. 187–195, June 1965. 237] M. Le Bellac, M. Le Bellac, F. Mortessagne, G. G. Batrouni, G. Ba-trouni, et al. , Equilibrium and non-equilibrium statistical thermodynam-ics . Cambridge University Press, 2004.[8] L. Peliti,
Statistical mechanics in a nutshell , vol. 10. Princeton Univer-sity Press, 2011.[9] J. D. Cowan, J. Neuman, and W. van Drongelen, “WilsonCowan Equa-tions for Neocortical Dynamics,”
The Journal of Mathematical Neuro-science , vol. 6, p. 1, Dec. 2016.[10] Z. P. Kilpatrick,
Wilson-Cowan Model , pp. 1–5. New York, NY: SpringerNew York, 2013.[11] R. Potthast,
Amari Model , pp. 1–6. New York, NY: Springer New York,2013.[12] W. Lu and T. Chen, “New conditions on global stability of cohen-grossberg neural networks,”
Neural Computation , vol. 15, no. 5,pp. 1173–1189, 2003.[13] H. Sompolinsky, A. Crisanti, and H. J. Sommers, “Chaos in randomneural networks,”
Phys. Rev. Lett. , vol. 61, pp. 259–262, Jul 1988.[14] O. Shriki, D. Hansel, and H. Sompolinsky, “Rate Models forConductance-Based Cortical Neuronal Networks,”
Neural Computation ,vol. 15, pp. 1809–1841, Aug. 2003.[15] T. P. Vogels, K. Rajan, and L. F. Abbott, “Neural network dynamics,”
Annu. Rev. Neurosci. , vol. 28, pp. 357–376, 2005.[16] J. J. Hopfield, “Neurons with graded response have collective compu-tational properties like those of two-state neurons.,”
Proceedings of theNational Academy of Sciences , vol. 81, pp. 3088–3092, May 1984.[17] J. Hopfield and D. Tank, “Computing with neural circuits: a model,”
Science , vol. 233, pp. 625–633, Aug. 1986.[18] S. Grossberg, “Some networks that can learn, remember, and reproduceany number of complicated space-time patterns, i,”
Journal of Mathe-matics and Mechanics , vol. 19, no. 1, pp. 53–91, 1969.2419] M. A. Cohen and S. Grossberg, “Absolute stability of global patternformation and parallel memory storage by competitive neural networks,”
IEEE Transactions on Systems, Man, and Cybernetics , vol. SMC-13,pp. 815–826, Sept. 1983.[20] S. Grossberg, “Nonlinear neural networks: Principles, mechanisms, andarchitectures,”
Neural networks , vol. 1, no. 1, pp. 17–61, 1988.[21] P. C. Bressloff, “Spatiotemporal dynamics of continuum neural fields,”
Journal of Physics A: Mathematical and Theoretical , vol. 45, no. 3,p. 033001, 2011.[22] S. Coombes, “Waves, bumps, and patterns in neural field theories,”
Biological Cybernetics , vol. 93, pp. 91–108, Aug. 2005.[23] D. J. Pinto, J. C. Brumberg, D. J. Simons, G. B. Ermentrout, andR. Traub, “A quantitative population model of whisker barrels: Re-examining the Wilson-Cowan equations,”
Journal of ComputationalNeuroscience , vol. 3, pp. 247–264, Sept. 1996.[24] W. S. McCulloch and W. Pitts, “A logical calculus of the ideas immanentin nervous activity,”
The bulletin of mathematical biophysics , vol. 5,no. 4, pp. 115–133, 1943.[25] A. Shimbel and A. Rapoport, “A statistical approach to the theory ofthe central nervous system,”
The Bulletin of Mathematical Biophysics ,vol. 10, pp. 41–55, Mar. 1948.[26] D. Sholl, “The organization of the visual cortex in the cat,”
Journal ofanatomy , vol. 89, no. Pt 1, p. 33, 1955.[27] A. Uttley, “The probability of neural connexions,”
Proceedings of theRoyal Society of London. Series B-Biological Sciences , vol. 144, no. 915,pp. 229–240, 1955.[28] J. L. Feldman and J. D. Cowan, “Large-scale activity in neural netsI: Theory with application to motoneuron pool responses,”
BiologicalCybernetics , vol. 17, pp. 29–38, Jan. 1975.[29] H.-T. Chang, “Dendritic potential of cortical neurons produced by directelectrical stimulation of the cerebral cortex,”
Journal of neurophysiology ,vol. 14, no. 1, pp. 1–21, 1951. 2530] B. D. Burns, “Some properties of isolated cerebral cortex in the unanaes-thetized cat,”
The Journal of physiology , vol. 112, no. 1-2, pp. 156–175,1951.[31] W. G. Walter and H. Shipton, “A new toposcopic display system,”
Elec-troencephalography and clinical neurophysiology , vol. 3, no. 3, pp. 281–292, 1951.[32] J. C. Lilly and R. B. Cherry, “Surface movements of click responses fromacoustic cerebral cortex of cat: leading and trailing edges of a responsefigure,”
Journal of Neurophysiology , vol. 17, no. 6, pp. 521–532, 1954.[33] J. Griffith, “On the Stability of Brain-Like Structures,”
BiophysicalJournal , vol. 3, pp. 299–308, July 1963.[34] T. N. Wiesel and D. H. Hubel, “Single-cell responses in striate cortexof kittens deprived of vision in one eye,”
Journal of neurophysiology ,vol. 26, no. 6, pp. 1003–1017, 1963.[35] D. H. Hubel and T. N. Wiesel, “Receptive fields and functional archi-tecture in two nonstriate visual areas (18 and 19) of the cat,”
Journalof neurophysiology , vol. 28, no. 2, pp. 229–289, 1965.[36] V. B. Mountcastle, “Modality and topographic properties of single neu-rons of cat’s somatic sensory cortex,”
Journal of neurophysiology , vol. 20,no. 4, pp. 408–434, 1957.[37] J. Szent´agothai and K. Lissak, “Recent developments of neurobiology inhungary,”
Academiai Kiado, Budapest , 1967.[38] M. Colonnier, “The structural design of the neocortex,” in
Brain andconscious experience , pp. 1–23, Springer, 1965.[39] B. G. Cragg and H. N. V. Temperley, “Memory: The analogy withferromagnetic hysteresis,”
Brain , vol. 78, no. 2, pp. 304–316, 1955.[40] D. Fender and B. Julesz, “Extension of panums fusional area in binoc-ularly stabilized vision,”
JOSA , vol. 57, no. 6, pp. 819–830, 1967.[41] C. Geisler, L. Frishkopf, and W. Rosenblith, “Extracranial responses toacoustic clicks in man,” science , vol. 128, no. 3333, pp. 1210–1211, 1958.2642] A. Verveen and H. Derksen, “Amplitude distribution of axon membranenoise voltage.,”
Acta physiologica et pharmacologica Neerlandica , vol. 15,no. 3, p. 353, 1969.[43] W. Rall and C. C. Hunt, “Analysis of reflex variability in terms ofpartially correlated excitability fluctuation in a population of motoneu-rons,”
The Journal of general physiology , vol. 39, no. 3, p. 397, 1956.[44] J. L. Feldman and J. D. Cowan, “Large-scale activity in neural nets II: Amodel for the brainstem respiratory oscillator,”
Biological Cybernetics ,vol. 17, pp. 39–51, Jan. 1975.[45] E. R. Caianiello, “Outline of a theory of thought-processes and thinkingmachines,”
Journal of theoretical biology , vol. 1, no. 2, pp. 204–235,1961.[46] J. Orbach, “Principles of neurodynamics. perceptrons and the theoryof brain mechanisms.,”
Archives of General Psychiatry , vol. 7, no. 3,pp. 218–219, 1962.[47] B. Widrow, “Generalization and information storage in network of ada-line’neurons’,”
Self-organizing systems-1962 , pp. 435–462, 1962.[48] K. Hadeler and D. Kuhn, “Stationary states of the hartline-ratliffmodel,”
Biological Cybernetics , vol. 56, no. 5, pp. 411–417, 1987.[49] S. Grossberg, “Some nonlinear networks capable of learning a spatialpattern of arbitrary complexity,”
Proceedings of the National Academyof Sciences of the United States of America , vol. 59, no. 2, p. 368, 1968.[50] S. Grossberg, “On learning and energy-entropy dependence in recurrentand nonrecurrent signed networks,” journal of Statistical Physics , vol. 1,no. 2, pp. 319–350, 1969.[51] J. J. Hopfield, “Neural networks and physical systems with emergentcollective computational abilities,”
Proceedings of the national academyof sciences , vol. 79, no. 8, pp. 2554–2558, 1982.[52] F. Rosenblatt, “Principles of neurodynamics. perceptrons and the theoryof brain mechanisms,” tech. rep., Cornell Aeronautical Lab Inc BuffaloNY, 1961. 2753] B. Widrow, “An adaptive’adaline’neuron using chemical’memistors’,1553-1552,” 1960.[54] S.-i. Amari, “Dynamics of pattern formation in lateral-inhibition typeneural fields,”
Biological Cybernetics , vol. 27, no. 2, pp. 77–87, 1977.[55] S.-I. Amari, “Field theory of self-organizing neural nets,”
IEEE Trans-actions on Systems, Man, and Cybernetics , vol. SMC-13, pp. 741–748,Sept. 1983.[56] W. A. Little, “The existence of persistent states in the brain,” in
FromHigh-Temperature Superconductivity to Microminiature Refrigeration ,pp. 145–164, Springer, 1974.[57] W. A. Little and G. L. Shaw, “Analytic study of the memory storagecapacity of a neural network,”
Mathematical biosciences , vol. 39, no. 3-4,pp. 281–290, 1978.[58] D. J. Amit, H. Gutfreund, and H. Sompolinsky, “Spin-glass models ofneural networks,”
Physical Review A , vol. 32, pp. 1007–1018, Aug. 1985.[59] H. Sompolinsky, “Statistical mechanics of neural networks,”
Physics To-day , vol. 41, no. 12, pp. 70–80, 1988.[60] W. Gerstner and J. L. van Hemmen, “Universality in neural networks:the importance of the mean firing rate,”
Biological Cybernetics , vol. 67,pp. 195–205, July 1992.[61] W. Gerstner, “Time structure of the activity in neural network models,”
Physical Review E , vol. 51, pp. 738–758, Jan. 1995.[62] W. Gerstner, W. M. Kistler, R. Naud, and L. Paninski,
Neuronal dy-namics: From single neurons to networks and models of cognition . Cam-bridge University Press, 2014.[63] W. Gerstner and J. L. van Hemmen, “Associative memory in a networkof spiking neurons,”
Network: Computation in Neural Systems , vol. 3,pp. 139–164, Jan. 1992.[64] M. A. Buice and C. C. Chow, “Dynamic Finite Size Effects in SpikingNeural Networks,”
PLoS Computational Biology , vol. 9, p. e1002872,Jan. 2013. 2865] S.-W. Qiu and C. C. Chow, “Finite-size effects for spiking neural net-works with spatially dependent coupling,”
Physical Review E , vol. 98,no. 6, p. 062414, 2018.[66] E. Brown, J. Moehlis, and P. Holmes, “On the Phase Reduction andResponse Dynamics of Neural Oscillator Populations,”
Neural Compu-tation , vol. 16, pp. 673–715, Apr. 2004.[67] S. Ichimaru,
Basic Principles of Plasma Physics: A Statistical Approach .A lecture note and reprint series, W. A. Benjamin, 1973.[68] D.R.Nicholson,
Introduction to Plasma Theory , vol. 2. Cambridge Uni-versity Press, 1984.[69] S. H. Strogatz and R. E. Mirollo, “Stability of incoherence in a pop-ulation of coupled oscillators,”
Journal of Statistical Physics , vol. 63,pp. 613–635, May 1991.[70] N. Brunel, “Dynamics of sparsely connected networks of excitatoryand inhibitory spiking neurons,”
Journal of computational neuroscience ,vol. 8, no. 3, pp. 183–208, 2000.[71] M. A. Buice and J. D. Cowan, “Field-theoretic approach to fluctuationeffects in neural networks,”
Phys. Rev. E , vol. 75, p. 051919, May 2007.[72] E. J. Hildebrand, M. A. Buice, and C. C. Chow, “Kinetic theory ofcoupled oscillators,”
Phys. Rev. Lett. , vol. 98, p. 054101, Jan 2007.[73] M. A. Buice and C. C. Chow, “Correlations, fluctuations, and stabilityof a finite-size network of coupled oscillators,”
Phys. Rev. E , vol. 76,p. 031118, Sep 2007.[74] M. A. Buice and C. C. Chow, “Generalized activity equations for spikingneural network dynamics,”
Frontiers in Computational Neuroscience ,vol. 7, p. 162, 2013.[75] M. A. Buice, J. D. Cowan, and C. C. Chow, “Systematic fluctuationexpansion for neural network activity equations,”
Neural Computation ,vol. 22, no. 2, pp. 377–426, 2010. PMID: 19852585.2976] M. A. Buice and C. C. Chow, “Beyond mean field theory: statistical fieldtheory for neural networks,”
Journal of Statistical Mechanics: Theoryand Experiment , vol. 2013, no. 03, p. P03003, 2013.[77] C. C. Chow and M. A. Buice, “Path integral methods for stochastic dif-ferential equations,”