[PDF] Model Reduction Captures Stochastic Gamma Oscillations on Low-Dimensional Manifolds

Abstract

Gamma frequency oscillations (25-140 Hz), observed in the neural activities within many brain regions, have long been regarded as a physiological basis underlying many brain functions, such as memory and attention. Among numerous theoretical and computational modeling studies, gamma oscillations have been found in biologically realistic spiking network models of the primary visual cortex. However, due to its high dimensionality and strong nonlinearity, it is generally difficult to perform detailed theoretical analysis of the emergent gamma dynamics. Here we propose a suite of Markovian model reduction methods with varying levels of complexity and applied it to spiking network models exhibiting heterogeneous dynamical regimes, ranging from homogeneous firing to strong synchrony in the gamma band. The reduced models not only successfully reproduce gamma band oscillations in the full model, but also exhibit the same dynamical features as we vary parameters. Most remarkably, the invariant measure of the coarse-grained Markov process reveals a two-dimensional surface in state space upon which the gamma dynamics mainly resides. Our results suggest that the statistical features of gamma oscillations strongly depend on the subthreshold neuronal distributions. Because of the generality of the Markovian assumptions, our dimensional reduction methods offer a powerful toolbox for theoretical examinations of many other complex cortical spatio-temporal behaviors observed in both neurophysiological experiments and numerical simulations.

Full PDF

PPLoS Computational Biology manuscript No. (will be inserted by the editor)

Model Reduction Captures Stochastic Gamma Oscillations onLow-Dimensional Manifolds

Yuhang Cai , ∗ · Tianyi Wu , , ∗ · Louis Tao , , † · Zhuo-Cheng Xiao , † Received: date / Accepted: date

Abstract

Gamma frequency oscillations (25-140 Hz), observed in the neural activities within manybrain regions, have long been regarded as a physiological basis underlying many brain functions, suchas memory and attention. Among numerous theoretical and computational modeling studies, gammaoscillations have been found in biologically realistic spiking network models of the primary visualcortex. However, due to its high dimensionality and strong nonlinearity, it is generally diﬃcult toperform detailed theoretical analysis of the emergent gamma dynamics. Here we propose a suite ofMarkovian model reduction methods with varying levels of complexity and apply it to spiking net-work models exhibiting heterogeneous dynamical regimes, ranging from nearly homogeneous ﬁring tostrong synchrony in the gamma band. The reduced models not only successfully reproduce gamma os-cillations in the full model, but also exhibit the same dynamical features as we vary parameters. Mostremarkably, the invariant measure of the coarse-grained Markov process reveals a two-dimensionalsurface in state space upon which the gamma dynamics mainly resides. Our results suggest that thestatistical features of gamma oscillations strongly depend on the subthreshold neuronal distributions.Because of the generality of the Markovian assumptions, our dimensional reduction methods oﬀera powerful toolbox for theoretical examinations of other complex cortical spatio-temporal behaviorsobserved in both neurophysiological experiments and numerical simulations.

Keywords

Gamma oscillations · Synchrony · Homogeneity · Coarse-graining method

Emergent oscillations in the brain may be the neural basis for many brain functions, such as memoryand attention. In particular, gamma band oscillations have been found concomitant with improve-ments in sensory and cognitive tasks (and its disruption have been associated with various braindisorders). A major goal of theoretical and computational neuroscience is to understand how com-plex dynamical neural activities can emerge from the interplay of cellular and network elements. Herewe present a sequence of reduction methods to examine the development of gamma dynamics in anidealized Integrate-and-Fire network model. Through coarse-graining, we ﬁnd that the appearanceof gamma band oscillations can be represented on a suitable, low-dimensional manifold. The reduced systems can recapitulate many of the statistics of the full model, including the transition from ho-mogeneous ﬁring to gamma synchrony. Furthermore, our coarse-graining reveals the importance ofsubthreshold neuronal distributions that may account for the emergence of gamma oscillations in thebrain. Department of Statistics, University of Chicago, IL, USA 60637; School of Mathematical Sciences, Peking University, Beijing 100871, China; Center for Bioinformatics, National Laboratory of Protein Engineering and Plant Genetic Engineering, School ofLife Sciences, Peking University, Beijing, 100871, China; Center for Quantitative Biology, Peking University, Beijing, 100871, China; Courant Institute of Mathematical Sciences, New York University, NY, USA 10003. ∗ These two authors contributed equally to this paper and are listed in alphabetical order. † Corresponding authors. L. Tao E-mail: [email protected]; Z.-C. Xiao E-mail: [email protected] a r X i v : . [ q - b i o . N C ] J a n Yuhang Cai , ∗ et al. Modern experimental techniques have revealed a vast diversity of coherent spatiotemporal activitypatterns in the brain, reﬂecting the many possible interactions between excitation and inhibition,between cellular and synaptic time-scales, and between local and long-range circuits. Prominentamongst these patterns are the rich repertoire of neuronal oscillations that can be stimulus driven orinternally generated and are likely to be responsible for sensory perception and cognitive tasks [29,70]. In particular, gamma band oscillations (25-140 Hz), observed in multi-unit activity (MUA) andlocal ﬁeld potential (LFP) measurements [59], have been found in many brain regions (visual cortex[3, 34, 36, 47], auditory cortex [12], somatosensory cortex [6], parietal cortex [15, 51, 54], frontalcortex [15, 18, 35, 66–68, 74], hippocampus [10, 22, 23, 25], amygdala [56] and striatum [73]). Muchexperimental evidence correlates gamma oscillations to behavior and enhanced sensory or cognitiveperformances. For instance, gamma dynamics has been shown to sharpen orientation tuning in V1 andspeed and direction tuning in MT [2, 3, 27, 46, 82]. During cognitive tasks, the increases in gammapower in the visual pathway have been shown to correlate with attention [28, 30]. Experimentshave implicated gamma oscillations during learning [5] and memory [54]. Numerical studies havedemonstrated that coherent gamma oscillations between neuronal populations can provide temporalwindows during which information transfer can be enhanced [81]. The disruption of gamma frequencysynchronization is also concomitant with multiple brain disorders [4, 11, 40, 48, 50].Many large-scale network simulations [20, 72] and ﬁring rate models [13, 39] have been used tocapture the wide range of the experimentally observed gamma band activity. The main mechanismunderlying the emergent gamma dynamics appears to be the strong coupling between network popu-lations, either via synchronizing inhibition [78] or via competition between excitation and inhibition[20, 78]. However, a theoretical account of how the collective behavior emerges from the detailedneuronal properties, local network properties and cortical architecture remains incomplete.Recently, Young and collaborators have examined the dynamical properties of gamma oscillationsin a large-scale neuronal network model of monkey V1 [19]. To further theoretical understanding, in[43], Li, Chariker and Young introduced a relatively tractable stochastic model of interacting neuronalpopulations designed to capture the essential network features underlying gamma dynamics. Throughnumerical simulations and analysis of three dynamical regimes (“homogeneous”, “regular” and “syn-chronized”), they identiﬁed how conductance properties (essentially, how long after each spike thesynaptic interactions are fully felt) can regulate the emergence of gamma frequency synchronization.Here, we present a sequence of model reductions of the spiking network models based on Li,Chariker and Young [43]. We ﬁrst present in detail our methods on a small, homogeneous networkof 100 neurons (75 excitatory and 25 inhibitory), exhibiting gamma frequency oscillations. Inspiredby [43, 44], to achieve dimensional reduction, we assume that the spiking activities during gammaoscillations and their temporally organization are mainly governed by one simple variable, namely,which sub-threshold neurons are only a few postsynaptic spikes from ﬁring. Thus, in terms of dy-namical dimensions, the number of network states is drasticaly reduced (although still too large forany meaningful analytical work, since 2 n is astronomical even for n = 100). Therefore, we furthercoarse-grain by keeping count of the numbers of neurons that are only a few postsynaptic spikes fromthreshold. The number of eﬀective states is then reduced to the order of millions. By restricting the dynamics onto this dimensionally-reduced state space, it is now possible to make use of the classicaltools of stochastic models to analyze the emergence and statistical properties of gamma frequencysynchronization. The reduced models not only successfully capture the key features of gamma oscil-lations, but strikingly, they also reveal a simple, low-dimensional manifold structure of the emergentdynamics.The outline of this paper is as follows. In Section 3, we present our model reductions, i.e., thesequence of models going from the full model, to the two-state, reduced network model and toits coarse-grained simpliﬁcation. Especially, we show the low-dimensional manifold representing thegamma dynamics. We discuss the connection of our work to previous studies and how it may beneﬁtfuture research of gamma dynamics in Section 4. Computational methods and technical details areprovided in the Methods section. odel Reduction Captures Stochastic Gamma Oscillations on Low-Dimensional Manifolds 3 externalinputpendingspike pendingspikependingspike ··· ······ ···inhibitory pendingspike······excitatorymembranepotiential membranepotential membranepotential membranepotentialexternalinputpendingspike pendingspikegatebasependingspike ··· ······ ···gatebase gatebaseinhibitory pendingspikegatebase······excitatory AB basepopulationgatepopulationinhibitory excitatorygatepopulationbasepopulationexternalinputpendingspike C Fig. 1

Structures and important features of 3 models. A. The structure of the full Markovian integrate-and-ﬁrenetwork. B. The structure of the reduced network model, where the membrane potentials only take values in { Base, Gate } . C. The structure of coarse grained model. In this model the pending E -kick pools for every neuronare merged in to one for the whole network, and so are the pending I -kick pools. In spiking neuronal networks, gamma frequency oscillations appear as temporally repeating, stochas- tic spiking clusters in the ﬁring patterns. Diﬀerent mechanisms have been proposed to explain thisphenomena. As a minimal mechanism, interneuron network gamma (ING) proposes that the gammaoscillations can be produced by the fast interactimodelingons between inhibitory ( I ) neurons alone[78]. When inhibitory neurons have intrinsic ﬁring rates higher than the gamma band, they mayexhibit gamma ﬁring frequency when mutually inhibited. ING does not require the existence of ex-citatory ( E ) neurons but studies showed that it loses its coherence in systems where neurons areheterogeneously driven [76]. Another class of theories view the repeating collective spiking clustersas the outcome of competition between E neurons and I neurons, such as the pyramidal-interneuronnetwork gamma (PING, [9, 78]) and recurrent excitation-inhibition (REI, [19]).Though sharing many common qualitative features, PING and REI provide substantially diﬀerentexplanations to the formation of gamma oscillations. Most remarkably, the collective spiking clusters Yuhang Cai , ∗ et al. in PING are usually whole-population spikes as a result of steady external input. The E -populationspikes induce I -population spiking activities that are oﬀset in time, and strong enough to suppressthe entire network. A new E -population spiking event occurs after the inhibition wears oﬀ, leadingto a series of nearly periodic, whole-population oscillatory activity.On the other hand, the REI mechanism depicts a highly stochastic network dynamics: Drivenby noisy stimulus, a few E -neurons crosses the threshold, and the subsequent recurrent excitationrecruit more spikes from other excitatory neurons, leading to rapidly rising spike clusters. But thiscan not go forever since a few of the inhibitory neurons are excited at the same time, pushing thewhole population to a less excitable condition. The next collective spiking event then emerges fromthe victory of excitation during its competition with inhibition. Therefore, the important featuresof the transient spike clusters are highly temporally variable, and the gamma frequency band of theoscillation is mainly a statistical feature of the emergent complex temporal ﬁring patterns.Our primary goal is to provide a better understanding of the emergence of gamma oscillationsfrom the interaction between neurons in this nonlinear, stochastic, and high-dimensional context. Ingeneral, the biggest diﬃculty of analyzing spiking network dynamics is its dimensionality. Considera network of N neurons, the number of possible states grows exponentially as N → ∞ , no matterif we use a single-neuron model as complex as the Hodgkin-Huxley [37] or as simple as binary-state[24]. Previously, many attempts have been made for model reduction by capturing collective networkdynamics in speciﬁc dynamical regimes. Successful examples include kinetic theories [17] for mean-ﬁeld ﬁring patterns, statistical ﬁeld theories [14] for higher-order correlations, Wilson-Cowan neuralﬁeld model [79, 80] for spatial-temporal patterns, to name a few. In this study, our reduced models aredeveloped with similar motivations to capture gamma oscillations. Generally, models incorporatingmany biologically realistic details can be very complicated; The reduced models are much easierto analyze, but some of the neglected information can lead to biases in many aspects. Here, weaim for a balance between realism and abstraction. Speciﬁcally, we present a sequence of reducedmodels, between which the connections are well deﬁned (and can be mathematically analyzed infuture studies). Importantly, we ﬁnd that even the coarsest model preserves important features andstatistics of the gamma oscillations found in the full model.Our reduced models are based on the integrate-and-ﬁre (IF) model, which is widely employed inprevious models of spiking networks (see Methods). In a recent study, gamma oscillations have beenfound emerging from the simulation of a large-scale IF network model of layer 4C α of V1 [19]. Latertheoretical studies suggest that gamma oscillations in diﬀerent spatial regions decorrelate quicklyon the scale of a couple of hypercolumns, echoing experimental observations that gamma oscillationis very local in cortex [33, 42, 52]. Therefore, we ﬁrst focus on a small homogeneous network with75 excitatory neurons ( E ) and 25 inhibitory neurons ( I ) as an analogy of an orientation-preferencedomain of a hypercolumn in V1.3.1 Reduced Models Captures Statistics of Gamma Oscillations We start with the Markovian integrate-and-ﬁre model (MIF) ﬁrst proposed in [43], and hereafterreferred to as the “full model.” As an analogy of the conventional IF model (see Methods), the MIFbrings us two additional conveniences: First, the discretized states of Markovian dynamics maketheoretical analysis easier as the probability ﬂow from one state to another is now straightforward;

Second, the Markov properties of the MIF enable the computation of the invariant measure of gammaoscillations directly from the probability transition matrix.Our MIF model network is composed of 100 interacting neurons (75 E -neurons and 25 I -neurons)driven by external Poissonian stimulus (Fig. 1A). The state of neuron i is described by three variables( v i , H Ei , H Ii ), where v i represents individual membrane potentials and H { E,I } i are analogies of the E and I conductances (see below). We use the size of external kicks to discretize the membrane potential,and let v i range from V I = −

66, the reversal potential of inhibitory synapses, to V th = 100, thespiking threshold. Immediately after reaching V th , v i enters the refractory state R , and, at the sametime, sends a spike to its postsynaptic targets. After an exponentially distributed waiting-time τ R , v i resets to the rest potential V r = 0. The ‘integrate’ part of MIF is separated into two components,external and recurrent. Each external kick increases v i by 1. To model the eﬀects of recurrent network odel Reduction Captures Stochastic Gamma Oscillations on Low-Dimensional Manifolds 5 E -10 0 10 Time (ms) I -10 0 10 E -10 0 10 Time (ms) I -10 0 102000 2200 2400 2600 2800 3000 Time (ms) F r eq ( H z ) ( s p i k e s / s e c ) / H z I nde x Conditioned on E at t=0 Conditioned on I at t=0

Syn

Fr: E=39.87 I=83.24 E -10 0 10 Time (ms) I -10 0 10 E -10 0 10 Time (ms) I -10 0 102000 2200 2400 2600 2800 3000 Time (ms) F r eq ( H z ) ( s p i k e s / s e c ) / H z I nde x Conditioned on E at t=0 Conditioned on I at t=0

Reg

Fr: E=34.73 I=74.91 E -10 0 10 Time (ms) I -10 0 10 E -10 0 10 Time (ms) I -10 0 102000 2200 2400 2600 2800 3000 Time (ms) F r eq ( H z ) ( s p i k e s / s e c ) / H z I nde x Conditioned on E at t=0 Conditioned on I at t=0

Hom

Fr: E=30.86 I=69.69

ABC

Fig. 2

Gamma oscillation exhibited by the MIF network in three diﬀerent regimes. A. Homogeneous regime(“Hom”). Top: Raster plots of all 75 E -neurons (blue) and I -neurons (red) with ﬁring rates noted in title. Middle:The spectrogram of the ﬁring pattern exhibiting low density in the gamma band (30-80 Hz). Bottom: The spikingcorrelation diagrams for { E, I } -spikes. B and C.

Same as A . The regular (“Reg”) and synchronized (“Syn”) regimesexhibit much more synchronous ﬁring patterns and stronger gamma-band activity. Yuhang Cai , ∗ et al. spikes, we use H { E,I } i to denote the number of postsynaptic spikes (forming ‘pools’ of pending-kicks)received by neuron i that has not yet taken eﬀect. When receiving an { E, I } -spike, the correspondingpending-kick pool, H { E,I } i , increases by 1. Each postsynaptic spike aﬀects v i independently, and afteran exponentially distributed waiting-time (i.e., the synaptic time-scale) τ { E,I } , increase or decrease v i (depending on { E, I } of the presynaptic neuron). The speciﬁc increment/decrement depends on thesynaptic strength and the state of v i . The connections between neurons are homogeneous: Whethera spike released by neuron i is received by neurons j is determined by an independent coin ﬂip,whose probability only depends on the type of neuron i and j . We leave the details and choices ofparameters to Methods.This 100-neuron MIF network exhibits gamma-band oscillations as demonstrated in [43] (Fig. 2).By varying the synaptic time scales τ { E,I } , we examine three regimes with diﬀerent degrees ofsynchrony: homogeneous (“Hom”), regular (“Reg”), and synchronized (“Syn”). (Speciﬁcally, we ﬁxthe expectation of the waiting time for I -kicks ( τ I ) and manipulate separately the expectation ofwaiting time for E -spikes on E and I neurons, τ EE and τ IE ; for full details, see Methods). In the rasterplot of “Hom” regime, the MIF network produces a ﬁring pattern in which spikes do not exhibit strongtemporal correlations (Fig. 2A top). (This is also veriﬁed by the spike-time correlations conditionedon each { E, I } -spike; see Fig. 2A bottom). Meanwhile, no strong spectral density peak is seen inthe spectrogram (Fig. 2A middle). In the “Reg” regime, however, spikes begin to cluster in time asmultiple ﬁring events (hereafter, MFEs) and exhibit stronger spike-time correlations [57, 58]. Namely,MFE is a temporally transient phenomenon lying between homogeneity and total synchrony, wherea part (but not all) of the neuronal population ﬁres during a relatively short time window, as widelyreported in previous experimental and modelling studies [7, 21, 55, 64, 85, 86]. Aﬀected by MFEs,the mass of the spectral density is primarily located in the gamma band, especially around 40-60 Hz(Fig. 2B). Many more and stronger synchronized ﬁring patterns, higher spike-time correlations, andstronger gamma-band spectral peaks are observed in the “Syn” regime. These dynamics and statisticsare consistent with the results of [43] and in IF neuronal network simulations [87–89]. Furthermore,the gamma-band spectrograms are comparable with experimental studies [84].We note that although our MIF network only consists of N = 100 neurons, the number of states inthe Markov chain is (168 · n H E · n H I ) N , where n { H E ,H I } are the largest possible sizes of the pendingspike pools (see Methods for precise deﬁnition). Therefore, it is computationally unrealistic to doany meaningful analytical work to understand the dynamics, especially how gamma oscillations canemerge from the probability ﬂows between diﬀerent states. Therefore, we regard this MIF networkas the “full model,” and apply dimensional reduction methods for further analysis. First, we introduce a reduced network (RN) consisting of two-state neurons, i.e., RN model has thesame setup as the MIF network except that the membrane potentials only have two states: base and gate (Fig. 1B). From the perspective of the full model, a neuron is deemed as a base or gate neuronby how far it is away from ﬁring. Speciﬁcally, we set a cutoﬀ V c below the threshold V th , and neuron i is regarded as a gate neuron if v i > V c , otherwise it is a base neuron (including v i ≤ V c and v i = R ). Neuron i ﬂips between the base and gate states when1. v i crosses the cutoﬀ V c , or2. Neuron i ﬁres and enters the refractory state R . However, the reduction to two-state neurons immediately raises a question: Without the consecu-tive discrete states between [ V I , V th ], how can we represent the eﬀect of (external, E / I -) kicks onindividual neurons when v i only takes two possible states?Consider an E -neuron i in the gate state, i.e., v i > V c in the corresponding MIF network. Sincewe do not know the exact value a priori , v i can take any value between [ V c , V th ] with a probabilitydetermined by the empirical distribution of the whole E -population. When an E -kick takes eﬀectand increases v i by a synaptic strength of S EE , neuron i ﬁres and changes state to “base” if andonly if v i locates at the excitable region ( V th − S EE , V th ], otherwise it stays in the gate state (and v i ∈ [ V c , V th − S EE )). Therefore, a single E -kick has a probability P GEE = P (cid:16) v i ∈ ( V th − S EE , V th ] | v i ∈ ( V c , V th ] (cid:17) = P ( v i ∈ ( V th − S EE , V th ]) P ( v i ∈ ( V c , V th ]) , (1) odel Reduction Captures Stochastic Gamma Oscillations on Low-Dimensional Manifolds 7 E -10 0 10 Time (ms) I -10 0 10 E -10 0 10 Time (ms) I -10 0 102000 2200 2400 2600 2800 3000 Time (ms) F r eq ( H z ) ( s p i k e s / s e c ) / H z I nde x Conditioned on E at t=0 Conditioned on I at t=0

Hom

Fr: E=35.57 I=72.79 E -10 0 10 Time (ms) I -10 0 10 E -10 0 10 Time (ms) I -10 0 102000 2200 2400 2600 2800 3000 Time (ms) F r eq ( H z ) ( s p i k e s / s e c ) / H z I nde x Conditioned on E at t=0 Conditioned on I at t=0

Reg

Fr: E=40.56 I=76.35 E -10 0 10 Time (ms) I -10 0 10 E -10 0 10 Time (ms) I -10 0 102000 2200 2400 2600 2800 3000 Time (ms) F r eq ( H z ) ( s p i k e s / s e c ) / H z I nde x Conditioned on E at t=0 Conditioned on I at t=0

Syn

Fr: E=45.47 I=82.06

ABC

Fig. 3

Gamma oscillation captured by the reduced network consisting of two-state neurons.

A-C.

Same regimesand statistics investigated in Fig. 2. Yuhang Cai , ∗ et al. to excite a gate E -neuron, leading to its spike and transition to the base state. That is, P GEE isthe transition probability of neuron i in the excitable region, conditioned on that neuron i is a gateneuron. A priori , we do not have the full distribution of neuronal states, therefore, in order to closethe RN model, we use statistical learning methods by inferring P GEE from a long-term simulationof the full MIF network. Likewise, we acquire all other transition possibilities induced by kicks (seeMethods).This reduction of classifying spiking neurons into two states is based on one simple assumption:The emergence of MFEs in the collective dynamics is mostly sensitive to one variable, i.e., the numberof subthreshold neurons are only a few spikes from ﬁring (gate neurons). On the other hand, thedistribution of neurons with lower membrane potentials (base neurons) is less immediately relevantsince it is much less possible for them to generate spikes in the next few milliseconds. Therefore, theinitiation and maintenance of gamma oscillations are dominated by the probability ﬂow between gateand base states.Even with such a drastic simpliﬁcation, the RN model provides remarkably good approximationof dynamics produced by the MIF network (Fig. 3), which is veriﬁed by raster plots, ﬁring rates, andspectrum densities similar to those of the full model in Fig. 2. On the other hand, we notice that thespike-time correlations in the “Syn” regime are indeed slightly lower than the corresponding value inthe full model. This is not very surprising: When the spiking pattern is synchronized, the dynamicsis more sensitive to the details of the probability distribution of neurons in the excitable region sinceone spike may trigger more spikes followed by other neurons. One may, of course, consider using moreinformation to describe the full distribution of the membrane potentials (perhaps by using three ormore states instead of two). Though a more detailed models may provide us with better numericalapproximations, our primary goal of capturing the key features of gamma oscillation has been wellserved by the current version of the RN model with two-state neurons.Though radically simpliﬁed, our RN model is still a Markov chain, with (2 · n H E · n H I ) N states.However, the setup and success of the RN model provide important insights for further model reduc-tion. When we infer the transition probabilities between gate and base states, we are uncertain of the fulldistribution of v i in the network (besides the number of neurons in base and gate states). Therefore,the success of the RN model suggests that we may think of the transition probabilities as functionsof the number of { E, I } neurons in each state: N GE , N GI , N BE , N BI . Thus, the core idea of furthersimplifying by coarse-grained (CG) approximation is this: Instead of thinking about the state of eachindividual neuron, we study the state of population statistics. Below we summarize the setup of ourCG model and leave the details to Methods.Let us ﬁrst consider the pending-kick pools of a single neuron. Take the I -to- E and I -to- I projec-tions as an example: Because of the homogeneity of the network, for an I -spike, each of its postsynapticneuron receives it independently with a probability that depends only on the speciﬁc neuronal type { E, I } . In addition, each spike takes eﬀect independently, with the same waiting time distribution(exponential with mean τ I ). Therefore, this is equivalent to a collective I -kick pool of size H I , repre-senting the sum of all I -kick pools in the entire network. In this pool, each pending-kick takes eﬀectindependently and is randomly distributed to a speciﬁc neuron with a probability depending only onits E/I type. With similar considerations, the E -kick pools are also merged into one at a size H E . However, since ( τ EE , τ IE ) are separately manipulated in the full model, we have to ignore the subtlediﬀerences between the consumption rates of pending E -kicks on E and I neurons. Speciﬁcally, weassume that a constant portion of H E are distributed to E ( I ) neurons (i.e., H EE = a EE · H E . SeeMethods), which introduces a bias in our CG model.Since each neuron is driven by the same, coarse-grained { E, I } -kick pools, due to the interchange-ability of I neurons, they are now only diﬀerentiated by their current state (base or gate). Similarly,the information of the state of every I neuron is now represented by the Numbers ( N BI , N GI ). Sincethe total number of I neurons is ﬁxed, we only need N GI . These considerations also apply to the E neurons. Thus, our CG model becomes a Markov chain with four variables:( N GE , N GI , H E , H I ) . (2) odel Reduction Captures Stochastic Gamma Oscillations on Low-Dimensional Manifolds 9 Time (ms) F r eq ( H z ) ( s p i k e s / s e c ) / H z I nde x Syn

Fr: E=41.61 I=79.48

Time (ms) F r eq ( H z ) ( s p i k e s / s e c ) / H z I nde x Hom

Fr: E=29.38 I=68.28

Time (ms) F r eq ( H z ) ( s p i k e s / s e c ) / H z I nde x Reg

Fr: E=32.47 I=69.13

ABC E -10 0 10 Time (ms) I -10 0 10 E -10 0 10 Time (ms) I -10 0 10 Conditioned on E at t=0 Conditioned on I at t=0 E -10 0 10 Time (ms) I -10 0 10 E -10 0 10 Time (ms) I -10 0 10 Conditioned on E at t=0 Conditioned on I at t=0 E -10 0 10 Time (ms) I -10 0 10 E -10 0 10 Time (ms) I -10 0 10 Conditioned on E at t=0 Conditioned on I at t=0

Fig. 4

Gamma oscillation captured by the coarse-grained model.

A-C.

Same regimes and statistics investigated inFig. 1 and 2. The fake raster plots are produced by randomly assigning spikes to each neuron.0 Yuhang Cai , ∗ et al. Kicks taking eﬀect may change N GE and N GI , and spikes released by neurons increase H E and H I .The CG model contains N E · N I · N H E · N H I states ( N H E = n H E · N , N H I = n H I · N ). For each state,there are at most 12 possible transitions to other states (see Methods). Therefore, the CG modelbecomes an O ( N ) problem, allowing us to analyze the dynamics in detail.The CG model is a natural simpliﬁcation of the RN model, and it is reasonable to expect CGto capture the main features of gamma oscillations. Indeed, we ﬁnd that in all three regimes weconsidered, the behaviors of CG model are in well agreement with both the MIF and RN models(Fig. 4). Note that, since we do not track the dynamics of individual neurons in CG, the (fake) rasterplots are generated by randomly assigning spikes to neurons.3.2 Gamma Dynamical Features in Reduced ModelsThe reduced models are not designed to reproduce every detail of the full model. But how well canour model reduction capture the dynamics and key statistical features? In addition to the ﬁring rates,spectral densities, and spike-time correlations for the three selected regimes, we examine the RN andCG models when we change parameters continuously. In this section, we test two dynamical featuresobserved in the full, MIF model. For each parameter set involved, we numerically simulate each of thethree models (MIF, RN, and CG), for 10 seconds each, and divide the dynamics into 10 batches. Wethen show the batch means and standard errors of the statistics collected from the batches (Fig. 5).First, when the frequency of the external Poisson stimulus ( λ ) increases, MFEs appear morefrequently in the ﬁring patterns. According to [19], a new MFE is initiated by excitatory stimulationor by chance when the inhibition from the last MFE fades. Therefore, stronger external stimuliresult in faster initiation of a new MFE. From the ﬁring patterns of our three models, we use aspike cluster detection algorithm [20] to recognize individual MFEs (see Methods) and examine howtheir emergence is regulated by external stimulus. We ﬁnd that the RN and CG models capture thesame trend exhibited in the full model (Fig. 5A), namely, the average waiting time of MFE (1/MFEfrequency) is linearly related to the external kicks ( λ − ). However, while the trend is captured semi-quantitatively, the reduced models exhibit lower MFE frequencies (or higher MFE waiting time inFig. 5A, red and yellow). On the other hand, we ﬁnd that the MFEs produced by the RN andCG models have longer duration than the MIF model (See Supplementary Materials). These resultssuggest that, on average, the reduced models have slower probability ﬂows between states.Second, when the ratio τ EE / τ I becomes smaller, the E -kicks take eﬀect on E neurons relativelyfaster, and thus, the recurrent network excitation recruits other E -spikes on a shorter time scale.Therefore, the whole network exhibits more synchronized ﬁring patterns. This phenomena has beenobserved in many previous computational models (see, for instance, [39]), and also veriﬁed by thecomparison between the three regimes in this paper. Here we go beyond these three regime and change τ EE continuously while ﬁxing τ I (Fig. 5B). In the full model, the degree of synchrony (measured byspike synchrony index, see Methods) exhibits a clear decreasing trend when τ EE goes up, which is alsowell-echoed by the RN model. For the CG model, however, although the same trend is captured, thedegree of synchrony is generally higher than in the full model. This bias is introduced when we mergeall E -kick pools into one and assume H EE / H IE is a constant: The underestimation of H IE delays thespikes of I neurons and the MFEs are artiﬁcially prolonged, leading to higher synchrony. To verifythis point, we carry out a CG model reduction with ﬁve variables ( N GE , N GI , H EE , H IE , H I ) bykeeping separately the pending E -kick pools of the E and I populations. In this ﬁve-variable CGmodel, the degree of synchrony is indeed much closer to the full model (see Supplementary Materials). N GE , N GI , H E , H I ) from the full model dynamics (Fig. 6A; see also Fig. 7A).Since N GE and N GI are positively correlated (Fig. 6A, Middle; See also Fig. 7A, bottom panel),we examine the three-dimensional subspace of ( N GE , H E , H I ). Strikingly, the full model trajectories odel Reduction Captures Stochastic Gamma Oscillations on Low-Dimensional Manifolds 11 -3

1/ (ms/spike) /f r equen cy ( s / v o ll e y ) Syn regime MFE frequency full modelreduced network modelcoarse grained model -3

1/ (ms/spike)

Reg regime MFE frequency s p i k e sy n c h r on y i nde x SSI full modelreduced network modelcoarse grained model

A B EE Fig. 5

Two gamma features captured by reduced models. A. MFE waiting time linearly related to external stim-ulation waiting time. Left: Syn regime; Right: Reg regime. B. Degree of synchrony decreases when τ EE increases. suggest a low-dimensional dynamical structure of gamma oscillations. This observation is also veriﬁedby the mass estimation from the trajectories collected from a long-time (50s) simulation of the fullmodel (Fig. 6B). On the other hand, we also simulate the CG model for 10 seconds and ﬁnd similartrajectories in state space (Fig. 6C), accounting for its successful reproduction of the emergent gammadynamics. Since the CG model only consists of O ( N ) states, its invariant probability distributionbecomes computable from the Markov transition probabilities matrix. Here we present the mass ofthe distribution after we further “shrink” the CG model and reduce the number of states to millions(Fig. 6D. See Methods for the “shrunk” CG model). The invariant probability distribution not onlydisplays similar mass of density as the full model, but on which the trajectories of the full modelremain near (See Supplementary Materials).The trajectories and the probability distributions of both the full MIF and corresponding CGmodels reveal a two-dimensional manifold structure with a certain thickness in the orthogonal direc-tion (Fig. 6A-D). To verify this point, we carry out a local dimensionality estimation of the full modeldata (see Methods). Notably, the manifold exhibits a nearly two-dimensional local structure at mostof the places (Fig. 6E, cyan and green parts), except for a region with low H E , H I and low-to-medium N GE (Fig. 6E, red part. 0 < N GE < H E < H I < < N GE <

15) and the initiation of MFEs (where 15 < N GE < E -neurons are recurrently recruited by the E -spikes at the very beginning.Both processes are highly stochastic, resulting into higher dimensionalities. We note that, this localtwo-dimensional structure is also veriﬁed by the results of local linear embedding (LLE). When weapply LLE to the 10-second full model trajectories (from four-dimensional to three-dimensional),they display a two-dimensional manifold as well, in the three-dimensional LLE space (Fig. 6F).The trajectories of the MIF model give us a clear view of the temporal organization of the emer-gent gamma dynamics. First of all, we observed that ( N GE , N GI ) are strongly positively correlated(Fig. 7A, bottom panel), although N GI rises slightly faster than N GE at the initiation of each MFE.In other words, the probability ﬂow of membrane potential distribution of E and I populations are mostly temporally synchronized, which is also observed in previous computational models [19, 57,89]. In the three-dimensional subspace of ( N GE , H E , H I ), the low-dimensional manifolds help usinterpret the gamma dynamics as follows:1. At the beginning of each MFE, due to the external stimulus and faded recurrent inhibition, themembrane potentials moves up (represented by increasing ( N GE , N GI ). Fig. 6A-D, Middle). Notethat N GI moves up slightly faster due to the lower I -to- I connectivity compared to I -to- E .2. Some E neurons ﬁre, possibly eliciting more spikes from other E neurons, so H E increases fastduring this phase.3. I neurons are excited by H E at the same time, and H I increases as well. Note that H E is alwaysconsumed faster due to smaller τ E . In this phase, H E attains its peak while H I still increasesrapidly. , ∗ et al. H E is then consumed and the system is now dominated by the inhibition brought by H I leadingto the end of the MFE. High H I brings down ( N GE , N GI ), and terminating the MFE, leading tothe inter-MFE-period. The next MFE is unlikely to start until H I is mostly consumed. The vast range of observed neuronal network dynamics presents a tremendous challenge to systemsneurophysiologists, data analysts, and computational neuroscientists. Evermore detailed neurophysi-ological datasets and large-scale neuronal network simulations reveal dynamical interactions on mul-tiple spatial and temporal scales that also participate in essential brain functions. The observeddynamics exhibits rapid, stochastic ﬂuctuations incorporating strong, transient correlations, quitepossibly leading to its complexity. Unfortunately, the emergent ﬂuctuating activity cannot eﬀectivelybe described by standard ensemble averages, as many population methods cannot capture many ofthe low order statistics associated with the observed dynamical regimes.The diversity and hierarchy of experimentally observed dynamics in the brain poses this im-mediate question: What concise, uniﬁed mathematical framework can reproduce the co-existence oftransient, heterogeneous dynamical states, emerging from a high-dimensional, strongly recurrent neu-ronal network. Here we focus on gamma frequency oscillations because of their role underlying thetransient ﬂuctuations within the stationary states of complex neuronal networks (see, for instance,[16, 65]), and the belief that they are signiﬁcant contributors to neural information processing [29,75].Important theoretical progress has been made by coarse graining sparsely coupled networks (see,for instance, [13, 17], and references therein). These approaches were able to capture spatio-temporalnetwork dynamics dominated by mean-ﬁeld and uncorrelated synaptic ﬂuctuations. Also inﬂuentialwere the studies focusing on weakly-coupled oscillators. Recent work has made signiﬁcant theoreticaladvances by mapping populations of the quadratic IF neurons and the so-called theta neurons tosystems of Kuramoto oscillators [41, 53], where the tools of phase oscillator theory [1, 8] can be usedto give insight into network synchronization. We view our approach here as complementary to thesestudies.Here we demonstrate that, by starting with a Markovian IF model, we can bring the tools ofclassical Markov processes to bear on stochastic gamma oscillations. Using a model ﬁrst studied by[43], we show that we can dimensionally reduce and coarse-grain the model network, ﬁrst, to a two-state reduced network model and then, to a model of transition probabilities between the number ofneurons in the two-state, reduced model. The full repertoire of network dynamics between homogene-ity and synchrony is faithfully reproduced by our reduced models. Furthermore, by preserving theMarkovian dynamics at each step, and combining with data-driven approaches, we were able to reveallocally two-dimesionaly invariant manifolds underlying the type of gamma oscillations observed inlarge-scale numerical simulations.A tremendous amount of experimental and theoretical literature implicates oscillatory, coherentneural activity as a crucial element of cognition. Here we provided a simple framework through whichcollective behavior of populations of neurons can be coarse-grained to counting statistics of neuronsin speciﬁc neuronal states. This dynamical perspective not only can aﬀord a concise handle for thesystems neuroscientists, with experimental access to neuronal circuits and populations, but also tothe theoreticians, who may wish to build computational and information processing frameworks on top of these coarse-grained, low-dimensional representations.While we have detailed our methodology for a system with a small number of neurons and ho-mogeneous connectivities, preliminary studies show that we can scale up to much larger networks(see Supplementary Information). We can extend our framework to networks with slowly varyingspatial inhomogeneities (e.g., V1 orientation hypercolumns coupled by long-range excitatory connec-tions) and in capturing interneuronal correlations in predominantly feedforward networks, like synﬁrechains [26, 77, 83], with each local, nearly homogeneous population described by CG modules (of 2-3states plus the relevant pending-spike pools). At the same time, we are examining data-driven andmachine learning approaches to estimate the various states and transition probabilities directly fromnumerical simulations, with the hope of applying to neurophysiological data sets. Finally, we are al-ready looking to incorporate higher-order structural motifs in the network connectivity. Higher-order odel Reduction Captures Stochastic Gamma Oscillations on Low-Dimensional Manifolds 13

ABDEC F

Fig. 6

Gamma dynamics restricted on a low-dimensional manifold. A. The trajectories of the full model (Synregime) presented the state space ( N GE , N GI , H E , H I ). Left: The projection on subspace ( N GE , H E , H I ); Middle:The projection on subspace ( N GE , N GI ); Right: The projection on subspace ( N GE , H I ). B-D depict the samesubspaces as A . B. Mass of trajectory density of the full model. C. The trajectories of the CG model (Syn regime). D. The stationary probability distribution of CG model. E. Local dimensionality of data for the full model trajectories,displayed in the same view as A . F. The full model trajectories are local linearly embedded in a three-dimensionalspace.4 Yuhang Cai , ∗ et al. motifs [69] are likely to substantially inﬂuence the types of dynamical correlations within complexnetworks, and consequently, activate a multitude of spatio-temporal spiking patterns that may haveimportant consequences for information processing and coding in the mammalian brain [38, 90]. Methods

Integrate-and-Fire NetworkConsider an N -neuron Integrate-and-Fire (IF) neuronal network with N E excitatory neurons ( E )and N I inhibitory neurons ( I ), where the membrane potential ( v i ) of each neuron is driven by a sumof synaptic currents:d v i d t = (cid:16) g exti + g Ei (cid:17) · (cid:16) V E − v i (cid:17) + g Ii (cid:16) V I − v i (cid:17) ,g exti = S exti (cid:88) µ exti G E ( t − t µ exti ) ,g Ei = (cid:88) j ∈ Ej (cid:54) = i S Eij (cid:88) µ Ej G E ( t − t µ Ej ) , g Ii = (cid:88) j ∈ Ij (cid:54) = i S Iij (cid:88) µ Ij G I ( t − t µ Ij ) , (3)where g { ext,E,I } i are the external, excitatory and inhibitory conductances of neuron i . Each neuronreceives excitatory spiking stimulus from an external source ( µ exti ) and other excitatory/inhibitory neurons in the network µ { E,I } j , where the strength of synaptic couplings are represented by S exti and S { E,I } ij , respectively. A spike is released by neuron i when its membrane potential v i reaches thethreshold V th . After this, neuron i immediately enters the refractory period, and remains there fora ﬁxed time of τ R before resetting to rest V r . It is conventional in many previous studies to choose V th = 1 and V r = 0 [17]. Accordingly, V E = / and V I = − / are the excitatory and inhibitoryreversal potentials. Each spike changes the postsynaptic conductance with a Green’s function, G E ( t ) = 1 τ E e − t / τE h ( t ) ,G I ( t ) = 1 τ I e − t / τI h ( t ) , (4)where h ( t ) is the Heaviside function. The time constants, τ { E,I } , model the time scale of conductancesof the excitatory and inhibitory synapses (such as AMPA and GABA [32]).While Eq. (3) can model a network with arbitrary connectivity structure, in this paper, we focus onhomogeneous networks. That is to say, whether certain spike released by a neuron of type Q is receivedby another neuron of type Q (cid:48) is only determined by an independent coin ﬂip with a probability P Q (cid:48) Q ,where Q, Q (cid:48) ∈ {

E, I } . Furthermore, S { E,I } ij are also considered as constants independent of i, j .Three diﬀerent levels of models are illustrated below. The Markovian integrate-and-ﬁre network approximates Eq. (3) with a Markov process, and the following reduced network and coarse-grainedmodel are reductions of the full Markovian model.

Full Model: A Markovian Integrate-and-Fire NetworkFollowing a previous study [43], we rewrite Eq. (3) as a Markov process to facilitate theoreticalanalysis. Therefore, we need to minimize the eﬀects of the memory terms and discretize membranepotentials and conductances. Speciﬁcally, v i takes values in Γ := (cid:110) V I , V I + 1 , . . . , V r − , V r , V r + 1 , . . . , V th (cid:111) ∪ {R} , (5)To be consistent with the IF network (3), we choose V I = − V r = 0, and V th = 100. As before, v i enters the refractory state R immediately after reaching V th . However, in this Markovian IF (MIF) odel Reduction Captures Stochastic Gamma Oscillations on Low-Dimensional Manifolds 15 network, the total time spent in R is no longer ﬁxed, but an exponential-distributed random variable τ R .Each neuron receives the external input as an independent Poisson process with rate λ { E,I } , and v i goes up for 1 when an external kick arrives. On the other hand, the synaptic conductances ofeach neuron g { E,I } i are replaced by “pending-kick pools,” H { E,I } i . Consider excitatory spikes as anexample: Instead of updating g Ei with Green’s functions when neuron i receives an E -spike, we addthe new spike to an existing spike pool H Ei . Each spike in the pool will aﬀect v i independently afteran exponentially-distributed waiting time τ E . H Ei is the number of spikes has not taken eﬀect yet.Therefore, for the a sequence of E -spikes received by neuron i , it is not hard to see that E [ H Ei ( t )] = E [ g Ei ( t )].How each spike changes v i is determined by the type of the spike ( Q (cid:48) ), the type of neuron i ( Q ), and the state of v i . When a spike takes eﬀect, the membrane potential stays unchanged if v i = R , otherwise v i may jumps up (for an E -spike) or down (for an I -spike). On the other hand,the size of each jump depends on the membrane potential, v i , and the synaptic coupling strengths, S QQ (cid:48) . For an E -spike, v i increases by S QE . For an I -spike, however, the size of the decrement is ( v i − V I ) / ( V th − V I ) · S QI . The diﬀerence in the eﬀects of { E, I } -spikes is due to the relative values ofthe reversal potentials. The currents induced by g Ii is sensitive to v i while the currents induced by g Ei is much less sensitive, i.e., in Eq. (3):0 ≤ | v i − V I | ≤ , ≤ | v i − V I | ≤

163 (6)In our MIF network, we take most of the system (synaptic and stimulus) parameters directlyfrom [43], but modiﬁed a few to accommodate the smaller network studied here ( N E = 75 vs. N E = 300 ∼ – Frequencies of external input: λ E = λ I = 7000 Hz; – Synaptic strength: S EE = S EI = S II = 20, and S IE = 8; – Probability of spike projections: P EE = 0 . P IE = P EI = 0 .

5, and P II = 0 . – Synaptic time-scales: τ I = 4 . τ E < τ I to reﬂect the fact that AMPA is faster than GABA.We use diﬀerent choices of τ E for regimes implying diﬀerent dynamics (all indicated in ms):1. Homogeneous (“Hom”): τ EE = 4, τ IE = 1 . τ EE = 1 . τ IE = 1 . τ EE = 1 . τ IE = 1 . reduced network (RN) is a direct reduction of the MIF network by reducing the size of the statespace for membrane potentials. In RN, each neuron i is a two-state neuron ﬂipping between “base”or “gate” states, i.e., instead of taking values in the state space Γ , now v i ∈ Γ = { B, G } . A neuronin the MIF network is deemed “base” or “gate” depending on how likely it is going to ﬁre in the nextcouple of millisecond: Consider a certain cutoﬀ V c ∈ Γ , neuron i is a gate neuron if v i ≥ V c since itis closer to the threshold and is only a couple of E -kicks away from spiking. Otherwise, neuron i is abase neuron if v i < V c or v i = R . Therefore, a ﬂip from base to gate can take place when an E -spikeor external stimuli takes eﬀect and v i crosses V c from the lower side; on the other hand, a ﬂip fromgate to base may be due to v i crossing V c from the higher side when 1) an I -spike takes eﬀect, or 2)the neuron ﬁres and enters the refractory period.The network with two-state neurons is reduced from the MIF network by combining states to-gether, but generally we do not expect the full MIF model as a lumpable Markov process [71].Therefore, the appropriate transition probability between the base and gate states should be care-fully estimated so that the RN can correctly capture the dynamics of the MIF network. Since theﬂip between states can only take place after certain spikes, the possibilities are: – Eﬀect of external stimuli : When a kick arrives, a base { E, I } neuron will become a gate { E, I } neuron with probability { P BEex , P

BIex } , while a gate { E, I } neuron will ﬁre and become abase { E, I } neuron with probability { P GEex , P

GIex } . – Eﬀects of E-kicks : Similar types of transitions here but diﬀerent probabilities due to diﬀerentsizes of kicks: { P GEE , P

GIE , P

BIE , P

BEE } . , ∗ et al. – Eﬀects of I-kicks : The I-kicks do not have any eﬀect on a base neuron. But I-kicks will depressa gate neuron to a base neuron with probabilities { P GEI , P

GII } .All transition probabilities listed above are time-dependent and determined by the distributionof membrane potentials of neurons in the network. As a ﬁrst approximation, we can collect theirstatistics from a long-time simulation of the MIF network. Here we illustrate how to compute P BEE for example, and everything else follows. Consider the distribution of membrane potentials of E -neurons, p E ( v ). Then P BEE is the conditional probability a base E -neuron goes across V c within one E -kick, which is expressed as: P BEE { p E } = (cid:82) V c V c − S EE p E ( v ) d v (cid:82) v

All possible transitions from state X = ( N GE , N GI , H E , H I ) to another. There are 13 cases in all.external kick takes eﬀect one E -kick takes eﬀect one I -kick takes eﬀect P BEex N BE · λ E P BEE a EE N BE N E H E /τ EE P GEI a EI N GE N E H I /τ I P BIex N BI · λ I P BIE a IE N BI N I H E /τ IE P GII a II N GI N I H I /τ I P GEex N GE · λ E P GEE a EE N GE N E H E /τ EE (1 − P GEI a EI N GE N E − P GII a II N GI N I ) · H I /τ I P GIex N GI · λ I P GIE a IE N GI N I H E /τ IE (1 − P BEex ) N BE · λ E +(1 − P BIex ) N BI · λ I +(1 − P GEex ) N GE · λ E +(1 − P GIex ) N GI · λ I (1 − P BEE ) a EE N BE N E H E /τ EE +(1 − P BIE ) a IE N BI N I H E /τ IE +(1 − P GEE ) a EE N GE N E H E /τ EE +(1 − P GIE ) a IE N GI N I H E /τ IE Table 2

The transition rates of all transitions in Table 1.

StatisticsTo quantify how well the reduced models (RN and CG) capture the dynamical features of the fullmodel (MIF), we compare several statistics of the network dynamics (or more precisely, the spikingpattern produced by the network) collected from the simulations of the MIF, RN, and CG models. , ∗ et al. N u m be r o f pend i ng s p i k e s R a t i o o f ga t e neu r on S p i k e den s i t y AB H IE H EE H EI H II N GE /N E N GI /N I Time (ms) I nde x Time (ms)

Fig. 7

A 2s simulation of the full model in the Syn regime. A. Raster plot (top) and state space statistics (theother three panels). B. MFEs are recognized by the spiking volley detection algorithm.

The reader should note that we can not tell the speciﬁc neuron indices of ﬁring events in the CGmodel; yet it does not aﬀect the computation of the statistics below. The raster plots produced for theCG model, however, are indeed mock-up raster plots, drawn by assigning spikes to neurons randomlyamong the appropriate { E, I } population. Firing rates.

Spikes from { E, I } -cells are collected separately, and ﬁring rates ( f r E , f r I ) are computedas the average numbers of spikes per neuron per second. All three models are simulated simulatedfor over 3 seconds and spikes are collected from the 2nd second to rule out possible inﬂuences by thechoice of initial conditions. Spike synchrony index.

We borrow the deﬁnition of spike synchrony index (SSI) from [19]. SSI de-scribes the degree of synchrony of the ﬁring events as the following. For each spike occurred at t ,consider a w -ms time window centered by the spike ( t − w / , t + w / ) and count the fraction of neuronsin the whole network ﬁring in such window. Finally, the SSI is the fraction averaged over all spikes.It is not hard to see that SSI is larger for more synchronous spiking patterns. For the completelysynchronized dynamics, every other neuron ﬁres within the time window of each spike hence SSI=1. odel Reduction Captures Stochastic Gamma Oscillations on Low-Dimensional Manifolds 19 For completely uncorrelated ﬁring patterns such as Poisson, SSI is a small number close to 0. Oneshould note that the absolute value of SSI dependes on the choice of the window size, and we choose w = 5ms (the same as [19]). Spectrogram.

The power spectrum density (PSD) measures the variance in a signal as a function offrequency. In this study, the PSD is computed as follows:A time interval (0 , T ) is divide into time bins B n = [( n − ∆t, n∆t ] , n = 1 , , ... , the spike density µ n per neuron in B n is given by µ n = m n / N∆t where m n is the total number of spikes ﬁred in bin B n . Hence, the discrete Fourier transform of { µ n } on (0 , T ) is given as:ˆ µ ( k ) = 1 √ T T / ∆t (cid:88) n =1 µ n ∆te − k · (2 πi ) · ( n∆t ) . (13)Finally, as a function of k , PSD is the “power” concentrated at frequency k , i.e., | ˆ µ ( k ) | . Spike timing correlations.

The correlation diagrams describe the averaged correlation between eachspike and others. Consider the correlation with I -spikes conditioned on E at t = 0:For each E -spike at time t , we take I -spikes within the time window [ t − , t + 15ms], andcompute the fraction of I -spikes in each 1-ms time bin. The correlation diagrams is then averagedover all E -spikes in this simulation. Spiking volley detection.

The method deﬁning spiking volleys (or MFEs) is borrowed from [20]. Thecore idea of this method is to ﬁnd time intervals with some length constraints that the ﬁring rate ofeach time bin in this interval is a certain amount higher than the average ﬁring rate. This is thendeﬁned as a spiking volley. We choose 1 ms time bin and δ = 0 . , (cid:15) = 8 as parameters (Fig. 7B).Computational Methods Exact timing of events.

During our simulation, we can compute the exact timing of all events includ-ing ﬁring, kicks taking eﬀect, etc. We note that MIF, RN, and CG models are all Markov processeswhose randomness is mainly due to the exponential distributions of various waiting times. Con-sider two independent events A and B with waiting time X A ∼ exp( λ A ), X B ∼ exp( λ B ), we havemin { X A , X B } ∼ exp( λ A + λ B ), i.e., the waiting time for either the ﬁrst event is also an exponentialdistribution. Furthermore, the probability for the ﬁrst occurring events to be A is λ A / ( λ A + λ B ) .Similar arguments extend to m events. By noticing that exponential distributed waiting times aretemporally memoryless, we can simulated all three models by repeatedly selecting the ﬁrst occurringevent and generate the actual waiting time by sampling from certain exponential distributions. Invariant probability distributions.

After determining the total number of states and the transitionprobabilities between them, we can calculate the invariant probability distribution from the CG modelby computing the eigenvectors and eigenvalues of the transition probability matrix. Such methodscan be found in standard linear algebra textbooks such as [61].Readers should note that theoretically, there is no upper bound for { H E , H I } . In order to closethe computation of invariant probability distributions, we set the n { H E ,H I } , the largest numbers ofpending spikes shown up in the simulations as the “boundaries” of the state space. Speciﬁcally, the transition probability from state X to Y is 0 if1. X = ( N GE , N GI , n H E , H I ) and Y = ( N GE , N GI , n H E + a, H I ), or2. X = ( N GE , N GI , H E , n H I ) and Y = ( N GE , N GI , H E , n H I + b ),where a, b > A “Shrunk” Coarse Grained Model.

After the reductions, the CG model studied in this paper still has M = 5 . × states. Though the ﬁrst left eigenvector (i.e., the stationary probability distribution,corresponding to eigenvalue 1) of a sparse, M -by- M probability transition matrix is computable, thecost could be high for a desktop. Therefore, we aim at an even coarser version of the CG model:a “shrunk” coarse grained (SCG) model. Since H { E,I } are much higher than N { GE,GI } , we shrink the CG model by combining every K states of pending kicks into one, i.e., all 1 ≤ H E cg ≤ K states , ∗ et al. in CG model is considered as H E scg = 1 in SCG model. The intuition is the following: no need tocharacterize the states of pending spikes very precisely especially when the number is very large (i.e.,the diﬀerence between 3000 and 3001 is very small). We choose K = 24 to keep it the same as S EE .Every state of SCG model can also be represented as a quadruplet Q scg = ( N GE , N GI , H E scg , H I scg).The SCG model works as follow. Firstly the quadruple Q scg is lifted to another quadruplet Q cg =( N GE , N GI , H E cg , H I cg), where H E,I cg = ( H E,I scg − . · K . Then quadruplet Q cg acts following the samerule as the CG model. Lastly the change of Q cg is projected back into the change of Q scg:1. The change of N GE and N GE in Q cg is kept the same on corresponding elements in Q scg;2. The change x of H E,I cg is replaced by change y = [ x/K ] + b on H E,I scg , where [ · ] denotes the leastinteger function and b is a Bernoulli variable with probability p = ( x/K ) − y ;Through this model, we can further reduce the M states of CG model to M/K states of SCGmodel. Locally-linear embedding.

LLE is a nonlinear dimension reduction method which discovers the low-dimensional structure of high-dimensional data [62]. More precisely, it maps the high-dimensionalinput data into a low-dimensional space. The core idea of LLE is to maintain the local linear structurethrough the mapping and this is achieved by a two-step optimization. There are N input data andwe denote the high-dimensional input as (cid:126)X i and low-dimensional output as (cid:126)Y i . The algorithm isdescribed below and more details see [62, 63].1. Find the nearest k neighbors S ki of each data point (cid:126)X i ;2. W = arg min W (cid:48) (cid:80) i (cid:107) (cid:126)X i − (cid:80) j W (cid:48) ij (cid:126)X j (cid:107) , W (cid:48) ij = 0 if (cid:126)X j / ∈ S ki ; Y = arg min Y (cid:48) (cid:80) i (cid:107) (cid:126) Y (cid:48) i − (cid:80) j W ij (cid:126) Y (cid:48) j (cid:107) , subject to (cid:80) i (cid:126) Y (cid:48) i = 0 and N (cid:80) i (cid:126) Y (cid:48) i (cid:126) Y (cid:48) Ti = I ; Dimensionality of data.

We use local principal component analysis (local PCA) method to computethe local dimensionality of the data at each data point x , i.e., how its neighbors cluster around x .The data points processed here is selected from original data points with probability proportional tothe square of distance from the place with highest density mass, in order to make the distribution ofdata points more uniform which can result in more precise local dimensionality chracterization. For x , consider the correlation matrix C x of x and its K nearest neighbors ( K selected as 100). Then thedimensionality at x is computed asDim( x ) = Tr( C x ) Tr( C x ) = ( (cid:80) i λ i ) (cid:80) i λ i (14)where λ i is the i -th eigenvalue of correlation matrix C x . Eq. (14) is widely used as the dimensionalitydeﬁnition in theoretical and experimental neuroscience studies [31, 45, 49, 60]. Acknowledgments

This work was partially supported by the Natural Science Foundation of China through grants31771147 (T.W., L.T.) and 91232715 (L.T.), by the Open Research Fund of the State Key Laboratoryof Cognitive Neuroscience and Learning grant CNLZD1404 (T.W., L.T.). Z.X. is supported by theSwartz Foundation through the Swartz Postdoctoral Fellowship.We thank Professors Yao Li (University of Massachusetts, Amherst), Kevin Lin (University Ari- zona), and Lai-Sang Young (New York University) for their useful comments.

References

1. Ashwin, P., Coombes, S. & Nicks, R. Mathematical frameworks for oscillatory network dynamicsin neuroscience.

The Journal of Mathematical Neuroscience Neuron

Proceedings of the National Academy of Sciences odel Reduction Captures Stochastic Gamma Oscillations on Low-Dimensional Manifolds 21

4. Ba¸sar, E. A review of gamma oscillations in healthy subjects and in cognitive impairment.

Inter-national Journal of Psychophysiology issn : 0167-8760. (2013).5. Bauer, E. P., Paz, R. & Par´e, D. Gamma oscillations coordinate amygdalo-rhinal interactionsduring learning.

Journal of Neuroscience

The Journalof Mathematical Neuroscience

Neural Computation et al.

Gamma (40-100 Hz) oscillation in the hippocampus of the behaving rat.

Journalof Neuroscience

Behavioral and Brain Sciences

Journal of Neurophysiology

Neural Computation

Journal of Statistical Mechanics: Theory and Experiment

P03003 (2013).15. Buschman, T. & Miller, E. Top-down versus bottom-up control of attention in the prefrontaland posterior parietal cortices.

Science

Annual Review of Neuroscience et al.

Kinetic theory for neuronal networkdynamics.

Communications in Mathematical Sciences et al. Oscillatory phase coupling coordinates anatomically dispersed functionalcell assemblies.

Proceedings of the National Academy of Sciences

Journal of Neuroscience

Journal of Com-putational Neuroscience et al.

Stimulus onset quenches neural variability: a widespread corticalphenomenon.

Nature Neuroscience et al.

Frequency of gamma oscillations routes ﬂow of information in the hippocampus.

Nature

Nature Reviews Neuroscience

Stochastic neurodynamics in Advances in Neural Information Processing Systems (1991), 62–69.25. Csicsvari, J., Jamieson, B., Wise, K. & Buzs´aki, G. Mechanisms of gamma oscillations in thehippocampus of the behaving rat.

Neuron

26. Diesmann, M., Gewaltig, M. O. & Aertsen, A. Stable propagation of synchronous spiking incortical neural networks.

Nature

European Journal of Neuroscience

Science

Annual Review of Neuroscience , ∗ et al.

30. Fries, P., Womelsdorf, T., Oostenveld, R. & Desimone, R. The eﬀects of visual stimulation andselective visual attention on rhythmic neuronal synchronization in macaque area V4.

Journal ofNeuroscience et al.

A theory of multineuronal dimensionality, dynamics and measurement.

BioRxiv,

Neuronal Dynamics: From Single Neuronsto Networks and Models of Cognition (Cambridge University Press, 2014).33. Goddard, C. A., Sridharan, D., Huguenard, J. R. & Knudsen, E. I. Gamma oscillations aregenerated locally in an attention-related midbrain network.

Neuron

Nature

Science

Journal of Neurophysiology

The Journal of Physiology

500 (1952).38. Hu, Y., Trousdale, J., Josi´c, K. & Shea-Brown, E. Motif statistics and spike correlations inneuronal networks.

Journal of Statistical Mechanics: Theory and Experiment

P03012(2013).39. Keeley, S., Byrne, ´A., Fenton, A. & Rinzel, J. Firing rate models for gamma oscillations.

Journalof Neurophysiology et al.

Impaired tuning of neural ensembles and the pathophysiology of schizophre-nia: a translational and computational neuroscience perspective.

Biological Psychiatry

Journal of MathematicalNeuroscience Brain ResearchReviews

Journal of Mathematical Biology

Journalof Mathematical Biology

Neuron

Journal of Neuroscience

Nature

Current Opinion inNeurobiology

Frontiers in Systems Neuroscience

11 (2016).50. McNally, J. M. & McCarley, R. W. Gamma band oscillations: a key to understanding schizophre- nia symptoms and neural circuit abnormalities.

Current Opinion in Psychiatry

202 (2016).51. Medendorp, W. P. et al.

Oscillatory activity in human parietal and occipital cortex showshemispheric lateralization and memory eﬀects in a delayed double-step saccade task.

CerebralCortex et al.

Spatio-temporal correlations in human gamma band electrocorticograms.

Elec-troencephalography and Clinical Neurophysiology

Physical Review X Nature Neuroscience odel Reduction Captures Stochastic Gamma Oscillations on Low-Dimensional Manifolds 23

55. Plenz, D. et al.

Multi-electrode array recordings of neuronal avalanches in organotypic cultures.

JoVE (Journal of Visualized Experiments), e2949 (2011).56. Popescu, A. T., Popa, D. & Par´e, D. Coherent gamma oscillations couple the amygdala andstriatum during learning.

Nature Neuroscience

Journal of Computational Neuroscience

Journal ofComputational Neuroscience

Trends inCognitive Sciences

PLoS Computational Biology e1006446 (2019).61. Roman, S., Axler, S. & Gehring, F.

Advanced Linear Algebra (Springer, 2005).62. Roweis, S. T. & Saul, L. K. Nonlinear dimensionality reduction by locally linear embedding.

Science (2000).64. Shew, W. L., Yang, H., Yu, S., Roy, R. & Plenz, D. Information capacity and transmission aremaximized in balanced cortical networks with neuronal avalanches.

Journal of Neuroscience

Nature Reviews Neuroscience

Proceedings of the National Academy of Sciences

Nature

PLoS Biology e68 (2005).70. Tallon-Baudry, C. The roles of gamma-band oscillatory synchrony in human visual cognition. Frontiers in Bioscience (Landmark Ed)

StochasticAnalysis and Applications et al.

Single-column thalamocortical network model exhibiting gamma oscillations,sleep spindles, and epileptogenic bursts.

Journal of Neurophysiology

Frontiers in Integrative Neuroscience Journal of Neu-roscience

Physiological Reviews ronal network model.

Journal of Neuroscience

PLoS Computational Biology e1004979 (2016).78. Whittington, M. A., Traub, R., Kopell, N., Ermentrout, B. & Buhl, E. Inhibition-based rhythms:experimental and mathematical observations on network dynamics.

International Journal ofPsychophysiology

Biophysical Journal

Kybernetik , ∗ et al.

81. Womelsdorf, T. et al.

Modulation of neuronal interactions through neuronal synchronization.

Science et al.

Orientation selectivity and noise correlation in awake monkey area V1 aremodulated by the gamma cycle.

Proceedings of the National Academy of Sciences

Physical Review E et al.

Stochastic generation of gamma-band activity in primary visual cortex of awakeand anesthetized monkeys.

Journal of Neuroscience

Neuron et al.

Higher-order interactions characterized in cortical activity.

Journal of Neuroscience

Journal of Computational Neuro-science

Journal of Computational Neuroscience

Frontiers in Computational Neuroscience

28 (2011). odel Reduction Captures Stochastic Gamma Oscillations on Low-Dimensional Manifolds 25

Supplementary Materials

S1 Low-Dimension Manifolds

Trajectories of MIF and CG Models.

Although MIF and CG models exhibit subtle diﬀerence in thestatistics of the dynamics, we ﬁnd that their trajectories in state space ( N GE , N GI , H E , H I ) stayclose to similar manifolds (Fig. S1). Fig. S1

MIF and CG models produce trajectories on similar surfaces. Left: The projection on subspace( N GE , H E , H I ); Middle: The projection on subspace ( N GE , N GI ); Right: The projection on subspace ( N GE , H I ). MIF Trajectories in Syn, Reg, and Hom Regimes.

The low-dimensional manifolds in state spaces arepowerful tools to visualize and analyze gamma dynamics. Here, we are interested in the comparison ofmanifolds for these three regimes studied in the manuscript. In the Syn regime, X has a faster speedto consume the pending E -kicks (compared to Hom and Reg), i.e., E -neurons are faster recruitedby the recurrent excitation, resulting in larger MFEs. These features are reﬂected by the full-modeltrajectories in the three diﬀerent regimes (Fig. S2). Fig. S2

Full model trajectories of three regimes (Syn, Reg, and Hom). Left: The projection on subspace( N GE , H E , H I ); Middle: The projection on subspace ( N GE , N GI ); Right: The projection on subspace ( N GE , H I ).6 Yuhang Cai , ∗ et al. S2 A Larger MIF NetworkWe repeat the model reduction for a 400-neuron MIF network, and collect the state space variables( N GE , N GI , H E , H I ) from a 100-second simulation (Fig. S3). A similar manifold is revealed by thetrajectories in the state space, and also veriﬁed by the mass of trajectory density (Fig. S3AB). Thetwo-dimensional local structure is also veriﬁed by the local dimensionality estimation and local linearembedding (Fig. S3CD). For this network, we simulate the Syn regime with parameters identical to[43]. ABC D

Fig. S3

Gamma dynamics of a 400-neuron MIF network restricted on a low-dimensional manifold.

A-D are similarto

A, B, E, F of Fig. 6.odel Reduction Captures Stochastic Gamma Oscillations on Low-Dimensional Manifolds 27

S3 Issues of Model Reduction

Selection of the cutoﬀ.

How faithful the RN models capture the full-model dynamics depends on theselection of cutoﬀ V c . Our assumption of RN is that the gamma dynamics is mostly sensitive to thesubthreshold distribution of neurons about to ﬁre. On the other hand, a high V c provides an accuratedescription of gate neurons, but very roughly for the base neurons, which is a potential issue for theRN model. For the RN model in this paper, we tuned the selection of the cutoﬀ systematically andﬁnally choose V c = 40. Here we present the ﬁring rates and raster plots of the RN model when V c = 70 and 40, where the former results in much larger ﬁring rates and MFEs. Time (ms) I nde x I nde x Bar=70Bar=40

Fr: E=89.90 I=153.26Fr: E=45.47 I=82.06

Fig. S4

Raster plots of diﬀerent cutoﬀs V c in the RN models. Upper: V c = 70; Lower: V c = 40. The duration of MFEs.

The MFEs of the reduced models have longer duration than the full model,possibly due to the slower probability ﬂows since we combine multiple states from the full model asone in the reduced ones.

Full model Reduced network model Coarse grained model5101520 M F E du r a t i on ( m s ) Fig. S5

The duration of MFEs in diﬀerent models. For each model, the mean (red line), ﬁrst and third quartile(lower and upper bounds of the blue box), minimum and maximum (black lines), and outliers (red cross) areindicated in the box plot. The duration of MFEs for the RN and CG models are signiﬁcantly higher than the fullmodel (p = 0.004 and 0.038).8 Yuhang Cai , ∗ et al. A 5-variable CG model.

The 4-variable CG model combines E -kick pools of all neurons together,which may cause a problem since the pending E -kicks for E neurons and I neurons are consumedby diﬀerent speeds. Hence we come up with a 5-variable version were H EE and H IE are countedseparately. EE s p i k e sy n c h r on y i nde x SSI full modelreduced network model 5-variable CG model

Fig. S6