[PDF] Building population models for large-scale neural recordings: opportunities and pitfalls

Abstract

Modern recording technologies now enable simultaneous recording from large numbers of neurons. This has driven the development of new statistical models for analyzing and interpreting neural population activity. Here we provide a broad overview of recent developments in this area. We compare and contrast different approaches, highlight strengths and limitations, and discuss biological and mechanistic insights that these methods provide.

Full PDF

BBuilding population models for large-scale neural recordings:opportunities and pitfalls

Cole Hurwitz, Nina Kudryashova, Arno Onken and Matthias H. Hennig*

University of Edinburgh, Institute for Adaptive and Neural ComputationEdinburgh, EH8 9AB, United Kingdom

Correspondence: * [email protected]

February, 2021

Abstract

Modern extracellular recording technologies now enable simultaneous recording from largenumbers of neurons. This has driven the development of new statistical models for analyzingand interpreting neural population activity. Here we provide a broad overview of recent devel-opments in this area. We compare and contrast dierent approaches, highlight strengths andlimitations, and discuss biological and mechanistic insights that these methods provide. Whilestill an area of active development, there are already a number of powerful models for interpret-ing large scale neural recordings even in complex experimental settings.

Introduction

The availability of dense microelectrode arrays with high channel counts now allows for study-ing populations of neurons with the temporal precision of single spike times [41, 68]. Statisticalmodels have emerged as an essential tool to represent selected characteristics of neural popula-tion activity.Informally, statistical models of population activity can be grouped into two categories: cor-relation models and latent variable models. Correlation models aim to capture the statisticalstructure of the population activity, a formidable task given the large space of all possible ac-tivity patterns. For just 20 neurons, there are over one million possible spiking patterns; Thisnumber grows to over one billion for a population of 30 neurons. A key ingredient of tractablecorrelation models is a restricted parameter set modeling relevant quantities such as pairwisecouplings.Latent variable models take the opposite approach, aiming to reduce the complexity of thedata by identifying a small number of essential factors that underlie the observed activity. Thisis also a dicult task as there are a myriad of decisions when designing such a model. Forinstance, it is not clear how many factors underlie the observed data, how these factors shouldbe modeled, and what their contribution is to the observed neural population activity.In this review, we rst discuss the often neglected process of obtaining single neuron ac-tivity (and other modeling data) from high-density recordings. Next, we provide an overviewof the most recent developments in correlation and latent variable modeling. Finally, we ad-dress the important question of how the predictions and parameters of these models give rise tomechanistic insights about the underlying neural circuitry. a r X i v : . [ q - b i o . N C ] F e b xtracellular potentials(bandpass filtered) SpikeInterface Spike Sorter 1Spike Sorter 2Spike Sorter 3 Consensus SortingSpike Sorter 1 automated/manual curation Curated Sorting 1Spike Sorter 2 automated/manual curation Curated Sorting 2 { ModelModelModel

Compareresults

Figure 1:

The process of extracting the activity of single neurons with a spike sorter is error-prone with dierent sortershaving dierent failure modes. It is, therefore, recommended to make use of dierent sorters before performing model-based analysis. A rst method is to perform the same analysis on the output of dierent sorters to see if functionalinsights are independent of the chosen spike sorter (bottom). Alternatively, one could obtain a consensus sorting (top)where false positive units are largely removed by taking the agreement between dierent sorters. Any analyses couldthen be performed on this consensus sorting. These pipelines, together with various pre- and postprocessing steps as wellas automated curation, can be easily implemented using the SpikeInterface framework [10] which supports all commonlyused spike sorters.

Extracting modeling data

Precisely recording spiking activity from many neurons at once is essential for understandingneural circuits which can be highly diverse and heterogeneous [13]. Extracting pertinent infor-mation from these recordings for downstream modeling, however, is dicult due to the size andcomplexity of the datasets.Spike trains are extracted through a process called spike sorting where detected spikes arealgorithmically assigned to individual neurons. On dense microelectrode arrays and probes,spike sorting is a complex de-mixing problem as the extracellular signals from a neuron arerecorded on multiple channels and each channel reports activity from multiple neurons. Thereare several algorithmic approaches for spike sorting which are reviewed elsewhere [71, 33, 16,46]. Accurately extracting spike trains is essential since mistakes, which include incorrectlymerged or split units, unresolved overlapping spikes, or incorrect clustering due to probe drift,have been shown to bias subsequent analysis [3, 92, 93].It is important to acknowledge that none of the existing spike sorting algorithms are error-free. A collection of ground truth comparisons for all major spike sorters is available on theSpikeForest website [50] to aid selection of an appropriate spike sorter for a given probe type.Moreover, the comparison of spike sorter outputs has shown that there is surprisingly low agree-ment in part due to a large number of false positive units [10]. A viable approach to mitigateagainst spurious ndings is, therefore, to perform the same analysis on multiple sorter outputsor on an ensemble agreement among multiple sorters. Both types of analysis can be easily im-plemented using the SpikeInterface framework (Figure 1) [10].High-density microelectrode arrays also allow for estimating the source location of eachrecorded spike which roughly corresponds to the location of the ring neuron’s soma. Theselocation estimates can be used to improve the accuracy of spike sorting, to register recordedneurons with anatomical information, and to identify the same units in longer recordings oracross multiple sessions [66, 77, 40, 35, 11, 36].Finally, it is potentially possible to classify the cell type of a neuron using its spiking activityand the corresponding spatio-temporal pattern recorded on the array. The morphology and ionchannel expression of a neuron leads to (relatively) unique extracellular potentials that unsu-pervised or supervised approaches can use to classify the neuron’s cell type [77, 47, 11, 72, 39]. orrelation models Understanding circuit function requires models that can capture the correlations that arise due tointeractions between neurons [2, 21, 98]. To this end, neural population activity can be approxi-mated by dierent multivariate probability distributions with varying levels of detail (see Fig. 2).

Maximum entropy models (MaxEnt) provide a principled way to construct such distributions.These models are characterized by a set of given constraints but otherwise make as few as-sumptions about the joint population statistics as possible [80]. Constraints enforce which as-pects of the data are reproduced, such as the ring rates and pairwise correlations between theneurons [81, 84], the population ring statistics [88], and the dependencies between individualneurons and the overall population activity [26, 57, 55]. While MaxEnt models can capture pop-ulation spiking well [81, 84], the computational cost of tting them is prohibitive for large pop-ulations. A more scalable approach is to discretize the multivariate Gaussian distribution [1, 48].This model is called the

Dichotomized Gaussian model and in its general form makes it possibleto model spike count variables with arbitrary distributions [48].When given well-specied external inputs, generalized linear models (GLMs) can also suc-cessfully describe the activity of large neuronal populations [90, 63, 64]. In these models, exter-nal inputs, inputs from other neurons, and each neuron’s spiking history are summed up andtransformed by a rate function. The resulting rate is then used in a stochastic process to modelspiking activity. Importantly, the rate is calculated from the inputs of previous time bins but notthe same time bin, making it dicult to model zero-lag correlations [49]. For typical rate func-tions, tting GLMs is a convex optimization problem which can be solved eciently, makingGLMs tractable for large-scale neural recordings [90, 64]. Along with modeling the observedneural population, the activity of unobserved neurons can be inferred using the algorithm pro-posed in Battistin et al. [4]. Knowing the topology of the external covariates can further aid ininferring the activity of the unseen neurons [86]. Despite their exibility, GLMs cannot modelnetworks with neurons that perform non-linear integration of multiple external inputs [5].A more direct representation of statistical dependencies can be provided by copula models .Every distribution representing neural population activity can be decomposed into the distri-butions of individual neuron activities and the dependence between these neurons, representedby a copula [38]. Importantly, copula models can represent inter-neuronal dependencies thatgo beyond linear (Gaussian) correlations [7], as well as non-linear stimulus-response relation-ships [43]. Explicit copula models make it possible to construct multivariate distributions withcomplex dependencies and radically dierent statistics for each element. For example, copulamodels can be used to jointly model dierent recording modalities such as local eld potentials

Time-series for each trialNeuron 2Neuron 1 N eu r on N ... Latent variable models Time-series (for each trial * ) d d d M ...Dimensionality reduction Model of neuronal networkRaw test data Neuron 2Neuron 1 N eu r on N ... Decoder ModelRaw data

Behaviour / stimuli + external covariates Generali-zation Prediction of external covariates

Correlation models

OLD sketch

Max EntropyCopula

MaxEnt Copula Level of detail ...

Dichotomized Gaussian MaxEnt

10 0 1 time

True data distribution temporal kernel GLM

Figure 2:

Correlation models construct a joint distribution of population activity with various levels of detail. The hori-zontal arrow orders the methods according to the complexity of their statistical distribution (i.e. number of parameters). atent Dynamics z t+1 = f( z t ) + ɛ z z Mapping function(e.g. Linear, GP, NN)

Spike TrainsExponential Poisson

Mapping function(e.g. Linear, GP)

Observation ModelGPFA

Observed trajectories

Time-series (for each trial * ) y y y N ... Independent GPs x x Mapping function y ( x ) (optional) Latent GPs z i1:t ∼ N(0, K i ) z z z (i) t+1 = f( z (i) t ) z z Latent Dynamics x x x Firing Rates Spike TrainsExponentialMapping function(e.g. Linear, GP, NN)

Poisson

Mapping function

Observation Model x x x Firing Rates Spike TrainsExponentialMapping function(e.g. Linear, GP)

Poisson

Observation Modelz (i)1:t ∼ N (0,K (i) )Latent GPs z z Firing Rates Spiking ActivityBehavioral Variables (e.g. joint angles) x x x Gaussian

Firing Rates

Spike Trains z (i) t+1 = f( z (i) t ) z z State-space

Spike TrainsExponential Poisson

Observation Modelz (i)1:t ∼ N (0,K (i) )GP-based z z Firing Rates Spiking ActivityBehavioural Variables x x x Firing Rates (a) (b) (c)(d)

Latent Dynamics z (i) t+1 = f( z (i) t ) + ɛ z z Spike TrainsExponential Poisson

Latent GPs z (i)1:t ∼ N(0,K (i) ) z z Firing Rates Spiking ActivityBehavioural Variables (e.g. joint angles) x x x Mapping function (a) (b) (c)(d) (Optional)

Figure 3:

Latent variable models assume that neural activity is generated by a set of low-dimensional latent states. (a)When modeling the temporal evolution of the latent states, one can explicitly model the dynamics with a state-spacemodel or can instead model the statistics of the latent trajectories with a Gaussian Process (GP). (b) The latent trajectoriescan be related back to neural activity (e.g. the ring rates) with a variety of mapping functions including linear functions,neural networks, and GPs. (c) An additional step to further constrain the dynamics is to reconstruct the spiking activityusing a point-process distribution such as the Poisson distribution. (d) Recently, it has been shown that explicitly modelingthe relationship between the latent factors and the observed behavioral variables can further constrain the factors to below-dimensional and ’behaviorally relevant’ [79] and individual neurons’ spiking activities [58]. Recently, copula models have also been appliedto forward and backward spike intervals in order to identify the directions of synaptic connec-tions [94]. Such models are particularly promising for analyzing recordings of neural data andexploring its relation to behavior and stimuli which typically follow dierent statistics and tendto have complex dependencies [43, 37].Correlation modeling approaches need sucient exibility to account for the complexity ofneural activity. However, since the amount of recorded data-points available for tting thesemodels is limited, this exibility can lead to identiability issues. One approach to alleviate thisproblem is to make assumptions about the structure of the population activity (e.g. restrict themodel to pairwise functional interactions only [81]). Alternatively, the uncertainty of the modelparameters can be quantied with approximate Bayesian inference methods [97]. A recently de-veloped LR-GLM utilized low-rank data approximations to scale approximate Bayesian inferencemethods to high-dimensional real datasets making them more applicable to large-scale neuralrecordings [89].Application to complex stimuli with strong correlations can also be challenging for correla-tion models. By default, these models assume that the given stimuli are independent. However,for correlated stimuli the inferred correlations between neurons consist of two parts: ‘stimuluscorrelations’ that depend on the set of presented stimuli and their statistics, and ‘noise cor-relations’ that arise from circuit interactions. A modied maximum entropy model developedby Granot-Atedgi et al. [28] showed that distinguishing between these correlation componentscan lead to a signicant improvement when predicting population activity patterns. Similarly,Mahuas et al. [51] introduced a two-step inference procedure for GLMs that decoupled the infer-ence of inter-neuron couplings from the inference of stimulus lters. They showed that withoutthis correction, the inferred couplings were overestimated and produced feedback loops leadingto unnatural simulated dynamics. atent variable models As neural population activity is characterized by the time-varying ring rate of each neuron,the activity at any given time point resides in a high-dimensional ambient space. Despite thisapparent high dimensionality, recent work suggest that the set of possible activity patterns isconned to a low-dimensional manifold [52, 17, 78, 31, 23, 20]. In other words, population-wideactivity patterns are generated from a small set of latent variables whose temporal evolutionsare referred to as the latent trajectories of the neural activity. The minimum number of latentvariables required to fully characterize the activity of a neural population is known as its intrinsicdimensionality [6]. The intrinsic dimensionality of neural population activity is thought to besignicantly smaller than the number of recorded neurons either due to the constraints set bythe underlying network connectivity [91, 78, 57, 23, 32] or to the low stimulus or task complexity[24, 87].There are two main methodological approaches for modeling low-dimensional neural tra-jectories (shown in Fig. 3). The rst models latent trajectories using state-space representations ,where the dynamics of the trajectories are modeled explicitly with a transition function thatrelates past latent states to future latent states. The second models latent trajectories using mo-ment representations where, rather than having an explicit model of the dynamics, the statisticsof the latent trajectories over time are modeled. With both approaches, there is an observationmodel which relates the neural trajectories back to the observed variables (e.g. neural activity).Traditional dimensionality reduction techniques such as principal component analysis (PCA)and factor analysis (FA) can be re-interpreted as static state-space models without temporal de-pendence between states [73]. These models are useful for extracting low-dimensional structurefrom neural recordings [22], however, their lack of temporal structure make them less appeal-ing for modeling more complicated phenomena such as rotational dynamics [17]. Beyond staticmodels, linear dynamical state-space models can capture simple temporal dependencies betweenlatent states. Common linear methods include linear dynamical systems (LDS) [85, 44, 12, 62]and jPCA [17, 83, 54]. While ecient and interpretable, linear state-space methods cannot cap-ture non-linear dynamical phenomena which are thought to underlie rhythmic motor patterns[75, 30], decision making [67], and pathological conditions such as epilepsy [29]. To capturethese dynamics, non-linear dynamical state-space models have been introduced including ap-proaches that model dynamics with recurrent neural networks (RNNs) [14, 82], switching lineardynamics [61], and recurrent switching linear dynamics [96]. A particularly ecient RNN-basedmodel is the recently introduced latent factor analysis via dynamical systems (LFADS) [60]. Thisapproach utilizes a variational autoencoder to amortize the cost of inference over the dynamicsmaking it more practical for large-scale neural recordings. AutoLFADS has further improvedthe generalizability and eciency of this approach using a novel regularization technique andhyperparameter search algorithm [42].An alternative to explicitly learning a dynamical model for the neural trajectories is to modeltheir statistics directly using a moment representation of the trajectories. The most commonmodel used for this approach is Gaussian-process factor analysis (GPFA) [15]. In this model, thetrajectory of a latent state is generated by an independent Gaussian process (GP) which denes adistribution over functions. Beyond simple properties like smoothness (set by the chosen kernel),GPs make minimal assumptions about the underlying function they are modeling, so they canbe used to model arbitrary non-linear trajectories. They also can provide uncertainty estimatesfor the trajectories since they learn a full distribution over the entire set of possible functions.The strong independence assumptions of GPFA provide few constraints to the functional form ofthe latent trajectories making it more useful for exploratory data analyses where the parametricform of the trajectories is unknown. Recent work has focused on adding ‘dynamical’ propertiesto the latent trajectories learned by GPFA. This has led to the development of Gaussian ProcessFactor Analysis with Dynamical Structure (GPFADS) which utilizes the insight that autonomousdynamics often result in observed trajectories that could not occur in reverse (i.e. temporal non-reversibility) [76]. To impose this constraint on the learned trajectories, GPFADS introduces a ovel multi-output non-reversible kernel which yields samples that have lower probabilities ofoccurring in reverse. As a result, GPFADS can disentangle the trajectories generated by non-linear non-reversible dynamical systems such as primary motor cortex arm reach data.There has also been recent work to improve the observation models which relate the trajec-tories back to the neural activity. Common approaches include modeling neural activity witha Gaussian model or the more plausible Poisson model [49]. The mapping between the latenttrajectories and neural activity can take on a number of functional forms including linear maps[85, 15, 60], GPs [45, 95, 82], or neural networks [25]. Most recently, observation models thattake into account external behavioral variables have been introduced for capturing ’behaviorallyrelevant’ latent trajectories. Sani et al. [79] introduced a novel linear Gaussian state-space model(PSID) that jointly models the neural activity and the recorded behavioral variables. Intriguingly,PSID found that behaviorally relevant latent trajectories are signicantly lower-dimensionalthan previously thought and that rotational dynamics in the primary motor cortex appear toreverse direction when the arm is returning versus when the arm is reaching. Mechanistic interpretations of statistical models

Correlation models of neural activity can yield two main insights into neural circuits. First, theycan be used as a proxy for the full joint distribution of neural activity which cannot be estimateddirectly from data. This is useful for information theoretic analyses of neural populations tomeasure the relationships between activity and external covariates such as stimuli and behav-ior [74]. Second, the parameters in these models can be interpreted in terms of the functionalcircuit connectivity. For instance, connectivity analysis has been used to uncover the spatialextent of interactions in the retina [84, 56] and to characterize interactions between neuronswith dierent selectivity in the entorhinal cortex [19]. One must be careful with this analysis,however, as unobserved neurons can systematically bias the estimated couplings. While thiseect is weak for weakly coupled populations, it can become a signicant problem for stronglycoupled populations [18, 9].It has also been shown that inferred functional connectivity is not well constrained by therecorded spike trains and that many coupling parameters, even those with large absolute values,can be altered without signicantly changing the network activity predicted by the model [59,65]. This may not to be a limitation of the models but rather an intrinsic property of neuralcircuits. Simulations of spiking networks have shown that re-wiring even a large number ofsynapses may lead to minor functional consequences [53]. As a result, related inferred functionalconnections are also poorly constrained. Analysis of long term recordings has shown that theseless important connections are also subject to more intensive re-modeling over time [59, 65].Therefore, taking into account the reliability of the inferred couplings in functional connectivitymodels allows for the identication of the neurons, connections and circuit motifs that mostdetermine the population activity statistics [34, 59, 65].In comparison to correlation models, latent variable models of neural activity provide a suc-cinct and low-dimensional description of joint population activity in the form of latent trajec-tories. One main way to gain insight into circuit function using these models is to correlatethe learned trajectories back to the stimuli or behavior in the experiment. The success of thisapproach is demonstrated best in studies that correlate latent trajectories in the premotor andmotor cortex to arm reaching tasks. Churchland et al. [17] found that low-dimensional rota-tional trajectories in the primary motor cortex were correlated with arm reaches. Gallego et al.[22] extended this nding, demonstrating that these low-dimensional trajectories are a stableneural correlate for consistent execution of arm reaches over the course of many years. Finally,Ramanathan et al. [69] showed that following a stroke, diminished reaching function in the con-tralesional arm was correlated with a loss of motor cortical neural trajectories. The trajectoriesonly reemerged after motor recovery and proved to be a useful neuromodulatory target for ther-apeutic electrical stimulation. espite the success of latent variable models, they make a number of assumptions that mayaect interpretation. The rst assumption is that the neuronal population in question is statisti-cally homogeneous. This might be violated when modeling populations that span multiple brainregions and contain dierent cell types [24]. Designing latent variable models for modeling thedynamical interactions of multiple brain areas, sub-populations, or functional cell types is stillan active area of research [27, 8, 70]. Another assumption implicit in many latent variable mod-els is that the factors that capture the most neural variability are also the most important forcircuit function. This may not be true, however, as neural circuits also encode information thatis not relevant for the specic task or stimuli being studied. This can be alleviated by constrain-ing latent variable models with the observed behavior of interest [79]. Finally, latent variablemodels assume that neural data is, in fact, low-dimensional. While a number of studies haveindependently found this to be the case for their datasets, recent work potentially challengedthis viewpoint. A recent analysis of stimulus-evoked activity in the visual cortex suggests thatthe manifold dimensionality is actually as high as it can be without becoming non-dierentiable[87]. Moreover, Gao et al. [24] suggests that population activity is actually as high-dimensionalas possible given the simplicity of the given stimuli or tasks. Outlook

The work summarized here shows that analyzing and interpreting large scale recordings re-quire model-based approaches and that statistical models have emerged as a particularly suit-able model class. A central issue is the explicit or implicit assumptions made in each class ofmodel. On the one hand, strong constraints and modeling choices are a key requirement to ob-tain tractable models that can be successfully t to large-scale datasets. At the same time, it isimportant that the model be expressive enough to capture the biologically relevant informationin the data. Striking a balance between these two factors requires a close interaction betweenexperimental design and model development.

Acknowledgments

AU is supported by the Engineering and Physical Sciences Research Council (EP/ S005692/1).CLH is supported by the Thouron Award and by the School of Informatics, University of Edin-burgh.

References [1] Amari, S. I., Nakahara, H., Wu, S., and Sakai, Y. (2003). Synchronous ring and higher-orderinteractions in neuron pool.

Neural Comput , 15.[2] Averbeck, B. B., Latham, P. E., and Pouget, A. (2006). Neural correlations, population codingand computation.

Nat Rev Neurosci , 7(5):358–366.[3] Bar-Gad, I., Ritov, Y., Vaadia, E., and Bergman, H. (2001). Failure in identication of overlap-ping spikes from multiple neuron activity causes articial correlations.

J Neurosci Methods ,107(1-2):1–13.[4] Battistin, C., Hertz, J., Tyrcha, J., and Roudi, Y. (2015). Belief propagation and replicas forinference and learning in a kinetic ising model with hidden spins.

J Stat Mech Theory Exp ,2015(5):P05021.[5] Benjamin, A. S., Fernandes, H. L., Tomlinson, T., Ramkumar, P., VerSteeg, C., Chowdhury,R. H., Miller, L. E., and Kording, K. P. (2018). Modern machine learning as a benchmark fortting neural responses.

Front Comput Neurosc , 12:56.

6] Bennett, R. (1969). The intrinsic dimensionality of signal collections. 15(5):517–525.[7] Berkes, P., Wood, F., and Pillow, J. (2008). Characterizing neural dependencies with copulamodels. In

Advances in Neural Information Processing Systems , volume 21, pages 129–136.[8] Bong, H., Liu, Z., Ren, Z., Smith, M., Ventura, V., and Robert, K. E. (2020). Latent dynamicfactor analysis of high-dimensional neural recordings.

Advances in Neural Information Pro-cessing Systems , 33.[9] Brinkman, B. A., Rieke, F., Shea-Brown, E., and Buice, M. A. (2018). Predicting how and whenhidden neurons skew measured synaptic interactions.

PLoS Comput Biol , 14(10):e1006490.[10] Buccino, A. P., Hurwitz, C. L., Garcia, S., Magland, J., Siegle, J. H., Hurwitz, R., and Hennig,M. H. (2020). SpikeInterface, a unied framework for spike sorting. eLife , 9:e61834.* This paper describes an open source framework to create fully reproducible analysispipelines for extracellular recordings. It allows runnning all major current spike sorting al-gorithms on the same data sets, includes common methods for pre- and postprocessing andautomated curation, and a systematic comparison between sorter outputs. Use of this toolkitcan both help to avoid failure modes of common spike sorters, and aid more reproducibleanalysis.[11] Buccino, A. P., Kordovan, M., Ness, T. V., Merkt, B., Häiger, P. D., Fyhn, M., Cauwenberghs,G., Rotter, S., and Einevoll, G. T. (2018). Combining biophysical modeling and deep learningfor multielectrode array neuron localization and classication.

J Neurophysiol , 120(3):1212–1232.[12] Buesing, L., Macke, J. H., and Sahani, M. (2013). Spectral learning of linear dynamics fromgeneralised-linear observations with application to neural population data. In

Advances inNeural Information Processing Systems 25: 26th Conference on Neural Information ProcessingSystems (NIPS 2012) , pages 1691–1699.[13] Buzsáki, G. (2004). Large-scale recording of neuronal ensembles.

Nat Neurosci , 7(5):446–451.[14] Byron, M. Y., Afshar, A., Santhanam, G., Ryu, S. I., Shenoy, K. V., and Sahani, M. (2006). Ex-tracting dynamical structure embedded in neural activity. In

Advances in Neural InformationProcessing Systems , pages 1545–1552.[15] Byron, M. Y., Cunningham, J. P., Santhanam, G., Ryu, S. I., Shenoy, K. V., and Sahani, M.(2009). Gaussian-process factor analysis for low-dimensional single-trial analysis of neuralpopulation activity. In

Advances in Neural Information Processing Systems , pages 1881–1888.[16] Carlson, D. and Carin, L. (2019). Continuing progress of spike sorting in the era of big data.

Curr Opin Neurobiol , 55:90–96.[17] Churchland, M. M., Cunningham, J. P., Kaufman, M. T., Foster, J. D., Nuyujukian, P.,Ryu, S. I., and Shenoy, K. V. (2012). Neural population dynamics during reaching.

Nature ,487(7405):51–56.[18] Dunn, B. and Battistin, C. (2017). The appropriateness of ignorance in the inverse kineticIsing model.

J Phys A: Math Theor , 50(12):124002.[19] Dunn, B., Mørreaunet, M., and Roudi, Y. (2015). Correlations and Functional Connectionsin a Population of Grid Cells.

PLOS Comput Biol , 11(2):e1004052.[20] Elsayed, G. F. and Cunningham, J. P. (2017). Structure in neural population recordings: anexpected byproduct of simpler phenomena?

Nat Neurosci , 20(9):1310.

21] Franke, F., Fiscella, M., Sevelev, M., Roska, B., Hierlemann, A., and da Silveira, R. A. (2016).Structures of neural correlation and how they favor coding.

Neuron , 89.[22] Gallego, J. A., Perich, M. G., Chowdhury, R. H., Solla, S. A., and Miller, L. E. (2020). Long-term stability of cortical population dynamics underlying consistent behavior.

Nat Neurosci ,23(2):260–270.** The paper demonstrates that latent dynamics are a stable neural correlate for the executionof learned behaviors. They show that even across years of recordings, latent dynamics in theprimary motor cortex during reach remain surprisingly consistent despite signicant changesin single neuron activities.[23] Gallego, J. A., Perich, M. G., Miller, L. E., and Solla, S. A. (2017). Neural manifolds for thecontrol of movement.

Neuron , 94(5):978–984.[24] Gao, P., Trautmann, E., Yu, B., Santhanam, G., Ryu, S., Shenoy, K., and Ganguli, S. (2017).A theory of multineuronal dimensionality, dynamics and measurement.

BioRxiv .[25] Gao, Y., Archer, E., Paninski, L., and Cunningham, J. P. (2016). Linear dynamical neuralpopulation models through nonlinear embeddings. arXiv preprint arXiv:1605.08454 .[26] Gardella, C., Marre, O., and Mora, T. (2016). A tractable method for describing complexcouplings between neurons and population rate. eNeuro , 3.[27] Glaser, J. I., Whiteway, M. R., Cunningham, J. P., Paninski, L., and Linderman, S. W. (2020).Recurrent switching dynamical systems models for multiple interacting neural populations. bioRxiv .[28] Granot-Atedgi, E., Tkačik, G., Segev, R., and Schneidman, E. (2013). Stimulus-dependentmaximum entropy models of neural population codes.

PLoS Comput Biol , 9(3):e1002922.[29] Haghighi, H. S. and Markazi, A. (2017). A new description of epileptic seizures based ondynamic analysis of a thalamocortical model.

Sci Rep , 7(1):1–10.[30] Hall, T. M., de Carvalho, F., and Jackson, A. (2014a). A common structure underlies low-frequency cortical dynamics in movement, sleep, and sedation.

Neuron , 83(5):1185–1199.[31] Hall, T. M., Nazarpour, K., and Jackson, A. (2014b). Real-time estimation and biofeedbackof single-neuron ring rates using local eld potentials.

Nat Commun , 5(1):1–12.[32] Hennig, J. A., Golub, M. D., Lund, P. J., Sadtler, P. T., Oby, E. R., Quick, K. M., Ryu, S. I., Tyler-Kabara, E. C., Batista, A. P., Byron, M. Y., et al. (2018). Constraints on neural redundancy.

Elife ,7:e36774.[33] Hennig, M. H., Hurwitz, C., and Sorbaro, M. (2019). Scaling Spike Detection and Sortingfor Next-Generation Electrophysiology. In Chiappalone, M., Pasquale, V., and Frega, M., edi-tors,

In Vitro Neuronal Networks: From Culturing Methods to Neuro-Technological Applications ,Advances in Neurobiology, pages 171–184. Springer International Publishing, Cham.[34] Herzog, R., Escobar, M.-J., Cofre, R., Palacios, A. G., and Cessac, B. (2018). DimensionalityReduction on Spatio-Temporal Maximum Entropy Models of Spiking Networks. bioRxiv , page278606.[35] Hilgen, G., Sorbaro, M., Pirmoradian, S., Muthmann, J.-O., Kepiro, I. E., Ullo, S., Ramirez,C. J., Puente Encinas, A., Maccione, A., Berdondini, L., Murino, V., Sona, D., Cella Zanacchi,F., Sernagor, E., and Hennig, M. H. (2017). Unsupervised Spike Sorting for Large-Scale, High-Density Multielectrode Arrays.

Cell Rep , 18(10):2521–2532.

36] Hurwitz, C., Xu, K., Srivastava, A., Buccino, A., and Hennig, M. (2019). Scalable spike sourcelocalization in extracellular recordings using amortized variational inference. In

Advances inNeural Information Processing Systems , pages 4724–4736.[37] Ince, R. A., Giordano, B. L., Kayser, C., Rousselet, G. A., Gross, J., and Schyns, P. G. (2017). Astatistical framework for neuroimaging data analysis based on mutual information estimatedvia a gaussian copula.

Hum Brain Mapp , 38(3):1541–1573.[38] Jenison, R. L. and Reale, R. A. (2004). The shape of neural dependence.

Neural Comput , 16.[39] Jia, X., Siegle, J. H., Bennett, C., Gale, S. D., Denman, D. J., Koch, C., and Olsen, S. R. (2019).High-density extracellular probes reveal dendritic backpropagation and facilitate neuron clas-sication.

J Neurophysiol , 121(5):1831–1847.[40] Jun, J. J., Mitelut, C., Lai, C., Gratiy, S. L., Anastassiou, C. A., and Harris, T. D. (2017a). Real-time spike sorting platform for high-density extracellular probes with ground-truth valida-tion and drift correction. bioRxiv , page 101030.[41] Jun, J. J., Steinmetz, N. A., Siegle, J. H., Denman, D. J., Bauza, M., Barbarits, B., Lee, A. K.,Anastassiou, C. A., Andrei, A., Aydın, Ç., Barbic, M., Blanche, T. J., Bonin, V., Couto, J., Dutta,B., Gratiy, S. L., Gutnisky, D. A., Häusser, M., Karsh, B., Ledochowitsch, P., Lopez, C. M.,Mitelut, C., Musa, S., Okun, M., Pachitariu, M., Putzeys, J., Rich, P. D., Rossant, C., Sun, W.-l.,Svoboda, K., Carandini, M., Harris, K. D., Koch, C., O’Keefe, J., and Harris, T. D. (2017b). Fullyintegrated silicon probes for high-density recording of neural activity.

Nature , 551(7679):232–236.[42] Keshtkaran, M. R., Sedler, A. R., Chowdhury, R. H., Tandon, R., Basrai, D., Nguyen, S. L.,Sohn, H., Jazayeri, M., Miller, L. E., and Pandarinath, C. (2021). A large-scale neural networktraining framework for generalized estimation of single-trial population dynamics. bioRxiv ,pages 2021–01.* The paper introduces AutoLFADS, an extension of LFADS which includes a novel regulariza-tion technique and hyperparameter search algorithm. The authors show that these extensionslead to improved generalizability and eciency over the vanilla version of LFADS.[43] Kudryashova, N., Amvrosiadis, T., Dupuy, N., Rochefort, N., and Onken, A. (2020). Paramet-ric Copula-GP model for analyzing multidimensional neuronal and behavioral relationships. arXiv preprint arXiv:2008.01007 .* The paper proposes a scalable framework for modeling neuronal population activity andexternal covariates (e.g. behavior). The advantage of the method is the mutual informationestimation: it provides more accurate estimates than non-copula methods, especially for largedatasets. The GP parametrization allows for conditioning on continuous variables, e.g. ontime or position, making the method suitable for studying dynamic trajectories.[44] Kulkarni, J. E. and Paninski, L. (2007). Common-input models for multiple neural spike-train data.

Network: Computation in Neural Systems , 18(4):375–407.[45] Lawrence, N. and Hyvärinen, A. (2005). Probabilistic non-linear principal component anal-ysis with gaussian process latent variable models.

J Mach Learn Res , 6(11).[46] Lefebvre, B., Yger, P., and Marre, O. (2016). Recent progress in multi-electrode spike sortingmethods.

Journal of Physiology-Paris , 110(4, Part A):327–335.[47] Li, P. H., Gauthier, J. L., Schi, M., Sher, A., Ahn, D., Field, G. D., Greschner, M., Callaway,E. M., Litke, A. M., and Chichilnisky, E. J. (2015). Anatomical Identication of ExtracellularlyRecorded Cells in Large-Scale Multielectrode Recordings.

J Neurosci , 35(11):4663–4675.

48] Macke, J. H., Berens, P., Ecker, A. S., Tolias, A. S., and Bethge, M. (2009). Generating spiketrains with specied correlation coecients.

Neural Comput , 21.[49] Macke, J. H., Buesing, L., Cunningham, J. P., Yu, B. M., Shenoy, K. V., and Sahani, M. (2012).Empirical models of spiking in neural populations. In

Advances in Neural Information Process-ing Systems 24: 25th conference on Neural Information Processing Systems (NIPS 2011) , pages1350–1358.[50] Magland, J., Jun, J. J., Lovero, E., Morley, A. J., Hurwitz, C. L., Buccino, A. P., Garcia, S.,and Barnett, A. H. (2020). SpikeForest, reproducible web-facing ground-truth validation ofautomated neural spike sorters. eLife , 9:e55167.[51] Mahuas, G., Isacchini, G., Marre, O., Ferrari, U., and Mora, T. (2020). A new inference ap-proach for training shallow and deep generalized linear models of noisy interacting neurons. arXiv preprint arXiv:2006.06497 .** The paper proposes a method for training GLMs with complex correlated stimuli. It dis-entangles stimulus correlations from couplings, which improves generalization and preventsself-excitation in the network dynamics.[52] Mante, V., Sussillo, D., Shenoy, K. V., and Newsome, W. T. (2013). Context-dependent com-putation by recurrent dynamics in prefrontal cortex.

Nature , 503(7474):78–84.[53] Mongillo, G., Rumpel, S., and Loewenstein, Y. (2018). Inhibitory connectivity denes therealm of excitatory plasticity.

Nat Neurosci , 21(10):1463–1470.[54] Nemati, S., Linderman, S. W., Chen, Z., et al. (2014). A probabilistic modeling approachfor uncovering neural population rotational dynamics. In

Computational and Systems Neuro-science (Cosyne) .[55] O’Donnell, C., alves, J. T. G., Whiteley, N., Portera-Cailliau, C., and Sejnowski, T. J. (2017).The Population Tracking Model: A Simple, Scalable Statistical Model for Neural PopulationData.

Neural Comput , 29(1):50–93.[56] Ohiorhenuan, I. E., Mechler, F., Purpura, K. P., Schmid, A. M., Hu, Q., and Victor, J. D. (2010).Sparse coding and high-order correlations in ne-scale cortical networks.

Nature , pages 1–6.[57] Okun, M., Steinmetz, N. A., Cossell, L., Iacaruso, M. F., Ko, H., Barthó, P., Moore, T., Hofer,S. B., Mrsic-Flogel, T. D., Carandini, M., et al. (2015). Diverse coupling of neurons to popula-tions in sensory cortex.

Nature , 521(7553):511–515.[58] Onken, A. and Panzeri, S. (2016). Mixed vine copulas as joint models of spike counts andlocal eld potentials. In

Advances in Neural Information Processing Systems .[59] Panas, D., Amin, H., Maccione, A., Muthmann, O., van Rossum, M., Berdondini, L., andHennig, M. H. (2015). Sloppiness in spontaneously active neuronal networks.

J Neurosci ,35(22):8480–8492.[60] Pandarinath, C., O’Shea, D. J., Collins, J., Jozefowicz, R., Stavisky, S. D., Kao, J. C., Traut-mann, E. M., Kaufman, M. T., Ryu, S. I., Hochberg, L. R., Henderson, J. M., Shenoy, K. V.,Abbott, L. F., and Sussillo, D. (2018). Inferring single-trial neural population dynamics usingsequential auto-encoders.

Nat Methods , 15(10):805–815.** This paper introduces a novel RNN-based, state-space model termed latent factor analysisvia dynamical systems (LFADS). LFADS utilizes a variational autoencoder to amortize the costof inference over the dynamics making it more practical for large-scale neural recordings.

61] Petreska, B., Yu, B. M., Cunningham, J. P., Santhanam, G., Ryu, S. I., Shenoy, K. V., andSahani, M. (2011). Dynamical segmentation of single trials from population neural data.In

Proceedings of the 24th International Conference on Neural Information Processing Systems ,NIPS’11, pages 756–764, Red Hook, NY, USA. Curran Associates Inc.[62] Pfau, D., Pnevmatikakis, E. A., and Paninski, L. (2013). Robust learning of low-dimensionaldynamics from large neural ensembles. In

Advances in Neural Information Processing Systems .[63] Pillow, J. W., Paninski, L., Uzzell, V. J., Simoncelli, E. P., and Chichilnisky, E. J. (2005). Pre-diction and decoding of retinal ganglion cell responses with a probabilistic spiking model.

JNeurosci , 25(47):11003–13.[64] Pillow, J. W., Shlens, J., Paninski, L., Sher, A., Litke, A. M., Chichilnisky, E. J., and Simon-celli, E. P. (2008). Spatio-temporal correlations and visual signalling in a complete neuronalpopulation.

Nature , 454(7207):995–999.[65] Ponce-Alvarez, A., Mochol, G., Hermoso-Mendizabal, A., de la Rocha, J., and Deco, G.(2020). Cortical state transitions and stimulus response evolve along sti and sloppy pa-rameter dimensions, respectively. eLife , 9:e53268.* Using maximum entropy models, this paper conrms that population activity in corticalnetworks is constrained only by a small fraction of highly sensitive coupling parameters. In in vivo recordings from auditory cortex, the authors show that transitions between synchro-nised and de-synchronised states reect changes in sensitive parameters. In contrast, auditorystimuli aect primarily the insensitive couplings, hence stimulation has only a minor eecton the overall statistics of cortical activity.[66] Prentice, J. S., Homann, J., Simmons, K. D., Tkačik, G., Balasubramanian, V., and Nelson,P. C. (2011). Fast, scalable, bayesian spike identication for multi-electrode arrays.

PloS one ,6(7):e19884.[67] Rabinovich, M. I., Huerta, R., Varona, P., and Afraimovich, V. S. (2008). Transient cognitivedynamics, metastability, and decision making.

PLoS Comput Biol , 4(5):e1000072.[68] Raducanu, B. C., Yazicioglu, R. F., Lopez, C. M., Ballini, M., Putzeys, J., Wang, S., Andrei, A.,Rochus, V., Welkenhuysen, M., Helleputte, N. v., et al. (2017). Time multiplexed active neuralprobe with 1356 parallel recording sites.

Sens , 17(10):2388.[69] Ramanathan, D. S., Guo, L., Gulati, T., Davidson, G., Hishinuma, A. K., Won, S.-J., Knight,R. T., Chang, E. F., Swanson, R. A., and Ganguly, K. (2018). Low-frequency cortical activity isa neuromodulatory target that tracks recovery after stroke.

Nat Med , 24(8):1257–1267.[70] René, A., Longtin, A., and Macke, J. H. (2020). Inference of a mesoscopic population modelfrom population spike trains.

Neural Comput , 32(8):1448–1498.[71] Rey, H. G., Pedreira, C., and Quian Quiroga, R. (2015). Past, present and future of spikesorting techniques.

Brain Res Bull , 119:106–117.[72] Rohan, P. (2018). Large-scale neuron cell classication of single-channel and multi-channelextracellular recordings in the anterior lateral motor cortex. bioRxiv , page 445700.[73] Roweis, S. and Ghahramani, Z. (1999). A unifying review of linear gaussian models.

NeuralComput , 11(2):305–345.[74] Runyan, C. A., Piasini, E., Panzeri, S., and Harvey, C. D. (2017). Distinct timescales ofpopulation coding across cortex.

Nature , 548.

75] Russo, A. A., Khajeh, R., Bittner, S. R., Perkins, S. M., Cunningham, J. P., Abbott, L. F., andChurchland, M. M. (2020). Neural trajectories in the supplementary motor area and motorcortex exhibit distinct geometries, compatible with dierent classes of computation.

Neuron ,107(4):745–758.* The paper explores a new non-linear pattern of activity in supplementary motor area dur-ing multi-cycle rhythmic movement: helical population-trajectory geometry. The separationbetween single-cycles in these trajectories allows for counting the number of revolutions.[76] Rutten, V., Bernacchia, A., Sahani, M., and Hennequin, G. (2020). Non-reversible gaussianprocesses for identifying latent dynamical structure in neural data. In

Advances in NeuralInformation Processing Systems , volume 33.** The rst paper to consider time-irreversibility in GPFA models by constructing a new ker-nel function. This method, termed GPFADS, assigns lower probabilities to the time-reversedvariants of the true trajectories. Unlike traditional GPFA, it can disentangle trajectories in thearm reach dataset [17].[77] Ruz, I. D. and Schultz, S. R. (2014). Localising and classifying neurons from high densitymea recordings.

J Neurosci Methods , 233:115–128.[78] Sadtler, P. T., Quick, K. M., Golub, M. D., Chase, S. M., Ryu, S. I., Tyler-Kabara, E. C., Byron,M. Y., and Batista, A. P. (2014). Neural constraints on learning.

Nature , 512(7515):423–426.[79] Sani, O. G., Abbaspourazad, H., Wong, Y. T., Pesaran, B., and Shanechi, M. M. (2020). Mod-eling behaviorally relevant neural dynamics enabled by preferential subspace identication.

Nat Neurosci , pages 1–10.** This paper introduces a novel linear state-space model (PSID) that jointly models the neuralactivity and the recorded behavioural variables as generated by the behaviourally relevantand irrelevant latent dynamics. The authors show that PSID discovers signicantly lower-dimensional latent dynamics than other models and is the only one that can capture reversedirectional rotational dynamics in the primary motor cortex when the arm is returning versuswhen the arm is reaching.[80] Savin, C. and Tkačik, G. (2017). Maximum entropy models as a tool for building preciseneural controls.

Curr Opin Neurobiol , 46:120–126.[81] Schneidman, E., Berry, M. J., Segev, R., and Bialek, W. (2006). Weak pairwise correlationsimply strongly correlated network states in a neural population.

Nature , 440.[82] She, Q. and Wu, A. (2020). Neural dynamics discovery via gaussian process recurrent neuralnetworks. In

Uncertainty in Articial Intelligence , pages 454–464. PMLR.* This paper extends RNN-based dynamical modelling approaches with a Gaussian Processobservation model. They show that their more expressive observation model is able to extractinsightful low-dimensional dynamics and better t neural datasets.[83] Shenoy, K. V., Sahani, M., and Churchland, M. M. (2013). Cortical control of arm move-ments: a dynamical systems perspective.

Annu Rev Neurosci , 36.[84] Shlens, J., Field, G. D., Gauthier, J. L., Grivich, M. I., Petrusca, D., Sher, A., Litke, A. M., andChichilnisky, E. J. (2006). The structure of multi-neuron ring patterns in primate retina.

JNeurosci , 26(32):8254–66.[85] Smith, A. C. and Brown, E. N. (2003). Estimating a state-space model from point processobservations.

Neural Comput , 15(5):965–991.

86] Spreemann, G., Dunn, B., Botnan, M. B., and Baas, N. A. (2018). Using persistent homologyto reveal hidden covariates in systems governed by the kinetic ising model.

Phys Rev E ,97(3):032313.[87] Stringer, C., Pachitariu, M., Steinmetz, N., Carandini, M., and Harris, K. D. (2019). High-dimensional geometry of population responses in visual cortex.

Nature , 571(7765):361–365.[88] Tkačik, G., Marre, O., Mora, T., Amodei, D., Berry, M. J., and Bialek, W. (2013). The simplestmaximum entropy model for collective behavior in a neural network.

J Stat Mech Theory Exp ,2013.[89] Trippe, B., Huggins, J., Agrawal, R., and Broderick, T. (2019). Lr-glm: High-dimensionalBayesian inference using low-rank data approximations. In

International Conference on Ma-chine Learning , pages 6315–6324. PMLR.[90] Truccolo, W., Eden, U. T., Fellows, M. R., Donoghue, J. P., and Brown, E. N. (2005). A pointprocess framework for relating neural spiking activity to spiking history, neural ensemble,and extrinsic covariate eects.

J Neurophysiol , 93(2):1074–1089.[91] Tsodyks, M., Kenet, T., Grinvald, A., and Arieli, A. (1999). Linking spontaneous activity ofsingle cortical neurons and the underlying functional architecture.

Science , 286(5446):1943–1946.[92] Ventura, V. (2009). Traditional waveform based spike sorting yields biased rate code esti-mates.

Proc Natl Acad Sci , 106(17):6921–6926.[93] Ventura, V. and Gerkin, R. C. (2012). Accurately estimating neuronal correlation requiresa new spike-sorting paradigm.

Proc Natl Acad Sci , 109(19):7230–7235.[94] Verzelli, P. and Sacerdote, L. (2019). A study of dependency features of spike trains throughcopulas.

BioSystems , 184.[95] Wu, A., Roy, N. A., Keeley, S., and Pillow, J. W. (2017). Gaussian process based nonlinearlatent structure discovery in multivariate spike train data.

Adv Neur In , 30:3496.[96] Zoltowski, D., Pillow, J., and Linderman, S. (2020). A general recurrent state space frame-work for modeling neural dynamics during decision-making. In

International Conference onMachine Learning , pages 11680–11691. PMLR.[97] Zoltowski, D. M. and Pillow, J. W. (2018). Scaling the poisson glm to massive neural datasetsthrough polynomial approximations. In

Advances in Neural Information Processing Systems ,volume 31, page 3517. NIH Public Access.[98] Zylberberg, J., Cafaro, J., Turner, M. H., Shea-Brown, E., and Rieke, F. (2016). Direction-selective circuits shape noise to ensure a precise population code.

Neuron , 89., 89.