[PDF] Event-based update of synapses in voltage-based learning rules

Abstract

Due to the point-like nature of neuronal spiking, efficient neural network simulators often employ event-based simulation schemes for synapses. Yet many types of synaptic plasticity rely on the membrane potential of the postsynaptic cell as a third factor in addition to pre- and postsynaptic spike times. Synapses therefore require continuous information to update their strength which a priori necessitates a continuous update in a time-driven manner. The latter hinders scaling of simulations to realistic cortical network sizes and relevant time scales for learning. Here, we derive two efficient algorithms for archiving postsynaptic membrane potentials, both compatible with modern simulation engines based on event-based synapse updates. We theoretically contrast the two algorithms with a time-driven synapse update scheme to analyze advantages in terms of memory and computations. We further present a reference implementation in the spiking neural network simulator NEST for two prototypical voltage-based plasticity rules: the Clopath rule and the Urbanczik-Senn rule. For both rules, the two event-based algorithms significantly outperform the time-driven scheme. Depending on the amount of data to be stored for plasticity, which heavily differs between the rules, a strong performance increase can be achieved by compressing or sampling of information on membrane potentials. Our results on computational efficiency related to archiving of information provide guidelines for the design of learning rules in order to make them practically usable in large-scale networks.

Full PDF

EEvent-based update of synapses involtage-based learning rules

Jonas Stapmanns , , ∗ , Jan Hahne , Moritz Helias , , Matthias Bolten ,Markus Diesmann , , and David Dahmen Institute of Neuroscience and Medicine (INM-6) and Institute for AdvancedSimulation (IAS-6) and JARA Institute Brain Structure Function Relationship(INM-10), Jülich Research Centre, Jülich, Germany Institute for Theoretical Solid State Physics, RWTH Aachen University, Aachen,Germany School of Mathematics and Natural Sciences, Bergische Universität Wuppertal,Wuppertal, Germany Department of Physics, Faculty 1, RWTH Aachen University, Aachen, Germany Department of Psychiatry, Psychotherapy and Psychosomatics, Medical Faculty,RWTH Aachen University, Aachen, Germany

Correspondence*:Jonas [email protected]

ABSTRACT

Due to the point-like nature of neuronal spiking, efﬁcient neural network simulators often employevent-based simulation schemes for synapses. Yet many types of synaptic plasticity rely on themembrane potential of the postsynaptic cell as a third factor in addition to pre- and postsynapticspike times. Synapses therefore require continuous information to update their strength which apriori necessitates a continuous update in a time-driven manner. The latter hinders scaling ofsimulations to realistic cortical network sizes and relevant time scales for learning.Here, we derive two efﬁcient algorithms for archiving postsynaptic membrane potentials, bothcompatible with modern simulation engines based on event-based synapse updates. We the-oretically contrast the two algorithms with a time-driven synapse update scheme to analyzeadvantages in terms of memory and computations. We further present a reference implementa-tion in the spiking neural network simulator NEST for two prototypical voltage-based plasticityrules: the Clopath rule and the Urbanczik-Senn rule. For both rules, the two event-based algo-rithms signiﬁcantly outperform the time-driven scheme. Depending on the amount of data to bestored for plasticity, which heavily differs between the rules, a strong performance increase canbe achieved by compressing or sampling of information on membrane potentials. Our resultson computational efﬁciency related to archiving of information provide guiding principles for thefuture design of learning rules in order to make them practically usable in large-scale networks.

Keywords: voltage-based plasticity rules, event-based simulation, spiking neural network simulator, NEST, Clopath rule, Urbanczik-Senn rule

One mechanism for learning in the brain is implemented by changing the strengths of connections betweenneurons, known as synaptic plasticity. Already early on, such plasticity was found to depend on the activity a r X i v : . [ q - b i o . N C ] S e p tapmanns et al. of the connected neurons. Donald Hebb postulated the principle ’Cells that ﬁre together, wire together’( Hebb , 1949). Later on, it was shown that plasticity is shaped by temporal coordination of activities evendown to the level of individual spikes (

Markram et al. , 1997;

Bi and Poo , 1998). Synaptic plasticity rulesfor spiking neural networks, such as spike timing-dependent plasticity (STDP,

Gerstner et al. (1996)),consequently employ spike times of pre- and postsynaptic cells to predict the change in connections.In recent years, a new class of biologically inspired plasticity rules has been developed that take intoaccount the membrane potential of the postsynaptic neuron as an additional factor (for a review, see

Mayr and Partzsch , 2010;

Gerstner et al. , 2014). The rule by

Clopath et al. (2010) can be seen as aprototypical example for a voltage-based plasticity rule since long-term potentiation of synapses depends onthe presynaptic spike arrival and a ﬁltered version of the postsynaptic membrane potential. This additionalvoltage dependence enables the Clopath rule to describe phenomena that are not covered by ordinary STDPbut can be observed in experimental data, such as the complex frequency dependence of the synaptic weightchanges in spike pairing experiments (

Sjöström et al. , 2001). Furthermore it provides a mechanism forcreation of strong bidirectional connections in networks, which have been found to be overrepresented insome cortical areas (

Song et al. , 2005).Further inspiration for recently proposed plasticity rules originates in the ﬁeld of artiﬁcial neural networks.The latter showed great success in the past decade, for example in image or speech recognition tasks(

Hinton et al. , 2006;

Krizhevsky et al. , 2012;

Hannun et al. , 2014;

LeCun et al. , 2015). The involvedlearning paradigms, for example the backpropagation algorithm (

Werbos , 1974;

Parker , 1985;

Lecun ,1985;

Rumelhart et al. , 1986), are, however, often not compatible with biological constraints such aslocality of information for weight updates. To bridge the gap to biology, different biologically inspiredapproximations and alternatives to the backpropagation algorithm have been proposed (

Sacramento et al. ,2018;

Bellec et al. , 2019). A common feature of these rules is that weight updates not only depend on theoutput activity of pre- and postsynaptic cells, but also on a third factor, which is a time-continuous signal. Aprominent example of such biologically and functionally inspired rules is the voltage-based plasticity ruleproposed by

Urbanczik and Senn (2014), where the difference between somatic and dendritic membranepotential serves as an error signal that drives learning. This rule, incorporated in complex microcircuits ofmulti-compartment neurons, implements local error-backpropagation (

Sacramento et al. , 2018).Research on functionally inspired learning rules in biological neural networks is often led by therequirement to implement a particular function rather than implementation efﬁciency. Present studies aretherefore primarily designed to prove that networks with a proposed learning rule minimize a given objectivefunction. Indeed many learning rules are rather simple to implement and to test in ad-hoc implementationswhere at any point the algorithm has access to all state variables. While the latter implementations aresufﬁcient for a proof of principle, they are hard to reuse, reproduce and generalize. In particular, simulationsare restricted to small network sizes, as the simulation code cannot be straight-forwardly distributed acrosscompute nodes and thus parallelized. This is, in particular, problematic given that successful learningrequires extended simulation times compared to the update interval.In parallel to the above efforts are long-term developments of simulation software for biological neuralnetworks (

Brette et al. , 2007). Such open-source software, combined with interfaces and simulator-independent languages (

Davison et al. , 2008;

Djurfeldt et al. , 2010, 2014), supports maintainability andreproducibility, as well as community driven development. The design of such simulators is primarily ledby implementation efﬁciency. Code is optimized for neuron and synapse dynamics, with the aim to upscalesimulations to biologically realistic network sizes. A modular structure of the code facilitates re-use and2 tapmanns et al. extensions in functionality. Therefore, one aim of the community should be the transfer of ad-hoc proof-of-principle implementations to these well-tested platforms. Given the differences in design principles behindthe exploratory development of speciﬁc models and general-purpose simulation technology, this transfer isnot trivial. In the current study, we show how to make voltage-based learning rules compatible with spikingneural network simulators that employ an event-driven update scheme of synapses.Modern network simulators use individual objects for different neurons and synapses, which allows thedistribution of large networks across many compute processes (

Jordan et al. , 2018;

Lytton et al. , 2016).Spiking simulators in addition idealize spikes as point-like events. In the absence of gap junctions, there isno neuronal interaction in between such spike events such that neuronal and synaptic state variables can bepropagated independently in time. This led to the development of event-based simulation schemes, wheresynapses are only updated in their state at the times of incoming spikes (

Morrison et al. , 2005). Sincespike events at single synapses are rare for physiological brain states, this signiﬁcantly reduces the amountof function calls to synapse code and optimizes computational performance in network simulations. Suchan event-based scheme is perfectly suitable for plasticity rules like STDP, which rely on a comparisonbetween two point-like events. However, in voltage-based learning rules, synapses continuously requireinformation from the postsynaptic neurons in order to update their weights. This a priori breaks the ideabehind an event-based update scheme.In this study we present an efﬁcient archiving of the history of postsynaptic state variables that allowsfor an event-based update of synapses and thus makes voltage-based plasticity rules compatible withstate-of-the-art simulation technology for spiking neural networks. In particular, we derive two event-based algorithms that store a time-continuous or discontinuous history, respectively. The latter relies ona compression of information at the time of spike events and is beneﬁcial for learning rules that requirethe storage of long histories. We theoretically analyze advantages of the two event-driven algorithms withrespect to each other and compared to a straight-forward time-driven algorithm.The presented simulation concepts are exempliﬁed and evaluated in a reference implementation inthe open source simulation code NEST (

Gewaltig and Diesmann , 2007;

Jordan et al. , 2019). Thereference implementation thereby exploits existing functionality of a scalable software platform whichcan be used on laptops as well as supercomputers. NEST is employed by a considerable user communityand equipped with an interface to the programming language Python (

Eppler et al. , 2009), the ﬁeld ofcomputational neuroscience has agreed on. It supports relevant neuron models and connection routines forthe construction of complex networks. Despite this ﬂexibility the simulation engine shields the researcherfrom the difﬁculties of handling a model description in a distributed setting (

Morrison et al. , 2005;

Plesseret al. , 2015).We here focus on the voltage-based plasticity rules by

Clopath et al. (2010) and

Urbanczik and Senn (2014). The two rules represent opposing ends of a family of learning rules in the amount of data required tocompute weight updates. The Clopath rule by design only triggers plasticity in the vicinity of postsynapticspike events; storing a history, which is non-continuous in time, thus becomes beneﬁcial. In contrast, theUrbanczik-Senn rule considers noisy prediction errors based on postsynaptic membrane voltages and spikes.Such prediction errors never vanish and therefore always need to be stored to update the weights, leadingto time-continuous histories. For a given span of biological time, simulations of the Urbanczik-Senn ruleare therefore by design less efﬁcient than those of the Clopath rule. However, we show that a compressionof membrane potential information reduces this performance gap. Changing the learning rule to include asparse sampling of the membrane voltage further increases efﬁciency and makes performance comparableto simulations with ordinary STDP. 3 tapmanns et al.

Our study begins with a speciﬁcation of the mathematical form of the learning rules that we considerand a comparison between voltage-based rules and ordinary STDP (Section 2.1). Subsequently, we reviewbeneﬁts of time- and event-driven schemes for updating neurons and synapses, respectively (Section 2.2),and present two different algorithms for archiving histories of postsynaptic state variables (Section 2.3).In Section 2.4 we detail the reference implementation of the algorithms in NEST, and evaluate theirperformance in Section 3. There we discuss the Clopath rule (Section 3.1) and the Urbanczik-Senn rule(Section 3.2) in terms of implementation, reproduction of results from the literature and performance.Conclusions from the implementation of the two rules are drawn in Section 3.3, followed by a generalDiscussion in Section 4. The technology described in the present article will be made available with thenext major releases of the simulation software NEST as open source. The conceptual and algorithmic workis a module in our long-term collaborative project to provide the technology for neural systems simulations(

Gewaltig and Diesmann , 2007).

The focus of this study are plasticity models of the general form dW ij ( t ) dt = F ( W ij ( t ) , s ∗ i ( t ) , s ∗ j ( t ) , V ∗ i ( t )) (1)where the change dW ij ( t ) dt of the synaptic weight W ij between the presynaptic neuron j and postsynapticneuron i is given by a function F that potentially depends on the current synaptic weight W ij ( t ) , as well ason s ∗ i ( t ) , s ∗ j ( t ) , V ∗ i ( t ) which are causal functionals of the postsynaptic spike train s i , the presynaptic spiketrain s j , and the postsynaptic membrane potential V i , respectively (Figure 1). Note that for simplicity of thenotation, we only show one function F on the right hand side of (1), while generally there could be a sumof multiple functions or functionals F α , where each one depends on spike trains and membrane potentialsin a different manner. Note also that F mixes information of pre- and postsynaptic neurons, while thefunctionals denoted by ∗ only need to take into account information of either the pre- or postsynapticneuron. In cases where F is a functional, it can take into account an additional joint history dependence on s ∗ i , s ∗ j and V ∗ i . A special case, the Urbanczik-Senn learning rule, is discussed in the results.One can formally integrate (1) to obtain the weight change between two arbitrary time points t and T ∆ W ij ( t, T ) = (cid:90) Tt dt (cid:48) F ( W ij ( t (cid:48) ) , s ∗ i ( t (cid:48) ) , s ∗ j ( t (cid:48) ) , V ∗ i ( t (cid:48) )) . (2)In general, the integral on the right hand side of the equation cannot be calculated analytically. Thereis, however, a notable exception, which is the model of spike-timing dependent plasticity (STDP). Thismodel is a form of Hebbian plasticity that relies on the exact spike times of pre- and postsynaptic neuronsand ignores any effect of the postsynaptic membrane potential. The dependence on the exact spike timesbecomes apparent by the fact that either the pre- or postsynaptic spike functional is the spike train itself,for example s ∗ i ( t ) = s i ( t ) = (cid:88) k δ ( t − t ki ) , (3)4 tapmanns et al. 2.2 Time-driven vs event-driven update schemes Figure 1. Voltage-based plasticity rules.

The change ∆ W ij in synaptic strength between presynapticneuron j and postsynaptic neuron i depends on the presynaptic spike train s j , the postsynaptic spike train s i and the postsynaptic membrane potential V i .where t ki is the k -th spike of the i -th neuron. This yields a plasticity rule that reads ( Morrison et al. , 2008) dW ij ( t ) dt = − f − ( W ij ( t )) s ∗− ,i ( t ) s j ( t ) + f + ( W ij ( t )) s ∗ + ,j ( t ) s i ( t ) (4)with functions f ± that model the weight dependence, and functionals s ∗± ( t ) = ( κ ± ∗ s )( t ) with exponentialkernels κ ± . The appearance of the raw spike trains (delta distributions) in the differential equation of theSTDP model renders the integration of the ODE trivial ∆ W ij ( t, T ) = − (cid:88) spikes k f − ( W ij ( t kj )) s ∗− ,i ( t kj ) + (cid:88) spikes l f + ( W ij ( t li )) s ∗ + ,j ( t li ) , (5)where t kj , t li ∈ [ t, T ] . An update of the synaptic weight between any two time points only requires knowledgeof the weight and spike functionals at the timing of the pre- and postsynaptic spikes.For models that do not solely rely on exact spike times, but for example on ﬁltered versions of thespike trains, much more information is needed in order to calculate a weight update ∆ W ij ( t, T ) be-tween any two time points. This makes the computation more involved: the synapse needs all values of W ij ( t (cid:48) ) , s ∗ i ( t (cid:48) ) , s ∗ j ( t (cid:48) ) , V ∗ i ( t (cid:48) ) for t (cid:48) ∈ [ t, T ] to update its weight. The remainder of this study describesdifferent approaches to this problem and their advantages and disadvantages. Modern neural network simulators have a modular code structure with individual objects for each synapseand neuron. Given this modularity, the question arises when in a simulation a speciﬁc part of the codeneeds to be executed. Spikes at individual synapses are rare (Figure 2A). Only at these points in time, thesynaptic weight needs to be known as any weight dynamics in between two spikes is invisible to the rest ofthe network. This suggests to execute synapse code only when an actual event, i.e. a spike, is happening.The speedup that is achieved by this rare code execution gets evident by considering that each neuroncan have up to thousands of synapses in cortical networks, which for large networks quickly becomesan immense amount of synapses in total. The large in-degree of individual neurons, however, changesthe picture for neurons: They each receive a large amount of spikes in rapid succession (Figure 2B). This5 .2 Time-driven vs event-driven update schemes Stapmanns et al.

Figure 2. Update schemes for neurons and synapses. A

A spike crosses a synapse from the presynaptic(pre) to the postsynaptic (post) neuron. Since this is a rare event, the synaptic weight is computed onlywhen the send() function of the synapse is called (event-driven update). B Neurons with a large in-degreereceive spikes in most of the time steps which suggests a time-driven update of the neuron’s state. C Sincethe computation of the synaptic weights requires information from the postsynaptic neuron, storing thesynapses on the same compute node reduces the amount of expensive communication between computeprocesses.suggests a time-driven update of neurons. As a consequence, the membrane potential in many simulators iscomputed at each simulation step t α = α · h , where h is the simulation stepsize and α ∈ N . For plasticitymodels that rely on the membrane potential, the time discretization of (2) therefore yields ∆ W ij ( t, T ) = (cid:88) steps α ∆ W ij ( t α , t α +1 ) , (6) ∆ W ij ( t α , t α +1 ) = (cid:90) t α +1 t α dt (cid:48) F ( W ij ( t (cid:48) ) , s ∗ i ( t (cid:48) ) , s ∗ j ( t (cid:48) ) , V ∗ i ( t α )) . (7)which, in comparison to the small sum over spikes in the STDP rule (5), contains a large sum over alltime steps t α in between time points t and T . Here, the membrane potential enters in a piecewise constantmanner – hence the argument V ( t α ) . The synapse therefore predominantly needs information of the6 tapmanns et al. 2.3 Simulation concepts time-driven event-driven event-driven & compressionhistory length L I i synapse function calls

M K · T K · T /I K · T /I weight change computations

C K · T K · T T history entry manipulations

H T T K · T /I · i Table 1. Comparison of synapse update schemes.

From the view point of a postsynaptic neuron, thetable shows the maximal length of the history L , the number of function calls M of synapse code, thenumber of computations C of inﬁnitesimal weight changes ∆ W ij ( t α , t α +1 ) , and the number of historyentry manipulations H for a simulation of T time steps, a uniform inter-spike interval I between spikesof a single presynaptic neuron, and an in-degree K for each neuron and no delays. For the event-drivencompression scheme the entries show the length of the compressed history where i is the number ofdifferent spike times within the last inter-spike interval I .postsynaptic neuron in order to update its weight. Thus, in a distributed simulation framework, whereneurons are split across multiple compute processes, it is beneﬁcial to store the synapses at the site ofthe postsynaptic neurons in order to reduce communication (Figure 2C). This conﬁrms the earlier designdecision of Morrison et al. (2005) who place synapses at the site of the postsynaptic neuron to reduce theamount of data communicated by the presynaptic site.If weight changes ∆ W ij depend on the synaptic weight themselves, then (7) cannot be used in practiceas intermediate weights W ij ( t (cid:48) ) for t α < t (cid:48) < t α +1 are not known. In this scenario, weight changes haveto be calculated on the simulation grid with W ij ( t (cid:48) ) → W ij ( t α ) in case of a forward Euler scheme, or W ij ( t (cid:48) ) → W ij ( t α +1 ) in case of a backward Euler scheme. In the following we, for simplicity, stick to theforward Euler setting and arrive at the core computation for voltage-based plasticity rules ∆ W ij ( t α , t α +1 ) = (cid:90) t α +1 t α dt (cid:48) F ( W ij ( t α ) , s ∗ i ( t (cid:48) ) , s ∗ j ( t (cid:48) ) , V ∗ i ( t α )) . (8)Given that s i and s j are spike trains, the functionals s ∗ i and s ∗ j are obtained trivially from their correspondingconvolution kernels. If F in addition does not depend on s ∗ i and s ∗ j in a too complicated manner, which isusually the case (see examples in Section 3), the integral in (8) can be calculated analytically.Let’s assume in the following that t and T denote two consecutive spike times of the presynaptic neuron j . In this case, the synaptic weight corresponding to the spike at time T can be obtained from the weight W ij ( t ) at the time of the previous spike at t and (6) by employing (8) to calculate the latter. As F mixesinformation of the pre- and postsynaptic neurons, this computation should be done in the synapse. Sincethere are no spikes in between t and T , it does not matter when the synapse is performing the updates ofits weight. Two possibilities are: 1) Neurons calculate their own s ∗ and V ∗ for the current time step andmake it accessible to the synapse to enable direct readout and update according to (8) in every time step.This method corresponds to a time-driven update of synapses (Figure 3A). 2) Neurons store a history of s ∗ and V ∗ and the synapse reads out this information at T , i.e. at the time where the weight update becomesrelevant for the network. This method corresponds to an event-driven update of synapses (Figure 3B). Bothmethods have their advantages and disadvantages analyzed in the following section.7 .3 Simulation concepts Stapmanns et al. readout & update in each time step time-driven update batch readout at spike events event-driven updateevent-driven updateevent-driven update & compression readout and compression at spike events ABC

Figure 3. Simulation concepts. A

In the time-driven update scheme the synaptic weight change isevaluated in every time step of the simulation for all the synapses. Thus, the postsynaptic quantities, e.g.the membrane potential V m, post , are processed immediately. No storage of information is needed and theweight is always available if a spike is sent from a presynaptic (pre) to a postsynaptic (post) neuron usingthe send() -function of the synapse. B In the event-driven update scheme the computation of the synapticweight change is performed only if a spike crosses the synapse. Therefore, the postsynaptic neuron needs tostore the time trace of V m, post in a buffer (history, green). Whenever a spike is delivered, the new synapticweight is computed inside the send() -function. To this end the synapse reads out the part of the historythat corresponds to the period from its last spike to the current one. C In the compressed event-drivenupdate scheme a synapse processes the history from the last spike that arrived at the postsynaptic neuronfrom any synapse to the current spike, e.g. from the ﬁrst to the second (yellow) or from the second to thethird (green). The result (red and blue, respectively) is used to update the weight changes of all synapseswhich are stored in a buffer called compressed history . After this operation, the active synapse reads out itsweight change. Within this scheme the postsynaptic neuron needs to record V m, post from the time of thelast incoming spike. 8 tapmanns et al. 2.3 Simulation concepts synchronous AB asynchronous time-driven updateevent-driven updateevent-driven update& compression Figure 4. Illustration of buffer sizes for different simulation schemes in case of fully synchronousor asynchronous spikes. A

All incoming spikes arrive synchronously: In the time-driven scheme thesynaptic weight is updated in every time step of the simulation, so that only the current value of V m, post needs to be available (green). In the event-driven scheme every synapse processes V m, post from the lastspike to the current one. Therefore, the relevant time trace needs to be stored (yellow). In the compressedevent-driven scheme this part of V m, post is processed only once and used to update the weight of all thesynapses. Since the weight change is a function of the last spike time which is the same for all the synapses,only one value needs to be updated (red). In this situation the length L of the compressed history is i = 1 ,see Table 1. B All incoming spikes arrive in different time bins: For the time-driven and the event-drivenscheme the scenario is similar to panel A. For the compressed event-driven scheme the number of valuesthat need to be updated equals the number of incoming synapses K so that i = K . An event-based update of synapses requires each neuron to store its membrane potential at the resolutionof the simulation time step. This amounts to an increase in memory consumption as opposed to simulationswithout voltage-based plasticity. A time-driven update circumvents this problem as the information on themembrane potential is directly processed by the synapses such that only the current value of the membranepotential needs to be stored, corresponding to a membrane potential history of length L = 1 (Figure 4and Table 1). The price for this is that synapses need to be updated as often as neurons. For a simulationof T time steps this amounts to M = K · T function calls to synapse code for each neuron. Here K denotes the in-degree, i.e. the number of incoming connections per neuron. Each function call of synapse9 .3 Simulation concepts Stapmanns et al. code causes a single computation of ∆ W ij ( t α , t α +1 ) , giving rise to in total C = K · T computations perneuron. The membrane potential trace is thus effectively integrated K times; once for each synapse. Asboth K and T are large numbers in typical simulations of plastic cortical networks, the amount of functioncalls and computations is therefore large in this setting. Another disadvantage is that the synapse is beingupdated also at time steps, where s ∗ i , s ∗ j , V ∗ i have values for which ∆ W ij ( t α , t α +1 ) = 0 vanishes, i.e. whereno update is required. In addition, for delayed connections a history of V ∗ i of length L = d max of themaximal delay d max measured in simulation steps needs to be stored. We here assume the delay to be onthe postsynaptic side; it represents the time the ﬂuctuations of the somatic membrane potential propagateback through the dendrites to the synapses. Therefore, F does not depend on V ∗ i ( t ) , but on V ∗ i ( t − d j ) with a delay d j encoding the location of the synapse with presynaptic neuron j . In an event-driven update scheme of synapses, the time trace of the membrane potential V ∗ i needs tobe stored until all presynaptic synapses have read out the information to update their weight for a givenperiod. The storage and management of such a history can be expensive in terms of memory and runtime.Assuming for simplicity a homogeneous inter-spike interval of I time steps between consecutive spikes ofsingle neurons, we in the following showcase some qualitative history sizes. As synapses need all values of V ∗ i in between two consecutive spikes, the maximum history length is L = I (Figure 4). In case of differentﬁring rates, I corresponds to the maximum inter-spike interval of any of the presynaptic neurons. Synapsecode in this scheme is, however, only called in the event of a spike, leading to only M = K · T /I functioncalls per neuron, where

T /I is the number of spikes passing a single synapse during simulation time T . Thetotal amount of computations C of weight changes ∆ W ij ( t α , t α +1 ) is of course unchanged with respectto the time-driven scheme; they are just split across less function calls ( C = M · L = K · T ). Table 1immediately shows the trade-off between memory consumption (length of history) and run time (number offunction calls): the event-based scheme consumes more memory, but is faster than the time-driven scheme.Note that since a history of the membrane potential is stored anyway, this scheme is naturally applicableto connections with different delays. A further performance increase can be achieved in plasticity rules,where weight changes only happen under certain conditions on V ∗ i : if values ∆ W ij ( t α , t α +1 ) (cid:54) = 0 are rare,a non-continuous history can be stored. In such a scenario, time stamps need to be stored alongside themembrane potential to enable synapses to read out the correct time intervals (see Section 3.1). The major operation of the plasticity scheme in terms of frequency and complexity is the computation ofinﬁnitesimal weight changes ∆ W ij ( t α , t α +1 ) . Since the presynaptic spike train s ∗ j enters F in (8), the samepostsynaptic information on s ∗ i and V ∗ i is used many times for very similar computations: the membranepotential trace of each neuron is effectively integrated K times. Is there a way to employ the result of thecomputation ∆ W ij ( t α , t α +1 ) for neuron j for the computations ∆ W ik ( t α , t α +1 ) for other neurons k (cid:54) = j ?In a simple setting, where F factorizes into F ( W ij ( t ) , s ∗ i ( t ) , s ∗ j ( t ) , V ∗ i ( t )) = s ∗ j ( t ) G ( s ∗ i ( t ) , V ∗ i ( t )) with s ∗ j ( t ) = ( κ ∗ s j ) ( t ) and κ ( t ) = H ( t ) 1 τ e − tτ , (9)deﬁned via the Heaviside step function H ( x ) , we can make use of the property s ∗ j ( t ) = ( s ∗ j ( t LS ) + τ − ) e − ( t − t LS ) /τ , (10)10 tapmanns et al. 2.3 Simulation concepts where t > t LS and t LS denotes the last spike time of the presynaptic neuron. In this case the weight updatefactorizes ∆ W ij ( t LS , T ) = ( s ∗ j ( t LS ) + 1) (cid:124) (cid:123)(cid:122) (cid:125) =:¯ x j ( t LS ) (cid:90) Tt LS dt (cid:48) e − ( t (cid:48) − t LS ) /τ G ( s ∗ i ( t (cid:48) ) , V ∗ i ( t (cid:48) )) (cid:124) (cid:123)(cid:122) (cid:125) =:∆ W i ( t LS ,T ) (11)where the latter factor ∆ W i ( t LS , T ) is independent of the presynaptic spike train s ∗ j . This opens thepossibility to directly compute and store ∆ W i ( t LS , T ) at the postsynaptic neuron i . The advantage isthat the integration in (11) only needs to be performed once. At the time of an incoming spike event, ∆ W i ( t LS , T ) can be read out by the synapse for the correct t LS of that synapse and combined with thestored presynaptic spike trace ¯ x j ( t LS ) .This scheme a priori amounts to the same number of function calls to synapse code ( M = K · T /I ) andthe same maximum history length ( L = I ) as in the ordinary event-driven scheme. For each time step ∆ W i ( t α , t α +1 ) is computed once in the neuron rather than in the synapse, but its value is used to update allentries in the history ∆ W i ( t LS , T ) . The latter operation is costly, as the history is potentially long ( L = I ).This can be improved by noting that ∆ W i ( t LS , T ) does not need to be stored for all t LS ∈ [ T − I, T ] , butonly for those times, where presynaptic neurons ﬁred: For in-degree K < I or synchronous spiking, thenumber of time bins with spikes can be much lower than I (Figure 4). Storing only relevant ∆ W i ( t LS , T ) therefore requires knowledge of the postsynaptic neurons on the last spike times of all its presynapticneurons.The time-driven update of the history can be avoided by updating the history ∆ W i ( t LS , T ) in an event-driven manner (Figure 3C): In between any two incoming spikes T and T , the postsynaptic neurons stores V ∗ i ( t ) . At the time T of the new spike, the stored V ∗ i ( t ) is read out and integrated for the range [ T , T ) δW i ( T , T ) = (cid:90) T T dt (cid:48) e − t (cid:48) /τ G ( s ∗ i ( t (cid:48) ) , V ∗ i ( t (cid:48) )) (12)such that all entries in the history ∆ W i ( t LS , T ) can be updated as ∆ W i ( t LS , T ) = ∆ W i ( t LS , T ) + e t LS /τ δW i ( T , T ) . To avoid too large or small arguments of exponentials in numerical implementations,one equivalently calculates δ W i ( T , T ) = (cid:90) T − T dt (cid:48) e − t (cid:48) /τ G ( s ∗ i ( t (cid:48) + T ) , V ∗ i ( t (cid:48) + T )) such that δW i ( T , T ) = e − T /τ δ W i ( T , T ) and ∆ W i ( t LS , T ) = ∆ W i ( t LS , T )+ e − ( T − t LS ) /τ δ W i ( T , T ) .Afterwards the history of V ∗ i ( t ) is deleted. This way the latter history is always short as the total rate ofincoming spikes is high in physiological network states.This event-driven compression amounts to a compressed history of length L = i , where i is the numberof different spike times within the last inter-spike interval I . It is consequently never larger than the historylength L = I of the ordinary event-driven scheme. Still, synapse code is executed at every spike event,giving rise to M = K · T /I function calls. For synchronous spikes, the history, however, only needs to beupdated once. The membrane potential trace is therefore effectively only integrated once, amounting to C = T inﬁnitesimal weight change computations (Table 1). The price for this is that history updates aremore expensive: instead of appending a single entry in each time step, at each spike event the full history11 .4 Reference implementation in network simulator with event-based synapse updates Stapmanns et al. is updated, giving rise to in total H = K · T /I · i history entry manipulations, as opposed to H = T intime- and ordinary event-driven scheme Table 1. In practice, inﬁnitesimal weight change computations are,however, often more costly than history updates, such that there is a performance increase achieved by thecompression algorithm (see Section 3.2).Finally, a drawback of the event-driven compression is that it relies on the fact that all synapses use thesame processed membrane potential V ∗ i . For distributed delays, ∆ W i ( t LS , T ) has a dependence on thepresynaptic neuron j via V ∗ i ( t − d j ) . In this case, a separate compressed history needs to be stored forevery different delay of connections to the neuron.The considerations above illustrate conceptual differences in the different schemes rather than quantitativebounds for efﬁciency. The remainder of the work describes reference implementations of the three schemesand examines their performance for two example plasticity rules from the literature. This section describes the implementation of two example voltage-based plasticity rules by

Clopath et al. (2010) and

Urbanczik and Senn (2014) in a spiking neural network simulator that employs a time-drivenupdate of neurons and an event-based update of synapses. While the naming conventions refer to ourreference implementation in the simulation software NEST, the algorithms and concepts presented beloware portable to other parallel spiking network simulators.The Clopath and Urbanczik-Senn rule are chosen as widely used prototypical models of voltage-based plasticity. The differences in the two rules help to exemplify the advantages and disadvantagesof the algorithms discussed in Section 2.3. As originally proposed, they are implemented here fortwo different types of neuron models, Ad-ex/Hodgkin-Huxley point-neurons for the Clopath rule( aeif_psc_delta_clopath / hh_psc_alpha_clopath ) and two-compartment Poisson neurons( pp_cond_exp_mc_urbanczik ) for the Urbanczik-Senn rule. The rules differ in the state variablethat is being stored and its interpretation. For the Clopath rule, the stored variable is a thresholded andﬁltered version of the membrane potential that takes into account characteristics of membrane potentialevolution within cells in the vicinity of spike events. The latter restriction to temporal periods around spikessuggests to implement a history that is non-continuous in time. In contrast, the Urbanzcik-Senn rule usesthe dendritic membrane potential to predict the somatic spiking; the resulting difference is taken as an errorsignal that drives learning. This error signal never vanishes and thus needs to be stored in a time-continuoushistory.Finally, the proposed infrastructure for storing both continuous and non-continuous histories is generic sothat it can also be used and extended to store other types of signals such as external teacher signals. Detailson the Clopath and Urbanczik-Senn rule are discussed in Section 3. The implementation of voltage-based plasticity rules in NEST follows the modular structure of NEST,key part of which is the separation between neuron and synapse models. This separation makes it easy fora newly added neuron model to be compatible with existing synapse models and vice versa. A downside isthat information, such as values of parameters and state variables, is encapsulated within the respectiveobjects. Simulations in NEST employ a hybrid parallelization scheme: OpenMP threads are used forintra node parallelization and the Message Passing Interface (MPI) for inter node communication. Inparallel simulations, synapses are located at the same MPI process as the postsynaptic neurons (

Morrisonet al. , 2005). Thereby, no communication between MPI processes is needed for the required transfer of12 tapmanns et al. 2.4 Reference implementation in network simulator with event-based synapse updates information between postsynaptic neurons and synapses to compute weight changes of connections andonly one spike needs to be communicated by a given source neuron for all target neurons living on thesame MPI process.The model of STDP requires synapses to access spike times of postsynaptic neurons. In order to providea standardized transfer of this information across all neuron models that support STDP, in recent yearsthe so-called

Archiving_Node has been introduced as a parent class of the respective neuron models(

Morrison et al. , 2007a). It provides member functions to store and access spike histories. If a neuronmodel supports STDP, it only needs to be a child of

Archiving_Node and contain one additional line ofcode, namely a call of the function set_spiketime() , which stores the time of outgoing spike events.We here extended this framework for voltage-based plasticity rules and enhanced the functionality of thearchiving node by the member functions write_history() and get_history() to additionallystore voltage traces or other continuous signals. To avoid overhead for simulations with only STDPsynapses, we introduced two child classes of

Archiving_Node , Clopath_Archiving_Node and

Urbanczik_Archiving_Node , that each provide containers and functions for the speciﬁc historiesrequired for the two plasticity rules. Neuron models that support the respective synapse model then derivefrom the child classes instead of the root level archiving node.One complication arises if the calculation of the processed membrane potential V ∗ inside the archivingnode requires parameters or functions that are speciﬁc to the neuron model. One example is the gain func-tion φ that translates the membrane potential into a ﬁring rate in case of the Poisson neuron model used in Urbanczik and Senn (2014) and here to demonstrate the Urbanczik-Senn rule: φ as well as its parametershave to be speciﬁed for the neuron, but they also enter the plasticity rule through V ∗ . Creating an additionalhelper class ( pp_urbanczik_parameters ) as a template argument for the corresponding archiv-ing node ( Urbanczik_Archiving_Node ) and neuron model ( pp_cond_exp_mc_urbanczik )solves this problem (Figure 5): it contains all parameters and functions required by both classes.

All synapses implemented in NEST are so far purely event-driven. To assess the performance of thetime-driven update scheme of synapses with voltage-based plasticity, we also implemented a time-drivenversion of the Clopath and Urbanczik-Senn synapse. Spiking network simulators exploit the delay ofconnections to reduce communication between compute processes: for the period of the minimal delay,neurons are decoupled and state variables can be propagated forward in time independent of each other.Spikes are therefore buffered and sent at the end of a min_delay period, and neuronal updates are batchedfor all time steps within one min_delay . We implemented the same min_delay update scheme forsynapses, by imposing a function call to time-driven synapses in every min_delay period to update theirsynaptic weight. If min_delay equals the simulation step size h , this scheme corresponds to the schemeexplained in Section 2.3.1. Making use of the min_delay infrastructure in NEST speeds up simulationswith time-driven synapses in the case d > h as fewer function calls to synapses are needed (see Section 3).In case of simulations with synaptic delays, the time-driven update scheme requires the storage of a historyof the membrane potential of length max_delay .Storing state variables in event-driven schemes is more complex as the history does not have a ﬁxed length max_delay . Instead it needs to be dynamically extended and shortened. A long history can occupy alarge amount of memory and its processing by the synapses becomes computationally expensive. Therefore,it is advantageous to optimize the way how information is stored and accessed and how entries that are nolonger needed can be identiﬁed for deletion. 13 .4 Reference implementation in network simulator with event-based synapse updates Stapmanns et al. nestkernelmodels (cid:28) bind (cid:29) Node + get_history( ) + get_LTD_value( ) + get_LTP_history( ) + get_urbanczik_history( ) + get_tau_s( ) + get_tau_L( ) Archiving_Node – history : deque< histentry >+ get_history( )

Clopath_Archiving_Node – ltd_history : vector< histentry >– ltp_history : deque< histentry >– { parameters used inthe Clopath rule }+ get_LTD_value( )+ get_LTP_history( )

Urbanczik_Archiving_Node aeif_psc_delta_clopath – u_bar_plus : double– u_bar_minus : double– u_bar_bar : double– update( ){ [ . . . ] set_spiketime( )write_clopath_history( ) [ . . . ] } hh_psc_alpha_clopath – u_bar_plus : double– u_bar_minus : double– u_bar_bar: double– update( ){ [ . . . ] set_spiketime( )write_clopath_history( ) [ . . . ] } pp_cond_exp_mc_urbanczik – params: pp_urbanczik_parameters– pp_cond_exp_mc_urbanczik( ){urbanczik_params = ¶ms}– update( ){ [ . . . ] set_spiketime( )write_urbanczik_history( ) [ . . . ] } pp_urbanczik_parameters – { parameters used in the U.-S.rule and the neuron model }– phi( )– h( ) any model with STDP support only Figure 5. Class diagram of NEST classes and functions.

Simpliﬁed class diagram for embedding theClopath (left) and Urbanczik-Senn rule (right) in the NEST infrastructure. The code is distributed acrossthe nestkernel and neuron models. nestkernel contains the base class

Node of all neurons models.Models that support ordinary STDP are derived from the

Archiving_Node , models that can use theClopath synapse ( aeif_psc_delta_clopath and hh_psc_alpha_clopath ) or Urbanczik-Sennsynapse ( pp_cond_exp_mc_urbanczik ) are derived from the

Clopath_Archiving_Node orthe

Urbanczik_Archiving_Node , respectively. The latter add the required functions for storing andmanaging the history of continuous quantities. The model pp_cond_exp_mc_urbanczik requiresa helping class pp_urbanczik_parameters because the

Urbanczik_Archiving_Node needsto access functions and parameters that are speciﬁc to the neuron model and therefore not located in the

Urbanczik_Archiving_Node to keep its implementation more general.

There are three points that need to be considered in the context of history management: Firstly, whichinformation needs to be stored. Secondly, how to search and read out the history. Thirdly, how to identifyand remove information that is no longer needed. The ﬁrst and third point mainly affect memory usage, thesecond the simulation time as searching in shorter histories is faster.14 tapmanns et al. 2.4 Reference implementation in network simulator with event-based synapse updates

There are four different histories to which our considerations apply. The one to store the membranepotential V ∗ i , the compressed history ∆ W i ( t LS , T ) used only for the compressed event-driven scheme, thehistory to store the spike times s i of the postsynaptic neuron (also used for ordinary STDP), and ﬁnally onemight need a history that stores the last spike time for every incoming synapse (see below for details). This paragraph concerns only the history that stores the time trace of V ∗ i . In every time step of thesimulation, neurons call the protected function write_history() of the archiving node and pass thecurrent value of the (low-pass ﬁltered) membrane potential. The archiving node then computes the derivedquantities V ∗ i and potentially combinations of V ∗ i and s ∗ i , and saves them in the history, which is of type vector . It is more efﬁcient to do the computations inside the archiving node and not in the synapse fortwo reasons: Firstly, the computation is done once for all incoming synapses and secondly, if the numberof derived quantities is smaller than those of the bare quantities, less memory is used to store the history. Let us assume that a synapse requests a part from t to t > t of the history that ranges from t start < t to t end > t . In case every time step of the simulation adds a new entry to the history, one can easilycompute the positions of the entries corresponding to t / by just knowing t start and t end . We will learn inSection 3.2 that this is the case for the Urbanczik-Senn plasticity rule. If the history is not continuous intime, like in case of the Clopath rule, this scheme is not applicable. Instead, we add a time stamp s as anadditional variable to each entry and search for those with the smallest/greatest s within the interval ( t , t ) using e.g. a linear or a binary search. Searching for the positions that deﬁne the start and the end of therequested interval is slower than computing them directly. Nevertheless, a non-continuous history can beadvantageous as we will see in case of the Clopath rule (Section 3.1). Here, only values of the membranepotential in the vicinity of a spike of the postsynaptic neuron are needed so that neglecting the majority ofvalues in between leads to a non-continuous history but saves memory.Technically, the archiving node contains a function called get_history() which expects two iterators start and finish and two times t and t . When executed, the function sets the iterators to point tothe entries corresponding to t and t , respectively. In the event-driven scheme this is used by the synapsewhich provides the time of the last spike t = t LS − d and that of the current spike t = t S − d , to settwo pointers to the correct position of the history of the postsynaptic neuron. Here, d accounts for thedelay, the information of the postsynaptic neuron is shifted by d backwards in time at the synapse. Havingreceived the correct position of the pointers, the synapse evaluates the integral (6). In the event-drivencompression scheme, the integration (12) is not done inside the synapse but inside the archiving_node .The reason for this is that the compressed history ∆ W i ( t LS , T ) , which is updated in case of an incomingspike, is stored inside the archiving_node. This way no exchange of information is needed. Thesynapse only triggers the updating process by calling the function compress_history() of the archiving_node . Internally, the archiving_node can use get_history() to obtain the partof the history that has to be integrated. Even though the linear search a priori might seem less efﬁcient thana binary search or direct computation of the position, it turns out that it has an advantage in that it iteratesconsecutively over the history entries which can be employed to identify data no longer needed. Therefore,especially for short histories a simple iteration that comes without any overhead is fastest (see Section 3.1).15 tapmanns et al.

To prevent the history from occupying an unnecessary amount of memory, it is crucial to have a mecha-nism to delete those entries that have been used by all incoming synapses. The simplest implementation toidentify these entries is to add one additional variable to each entry called access counter initialized tozero when the entry is created. When a synapse requests a part from t to t of the history, the algorithmiterates over all entries t < t < t and increases the access counters by one. After the update of thesynaptic weight all entries whose access counters are equal to the number of incoming synapses are deleted.This scheme can be combined easily with a linear search starting the iteration from the oldest entry of thehistory.For long histories a linear search is inefﬁcient and should be replaced by a binary search or directcomputation of positions if applicable. To avoid iteration within long histories, we replace access countersby a vector that stores the last spike time t LS for every incoming synapse. If a synapse delivers a spike, itupdates its entry in that vector by replacing t LS by the time stamp of the current spike. After each weightupdate, searching the vector for the smallest t LS allows us to safely remove all membrane potentials withtime stamps t < min( { t LS ,i } ) . In practice, we can further improve this mechanism with two technicaldetails. Firstly, n incoming spikes with the same time stamp can share the same entry t LS which we thenhave to provide with a counter that goes down from n to zero in steps of one whenever one of the n synapses sends a new spike for a time t > t LS . Secondly, we can avoid the search for the smallest t LS bymaking sure that the entries t LS are in chronological order. This can be easily achieved if a synapse doesnot update its entry in the vector but removes it and appends a new one at the end of the vector. As discussed in Section 2.3.3, the event-based compression scheme relies on the fact that all synapsesto one postsynaptic neuron employ the same V ∗ i . This is not the case if delays of the correspondingconnections are distributed. The compression scheme can therefore only be efﬁcient if all delays have aﬁxed value. If spikes are processed and synapses are updated in a chronological order, then a well-deﬁnedsegment of the history of V ∗ i can be integrated and the compressed history can be updated. In NEST, spikesare, however, buffered within a period of min_delay before being sent and processed. Consequently,synapses are not updated in chronological order. Therefore, the event-based compression scheme can onlybe implemented in NEST in the case where delays equal the simulation time step. In the following, we evaluate the different simulation concepts of Figure 2 for the Clopath rule (

Clopathet al. , 2010) and the Urbanczik-Senn rule (

Urbanczik and Senn , 2014) and our reference implementationin the spiking network simulator NEST (

Jordan et al. , 2019). For each of the voltage-based plasticityrules, we ﬁrst detail the rule itself and discuss speciﬁcities of its implementations, before assessing theirperformance on a distributed computing architecture.

The Clopath rule (

Clopath et al. , 2010) was designed as a voltage-based STDP rule that accounts fornonlinear effects of spike frequency on weight changes which had been previously observed in experiments(

Sjöström et al. , 2001). It does so by using the evolution of the postsynaptic membrane voltage aroundpostsynaptic spike events instead of the postsynaptic spikes themselves. This requires a neuron model thattakes into account features of membrane potential excursions near spike events, such as modiﬁed adaptive16 tapmanns et al. 3.1 Clopath plasticity exponential integrate-and-ﬁre (aeif) model neurons that are used in the original publication (

Clopath et al. (2010), see Section 5.1) or Hodgkin-Huxley (hh) neurons that are used in a reference implementation by B.Torben-Nielson on ModelDB (

Hines et al. , 2004).The plasticity rule is of the general form (1) with a sum of two different functions F α on the right handside. It treats long-term depression (LTD) and potentiation (LTP) of the synaptic weight in the two terms F LTD and F LTP , with F LTD (cid:0) s j ( t ) , V ∗ i, LTD ( t ) (cid:1) = − A LTD s j ( t ) V ∗ i, LTD ( t ) (13) with V ∗ i, LTD = (¯ u − − θ − ) + , ¯ u − ( t ) = ( κ − ∗ V i )( t − d s ) and F LTP (cid:0) s ∗ j ( t ) , V ∗ i, LTP ( t ) (cid:1) = A LTP s ∗ j ( t ) V ∗ i, LTP ( t ) (14) with s ∗ j = κ s ∗ s j ,V ∗ i, LTP = (¯ u + − θ − ) + ( V i − θ + ) + , ¯ u + ( t ) = ( κ + ∗ V i )( t − d s ) . Here ( x − x ) + = H ( x − x ) ( x − x ) is the threshold-linear function and H ( x ) is the Heaviside stepfunction. A LTD and A LTP are prefactors controlling the relative strength of the two contributions. κ ± areexponential kernels of the form (9), which are applied to the postsynaptic membrane potential, and κ s isan exponential kernel applied to the presynaptic spike train. The time-independent parameters θ ± serveas thresholds below which the (low-pass ﬁltered) membrane potential does not cause any weight change(Figure 6). Note that A LTP can also depend on the membrane potential. This case is described in AppendixSection 5.3.In a reference implementation of the Clopath rule by C. Clopath and B. Torben-Nielsen available onModelDB (

Hines et al. , 2004), there is a subtle detail not explicitly addressed in the original journal article.In their implementation the authors introduce an additional delay d s between the convolved version of themembrane potentials ¯ u ± and the bare one (cf. parameter d s in (13) and (14) ). The convolved potentialsare shifted backwards in time by the duration of a spike d s (see Table 2 and Table 4). As a result, thedetailed shape of the excursion of the membrane potential during a spike of the postsynaptic neuron doesnot affect the LTP directly but only indirectly via the low-pass ﬁltered version ¯ u + , see green backgroundin Figure 6B. Incorporating this time shift in ¯ u ± is essential to reproduce the results from Clopath et al. (2010) on spike-pairing experiments (Figure 7A,B).The depression term F LTD depends on the unﬁltered spike train s j . It can thus be treated analogousto ordinary STDP rules (cf. (4)ff). In particular, V ∗ i, LTD only needs to be available for time points ofpresynaptic spikes (potentially taking into account additional delays of the connection). The potentiationterm F LTP , however, depends on the ﬁltered spike train s ∗ j ; V ∗ i, LTP consequently needs to be stored also fortimes in between spike events. 17 .1 Clopath plasticity Stapmanns et al. V m [ m V ] A V j s j t t + 3 20 t t + 3 40 t [ms]755025025 V m [ m V ] B V i u ++ s p i k e t r a c e Figure 6. Illustration of LTP contribution to the Clopath rule . A presynaptic neuron (panel A ) anda postsynaptic neuron (panel B ) emit a spike at t sp , pre = 4 ms and t sp , post = 6 ms , respectively. Thepresynaptic spike elicits a trace s ∗ j (gray) at the synapse. The excursion of the postsynaptic membranepotential V i (panel B, blue) elevates the low-pass ﬁltered potential ¯ u + (green) so that both V i and ¯ u + exceedthe respective thresholds θ + (dash-dotted, dark blue) and θ − (dash-dotted, dark green), cf. (14), between t and t . Only within this period, shifted by d s = 3 ms , which is for times t + 3 ms < t < t + 3 ms (panelB, green background), see Section 3.1.1 for details, the LTP of the synaptic weight is non-vanishing becauseof the threshold-linear functions in 14. The shift by d s = 3 ms does not apply to the spike trace (panelA, green background). The rectangular shape of the spikes is achieved by a clamping of the membranepotential to V clamp = 33 mV for a period of t clamp = 2 ms . We implement both an adaptive exponential integrate-and-ﬁre neuron model ( aeif_psc_delta_clopath )and a Hodgkin-Huxley neuron model ( hh_psc_alpha_clopath ) supporting Clopath plasticity.Our implementation of aeif_psc_delta_clopath follows the reference implementation onModelDB which introduced a clamping of the membrane potential after crossing the spiking thresh-old to mimic an action potential. Details can be found in Section 5.1. The implementations of aeif_psc_delta_clopath and hh_psc_alpha_clopath consider the ﬁltered versions ¯ u ± ofthe membrane potential as additional state variables of the neuron. Thereby, they can be included inthe differential equation solver of the neurons to compute their temporal evolution. Parameters of κ ± consequently need to be parameters of the neuron rather than the synapse. The same is true for the valuesof θ ± ; they are used in the neuron to determine whether V ∗ i, LTP and V ∗ i, LTD evaluate to zero, which happensfor many time steps due to the Heaviside functions in their deﬁnitions.The LTD mechanism is convenient to implement within the event-driven framework: when the synapse isupdated at time t , it reads the values ¯ u − ( t − d ) and θ − from its target and computes the new weight. Here,18 tapmanns et al. 3.1 Clopath plasticity d denotes the dendritic delay of the connection that accounts for the time it takes to propagate somaticmembrane potential ﬂuctuations to the synapse. The archiving node contains a ring buffer that storesthe history of ¯ u − for the past max_delay time steps so that the synapse can access a past value of thisquantity. Consequently, the LTD history is always short and can be forgotten in a deterministic fashion.The computation of the weight change due to LTP requires the evaluation of the integral over V ∗ i, LTP ( t ) .The latter is stored in the archiving node as a vector whose elements are objects that contain three values:the corresponding time t , the value of V ∗ i, LTP and an access counter that initially is set to zero.

For simulations with delay equal to the simulation time step, the history of V ∗ i, LTP always contains only asingle value as it is read out in every time step by all synapses. For larger delays, the history is of length max_delay , and each synapse reads out a segment of length min_delay , increasing the access counterof the corresponding entries by one. For the last synapse that requests a certain segment, the access counterthen equals the in-degree K , which is the criterion to delete the corresponding entries from the history.Although for simplicity done in our reference implementation, the time-driven scheme does not requireus to store the time stamp t of each history entry. The overhead of this additional number is, however,negligible. In event-driven schemes, the history of V ∗ i, LTP dynamically grows and shrinks depending on the spikes ofpresynaptic neurons. Since many values of V ∗ i, LTP are zero, it is beneﬁcial to only store the non-zero values.In this case, a time stamp of each entry is required to assign values of the non-continuous history of V ∗ i, LTP to their correct times. When a synapse j is updated at time t S of a spike, it requests the part of the historybetween the last spike t LS and the current spike t S (minus the dendritic delay) from the archiving node. Incase of the uncompressed scheme, t LS thereby refers to the last spike of the same synapse j . This historysegment is then integrated in synapse j and used for the update of synapse j . Each synapse thus integratesthe history V ∗ i, LTP anew (Section 2.3.2). For the compressed scheme, t LS refers to the last spike of anysynapse to neuron i . In this case, the history is integrated in the archiving node, the weight of synapse j isupdated, and the compressed history for all other last spike times is updated. Afterwards the history of V ∗ i, LTP is deleted. Thereby, V ∗ i, LTP is only integrated once for all synapses.In any case, the integrated history of V ∗ i, LTP needs to be combined with the presynaptic spike trace s ∗ i .The latter is easily computed analytically inside the synapse because it is an exponential decay of thecorresponding value at the time of the last spike. At the end of the update process the trace is increased by τ − s to account for the trace of the current spike, where τ s is the time constant of the kernel κ s .The reference implementation reproduces the results from Clopath et al. (2010) on spike-pairingexperiments and the emergence of bidirectional connections in orientation-selective networks (Figure 7).See Appendix Section 5.2 for details on the setup of both experiments as implemented in NEST.

In order to evaluate the performance of the implementation of the Clopath rule in NEST, in a weak-scalingsetup, we simulate excitatory-inhibitory networks of increasing size, but ﬁxed in-degree K . As we expectthe performance to critically depend on the number of synapses, we examine two scenarios: a smallin-degree K = 100 , and a rather large in-degree K = 5000 . While the ﬁrst case might be suitable for smallfunctional networks, the latter in-degree represents a typical number for cortical networks. Further details19 .1 Clopath plasticity Stapmanns et al. Figure 7. Reproduction of results with Clopath rule. A

Setup of the spike pairing experiment. Twoneurons (“pre” and “post”) that are connected by a plastic synapse receive input so that they spike one afteranother with a delay ∆ t . The change of the synaptic weight is computed according to the Clopath ruleas a function of the frequency f pair with which the spike pairs are induced. B Result of the spike pairingexperiment. The relative change of the synaptic weight after ﬁve spike pairs as a function of f pair is shownfor two different neuron models (aeif: solid lines, Hodgkin-Huxley: dashed lines). The blue lines representa setup where the postsynaptic neuron ﬁres after the presynaptic one (pre-post, ∆ t = 10 ms ) and the greenlines represent the opposite case (post-pre, ∆ t = −

10 ms ). This panel corresponds to ﬁgure 2b in

Clopathet al. (2010). C Setup of the network that produces strong bidirectional couplings. The network consistsof an inhibitory (I) and an excitatory (E) population which receive Poisson spike trains (P) as an externalinput. The ﬁring rate of the latter is modulated with a Gaussian shape whose center is shifted every

100 ms .The external input connections to the excitatory population are plastic as well as the connections within theexcitatory population (indicated by blue arrows). D Synaptic weights of the all-to-all connected excitatoryneurons after the simulation of the network. Strong bidirectional couplings can be found, e.g. betweenneurons 2 and 3, 2 and 7, and 2 and 10. This panel corresponds to ﬁgure 5 in

Clopath et al. (2010). Amore detailed description of the two experiments can be found in Section 5.2.on network and simulation parameters are given in Table 6. As a reference, we also simulate the samenetwork with STDP synapses, which require much less computations as they rely solely on spike times.To achieve the same network state, that is the same spikes, for the different connectivity rules, we imposethe weights to stay constant across time by setting learning rates to zero. This way all computations forweight changes are being performed, but just not applied. This has the additional advantage that reasonableasynchronous irregular network states are simple to ﬁnd based on predictions for static synapses (

Brunel ,2000).The Clopath rule has originally been proposed for connections without delays (

Clopath et al. , 2010).Therefore, we ﬁrst evaluate its performance in this setting (delay equals simulation time step), whichis, however, not the natural setting for a simulator like NEST that makes use of delays to speed up20 tapmanns et al. 3.1 Clopath plasticity stdp edtd edc T s i m ( s ) A K = 100 , d = h stdp edtd edc T s i m ( s ) B K = 5000 , d = h stdp edtd T s i m ( s ) C K = 5000 , d = 1 . linear search binary search T s i m ( s ) D K = 5000 , d = 1 . Figure 8. Comparison of state propagation times T sim for excitatory-inhibitory networks with dif-ferent implementations of the Clopath plasticity in NEST. The following implementations are shown:“stdp”: standard implementation of STDP synapse, “td”: time-driven implementation of Clopath synapse,“ed”: event-driven scheme as included in NEST 2.18, “edc”: event driven compression. A Network of size N = 1 . · with small in-degree K = 100 and all synapses having a delay d equal to the resolution ofthe simulation h . B Network of size N = 1 . · with large in-degree K = 5000 and d = h . C Samenetwork as in panel B but d = 1 . (for d > h “edc” not compatible with NEST, see Section 2.3.3). In A,B, and C both “ed” and “edc” use linear search of the history and access counters, see 2.4.3. D Comparisonbetween “ed”-implementations using linear search and binary search of the history. For all the simulations threads were used over compute nodes each running one MPI process. Further parameters as inTable 6.communication between compute processes. The ﬁrst observation is that, as expected, simulations withClopath synapses are slower than those with ordinary STDP (Figure 8). Given the update of synapses inevery simulation step, the time-driven scheme for Clopath synapses is much slower than the event-drivenscheme (Figure 8A). The difference becomes larger the more synapses there are (Figure 8B). Introducing adelay leads to less function calls to synapses (once every min_delay ) and therefore increases the speedof the time-driven scheme (Figure 8C). Its performance, however, stays much below the event-drivenscheme. This comparison illustrates the beneﬁt of event-driven updates for Clopath synapses.How does compression of the history change the picture? As discussed in Section 2.3.3, compression hasthe advantage of not integrating the membrane potential history for each synapse separately. A downsideof the event-based compression is that it requires storing one history entry for each last spike time ofpresynaptic neurons. For large in-degrees, this history is therefore longer than the history of V ∗ i, LTP , whichwe implemented as non-continuous for the Clopath rule. Consequently, the event-based compressionscheme only outperforms the ordinary event-driven scheme for small in-degrees (Figure 8A), but not forlarge in-degrees (Figure 8B). Given that the compression can only be implemented in NEST for connectionswith delay equal to the resolution of the simulation (see Section 2.4.4), the method of choice is therefore21 .2 Urbanczik-Senn plasticity Stapmanns et al.

384 768 1536 3072 6144 N VP T s i m ( s ) A with static synapseswith stdp synapseswith clopath synapses 384 768 1536 3072 6144 N VP T s i m ( s ) B with static synapseswith stdp synapseswith urbanczik synapses 384 768 1536 3072 6144 N VP T s i m ( s ) C with static synapseswith stdp synapsesurbanczik spike-spike N network size N network size N network size Figure 9. Scaling of state propagation times T sim with network size for s of biological time. Weakscaling: computational resources (horizontal axes) are increased proportionally to network size N (blackcurve and triangles, right vertical axes). ( A ) Event-driven scheme for Clopath rule compared to staticand STDP synapse. ( B ) Event-driven Urbanczik-Senn rule compared to static and STDP synapse. Dueto the scale on the y-axis the green curve (static synapses) coincides with the red one (STDP synapses).( C ) Spike-spike version of the Urbanczik-Senn rule compared to static and STDP synapse. Network andsimulation parameters as in Table 6 with in-degree K = 5000 . N VP denotes the number of threads whereevery MPI process occupies one compute node and runs 24 threads.the ordinary event-driven scheme (Section 2.3.2). Although a bit slower, its run-time is on the same order ofmagnitude as the ordinary STDP synapse, with similar weak-scaling behavior (Figure 9A). The additionalcomputations with respect to STDP result in a constant overhead.Another advantage of having short non-continuous histories is that searching the history at readout isfast. A simple linear iteration scheme is therefore even faster than a binary search (Figure 8D) because thelatter search requires an additional list of presynaptic spike times (see Section 2.4.3) which is unnecessaryoverhead in this scenario. As a result the ordinary event-driven scheme with linear history iteration is themost general and efﬁcient scheme and therefore integrated into NEST 2.18 ( Jordan et al. , 2019).

The Urbanczik-Senn rule (

Urbanczik and Senn , 2014) applies to synapses that connect to dendrites ofmulticompartment model neurons. The main idea of this learning rule is to adjust the weights of dendriticsynapses such that the dendrite can predict the ﬁring rate of the soma. The dendrite expects the ﬁring rateto be high when the dendrite’s membrane potential is elevated due to many incoming spikes at the dendrite,and to be low if there are only a few incoming spikes. Thus, for this prediction to be true, synapses thattransmit a spike towards the dendrite while the ﬁring rate of the soma is low are depressed and those thatprovide input while the soma’s ﬁring rate is high are facilitated. Learning can be triggered by applyinga teacher signal to the neuron via somatic synapses such that the actual somatic ﬁring deviates from thedendritic prediction.The plasticity rule is again of the general form (1), with a functional F on the right hand side that reads22 tapmanns et al. 3.2 Urbanczik-Senn plasticity F [ s ∗ j , V ∗ i ] = η κ ∗ (cid:0) V ∗ i s ∗ j (cid:1) (15) with V ∗ i = ( s i − φ ( V i )) h ( V i ) , (16) s ∗ j = κ s ∗ s j . with exponential ﬁlter kernels κ and κ s and nonlinearities φ and h . Note that F depends on the postsynapticspike train s i via V ∗ i . The latter can be interpreted as a prediction error, which never vanishes as spikes s i (point process) are compared against a rate prediction φ ( V i ) (continuous signal). Following the original publication (

Urbanczik and Senn , 2014), we here exemplify the Urbanczik-Senn rule with a two-compartment Poisson model neuron consisting of one somatic and one dendriticcompartment ( pp_cond_exp_mc_urbanczik ). Extensions to multiple dendritic compartments arestraight forward. In contrast to the implementation of a non-continuous history for the Clopath rule, anon-vanishing prediction error ( V ∗ i ( t ) (cid:54) = 0 ) here requires the storage of a continuous history. In order tosolve (1), we need to integrate over F [ s ∗ j , V ∗ i ] (cf. (2)). Writing down the convolution with κ explicitly, weobtain ∆ W ij ( t, T ) = (cid:90) Tt dt (cid:48) F [ s ∗ j , V ∗ i ]( t (cid:48) )= (cid:90) Tt dt (cid:48) η (cid:90) t (cid:48) dt (cid:48)(cid:48) κ (cid:0) t (cid:48) − t (cid:48)(cid:48) (cid:1) V ∗ i (cid:0) t (cid:48)(cid:48) (cid:1) s ∗ j (cid:0) t (cid:48)(cid:48) (cid:1) . A straight forward implementation of this expression is very inefﬁcient in terms of memory usage andcomputations because of the two nested integrals: The latter integral over t (cid:48)(cid:48) always starts at t (cid:48)(cid:48) = 0 whichrequires the synapse to evaluate the entire history of V ∗ i and s ∗ j whenever it performs a weight update.However, since the kernels κ and κ s are exponentials, one can perform one of the integrations analytically(see appendix Section 5.4 for a derivation) to rewrite the weight change as ∆ W ij ( t, T ) = η (cid:104) I ( t, T ) − I ( t, T ) + I (0 , t ) (cid:16) − e − T − tτκ (cid:17)(cid:105) , (17) with I ( a, b ) = (cid:90) ba dt V ∗ i ( t ) s ∗ j ( t ) ,I ( a, b ) = (cid:90) ba dt e − b − tτκ V ∗ i ( t ) s ∗ j ( t ) . The ﬁrst two integrals in (17) only extend from t to T ; history entries for times smaller than t are notneeded and can be deleted after the corresponding update. The dependence on the full history back until arising from the convolution with κ is accumulated in the last term in (17), which can be computed by thesynapse by storing one additional value I (0 , t ) . At the end of a weight update this value is overwritten bythe new value I (0 , T ) = e − T − tτκ I (0 , t ) + I ( t, T ) which is then used in the next update.(17) corresponds to the general formulation discussed in Figure 2 and can be dealt with as shown there. Inparticular, a history of V ∗ i ( t ) is stored and the integrals I and I are calculated within the synapse (time-and event-driven update) or the archiving node (event-driven update and compression).23 .2 Urbanczik-Senn plasticity Stapmanns et al. − − − UV i U M g I g E Figure 10. Reproduction of results with Urbanczik-Senn rule. A

Setup of a simple learning task usingthe Urbanczik-Senn plasticity rule. The somatic conductances g I and g E of a two-compartment neuronare modulated such that they induce a teaching signal with sinusoidal shape. The dendrite receives arepeating spike pattern as an input via plastic synapses (blue arrows). B The synapses adapt their weightsso that the somatic membrane potential U (dark blue) and the dendritic prediction V i (light blue) follow thematching potential U M (red) after learning. C Excitatory ( g E ) and inhibitory ( g I ) somatic conductancesthat produce the teaching signal. A and B corresponds to ﬁgure 1 in ( Urbanczik and Senn , 2014). D Temporal evolution of the synaptic weights during learning. For the sake of better overview, only a subsetof weights is shown (gray) with three randomly chosen time traces highlighted in blue. Synapses in NESTfulﬁll Dale’s principle which means that a weight update cannot convert an excitatory into an inhibitorysynapse and vice versa giving rise to the rectiﬁcation at zero.Since parameters of the functions φ and h in (16) are not only used for the plasticity rule, but also for theneuron dynamics, they are passed via a template class pp_urbanczik_parameters as an argumentto the archiving node and the neuron model (see Section 2.4.1).The basic use of the Urbanczik-Senn rule in NEST is exempliﬁed in Appendix Section 5.5, whichdetails the NEST setup and a reproduction of a simple learning experiment (Figure 10) from the originalpublication ( Urbanczik and Senn , 2014).

As for the Clopath rule, we employ a weak-scaling setup with the same excitatory-inhibitory networks ofincreasing size and ﬁxed in-degree K = 100 or K = 5000 , respectively (Table 6). We compare the resultsfor the Urbanczik-Senn plasticity rule to networks with ordinary STDP synapses, again setting all learningrates to zero to achieve the same network state across different types of plasticity.The Urbanczik-Senn rule, in its original version, does not account for delays in connections ( Urbanczikand Senn , 2014). As for the Clopath rule, we therefore ﬁrst evaluate its performance for connections withdelays that equal the simulation time step. Naturally, the processing of the membrane potential informationmakes the Urbanczik-Senn plasticity less efﬁcient to simulate than networks with ordinary STDP synapses(Figure 11). Note that the absolute numbers of simulation times are not directly comparable to simulationswith Clopath plasticity (Figure 8) as network sizes are smaller here (Table 6). Networks with small andlarge in-degrees behave qualitatively similar: given the long continuous history that needs to be stored andread out, the event-driven scheme does not signiﬁcantly outperform the time-driven scheme (Figure 11A,B).In the network with small in-degree, the time-driven scheme is even slightly faster (Figure 11A). This24 tapmanns et al. 3.2 Urbanczik-Senn plasticity stdp edtd edc T s i m ( s ) A K = 100 , d = h stdp edtd edc T s i m ( s ) B K = 5000 , d = h stdp edtd T s i m ( s ) C K = 5000 , d = 1 . linear search compute position T s i m ( s ) D K = 5000 , d = 1 . Figure 11. Comparison of state propagation times T sim for excitatory-inhibitory networks with dif-ferent implementations of the Urbanczik-Senn plasticity in NEST. The following implementations areshown: “stdp”: standard implementation of STDP synapse in NEST, “td”: time-driven implementation ofUrbanczik-Senn synapse, “ed”: event-driven scheme, “edc”: event driven compression. A Network of size N = 3 . · with small in-degree K = 100 and all synapses having a delay d equal to the resolution ofthe simulation h . B Network of size N = 3 . · with large in-degree K = 5000 and d = h . C Samenetwork as in panel B but d = 1 . (for d > h “edc” not compatible with NEST, see Section 2.3.3). InA, B, and C both “ed” and “edc” use linear search of the history and the access counters, see 2.4.3. D Comparison between “ed”-implementations using linear search and binary search of the history. For all thesimulations threads were used over 32 compute nodes each running one MPI-process. Details aboutthe parameters used in these networks can be found in Table 6.behavior reverses for large in-degrees as the number of synapse calls grows stronger than the length of thehistory (Figure 11B). However, given that the length of the history is so critical in this rule, the compressionalgorithm can in both cases achieve a signiﬁcant increase in performance (Figure 11A,B). This performanceincrease is larger the smaller the in-degree, as the compressed history becomes shorter (Figure 11A).Due to NEST speciﬁcities (see Section 2.4.4), the compression algorithm cannot be used in settings withdelays that are are larger than the simulation time step (Figure 11C): Here, as expected, the time-drivenscheme becomes faster than in the d = h case, but it is in general still comparable in performance to theevent-driven scheme. The latter is therefore the method of choice for simulations with delayed connections;for zero-delay connections, the compression algorithm performs best. Consequently, both schemes arepermanently integrated in NEST. Whether the history readout is done via linear iteration or via computingpositions of history entries has no signiﬁcant impact on the performance (Figure 11D). Therefore, thesimple linear iteration is integrated in NEST 3. 25 .3 Conclusions Stapmanns et al. The analyses of the Clopath and Urbanczik-Senn plasticity as prototypical examples for rules that rely onstorage of discontinuous versus continuous histories show that the former are much faster to simulate, inparticular for large networks that require distributed computing architectures. For discontinuous histories,the event-driven scheme is most generally applicable and efﬁcient, which makes corresponding rules easyto integrate into modern simulators with event-based synapses. The performance gap between the differentrules should be kept in mind in the design of new learning rules. Furthermore, it is worthwhile to testmodiﬁcations of existing learning rules to decrease the amount of stored information.For illustration, we here test a spike-based alternative to the original Urbanczik-Senn rule, where wereplace the rate prediction φ ( V i ( t )) in V ∗ of (16) by a noisy estimate, which we generate by a non-homogeneous Poisson generator with rate φ ( V i ( t )) , see Section 5.6. The prediction error then resultsin a comparison of somatic and dendritic spikes, s i and s dend i , respectively; it is therefore purely basedon point processes. In terms of storage and computations, the rule thereby becomes similar to ordinarySTDP (cf. (5)). This becomes apparent in the weak-scaling experiment in Figure 9C, which shows that themodiﬁcation of the learning rule results in a speedup of a factor to arriving essentially at the samerun time as the ordinary STDP rule.When changing learning rules to improve the efﬁciency of an implementation, the question is in how farthe modiﬁed rule, in our example including the noisy estimate of the dendritic prediction, still fulﬁlls thefunctionality that the original rule was designed for. Generally, without control of the error any simulationcan be made arbitrarily fast. Therefore Morrison et al. (2007b) deﬁne efﬁciency as the wall-clock timerequired to achieve a given accuracy. We test in the appendix (Fig. 1 Figure 12) whether the dynamicsis still robust enough to achieve proper learning and function in the reproduced task of Figure 10. Thelearning works as well as in the original Urbanczik-Senn rule. However, given the simplicity of the chosentask , this result may not generalize to other more natural tasks. We leave a more detailed investigation ofthis issue to future studies. The basic exploration here, however, illustrates how taking into account theefﬁciency of implementations can guide future development of learning rules to make them practicallyusable for large-scale simulations of brain networks.

This work presents efﬁcient algorithms to implement voltage-based plasticity in modern neural networksimulators that rely on event-based updates of synapses (for a review, see

Brette et al. , 2007). Thisupdate scheme restricts function calls of synapse code to time points of spike events and thereby improvesperformance in simulations of biologically plausible networks, where spike events at individual synapsesare rare and the total number of synapses is large compared to the number of neurons. In voltage-basedplasticity rules, synapses, however, rely on continuous information of state variables of postsynaptic cellsto update their strength, which naturally suggests their time-driven update. Instead, we here proposean efﬁcient archiving of voltage traces to enable event-based synapse updates and detail two schemesfor storage, read out and post-processing of time-continuous or discontinuous information. We showtheir superior performance with respect to time-driven update both theoretically and with a referenceimplementation in the neural network simulation code NEST for the rules proposed in

Clopath et al. (2010) and

Urbanczik and Senn (2014).Event-driven update schemes for voltage-based plasticity come at the expense of storing possibly longhistories of a priori continuous state variables. Such histories not only require space in memory but they26 tapmanns et al. also affect the runtime of simulations, which we focus on here. The time spent for searching and post-processing the history to calculate weight updates increases with increasing length, and these operationshave to be done for each synapse. Therefore, in addition to an ordinary event-driven scheme, we deviseda compression scheme that becomes superior for long histories as occurring in the Urbanczik-Senn rule.In particular for networks with small in-degrees or synchronous spiking, the compression scheme resultsin a shorter history. It further reduces the total amount of computations for weight changes by partiallyre-using results from other synapses thereby avoiding multiple processing of the history. For short historiesas occurring in the Clopath rule, the compression results in unnecessary overhead and an increase in historysize as one entry per last presynaptic spike time needs to be stored instead of a discontinuous membranepotential around sparse postsynaptic spike events. We here, for simplicity, contrasted time- and event-drivenupdate schemes. However, further work could also investigate hybrid schemes, where synapses are notonly updated at spike events, but also on a predeﬁned and coarse time grid to avoid long histories andcorresponding extensive management. A similar mechanism is used in

Kunkel et al. (2011) to implementa normalization of synaptic weights. The corresponding technical details can be found in

Kunkel (2015,ch. 5.2).The storage and management of the history as well as complex weight change computations naturallyreduce the performance of simulations with voltage-based plasticity in comparison to static or STDPsynapses. The latter only require information on spike times which is much less data compared to continuoussignals. Nevertheless, given that the Clopath rule is based on thresholded membrane potentials andconsequently short, discontinuous histories, the performance and scaling of the event-driven algorithms isonly slightly worse than for ordinary STDP. Time-driven implementations cannot employ this model featureand update weights also in time steps where no adjustment would be required, leading to signiﬁcantly slowersimulations. The performance gain of using event-driven schemes is less pronounced for the Urbanczik-Senn rule as, by design, histories are typically long. In this case, the compression scheme naturally yieldsbetter results in terms of runtime. Our own sampling-based modiﬁcation of the Urbanczik-Senn rule onlyrequires storage of the membrane potentials at spike events, giving rise to the same performance as STDP.Generally, an algorithm is faster if it requires less computations. However, opportunities for vectorizationand cache efﬁcient processing, outside of the scope of the present manuscript, may change the picture.We here chose the Clopath and the Urbanczik-Senn rule as two prototypical models of voltage-basedplasticity. While both rules describe a voltage dependence of weight updates, their original motivationas well as their speciﬁc form are different: The Clopath rule reﬁnes standard STDP models to capturebiologically observed phenomena such as frequency dependence of weight changes (

Sjöström et al. , 2001).For this it is sufﬁcient to take into account membrane potential traces in the vicinity of spike events, leadingto storage of time-discontinuous histories in our implementation. In contrast, the Urbanczik-Senn rule isfunctionally inspired by segregating dendritic and somatic compartments of cells and using the differencebetween somatic output and dendritic prediction as a teacher signal for dendritic synapses. The teachersignal is by construction never vanishing, imposing the need to store a time-continuous history. The originalpublications of both rules had a great and long-lasting impact on the ﬁeld. The Clopath rule has beenused in a variety of studies (

Clopath and Gerstner , 2010;

Ko et al. , 2013;

Litwin-Kumar and Doiron ,2014;

Sadeh et al. , 2015;

Bono and Clopath , 2017;

Maes et al. , 2020), partly in modiﬁed versions whichare, however, still compatible with the here presented simulation algorithms. The same holds for theUrbanczik-Senn rule (

Brea et al. , 2016;

Sacramento et al. , 2018).The current implementation in NEST supports an adaptive exponential integrate-and-ﬁre and a Hodgkin-Huxley neuron model for the Clopath rule. The former is used in the original publication (

Clopath tapmanns et al. et al. , 2010) and the latter appears on ModelDB ( Hines et al. , 2004) in code for the Clopath rule for theNEURON simulator (

Hines and Carnevale , 2001). For the Urbanczik-Senn rule, NEST currently supportsthe two-compartment Poisson model neuron of the original publication (

Urbanczik and Senn , 2014). Athree-compartment version as used in

Sacramento et al. (2018) or other models are straight forward tointegrate into the current simulation framework. However, with voltage-based plasticity rules, bordersbetween neurons and synapses become blurred as these rules often depend on speciﬁcities of the employedneuron models rather than only spike times as for standard STDP. Consequently, archiving nodes mightneed to have speciﬁc functionalities, which, in light of the zoo of existing neuron models, could easilylead to a combinatorial explosion of code. These problems can in future be overcome with automatic codegeneration using NESTML that only creates and compiles code that is needed for the speciﬁed modelsimulations (

Plotnikov et al. , 2016).While the here presented implementation refers to the neural network simulator NEST (

Gewaltig andDiesmann , 2007), the proposed algorithms and simulation infrastructure are compatible with any networksimulator with event-driven update of synapses, such as, for example, NEURON (

Lytton et al. , 2016, cf.ch. 2.4) and Brian2 (

Stimberg et al. , 2014). Furthermore, applicability is not restricted to the Clopathand Urbanczik-Senn rule, but the framework can be adapted to any other learning rule that relies onstate variables of postsynaptic neurons. State variables hereby not only encompass membrane potentialssuch as, for example, in the LCP rule by

Mayr and Partzsch (2010), the Convallis rule by

Yger andHarris (2013), the voltage-triple rule by

Brea et al. (2013), the MPDP rule by

Albers et al. (2016), theneuromorphic learning rules by

Sheik et al. (2016) and

Diederich et al. (2018), or the branch-speciﬁc ruleby

Legenstein and Maass (2011), but also, for example, ﬁring rates of stochastic neuron models or ratemodels (

Brea et al. , 2016;

Sacramento et al. , 2018), or other learning signals (

Bellec et al. , 2019). Thesame machinery could be also used to store external teacher signals that are provided to model neurons bystimulation devices. Since synapses are located at the compute process of the postsynaptic neuron, readoutof state variables from presynaptic neurons comes with large costs in simulations on distributed computingarchitectures and is therefore not considered here. Due to speciﬁcities of the present NEST code in spikedelivery, the event-driven compression proposed here is only applicable in NEST for delays that equal thesimulation time step. Such a restriction can be readily overcome in a simulation algorithm that performs achronological update of synapses.In general, one has to distinguish two types of efﬁciency in the context of simulating plastic networks:Firstly, the biological time it takes the network to learn a task by adapting the weights of connections.Secondly, the wall-clock time it takes to simulate this learning process. Both times crucially depend onthe employed plasticity rule. In this study, we focused on the wall-clock time and argue that this can beoptimized by designing learning rules that require storing only minimal information on postsynaptic statevariables. Ideally, the plasticity rule contains unﬁltered presynaptic or postsynaptic spike trains to reachthe same performance as in ordinary STDP simulations. If rules, however, need to capture the pre- andpost-spike dynamics of postsynaptic neurons, it is beneﬁcial to make use of thresholded state variables asin the example of the Clopath rule as this yields short, time-discontinuous histories. Reducing the amountof information available for synapses to adjust their weights can in general slow down the learning. Wepresented a modiﬁcation of the Urbanczik-Senn rule where the dendritic prediction of the somatic ﬁringcontains an additional sampling step with Poisson spike generation. This modiﬁcation signiﬁcantly reducesthe simulation time. For the here presented simple task, learning speed is largely unaffected, but generallya performance decrease is to be expected when error signals become more noisy. Therefore, there is atrade-off between learning speed and simulation speed, which should be considered in the design processof new learning rules. 28 tapmanns et al.

For the plasticity rules by

Clopath et al. (2010) and

Urbanczik and Senn (2014), we present a highlyscalable reference implementation in NEST. The parallelism of the NEST implementation enables simula-tions of plastic networks of realistic size on biologically plausible time scales for learning. The ﬁeld ofcomputational neuroscience recently entered a new era with development of large-scale network models(

Markram et al. , 2015;

Schmidt et al. , 2018;

Billeh et al. , 2020). Emulating the dynamics of corticalnetworks, such models are so far restricted to static connections. We here provide simulation algorithms forplasticity mechanisms that are required for augmenting such complex models with functionality. It is ourhope that incorporating both biologically and functionally inspired plasticity models in a single simulationengine fosters the exchange of ideas between communities towards the common goal of understandingsystem-level learning in the brain.

CONFLICT OF INTEREST STATEMENT

The authors declare that the research was conducted in the absence of any commercial or ﬁnancialrelationships that could be construed as a potential conﬂict of interest.

ACKNOWLEDGMENTS

We thank Claudia Clopath and Wulfram Gerstner for explaining details of their reference implementationand the underlying biological motivation. Moreover, we thank Hedyeh Rezaei and Ad Aertsen for suggestingto implement the Clopath rule in NEST and Charl Linssen, Alexander Seeholzer, Renato Duarte for carefullyreviewing our implementation. Finally we thank Walter Senn, Mihai A. Petrovici, Laura Kriener and JakobJordan for fruitful discussions on the Urbanczik-Senn rule and our proposed spike based version. We furthergratefully acknowledge the computing time on the supercomputer JURECA (

Jülich SupercomputingCentre

For the Clopath rule the change of the synaptic weight strongly depends on the excursion of the membranepotential V m around a spike of the postsynaptic neuron which causes ¯ u ± to cross the respective thresholds θ ± so that (13) and (14) yield nonvanishing results. Within the original neuron model ( Brette and Gerstner ,2005) u is reset immediately after it reached the spiking threshold so that the shape of the action potential isnot described accurately. In our NEST implementation of aeif_psc_delta_clopath we adapted theapproach of the reference implementation on ModelDB ( Hines et al. , 2004) and introduced a clamping of u to a ﬁxed value V clamp for a period of t clamp before it is reset. The reference implementation is restricted toa simulation resolution of exactly and sets u to two different values for the two subsequent simulation29 .2 Implementation of experiments using Clopath rule Stapmanns et al. steps after a spike. To be independent of the resolution of the simulation, the implementation in NEST usesa constant V clamp . In the simulations we set t clamp to and V clamp to

33 mV . These values are chosento match the behavior of the reference implementation.

The setup of the spike pairing experiment from

Clopath et al. (2010) presented in Figure 7A,B consistsof two neurons connected via a plastic synapse. The pre- and postsynaptic neuron are forced to spikewith a time delay of ∆ t by stimulation with spike_generators at times t ( i )pre = t ( i ) and t ( i )post = t ( i ) + ∆ t , respectively. A positive time shift ∆ t > refers to the presynaptic neuron spike before thepostsynaptic one (pre-post pairing, solid lines in Figure 7) and vice versa. Spike pairs (cid:16) t ( i )pre , t ( i )post (cid:17) areinduced with frequency f pair = t ( i +1) − t ( i ) and the weight change of the synapse is measured after a set ofﬁve pairs. In our simulation using NEST the presynaptic neuron is modeled as a parrot_neuron and thepostsynaptic neuron is either of type aeif_psc_delta_clopath or hh_psc_alpha_clopath .In NEST parrot_neurons are model neurons that emit a spike whenever they receive one. In this setupthey are required because devices like spike_generators support only static synapses in NEST so thatwe cannot connect the postsynaptic neuron directly to the spike_generator via a plastic synapse. Theinitial weight of the clopath_synapse connecting the two neurons is given by w init . In this experimentwe use the Clopath rule with ﬁxed amplitude A LTD . A list with all the parameters can be found in Table 2.

In this experiment after

Clopath et al. (2010), a small network of N I inhibitory and N E excitatoryneurons subject to an external input develops strong bidirectional couplings between neurons of theexcitatory population. The input is given by N p Poisson spike trains with rates f ( j )p = A p e − ( j − µ p ) σ + c p , (18)where j = 1 , . . . , N p . The center µ p of the Gaussian is drawn randomly from a set s p of possible valuesand a new value is drawn after each time interval t µ . The total number of intervals is N µ . In our simulationwith NEST we used aeif_psc_delta model neurons with the same parameters (cf. Table 4) for boththe inhibitory and the excitatory population. The simulation is divided into N µ intervals between which therates of the N p poisson_generators are set according to (18). The poisson_generators areconnected in a one-to-one manner to N p parrot_neurons which in turn are connected to the network.The details of the latter connectivity can be found in Table 3. In NEST a poisson_generator that isconnected to several target model neurons generates an independent Poisson spike train for each of theseneurons. Thus, the intermediate step via parrot_neurons is required to provide neurons in the networkwith common Poisson inputs. Moreover, a direct connection from a device like a poisson_generator to a model neuron via a plastic synapse is not possible in NEST. In this experiment, the Clopath rule withhomeostasis (time dependent prefactor for LTD, cf. Section 5.3) is used. Figure 7 C shows the weights ofthe synapses among the excitatory population after the simulation.30 tapmanns et al. 5.3 Implementation of homeostasis A LTD (¯¯ u ) A LTD ( ¯¯ u ) For the network simulations presented in

Clopath et al. (2010), the authors use a sightly modiﬁed versionof the Clopath rule deﬁned in (13): The constant factor A LTD is replaced by a voltage dependent term A LTD (¯¯ u ) = A LTD (cid:18) ¯¯ uu ref (cid:19) to take into account homeostatic processes. The quantity ¯¯ u is a temporal average of the quantity ¯ u − ( t ) over a time window of T = 1 s and u ref is a reference value. An exact temporal average requires to storethe time trace of ¯ u − ( t ) for the entire interval T . This would cancel the advantage of the implementationdiscussed in Section 3.1 where storage of time traces is needed only in the vicinity of spikes. Therefore,deviating from the original work by Clopath et al. (2010), we implement an additional low-pass ﬁltering ¯¯ u ( t ) = ( κ low ∗ ¯ u − ) ( t ) with an exponential kernel κ low ( t ) = H ( t ) exp ( − t/ ¯¯ τ ) instead. Like ¯ u ± , ¯¯ u ispassed as an additional state variable to the solver. To derive 17 it is convenient to ﬁrst investigate ∆ W ij (0 , t ) and ∆ W ij (0 , T ) and then compute ∆ W ij ( t, T ) = ∆ W ij (0 , T ) − ∆ W ij (0 , t ) . Assuming that the simulation starts at t = 0 , the weightchange from the start to time t is given by ∆ W ij (0 , t ) = η (cid:90) t dt (cid:48) (cid:90) t (cid:48) dt (cid:48)(cid:48) κ (cid:0) t (cid:48) − t (cid:48)(cid:48) (cid:1) V ∗ i (cid:0) t (cid:48)(cid:48) (cid:1) s ∗ j (cid:0) t (cid:48)(cid:48) (cid:1) = η (cid:90) t dt (cid:48)(cid:48) (cid:90) tt (cid:48)(cid:48) dt (cid:48) κ (cid:0) t (cid:48) − t (cid:48)(cid:48) (cid:1) V ∗ i (cid:0) t (cid:48)(cid:48) (cid:1) s ∗ j (cid:0) t (cid:48)(cid:48) (cid:1) = η (cid:90) t dt (cid:48)(cid:48) (cid:2) ˜ κ (cid:0) t − t (cid:48)(cid:48) (cid:1) − ˜ κ (0) (cid:3) V ∗ i (cid:0) t (cid:48)(cid:48) (cid:1) s ∗ j (cid:0) t (cid:48)(cid:48) (cid:1) = η [ − I (0 , t ) + I (0 , t )] where we exchanged the order of integration from the ﬁrst to the second line. In the third line we introduced ˜ κ ( t ) deﬁned by κ ( t ) = ∂∂t ˜ κ ( t ) and in the fourth line we deﬁned the two integrals I ( t , t ) = − (cid:90) t t dt (cid:48) ˜ κ (0) V ∗ i (cid:0) t (cid:48) (cid:1) s ∗ j (cid:0) t (cid:48) (cid:1) , I ( t , t ) = − (cid:90) t t dt (cid:48) ˜ κ (cid:0) t − t (cid:48) (cid:1) V ∗ i (cid:0) t (cid:48) (cid:1) s ∗ j (cid:0) t (cid:48) (cid:1) . In case of the Urbanczik-Senn rule ˜ κ ( t ) = − e − tτκ which implies the identities I ( t , t + ∆ t ) = I ( t , t ) + I ( t , t + ∆ t ) , I ( t , t + ∆ t ) = e − t − t τκ I ( t , t ) + I ( t , t + ∆ t ) , .5 Implementation of experiment using Urbanczik-Senn rule Stapmanns et al. which we use to write the weight change from t to T as ∆ W ij ( t, T ) = ∆ W ij (0 , T ) − ∆ W ij (0 , t )= η [ − I (0 , T ) + I (0 , T ) + I (0 , t ) − I (0 , t )]= η (cid:104) I ( t, T ) − I ( t, T ) + I (0 , t ) (cid:16) − e − T − tτκ (cid:17)(cid:105) . This is the the result 17.

In the simulation experiment shown in Figure 10 the dendrite of a conductance-based two-compartmentmodel neuron receives a spike pattern of duration T as an input via plastic synapses. The pattern consists of N p independent Poisson spike trains with a ﬁring rate f p . For learning, the pattern is repeated N rep times.Dendritic synapses adapt their weights so that after learning the somatic membrane potential U and thedendritic prediction V ∗ w follow a matching potential U M . The latter is created by somatic input via two spike_generators that are connected via a static excitatory or inhibitory connection, respectively.Both spike generators send spikes in every simulation step. Inhibitory input spikes have a constant weightto generate a constant somatic inhibitory conductance g I . Excitatory spikes have a modulated weight togenerate a periodic excitatory conductance g E . The input to the dendritic compartment is provided by N p spike_generators each of which is connected to one parrot_neuron which in turn is connectedto the dendrite via a plastic urbanczik_synapse . The intermediate parrot_neurons are required sincein NEST the spike_generators can have only static synapses as outgoing connections. The spike times ofthe spike_generators are set to repeatedly generate the spike pattern created before the start of theactual simulation. The neuron’s state variables are read out by a multimeter and the synaptic weightsby a weight_recorder . The weight change of the Urbanczik-Senn rule as presented in Section 3.2 in line with the originalpublication is driven by the prediction error V ∗ i = ( s i − φ ( V i )) h ( V i ) , where s i is the somatic spike train and V i the dendritic prediction of the somatic membrane potential U i .Instead of integrating over the difference between the spike train and the rate φ ( V i ) (spike-rate), one canderive two variants V ∗ i = (cid:16) s i − s dend i (cid:17) h ( V i ) (spike − spike) and V ∗ i = ( φ ( U i ) − φ ( V i )) h ( V i ) (rate − rate) . In the ﬁrst one (spike-spike) we replaced the dendritic rate prediction by a noisy realization s dend i using aninhomogeneous Poisson process with rate φ ( V i ) . In the second one (rate-rate) the somatic spike train isreplaced by the rate of the underlying Poisson process which is computed by applying the rate function φ to the somatic potential U i . The learning of a matching potential U M as described in Section 5.5 alsoworks in these two cases. Figure 12 shows the learning curve for all three variants of the Urbanczik-Sennrule. The loss is deﬁned as the average mismatch between U i and U M averaged over one period T p of the32 tapmanns et al. REFERENCES repetition l o ss rate - ratespike - ratespike - spike Figure 12. Comparison of learning curves in the experiment described in Section 5.5 for differentvariants of the Urbanczik-Senn plasticity rule . The loss is averaged over trials of different inputpatterns. Solid curves denote the mean value and the shaded area the corresponding standard deviation ofthe loss.input pattern T p (cid:90) dt ( U ( t ) − U M ( t )) . The decrease of the loss as a function of the pattern repetitions has a similar shape for all three variantswith a signiﬁcantly higher variance in case of the spike-spike version.

REFERENCES

Albers, C., Westkott, M., and Pawelzik, K. (2016), Learning of precise spike times with homeostaticmembrane potential dependent synaptic plasticity,

PLOS ONE , 11, 2, 1–28, doi:10.1371/journal.pone.0148948Bellec, G., Scherr, F., Subramoney, A., Hajek, E., Salaj, D., Legenstein, R., et al. (2019), A solution to thelearning dilemma for recurrent networks of spiking neurons, bioRxiv , 738385Bi, G. and Poo, M. (1998), Synaptic modiﬁcations in cultured hippocampal neurons: Dependence on spiketiming, synaptic strength, and postsynaptic cell type,

J. Neurosci. , 18, 10464–10472Billeh, Y. N., Cai, B., Gratiy, S. L., Dai, K., Iyer, R., Gouwens, N. W., et al. (2020), Systematic integrationof structural and functional data into multi-scale models of mouse primary visual cortex,

Neuron

Bono, J. and Clopath, C. (2017), Modeling somatic and dendritic spike mediated plasticity at the singleneuron and network level,

Nat. Commun. , 8, 1, 1–17Brea, J., Gaal, A. T., Urbanczik, R., and Senn, W. (2016), Prospective coding by spiking neurons,

PLOSComput. Biol. , 12, 6, 1–25, doi:10.1371/journal.pcbi.1005003Brea, J., Senn, W., and Pﬁster, J.-P. (2013), Matching recall and storage in sequence learning with spikingneural networks,

J. Neurosci. , 33, 23, 9565–957533

EFERENCES Stapmanns et al.

Brette, R. and Gerstner, W. (2005), Adaptive exponential integrate-and-ﬁre model as an effective descriptionof neuronal activity,

J. Neurophysiol. , 94, 5, 3637–3642Brette, R., Rudolph, M., Carnevale, T., Hines, M., Beeman, D., Bower, J. M., et al. (2007), Simulation ofnetworks of spiking neurons: A review of tools and strategies,

J. Comput. Neurosci. , 23, 3, 349–398,doi:10.1007/s10827-007-0038-6Brunel, N. (2000), Dynamics of sparsely connected networks of excitatory and inhibitory spiking neurons,

Journal of Computational Neuroscience , 8, 3, 183–208, doi:10.1023/a:1008925309027Clopath, C., Büsing, L., Vasilaki, E., and Gerstner, W. (2010), Connectivity reﬂects coding: a model ofvoltage-based STDP with homeostasis,

Nat. Neurosci. , 13, 344–352Clopath, C. and Gerstner, W. (2010), Voltage and spike timing interact in stdp–a uniﬁed model,

Frontiersin synaptic neuroscience , 2, 25Davison, A., Brüderle, D., Eppler, J., Kremkow, J., Muller, E., Pecevski, D., et al. (2008),PyNN: a common interface for neuronal network simulators,

Front. Neuroinformatics , 2, 11,doi:10.3389/neuro.11.011.2008Diederich, N., Bartsch, T., Kohlstedt, H., and Ziegler, M. (2018), A memristive plasticity model ofvoltage-based stdp suitable for recurrent bidirectional neural networks in the hippocampus,

Scientiﬁcreports , 8, 1, 1–12Djurfeldt, M., Davison, A. P., and Eppler, J. M. (2014), Efﬁcient generation of connectivity in neuronalnetworks from simulator-independent descriptions,

Front. Neuroinformatics , 8, 43, doi:10.3389/fninf.2014.00043Djurfeldt, M., Hjorth, J., Eppler, J. M., Dudani, N., Helias, M., Potjans, T. C., et al. (2010), Run-timeinteroperability between neuronal network simulators based on the MUSIC framework,

Neuroinformatics ,8, 43–60Eppler, J. M., Helias, M., Muller, E., Diesmann, M., and Gewaltig, M. (2009), PyNEST: a convenientinterface to the NEST simulator,

Front. Neuroinformatics , 2, 12, doi:10.3389/neuro.11.012.2008Gerstner, W., Kempter, R., van Hemmen, J. L., and Wagner, H. (1996), A neuronal learning rule forsub-millisecond temporal coding,

Nature , 383, 76–78Gerstner, W., Kistler, W. M., Naud, R., and Paninski, L. (2014), Neuronal Dynamics. From single Neuronsto Networks and Models of Cognition (Cambridge University Press, Cambridge)Gewaltig, M.-O. and Diesmann, M. (2007), NEST (NEural Simulation Tool),

Scholarpedia , 2, 4, 1430,doi:10.4249/scholarpedia.1430Hannun, A., Case, C., Casper, J., Catanzaro, B., Diamos, G., Elsen, E., et al. (2014), Deep speech: Scalingup end-to-end speech recognition, arXiv preprint arXiv:1412.5567

Hebb, D. O. (1949), The organization of behavior: A neuropsychological theory (John Wiley & Sons, NewYork)Hines, M. L. and Carnevale, N. T. (2001), NEURON: a tool for neuroscientists,

Neuroscientist , 7, 2,123–135, doi:10.1177/107385840100700207Hines, M. L., Morse, T., Migliore, M., Carnevale, N. T., and Shepherd, G. M. (2004), ModelDB: Adatabase to support computational neuroscience,

J. Comput. Neurosci. , 17, 1, 7–11, doi:10.1023/B:JCNS.0000023869.22017.2eHinton, G. E., Osindero, S., and Teh, Y.-W. (2006), A fast learning algorithm for deep belief nets,

Neuralcomputation , 18, 7, 1527–1554Jordan, J., Ippen, T., Helias, M., Kitayama, I., Sato, M., Igarashi, J., et al. (2018), Extremely scalablespiking neuronal network simulation code: From laptops to exascale computers,

Front. Neuroinformatics ,12, 2, doi:10.3389/fninf.2018.00002 34 tapmanns et al. REFERENCES

Jordan, J., Mørk, H., Vennemo, S. B., Terhorst, D., Peyser, A., Ippen, T., et al. (2019), Nest 2.18.0,doi:10.5281/zenodo.2605422Jülich Supercomputing Centre (2015), JUQUEEN: IBM Blue Gene/Q R (cid:13) supercomputer system at the JülichSupercomputing CentreKo, H., Cossell, L., Baragli, C., Antolik, J., Clopath, C., Hofer, S. B., et al. (2013), The emergence offunctional microcircuits in visual cortex, Nature , 496, 7443, 96–100Krizhevsky, A., Sutskever, I., and Hinton, G. E. (2012), Imagenet classiﬁcation with deep convolutionalneural networks, in Advances in neural information processing systems, 1097–1105Kunkel, S. (2015), Simulation technology for plastic neuronal networks on high-performance computers,10.6094/UNIFR/10419, 1–188, doi:10.6094/UNIFR/10419Kunkel, S., Diesmann, M., and Morrison, A. (2011), Limits to the development of feed-forward structuresin large recurrent neuronal networks,

Front. Comput. Neurosci. , 4Lecun, Y. (1985), Une procedure d’apprentissage pour reseau a seuil asymmetrique (a learning scheme forasymmetric threshold networks), in Proceedings of Cognitiva 85, Paris, France, 599–604LeCun, Y., Bengio, Y., and Hinton, G. (2015), Deep learning,

Nature , 521, 7553, 436–444Legenstein, R. and Maass, W. (2011), Branch-speciﬁc plasticity enables self-organization of nonlinearcomputation in single neurons,

J. Neurosci. , 31, 30, 10787–10802, doi:10.1523/JNEUROSCI.5684-10.2011Litwin-Kumar, A. and Doiron, B. (2014), Formation and maintenance of neuronal assemblies throughsynaptic plasticity,

Nat. Commun. , 5, 1, 1–12Lytton, W. W., Seidenstein, A. H., Dura-Bernal, S., McDougal, R. A., Schürmann, F., and Hines, M. L.(2016), Simulation neurotechnologies for advancing brain research: parallelizing large networks inNEURON,

Neural computation , 28, 10, 2063–2090, doi:10.1162/neco_a_00876Maes, A., Barahona, M., and Clopath, C. (2020), Learning spatiotemporal signals using a recurrent spikingnetwork that discretizes time,

PLoS computational biology , 16, 1, e1007606Markram, H., Lübke, J., Frotscher, M., and Sakmann, B. (1997), Regulation of synaptic efﬁcacy bycoincidence of postsynaptic APs and EPSPs,

Science , 275, 213–215Markram, H., Muller, E., Ramaswamy, S., Reimann, M. W., Abdellah, M., Sanchez, C. A., et al. (2015),Reconstruction and simulation of neocortical microcircuitry,

Cell , 163, 2, 456–492, doi:10.1016/j.cell.2015.09.029Mayr, C. G. and Partzsch, J. (2010), Rate and pulse based plasticity governed by local synaptic statevariables,

Frontiers in Synaptic Neuroscience , 2, 33Morrison, A., Aertsen, A., and Diesmann, M. (2007a), Spike-timing dependent plasticity in balancedrandom networks,

Neural Comput. , 19, 1437–1467, doi:10.1162/neco.2007.19.6.1437Morrison, A., Diesmann, M., and Gerstner, W. (2008), Phenomenological models of synaptic plasticitybased on spike-timing,

Biol. Cybern. , 98, 459–478Morrison, A., Mehring, C., Geisel, T., Aertsen, A., and Diesmann, M. (2005), Advancing the boundaries ofhigh connectivity network simulation with distributed computing,

Neural Comput. , 17, 8, 1776–1801,doi:10.1162/0899766054026648Morrison, A., Straube, S., Plesser, H. E., and Diesmann, M. (2007b), Exact subthreshold integration withcontinuous spike times in discrete-time neural network simulations,

Neural Comput. , 19, 1, 47–79,doi:10.1162/neco.2007.19.1.47Nordlie, E., Gewaltig, M.-O., and Plesser, H. E. (2009), Towards reproducible descriptions of neuronalnetwork models,

PLOS Comput. Biol. , 5, 8, e1000456, doi:10.1371/journal.pcbi.1000456Parker, D. (1985), Learning logic,

Technical Report TR-47 EFERENCES Stapmanns et al.

Plesser, H., Diesmann, M., Gewaltig, M.-O., and Morrison, A. (2015), Nest: the neural simulation tool,in D. Jaeger and R. Jung, eds., Encyclopedia of Computational Neuroscience (Springer New York),1849–1852, doi:10.1007/978-1-4614-6675-8_258Plotnikov, D., Blundell, I., Ippen, T., Eppler, J. M., Morrison, A., and Rumpe, B. (2016), NESTML: amodeling language for spiking neurons, in Modellierung 2016 Conference, volume 254 of

LNI (BonnerKöllen Verlag, Bonn), volume 254 of

LNI , 93–108Rumelhart, E., David, Hinton, E., Geoffrey, and Williams, J., Ronald (1986), Learning representations byback-propagating errors,

Nature , 323, 533–536, doi:10.1038/323533a0Sacramento, J., Costa, R. P., Bengio, Y., and Senn, W. (2018), Dendritic cortical microcircuits approximatethe backpropagation algorithm, in Advances in Neural Information Processing Systems, 8721–8732Sadeh, S., Clopath, C., and Rotter, S. (2015), Emergence of functional speciﬁcity in balanced networkswith synaptic plasticity,

PLOS Comput. Biol. , 11, 6, 1–27, doi:10.1371/journal.pcbi.1004307Schmidt, M., Bakker, R., Shen, K., Bezgin, G., Diesmann, M., and van Albada, S. J. (2018), A multi-scalelayer-resolved spiking network model of resting-state dynamics in macaque visual cortical areas,

PLOSComput. Biol. , 14, 10, e1006359, doi:10.1371/journal.pcbi.1006359Sheik, S., Paul, S., Augustine, C., and Cauwenberghs, G. (2016), Membrane-dependent neuromorphiclearning rule for unsupervised spike pattern detection, in 2016 IEEE Biomedical Circuits and SystemsConference (BioCAS) (IEEE), 164–167Sjöström, P., Turrigiano, G., and Nelson, S. (2001), Rate, timing, and cooperativity jointly determinecortical synaptic plasticity,

Neuron , 32, 1149–1164Song, S., Sjöström, P., Reigl, M., Nelson, S., and Chklovskii, D. (2005), Highly nonrandom features ofsynaptic connectivity in local cortical circuits,

PLoS Biol. , 3, 3, e68Stimberg, M., Goodman, D., Benichoux, V., and Brette, R. (2014), Equation-oriented speciﬁcation ofneural models for simulations,

Frontiers in Neuroinformatics , 8, 6, doi:10.3389/fninf.2014.00006Urbanczik, R. and Senn, W. (2014), Learning by the dendritic prediction of somatic spiking,

Neuron , 81, 3,521–528Werbos, P. (1974), Beyond regression : new tools for prediction and analysis in the behavioral sciences(Harvard University)Yger, P. and Harris, K. D. (2013), The convallis rule for unsupervised learning in cortical networks,

PLOSComput. Biol. , 9, 10, 1–16, doi:10.1371/journal.pcbi.100327236 tapmanns et al. REFERENCES

A: Simulation parametersSymbol Value Description f pair [10 , , . . . ,

50] Hz frequency of occurrence of spike pairs ∆ t ±

10 ms time shift of spike pair w init [mV] 0 . or pA / ms initial weight (unit is mV for aeif and pA / ms forhh neuron) B: Parameters of aeif_psc_delta_clopath

Symbol Value Description E L − .

402 mV leak reversal potential E Na . sodium reversal potential E K − . potassium reversal potential g L . leak conductance g Na · nS sodium peak conductance g K . · nS potassium peak conductance C m

100 pF membrane capacitance τ ex . rise time of the exc. synaptic alpha funct. τ in . rise time of the inh. synaptic alpha funct. θ − − . threshold θ + −

35 mV threshold A LTD · − / mV amplitude of LTD A LTP · − / mV amplitude of LTP τ −

10 ms time constant of ¯ u − τ +

114 ms time constant of ¯ u + τ s

15 ms time constant of s ∗ j d s delay of ¯ u ± C: Parameters of hh_psc_alpha_clopath

Symbol Value Description E L − . leak reversal potential g L

30 nS leak conductance C m

281 pF membrane capacitance V reset −

60 mV reset value of membr. pot. after spike V peak

33 mV spike detection threshold ∆ T slope factor τ w

144 ms spike adaptation time constant τ z

40 ms spike adaptation time constant V th , max . threshold potential after spike τ V, th

50 ms threshold potential time constant a subthreshold adaptation b . spike triggered adaptation θ − − . threshold of ¯ u − θ + − . threshold of ¯ u + A LTD · − / mV amplitude of LTD A LTP · − / mV amplitude of LTP τ −

10 ms time constant of ¯ u − τ + time constant of ¯ u + τ s

15 ms time constant of s ∗ j d s delay of ¯ u ± Table 2. Parameters of the spike pairing experiment using the Clopath rule.

The values for the aeifmodel are taken from (

Clopath et al. , 2010, Tab. 1 and appendix) and those for the hh model are extractedfrom the reference implementation by B. Torben-Nielson on ModelDB (

Hines et al. , 2004).37

EFERENCES Stapmanns et al.

A: Model summaryPopulations

Three: excitatory, inhibitory, external input

Connectivity all-to-all, ﬁxed out-degree, ﬁxed in-degree

Neuron model adaptive exponential integrate-and-ﬁre (aeif, Clopath)

Plasticity

Clopath synapse/

Input independent homogeneous Poisson spike trains

Measurements synapse weight

B: PopulationsName Elements Population size

E aeif/two-comp. N E = 10 I aeif/two-comp. N I = 3 E ext Poisson generator N p = 500 C: ConnectivityName Source Target Pattern

ExcExc E E all-to-all (no autapses)ExcInh E I ﬁxed in-degree C E = 8 InhExc I E ﬁxed out-degree C I = 6 ExtExc E

Ext

E all-to-allExtInh E

Ext

I all-to-all

D: NeuronsName aeif_psc_delta_clopath

Type adaptive exponential integrate-and-ﬁre

Details see

Clopath et al. (2010)

Parameters see Table 4

E: SynapsesName Model Initial weight [mV] Max. weight[mV]

ExcExc clopath_synapse .

25 0 . ExcInh static_synapse . − InhExc static_synapse . − ExtExc clopath_synapse random uniform from [0 . , .

5] 3 . ExtInh static_synapse random uniform from [0 . , . − F: InputType

Poisson generator

Description homogeneous Poisson spike trains, independent for each neuron, modu-lated rate

Parameters see Table 4

Table 3. Model description of Brunel network after Nordlie et al. (2009). This network is used toproduce the performance measurement shown in Figure 8, Figure 9, and Figure 11. The values of theparameters are shown in Table 7. 38 tapmanns et al. REFERENCES

A: Parameters of aeif_psc_delta_clopath

Symbol Value Description E L − . leak reversal potential g L

30 nS leak conductance C m

281 pF membrane capacitance V reset −

60 mV reset value of membr. pot. after spike V peak

20 mV spike detection threshold ∆ T slope factor τ w

144 ms spike adaptation time constant τ z

40 ms spike adaptation time constant V th , max . threshold potential after spike τ V, th

50 ms threshold potential time constant a subthreshold adaptation b . spike triggered adaptation θ − − . threshold of ¯ u − θ + − . threshold of ¯ u + A LTD · − / mV amplitude of LTD u ref reference value for ¯¯ u ¯¯ τ . · ms time constant of ¯¯ uA LTP · − / mV amplitude of LTP τ −

10 ms time constant of ¯ u − τ + time constant of ¯ u + τ s

15 ms time constant of s ∗ j d s delay of ¯ u ± B: Input parametersSymbol Value Description A p

60 [Hz] amplitude of Gaussian rate proﬁle σ p width of Gaussian rate proﬁle c p .

48 [Hz] offset of Gaussian rate proﬁle s p [25 , , . . . , set of possible values for the center of the Gaussian µ p t µ

100 [ms] interval after which a new value for µ p is drawn N µ number of intervals t µ Table 4. Neuron and input parameters for simulation of network producing bidirectional connec-tions using the Clopath rule.

The values are taken from (

Clopath et al. , 2010, Tab. 1 and appendix). Thesame values are used for the performance measurements shown in Figure 8 and Figure 9.39

EFERENCES Stapmanns et al.

A: Simulation parametersSymbol Value Description T p pattern duration N rep number of pattern repetitions N p number of input spike trains f p

10 Hz input ﬁring rate w g E (18 sin (2 πt ) + 4 .

8) nS weights to generate periodic excitatory conductance w g I weights to generate constant inhibitory conductance B: Parameters of pp_cond_exp_mc_urbanczik (soma)

Symbol Value Description C m

300 pF membrane capacitance E L −

70 mV leak reversal potential g L . leak conductance E ex . exc. reversal potential E in − . inh. reversal potential τ ex . rise time of the exc. synaptic alpha funct. τ in . rise time of the inh. synaptic alpha funct. t ref . refractory time C: Parameters of pp_cond_exp_mc_urbanczik (dendrite)

Symbol Value Description C m

300 pF membrane capacitance E L −

70 mV leak reversal potential g L . leak conductance τ ex . rise time of the exc. synaptic alpha funct. τ in . rise time of the inh. synaptic alpha funct. φ ( U ) .

15 kHz1+ exp ( −

55 mV − U ) rate function g sp . coupling dendrite to soma Table 5. Parameters of the simulation of the learning experiment using the Urbanczik-Senn rule.

The same values of the neuron parameters are used for the performance measurements shown in Figure 11and Figure 9. 40 tapmanns et al. REFERENCES

A: Model summaryPopulations

Three: excitatory, inhibitory, external input

Connectivity random with ﬁxed indegree

Neuron model adaptive exponential integrate-and-ﬁre (aeif, Clopath)/two-compartment Poisson (two-comp., Urbanczik-Senn)

Plasticity

Clopath synapse/Urbanczik-Senn synapse

Input independent homogeneous Poisson spike trains

Measurements — B: PopulationsName Elements Population size

E aeif/two-comp. N E = 4 N I I aeif/two-comp. N I E ext Poisson genera-tor C: ConnectivityName Source Target Pattern Weight

Exc E E+I ﬁxed in-degree C E J ex Inh I E+I ﬁxed in-degree C I J in Ext E

Ext

E+I all-to-all J D: NeuronsName aeif_psc_delta_clopath

Type adaptive exponential integrate-and-ﬁre

Details see

Clopath et al. (2010)

Parameters see Table 4

Name pp_cond_exp_mc_urbanczik

Type two-compartment neuron with spike generation via inhomogeneousPoisson process

Details see

Urbanczik and Senn (2014)

Parameters see Table 5

E: SynapsesName Model

Exc clopath/urbanczik_synapse

Inh clopath/urbanczik_synapse

Ext static_synapse

F: InputType Description

Poisson generator homogeneous Poisson spike trains, independent for each neuron, rate f ext = ν ext C E Table 6. Model description of Brunel network after Nordlie et al. (2009). This network is used toproduce the performance measurement shown in Figure 8, Figure 9, and Figure 11. The values of theparameters are shown in Table 7. 41

EFERENCES Stapmanns et al.