[PDF] Nonlinear computations in spiking neural networks through multiplicative synapses

Abstract

The brain efficiently performs nonlinear computations through its intricate networks of spiking neurons, but how this is done remains elusive. While recurrent spiking networks implementing linear computations can be directly derived and easily understood (e.g., in the spike coding network framework), the connectivity required for nonlinear computations can be harder to interpret, as they require additional non-linearities (e.g., dendritic or synaptic) weighted through supervised training. Here we extend the spike coding framework to implement any polynomial dynamical system. This results in networks requiring multiplicative synapses, which we term the multiplicative spike coding network (mSCN). We demonstrate how the required connectivity for several nonlinear dynamical systems can be directly implemented in mSCNs, without training. We also show how to carry out higher-order polynomials with coupled networks that use only pair-wise multiplicative synapses, and provide expected numbers of connections for each synapse type. Overall, our work provides an alternative method for implementing nonlinear computations in spiking neural networks, while keeping all the attractive features of standard SCNs (such as robustness, irregular and sparse firing, and interpretable connectivity). Finally, we discuss the biological plausibility of our approach, and how the high accuracy and robustness of the approach may be of interest for neuromorphic computing.

Full PDF

NNonlinear computations in spiking neural networksthrough multiplicative synapses

Michele Nardin, , ∗ James W Phillips, William F Podlaski, Sander W Keemink , ∗ Institute of Science and Technology Austria, Klosterneuburg, Austria Champalimaud Research, Champalimaud Centre for the Unknown, Lisbon, Portugal

September 9, 2020

Abstract

The brain performs many nonlinear computations through intricate spiking neuralnetworks (SNNs). How neural network dynamics relate to arbitrary computations underthese constraints is still an open question. As a strong constraint, these networks arehypothesized to be robust to perturbations and use minimal energy. The theory ofSpike Coding Networks (SCNs) derives the required connectivity and dynamics forboth information representation and linear dynamical systems from ﬁrst principles, andachieves robustness and eﬃciency. Nonlinear dynamical systems have thus far only beenimplemented in SCNs by ﬁltering neural inputs through sets of nonlinear dendritic basisfunctions. While this approach works well, it relies on providing a rich enough basis setas well as supervised training of the connectivity weights. Another way to implementnonlinear computations is through multiplicatively interacting synapses. However, thereis currently no principled way to implement such synapses in SCNs. Here, we extend thecore SCN derivations to implement polynomial dynamical systems, from which also theneed for such multiplicatively interacting synapses arises. We demonstrate our approachwith a highly accurate Lorenz attractor implementation, as well as a second-orderapproximation of a double pendulum. We additionally demonstrate how to implementhigher-order polynomials using sequential networks with only pair-wise synapses. Finally,we derive upper bounds and expected numbers of connections based on the sparsityof the underlying representation. Overall, our work provides an alternative way todirectly implement nonlinear computations in spike coding networks, and expands ourunderstanding about the potential functions of multiplicative synapses. Furthermore,due to the high accuracy and low energy usage of our approach, this work may be ofinterest for neuromorphic computing.

A central quest in neuroscience is to understand how spiking neural networks (SNNs) are ableto perform nonlinear computations accurately, eﬃciently, and robustly. An important hint isthat one can often make sense of complex population responses with purely linear read-outssuch as PCA (Keemink and Machens, 2019). In the theory of spike-coding networks (SCNs),(Boerlin et al., 2013; Barrett et al., 2016; Denève and Machens, 2016; Calaim et al., 2020)the core assumptions are that signal representations can indeed be read out linearly from thenetwork, and that spiking should be as eﬃcient as possible (with each spike contributing to theread-out). SCNs (a sub-class of SNNs) are derived from these ﬁrst principles, with eﬃcientrepresentation being achieved through fast recurrent connections, and linear dynamicalcomputations through slow recurrent connections (Boerlin et al., 2013). SCNs are consistentwith many features from biology (such as sparse and irregular activity, robustness to cell-loss, ∗ Correspondence: [email protected]; [email protected] a r X i v : . [ q - b i o . N C ] S e p nd excitation/inhibition-balance). However, only linear computations are possible withthe standard SCN derivation.Previous theories of nonlinear computations in SNNs relied on a basis-function approachthrough neural (Eliasmith, 2005) or dendritic nonlinearities (in SCNs) (Thalmeier et al.,2016; Alemi et al., 2018). This approach, though powerful, depends on a suitable choice ofbasis functions and supervised training of the weights. As of yet there is no principled wayto directly derive the connectivity required for a given nonlinear dynamics, which would beuseful both for achieving an optimal network implementation of a particular system, andgaining a more intuitive understanding of how the network dynamics operate. A promisingalternative approach to nonlinear computation is to use multiplicative synapses, which inprinciple enables computations of polynomial form (Koch and Poggio, 1992; Salinas andAbbott, 1996; Nezis and Rossum, 2011). However, how to combine such multiplicativesynapses optimally for a speciﬁc computation in SNNs is not currently known.Here we extend the original SCN derivation for linear dynamics (Boerlin et al., 2013) topolynomial dynamical systems. We demonstrate that the resulting connectivity predictsthe existence of an additional set of slow connections with multiplicatively interactingsynapses. We demonstrate the applicability of our approach through an exact implementationof the Lorenz system, as well as a second order implementation of a double pendulum.While polynomial systems can in principle approximate any other system (De Branges,1959), this can quickly become infeasible, as higher-order computations require higher-ordersynapses (pair-wise, triplets, quadruplets, etc.), resulting in a dense and complicated all-to-allconnectivity structure. We solve this by demonstrating how higher-order computationscan be approximated by successive network layers with solely pair-wise synapses, and weadditionally (through a precise quantiﬁcation of the connection density) demonstrate thatthe density of connections needed does not need to be all-to-all.Our approach thus provides a novel understanding of how multiplicative synapses mightbe used to implement nonlinear computations on a network level. We don’t claim thatmultiplicative synapse SCNs are more realistic or eﬀective than the basis function approach,but they provide an alternative hypothesis on how nonlinear computations might be done.Finally, in terms of possible neuromorphic applications, whenever a system (such as theLorenz attractor) is well described by a lower-order polynomial, multiplicative synapses andSCNs oﬀer an eﬃcient solution. Here and in the following sections, we present the main results and refer the reader to theMethods section for details. Consider a network of N spiking neurons, which emit spiketrains of the form s i ( t ) = (cid:80) k δ ( t ik − t ) , where δ is the Dirac delta function and { t ik ≥ } is the set of discrete times at which a spike was emitted. The population spike train isdescribed by the vector s ( t ) = [ s ( t ) , . . . , s N ( t )] T . Vectors will be denoted by lower case boldletters, and wherever possible we will omit the time index for the sake of text clarity. Suppose that a given K − dimensional signal x ( t ) ∈ R K should be represented by the outputactivity of the network. How should the neurons then spike to accomplish this task? Thetheory of spike-coding networks approaches this through two core assumptions (Boerlin et al.,2013; Denève and Machens, 2016; Barrett et al., 2016): (a) Linear decoding: the network representation ˆ x ( t ) is read out as ˆ x = Dr where D ∈ R K × N is the decoding matrix, and r ( t ) = [ r ( t ) , . . . , r N ( t )] T are ﬁltered spike-trains ˙ r = − λ r + s , λ is the leak time-constant. The variable r can be seen as a neuron’s time-dependentrate, or equivalently the eﬀect of a neuron’s spikes on the post-synaptic potential of otherneurons. (b) Eﬃcient spiking: D i (the i − th column vector of the matrix D ) represents thecontribution that a spike from neuron i will have on each dimension of the read-out signal;more speciﬁcally, a spike at time t will update the current readout as ˆ x ( t ) → ˆ x ( t ) + D i .Assumption (b) requires that this spike should only occur if it improves the read-out. Formally,we require that a spike reduces the (cid:96) -error between the readout and the signal. Thus, neuron i will ﬁre at time t iﬀ (cid:107) x ( t ) − (ˆ x ( t ) + D i ) (cid:107) < (cid:107) x ( t ) − ˆ x ( t ) (cid:107) After some algebra (see Methods 7.1), and deﬁning the membrane potential of each neuronas V i = D T i (ˆ x − x ) , one ﬁnds an underlying dynamical description of the system where eachneuron spikes whenever V i > T i , with T i = D T i D i / , and membrane potential dynamics ˙ v = − λ v + D T ( ˙ x + λ x ) − D T Ds , (1)where v = ( V , . . . , V N ) . Thus, starting from the two core assumptions, we have deriveda recurrently connected network of leaky integrate-and-ﬁre neurons. Through recurrentconnections (given by D T D ) the network accurately tracks its input signal x , (Fig. 1B+C toprow). These networks exhibit many common properties observed in biological networks suchas irregular yet highly eﬃcient spiking activity (Boerlin et al., 2013; Denève and Machens,2016; Barrett et al., 2016). In the above derivation, the signal x was provided directly to the network, but this is notstrictly necessary. If x follows some known linear dynamics ˙ x = Ax (with A ∈ R K × K ) thenits trajectory can be computed by the network (Boerlin et al., 2013). Their derivation usesthe fact that x ≈ ˆ x , so that (1) can be approximated as ˙ v = − λ v + D T ( A ˆ x + λ ˆ x ) − D T Ds = − λ v + Ω f s + Ω s r (2)where so-called “fast” connections Ω f = − D T D keep the error constrained on a short time-scale, and “slow” connections Ω s = D T ( A + λ I ) D implement the dynamical computationusing the ﬁltered spikes r . While technically an approximation, this implementation workswell in practice and can closely reproduce a given linear dynamical system (Fig. 1B middle). The approximation in (2) was originally conceived for linear systems, but can in principlebe extended to any arbitrary dynamical system ˙ x = F ( x ) (with F : R K → R K ). The fullnetwork dynamics then become ˙ v = − λ v + D T ( F (ˆ x ) + λ ˆ x ) − D T D s , (3)with the problem that the nonlinear function F () has to be somehow computed by thenetwork (or individual neurons). Previous work approximated this computation througha set of basis functions (Thalmeier et al. (2016); Alemi et al. (2018); see Methods 7.5),which can be interpreted as dendritic nonlinearities. Here we take a diﬀerent approach. Wenote that any smooth nonlinear function can in principle be approximated by a polynomialtransformation (De Branges, 1959). Furthermore, any polynomial function F : R K → R K containing terms with maximum degree g can be written in the form F ( x ) = g (cid:88) d =0 A d x ⊗ d (4)3 lowMultiplicative Input P S P ( m V ) Types of synapses

Linear readoutNetwork

Fast

A B C P S P ( m V ) P S P ( m V ) Structure Types of Computation

RepresentationLinearNonlinear * Figure 1: Multiplicative synapses in Spike Coding Networks can implement polynomialdynamics. (A)

Schematic representation of the network. (B)

The network has three typesof synapses, illustrated for two neurons (red and green) connecting to another (blue). Thepostsynaptic potential (PSP) of a cell endowed with multiplicative synapses will be aﬀectedonly if the two presynaptic neurons ﬁre very close in time to each other. (C)

Examplecomputations: (top) the network represents the two inputs ( x and y ) in the ﬁrst twodimensions. The blue line represents the output of the network, the dashed black lines thereal input. (Middle) A network which computes the dynamical system z = x + y . The blackdashed line represents the real sum of the inputs, the blue line represents the output of thenetwork. (Bottom) A network which computes the nonlinear dynamical system ˙ z = x ∗ y ,and thus integrates the product of x and y . 4here A d ∈ R K × K d is the matrix of coeﬃcients for the polynomials of degree d , and wedeﬁne M ⊗ d = M ⊗ M ⊗· · ·⊗ M as the Kronecker product applied d times, with the conventionthat M ⊗ = 1 and M ⊗ = M . The Kronecker product is closely related to the outer-productand computes all possible pair-wise multiplications between the elements of two matrices.For example, the Kronecker product of two vectors of length l is itself a new vector of length l (see Methods 7.3 for a detailed example).Using this notation, the connectivity and dynamics for a network implementing a polyno-mial function can be directly derived as ˙ v = − λ v − D T Ds + D T ( g (cid:88) d =0 A d D ⊗ d r ⊗ d + λ ˆ x )= − λ v + Ω f s + Ω m s + Ω m s r + Ω m s r ⊗ + · · · + Ω mgs r ⊗ d = − λ v + Ω f s + g (cid:88) d =0 Ω mds r ⊗ d (5)where Ω f = − D T D , Ω m s = D T ( A + λ I ) D and Ω mds = D T A d D ⊗ d for d ∈ { , , , . . . , g } .The matrix Ω mds represents the d − th degree multiplicative interactions between cells,through multiplicative synapses (Fig. 1B bottom). While higher order interactions areunlikely to be feasible, lower order multiplicative interactions may indeed be possible inbiology, and have been hypothesized before (Koch and Poggio, 1992). In particular, Ω m s represents the connectivity required for each cell to multiply each pair of their inputs (Fig.1B bottom row). Compared to linear dynamics, multiplicative synapses enable nonlinearcomputations such as AND gates (Fig. 1C,D). Therefore, the presence of multiplicativesynapses arises naturally from extending the spike-coding framework to polynomial dynamicalsystems.In principle, increasingly complex nonlinear dynamics may be implemented through theinclusion of higher-order terms in (5) ( Ω md for d > ), though this ﬂexibility comes withincreased cost on the number of synapses and interactions thereof. In a later section we willshow how to avoid interactions beyond pair-wise synapses, and we will derive the expectednumber of connections for each type of synapses. We illustrate the functionality of this extended SCN formalism through an implementationof a simple dynamical system, the Lorenz attractor. The Lorenz attractor is a system ofordinary diﬀerential equations ﬁrst studied by Edward Lorenz, which may lead to chaoticsolutions (Lorenz, 1963; Strogatz, 2018). Notably, this system contains multiplicative termsof the state-variables, thereby making it a polynomial (and nonlinear) dynamical system. Itis deﬁned as ˙ x = σ ( y − x ) , ˙ y = x ( ρ − z ) − y, ˙ z = xy − βz. In the following, we use the “classical” parameter values σ = 10 , β = 8 / and ρ = 28 , inwhich the system is a chaotic regime. This is a useful case study as the resulting behavioris very sensitive to small representation errors, and has been used to test previous spikingnetworks implementations (e.g. Thalmeier et al. (2016)). Denoting x = ( x, y, z ) T , we canrewrite the Lorenz attractor in the format of (4), such that ˙ x = Ax + Bx ⊗ x , A =  − σ σ ρ − − β  , (6)and B ∈ R × with B = − , B = 1 , and all other elements of B being zero.We implemented the Lorenz system in three ways, each using networks of N = 100 neurons. First, we simulated the Lorenz system in a standard numerical simulation (Runge-Kutta method), and fed the dynamic variables x directly as input into an autoencodingnetwork with only fast synapses. This control acted as an upper-bound on the accuracy ofrepresentation with a spiking network of a given size and read-out weight magnitudes. Asexpected the network represented the system with high ﬁdelity — only small deviations arosecompared to the standard numerical simulation due to the discrete spiking representation ofthe network (Fig. 2Aii, dotted lines). Also note that the correct trajectory is continuouslybeing fed into this network. To have a better idea of the accuracy of the representation,we followed Thalmeier et al. (2016) and compared the values of neighboring peaks in thedynamics of the z variable, which closely tracked a well-deﬁned function deﬁned by the pureLorenz simulation (Fig. 2Aiii).Next, we implemented the SCN with multiplicative synapses as deﬁned above, whichcomputes the Lorenz dynamics internally. The corresponding voltage dynamics were ˙ v = − λ v + Ω f s + Ω m s r + Ω m s r ⊗ , where Ω f = − D T D , Ω m s = D T ( A + λ I ) D and Ω m s = D T BD ⊗ D .The resulting network is able to compute the Lorenz dynamics with high accuracy (Fig.2B). The representation tracked the dynamics of the standard numerical simulation (dottedlines) for a reasonable amount of time, despite the attractor’s chaotic nature (Fig. 2B iii),though this depends on the simulation time step (set to . ms here). Additionally, despitethe deviations from the ‘true’ trajectory, the peak analysis demonstrates that qualitativelythe implementation is near perfect (Fig. 2B iv). Furthermore, the network simulation canbe implemented with a small number of neurons, and thus still displays the stereotypicalrobustness of SCNs (Supp. Fig. 1).Lastly, for comparison’s sake, we implemented the same dynamical system using basisfunctions with trained weights (Methods 7.5) in order to understand the beneﬁts anddrawbacks of each approach. We used basis functions per neuron. In contrast to thetwo previous schemes, the implementation with basis functions led to more inaccuracies(Fig. 2C), quickly resulting in missed shifts in the dynamics. This was likely due to theapproximate nature of the basis function implementation, and we note that more precisesimulations might be possible with diﬀerent basis functions (see e.g. Fig. 3f-h in Thalmeieret al. (2016); though even there, outliers are still present).This might suggest that the direct implementation of the Lorenz system with multiplicativesynapses is capable of more accurate dynamics than a basis function implementation. However,we note three caveats here. First, the accuracy of the dynamics depends on the natureof the underlying system — e.g., small inaccuracies would matter less for a system withstable ﬁxed points. Second, the Lorenz attractor is perfectly described by a polynomial,and other dynamical systems might be better described by a basis-function implementation(of similar complexity as a given multiplicative synapse SCN). Third, the diﬀerences inaccuracy and scaling of the two implementations suggests that each may be more suitabledepending upon the speciﬁc problem at hand (on the order of N parameters are needed forthe multiplicative synapse implementation and bN for the basis function implementation,where b is the number of basis functions). 6igure 2: Implementation of a Lorenz dynamical system. Across columns: (A) The numericalsolution was found using an explicit Runge-Kutta method of order 4. The network containsonly fast synapses and receives as input the output of the Lorenz simulation, which thenetwork read-out tracks closely. (B)

A network with multiplicative synapses computingthe Lorenz attractor through its network dynamics. (C)

A network with nonlinear basisfunctions computing the Lorenz attractor through its network dynamics. Across rows: (i) (ii)

Each network readout dimension (blue) across time vs the ‘true’ solution (black dotted).The gray region indicates the 2-5sec period used in panels i and iii. (iii)

Raster plot withspikes emitted by the neurons in the time interval 2 - 5 sec. (iv)

Peak analysis over 100 sec:blue = network output, black = ‘true’ Lorenz simulation.7

Higher-order polynomials as sequential pairwise multi-plications

While the Lorenz system is a good case study for demonstrating the power and accuracyof SCNs with multiplicative synapses, a core problem remains: higher order polynomialsnecessitate higher order multiplicative interactions. E.g., a polynomial of order wouldrequire a r ⊗ term, with on the order of N synapses. Such precise higher-order interactionsmay not always be feasible, either biologically or on a neuromorphic substrate. However, aswe show here, this is not strictly necessary. Across populations, it is possible to combinemany sequential pairwise interactions to achieve multiplications of any other order. In principle, an SCN can also represent a nonlinear transformation of a signal G ( x ) , where G is a smooth function G : R K → R M , M ≥ . For this, consider a new SCN with decoder W , spikes σ , and ﬁltered spike trains ρ . In that case v evolves according to ˙ v = − λ v + W T ( J G ( x ) ˙ x + λG ( x )) − W T W σ (7)where J G is the Jacobian of G (Methods 7.1). The problem here is that the transformationfunction has to be either provided or computed by the network. However, if G is a polynomialfunction, the computation can be implemented using multiplicative synapses — e.g., forquadratic terms, we can use G ( x ) = x ⊗ x . Speciﬁcally, we need a network that takes asinput the output of another network ˆ x = Dr (with spikes s ) and returns W ρ ≈ ˆ x ⊗ (Supp.Fig. 2, Methods 7.2). Using the fact that ˙ r = − λ r + s , we obtain dynamics ˙ v = − λ v + Ω x ( r ⊗ s + s ⊗ r + − λ r ⊗ r ) + Ω Wf σ , with Ω x = W T ( D ⊗ D ) and Ω Wf = − W T W . Now, the combination of this network with another of the form described in eq. (5) results ina system with the ability of computing third-order multiplications with only pairwise (second-order) synapses. Notably, one network computes the pairwise multiplications, and the othercomputes the desired third-order dynamic equation using another pairwise multiplication ofthose new variables (Fig. 3A, Supp. Fig. 2).More concretely, we consider a polynomial function F : R K → R K with maximum degree g = 3 . We can write F ( x ) = Ax + Bx ⊗ + Cx ⊗ using eq. (4). Naively, a network ofneurons that approximate the solution to ˙ x = F ( x ) can be written using eq. (5) as ˙ v = − λ v + Ω f s + Ω m s r + Ω m s r ⊗ + Ω m s r ⊗ , which contains the third-order synapses in the last term. However, we now reintroduce theﬁrst network (from Section 4.1) with outputs W ρ ≈ ˆ x ⊗ , which allows the term Ω m s r ⊗ tobe replaced by D T C ( D ⊗ W )( r ⊗ ρ ) , yielding ˙ v = − λ v + Ω f s + Ω m s r + Ω m s r ⊗ + Ω exts ( r ⊗ ρ ) (8)where Ω exts = D T C ( D ⊗ W ) . The same argument can be extended to higher order multi-plications, at the cost of having (cid:100) log ( g ) − (cid:101) support networks, where g is the maximumdegree of F . 8 CB Figure 3: Third order polynomial dynamics solved by sequential pairwise multiplications. (A)

To avoid third order multiplications, another network can be used whose output will bethe pairwise multiplication of any two input dimensions (which can be done through onlypair-wise synapses). (B)

Example output of a network computing the double pendulum andusing an external network to avoid the third order multiplications. (C)

Solution computedby employing a neural network with third order synapses (dashed line) or employing twoneural networks to avoid the third order multiplications (dash-dotted line) compared to thenumerical solution of dynamical system (dotted line). All solutions almost perfectly overlap.

We illustrate the use of the higher-order polynomial implementation using the doublependulum as an example (Fig. 3B). Suppose that each pendulum has length l and mass m .We denote θ , θ the angles of the ﬁrst and second pendulum with respect to the verticalaxis (i.e. θ i = 0 when the pendulum is pointing downwards), and p θ and p θ their momenta.The full double pendulum dynamics can be derived using the Lagrangian (Methods 7.4;e.g., Levien and Tan (1993)). For small angles θ , θ one can consider the approximation sin x ≈ x and cos x ≈ , which leads to the following approximated dynamics: ˙ θ = 67 ml (2 p θ − p θ )˙ θ = 67 ml (8 p θ − p θ )˙ p θ = − ml (cid:16) ˙ θ ˙ θ ( θ − θ ) + 3 gl θ (cid:17) ˙ p θ = − ml (cid:16) − ˙ θ ˙ θ ( θ − θ ) + gl θ (cid:17) . We denote x = ( θ , θ , p θ , p θ ) T , and rewrite the system as ˙ x = Ax + Cx ⊗ ( A and C deﬁned explicitly in Methods 7.4).We implemented the double pendulum system in three distinct ways. As before for theLorenz system, we ﬁrst simulated a control network that was simply asked to autoencodethe dynamics directly, which were computed externally. Next, we implemented two SCNs —one network computed the dynamics through explicit third-order multiplicative synapses(as in (8)). The other implementation utilized the trick that one network computes thepairwise multiplication of the variables, and the other network uses that to compute thethird-order multiplication of the variables (as in (5)). We found that these two dynamicSCN implementations produced accurate representations of the dynamics, closely followingthe autoencoder network which received the “true” solution directly.9 B C F a s t C o nn e c t i o n s D e n s i t y ( r e l a t i v e t o m a x . ) S l o w C o nn e c t i o n s D e n s i t y ( r e l a t i v e t o m a x . ) Q u a d r a t i c C o nn . D e n s i t y ( r e l a t i v e t o m a x . ) Decoder Density (p) Decoder Density (p) Decoder Density (p)

Figure 4: Expected number of connections for fast synapses ( A ), slow synapses ( B ), and slowmultiplicative (quadratic) synapses ( C ) as a function of the decoder density p for diﬀerentsignal dimensions ( K ) and dynamical system density ( N A for the linear part and N B forthe quadratic part, each representing the number of non-zero entries of matrices A and B resp.). Dashed lines represent the theoretical upper bounds, whereas shaded areas represent ± standard deviation from the average of the simulated connectivity (see Methods 7.6). As shown in the previous sections, SCNs with multiplicative synapses oﬀer a powerful andintuitive way of implementing polynomial dynamical systems in spiking networks. However,though they may be eﬃcient with respect to the number of neurons and spikes required,they can require dense synaptic connections (sometimes with several connections for eachpair of neurons). Speciﬁcally, in the SCN framework, any two neurons with an overlap insignal representation will be connected by fast and slow synapses, along with the possibilityof additional slow multiplicative connections with the derivations introduced here. So far wehave assumed all neurons share signal space, resulting in full all-to-all connectivity of all typesof connections (i.e., N fast connections, N slow connections, N pair-wise multiplicativesynapses, and so on). However, connectivity in the brain is known to be sparse (Song et al.,2005; Lefort et al., 2009), and it is reasonable to assume that each neuron is not likely tocode for all possible stimulus space features. Given this constraint, it is important to addressthe expected density of the various connectivities, especially the costly multiplicative ones.We consider the three connectivity matrices required for second-order multiplicativecomputations: the fast connection Ω f = D T D , the slow connections Ω m s = D T ( A + λ I ) D and the multiplicative connections Ω m s = D T BD ⊗ D . The fast connections only dependon the density of the decoder, and any two neurons are connected whenever they share adecoding-dimension. For the slow and multiplicative connections this density additionallydepends on the ‘density’ of the dynamical system interactions, i.e. the number of non-zero elements in the matrices A and B . Considering a ﬁxed probability p that a givenneuron codes for a given signal dimension, the full density then depends on both p andthe signal dimension K for the fast connections (Fig. 3), and the dynamical system densityfor the slow and multiplicative connections (Fig. 3B,C, see Methods 7.6). Although thisconnection probability rises sharply as the signal dimensionality increases, we see that forlow to moderate numbers of signals, the connectivity density is far below all-to-all. Notably,the slow and multiplicative connections have a slower rise compared to the fast connections,which may push the network closer to a biologically-plausible regime (though this meritsfurther analysis). In this report, we extended the spike-coding network (SCN) framework (Boerlin et al., 2013;Denève and Machens, 2016; Calaim et al., 2020) to derive the network connectivity required to10mplement arbitrary polynomial dynamics, for which fast, slow, and multiplicative connectionsare necessary. For second order systems the connectivity requires pair-wise multiplicativesynapses, and we demonstrated how higher-order multiplications can be implemented inseveral network stages with only pair-wise synapses. We illustrated the use of this formalismby simulating the Lorenz system and a third order approximation of a double pendulum.Furthermore, due to the rich ﬂexibility of polynomials for approximating arbitrary nonlinearfunctions (Stone-Weierstrass Theorem or Taylor Expansion), this approach could in principlebe extended to many other systems for which a lower-order polynomial approximation issuﬃcient. Lastly, we also analyzed the relationship between the sparsity of signal coding perneuron and the connectivity sparsity (for fast, slow, and multiplicative connections), andshowed how the need for all-to-all connectivity can be relaxed.The previous approach to implement nonlinear dynamical systems in SCNs was to usesets of basis-functions to ﬁlter a neuron’s input (Thalmeier et al., 2016; Alemi et al., 2018),with the connectivity required for a given dynamical system implemented through supervisedtraining (as we also implemented in Fig. 2). We don’t claim that our approach is superior tobasis functions, but in certain situations it can oﬀer advantages. Notably, it oﬀers a directderivation of the required connectivity for any polynomial system, and thus results in well-deﬁned and interpretable connectivity architectures which directly relate to the underlyingdynamics. Furthermore, it avoids the potentially computationally-intensive task of trainingthe connectivity, as well as choosing meta-parameters such as basis function shapes. Finally,in scaling up to larger, high-dimensional problems, we suspect that the training algorithmsmay fare poorly, in which case having a direct and interpretable derivation would be highlyuseful.Previous studies have noted the wide range of nonlinear computations that multiplicativesynapses enable (Koch and Poggio, 1992; Nezis and Rossum, 2011). Our contribution here wasto present an optimal derivation in the SCN framework, in which multiplicative synapses arisenaturally out of the extension to polynomial dynamical systems. The biological plausibilityof such multiplicative synapses is questionable, however, and they are unlikely to exist inthis exact form biologically. Many types of eﬀective multiplication are known to exist in thedendritic tree (Koch and Poggio, 1992), and future work would be needed to verify whetheror not accurate polynomial computations are still possible in such a setting. More generally,other studies have characterized multiplicative operations in biological networks (Peña andKonishi, 2001; Gabbiani et al., 2002; Zhou et al., 2007), indicating that there may be a rolefor multiplicative synaptic interactions in neural circuits.Despite these limitations, we contend that SCNs with multiplicative synapses oﬀer twobeneﬁts even if their biological plausibility is called into question. First, the fact that theimplementation of polynomial dynamics is direct and explicit implies that this techniqueoﬀers a useful comparative control when considering the possible computations that spikingnetworks can perform, as well as the limits of their accuracy. This model may therefore serveas a useful reference for future studies. Second, as many neuromorphic hardware applicationsrely on networks of integrate-and-ﬁre type neurons, it is in principle less constrained bybiological plausibility. In this context, the constraint of high pair-wise connectivity maybe an acceptable cost to pay for a precise recipe for the connectivity required for a givennonlinear computation. Unlike the approximate basis function approach, an implementationwith multiplicative synapses would not need to undergo a possibly time-consuming trainingphase.In sum, we have provided a proof of concept of direct and explicit polynomial dynamicsimplemented in spiking networks. Future directions include the application of this frameworkto other biologically-plausible and neuromorphic computations, a study of the eﬃciency ofthis framework, and the potential for biologically-plausible learning of the connectivity.11

Methods

We will show here a generalization of the derivation of spike-coding networks (SCNs) shownin Barrett et al. (2016), ignoring the constraint that neurons need to be either excitatory orinhibitory. Consider a network of N leaky integrate-and-ﬁre neurons receiving time-varyinginputs x ( t ) = ( x ( t ) , . . . , x K ( t )) , where K is the dimension of the input. For each neuron i we denote with s i ( t ) = (cid:80) k δ ( t ik − t ) the spike train function, where δ represents the Diracdelta function and { t ik ≥ } is the set of discrete times at which a spike was emitted. Thepopulation spike train function is described by the vector s ( t ) = ( s ( t ) , . . . , s N ( t )) T . Thepostsynaptic potential (loosely called ﬁring rate) of neuron i is deﬁned as a convolution ofthe spike train with an exponentially decaying kernel r i ( t ) = (cid:90) t exp( − λt (cid:48) ) s i ( t − t (cid:48) ) dt (cid:48) = (cid:88) t ik ≤ t exp( − λ ( t − t ik ))) (9)with leak constant λ , or, equivalently, in the diﬀerential form ˙ r i ( t ) = − λr i ( t ) + s i ( t ) . We denote the ﬁring rate of each neuron as a vector r ( t ) = ( r ( t ) , . . . , r N ( t )) T . Vectors willbe denoted by bold letters, and wherever possible we will exclude the explicit dependence ontime for the sake of text clarity. A neuron i ﬁres a spike whenever its membrane potential, V i exceeds a spiking threshold, T i , and is then reset to the value V i = R i .Consider a generic smooth function G : R K → R M , M ≥ . Our goal is to derivedynamics and connectivity of the network so that its output activity provides an accuraterepresentation of the modiﬁcation of the incoming signal y = G ( x ) ∈ R M . Notice that, usingthe identity function, one can recover the same form considered in Barrett et al. (2016).Following the assumptions made in the main text, we require the signals to be linearlydecodable, so that the readout can be simply written as ˆ y = Dr ≈ G ( x ) . The matrix D ∈ R M × N is called the decoding matrix, and its i -th column vector D i ∈ R M is the ﬁxedcontribution of neuron i to the signal. The accuracy of the representation is measured usinga squared error loss function, E = (cid:107) y − ˆ y (cid:107) = (cid:107) G ( x ) − ˆ y (cid:107) . The second assumption madein the main text requests the network to be eﬃcient, and can be formalized by asking that aneuron ﬁres a spike only if its eﬀect on the readout will reduce the loss function: E ( spike ) < E ( no spike ) which is, noticing that a spike of neuron i changes the readout by ˆ y → ˆ y + D i , (cid:107) G ( x ) − (ˆ y + D i ) (cid:107) < (cid:107) G ( x ) − ˆ y (cid:107) (10)After expanding the squares and canceling equal terms we obtain (cid:107) D i (cid:107) − D T i ( G ( x ) − ˆ y ) < (11)which can be rearranged into D T i ( G ( x ) − ˆ y ) > (cid:107) D i (cid:107) (12)This equation is crucial: it describes a spiking rule under which the loss function is reduced,and it oﬀers an enticing geometric interpretation of the behavior of the network (Calaim Throughout the text, the input signals, the membrane voltages and the spike trains are all time-dependentquantities, whereas the thresholds, the decay constants, and the connection strengths are all constants.

12t al., 2020). The right hand side of the equation is ﬁxed, and can be interpreted as thespiking threshold of neuron i : T i = (cid:107) D i (cid:107) . The left hand side of the equation, similarly to the derivation showed in (Barrett et al., 2016),is used to deﬁne the voltage of neuron iV i = D T i ( G ( x ) − ˆ y ) , (13)which, taking the derivative, yields, ˙ V i = D T i (cid:18) dG ( x ) dt − d ˆ y dt (cid:19) = D T i ( J G ( x ) ˙ x ) + D T i λ ˆ y − (cid:88) k D T i D k s k , (14)where we used J G to indicate the Jacobian of the function G . Using (13), we have that D T i ˆ y = V i − D T i ( G ( x ) , and substituting this into (14) we obtain: ˙ V i = − λV i + D T i ( J G ( x ) ˙ x + λG ( x )) − (cid:88) k D T i D k s k (15)This equation describes the dynamic behavior of the voltage of a neuron in a network thatrepresents G ( x ) . We will use the vector form ˙ v = − λ v + D T ( J G ( x ) ˙ x + λG ( x )) + Ω f s (16)where Ω f = − D T D represents the fast connections among units, and also includes the resetterms on the diagonal. Consider the function G : R K → R K deﬁned as G ( x ) = x ⊗ x , where ⊗ is the Kroneckerproduct. Suppose that the input x = ( x , . . . , x K ) is given as a linearly decodable inputfrom an upstream network, such that x = Dr , where r describes the ﬁltered spike-trains ofthe upstream neurons and D is their decoding matrix. This generalization requires us tokeep track of ˙ x : if the upstream neurons follow equation (9), and we denote with s theirspike trains, we will have that ˙ x = D ˙ r = D ( s − λ r ) , where λ represents their leak constant.In order to use eq. (16), we need to compute the Jacobian of the function G . That’s givenby the K × K matrix J G ( x ) , with column i given by dGdx i . Denote with D i the i − th row ofthe matrix D , and with [ J G ( x ) ˙ x ] iK + j the ( iK + j ) − th entry of the matrix-vector product J G ( x ) ˙ x ∈ R K , for < i ≤ j ≤ K . We have: [ J G ( x ) ˙ x ] iK + j = dGdx i dx i dt + dGdx j dx j dt = x j ˙ x i + x i ˙ x j = ( D j r )( D i ( s − λ r )) + ( D i r )( D j ( s − λ r )= ( D j ⊗ D i + D i ⊗ D j )( r ⊗ ( s − λ r ))= ( D i ⊗ D j )( r ⊗ s + s ⊗ r − λ r ⊗ r ) and [ G ( x )] iK + j = ( D i r )( D j r ) = ( D i ⊗ D j )( r ⊗ r ) We can now derive the voltage equations of a network of neurons that represents the productof any pair of input dimensions using the equation derived in the previous section. Denote13ith σ the spike train of the network, and with ρ their ﬁltered spike train with leak constant α . Using eq. (16) we have ˙ v = − λ v + Ω x ( r ⊗ s + s ⊗ r + ( α − λ ) r ⊗ r ) + Ω Wf σ , with Ω x = W T ( D ⊗ D ) , Ω Wf = − W T W and W being their decoding matrix. An exampleof the output of such a network can be seen in Supp. Fig. 1. In that case the input was − dimensional, and the − dimensional output faithfully represented the product of eachinput dimension pair. By using the identity function G ( x ) = x in (16) we obtain the “classical” equation ˙ v = − λ v + D T ( ˙ x + λ x ) + Ω f s (17)This will be the starting point to implement linear and nonlinear computations. Lineardynamical systems were already considered in (Boerlin et al., 2013). Here we will focus on amore general class of nonlinearities, namely polynomial nonlinearities, and show that theoriginal formulation can be analytically extended to implement any polynomial nonlinearity.Denote with F : R K → R K the dynamic under study, so that ˙ x = F ( x ) . Starting from(17) and knowing that x ≈ ˆ x we can consider the following approximation: ˙ v = − λ v + D T ( F (ˆ x ) + λ ˆ x ) + Ω f s (18)If F is a linear dynamic of the form F ( x ) = Ax , with the matrix A ∈ R K × K , we recoverthe same form considered in (Boerlin et al., 2013): ˙ v = − λ v + D T ( A ˆ x + λ ˆ x ) + Ω f s = − λ v + D T ( ADr + λD r ) + Ω f s = − λ v + Ω s r + Ω f s (19)where Ω f = − D T D and Ω s = D T ( A + λ I ) D represent the fast and slow connectionsrespectively.If F is polynomial, we proceed as follows. Let ⊗ represent the Kronecker product, whichis deﬁned for any couple of matrices A , B of any arbitrary size as A ⊗ B =  a B · · · a n B ... . . . ... a m B · · · a mn B  Among the other properties, the one that will be used is the mixed-product, which states: If A , B , C and D are matrices of such size that one can form the matrix products AC and BD , then ( A ⊗ B )( C ⊗ D ) = ( AC ) ⊗ ( BD ) . Using Kronecker notation, any polynomial F : R K → R K with maximum degree g canbe written in the form F ( x ) = g (cid:88) d =0 A d x ⊗ d (20)where A d ∈ R K × K d is the matrix of coeﬃcients for the polynomials of degree d , and wedeﬁne M ⊗ d = M ⊗ M ⊗· · ·⊗ M as the Kronecker product applied d times, with the convention14hat M ⊗ = 1 and M ⊗ = M . Using the notation introduced in (20) and the mixed-productproperty, we can work out the following: F (ˆ x ) = g (cid:88) d =0 A d ˆ x ⊗ d = A + A ˆ x + A ˆ x ⊗ + A ˆ x ⊗ + . . . = A + A Dr + A ( Dr ) ⊗ + A ( Dr ) ⊗ + . . . = A + A Dr + A ( D ⊗ )( r ⊗ ) + A ( D ⊗ )( r ⊗ ) + . . . = g (cid:88) d =0 A d D ⊗ d r ⊗ d Inserting it into (18) one obtains the equations describing a network of integrate-and-ﬁreneurons that approximate the solution of a polynomial dynamical system: ˙ v = − λ v − D T Ds + D T ( g (cid:88) d =0 A d D ⊗ d r ⊗ d + λ ˆ x )= − λ v + Ω f s + Ω m s + Ω m s r + Ω m s r ⊗ + · · · + Ω mgs r ⊗ d = − λ v + Ω f s + g (cid:88) d =0 Ω mds r ⊗ d (21)where Ω f = − D T D , Ω m s = D T ( A + λ I ) D and Ω mds = D T A d D ⊗ d for d ∈ { , , , . . . , g } . The equations describing the time evolution of the double pendulum with each length l andmass m can be derived using the Lagrangian (Levien and Tan, 1993). θ , θ describe theangles of the ﬁrst and second pendulum with respect to the vertical axis (i.e. θ i = 0 whenthe pendulum is pointing downwards). The position of the centers of mass can be writtenthanks to these two coordinates: assuming that the origin is at the point of suspension ofthe ﬁrst pendulum, its center of mass will be at: x = l θ , y = − l θ and the center of mass of the second pendulum is at x = l (cid:0) sin θ + sin θ (cid:1) , y = − l (cid:0) cos θ + cos θ (cid:1) The full dynamics can be described by a − dimensional dynamical system representing thetwo angles and the two moments: ˙ θ = 6 ml p θ − θ − θ ) p θ − ( θ − θ )˙ θ = 6 ml p θ − θ − θ ) p θ − ( θ − θ ) . ˙ p θ = − ml (cid:16) ˙ θ ˙ θ sin( θ − θ ) + 3 gl sin θ (cid:17) ˙ p θ = − ml (cid:16) − ˙ θ ˙ θ sin( θ − θ ) + gl sin θ (cid:17) .

15e will use a small angle approximation of the above equations: if θ ≈ , the functions sin , cos are well approximated by x, respectively. The introduction of this simplifyingassumption turned the above equations into these: ˙ θ = 67 ml (2 p θ − p θ )˙ θ = 67 ml (8 p θ − p θ )˙ p θ = − ml (cid:16) ˙ θ ˙ θ ( θ − θ ) + 3 gl θ (cid:17) ˙ p θ = − ml (cid:16) − ˙ θ ˙ θ ( θ − θ ) + gl θ (cid:17) (22)These can be implemented using either equation (21) or (8) by considering x = ( θ , θ , p θ , p θ ) and rewriting the dynamical system as ˙ x = A x + C x ⊗ , where A =  k − k − k k cg/l cg/l  , (23)and C ∈ R × with C , = − ck , C , = 6 ck , C , = 25 ck , C , = − ck , C , = − ck , C , = 24 ck , C = − C and all the other entries set to zero, with k = 6 / (7 ml ) and c = − / ml . Previously, the original derivation was extended to implement arbitrary nonlinear dynamicalsystems through weighted basis functions, meant to model nonlinear synapses or dendrites(Alemi et al., 2018; Thalmeier et al., 2016). The trick consists in replacing the function F bya weighted set of basis functions g i , such that (cid:80) i a i g i ( x ) ≈ F ( x ) , so that eq. (18) can berewritten as ˙ v = A g (ˆ x ) + Ω f s . (24)In previous work the weights a i were found through supervised local learning rules. Forbrevity and comparison’s sake we will instead ﬁnd the optimal weights through regression(following (Eliasmith and Anderson, 2004)).We can ﬁnd the weights a i by solving the the following optimization min A || AG − F ( X ) || , (25)where X are the sampled inputs, G are the resulting basis functions, and F () is the targetfunction. The solution that minimizes this is A = F ( X ) G T ( F ( X ) F ( X ) T ) − . (26)In previous work (Alemi et al., 2018; Thalmeier et al., 2016) online learning rules were usedto minimize the cost, but as learning rules are not the focus of this paper, we used theabove solution. As a nonliarity we used a simple rectiﬁcation function ( g ( x ) = [ bx + c ] + ,with randomly distributed b ∈ [ − , and c ∈ [ − , ) of the input, but many types ofnonlinearities will work. We consider networks of N neurons representing a K -dimensional signal space, with decodingmatrix D ∈ R K × N . The i − th column vector of the matrix D , denoted by D i , represents theweights associated to neuron i . Any given unit in our setting could participate in the coding16f an arbitrary number of dimensions, given by the number of nonzero terms of the decodingweights D i . Here and in the following we will say that “neuron i codes for dimension x ”meaning that D xi (cid:54) = 0 . So far, we have considered that each neuron codes for each signaldimension, meaning that D is dense resulting in all-to-all connectivity. We will analyze herethe expected amount of connections based on the sparsity of the decoding matrix D of anetwork implementing a generic dynamical system ˙ y = Ay + By ⊗ y . Ω f For the fast connections, the connectivity matrix is given by D T D . For any pair of neurons m, n we will have that a (fast) connection exists if D T m D n (cid:54) = 0 , which means that if these twoneurons “share a dimension” (i.e. D m , D n have nonzero entries in at least one common spotand they are not orthogonal) they will need a fast connection among them. Let’s denote with F the number of fast connections the system needs to make, and denote with ≤ p nd ≤ the probability that a neuron n will participate in the representation of the d − th dimension(i.e. p nd = P ( D dn (cid:54) = 0) ). Let’s assume that they are all independent. Then the probabilitythat any given pair of neurons n, m will need a connection is given by the probability thatthey both end up coding for at least one common dimension, which is − K (cid:89) d =1 (1 − p nd p md ) In the case where p nd = p we can compute the expected number of fast connections: E ( F ) = N ( N − (cid:0) − (1 − p ) K (cid:1) Ω s Slow connections have the form Ω s = D T ( A + λ I ) D = D T AD + λ D T D . The second part, λ D T D , has exactly the form of the already considered case of fast connections. So theinteresting part is D T AD , which will add further connections to allow the network to solvelinear dynamical systems. Since A is not symmetric in general, Ω s can be non symmetrictoo, hence the total possible number of slow connections is N , and will be so when thedecoding matrix D is not sparse. If the matrix A has a non-zero entry at a location d, e ,all the neurons that code for dimension d will have to connect to all the neurons that codefor dimension e . The probability that two neurons n, m will form a slow connection willbe p nd p me , or simply p if the probability is uniform across dimensions and neurons. Theexpected number of slow connections (due to that single non-zero entry) is N (cid:88) n =1 N (cid:88) m =1 p nd p me = ( N p ) where the last equality holds only in case of uniform probability. In that case we also have E ( S ) ≤ E ( F ) + N A ( N p ) where S is the number of slow connections and N A is the number of nonzero entries in A . Ω nl The quadratic connections take form Ω nl = D T B ( D ⊗ D ) . If the decoding matrix is notsparse, the number of quadratic connections will be ∝ N . In fact, the maximum possiblenumber is given by N ( N − / , corresponding to each neuron ( N ) connected to eachpossible pair ( N ( N − ). On the other hand, if D is suﬃciently sparse, we can reason asfollows. Denote with G d the group of neurons s.t. D d (cid:54) = 0 . Let’s assume that our dynamical17ystem requires dimension d depends nonlinearly on dimensions e and f , i.e. ˙ x d ∝ x e x f , orequivalently B d,eK + f (cid:54) = 0 . Then, each neuron in G d needs to keep track of coincident ﬁringof any neuron in G e with any other in G f . The probability that a neuron in G d will needto take care of coincident spiking of the pair of neurons m, n is − (1 − p ne p mf )(1 − p me p nf ) ,corresponding to the probability that at least one of the two neurons codes for dimension e and the other for dimension f . In the case of uniform p this reduces to p − p , so eachneuron in G d will need an average of N ( N − (2 p − p ) coincidence detectors, leading to anupper bound for the expected total number of multiplicative synapses E ( Q ) ≤ N B n d N ( N − p − p ) ≈ N B ( N p ) where n d = G d ≈ N p and N B is the number of nonzero entries in B . The equality signholds only in the case N B ≤ . In order to simulate the connectivity we ﬁxed a decoder density p and randomly ﬁlled thedecoding matrix using a Bernoulli distribution B ( p ) in each entry for 1000 times. For thefast connections we varied the size of the output signal - i.e. the size of the decoding matrix.For slow and multiplicative synapses the dimensionality of the signal K did not aﬀect thedensity of the resulting connections (not shown). What inﬂuenced the amount of slowand multiplicative synapses was the number of non-zero entries in the matrices A and B ,respectively. Author Contributions

MN contributed to discussions, initial and follow-up implementations, mathematical deriva-tions and writing of the paper; JWP contributed to discussions and initial implementations;WFP contributed to discussions, supervision, and writing of the paper; SWK conceived theinitial idea, contributed to supervision, discussions and initial implementation, and writingof the paper.

Acknowledgements

We thank Christian Machens for useful discussions on the project. This report came outof a collaboration started at the CAJAL Advanced Neuroscience Training Programme inComputational Neuroscience in Lisbon, Portugal, during the 2019 summer. The authorswould like to thank the participants, TAs, lecturers, and organizers of the summer school.SWK was supported by the Simons Collaboration on the Global Brain (543009). WFP wassupported by FCT (032077). MN was supported by European Union Horizon 2020 (665385).

References

Alemi, A., Machens, C. K., Deneve, S., and Slotine, J.-J. (2018). Learning nonlinear dynamics ineﬃcient, balanced spiking networks using local plasticity rules. In

Thirty-second aaai conferenceon artiﬁcial intelligence .Barrett, D. G., Deneve, S., and Machens, C. K. (2016). Optimal compensation for neuron loss.

Elife ,5:e12454.Boerlin, M., Machens, C. K., and Denève, S. (2013). Predictive coding of dynamical variables inbalanced spiking networks.

PLoS Comput Biol , 9(11):e1003258.Calaim, N., Dehmelt, F. A., Gonçalves, P. J., and Machens, C. K. (2020). Robust coding withspiking networks: a geometric perspective. bioRxiv . e Branges, L. (1959). The stone-weierstrass theorem. Proceedings of the American MathematicalSociety , 10(5):822–824.Denève, S. and Machens, C. K. (2016). Eﬃcient codes and balanced networks.

Nature neuroscience ,19(3):375–382.Eliasmith, C. (2005). A Uniﬁed Approach to Building and Controlling Spiking Attractor Networks.

Neural Computation , 17(6):1276–1314.Eliasmith, C. and Anderson, C. H. (2004).

Neural engineering: Computation, representation, anddynamics in neurobiological systems . MIT press.Gabbiani, F., Krapp, H. G., Koch, C., and Laurent, G. (2002). Multiplicative computation in avisual neuron sensitive to looming.

Nature , 420(6913):320–324.Keemink, S. W. and Machens, C. K. (2019). Decoding and encoding (de) mixed population responses.

Current Opinion in Neurobiology , 58:112–121.Koch, C. and Poggio, T. (1992). Multiplying with Synapses and Neurons. In

Single NeuronComputation , pages 315–345. Elsevier.Lefort, S., Tomm, C., Sarria, J.-C. F., and Petersen, C. C. (2009). The excitatory neuronal networkof the c2 barrel column in mouse primary somatosensory cortex.

Neuron , 61(2):301–316.Levien, R. and Tan, S. (1993). Double pendulum: An experiment in chaos.

American Journal ofPhysics , 61(11):1038–1044.Lorenz, E. N. (1963). Deterministic nonperiodic ﬂow.

Journal of the atmospheric sciences , 20(2):130–141.Nezis, P. and Rossum, M. C. W. v. (2011). Accurate multiplication with noisy spiking neurons.

Journal of Neural Engineering , 8(3):034005.Peña, J. L. and Konishi, M. (2001). Auditory Spatial Receptive Fields Created by Multiplication.

Science , 292(5515):249–252.Salinas, E. and Abbott, L. F. (1996). A model of multiplicative neural responses in parietal cortex.

Proceedings of the national academy of sciences , 93(21):11956–11961.Song, S., Sjöström, P. J., Reigl, M., Nelson, S., and Chklovskii, D. B. (2005). Highly nonrandomfeatures of synaptic connectivity in local cortical circuits.

PLoS Biol , 3(3):e68.Strogatz, S. H. (2018).

Nonlinear dynamics and chaos with student solutions manual: Withapplications to physics, biology, chemistry, and engineering . CRC press.Thalmeier, D., Uhlmann, M., Kappen, H. J., and Memmesheimer, R.-M. (2016). Learning universalcomputations with spikes.

PLoS computational biology , 12(6):e1004895.Zhou, W., Xu, Y., Simpson, I., and Cai, Y. (2007). Multiplicative Computation in the Vestibulo-Ocular Reﬂex (VOR).

Journal of Neurophysiology , 97(4):2780–2789. Supplementary Figures

Supp. Fig. 1:

Robustness of SCN.

Implementation of a Lorenz dynamical system in anetwork with multiplicative synapses. For the ﬁrst 9 seconds, 10 neurons were artiﬁciallykilled every second. The peak analysis was executed on a 200 seconds simulation with either100 or 10 cells. 20 nput Output

Supp. Fig. 2:

Representation of the Kronecker product of the input . The Input ˆ x was givenby a network which computed a Lorenz system. The second network, using eq. (21), outputsa signal ≈ ˆ x ⊗ ˆ x . Blue lines represent network output, black dotted lines represent the real ˆ xx