[PDF] Local homeostatic regulation of the spectral radius of echo-state networks

Abstract

Recurrent cortical networks provide reservoirs of states that are thought to play a crucial role for sequential information processing in the brain. However, classical reservoir computing requires manual adjustments of global network parameters, particularly of the spectral radius of the recurrent synaptic weight matrix. It is hence not clear if the spectral radius is accessible to biological neural networks. Using random matrix theory, we show that the spectral radius is related to local properties of the neuronal dynamics whenever the overall dynamical state is only weakly correlated. This result allows us to introduce two local homeostatic synaptic scaling mechanisms, termed flow control and variance control, that implicitly drive the spectral radius towards the desired value under working conditions. We demonstrate the effectiveness of the two adaptation mechanisms under different external input protocols and the network performance after adaptation by training the network to perform a time-delayed XOR operation on binary sequences. As our main result, we found that flow control reliably regulates the spectral radius for different types of input statistics. Precise tuning is however negatively affected when interneural correlations are substantial. Furthermore, we found a consistent task performance over a wide range of input strengths/variances. Variance control did however not yield the desired spectral radii with the same precision, being less consistent across different input strengths. Given the effectiveness and remarkably simple mathematical form of flow control, we conclude that self-consistent local control of the spectral radius via an implicit adaptation scheme is an interesting and biological plausible alternative to conventional methods using setpoint homeostatic feedback controls of neural firing.

Full PDF

LLocal homeostatic regulation of the spectralradius of echo-state networks

Fabian Schubert , ∗ and Claudius Gros Institute for Theoretical Physics, Goethe University Frankfurt am Main, Germany

Correspondence*:Institute for Theoretical PhysicsGoethe University Frankfurt am MainMax-von-Laue-Str. 160438 Frankfurt am Main, [email protected]

ABSTRACT

Recurrent cortical networks provide reservoirs of states that are thought to play a crucial role forsequential information processing in the brain. However, classical reservoir computing requiresmanual adjustments of global network parameters, particularly of the spectral radius of therecurrent synaptic weight matrix. It is hence not clear if the spectral radius is accessible tobiological neural networks.Using random matrix theory, we show that the spectral radius is related to local propertiesof the neuronal dynamics whenever the overall dynamical state is only weakly correlated. Thisresult allows us to introduce two local homeostatic synaptic scaling mechanisms, termed ﬂowcontrol and variance control, that implicitly drive the spectral radius towards the desired value.For both mechanisms the spectral radius is autonomously adapted while the network receivesand processes inputs under working conditions.We demonstrate the effectiveness of the two adaptation mechanisms under different externalinput protocols. Moreover, we evaluated the network performance after adaptation by trainingthe network to perform a time-delayed XOR operation on binary sequences. As our mainresult, we found that ﬂow control reliably regulates the spectral radius for different types ofinput statistics. Precise tuning is however negatively affected when interneural correlations aresubstantial. Furthermore, we found a consistent task performance over a wide range of inputstrengths/variances. Variance control did however not yield the desired spectral radii with thesame precision, being less consistent across different input strengths.Given the effectiveness and remarkably simple mathematical form of ﬂow control, we concludethat self-consistent local control of the spectral radius via an implicit adaptation scheme isan interesting and biological plausible alternative to conventional methods using set pointhomeostatic feedback controls of neural ﬁring.

Keywords: recurrent networks, homeostasis, synaptic scaling, echo-state networks, reservoir computing, spectral radius a r X i v : . [ q - b i o . N C ] J a n abian Schubert and Claudius Gros Homeostatic regulation of echo-state networks

Cortical networks are highly recurrent, a property that is considered to be crucial for processing and storingtemporal information. For recurrent networks to remain stable and functioning, the neuronal ﬁring activityhas to be kept within a certain range by autonomously active homeostatic mechanisms. It is hence importantto study homeostatic mechanisms on the level of single neurons, as well as the more theoretic question ofcharacterizing the dynamic state that is to be attained on a global network level. It is common to roughlydivide adaptation mechanisms into intrinsic homeostasis, synaptic homeostasis, and metaplasticity.Synaptic scaling was identiﬁed as a mechanism that can postsynaptically regulate neural ﬁring byadjusting synaptic efﬁcacies in a proportional, multiplicative way. This ﬁnding has led to numerous studiesinvestigating the role of synaptic scaling in controlling neural network activity (Turrigiano et al., 1998;Turrigiano and Nelson, 2000; Turrigiano, 2008) and in stabilizing other plasticity mechanisms (van Rossumet al., 2000; Stellwagen and Malenka, 2006; Tetzlaff, 2011; Toyoizumi et al., 2014). Indeed, synapticscaling has proven successful in stabilizing activity in recurrent neural networks (Lazar et al., 2009; Remmeand Wadman, 2012; Zenke et al., 2013; Effenberger and Jost, 2015; Miner and Triesch, 2016). However,these studies either used synaptic scaling as the sole homeostatic mechanism (Zenke et al., 2013; Remmeand Wadman, 2012) or resorted to a variant of synaptic scaling where the scaling is not dynamicallydetermined through a control loop using a particular target activity, but rather by a ﬁxed multiplicativenormalization rule (Lazar et al., 2009; Effenberger and Jost, 2015; Miner and Triesch, 2016). Therefore,these homeostatic models cannot account for higher moments of temporal activity patterns, i.e., theirvariance, as this would require at least the tuning of two parameters (Cannon and Miller, 2017).Within more abstract models of rate encoding neurons, intrinsic homeostasis and synaptic scalingessentially correspond to adjusting a bias and gain factor on the input entering a nonlinear transfer function.Within this framework, multiple dual-homeostatic adaptation rules have been investigated concerningtheir effect on network performance. In this framework, the adaptation of the bias acts as an intrinsicplasticity mechanism for the control of the internal excitability of a neuron (Franklin et al., 1992; Abbottand LeMasson, 1993; Borde et al., 1995), while the gain factors functionally correspond to a synapticscaling of the recurrent weights. Learning rules for these types of models were usually derived by deﬁninga target output distribution that each neuron attempts to reproduce by changing neural gains and biases(Triesch, 2007; Steil, 2007; Schrauwen et al., 2008; Boedecker et al., 2009), or were directly derived froman information-theoretic measure (Bell and Sejnowski, 1995).While these studies did indeed show performance improvements by optimizing local informationtransmission measures, apparently, optimal performance can effectively be traced back to a global parameter,the spectral radius of the recurrent weight matrix (Schrauwen et al., 2008). Interestingly, to our knowledge,theoretical studies on spiking neural networks did not explicitly consider the spectral radius as a parameteraffecting network dynamics. Still, the theory of balanced states in spiking recurrent networks established theidea that synaptic strengths should scale with / √ k , where k is the average number of afferent connections(Van Vreeswijk and Sompolinsky, 1998). According to the circular law of random matrix theory, thisscaling rule simply implies that the spectral radius of the recurrent weight matrix remains ﬁnite as thenumber of neurons N increases. More recent experiments on cortical cultures conﬁrm this scaling (Barraland D’Reyes, 2016).In the present study, we investigated whether the spectral radius of the weight matrix in a random recurrentnetwork can be regulated by a combination of intrinsic homeostasis and synaptic scaling. Following thestandard echo-state framework, we used rate encoding tanh-neurons as the model of choice. However,aside from their applications as efﬁcient machine learning algorithms, echo state networks are potentially This is a provisional ﬁle, not the ﬁnal typeset article abian Schubert and Claudius Gros Homeostatic regulation of echo-state networks relevant as models of information processing in the brain (Nikoli´c et al., 2009; Hinaut et al., 2015; Enelet al., 2016). Note in this context that extensions to layered ESN architectures have been presented byGallicchio and Micheli (2017), which bears a somewhat greater resemblance to the hierarchical structure ofcortical networks than the usual shallow ESN architecture. This line of research illustrates the importanceof examining whether local and biological plausible principles exist that would allow to tune the propertiesof the neural reservoir to the “edge of chaos” (Livi et al., 2018), particularly when a continuous streamof inputs is present. The rule has to be independent of both the network topology, which is not locallyaccessible information, and the distribution of synaptic weights.We propose and compare two unsupervised homeostatic mechanisms, which we denote by ﬂow controland variance control. Both regulate the mean and variance of neuronal ﬁring such that the network worksin an optimal regime concerning sequence learning tasks. The mechanisms act on two sets of node-speciﬁcparameters, the biases b i , and the neural gain factors a i . We restricted ourselves to biologically plausibleadaptation mechanisms, viz adaptation rules for which the dynamics of all variables are local, i.e., boundto a speciﬁc neuron. Additional variables enter only when locally accessible. In a strict sense, this impliesthat local dynamics are determined exclusively by the neuron’s dynamical variables and by informationabout the activity of afferent neurons. Being less restrictive, one could claim that it should also be possibleto access aggregate or mean-ﬁeld quantities that average a variable of interest over the population. Forexample, nitric oxide is a diffusive neurotransmitter that can act as a measure for the population average ofneural ﬁring rates (Sweeney et al., 2015).Following a general description of the network model, we introduce both adaptation rules and evaluatetheir effectiveness in tuning the spectral radius in Section 2.4 and 2.5. We assess the performance ofnetworks that were subject to adaptation in Section 2.7, using a nonlinear sequential memory task. Finally,we discuss the inﬂuence of node-to-node cross-correlations within the population in Section 2.8. A full description of the network model and parameters can be found in the methods section. We brieﬂyintroduce the network dynamics as x i ( t ) = x r ,i ( t ) + I i ( t ) (1) x r ,i ( t ) := a i N (cid:88) j =1 W ij y j ( t − (2) y i ( t ) = tanh ( x i ( t ) − b i ) . (3)Each neuron’s membrane potential x i consists of a recurrent contribution x r ,i ( t ) and an external input I i ( t ) .The biases b i are subject to the following homeostatic adaptation: b i ( t ) = b i ( t −

1) + (cid:15) b [ y i ( t ) − µ t ] . (4)Here, µ t deﬁnes a target for the average activity and (cid:15) b is the adaptation rate.The local parameters a i act as scaling factors on the recurrent weights. We considered two differentforms of update rules. Loosely speaking, both drive the network towards a certain dynamical state whichcorresponds to the desired spectral radius. The difference between them lies in the variables characterizing Frontiers 3 abian Schubert and Claudius Gros

Homeostatic regulation of echo-state networks this state: While ﬂow control deﬁnes a relation between the variance of neural activity and the varianceof the total recurrent synaptic current, variance control does so by a more complex relation between thevariance of neural activity and the variance of the synaptic current from the external input.

The ﬁrst adaptation rule, ﬂow control, is given by a i ( t ) = a i ( t − (cid:104) (cid:15) a ∆ R i ( t ) (cid:105) , ∆ R i ( t ) = R y i ( t − − x ,i ( t ) . (5)The parameter R t is the desired target spectral radius and (cid:15) a the adaptation rate of the scaling factor.The dynamical variables y i and x ,i have been deﬁned before in Eqs. (1) and (2). We also considered analternative global update rule where ∆ R i ( t ) is given by ∆ R i ( t ) = 1 N (cid:104) R || y ( t − || − || x r ( t ) || (cid:105) , (6)where || · || denotes the euclidean vector norm. However, since this is a non-local rule, it only served as acomparative model to Eq. (5) when we investigated the effectiveness of the adaptation mechanism. Threekey assumptions enter ﬂow control, Eq. (5): • Represented by x r ,i ( t ) , we assume that there is a physical separation between the recurrent input that aneuron receives and its external inputs. This is necessary because x r ,i ( t ) is explicitly used in the updaterule of the synaptic scaling factors. • Synaptic scaling only affects the weights of recurrent connections. However, this assumption is notcrucial for the effectiveness of our plasticity rule, as we were mostly concerned with achieving a presetspectral radius for the recurrent weight matrix. If instead the scaling factors acted on both the recurrentand external inputs, this would lead to an “effective” external input I (cid:48) i ( t ) := a i I i ( t ) . However, a i only affecting the recurrent input facilitated the parameterization of the external input by means of itsvariance, see Section 2.7, a choice of convenience. • For (5) to function, neurons need to able to represent and store squared neural activities.Whether these three preconditions are satisﬁed by biological neurons needs to be addressed in futurestudies.

The second adaptation rule, variance control, has the form a i ( t ) = a i ( t −

1) + (cid:15) a (cid:104) σ ,i ( t ) − (cid:0) y i ( t ) − µ y i ( t ) (cid:1) (cid:105) (7) σ ,i ( t ) = 1 − (cid:113) R y i ( t ) + 2 σ ,i ( t ) . (8)Eq. (7) drives the average variance of each neuron towards a desired target variance σ ,i ( t ) at an adaptationrate (cid:15) a by calculating the momentary squared difference between the local activity y i ( t ) and its trailingaverage µ y i ( t ) . Eq. (8) calculates the target variance as a function of the target spectral radius R t , the currentlocal square activity y i ( t ) and a trailing average σ ,i ( t ) of the local variance of the external input signal.When all a i ( t ) reach a steady state, the average neural variance equals the target given by (8). According to This is a provisional ﬁle, not the ﬁnal typeset article abian Schubert and Claudius Gros Homeostatic regulation of echo-state networks a mean-ﬁeld approach that is described in Section 5.6, reaching this state then results in a spectral radius R a that is equal to the target R t entering (8). Intuitively, it is to be expected that σ ,i is a function of boththe spectral radius and the external driving variance: The amount of ﬂuctuations in the network activity isdetermined by the dynamic interplay between the strength of the external input as well as the recurrentcoupling.A full description of the auxiliary equations and variables used to calculate µ y i ( t ) and σ ,i ( t ) can befound in Section 5.1.Similar to ﬂow control, we also considered a non-local version for comparative reasons, where (8) isreplaced with σ ,i ( t ) = 1 − (cid:113) R || y ( t ) || /N + 2 σ ,i ( t ) . (9)Again, || · || denotes the euclidean norm. Before proceeding to the results, we discuss the mathematicalbackground of the proposed adaptation rules in some detail. There are some interesting aspects to the theoretical framework at the basis of the here proposed regulatoryscaling mechanisms.The circular law of random matrix theory states that the eigenvalues λ j are distributed uniformly on thecomplex unit disc if the elements of a real N × N matrix are drawn from distributions having zero meanand standard deviation / √ N (Tao and Vu, 2008). Given that the internal weight matrix (cid:99) W ( (cid:98) · denotingmatrices) with entries W ij has p r N non-zero elements per row ( p r is the connection probability), thecircular law implies that the spectral radius of a i W ij , the maximum of | λ j | , is unity when the synapticscaling factors a i are set uniformly to /σ w , where σ w is the standard deviation of W ij . Our goal is toinvestigate adaptation rules for the synaptic scaling factors that are based on dynamic quantities, whichincludes the membrane potential x i , the neural activity y i and the input I i .The circular law, i. e. a N × N matrix with i.i.d. entries with zero mean and /N variance approaching aspectral radius of one as N → ∞ , can be generalized. Rajan and Abbott (2006) investigated the case wherethe statistics of the columns of the matrix differ in their means and variances: given row-wise E-I balancefor the recurrent weights, the square of the spectral radius of a random N × N matrix whose columnshave variances σ i is N (cid:10) σ i (cid:11) i for N → ∞ . Since the eigenvalues are invariant under transposition, thisresult also holds for row-wise variations of variances and column-wise E-I balance. While the latter isnot explicitly enforced in our case, deviations from this balance are expected to tend to zero for large N given the statistical assumptions that we made about the matrix elements W ij . Therefore, the result can beapplied to our model, where node-speciﬁc gain factors a i are applied to each row of the recurrent weightmatrix. Thus, the spectral radius R a of the effective random matrix (cid:99) W a with entries a i W ij (as entering (2))is R (cid:117) N (cid:88) i R ,i , R ,i := a i (cid:88) j ( W ij ) , (10)for large N , when assuming that the distribution underlying the bare weight matrix (cid:99) W with entries W ij haszero mean. Note that R can be expressed alternatively in terms of the Frobenius norm (cid:13)(cid:13)(cid:13)(cid:99) W a (cid:13)(cid:13)(cid:13) F , via R (cid:117) (cid:13)(cid:13)(cid:13)(cid:99) W a (cid:13)(cid:13)(cid:13) /N . (11) Frontiers 5 abian Schubert and Claudius Gros

Homeostatic regulation of echo-state networks

We numerically tested Eq. (10) for N = 500 and heterogeneous random sets of a i drawn from a uniform [0 , -distribution and found a very close match to the actual spectral radii ( - relative error). Given thatthe R a ,i can be interpreted as per-site estimates for the spectral radius, one can use the generalized circularlaw (10) to regulate R a on the basis of local adaptation rules, one for every a i .For the case of ﬂow control, the adaptation rule is derived using a comparison between the variance ofneural activity that is present in the network with the recurrent contribution to the membrane potential. Adetailed explanation is presented in Section 2.6 and Section 5.5. In short, we propose that (cid:10) || x r ( t ) || (cid:11) t (cid:117) R (cid:10) || y ( t − || (cid:11) t , (12)where x r ,i is the recurrent contribution to the membrane potential x i . This stationarity condition leads tothe adaptation rule given in Eq. (5). An analysis of the dynamics of this adaptation mechanisms can befound in Section 5.5.Instead of directly imposing Eq. (12) via an appropriate adaptation mechanism, we also considered thepossibility of transferring this condition into a set point for the variance of neural activities as a functionthe external driving. To do so, we used a mean-ﬁeld approach to describe the effect of recurrent input ontothe resulting neural activity variance. An in-depth discussion is given in Section 5.6. This led to the updaterule given by Eq. (7) and (8) for variance control. We used several types of input protocols for testing the here proposed adaptation mechanisms, as wellas for assessing the task performance discussed in Section 2.7. The ﬁrst two variants concern distinctbiological scenarios: • Binary.

Binary input sequences correspond to a situation when a neural ensemble receives inputdominantly from a singular source, which itself has only two states, being either active or inactive.Using binary input sequences during learning is furthermore consistent with the non-linear performancetest considered here for the echo-state network as a whole, the delayed XOR-task. See Section 2.7. Forsymmetric binary inputs, as used, the source signal u ( t ) is drawn from ± . • Gaussian.

Alternatively one can consider the situation that a large number of essentially uncorrelatedinput streams are integrated. This implies random Gaussian inputs signals. Neurons receive in this casezero-mean independent Gaussian noise.Another categorical dimension concerns the distribution of the afferent synaptic weights. Do all neuronsreceive inputs with the same strength, or not? As a quantiﬁer for the individual external input strengths, thevariances σ ,i of the local external input currents where taken into account. We distinguished two cases • Heterogeneous.

In the ﬁrst case, the σ ,i are quenched random variables. This means that each neuronis assigned a random value σ ,i before the start of the simulation, as drawn from a half-normaldistribution parameterized by σ = σ ext . This ensures that the expected average variance (cid:10) σ ,i (cid:11) isgiven by σ . • Homogeneous.

In the second case, all σ ,i are assigned the identical global value σ .Overall, pairing “binary” vs. “Gaussian” and “heterogeneous” vs. “homogeneous”, leads to a totalof four different input protocols, i. e. “heterogeneous binary”, “homogeneous binary”, “heterogeneous This is a provisional ﬁle, not the ﬁnal typeset article abian Schubert and Claudius Gros Homeostatic regulation of echo-state networks

Figure 1. Online spectral radius regulation using ﬂow control.

Homeostatic regulation of echo-state networks

Figure 2. Online spectral radius regulation using variance control.

The spectral radius R a and therespective local estimates R a ,i as deﬁned by (10). For the input protocol see Section 5.3. A : Dynamicsof R ,i and R , in the presence of heterogeneous independent Gaussian inputs. Local adaptation. B :Distribution of eigenvalues of the corresponding effective synaptic matrix (cid:99) W a . The circles denote therespective spectral radius.input strength resulted in stronger deviations from the target spectral radius. On the other hand, uncorrelatedGaussian inputs resulted in perfect alignment for the entire range of input strengths that we tested. In comparison, variance control, shown in Figure 2 and Figure 13, resulted in notable deviations from R t , for both uncorrelated Gaussian and correlated binary input. As for ﬂow control, we also calculated thedeviations from R t as a function of σ ext , see Figure 16. For heterogeneous binary input, deviations fromthe target spectral radius did not increase monotonically as a function of the input strength (Figure 16A),reaching a peak at σ ext ≈ . for target spectral radii larger than . For homogeneous binary input, weobserved a substantial negative mismatch of the spectral radius for strong external inputs, see Figure 16C.Overall, we found that variance control did not exhibit the same level of consistency in tuning the systemtowards a desired spectral radius, even though it did perform better in some particular cases (compareFigure 15A for large σ ext with Figure 16). Moreover, variance control exhibited deviations from the target(shown in Figure 14) even when a global adaptation rule was used, as deﬁned in (9). This is in contrastto the global variant of ﬂow control, which, as stated in the previous section, robustly tuned the spectralradius to the desired value even in the presence of strongly correlated inputs. Apart from the spectral radius R a of the matrix (cid:99) W a , one may consider the relation between the adaptationdynamics and the respective singular values σ i of (cid:99) W a . We recall that the spectrum of ˆ U a = (cid:99) W † a (cid:99) W a is givenby the squared singular values, σ i , and that the relation || x r || = y † (cid:99) W † a (cid:99) W a y holds. Now, assume that the This is a provisional ﬁle, not the ﬁnal typeset article abian Schubert and Claudius Gros Homeostatic regulation of echo-state networks time-averaged projection of neural activity y = y ( t ) onto all eigenvectors of ˆ U a is approximately the same,that is, there is no preferred direction of neural activity in phase space. From this idealized case, it followsthat the time average of the recurrent contribution to the membrane potential can be expressed with (cid:10) || x r || (cid:11) t ≈ (cid:10) || y || (cid:11) t N (cid:88) i σ i = (cid:10) || y || , (cid:11) t N (cid:88) i,j (cid:0) a i W ij (cid:1) (13)as the rescaled average of the σ i . For the second equation, we used the fact that the (cid:80) i σ i equals thesum of all matrix elements squared (Sengupta and Mitra, 1999; Shen, 2001). With (10), one ﬁnds that(13) is equivalent to (cid:10) || x r || , (cid:11) t = R a (cid:10) || y || , (cid:11) t and hence to the ﬂow condition (12). This result can begeneralized, as done in Section 5.5, to the case that the neural activities have inhomogeneous variances,while still being uncorrelated with zero mean. We have thus shown that the stationarity condition leads to aspectral radius of (approximately) unity.It is worthwhile to note that the singular values of ˆ U a = (cid:99) W † a (cid:99) W a do exceed unity when R a = 1 . Moreprecisely, for a random matrix with i.i.d. entries, one ﬁnds in the limit of large N that the largest singularvalue is given by σ max = 2 R a , in accordance with the Marchenko-Pastur law for random matrices(Marˇcenko and Pastur, 1967). Consequently, directions in phase space exist in which the norm of the phasespace vector is elongated by factors greater than one. Still, this does not contradict the fact that a unitspectral radius coincides with the transition to chaos for the non-driven case. The reason is that the globalLyapunov exponents are given by lim n →∞ n ln (cid:18)(cid:16)(cid:99) W n a (cid:17) † (cid:99) W n a (cid:19) (14)which eventually converge to ln (cid:107) λ i (cid:107) , see Figure 17 in the supplementary material and Wernecke et al.(2019), where λ i is the i th eigenvalue of (cid:99) W a . The largest singular value of the n th power of a random matrixwith a spectral radius R a scales like R n a in the limit of large powers n . The global Lyapunov exponent goesto zero as a consequence when R a → . To this point, we presented results regarding the effectiveness of the introduced adaptation rules. However,we did not account for their effects onto a given learning task. Therefore, we tested the performance oflocally adapted networks under the delayed XOR task, which evaluates the memory capacity of the echostate network in combination with a non-linear operation. For the task, the XOR operation is to be takenwith respect to a delayed pair of two consecutive binary inputs signals, u ( t − τ ) and u ( t − τ − , where τ isa ﬁxed time delay. The readout layer is given by a single unit, which has the task of reproducing f τ ( t ) = XOR [ u ( t − τ ) , u ( t − τ − , t, τ = 1 , , . . . , (15)where XOR[ u, u (cid:48) ] is / if u and u (cid:48) are identical/not identical.The readout vector w out is trained with respect to the mean squared output error, || (cid:98) Y w out − f τ || + α || w out || , (16)using ridge regression on a sample batch of T batch = 10 N time steps, here for N = 500 , and a regularizationfactor of α = 0 . . The batch matrix (cid:98) Y , of size T batch × ( N + 1) , holds the neural activities as well as onenode with constant activity serving as a potential bias. Similarly, the readout (column) vector w out is of Frontiers 9 abian Schubert and Claudius Gros

Homeostatic regulation of echo-state networks size ( N + 1) . The T batch entries of f τ are the f τ ( t ) , viz the target values of the XOR problem. Minimizing(16) leads to w out = (cid:16) (cid:98) Y † (cid:98) Y + α ˆ (cid:17) − (cid:98) Y † f τ . (17)The learning procedure was repeated independently for each time delay τ . We quantiﬁed the performanceby the total memory capacity, MC XOR , as MC XOR = ∞ (cid:88) k =1 MC XOR ,k (18) MC XOR ,k = Cov [ f k ( t ) , y out ( t )] t Var [ f k ( t )] t Var [ y out ( t )] t . (19)This is a simple extension of the usual deﬁnition of short term memory in the echo state literature (Jaeger,2002). The activity y out = (cid:80) N +1 i =1 w out ,i y i of the readout unit is compared in (19) with the XOR predictiontask, with the additional neuron, y N +1 = 1 , corresponding to the bias of the readout unit. Depending onthe mean level of the target signal, this offset might actually be unnecessary. However, since it is a standardpractice to use an intercept variable in linear regression models, we decided to include it into the readoutvariable y out . The variance and covariance are calculated with respect to the batch size T batch .The results for ﬂow control presented in Figure 3 correspond to two input protocols, heterogeneousGaussian and binary inputs. Shown are sweeps over a range of σ ext and R t . The update rule (5) was appliedto the network for each pair of parameters until the a i values converged to a stable conﬁguration. We thenmeasured the task performance as described above. Note that in the case of Gaussian input, this protocolwas only used during the adaptation phases. Due to the nature of the XOR task, binary inputs with thecorresponding variances are to be used during performance testing. See Figure 18 in the supplementarymaterial for a performance sweep using the homogeneous binary and Gaussian input protocol. Optimalperformance was generally attained around the R a ≈ line. A spectral radius R a slightly smaller than unitywas optimal when using Gaussian input, but not for binary input signals. In this case the measured spectralradius R a deviated linearly from the target R t , with increasing strength of the input, as parameterized bythe standard deviation σ ext . Still, the locus of optimal performance was essentially independent of the inputstrength, with maximal performance attained roughly at R t ≈ . . Note that the line R a = 1 joins R t = 1 in the limit σ ext → .Comparing these results to variance control, as shown in Figure 4, we found that variance control ledto an overall lower performance. To our surprise, for external input with a large variance, Gaussian inputcaused stronger deviations from the desired spectral radius as compared to binary input. Therefore, in asense, it appeared to behave opposite to what we found for ﬂow control. However, similar to ﬂow control,the value of R t giving optimal performance under a given σ ext remained relatively stable over the rangeof external input strength measured. On the other hand, using homogeneous input, see Figure 19 in thesupplementary material, did cause substantial deviations from the target spectral radius when using binaryinput. A crucial assumption leading to the proposed adaptation rules is the statistical independence of neuralactivity for describing the statistical properties of the bare recurrent contribution to the membrane potential, x bare = (cid:80) j W ij y j . In particular, the variance σ of x bare enters the mean-ﬁeld approach described This is a provisional ﬁle, not the ﬁnal typeset article abian Schubert and Claudius Gros Homeostatic regulation of echo-state networks

Figure 3. XOR performance for ﬂow control.

Color-coded performance sweeps for the XOR-performance (18) after adaptation using ﬂow control. Averaged over ﬁve trials. The input has variance σ and the target for the spectral radius is R t . A/B panels are for heterogeneous binary/Gaussian inputprotocols. Optimal performance for a given σ ext was estimated as a trial average (yellow solid line) andfound to be generally close to criticality, R a = 1 , as measured (white dashed lines).in Section 5.6. Assuming statistical independence across the population for y i ( t ) , it is simply given by σ = σ σ , where σ ≡ Var  N (cid:88) j =1 W ij  (20)is the variance of the sum of the bare afferent synaptic weights (see also Section 5.1). Being a crucial elementof the proposed rules, deviations from the prediction of σ would also negatively affect the precisionof tuning the spectral radius. In Figure 5, a comparison of the deviations | σ − σ σ | is presented forthe four input protocols introduced in Section 5.3. For the Gaussian protocols, for which neurons receivestatistically uncorrelated external signals, one observes that σ → σ σ in the thermodynamic limit N → ∞ via a power law, which is to be expected when the presynaptic neural activities are decorrelated. Onthe other side, binary / inputs act synchronous on all sites, either with site-dependent or site-independentstrengths (heterogeneous/homogeneous). Corresponding activity correlations are induced and a ﬁnite andonly weakly size-dependent difference between σ and σ σ shows up. Substantial corrections to theanalytic theory are to be expected in this case. To this extend we measured the cross-correlation C ( y i , y j ) ,deﬁned as ¯ C = 1 N ( N − (cid:88) i (cid:54) = j | C ( y i , y j ) | , C ( y i , y j ) = Cov( y i , y j ) (cid:112) Cov( y i , y i )Cov( y j , y j ) , (21)with the covariance given by Cov( y i , y j ) = (cid:104) ( y i − (cid:104) y i (cid:105) t )( y j − (cid:104) y j (cid:105) t ) (cid:105) t . For a system of N = 500 neurons the results for the averaged absolute correlation ¯ C are presented in Figure 6 (see Figure 20 in thesupplementary material for homogeneous input protocols). Autonomous echo-state layers are in chaoticstates when supporting a ﬁnite activity level, which implies that correlations vanish in the thermodynamic Frontiers 11 abian Schubert and Claudius Gros

Homeostatic regulation of echo-state networks

Figure 4. XOR performance for variance control.

Color-coded performance sweeps for the XOR-performance (18) after adaptation using variance control. Averaged over ﬁve trials. The input has variance σ and the target for the spectral radius R t . A/B panels are for heterogeneous binary/Gaussian inputprotocols. Optimal performance (yellow solid line) is in general close to criticality, R a = 1 , as measured(white dashed lines).limit N → ∞ . The case σ ext = 0 , as included in Figure 6, serves consequently as a yardstick for themagnitude of correlations that are due to the ﬁnite number of neurons.Input correlations were substantially above the autonomous case for correlated binary inputs, with themagnitude of ¯ C decreasing when the relative contribution of the recurrent activity increased. This wasthe case for increasing R a . The effect was opposite for the Gaussian protocol, for which the input didnot induce correlations, but contributed to decorrelating neural activity. In this case, the mean absolutecorrelation ¯ C was suppressed when the internal activity became small in the limit R a → . For larger R a ,the recurrent input gained more impact on neural activity relative to the external drive and thus drove ¯ C towards an amount of correlation that would be expected in the autonomous case. The mechanisms for tuning the spectral radius via a local homeostatic adaptation rule introduced in thepresent study require neurons to have the ability to distinguish and locally measure both external andrecurrent input contributions. For ﬂow control, neurons need to be able to compare the recurrent membranepotential with their own activity, as assumed in Section 2.2. On the other hand, variance control directlymeasures the variance of the external input and derives the activity target variance accordingly. The limitingfactor to a successful spectral radius control is the amount of cross-correlation induced by external drivingstatistics. As such, the functionality and validity of the proposed mechanisms depended on the ratio betweenexternal input, i.e. feed-forward or feedback connections, with respect to recurrent, or lateral connections.In general, it is not straightforward to directly connect experimental evidence regarding the ratio betweenrecurrent and feed-forward contributions to the effects observed in the model. It is, however, worthwhile tonote that the fraction of synapses associated with interlaminar loops and intralaminar lateral connectionsare estimated to make up roughly (Binzegger et al., 2004). Relating this to our model, it impliesthat the signiﬁcant interneural correlations that we observed when external input strengths were of the

This is a provisional ﬁle, not the ﬁnal typeset article abian Schubert and Claudius Gros Homeostatic regulation of echo-state networks

Figure 5. Size dependence of correlation.

Comparison between the variance σ of the bare recurrentinput x bare = (cid:80) j W ij y j with σ σ . Equality is given when the presynaptic activities are statisticallyindependent. This can be observed in the limit of large network sizes N for uncorrelated input data streams(homogeneous and heterogeneous Gaussian input protocols), but not for correlated inputs (homogeneousand heterogeneous binary input protocols). Compare Section 5.3 for the input protocols. Parameters are σ ext = 0 . , R a = 1 and µ t = 0 . .5same order of magnitude as the recurrent inputs, can not generally be considered an artifact of biologicallyimplausible parameter choices. Synchronization (Echeveste and Gros, 2016) is in fact a widely observedphenomenon in the brain (Usrey and Reid, 1999), with possible relevance for information processing(Salinas and Sejnowski, 2001).On the other hand, correlations due to shared input reduces the amount of information that can be storedin the neural ensemble (Bell and Sejnowski, 1995). Maximal information is achieved if neural activitiesor spikes trains form an orthogonal ensemble (F ¨oldiak, 1990; Bell and Sejnowski, 1995; Tetzlaff et al.,2012). Furthermore, neural ﬁring in cortical microcircuits was found to be decorrelated across neurons,even if common external input was present (Ecker et al., 2010), that is, under a common orientation tuning.Therefore, the correlation we observed in our network due to shared input might be signiﬁcantly reduced bypossible modiﬁcations/extensions of our model: First, a strict separation between inhibitory and excitatorynodes according to Dale’s law might help actively decorrelating neural activity (Tetzlaff et al., 2012;Bernacchia and Wang, 2013). Second, if higher dimensional input was used, a combination of plasticitymechanisms in the recurrent and feed-forward connections could lead to the formation of an orthogonalrepresentation of the input (F¨oldiak, 1990; Bell and Sejnowski, 1995; Wick et al., 2010), leading to richer,“higher dimensional” activity patterns, i. e. a less dominant largest principal component. Ultimately, if thesemeasures helped in reducing neural cross-correlations in the model, we thus would expect them to alsoincrease the accuracy of the presented adaptation mechanisms. We leave these modiﬁcations to possiblefuture research. Frontiers 13 abian Schubert and Claudius Gros

Homeostatic regulation of echo-state networks

Figure 6. Input induced activity correlations.

For heterogeneous binary and Gaussian inputs (A/B), thedependency of mean activity cross correlations ¯ C , see Eq. (21). ¯ C is shown as a function of the spectralradius R a . Results are obtained for N = 500 sites by averaging over ﬁve trials, with shadows indicating thestandard error across trials. Correlations are due to ﬁnite-size effect for the autonomous case σ ext = 0 .Overall, we found ﬂow control to be generally more robust than variance control in the sense that, whilestill being affected by the amount of correlations within the neural reservoir, the task performance wasless so prone to changes in the external input strength. Comparatively stable network performance couldbe observed, in spite of certain deviations from the desired spectral radius (see Figure 3). A possibleexplanation may be that ﬂow control uses a distribution of samples from only a restricted part of phasespace, that is, from the phase space regions that are actually visited or “used” for a given input. Therefore,while a spectral radius of unity ensures –statistically speaking– the desired scaling properties in all phase-space directions, it seem to be enough to control the correct scaling for the subspace of activities thatis actually used for a given set of input patters. Variance control, on the other hand, relies more strictlyon the assumptions that neural activities are statistical independent. In consequence, the desired resultscould only be achieved under a rather narrow set of input statistics (independent Gaussian input with smallvariance). In addition, the approximate expression derived for the nonlinear transformation appearing inthe mean ﬁeld approximation adds another layer of potential source of systematic error to the controlmechanism. This aspect also speaks in favor of ﬂow control, since its rules are mathematically more simple.In contrast to variance control, the stationarity condition stated in Eq. (12) is independent of the actualnonlinear activation function used and could easily be adopted in a modiﬁed neuron model. It should benoted, however, that the actual target R t giving optimal performance might then also be affected.Interestingly, ﬂow control distinguishes itself from a conventional local activity-target perspective ofsynaptic homeostasis: There is no predeﬁned set point in Eq. (5). This allows heterogeneities of variancesof neural activity to develop across the network, while retaining the average neural activity at a ﬁxedpredeﬁned level.We would like to point out that, for all the results presented here, only stationary processes were usedfor generating the input sequences. Therefore, it might be worth considering the potential effects ofnon-stationary, yet bounded, inputs on the results in future work. It should be noted, however, that thetemporal domain enters both adaptation mechanisms only in the form of trailing averages of ﬁrst and secondmoments. As a consequence, we expect the issue of non-stationarity of external inputs to present itself This is a provisional ﬁle, not the ﬁnal typeset article abian Schubert and Claudius Gros Homeostatic regulation of echo-state networks simply as a trade-off between slower adaptation, i.e. longer averaging time scales, and the mitigation of theeffects of non-stationarities. Slow adaptation is, however, completely in line with experimental results onthe dynamics of synaptic scaling, which is taking place on the time scale of hours to days (Turrigiano et al.,1998; Turrigiano, 2008).

Apart from being relevant from a theoretical perspective, we propose that the separability of recurrentand external contributions to the membrane potential is an aspect that is potentially relevant for theunderstanding of local homeostasis in biological networks. While homeostasis in neural compartmentshas been a subject of experimental research (Chen et al., 2008), to our knowledge, it has not yet beenfurther investigated on a theoretical basis, although it has been hypothesized that the functional segregationwithin the dendritic structure might also affect (among other intraneural dynamical processes) homeostasis(Narayanan and Johnston, 2012). The neural network model used in this study lacks certain featurescharacterizing biological neural networks, like strict positivity of the neural ﬁring rate or Dale’s law, vizE-I balance (Trapp et al., 2018). Future research should therefore investigate whether the here presentedframework of local ﬂow control can be implemented within more realistic biological neural networkmodels. A particular concern regarding our ﬁndings is that biological neurons are spiking. The concept ofan underlying instantaneous ﬁring rate is, strictly speaking, a theoretical construct, let alone the deﬁnitionof higher moments, such as the “variance of neural activity”. It is however acknowledged that the variabilityof the neural activity is central for statistical inference (Echeveste et al., 2020). It is also important tonote that real-world biological control mechanisms, e.g. of the activity, rely on physical quantities thatserve as measurable correlates. A well-known example is the intracellular calcium concentration, which isessentially a linearly ﬁltered version of the neural spike train (Turrigiano, 2008). On a theoretical level,Cannon and Miller showed that dual homeostasis can successfully control the mean and variance of thistype of spike-averaging physical quantities (Cannon and Miller, 2017). An extension of the ﬂow control toﬁltered spike trains of spiking neurons could be an interesting subject of further investigations. However,using spiking neuron models would have shifted the focus of our research towards the theory of liquid statemachines (Maass et al., 2002; Maass and Markram, 2004), exceeding the scope of this publication. Wetherefore leave the extension to more realistic network/neuron models to future work.

We implemented an echo state network with N neurons, receiving D in inputs. The neural activity is y i ∈ [ − , , x i the membrane potential, u i the input activities, W ij the internal synaptic weights and I i theexternal input received. The output layer will be speciﬁed later. The dynamics x i ( t ) = a i N (cid:88) j =1 W ij y j ( t −

1) + I i ( t ) , y i ( t ) = tanh ( x i ( t ) − b i ) (22)is discrete in time, where the input I i is treated instantaneously. A tanh-sigmoidal has been used as anonlinear activation function.The synaptic renormalization factor a i in (22) can be thought of as a synaptic scaling parameter thatneurons use to regulate the overall strength of the recurrent inputs. The strength of the inputs I i is unaffected, Frontiers 15 abian Schubert and Claudius Gros

Homeostatic regulation of echo-state networks which is biologically plausible if external and recurrent signals arrive at separate branches of the dendritictree (Spruston, 2008).The W ij are the bare synaptic weights, with a i W ij being the components of the effective weight matrix (cid:99) W a . Key to our approach is that the propagation of activity is determined by (cid:99) W a , which implies that thespectral radius of the effective, and not of the bare weight matrix needs to be regulated.The bare synaptic matrix W ij is sparse, with a connection probability p r = 0 . . The non-zero elementsare drawn from a Gaussian with standard deviation σ = σ w √ N p r , (23)and vanishing mean µ . Here N p r corresponds to the mean number of afferent internal synapses, with thescaling ∼ / √ N p r enforcing size-consistent synaptic-weight variances. As discussed in the results section,we applied the following adaptation mechanisms: b i ( t ) = b i ( t −

1) + (cid:15) b [ y i ( t ) − µ t ] (24)for the thresholds b i . • Adaption of gains, using ﬂow control: a i ( t ) = a i ( t − (cid:104) (cid:15) a ∆ R i ( t ) (cid:105) , ∆ R i ( t ) = R | y i ( t − | − | x r ,i ( t ) | . (25) • Adaption of gains, with variance control: a i ( t ) = a i ( t −

1) + (cid:15) a (cid:104) σ ,i ( t ) − (cid:0) y i ( t ) − µ y i ( t ) (cid:1) (cid:105) (26) σ ,i ( t ) = 1 − (cid:113) R y i ( t ) + 2 σ ,i ( t ) (27) µ y i ( t ) = µ y i ( t −

1) + (cid:15) µ (cid:2) y i ( t ) − µ y i ( t − (cid:3) (28) σ ,i ( t ) = σ ,i ( t −

1) + (cid:15) σ (cid:104) ( I i ( t ) − µ ext ,i ( t )) − σ ,i ( t − (cid:105) (29) µ ext ,i ( t ) = µ ext ,i ( t −

1) + (cid:15) µ [ I i ( t ) − µ ext ,i ( t − . (30)Note that Eq. (28)–(30) have the same mathematical form (cid:104) trail (cid:105) ( t ) = (cid:104) trail (cid:105) ( t −

1) + (cid:15) [ (cid:104) var (cid:105) ( t ) − (cid:104) trail (cid:105) ( t − since they only serve as trailing averages that are used in the two main equation (26) and (27).For a summary of all model parameters, see Table 1. Table 1.

Standard values for model parameters

N p r σ w µ t (cid:15) b (cid:15) a (cid:15) µ (cid:15) σ

500 0 . .

05 10 − − − − This is a provisional ﬁle, not the ﬁnal typeset article abian Schubert and Claudius Gros Homeostatic regulation of echo-state networks

Figure 7. Convergence time with and without adaptation rate renormalization

Number of time steps T conv needed for | R a ( t ) − R a ( t − | to fall below − . Shown are results using heterogeneous Gaussianinput without and with, ( A ) and respectively ( B ), a renormalization of the learning rate (cid:15) a → (cid:15) a / ¯ x . Notethat, due to computational complexity, an estimate of R a given by (10) was used. An initial offset of . from the target R t was used for all runs. Color coding of R t is the same in both panels. For small values of R t and weak external input, the average square activities and membrane potentials y i ( t ) and x ,i ( t ) can become very small. As a consequence, their difference entering ∆ R i ( t ) in (25) alsobecomes small in absolute value, slowing down the convergence process. To eliminate this effect, wedecided to rescale the learning rate by a trailing average of the squared recurrent membrane potential, i. e. (cid:15) a → (cid:15) a / ¯ x . The effect of this renormalization is shown in Fig. 7. Rescaling the learning rate effectivelyremoves the signiﬁcant rise of convergence times for small σ ext and small R t . Overall, we examined four distinct input protocols. • Homogeneous Gaussian.

Nodes receive inputs I i ( t ) that are drawn individually from a Gaussian withvanishing mean and standard deviation σ ext . • Heterogeneous Gaussian.

Nodes receive stochastically independent inputs I i ( t ) that are drawn fromGaussian distributions with vanishing mean and node speciﬁc standard deviations σ i, ext . The individual σ i, ext are normal distributed, as drawn from the positive part of a Gaussian with mean zero and variance σ . • Homogeneous binary.

Sites receive identical inputs I i ( t ) = σ ext u ( t ) , where u ( t ) = ± is a binaryinput sequence. • Heterogeneous binary.

We deﬁne with I i = W u i u ( t ) , u j ( t ) = ± (31)the afferent synaptic weight vector W u i , which connects the binary input sequence u ( t ) to the network.All W u i are drawn independently from a Gaussian with mean zero and standard deviation σ ext . Frontiers 17 abian Schubert and Claudius Gros

Homeostatic regulation of echo-state networks

The Gaussian input variant simulates external noise. We used it in particular to test predictions of the theorydeveloped in Section 5.6. In order to test the performance of the echo state network with respect to thedelayed XOR task, the binary input protocols are employed. A generalization of the here deﬁned protocolsto the case of higher-dimensional input signals would be straightforward.

For an understanding of the spectral radius adaptation dynamics of ﬂow control, it is of interest to examinethe effect of using the global adaptation constraint ∆ R i ( t ) = 1 N (cid:104) R || y ( t − || − || x r ( t ) || (cid:105) (32)in (5). The spectral radius condition (12) is then enforced directly, with the consequence that (32) isstable and precise even in the presence of correlated neural activities (see Figure 1C). This rule, whilenot biologically plausible, provides an opportunity to examine the dynamical ﬂow, besides the resultingstate. There are two dynamic variables, a = a i ∀ i , where, for the sake of simplicity, we assumed that all a i are homogeneous, and the activity variance σ = || y || /N . The evolution of ( a, σ ) resulting from theglobal rule (6) is shown in Figure 8. For the ﬂow, ∆ a = a ( t + 1) − a ( t ) and ∆ σ = σ ( t ) − σ ( t − , theapproximation ∆ a = (cid:15) a a (cid:0) R − a σ (cid:1) σ (33) ∆ σ = 1 − σ − (cid:113) a σ σ + 2 σ ext (34)is obtained. For the scaling factor a , this leads to a ﬁxed point of R t /σ w . We used the mean-ﬁeldapproximation for neural variances that is derived in Section 5.6. The analytic ﬂow compares well withnumerics, as shown in Figure 8. For a subcritical rescaling factor a and σ ext = 0 , the system ﬂows towardsa line of ﬁxpoints deﬁned by a vanishing σ and a ﬁnite a ∈ [0 , , see Figure 8A. When starting with a > , the ﬁxpoint is instead ( a, σ ) = (1 , . The situation changes qualitatively for ﬁnite external inputs,viz when σ ext > , as shown in Figure 8B–D. The nullcline ∆ σ = 0 is now continuous and the systemﬂows to the ﬁxed point, as shown in Figure 8B–D, with the value of σ being determined by the intersectionof the two nullclines. In addition, we also varied the target spectral radius, see Figure 8B/C. This causeda slight mismatch between the ﬂow of the simulated systems and the analytic ﬂow. It should be noted,however, that this is to be expected anyhow since we used an approximation for the neural variances, again,see Section 5.6.This analysis shows that external input is necessary for a robust ﬂow towards the desired spectral weight,the reason being that the dynamics dies out before the spectral weight can be adapted when the isolatedsystems starts in the subcritical regime. We would like to show that the stationarity condition in Eq. (12) results in the correct spectral radius,under the special case of independently identically distributed neural activities with zero mean.We start with Eq. (12) as a stationarity condition for a given R t : (cid:10) || x r ( t ) || (cid:11) t ! = R (cid:10) || y ( t − || (cid:11) t . (35) This is a provisional ﬁle, not the ﬁnal typeset article abian Schubert and Claudius Gros Homeostatic regulation of echo-state networks

Figure 8. Spectral radius adaptation dynamics.

The dynamics of the synaptic rescaling factor a andthe squared activity σ (orange), as given by (6), for R t = 1 . Also shown is the analytic approximation tothe ﬂow (blue), see (33) and (34), and the respective nullclines ∆ a = 0 (green) and ∆ σ = 0 (red). For theinput, the heterogeneous binary protocol is used. Panels A to D correspond to different combinations ofexternal input strengths and target spectral radii. The black dots show the stead-state conﬁgurations of thesimulated systems. (cid:15) a = 0 . .We can express the left side of the equation as E (cid:104) y † ( t ) (cid:99) W † a (cid:99) W a y ( t ) (cid:105) t . (36)We deﬁne (cid:98) U a ≡ = (cid:99) W † a (cid:99) W a with { σ k } being the set of eigenvalues, which are also the squared singularvalues of (cid:99) W a , and { u k } the respective set of orthonormal (column) eigenvectors. We insert the identity Frontiers 19 abian Schubert and Claudius Gros

Homeostatic regulation of echo-state networks (cid:80) Nk =1 u k u † k and ﬁnd E (cid:34) y † ( t ) (cid:98) U a N (cid:88) k =1 u k u † k y ( t ) (cid:35) t (37) =E (cid:34) N (cid:88) k =1 σ k y † ( t ) u k u † k y ( t ) (cid:35) t (38) = N (cid:88) k =1 σ k u † k E (cid:104) y ( t ) y † ( t ) (cid:105) t u k (39) = N (cid:88) k =1 σ k u † k (cid:98) C yy u k (40) =Tr (cid:16) (cid:98) D σ (cid:98) S † u (cid:98) C yy (cid:98) S u (cid:17) . (41)Given zero mean neural activity, (cid:98) C yy = E[ y ( t ) y † ( t )] t is the covariance matrix of neural activities. (cid:98) D σ is a diagonal matrix holding the { σ k } and (cid:98) S u is a unitary matrix whose columns are { u k } . (cid:98) S † u (cid:98) C yy (cid:98) S u isexpressing (cid:98) C yy in the diagonal basis of (cid:98) U a .Including the right hand side of (35), we get Tr (cid:16) (cid:98) D σ (cid:98) S † u (cid:98) C yy (cid:98) S u (cid:17) = R Tr (cid:16) (cid:98) C yy (cid:17) . (42)However, since the trace is invariant under a change of basis, we ﬁnd Tr (cid:16) (cid:98) D σ (cid:98) S † u (cid:98) C yy (cid:98) S u (cid:17) = R Tr (cid:16) (cid:98) S † u (cid:98) C yy (cid:98) S u (cid:17) . (43)Deﬁning (cid:98) C u ≡ = (cid:98) S † u (cid:98) C yy (cid:98) S u , we get N (cid:88) k =1 σ k C u kk = R N (cid:88) k =1 C u kk . (44)If we assume that the node activities are independently identically distributed with zero mean, we get ( (cid:98) C yy ) ij = ( (cid:98) C u ) ij = (cid:10) y (cid:11) t δ ij . In this case, which was also laid out in Section 2.6, the equation reduces to N (cid:88) k =1 σ k = R N . (45)The Frobenius norm of a square Matrix (cid:98) A is given by (cid:107) (cid:98) A (cid:107) ≡ = (cid:80) i,j (cid:98) A ij . Furthermore, the Frobeniusnorm is linked to the singular values via (cid:107) (cid:98) A (cid:107) = (cid:80) k σ k ( (cid:98) A ) (Sengupta and Mitra, 1999; Shen, 2001). Thisallows us to state (cid:88) i,j (cid:16)(cid:99) W a (cid:17) ij = R N (46) This is a provisional ﬁle, not the ﬁnal typeset article abian Schubert and Claudius Gros Homeostatic regulation of echo-state networks which, by using (10), gives R = R . (47)A slightly less restrictive case is that of uncorrelated but inhomogeneous activity, that is ( (cid:98) C yy ) ij = (cid:10) y i (cid:11) t δ ij .The statistical properties of the diagonal elements C u kk then determine to which degree one can stillclaim that Eq. (44) leads to Eq. (45). Figure 21 in the supplementary materials shows an example of arandomly generated realization of ( (cid:98) C yy ) ij = (cid:10) y i (cid:11) t and the resulting diagonal elements of (cid:98) C u , where thecorresponding orthonormal basis (cid:98) S u was generated from the SVD of a random Gaussian matrix. As onecan see, the basis transformation has a strong smoothing effect on the diagonal entries, while the mean overthe diagonal elements is preserved. Note that this effect was not disturbed by introducing random row-wisemultiplications to the random matrix from which the orthonormal basis was derived. The smoothing of thediagonal entries allows us to state that C u kk (cid:117) (cid:10) y (cid:11) is a very good approximation in the case considered,which therefore reduces (44) to the homogeneous case previously described. We can conclude that theadaptation mechanism also gives the desired spectral radius under uncorrelated inhomogeneous activity.In the most general case, we can still state that if C u kk and σ k are uncorrelated, for large N , Eq. (44) willtend towards N (cid:10) σ (cid:11) (cid:104) C u (cid:105) = N R (cid:104) C u (cid:105) (48)which would also lead to Eq. (45). However, we can not generally guarantee statistical independence sincethe recurrent contribution on neural activities and the resulting entries of (cid:98) C yy and thus also C u kk are linkedto (cid:98) S and σ k , being the SVD of the recurrent weight matrix. In the following, we deduce analytic expressions allowing to examine the state of echo-state layers subjectto a continuous timeline of inputs. Our approach is similar to the one presented by Massar and Massar(2013).The recurrent part of the input x i received by a neuron is a superposition of N p r terms, which are assumedhere to be uncorrelated. Given this assumption, the self-consistency equations σ ,i = (cid:90) ∞−∞ dx tanh ( x ) N µ i ,σ i ( x ) − µ ,i (49) µ y ,i = (cid:90) ∞−∞ dx tanh( x ) N µ i ,σ i ( x ) (50) σ i = a i σ (cid:10) σ ,j (cid:11) j + σ ,i , µ i = µ ext ,i − b i (51)determine the properties of the stationary state. We recall that σ w parameterizes the distribution of baresynaptic weights via (23). The general expressions (49) and (50) hold for all neurons, with the site-dependency entering exclusively via a i , b i , σ ext ,i and µ ext ,i , as in (51), with the latter characterizing thestandard deviation and the mean of the input. Here, a i σ σ is the variance of the recurrent contributionto the membrane potential, x , and σ the respective total variance. The membrane potential is Gaussiandistributed, as N µ,σ ( x ) , with mean µ and variance σ , which are both to be determined self-consistently.Variances are additive for stochastically independent processes, which has been assumed in (51) to be thecase for recurrent activities and the external inputs. The average value for the mean neural activity is µ i . Frontiers 21 abian Schubert and Claudius Gros

Homeostatic regulation of echo-state networks

Figure 9. Variance control for the spectral radius.

The spectral radius R a , given by the approximation R = (cid:80) i a i /N , for the four input protocols deﬁned in Section 5.3. Lines show the numerical self-consistency solution of (49), symbols the full network simulations. Note the instability for small σ y and σ ext . A : Homogeneous independent Gaussian input. B : Homogeneous identical binary input. C :Heterogeneous independent Gaussian input. D : Heterogeneous identical binary input.For a given set of a i , σ ext ,i and b i , the means and variances of neural activities, σ ,i and µ y ,i , followimplicitly.We compared the numerically determined solutions of (49) and (50) against full network simulationsusing, as throughout this study, N = 500 , p r = 0 . , σ w = 1 , µ t = 0 . . In Figure 9, the spectral radius R a is given for the four input protocols deﬁned in Section 5.3. The identical ensemble of input standarddeviations σ ext ,i enters both theory and simulations. Theory and simulations are in good accordance forvanishing input. Here, the reason is that ﬁnite activity levels are sustained in an autonomous random neuralnetwork when the ongoing dynamics is chaotic and hence decorrelated. For reduced activity levels, vizfor small variances σ , the convergence of the network dynamics is comparatively slow, which leads to acertain discrepancy with the analytic prediction (see Figure 9). This is a provisional ﬁle, not the ﬁnal typeset article abian Schubert and Claudius Gros Homeostatic regulation of echo-state networks

The integral occurring in the self-consistency condition (49) can be evaluated explicitly when a tractableapproximation to the squared transfer function tanh () is available. A polynomial approximation wouldcapture the leading behavior close to the origin, however without accounting for the fact that tanh () converges to unity for large absolute values of the membrane potential. Alternatively, an approximationincorporating both conditions, the correct second-order scaling for small, and the correct convergence forlarge arguments, is given by the Gaussian approximation tanh ( x ) ≈ − exp (cid:0) − x (cid:1) . (52)With this approximation the integral in (49) can be evaluated explicitly. The result is − σ − µ = (cid:112) σ / exp (cid:0) − µ / (cid:0) σ (cid:1)(cid:1) (53) = (cid:113) a σ σ + 2 σ / exp (cid:0) − µ / (cid:0) a σ σ + 2 σ (cid:1)(cid:1) . Assuming that µ ≈ and µ y ≈ , inverting the ﬁrst equation yields a relatively simple analyticapproximation for the variance self-consistency equation: σ = 1 − (cid:113) a σ σ + 2 σ . (54)This equation was then used for the approximate update rule in (8) and (34).Alternatively, we can write (54) as a self- consistency equation between σ , σ a σ = R , describinga phase transition at R a = 1 : R σ (cid:0) − σ (cid:1) = 1 − (cid:0) σ (cid:1) (cid:0) − σ (cid:1) . (55)See Fig. 10 for solutions of (55) for different values of σ . Note that for vanishing external driving andvalues of R a above but close to the critical point, the standard deviation σ y scales with σ y ∝ ( R a − / ,which is the typical critical exponent for the order parameter in classical Landau theory of second-orderphase transitions (Gros, 2008, p. 169). If combined with a slow homeostatic process, ﬂow or variancecontrol in our case, this constitutes a system with an absorbing phase transition (Gros, 2008, p. 182-183),settling at the critical point R a = 1 . This phase transition can also be observed in Fig. 9 for σ ext = 0 as asharp onset in σ y . CONFLICT OF INTEREST STATEMENT

The authors declare that the research was conducted in the absence of any commercial or ﬁnancialrelationships that could be construed as a potential conﬂict of interest.

AUTHOR CONTRIBUTIONS

Both authors, F.S. and C.G., contributed equally to the writing and review of the manuscript. F.S. providedthe code, ran the simulations and prepared the ﬁgures.

Frontiers 23 abian Schubert and Claudius Gros

Homeostatic regulation of echo-state networks

Figure 10. Phase transition of activity variance

Shown are solutions of the analytical approximationgiven in (55), capturing the onset of activity (characterized by its variance σ ) at the critical point R a = 1 . ACKNOWLEDGMENTS

The authors acknowledge the ﬁnancial support of the German research foundation (DFG) and discussionswith R. Echeveste. This manuscript was published as a pre-print on biorxiv (Schubert and Gros, 2020).

DATA AVAILABILITY STATEMENT

The datasets generated for this study can be found in https://itp.uni-frankfurt.de/˜fschubert/data_esn_frontiers/ .Simulation and plotting code is available in https://github.com/FabianSchubert/ESN_Frontiers . REFERENCES

Abbott, L. F. and LeMasson, G. (1993). Analysis of Neuron Models with Dynamically RegulatedConductances.

Neural Computation

5, 823–842. doi:10.1162/neco.1993.5.6.823Barral, J. and D’Reyes, A. (2016). Synaptic scaling rule preserves excitatory-inhibitory balance and salientneuronal network dynamics.

Nature Neuroscience

19, 1690–1696. doi:10.1038/nn.4415Bell, A. J. and Sejnowski, T. J. (1995). An Information-maximisation approach to blind separation andblind deconvolution.

Neural Computation

7, 1129–1159[Dataset] Bernacchia, A. and Wang, X. J. (2013). Decorrelation by recurrent inhibition in heterogeneousneural circuits. doi:10.1162/NECO a 00451Binzegger, T., Douglas, R. J., and Martin, K. A. (2004). A quantitative map of the circuit of cat primaryvisual cortex.

Journal of Neuroscience

24, 8441–8453. doi:10.1523/JNEUROSCI.1400-04.2004Boedecker, J., Obst, O., Mayer, N. M., and Asada, M. (2009). Initialization and self-organized optimizationof recurrent neural network connectivity.

HFSP Journal

3, 340–349. doi:10.2976/1.3240502Borde, M., Cazalets, J. R., and Buno, W. (1995). Activity-dependent response depression in rat hippocampalCA1 pyramidal neurons in vitro.

Journal of Neurophysiology

74, 1714–1729. doi:10.1152/jn.1995.74.4.1714

This is a provisional ﬁle, not the ﬁnal typeset article abian Schubert and Claudius Gros Homeostatic regulation of echo-state networks

Cannon, J. and Miller, P. (2017). Stable Control of Firing Rate Mean and Variance by Dual HomeostaticMechanisms.

Journal of Mathematical Neuroscience

7. doi:10.1186/s13408-017-0043-7Chen, N., Chen, X., and Wang, J. H. (2008). Homeostasis established by coordination of subcellularcompartment plasticity improves spike encoding.

Journal of Cell Science

Nature Neuroscience , 1138–1149Echeveste, R. and Gros, C. (2016). Drifting states and synchronization induced chaos in autonomousnetworks of excitable neurons.

Frontiers in computational neuroscience

10, 98Ecker, A. S., Berens, P., Keliris, G. A., Bethge, M., Logothetis, N. K., and Tolias, A. S. (2010). Decorrelatedneuronal ﬁring in cortical microcircuits.

Science

PLOS Computational Biology

Enel, P., Procyk, E., Quilodran, R., and Dominey, P. F. (2016). Reservoir Computing Properties of NeuralDynamics in Prefrontal Cortex.

PLOS Computational Biology

12, e1004967. doi:10.1371/journal.pcbi.1004967F¨oldiak, P. (1990). Forming sparse representations by local anti-Hebbian learning.

Biological Cybernetics

Franklin, J. L., Fickbohm, D. J., and Willard, A. L. (1992). Long-term regulation of neuronal calciumcurrents by prolonged changes of membrane potential.

Journal of Neuroscience

12, 1726–1735.doi:10.1523/jneurosci.12-05-01726.1992Gallicchio, C. and Micheli, A. (2017). Echo state property of deep reservoir computing networks.

CognitiveComputation

9, 337–350Gros, C. (2008).

Complex and Adaptive Dynamical Systems, a Primer (Springer)Hinaut, X., Lance, F., Droin, C., Petit, M., Pointeau, G., and Dominey, P. F. (2015). Corticostriatal responseselection in sentence production: Insights from neural network simulation with reservoir computing.

Brain and Language

Short Term Memory in Echo State Networks . Tech. rep., Fraunhofer Institute forAutonomous Intelligent SystemsLazar, A., Pipa, G., and Triesch, J. (2009). SORN: a self-organizing recurrent neural network.

Frontiers inComputational Neuroscience

IEEE Transactions on Neural Networks andLearning Systems

29, 706–717Maass, W. and Markram, H. (2004). On the computational power of circuits of spiking neurons.

Journal ofComputer and System Sciences

69, 593–616. doi:10.1016/j.jcss.2004.04.001Maass, W., Natschl¨ager, T., and Markram, H. (2002). Real-time computing without stable states: Anew framework for neural computation based on perturbations.

Neural Computation

14, 2531–2560.doi:10.1162/089976602760407955Marˇcenko, V. A. and Pastur, L. A. (1967). Distribution of Eigenvalues for some Sets of Random Matrices.

Mathematics of the USSR-Sbornik

1, 457–483. doi:10.1070/sm1967v001n04abeh001994Massar, M. and Massar, S. (2013). Mean-ﬁeld theory of echo state networks.

Physical Review E

87, 42809.doi:10.1103/PhysRevE.87.042809Miner, D. and Triesch, J. (2016). Plasticity-Driven Self-Organization under Topological ConstraintsAccounts for Non-Random Features of Cortical Synaptic Wiring.

PLoS Computational Biology

Frontiers 25 abian Schubert and Claudius Gros

Homeostatic regulation of echo-state networks

Narayanan, R. and Johnston, D. (2012). Functional maps within a single neuron.

Journal ofNeurophysiology

PLoS biology

7, e1000260Rajan, K. and Abbott, L. F. (2006). Eigenvalue Spectra of Random Matrices for Neural Networks.

PhysicalReview Letters

97. doi:10.1103/physrevlett.97.188104Remme, M. W. and Wadman, W. J. (2012). Homeostatic scaling of excitability in recurrent neural networks.

PLoS Computational Biology

8, 1002494. doi:10.1371/journal.pcbi.1002494[Dataset] Salinas, E. and Sejnowski, T. J. (2001). Correlated neuronal activity and the ﬂow of neuralinformation. doi:10.1038/35086012Schrauwen, B., Wardermann, M., Verstraeten, D., Steil, J. J., and Stroobandt, D. (2008). Improvingreservoirs using intrinsic plasticity.

Neurocomputing

71, 1159–1171. doi:10.1016/j.neucom.2007.12.020Schubert, F. and Gros, C. (2020). Local homeostatic regulation of the spectral radius of echo-state networks. bioRxiv doi:10.1101/2020.07.21.213660Sengupta, A. M. and Mitra, P. P. (1999). Distributions of singular values for some random matrices.

Physical Review E

60, 3389Shen, J. (2001). On the singular values of Gaussian random matrices.

Linear Algebra and its Applications

Nature ReviewsNeuroscience

9, 206–221. doi:10.1038/nrn2286Steil, J. J. (2007). Online reservoir adaptation by intrinsic plasticity for backpropagation-decorrelation andecho state learning.

Neural Networks

20, 353–364. doi:10.1016/j.neunet.2007.04.011Stellwagen, D. and Malenka, R. C. (2006). Synaptic scaling mediated by glial TNF- α . Nature

PLOS Computational Biology

Communications in ContemporaryMathematics

10, 261–307Tetzlaff, C. (2011). Synaptic scaling in combination with many generic plasticity mechanisms stabilizescircuit connectivity.

Frontiers in Computational Neuroscience

5, 47. doi:10.3389/fncom.2011.00047Tetzlaff, T., Helias, M., Einevoll, G. T., and Diesmann, M. (2012). Decorrelation of Neural-NetworkActivity by Inhibitory Feedback.

PLoS Computational Biology

8, 1002596. doi:10.1371/journal.pcbi.1002596Toyoizumi, T., Kaneko, M., Stryker, M. P., and Miller, K. D. (2014). Modeling the Dynamic Interaction ofHebbian and Homeostatic Plasticity.

Neuron

84, 497–510. doi:10.1016/j.neuron.2014.09.036Trapp, P., Echeveste, R., and Gros, C. (2018). Ei balance emerges naturally from continuous hebbianlearning in autonomous neural networks.

Scientiﬁc reports

8, 1–12Triesch, J. (2007). Synergies between intrinsic and synaptic plasticity mechanisms.

Neural Computation

19, 885–909. doi:10.1162/neco.2007.19.4.885[Dataset] Turrigiano, G. G. (2008). The Self-Tuning Neuron: Synaptic Scaling of Excitatory Synapses.doi:10.1016/j.cell.2008.10.008Turrigiano, G. G., Leslie, K. R., Desai, N. S., Rutherford, L. C., and Nelson, S. B. (1998). Activity-dependent scaling of quantal amplitude in neocortical neurons.

Nature

This is a provisional ﬁle, not the ﬁnal typeset article abian Schubert and Claudius Gros Homeostatic regulation of echo-state networks [Dataset] Turrigiano, G. G. and Nelson, S. B. (2000). Hebb and homeostasis in neuronal plasticity.doi:10.1016/S0959-4388(00)00091-XUsrey, W. M. and Reid, R. C. (1999). Synchronous Activity in the Visual System.

Annual Review ofPhysiology

61, 435–456. doi:10.1146/annurev.physiol.61.1.435van Rossum, M. C. W., Bi, G. Q., and Turrigiano, G. G. (2000). Stable Hebbian Learning from SpikeTiming-Dependent Plasticity.

Journal of Neuroscience

20, 8812–8821Van Vreeswijk, C. and Sompolinsky, H. (1998). Chaotic Balanced State in a Model of Cortical Circuits.

Neural Computation

10, 1321–1371. doi:10.1162/089976698300017214Wernecke, H., S´andor, B., and Gros, C. (2019). Chaos in time delay systems, an educational review. arXivpreprint arXiv:1901.04826

Wick, S. D., Wiechert, M. T., Friedrich, R. W., and Riecke, H. (2010). Pattern orthogonalizationvia channel decorrelation by adaptive networks.

Journal of Computational Neuroscience

28, 29–45.doi:10.1007/s10827-009-0183-1Zenke, F., Hennequin, G., and Gerstner, W. (2013). Synaptic Plasticity in Neural Networks NeedsHomeostasis with a Fast Rate Detector.

PLoS Computational Biology

9, 1003330. doi:10.1371/journal.pcbi.1003330

Frontiers 27 abian Schubert and Claudius Gros

Homeostatic regulation of echo-state networks

Figure 11. Adaptation dynamics, ﬂow control, local

Panels A – D show the dynamics of the squarespectral radius R and local estimates R ,i under local ﬂow control for different input protocols, as givenin the panel titles. This is a provisional ﬁle, not the ﬁnal typeset article abian Schubert and Claudius Gros Homeostatic regulation of echo-state networks

Figure 12. Adaptation dynamics, ﬂow control, global

Panels A – D show the dynamics of the squarespectral radius R and local estimates R ,i under global ﬂow control for different input protocols, as givenin the panel titles. Frontiers 29 abian Schubert and Claudius Gros

Homeostatic regulation of echo-state networks

Figure 13. Adaptation dynamics, variance control, local

Panels A – D show the dynamics of the squarespectral radius R and local estimates R ,i under local variance control for different input protocols, asgiven in the panel titles. This is a provisional ﬁle, not the ﬁnal typeset article abian Schubert and Claudius Gros Homeostatic regulation of echo-state networks

Figure 14. Adaptation dynamics, variance control, global

Panels A – D show the dynamics of the squarespectral radius R and local estimates R ,i under global variance control for different input protocols, asgiven in the panel titles. Frontiers 31 abian Schubert and Claudius Gros

Homeostatic regulation of echo-state networks

Figure 15. Difference of the spectral radius after adaptation and the target spectral radius, ﬂowcontrol

For different standard deviations σ ext of the external input, the error R a − R t between theresulting spectral radius R a and the target spectral radius R t was determined. Heterogeneous/homogeneousbinary input ( A / C ) led to positive deviations from the target spectral radius for stronger external input.Heterogeneous/homogeneous Gaussian input ( B / D ) yielded perfect alignment between R a and R t . Localadaptation was used for both panels. This is a provisional ﬁle, not the ﬁnal typeset article abian Schubert and Claudius Gros Homeostatic regulation of echo-state networks

Figure 16. Difference of the spectral radius after adaptation and the target spectral radius, variancecontrol

For different standard deviations σ ext of the external input, the error R a − R t between the resultingspectral radius R a and the target spectral radius R t was determined. Heterogeneous binary input ( A ) led to agood alignment. On the other hand, homogeneous binary input ( C ) caused strong deviation from the targetfor larger input. Heterogeneous/homogeneous Gaussian input ( B / D ) both resulted in positive deviationsthat increased for larger input strengths. Local adaptation was used for both panels. Frontiers 33 abian Schubert and Claudius Gros

Homeostatic regulation of echo-state networks

Figure 17. Convergence of Lyapunov Spectrum.

Convergence of eigenvalues of ln (cid:0) ( (cid:99) W n ) † (cid:99) W n (cid:1) / (2 n ) for different n , as discussed in Section 2.6. (cid:99) W is a random Gaussian matrix which was rescaled to a spectralradius of one. Colors from blue to orange encode the exponent n ranging between and . Green dotsshow the theoretical limit of ln (cid:107) λ i (cid:107) , where λ i is the i th eigenvalue of (cid:99) W . This is a provisional ﬁle, not the ﬁnal typeset article abian Schubert and Claudius Gros Homeostatic regulation of echo-state networks

Figure 18. XOR performance for ﬂow control, homogeneous input.

Numerical results for the networkperformance under a time-delayed XOR task, as deﬁned in Section 2.7, using homogeneous binary/Gaussianinput. Shown are color-coded performance sweeps for the XOR-performance (18), averaged over ﬁve trials.The input has variance σ and the target for the spectral radius R t . A/B panels are for binary/Gaussianinput protocols. Optimal performance for a given σ ext is given by yellow solid lines, measured value of R a = 1 by white dashed lines. Figure 19. XOR performance for variance control, homogeneous input.

Numerical results for thenetwork performance under a time-delayed XOR task, as deﬁned in Section 2.7, using homogeneousbinary/Gaussian input. Shown are color-coded performance sweeps for the XOR-performance (18),averaged over ﬁve trials. The input has variance σ and the target for the spectral radius R t . A/Bpanels are for binary/Gaussian input protocols. Optimal performance for a given σ ext is given by yellowsolid lines, measured value of R a = 1 by white dashed lines. Frontiers 35 abian Schubert and Claudius Gros

Homeostatic regulation of echo-state networks

Figure 20. Input induced activity correlations.