AA new stochastic STDP Rulein a neural Network Model
Pascal Helson ∗ Draft March 2, 2018
Contents ∗ [email protected] a r X i v : . [ m a t h . P R ] M a r bstract Thought to be responsible for memory, synaptic plasticity has been widely studied in the pastfew decades. One example of plasticity models is the popular Spike Timing Dependent Plasticity(STDP). There is a huge litterature on STDP models. Their analysis are mainly based onnumerical work when only a few has been studied mathematically. Unlike most models, we aim atproposing a new stochastic STDP rule with discrete synaptic weights. It brings a new frameworkin order to use probabilistic tools for an analytical study of plasticity. A separation of time-scaleenables us to derive an equation for the weights dynamics, in the limit plasticity is infinitely slowcompare to the neural network dynamic. Such an equation is then analysed in simple cases whichshow counter intuitive result: divergence of weights even when integral over the learning windowis negative. Finally, without adding constraints on our STDP, such as bounds or metaplasticity,we are able to give a simple condition on parameters for which our weights’ process remainsergodic. This model attempts to answer the need for understanding the interplay between theweights dynamics and the neurons ones.
A huge amount of studies have focused on neural networks dynamics in order to reproducebiological phenomena observed in experiments. Thereby, there exist many different individualneuron models from the two states neurons to the adaptive exponential integrate-and-fire [17, 24].Compare to this kind of literature, plasticity in recurrent networks has been well less studied.One reason is because it adds an additional layer of complexity to existing models despite beinga candidate for memory formation, learning, etc [6, 10].In the beginning, plasticity models were based on firing rates [8]. Later on, as suggested byHebb’s in 1949 [23], the crucial role of precise spikes timings was proved experimentally and gaverise to Spike-Timing Dependent Plasticity (STDP) [7, 34, 36]. Following such a breakthrough,numerous STDP models emerged. They were associated with neural networks of either Poissonneurons [18, 29, 30] or continuous model of neurons [1, 12, 40]. Here, we would like to presenta new STDP rule which is implemented in the well-known stochastic Wilson-Cowan model ofspiking neurons as presented in [5]. More precisely, because of the plasticity rule, our model is apiecewise deterministic Markov process [13, 14] whereas it is a pure point process in [5].Motivations for proposing such a new model are four folds. First, although mechanisms in-volved in plasticity are mainly stochastic such as the activation of ions channels and proteins, themajority of studies on STDP are implemented using a deterministic description or an extrinsicnoise source [12, 21, 38]. One exception is the stochastic STDP model proposed by Appleby andElliott in [3, 4]. The stochasticity of their model lies in the learning window size. They analysethe dynamic of the weights of one target cell innervated by a few Poisson neurons. A fixed pointanalysis enabled them to show that their model is not relevant in the pair-based case and that mul-tispike interactions are required to get stable competitive weights dynamics. Second, most studiesare based on simulations and their analyses, thus there is still a need to find a good mathematicalframework, see [16, 33, 40]. We propose here a mathematical analysis based on probabilisticmethods which leads to a control of weights through the study of their dynamics on their slow timescale. Indeed, long term plasticity timescale ranges from minutes to more than one hour. On theother hand, a spike lasts for a few milliseconds [38]. Thus, third, there is a need to understand howto bridge this time scale gap between the synapse level and the network one [15, 45, 48]. Finally,the interplay between the weights dynamics and the neurons ones is not yet fully understood andwe think the study of recurrent networks is necessary to bring some basis to fully numerical studies.2uch motivations impose some constraints on our model. It has to be rich enough to reproducebiological phenomena, simple enough to be mathematically tractable and easily simulated withthousands of neurons. Finally, it has to enable us to observe macroscopic effects out of microscopicevents. The Wilson-Cowan model has been widely studied [5,9,33] and reproduces many biologicalfeatures of a network such as oscillation and bi-stability for example. On the other hand, based onexperimental evidence [7,44], we propose a new STDP rule with intrinsic noise with fixed synapticweight increment [41]. This allows to control independently the synaptic weight increment andthe probability of a plasticity event. Indeed, several pairs protocol are required for the inductionof plasticity [7, 36].Thus, we can produce a mathematical analysis by studying the Markov process composedof the following three components: the synaptic weight matrix, the inter-spiking times and theneuron states. In the context of long term plasticity, synaptic weights dynamics are much slowerthan the neural network one. A timescale analysis enables us to remove the neurons dynamicsfrom the equations. Then we can derive an equation for the slow weights dynamics alone, inwhich neurons dynamics are replaced by their stationary distributions. Thus, we don’t need tosimulate the dynamics of thousands of fast neurons and we obtain a much easier equation toanalyse. We then discuss the implications of such derivation for learning and adaptation in neuralnetworks.A similar analysis has been done in a few papers with different mathematical tools and models [18,19, 29, 30, 32, 40]. When the two first one studied only one postsynaptic neuron, the last oneshad a look at recurrent networks. Thanks to a separation of time scale, they derive an equationfor weights in which STDP appears in an integral of the STDP curve against cross-correlationmatrix. The main problem is the computation of such a matrix, they use Taylor expansion andFourier analysis to derive estimations of it. We don’t need such an estimation for our analysisthanks to probabilistic methods.
As in all model of neural networks with plastic connections, one can separate the neuron modeland the plasticity one. Our neuron model is the well-known stochastic Wilson-Cowan model ofspiking neurons presented in [5]. In such a model, neurons are binary, meaning they are either atrest, state 0, or spiking, state 1. This model has been widely studied in the case of fix weightsand presents realistic features such as oscillations or bistable phenomenon, see [9]. However, thereare only few studies with plasticity, see for instance with an Ising model in [42].We implement plasticity in this model in a stochastic way. Indeed, our plasticity rule dependson the precise spike times and thus has the same form as STDP, see [35] for an overview, but isnot deterministic: in the situation of correlated spikes, weights will change or not according to acertain probability.First, we are interested in excitatory neurons, as in most models inhibitory neurons are notplastic, so the synaptic weights will be positive. Also, we suppose they are all to all connected sothis positivity will be strict. We will discuss about these assumptions at the end. Therefore, wefirst give some global notations, then explain the neuron model, the plasticity rule, and finally wegather these dynamics in the generator of the process.3e are interested in analysing the time continuous Markov process ( W t , S t , V t ) t ≥ where:- W t ∈ { ∆ wK, K ∈ E } synaptic weights matrix, E = n K, K ∈ N N , K ij > ∀ i = j and K ii = 0 ∀ i o ,∆ w ∈ R + ∗ and W ∈ { ∆ wK, K ∈ E } , W ijt weight of the connection from neuron i to j at t.- S t ∈ R N + vector of times from last spikes of neurons.- V t ∈ I = { , } N neuron system state.As weights dynamics and the neural network one will be separated, we spare the global statespace E in two spaces. Hence, in the following we denote E = { ∆ wK, K ∈ E } , E = R N + × I such that E = E × E . Neuron model
Let’s define the dynamic of the process. It is a recurrent neural plastic network with Poissonneurons in interaction. Each neuron jumps with an inhomogeneous rate between two states: 0and 1. This rate depends on the network state and the weights matrix:0 α i ( W t ,V t ) −−−−−− *) −−−−−− β α i is given by ξ i : R R + ∗ bounded, positive and nondecreasing: α i ( W t , V t ) = ξ i N X j =1 W jit V jt (2)As the neuron activity is never null, we will consider that for all i , inf x ∈ R ξ i ( x ) ≥ α m > α i is uniformly bounded in w and v for all i :0 < α m = min i (cid:18) inf x ∈ R ξ i ( x ) (cid:19) ≤ α i ( w, v ) ≤ α M = max i (cid:18) sup x ∈ R ξ i ( x ) (cid:19) Plasticity rule
The basic idea of STDP is that of the Hebb’s law (1949):“
When an axon of cell A[...] repeatedly or persistently takes part in firing (a cell B), [...]A’sefficiency, as one of the cells firing B, is increased ” [23].STDP is a bit more complex as it completes this law with the possibility for weights to decreasewhen they are decorrelated.We expose our plasticity model through an example. First, weights can change only when aneuron spikes that we define as the jump from 0 to 1 (we could have chosen from 1 to 0 . Sosuppose the neuron i spikes at time t . Then, weights related to this neuron, that is to say W jit and W ijt for all j = i , have a certain probability to jump. This differs from models we can findin the literature for which weights’ jumps are systematic but small [1, 29, 38]. Here, the jump isnot small but happens with a small probability: W jit has probability p + ( S jt ) to increase and W ijt decrease with probability p − ( S jt ). These probabilities depends on the inter-spiking times givenby S jt : 4 igure 1: Dynamics of neurons i and j over time, and the corresponding probability of jump for weights As the classic STDP curve, found by Bi&Poo [7], suggests it, we take the following probabilityfunctions in our examples, with 0 < A + , A − ≤ τ + , τ − > p + ( s ) = A + e − sτ + and p − ( s ) = A − e − sτ − (3) Remark . By definition of E and α i , we study excitatory neurons. We see at the end how toextend our results to inhibitory-excitatory neurons. Also, we remark that W iit stays constant andas W ii = 0 for all i , W iit = 0 for all t . We will discuss this assumption later on. Finally, ( S t ) t ≥ is crucial for our process to be Markovian. Generator of the process
Now we know how the process works, we can write its infinitesimal generator. To do so, we needthe following notations. We denote by G wi all reachable weights after a spike of neuron i whilethe current weight is w ∈ E . Thus: G wi = w + ∆ w . . . . . . ~ζ p ... ...... ... ... ...0 . . . . . . − . . . . . . ~ζ d . . . . . . | {z } N × N matrix , ( ~ζ p , ~ζ d ) ∈ F wi Where F wi = ( ~ζ p , ~ζ d ) , ~ζ d = (cid:2) ζ d , ..., ζ Nd (cid:3) , ~ζ p = ζ p ... ζ Np , ζ jd , ζ jp ∈ { , } , ζ id = ζ ip = 0 and ζ jd = 0 if w ij = ∆ w We call Z p (respectively Z d ) the matrix associated to the vector ~ζ p (respectively ~ζ d ). As eachweight jumps independently whenever a neuron i spikes, we can decompose the probability of5umping to a certain state as the product of probabilities to jump or not for each weights. Wewant to compute φ i ( s, ˜ w, w ), the probability of jumping in a given ˜ w ∈ G wi knowing the neuron i spikes. Let ˜ w = w + ∆ w ( Z p + Z d ), the probability for w ji to increase ( ζ jp = 1) is p + ( s j ) whenthe probability to stay the same ( ζ p = 0) is (1 − p + ( s j )), for all j = i . This will appear as ζ jp p + ( s j ) + (1 − ζ jp )(1 − p + ( s j )) in φ i ( s, ˜ w, w ): φ i ( s, ˜ w, w ) = Φ i ( s, ~ζ p , ~ζ d ) = Y j = i (cid:2) ζ jp p + ( s j ) + (1 − ζ jp )(1 − p + ( s j )) (cid:3) h ζ jd p − ( s j ) + (1 − ζ jd )(1 − p − ( s j )) i (4)Therefore, we can write the generator ( C , D ( C )) of the all process ( W t , S t , V t ) t ≥ where D ( C ) ⊂ C b ( E ) and C given ∀ f ∈ D ( C ) : C f ( w, s, v ) = X i δ ( v i ) β [ f ( w, s, v − e i ) − f ( w, s, v )]+ X i α i ( w, v ) δ ( v i ) X ˜ w ∈ G wi ( f ( ˜ w, s − s i e i , v + e i ) − f ( w, s, v )) φ i ( s, ˜ w, w ) + N X i =1 ∂ s i f ( w, s, v )Or C f ( w, s, v ) = X i δ ( v i ) β [ f ( w, s, v − e i ) − f ( w, s, v )] | {z } B ↓ f ( w,s,v ) + X i φ i ( s, w, w ) α i ( w, v ) δ ( v i ) ( f ( w, s − s i e i , v + e i ) − f ( w, s, v )) | {z } B ↑ f ( w,s,v ) + N X i =1 ∂ s i f ( w, s, v ) | {z } B tr f ( w,s,v ) + X i α i ( w, v ) δ ( v i ) X ˜ w ∈ G wi , ˜ w = w ( f ( ˜ w, s − s i e i , v + e i ) − f ( w, s, v )) φ i ( s, ˜ w, w ) Written in this form, the generator shows two different dynamics which are related: the weightsdynamic and the network, inter-spiking time dynamics. As we know that synaptic weightsdynamics are slow compare to the network dynamics (( S t , V t ) t> change fast compare to ( W t ) t> ),this means that for all i : X ˜ w ∈ G wi , ˜ w = w φ i ( s, ˜ w, w ) (cid:28) φ i ( s, w, w )Typically, P ˜ w ∈ G wi , ˜ w = w φ i ( s, ˜ w, w ) = O ( (cid:15) ) and φ i ( s, w, w ) = 1 − O ( (cid:15) ). This time scale differenceis studied in section 3.2 while the study of the fast part of the process is done in section 3.1. Thisprocess is given by the generator B : D ( B ) ⊂ C b ( E ) → C b ( E ): B = B tr + B ↓ + B ↑ (5)6 Derivation of the weight equation
In this section, W t = W = w ∈ E is fixed. We are interested in proving: Theorem 3.1.
For all w ∈ E , the process ( S t , V t ) t ≥ with generator B w mapping D ( B ) into C b ( E ) , defined ∀ f ∈ D ( B ) as: B w f ( s, v ) = X i δ ( v i ) β [ f ( s, v − e i ) − f ( s, v )] (6)+ X i α i ( w, v ) δ ( v i ) ( f ( s − s i e i , v + e i ) − f ( s, v )) (7)+ N X i =1 ∂ s i f ( s, v ) (8) has a unique invariant measure. This aim enters in a bigger ambition to analyse the total process ( W t , S t , V t ) t ≥ on two differenttime scales. Indeed, in the limit where the plasticity is infinitely slow, it stays constant so φ i ( s, w, w ) = 1, and then for all f ∈ D ( B w ), B w f ( s, v ) = B f ( w, s, v ). This analysis enables us toshow in section 3.2 that, on the slow time scale of plasticity, ( W t ) t ≥ behaves simply against theinvariant measure of ( S t , V t ) wt ≥ . In the following, we omit the dependence on w in the notationof processes only and we use ( S t , V t ) t ≥ instead of ( S t , V t ) wt ≥ .In a first subsection we show existence of an invariant measure of the process ( S t , V t ) t ≥ andthen its uniqueness in the next subsection. We start with some notations. Notations
Let X t = ( S t , V t ) with S t ∈ R N + and V t ∈ I = { , } N . The process is then the same as theone defined before with a fixed matrix of weights w . Each X it = ( S it , V it ) ∈ R + × { , } , for i ∈ [[1 , N ]], follows the same kind of process: the discrete variable V t jumps with a total rate P j (cid:0) α j ( w, v ) δ ( v j ) + βδ ( v j ) (cid:1) when V t = v . Between these jumps, the continuous part S t willgrow linearly with a slope of 1 ( dS t dt = 1) except when V it jumps from 0 to 1 at time t , then thecontinuous part restarts from 0, i.e. S it = 0, see F igure N t ) t ≥ the counting process corresponding tothe number of jump of the process ( V t ) t ≥ . We can then define the processes N t = P Ni =1 N it where ( N it ) t ≥ are counters of the number of jumps of neuron i. By definition of α i , one has N it = Y i (cid:16)R t α i ( w, V s ) ds (cid:17) where Y i are independent Poisson processes of intensity 1, as in [27].Finally, we call ( P t ) t ≥ the transition probability of the process, P t maps E × B ( E ) in R + .Hence, for all x ∈ E , A ∈ B ( E )( σ -algebra of Borel sets of E ), P t ( x, A ) is the probability that X t ∈ A knowing X = x , probability also written as P x ( X t ∈ A ).7 igure 2: Graph representing the i th coordinates of the processes S t and V t In this subsection, we aim at proving the following theorem:
Proposition 3.2.
The process ( S t , V t ) t ≥ defined in Theorem 3.1 has at least one invariantmeasure of probability. To do so, we use the following theorem, classical in theory of discrete Markov chains on anystate space:
Theorem 3.3.
If a transition probability P is Feller and admits a Lyapunov function, then italso has an invariant probability measure.Proof.
A nice proof of this result can be found in the course of Martin Hairer called
ErgodicProperties of Markov Processes . See theorem 2 of [46]. Just need to show condition ( F ) isequivalent to our Lyapunov condition.After recalling the definitions of a Lyapunov function and a Feller process, we find such aLyapunov function for our process. Definition 3.4.
Let X be a complete separable metric space and let P be a transition probabilityon X . A Borel measurable function V : X R + ∪ {∞} is called a Lyapunov function for P if itsatisfies the following conditions:- V − ( R + ) = ∅ , in other words there are some values of x for which V ( x ) is finite.- For every c ∈ R + , the set V − ( { x ≤ c } ) is compact.- There exists a positive constant γ < 1 and a constant C such that for every x such that V ( x ) = + ∞ : Z X V ( y ) P ( x, dy ) ≤ γV ( x ) + C efinition 3.5. We say that a homogeneous Markov process with transition operator P is Fellerif Pf is continuous whenever f is continuous and bounded. It is strong Feller if Pf is continuouswhenever f is measurable and bounded.We emphasize that previous definitions and theorem are given for Markov chains and notprocesses. The following proposition links them.
Proposition 3.6.
Let ( P t ) t ≥ be a Markov semigroup over X and let P = P T for some fixed T > . Then, if µ is invariant for P, the measure ∧ µ defined by: ∧ µ ( A ) = 1 T Z T P t µ ( A ) dt, ∀ A ∈ B ( E ) is invariant for ( P t ) t ≥ .Proof. P t ∧ µ = P t T Z T P s µ ds ! = 1 T Z T P t P s µ ds = 1 T Z T P t + s µ ds = 1 T Z T + tt P s µ ds = 1 T Z Tt P s µ ds + Z T + tT P s µ ds ! = 1 T Z Tt P s µ ds + Z t P s P T µ ds ! = 1 T Z T P s µ ds = ∧ µ Hence, we want to apply theorem 3.3 to the transition probability P T extracted from ( P t ) t ≥ for some fixed T >
0. To do so, we show that for
T > V defined as V ( x ) = s + s + ...s N ∀ x = ( s, v ) ∈ E is a Lyapunov function for P T . Then we use theorem27.6 of the Davis’ book [14] to prove P T is Feller. We conclude on the existence of the invariantmeasure of probability for P T and thus for ( P t ) t ≥ thanks to proposition 3.6.After these definitions and notations, let’s prove the process ( X t ) t ≥ has at least one invariantmeasure π , i.e. X ∼ π ⇒ ∀ t ≥ , X t ∼ π or more formally, ∀ A ∈ B ( E ): Z E P t ( x, A ) π ( dx ) = π ( A ) (9) Existence
Assumption 3.7. ∃ α m , α M ∈ R + such that ∀ v ∈ I, w ∈ E : < α m ≤ α i ( w, v ) , β ≤ α M < ∞ Proposition 3.8.
With assumption 3.7, for any
T > , V ( x ) = s + ... + s N is Lyapunov for P T with constants C = N T and γ = P x ( ∃ i : N iT < < , ∀ x ∈ E .Proof. The main idea is to use the fact that S it values return to 0 whenever neuron i jumps from0 to 1. Hence, as neurons have only two states, if N iT ≥
2, neuron i has jumped at least one timefrom 0 to 1 between 0 and T . Therefore, decomposing possible events we get: V ( X T ) ≤ ( V ( x ) + N T ) {∃ iN iT < } + N T {∀ iN iT ≥ } E x V ( X T ) ≤ N T + V ( x ) P x ( ∃ i : N iT < | {z } < Furthermore, one can show the process ( S t , V t ) is Feller thanks to Davis’ book [14]: Proposition 3.9. ( S t , V t ) is Feller.Proof. First, we define a distance ρ such that ( E , ρ ) is a metric space, locally compact. Such adistance is proposed in [14] page 58: ∀ x = ( s x , v x ) , y = ( s y , v y ) ∈ E : ρ ( x, y ) = ( v x = v y π max { ≤ i ≤ N } tan − ( | s ix − s iy | ) if v x = v y (10)We need this kind of norm because if we take for instance the euclidean distance ρ ( x, y ) = k s x − s y k ,we can have ρ ( x, y ) = 0 and x = y as soon as s x = s y and v x = v y .Then, we want to apply theorem 27.6 of [14]. We define t ∗ ( x ) as t ∗ ( x ) = { time to hit the boundary of E leaving from x and following the flow on s } t ∗ ( x ) = + ∞ as the only boundary is for x = (0 , v ) which is never reached because S t increasestoward infinity following the flow.Moreover, we define the total jump rate λ ( x ) = P j (cid:0) α j ( w, v ) δ ( v j ) + βδ ( v j ) (cid:1) = λ ( v ). Thus, as λ is bounded by assumption 3.7 and it only depends on v , as soon as ρ ( x, y ) < v x = v y so λ ( x ) = λ ( y ), hence λ ∈ C b ( E ).Finally, we define Q as Q (cid:0) { (( s − δ ( v i ) s i e i , v + e i ) } , ( s, v ) (cid:1) = α i ( w, v ) δ ( v i ) + βδ ( v i ) λ ( v )and show it is continuous for f ∈ D ( B w ). Indeed, let f ∈ D ( B w ), if ρ ( x, y ) ≤ η < | Qf ( x ) − Qf ( y ) | = (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)X i f ( s x , v + e i ) α i ( w, v ) δ ( v i ) + βδ ( v i ) λ ( v ) − X i f ( s y , v + e i ) α i ( w, v ) δ ( v i ) + βδ ( v i ) λ ( v ) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) ≤ N sup i | f ( s x , v + e i ) − f ( s y , v + e i ) | Then, choosing η such that sup v ∈ I | f ( s x , v ) − f ( s y , v ) | ≤ (cid:15)N (possible as f ∈ D ( B w ) ⊂ C b ( E ))we have for all (cid:15) > ∃ η > ρ ( x, y ) ≤ η ⇒ | Qf ( x ) − Qf ( y ) | ≤ (cid:15) Thus, x → Qf ( x ) is continuous for f ∈ D ( B w ). We can apply theorem 27.6 of Davis’ book [14]which ends the proof.We can now prove theorem 3.2: Proof.
Proposition 3.8 and Proposition 3.9 allows to apply Theorem 3.3 and thus conclude onthe existence of an invariant measure of probability for ( S t , V t ).In the following, we show that such a measure is unique.10 .1.2 Uniqueness through Laplace transform We now want to show this process has a unique invariant measure of probability π . To do so,we find the possible Laplace transforms of the invariant measures of the process. We prove suchLaplace transforms satisfy an equation with a unique solution. By uniqueness of the Laplacetransform of a measure, we deduce the result we want.In the following, we use an equivalent definition of invariant measures which makes use of thegenerator ( B w , D ( B w )) of the process, see proposition 34.7 in [14]. Proposition 3.10.
Let ( T t ) t ≥ be a semigroup on F , a Banach space, associated to a Markovprocess ( X t ) t ≥ . We note, ( B w , D ( B w )) its generator and we assume D ( B w ) is separating. Then, π is an invariant measure if and only if ∀ f ∈ D ( B w ) , Z E B w f dπ = 0 (11)We remind us what is a separating class of functions: Definition 3.11.
A class of functions
D ∈ B ( E ) (measurable and bounded function on E)is said to be separating if for probability measures µ w and µ w on E , µ w = µ w whenever R E f dµ w = R E f dµ w for all f ∈ D .In what follows, domains of generators will always be separating as showed in the proposition34.11 of [14]. Uniqueness
We invite you to have a look to the appendix A to have a better view on the following computations.
Proposition 3.12.
Assume the process ( X t ) t ≥ in dimension N has at least one invariantmeasure of probability π w . Then it is unique.Proof. Let start with some notations: I = { , } N and E = R N + × I ∀ ( s, v ) ∈ E , s = ( s , ..., s N ) ∈ R N + and v = ( v , ..., v N ) ∈ Ie i = (0 , ..., , |{z} i , , ..., B I = ( ~v , ..., ~v N ) an enumeration of I s.t. k ≥ l ⇒ N X i =1 v ik ≥ N X i =1 v il | λ | = N X i =1 λ i (12)The jump process alone ( V t ) t ≥ has a unique invariant measure µ w = ( µ w , ..., µ w N ) ∈ R N + . Indeed,as each neuron is connected to each other, ( V t ) t ≥ is irreducible. As its state space is finite,the process is also positive recurrent so it has a unique invariant probability measure µ w bytheorem1.7.7 in [39]. Moreover, as each state is positive recurrent, µ wv > , ∀ v ∈ I . In particular,this measure satisfies P N k =1 B g ( ~v k ) µ wk = 0, where B is the generator of ( V t ) t ≥ and for functions g I -measurable: B g ( v ) = B w g ( s, v ) = N X i =1 βδ ( v i )[ g ( v − e i ) − g ( v )] + α i ( w, v ) δ ( v i ) [ g ( v + e i ) − g ( v )] (13)11ence, with g ( v ) = ~v j ( v ) we get ∀ j ∈ [[1 , N ]]: N X k =1 B g ( ~v k ) µ wk = N X k =1 µ wk N X i =1 ( βδ ( v ik )[ ~v j ( v k − e i ) − ~v j ( ~v k )] + α i ( ~v k ) δ ( v ik ) (cid:2) ~v j ( ~v k + e i ) − ~v j ( ~v k ) (cid:3) ) = 0 ⇔ N X k =1 ,k = j µ wk N X i =1 [ βδ ( v ik ) ~v j ( ~v k − e i ) + α i ( ~v k ) δ ( v ik ) ~v j ( ~v k + e i )] = µ wj N X i =1 βδ ( v ij ) + α i ( ~v j ) δ ( v ij )(14)We can then write the system satisfied by Laplace transforms of invariant probability measuresof the process ( S t , V t ) t ≥ . We call π w one of them. First we can decompose π w as: π w ( ds, v ) = N X k =1 π w~v k ( ds ) µ wk ~v k ( v ) (15)In what follows, for the sake of simplicity, we note π wk for π w~v k .From proposition 3.10, ∀ f ∈ D ( B w ): N X k =1 Z s ∈ R N + B w f ( s, ~v k ) µ wk π wk ( ds ) = 0 (16)Where ( B w , D ( B w )) is the generator of the process ( X t ) t ≥ (6) . As we are interested in findingthe Laplace transform of π w we take f ( s, v ) = e − ~λ.~s g ( v ). First we compute B w f : B w f ( s, v ) = N X i =1 βδ ( v i )[ e − ~λ.~s g ( v − e i ) − e − ~λ.~s g ( v )]+ N X i =1 α i ( w, v ) δ ( v i ) h e − ~λ. ( ~s − s i ~e i ) g ( v + e i ) − e − ~λ.~s g ( v ) i − ( N X i =1 λ i ) | {z } | λ | e − ~λ.~s g ( v ) (17)So in (16) we get: N X k =1 Z s ∈ R N + B w f ( s, ~v k ) µ wk π wk ( ds )= N X k =1 " N X i =1 βδ ( v ik )[ g ( ~v k − e i ) − g ( ~v k )] − α i ( ~v k ) δ ( v ik ) g ( ~v k ) ! − | λ | g ( ~v k ) µ wk Z s e − ~λ.~s π wk ( ds ) | {z } L ( π wk )( λ ) + N X k =1 N X i =1 α i ( ~v k ) δ ( v ik ) g ( ~v k + e i ) Z s e − ~λ. ( ~s − s i ~e i ) π wk ( ds ) | {z } L ( π wk )( b λ i ) µ wk = 0 (18)12here b λ i = ( λ , ..., λ i − , , λ i +1 , ..., λ N ). We first show recursively that we can express L ( π wk )( λ )in function of linear combinations of L ( π wl )(ˇ λ l ) where ˇ λ l = (0 , ..., , λ l , , ..., D (ˇ λ l ) invertible such that: D (ˇ λ l ) L ( π w )(ˇ λ l )... L ( π w N )(ˇ λ l ) = Λ ( l ) , with Λ ( l ) ∈ R N a constant vector Where ˇ λ l = (0 , ..., , λ l , , ..., L ( π wk )( λ ) as a linear combination of L ( π wk )(ˇ λ l ). Step 1
First, we express the L ( π wk )( λ ) in function of the L ( π wl )( b λ i ). In particular, we find Γ( λ ) : R N + → M N ( R ) and Λ( λ ) : R N + → R N , for which Λ j ( λ ) depends only on linear combination of L ( π wl )( b λ i )where i ∈ [[1 , N ]] and l ∈ [[1 , N ]], such that:Γ( λ ) L ( π w )( λ )... L ( π w N )( λ ) = Λ( λ ) (19)To do so, we take g ( v ) = ~v j ( v ) in (18) and find Γ and Λ : N X k =1 L ( π wk )( λ ) " N X i =1 βδ ( v ik )[ ~v j ( ~v k ) − ~v j ( ~v k − e i )] + α i ( ~v k ) δ ( v ij ) ~v j ( ~v k ) ! + | λ | ~v j ( ~v k ) µ wk | {z } Γ jk ( λ ) = N X k =1 " N X i =1 α i ( ~v k ) δ ( v ik ) ~v j ( ~v k + e i ) L ( π wk )( b λ i ) µ wk | {z } Λ j ( λ ) (20)We can remark from (14) that: Γ jk ( λ ) = 0 ∀ k < j Γ jj ( λ ) = h(cid:16)P Ni =1 βδ ( v ij ) + α i ( ~v j ) δ ( v ij ) (cid:17) + | λ | i µ wj > jk ( λ ) = − P Ni =1 βδ ( v ik ) ~v j ( ~v k − e i ) µ wk , ∀ k > j
13o Γ jj ( λ ) = µ wj N X i =1 βδ ( v ij ) + α i ( ~v j ) δ ( v ij ) + | λ | µ wj = N X k =1 ,k = j µ wk N X i =1 [ βδ ( v ik ) ~v j ( ~v k − e i ) + α i ( ~v k ) δ ( v ik ) ~v j ( ~v k + e i )] + | λ | µ wj = N X k =1 ,k = j | Γ jk ( λ ) | + N X k =1 ,k = j µ wk N X i =1 α i ( ~v k ) δ ( v ik ) ~v j ( ~v k + e i ) + | λ | µ wj (21)Thus Γ is invertible as a strictly dominant diagonal matrix as soon as | λ | ≥
0. We will use thesame idea in what follows to show there is a unique way to express each L ( π wm )( λ ), m ∈ I , as alinear combination of terms of the family (cid:16) L ( π wk )(ˇ λ l ) (cid:17) ≤ l ≤ N,m ∈ I .Second, take a sequence k , k , ... , k d ∈ [[1 , N ]], d ≤ N − b λ k ...k d whichchecks the conditions b λ k i k ...k d = 0. We have from (19):Γ( b λ k ...k d ) L ( π w )( b λ k ...k d )... L ( π w N )( b λ k ...k d ) = Λ( b λ k ...k d ) (22)Using (20) we get:Λ j ( b λ k ...k d ) = N X k =1 X i/ ∈{ k ,...,k d } α i ( ~v k ) δ ( v ik ) ~v j ( ~v k + e i ) L ( π wk )( b λ k ...k d m ) + X i ∈{ k ,...,k d } α i ( ~v k ) δ ( v ik ) ~v j ( ~v k + e i ) | {z } Γ jk L ( π wk )( b λ k ...k d ) µ wk Hence we can decompose Λ j ( b λ k ...k d ) as follows:Λ( b λ k ...k d ) = Λ ( k ...k d ) ( λ ) + Γ L ( π w )( b λ k ...k d )... L ( π w N )( b λ k ...k d ) Where Λ ( k ...k d ) j ( λ ) depends on λ only through (cid:16) L ( π wk )( b λ k ...k d m ) (cid:17) m/ ∈{ k ,...,k d } ,k ∈ I . Thus, equa-tion (22) can be rewritten as: h Γ( b λ k ...k d ) − Γ i| {z } Γ ( k ...kd ) ( b λ k ...kd ) L ( π w )( b λ k ...k d )... L ( π w N )( b λ k ...k d ) = Λ ( k ...k d ) ( λ ) (23)14ventually, we show Γ ( k ...k d ) ( b λ k ...k d ) is invertible as soon as | λ | ≥
0, denoting by K = { k , ..., k d } : N X k =1 , = j | Γ ( k ...k d ) jk ( b λ k ...k d ) | = X k
To end with a way to compute L ( π w )( λ ) we show how to find L ( π wm )(ˇ λ l ) and then we get a newsystem of the form: D (ˇ λ l ) L ( π w )(ˇ λ l )... L ( π w N )(ˇ λ l ) = Λ ( i ) , with Λ ( l ) ∈ R N a constant vector (24)The idea is the same as previously. We evaluate the expression (19) in all ˇ λ l which gives:Γ(ˇ λ l ) L ( π w )(ˇ λ l )... L ( π w N )(ˇ λ l ) = Λ(ˇ λ l ) (25)So at line j: Λ j (ˇ λ l ) = N X k =1 N X i =1 α i ( ~v k ) δ ( v ik ) ~v j ( ~v k + e i ) L ( π wk )( b λ i ∩ ˇ λ l | {z } =(0 ,..., if i = l ) µ wk And as L ( π wk )(0 , ...,
0) = 1 we have:Λ j (ˇ λ l ) = N X k =1 µ wk N X i =1 ,i = l α i ( ~v k ) δ ( v ik ) ~v j ( ~v k + e i ) | {z } Λ ( l ) j = cst + N X k =1 α l ( ~v k ) δ ( v lk ) ~v j ( ~v k + e l ) µ wk | {z } D jk for k
We use the Theorem 2.1 of [31] twice. Once to link the occupation measure of the fastprocess to its invariant measure and then again to show (32).We denote by F nt the natural filtration of ( W nt , S nt , V nt ) t ≥ . I will enumerate and show theproperties we need in order to apply [31].1. ( W nt ) t ≥ satisfies the compact containment condition that is for each (cid:15) > T > K ⊂ E such that:inf n P ( W nt ∈ K, t ≤ T ) ≥ − (cid:15) Proof.
We denote K i = (cid:8) ˜ w ∈ E s.t. ∀ k, l ∈ [[1 , N ]] , | ˜ w kl − w kl | ≤ i ∆ w (cid:9) . Therefore, we want toshow that for each (cid:15) > T > ∃ i large enough to have ∀ n ∈ N : P ( W nt ∈ K i − , t ≤ T ) ≥ − (cid:15) (33)But: P ( W nt ∈ K i − , t ≤ T ) = P (cid:18)f W nt ∈ K i − , t ≤ T(cid:15) n (cid:19) = 1 − P (cid:18) ∃ t ≤ T(cid:15) n , f W nt / ∈ K i − (cid:19)
18o we major P (cid:16) ∃ t ≤ T(cid:15) n , f W nt / ∈ K i − (cid:17) in what follows. As lim n ∞ T(cid:15) n = + ∞ , the time on whichwe are looking at our process is becoming larger and larger with n so we need the probabilityof jumping to become smaller and smaller as it is the case for ( f W nt ) t ≥ . Indeed, when neuron i jumps from 0 to 1, w ij and w ji for j = i have probability to jump of order (cid:15) n .First, from (26) there exists c > i jumped from 0 to 1 is less than c (cid:15) n , so for all i, s and w : X ˜ w ∈ G wi , ˜ w = w φ in ( s, e w, w ) = P (cid:16)f W nt = f W nt − | e V n,it − e V n,it − = 1 (cid:17) ≤ c (cid:15) n < X nt as the particular case of the process e X nt for which neuronsare independent and fire at rate γ = max( β, α M ) and whenever a neuron i jumps (from 0 to 1 or1 to 0), W nt change with probability c (cid:15) n . We just impose that the size of weights jumps are asbefore: + / − ∆ w . Hence, in such a process weights jump more frequently. So denoting by N wt and N wt processes respectively counting the number of jumps of f W nt and W wt between 0 and t,and as previously, N t the counting process corresponding to the number of jump of the process( V t ) t ≥ . Thus: P (cid:18) ∃ t ≤ T(cid:15) n , f W nt / ∈ K i − (cid:19) = P (cid:18) ∃ k, l ∈ [[1 , N ]] , ∃ t ≤ T(cid:15) n , | ( f W nt ) kl − w kl | ≥ i ∆ w (cid:19) ≤ P (cid:16) N w T(cid:15)n ≥ i (cid:17) ≤ P (cid:16) N w T(cid:15)n ≥ i (cid:17) = + ∞ X k = i P ( N w T(cid:15)n = k )But P ( N w T(cid:15)n = k ) = + ∞ X m = k (cid:18) P ( N T(cid:15)n = m )( c(cid:15) n ) k (1 − c(cid:15) n ) m − k (cid:18) mk (cid:19)(cid:19) = + ∞ X m = k e − Nα M T(cid:15)n ( N α
M T(cid:15) n ) m m ! ( c(cid:15) n ) k (1 − c(cid:15) n ) m − k (cid:18) mk (cid:19)| {z } probability ( W nt ) t ≥ changed k times knowing ( V nt ) t ≥ jumped m times So for (cid:15) n small enough: P (cid:18) ∃ t ≤ T(cid:15) n , f W nt / ∈ K i − (cid:19) ≤ + ∞ X k = i + ∞ X m = k e − Nα M T(cid:15)n ( N α
M T(cid:15) n ) m m ! ( c(cid:15) n ) k (1 − c(cid:15) n ) m − k (cid:18) mk (cid:19)! ≤ + ∞ X k = i e − Nα M T(cid:15)n + ∞ X m = k (cid:18) ( N α M T ) m k !( m − k )! c k ( (cid:15) n ) m − k (1 − c(cid:15) n ) m − k (cid:19) ≤ + ∞ X k = i ( N α M T c ) k k ! e − Nα M T(cid:15)n + ∞ X m = k ( N α M T ) m − k ( (cid:15) n − c ) m − k ( m − k )! !| {z } e NαM ( 1 (cid:15)n − c ) T ≤ + ∞ X k = i ( N α M T c ) k k ! e − Nα M T ≤ + ∞ X k = i ( N α M T c ) k k ! −→ i → + ∞ ∃ i such that (33) is satisfied. 19. Moreover, define ∀ w ∈ E , h ∈ D ( B ) ⊂ C b ( E ) the operator B w by B w h ( s, v ) = B h ( w, s, s ).There exists a unique probability measure on E π w such that: Z E B w ( s, v ) π w ( ds, dv ) = 0 Proof.
See theorem 3.1.3. ∀ g ∈ D ( C ) ∩ C b ( E ) : g ( W nt ) − Z t A g ( W nu , S nu , V nu ) du + o (cid:15) n (1) Z t A r g ( W nu , S nu , V nu ) du (34)is a F nt martingale and ∀ ( w, s, v ) ∈ E lim n → + ∞ E ( w,s,v ) (cid:20) sup t ≤ T (cid:12)(cid:12)(cid:12)(cid:12) o (cid:15) n (1) Z t A r g ( W nu , S nu , V nu ) du (cid:12)(cid:12)(cid:12)(cid:12)(cid:21) = 0 (35) Proof. ∀ f ∈ D ( C ) = D ( C n ): f ( W nt , S nt , V nt ) − Z t C n f ( W nu , S nu , V nu ) du (36)is a F nt martingale and ∀ g ∈ D ( C ) ∩ C b ( E ) (cid:15) n C n g ( w, s, v ) = X i α i ( w, v ) δ ( v i ) X ˜ w ∈ G wi , ˜ w = w ( g ( ˜ w ) − g ( w )) φ in ( s, ˜ w, w ) = (cid:15) n A g ( w, s, v ) + o ( (cid:15) n ) A r g ( w, s, v )So (34) is a F nt martingale.Moreover, as g ∈ D ( A ) ∩ C b ( E ) and max i ∈ I,w ∈ E ( G wi ) ∩ ≤ N − , ∃ M > |A r g ( w, s, v ) | = (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)X i α i ( w, v ) δ ( v i ) X ˜ w ∈ G wi , ˜ w = w ( g ( ˜ w ) − g ( w )) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) ≤ N α M (cid:18) max i ∈ I,w ∈ E G wi (cid:19) x ∈ E | g ( x ) | ≤ M Hence, ∀ ( w, s, v ) ∈ E , E ( w,s,v ) h sup t ≤ T (cid:12)(cid:12)(cid:12) o (cid:15) n (1) R t A r g ( W nu , S nu , V nu ) du (cid:12)(cid:12)(cid:12)i = o (cid:15) n (1), thus condi-tion (35) is satisfied.4. Similarly, ∀ h ∈ D ( C ) ∩ C b ( E ) h ( S nt , V nt ) − Z t (cid:15) n B h ( W nu , S nu , V nu ) du (37)is a F nt martingale 20 roof. (cid:15) n C n h ( w, s, v ) = N X i =1 ∂ s i h ( s, v ) + X i δ ( v i ) β [ h ( s, v − e i ) − h ( s, v )]+ X i α i ( w, v ) δ ( v i ) ( h ( s − s i e i , v + e i ) − h ( s, v )) φ in ( s, w, w )+ X i α i ( w, v ) δ ( v i ) X ˜ w ∈ G wi , ˜ w = w ( h ( s − s i e i , v + e i ) − h ( s, v )) φ in ( s, ˜ w, w ) = N X i =1 ∂ s i h ( s, v ) + X i δ ( v i ) β [ h ( s, v − e i ) − h ( s, v )]+ X i α i ( w, v ) δ ( v i ) ( h ( s − s i e i , v + e i ) − h ( s, v )) φ in ( s, w, w ) + X ˜ w ∈ G wi , ˜ w = w φ in ( s, ˜ w, w ) | {z } =1 = B h ( w, s, v )So C n h = 1 (cid:15) n B h As (36)is a F nt martingale, (37) is a F nt martingale too.Thus, conditions of example 2.3 of [31] are satisfied and ( W nt , S nt , V nt ) t ≥ converges, when n → + ∞ , in law to ( W t , S t , V t ) t ≥ where ( S t , V t ) ∼ π W t and ( W t ) is the solution of the martingaleproblem associated to the operator C av : D ( C av ) ⊂ C b ( E ) → C b ( E ): C av f ( w ) = Z E A f ( w, s, v ) π w ( ds, dv )Indeed, we use theorem 2.1 of [31] twice. First the point 1., 2. and 4. enable us to use thetheorem to obtain that when n → + ∞ , ( W n , Γ n ) → ( W, Γ) such that there exists a filtration {G t } such that M t = Z t Z E B f ( W ( s ) , y )Γ( ds × dy )is a {G t } -martingale for each f ∈ D ( C | C b ( E ) ). But M t is continuous and of bounded variation,so it must be constant (see for instance Theorem 27 of [43]) and finally M t = 0 for all t >
0. Wethen write Γ( ds × dy ) = γ s ( dy ) ds and get Z t Z E B f ( W ( s ) , y ) γ s ( dy ) ds = 0And then Z E B f ( W ( s ) , y ) γ s ( dy ) = 0So we can take γ s ( dy ) = π W s ( dy ) is the unique invariant measure for B x such that B x f ( y ) = B f ( x, y ). We conclude using 1.,2. and 3. and the Theorem 2.1 of [31] which gives that Z t Z E A f ( W ( s ) , y )Γ( ds × dy )21 martingale and thus ( W t ) is the solution of the martingale problem associated to the operator C av : D ( C av ) ⊂ C b ( E ) → C b ( E ): C av f ( w ) = Z E A f ( w, s, v ) π w ( ds, dv )This time scale separation gives the infinitesimal generator of the weight process on the slowtime scale. However, we don’t know explicitly π w but its Laplace transform. Under some simpleassumptions, we can get explicitly the dynamic of the weights which is a Markov process on E with non-homogeneous jump rates depending on the Laplace transform of π w . Proposition 3.14.
Suppose that for all i ∃ Φ i ( ˜ w,w ) such that ϕ i ( s, ˜ w, w ) = L (cid:16) Φ i ( ˜ w,w ) (cid:17) ( s ) .Then, C av f ( w ) = X ˜ w ∈ G w , ˜ w = w ( f ( ˜ w ) − f ( w )) X v ∈ I µ wv X i s.t. ˜ w ∈ G wi α i ( w, v ) δ ( v i ) Z R N + Φ i ( ˜ w,w ) ( s ) L ( π wv )( s )( ds ) Where G w = { w ∈ E , P ( W = w | W = w ) > } and µ wv is the invariant measure of the processgenerated by B defined in (13) .Proof. If we develop the infinitesimal generator of the process ( W t ) t ≥ . Thanks to (32) and (28)we get: C av f ( w ) = Z E A f ( w, s, v ) π w ( ds, dv ) = X v ∈ I Z E A f ( w, s, v ) µ wv π wv ( ds )= X v ∈ I Z E X i α i ( w, v ) δ ( v i ) X ˜ w ∈ G wi , ˜ w = w ( f ( ˜ w ) − f ( w )) ϕ i ( s, ˜ w, w ) µ wv π wv ( ds )= X v ∈ I µ wv X i α i ( w, v ) δ ( v i ) X ˜ w ∈ G wi , ˜ w = w ( f ( ˜ w ) − f ( w )) Z E ϕ i ( s, ˜ w, w ) π wv ( ds ) With the assumption that for all i ∃ Φ i ( ˜ w,w ) such that ϕ i ( s, ˜ w, w ) = L (Φ ( ˜ w,w ) )( s ) we get: C av f ( w ) = X v ∈ I µ wv X i α i ( w, v ) δ ( v i ) X ˜ w ∈ G wi , ˜ w = w ( f ( ˜ w ) − f ( w )) Z R N + L (Φ i ( ˜ w,w ) )( s ) π wv ( ds ) = X v ∈ I µ wv X i α i ( w, v ) δ ( v i ) X ˜ w ∈ G wi , ˜ w = w ( f ( ˜ w ) − f ( w )) Z R N + Φ i ( ˜ w,w ) ( s ) L ( π wv )( s )( ds ) = X ˜ w ∈ G, ˜ w = w ( f ( ˜ w ) − f ( w )) X v ∈ I µ wv X i s.t. ˜ w ∈ G wi α i ( w, v ) δ ( v i ) Z R N + Φ i ( ˜ w,w ) ( s ) L ( π wv )( s )( ds ) Sufficient conditions for recurrence and transience
Plasticity models evolved interacting with neurologists’ discoveries. For instance, models basedon STDP confirmed the need of homeostasis in order to regulate evolution of weights: preventfrom their divergence or extinction, need of competition. Indeed, Hebbian learning suffers from apositive feedback instability and lead to all neurons wiring together [48]. Synaptic scaling andmetaplasticity are the main homeostatic mechanisms used in models through different ways [47].In our model we don’t have such mechanisms, like hard or soft bounds, but we can show thatweights still stabilize under some conditions. We propose some general conditions which wemanage to express in a simple condition on parameters of our model.In our case, we are faced with a non-homogeneous in space and homogeneous in time Markovprocess which is in a space equivalent to N N . A few results exists for such processes. Asunderlines authors of the book [37], Lyapunov techniques seem to be the most adapted to analysesuch processes.For the sake of simplicity and as it doesn’t change anything in what follows, we consider now∆ w = 1. Then E = N N ∗ . Also, we are interested in the case presented in the first example givenin remark 2. Therefore, the slow process comes from the fact p + / − are multiplied by (cid:15) n , so: ϕ i ( s, w = w − E ij , w ) = (cid:26) p − ( s j ) if w ij >
10 if w ij = 1 ϕ i ( s, w = w + E ij , w ) = p + ( s j ) ϕ i ( s, w , w ) = 0 for all other w If we develop the infinitesimal generator of the process ( W t ) t ≥ . Thanks to (32) and (28) we get: C av f ( w ) = Z E A f ( w, s, v ) π w ( ds, dv ) = X v ∈ I Z E A f ( w, s, v ) µ wv π wv ( ds )= X v ∈ I Z E X i α i ( w, v ) δ ( v i ) X ˜ w ∈ G wi , ˜ w = w ( f ( ˜ w ) − f ( w )) ϕ i ( s, ˜ w, w ) µ wv π wv ( ds )= X i,j : i = j ( f ( w + E ij ) − f ( w )) X v,v j =0 µ wv α j ( w, v ) Z E p + ( s i ) π wv ( ds ) + X i,j : i = j ]1 , + ∞ [ ( w ij )( f ( w − E ij ) − f ( w )) X v,v i =0 µ wv α i ( w, v ) Z E p − ( s j ) π wv ( ds ) (38)Denoting rate of jump by r + / − ij ( w ) we get: C av f ( w ) = X i,j ( f ( w + E ij ) − f ( w )) r + ij ( w ) + ]1 , + ∞ [ ( w ij )( f ( w − E ij ) − f ( w )) r − ij ( w ) (39)23 .1 General conditions for positive recurrence and transience Proposition 4.1.
Assume the following conditions:• ∃ I + / − m , I + / − M ∈ R ∗ + such that I + / − m ≤ r + / − ij ( w ) ≤ I + / − M for all w ,• I − m > I + M which leads to r + ij ( w ) − r − ij ( w ) ≤ I + M − I − m < for all w Then, the process ( W t ) t ≥ associated to the generator C av given in (39) is positive recurrent.Proof. We use proposition 1.3 from Hairer’s course [22]. In order to check assumptions of thisproposition, we need to find a function f : E → R + such that lim x → + ∞ f ( x ) = + ∞ and ∃ A ⊂ E finite such that for all w ∈ E \ A : C av f ( w ) ≤ − f : E → R + as: ∀ w ∈ E , f ( w ) = X i,j : i = j ( w ij ) = || w || So C av f ( w ) = X i,j : i = j ( || w + E ij || − || w || ) r + ij ( w ) + X i,j : i = j, w ij =1 ( || w − E ij || − || w || ) r − ij ( w )= X i,j : i = j (2 w ij + 1) r + ij ( w ) + X i,j : i = j, w ij =1 ( − w ij + 1) r − ij ( w )= X i,j : i = j, w ij =1 (2 w ij + 1) r + ij ( w ) + X i,j : i = j, w ij =1 ( − w ij + 1) r − ij ( w ) + X i,j : i = j, w ij =1 (2 w ij + 1) r + ij ( w )= X i,j : i = j, w ij =1 w ij ( r + ij ( w ) − r − ij ( w )) + X i,j : i = j, w ij =1 (cid:0) r − ij ( w ) + r + ij ( w ) (cid:1) + X i,j : i = j, w ij =1 (2 w ij + 1) r + ij ( w ) ≤ X i,j : i = j, w ij =1 w ij ( r + ij ( w ) − r − ij ( w )) + ( N − { w ij = 1 } ) r − ij ( w ) + N r + ij ( w ) + 2 { w ij = 1 } r + ij ( w ) | {z } ≤ N ( r + ij ( w )+ r − ij ( w )) ≤ N ( I + M + I − M ) As r + ij ( w ) − r − ij ( w ) ≤ I + M − I − m < w such that || w || > N (to enforce that at least one w kl > C av f ( w ) ≤ i,j : i = j ( w ij )( I + M − I − m ) + 3 N (( I + M + I − M )) −→ || w ||→ + ∞ −∞ Let w sep ∈ N ∗ + be such that w ≥ w sep ⇒ w ( I + M − I − m ) + 3 N ( I + M + I − M ) ≤ − i,j ( w ij ) ≥ || w || N , we define A = { w, || w || > N w sep } so: w ∈ A ⇒ max i,j ( w ij ) ≥ w sep ⇒ i,j ( w ij )( I + M − I − m ) + 3 N (( I + M + I − M )) ≤ − A = A c = { w, || w || ≤ N w sep } . A is finite and for all w ∈ E \ A : C av f ( w ) ≤ − W t ) t ≥ .24 orollary 4.2. If lim r → + ∞ sup w ∈ Σ , k w k≥ r ( r + ij ( w ) − r − ij ( w )) < Then, the process ( W t ) t ≥ associated to the generator C av given in (39) is positive recurrent.Proof. Exactly the same as the proof of proposition 4.1.
Proposition 4.3. p + ( s ) − p − ( s ) > γ > for all s ∈ R ∗ + imply transience of ( W t ) t ≥ .Proof. Let define A = { w ∈ E s.t. min w ij ≤ w > } And f : E → R + such that: f ( x ) = N w if x ∈ A, P x ij if x ∈ A c Thus, inf A f = N w so for all w ∈ A c , f ( w ) < inf A f .Moreover, C av f ( w ) = X i,j ( f ( w + E ij ) − f ( w )) r + ij ( w ) + ]1 , + ∞ [ ( w ij )( f ( w − E ij ) − f ( w )) r − ij ( w )= X i,j − w ( w + 1) r + ij ( w ) + ]1 , + ∞ [ ( w ij ) 1 w ( w − r − ij ( w ) ≤ X i,j − w ( w + 1) r + ij ( w ) + 1 w ( w − r − ji ( w ) ≤ X i X v,v i =0 µ wv α i ( w, v ) X j = i w ( w + 1) Z E ( p − ( s j ) − p + ( s j )) π wv ( ds ) < − γ w ( w + 1) X i X v,v i =0 µ wv α i ( w, v ) ≤ p + ( s ) − p − ( s ) < − γ < s ∈ R ∗ + imply positive recurrenceof ( W t ) t ≥ as we showed in simulations. Remark . Denoting by η ( w ) the expectation of jumps of ( W t ) t ≥ , we easily get that η ij ( w ) =( r + ij ( w ) − r − ij ( w ))∆ w . Thus, conditions on ( r + ij ( w ) − r − ij ( w )) are equivalent to conditions on η ij ( w ).We now compute the constants I + / − m , I + / − M ∈ R ∗ + in order to derive a simple condition oftransience or recurrence depending on parameters. We want to bound the following quantities: r + / − ij ( w ) = X v,v j =0 µ wv α j ( w, v ) Z E p + / − ( s i ) π wv ( ds )The main idea is to use that 0 < α m ≤ α j ( w, v ) ≤ α M so α m Z E p + ( s i ) X v,v j =0 µ wv π wv ( ds ) ≤ X v,v j =0 µ wv α j ( w, v ) Z E p + ( s i ) π wv ( ds ) ≤ α M Z E p + ( s i ) X v,v j =0 µ wv π wv ( ds ) Z E p + ( s i ) X v,v j =0 µ wv π wv ( ds ) But for all differentiable p + , by Fubini: Z E p + ( s i ) π wv ( ds ) = Z E (cid:18)Z s i ( p + ) ( u ) du + p + (0) (cid:19) π wv ( ds )= p + (0) + Z E (cid:18)Z + ∞ ( p + ) ( u ) { u u | V t = v (cid:1) ( p + ) ( u ) du We are finally interested in bounding X v,v j =0 µ wv P π w (cid:0) S it > u | V t = v (cid:1) = X v,v j =0 P π w ( V t = v ) P π w (cid:0) S it > u | V t = v (cid:1) = P π w (cid:16) S it > u, V jt = 0 (cid:17) Proposition 4.4.
For all w ∈ E : α M e − βu − β e − α M u α M − β βα M + β ≤ P π w (cid:16) S it > u, V jt = 0 (cid:17) ≤ α m e − βu − β e − α m u α m − β βα m + β (41)Let ( V t , S t ) and ( V t , S t ) be the processes for which (cid:16) ( V it , S it ) t ≥ (cid:17) i (the same for (cid:16) ( V it , S it ) t ≥ (cid:17) i )are independent each other and neurons jump from 0 to 1 respectively with a rate α M and α m and from 1 to 0 with the rate β . We thus get for similar trajectories, for all t ≥ i : S it ≤ S it ≤ S it Thus, we can bound P π w (cid:0) S it > u | V t = v (cid:1) as follows: P π w (cid:16) S it > u, V jt = 0 (cid:17) ≤ P π w (cid:16) S it > u, V jt = 0 (cid:17) ≤ P π w (cid:16) S it > u, V jt = 0 (cid:17) So P π w (cid:16) S it > u (cid:17) P π w (cid:16) V jt = 0 (cid:17) ≤ P π w (cid:16) S it > u, V jt = 0 (cid:17) ≤ P π w (cid:16) S it > u (cid:17) P π w (cid:16) V jt = 0 (cid:17) (42)First, let bound P π w (cid:16) V jt = 0 (cid:17) = P v,v j =0 . Proposition 4.5.
For all i , w : βα M + β ≤ X v,v i =0 µ wv ≤ βα m + β (43)26 roof. Let recall from (13) the generator of the process of neurons ( V t ) only when w is fixedjump: B g ( v ) = N X i =1 βδ ( v i )[ g ( v − e i ) − g ( v )] + α i ( w, v ) δ ( v i ) [ g ( v + e i ) − g ( v )]Which gives for the invariant measure µ w = ( µ wv ) v ∈ I : X v ∈ I B g ( v ) µ wv = 0Thus, let i ∈ [[1 , N ]] with g i ( v ) = δ ( v i ), we get:0 = X v ∈ I B g i ( v ) µ wv = X v ∈ I µ wv N X j =1 βδ ( v j )[ g i ( v − e j ) − g i ( v )] + α j ( w, v ) δ ( v j ) [ g i ( v + e j ) − g i ( v )]= X v ∈ I,v i =0 µ wv N X j =1 βδ ( v j )[ g i ( v − e j ) − g i ( v )] + α j ( w, v ) δ ( v j ) [ g i ( v + e j ) − g i ( v )] X v ∈ I,v i =1 µ wv N X j =1 βδ ( v j )[ g i ( v − e j ) − g i ( v )] + α j ( w, v ) δ ( v j ) [ g i ( v + e j ) − g i ( v )]= X v ∈ I,v i =0 µ wv ( − α i ( w, v )) + X v ∈ I,v i =1 µ wv β Indeed, when v i = 0, for all j = i one has g i ( v − e j ) − g i ( v ) = g i ( v + e j ) − g i ( v ) = 1 − g i ( v + e i ) − g i ( v ) = 0 − − δ ( v i ) = 0. Doing the same reasoning with v i = 1 we get the lastline. Then, we also know that P v ∈ I,v i =0 µ wv + P v ∈ I,v i =1 µ wv = 1 so:0 = X v ∈ I,v i =0 µ wv ( − α i ( w, v )) + X v ∈ I,v i =1 µ wv β = X v ∈ I,v i =0 µ wv ( − α i ( w, v )) + (1 − X v ∈ I,v i =0 µ wv ) β = β − ( β X v ∈ I,v i =0 µ wv + X v ∈ I,v i =0 µ wv α i ( w, v ))Finally,( β + α m ) X v ∈ I,v i =0 µ wv ≤ ( β X v ∈ I,v i =0 µ wv + X v ∈ I,v i =0 µ wv α i ( w, v )) ≤ ( β + α M ) X v ∈ I,v i =0 µ wv We conclude that βα M + β ≤ X v,v i =0 µ wv ≤ βα m + β (44)We now focus our interest on computations of P π w (cid:16) S it > u (cid:17) and P π w (cid:16) S it > u (cid:17) .27t is interesting to note that the previous inequality holds for all t ≥
0. We already showedin theorem 3.1 that each of ( S it , V it ) and ( S it , V it ) possesses a unique invariant measure ( S ∞ , V ∞ )and ( S ∞ , V ∞ ). Therefore, as (42) is true for all t ≥
0, we get: P (cid:0) S ∞ > u (cid:1) P π w (cid:16) V jt = 0 (cid:17) ≤ P π w (cid:16) S it > u, V jt = 0 (cid:17) ≤ P (cid:0) S ∞ > u (cid:1) P π w (cid:16) V jt = 0 (cid:17) (45)We turn on the computing of measures of ( S ∞ , V ∞ ) and ( S ∞ , V ∞ ) from their Laplace trans-forms. To do so, we study the process ( S t , V t ) ∈ R + ×{ , } with the following generator( A , D ( A )): A f ( s, v ) = βδ ( v ) ( f ( s, − f ( s, αδ ( v ) ( f (0 , − f ( s, ∂ s f ( s, v ) (46) Proposition 4.6.
The invariant probability measure π ( ds, v ) of ( S t , V t ) is: π ( ds, v ) = αβα − β ( e − βs − e − αs ) dsµ ( v ) + βe − βs dsµ ( v ) (47) Proof.
As in (15), π can be written as: π ( ds, v ) = π ( ds ) ( v ) µ + π ( ds ) ( v ) µ . In this case, µ = βα + β and µ = αα + β . Moreover, it is an invariant measure if and only if E π [ A f ] = 0 , ∀ f ∈ D ( A ). Thanks to functions f well-chosen we get equations on Laplace transforms of π and π .Denoting e iλ ( s, v ) = e − λs δ v ( i ) we get: βα + β R R + Ae λ ( s, π ( ds ) + αα + β R R + Ae λ ( s, π ( ds ) = 0 βα + β R R + Ae λ ( s, π ( ds ) + αα + β R R + Ae λ ( s, π ( ds ) = 0We remind us that: A f ( s, v ) = βδ ( v ) ( f ( s, − f ( s, αδ ( v ) ( f (0 , − f ( s, ∂ s f ( s, v )Thus: βα + β R R + ( αe − λs + λe − λs ) π ( ds ) = αα + β R R + βe − λs π ( ds ) β R R + απ ( ds ) = α R R + ( β + λ ) e − λs π ( ds )But R R + π ( ds ) = R R + π ( ds ) = 1.Therefore: R R + ( α + λ ) e − λs π ( ds ) = α R R + e − λs π ( ds ) R R + e − λs π ( ds ) = β ( β + λ ) ⇒ π ( s ) = βe − βs So R R + e − λs π ( ds ) = αβ ( α + λ )( β + λ ) ⇒ π ( s ) = αβα − β ( e − βs − e − αs ) π ( s ) = βe − βs
28e finally check the measure is invariant, that is to say: E π [ A f ] = βα + β Z R + A f ( s, π ( ds ) + αα + β Z R + f ( s, π ( ds )= αβ ( α + β )( α − β ) Z R + ( − αf ( s,
0) + ∂ s f ( s, e − βs − e − αs ) ds + αβα + β Z R + ( β ( f ( s, − f ( s, ∂ s f ( s, e − βs ds = αβ ( α + β )( α − β ) "Z R + ( − αf ( s,
0) + ∂ s f ( s, e − βs ds + Z R + ( αf ( s, − ∂ s f ( s, e − αs ds + αβα + β "Z R + βf ( s, e − βs ds + Z R + ( − βf ( s,
1) + ∂ s f ( s, e − βs ds = (cid:20) αβ ( β − α )( α + β )( α − β ) + αβ ( α + β ) (cid:21) Z R + f ( s, e − βs ds = 0Moreover, P v ∈{ , } R R + π ( ds, v ) = βα + β R R + π ( ds ) + αα + β R R + π ( ds ) = 1 completes the proof.We can now go on the proof of proposition 4.4. Proof.
We replace α by α M for S ∞ and by α m for S ∞ : P (cid:0) S ∞ > u (cid:1) = Z ∞ u ( π α M ( ds,
0) + π α M ( ds, Z ∞ u (cid:18) α M βα M − β ( e − βs − e − α M s ) βα M + β + βe − βs α M α M + β (cid:19) ds = α M e − βu − β e − α M u α M − β And P (cid:0) S ∞ > u (cid:1) = Z ∞ u ( π α m ( ds,
0) + π α m ( ds, α m e − βu − β e − α m u α m − β So from (45): α M e − βu − β e − α M u α M − β P π w (cid:16) V jt = 0 (cid:17) ≤ P π w (cid:16) S it > u, V jt = 0 (cid:17) ≤ α m e − βu − β e − α m u α m − β P π w (cid:16) V jt = 0 (cid:17) Hence with (44): α M e − βu − β e − α M u α M − β βα M + β ≤ P π w (cid:16) S it > u, V jt = 0 (cid:17) ≤ α m e − βu − β e − α m u α m − β βα m + β (48)29rom this proposition we deduce bounds on the rates r + / − ij ( w ) = P v,v i =0 µ wv α i ( w, v ) R E p + / − ( s j ) π wv ( ds )for all p + and p − differentiable monotone. If functions p + and p − are decreasing: (cid:18) p + / − (0) + Z + ∞ (cid:18) α m e − βu − β e − α m u α m − β (cid:19) ( p + / − ) ( u ) du (cid:19) α m X v,v j =0 µ wv ≤ r + / − ij ( w ) ≤ (cid:18) p + / − (0) + Z + ∞ (cid:18) α M e − βu − β e − α M u α M − β (cid:19) ( p + / − ) ( u ) du (cid:19) α M X v,v j =0 µ wv We finally conclude with p + ( s ) = A + e − sτ + and p − ( s ) = A − e − sτ − : r + / − ij ( w ) ≥ (cid:18) A + / − − Z + ∞ (cid:18) α m e − βu − β e − α m u α m − β (cid:19) A + / − τ + / − e − uτ + / − du (cid:19) α m X v,v j =0 µ wv ≥ A + / − α m − α m τ + / − β +1 − β τ + / − α m +1 α m − β X v,v j =0 µ wv ≥ A + / − α m βα M + β − α m τ + / − β +1 − β τ + / − α m +1 ( α m − β ) To get the last inequality, we used the fact that 1 − α mτ + / − β +1 − β τ + / − αm +1 ( α m − β ) ≥ r + / − ij ( w ): r + / − ij ( w ) ≤ A + / − α M βα m + β − α M τ + / − β +1 − β τ + / − α M +1 ( α M − β ) But we showed that if r + ij ( w ) < r − ij ( w ) for all w we get that the limit process W t is recurrentpositive so it is the case if: A + α M α m + β − α M τ + β +1 − β τ + α M +1 ( α M − β ) < A − α m α M + β − α m τ − β +1 − β τ − α m +1 ( α m − β ) Finally we get the following simple condition: α M A + τ + ( α M τ + + βτ + + 1)( τ − α m + 1)( τ − β + 1) α m A − τ − ( α m τ − + βτ − + 1)( τ + α M + 1)( τ + β + 1) < p + and p − are not monotone, we can get a similar condition separating intervals where theyare increasing or decreasing. 30inally, previous results show that in our model weights can diverge although rates are boundedand we can give simple explicit condition on parameters for which they don’t diverge. This isthe first time, to our knowledge, that such a condition can be given without any homeostaticmechanisms added. Some analytical studied previously needed to add some constraints in orderto bound weights and obtained results depending on the spike correlation matrix they were notable to control [18, 29, 40]. With such a condition, our model becomes ready to use being awareof criticizes we present in the sixth section. As shown in the appendix A, we can find the Laplace transform of π , the invariant measure of thefast process. However, inverting it analytically for a network of N neurons, N too large, needs tooheavy computations. Hence, we apply our results in a network of 2 neurons and then simulate abigger network. But first let remind us the parameters present in our model. Even if simple, our model depends on many parameters. First, let’s recall the probability tojump: p + ( s ) = A + e − sτ + and p − ( s ) = A − e − sτ − Then let’s detail the function ξ i we used in our simulations. We used the same ξ i = ξ for allneurons, σ > θ > ξ i ( x ) = ξ ( x ) = S e − σ ( x − θ ) + α m Our parameters are then: (cid:15), A + , A − , τ − , τ + , σ, θ, β, α m and α M . Time of influence of a spike10ms so β ∼ .
1. Firing rates of neurons are bounded by α m ∼ .
01 and α M ∼
1. STDP param-eters are in the following range: τ + / − ∈ [1 , , A + / − ∈ [0 , S = α M , σ = 0 . , θ = ln ( α M /α m − σ and (cid:15) ≤ . p + and p − enable to be close to biological experiments [7]: -100 0 1000.00.51.0 Model STDP curve dt(ms) d w / w Wij
Figure 3: Bi-Poo experiment on our model compare to the real one. Parameters used here are: A + =1 , A − =0 . , τ − =2 τ + =34 ms as in [20]. .2 First applications of our results In the simple case of (3) we get: r + ij ( w ) = X v,v j =0 µ wv α j ( w, v ) Z E p + ( s i ) π wv ( ds )= X v ∈ I,v j =0 µ wv α j ( w, v ) Z E A + e − siτ + π wv ( ds )= X v ∈ I,v j =0 µ wv α j ( w, v ) A + L{ π wv } (0 , ..., , τ + |{z} i , , ..., r − ij ( w ) = X v ∈ I,v i =0 µ wv α i ( w, v ) A − L{ π wv } (0 , ..., , τ − |{z} j , , ..., One weight free and 2 neurons:
In this example of one weight free and 2 neurons, we get a birth and death process with w fixed, w =( w , w ). We can find the explicit stationnary distribution of the weights in that case.From previous computations we have: w → w + ∆ w : r + ( w ) = A + (cid:20) µ w α ( w, L ( π w ) (cid:18) τ + , (cid:19) + µ w α ( w, L ( π w ) (cid:18) τ + , (cid:19)(cid:21) w → w − ∆ w : r − ( w ) = ]∆ w, + ∞ [ ( w ) A − (cid:20) µ w α ( w, L ( π w ) (cid:18) , τ − (cid:19) + µ w α ( w, L ( π w ) (cid:18) , τ − (cid:19)(cid:21) Hence, it is similar to a birth process on N with 0 reflecting. In order to study the conditions fortransience and recurrence, we use the following theorem which gather some results of the fourfirst sections of [28] with its notations. Theorem 5.1.
Suppose X t is a birth and death process on N with birth rates λ k > for all k ∈ N and death rates µ k > for all k ∈ N ∗ and µ = 0 . Then [28] gives the following classification:(a) The process is ergodic if and only if P + ∞ i =1 Q ij =1 µ j λ j = + ∞ and P + ∞ i =1 Q ij =1 λ j − µ j < + ∞ . Inthis case, there exists a unique θ invariant measure given by: θ ( i ) = θ (0) i Y j =1 λ j − µ j With θ (0) = 11 + P + ∞ i =1 Q ij =1 λ j − µ j (b) The process is null recurent if and only if P + ∞ i =1 Q ij =1 µ j λ j = + ∞ and P + ∞ i =1 Q ij =1 λ j − µ j = + ∞ (b) The process is transient if and only if P + ∞ i =1 Q ij =1 µ j λ j < + ∞ and P + ∞ i =1 Q ij =1 λ j − µ j = + ∞
32n order to apply this theorem to our example, we prove the following corollary.
Corollary 5.2.
Suppose assumptions of theorem 5.1 hold. Suppose in more that λ k and µ k converge respectively towards λ and µ when k → + ∞ . Then X t is ergodic iff < λ < µ andtransient if λ > µ > .Proof. Let prove only the ergodic case as the proof for the transient one is similar. Suppose that0 < λ < µ . Thus, for all (cid:15) > ∃ k ∈ N such that for all j > k , λ j − µ j ≤ λµ − (cid:15) = l (cid:15) and λ j µ j ≤ l (cid:15) .Taking l (cid:15) < Remark . The case λ = µ > is more complex as it will depend on the way ( λ k ) and ( µ k ) converge. We come back to our example.
Proposition 5.3. r + ( w , w ) and r − ( w , w ) are strictly positive and converge respectivelyto R + ( α M , w ) > and R − ( α M , w ) > when w → ∞ .Proof. First, α ( w,
00) = α ( w,
00) = ξ (0) = α m and α ( w,
01) = ξ ( w ) don’t depend on w .Second, x
7→ L π wv (0 , x ), x
7→ L π wv ( x,
0) and µ wv depend on w only through α ( w,
10) = ξ ( w ).But lim w → + ∞ ξ ( w ) = α M so ~µ w converges to ~µ solution of (50) with α = α ( w,
01) = ξ ( w )and α = lim w → + ∞ α ( w,
10) = lim w → + ∞ ξ ( w ) = α M . Concerning x
7→ L π wv (0 , x ),we can fix x = x and call f v ( ξ ( w )) = L π wv (0 , x ). Computations of 54 show that for all v ∈ { , } , 0 < f v ( y ) < ∞ for all y ∈ [ α m , α M ] and is continuous as a positive boundedrational fraction. Hence, w f v ( ξ ( w )) is continuous by composition. We conclude thatlim w → + ∞ L π wv (0 , x ) = f v ( α M ) and:lim w →∞ r + ( w ) = lim w →∞ A + µ w α m L ( π w ) (cid:18) τ + , (cid:19)| {z } −→ w →∞ f ( α M ) + µ w α M L ( π w ) (cid:18) τ + , (cid:19)| {z } −→ w →∞ f ( α M ) = A + ( µ α m f ( α M ) + µ α M f ( α M ))= R + It is similar for x
7→ L π wv ( x, w →∞ r − ( w ) = R − Hence, by corollary 5.2, R + < R − ensures the process w t admits a unique invariant measure θ : θ ( i ∆ w ) = θ (∆ w ) i Y j =2 r + (( j − w ) r − ( j ∆ w )With θ (∆ w ) = 11 + P + ∞ i =1 Q ij =2 r + (( j − w ) r − ( j ∆ w )
33e then wonder when this condition holds and we did simulations with parameters in therange of biological ones. Practically, explosion of the weight reflects the fact that LTP winsover LTD. Some studies has tried to tackle question of the relationship between STDP curveparameters, τ + / − and A + / − , and the balance of LTP and LTD. They showed that when theintegral of the STDP window is enough biased toward depression the system is intrinsicallystable [25, 29, 30]. In our case, we can find examples for which the "enough" is important. Forinstance with the following parameters, we get an explosion of w when depression wins againstpotentiation: β = 0 . , α m = 0 . , α M = 1 , τ + = 17 ms, τ − = 34 ms, A − = 0 . , A + = 0 . , (cid:15) = 10 − We took p + / − (cid:15) = (cid:15)p + / − . When (cid:15) is small enough ( ≤ − ) simulations agrees with analyticalresults. That is to say w diverges when w <
25 and doesn’t diverge when w > + p - Figure 4: Plot of r + ( w ) − r − ( w ) (left) and plot of p + , p − on the same graph(right) time(ms) w time(ms) w Figure 5: Evolution of the weight w when w is fixed at (left) and 30 (right) and (cid:15) = 10 − Remark . We can even get divergence when p + ( s ) < p − ( s ) for all s ∈ R + xample with 2 excitatory neurons Let’s apply this result in a network of 2 excitatory neurons. First, we denote w = ( w , w ) sincethe diagonal elements are null. We are interested in the sign of the limit of sup k w k≥ r ( r ij + ( w ) − r ij − ( w ))which is equivalent to sup k w k≥ r ( η ( w )) ij (see 3), when r → ∞ , in order to use corollary 4.2 tostudy stability of weights. We first show this limit exists and then compute it to determineparameters for which we don’t have weights divergence.In order to show the existence of the limit, we first recall that w is only present in neurons’ rates.Thus, thanks to the sigmoid, these rates are bounded and when one of the components of w goes to ∞ , rates in which it plays a role tends to the upper bound of the sigmoid, α M , since allneurons are excitatory ones. For instance: α ( w,
01) = ξ ( w ) −→ w →∞ α M Therefore, we can separate the space R + × R + as following the intuition given by the graph of( η ( w )) for instance: �� �� �� ���������� ������������������������������ ��� � �� Figure 6: η ( w ) when A + = A − = 0 , and τ − = 2 τ + = 34 ms So the separation looks like this:As showed in the appendix A, we can compute the Laplace transforms L{ π wv } ( λ , λ ) forfixed w . If we introduce the dependence on w , it will be in rate terms such as α = α ( w, α = α ( w,
01) = ξ ( w ), α = α ( w,
10) = ξ ( w ), α = α = ξ (0) = α m and α = α = α = α = β . So we can rewrite η as a function of α ( w ) and α ( w ). Therefore, when r → ∞ , η ( α ( w ) , α ( w )) → η ( α M , α M ) on B . The sup of η becomes sup α m ≤ α ≤ α M η ( α M , α ) on A and sup α m ≤ α ≤ α M η ( α, α M ) on A . We conclude withlim r ∞ sup k w k≥ r η = max (cid:18) sup α m ≤ α ≤ α M η ( α, α M ) , sup α m ≤ α ≤ α M η ( α M , α ) (cid:19) We can compute numerically this limit in function of A − and τ − : t a u m o i n s - . - . . . . . . Figure 7: sup η when k w k → ∞ for A + = 0 , and τ + = 17 ms
36e note that we need a really small value of A + compared to the one of A − to satisfythe condition of positive recurrence. However, such a difference doesn’t seem to be needed insimulations. Indeed, we can have numerically positive recurrence for any parameters A + between0 and 1. Remark . The condition for null recurrence given in [37] result in η ij = 0 for all i, j in ourcase. Condition for transience leads to the exact opposite of the one of corollary 4.2: lim r → + ∞ sup w ∈ Σ , k w k≥ r ( η ( w )) ij ≥ , ∀ i, j And ∃ ( k, l ) , j = i s.t. lim r → + ∞ sup w ∈ Σ , k w k≥ r ( η ( w )) kl > It would be interesting to try to have a larger range of values of parameters for which we are inthe null recurrence case, and we need another plasticity rule to do so (with the condition of [37]).
10 neurons:
When depression is really higher than potentiation, weights seem to converge to a stationarydistribution and have such trajectories: 37owever, initial weights can play an important role. With parameters A + =0 . , A − =0 . , β =1 , α m =0 . , α M = 0 . (cid:15) =0 .
1, we have no divergence in short time with low initial weights and selection of oneweight from big initial ones, W i10 = :The selected weight is different from one trajectory to another. Remark . We have chosen 10 neurons for plotting constraints. Thousands of them are easilysimulated. This kind of phenomenon is called winner take all dynamics in [33] where they prevent themusing iSTDP. The reason to avoid them is that it prevents new assemblies to be formed.38
Discussion
Mathematical results
Based on a well known neural network model, we added plasticity in order to get insight onthe combined neurons - weights dynamics. We could analyse plasticity on the slow time scaleof weights dynamics compared to the neurons ones, thus producing a simplified model. Thislatter gives the weights dynamics under the stationary distribution of the fast process and isa continuous time Markov jump process on the state space of weights with non homogeneousin space jump rates. Such processes are hard to deal with and current results are given in [37].Moreover, even if we could prove existence and uniqueness of the invariant measure of the fastprocess, we were not able to express it explicitly. Thus, it is even harder to analyse the limitmodel. However, we can compute its Laplace transform in small networks, we didn’t try morethan 2 but it should not be too hard for more. The problem will nevertheless become quicklyharder as it consists in inverting a 2 N square matrix for a given w and as soon as w change, thiscomputation need to be done again. Here, making use of bounds on jump rates of neurons, weare able to give conditions of stability, but we emphasize it is only sufficient ones. To know if weneed additive terms, depending on weights for instance or just hard bounds, in order to avoiddivergence in the context of biological parameters is still under study. Simulation results
For small networks (2 neurons) and in the case of a STDP rule following the classical STDPcurve [7], we computed Laplace transform of the stationary distribution. We then gave explicitexpression of jump rates for the limit process which enabled us to study the weight dynamicsmore precisely. We even show that the divergence of weights is possible even when integral ofthe learning window is biased towards synaptic depression, even when depression curve is alwaysstronger than depression ( p + ( s ) < p − ( s ) for all s ). Such a result is not intuitive and led us to findconditions on parameters for which such a divergence doesn’t occur. Simulations with more thantwo neurons showed the winner take all phenomenon takes place. A calibration of parametersis needed to test more characteristics of the model: how does it respond to high frequence, lowfrequence? Does it enable bidirectional connections?... Limitations of our model and future work
We are aware our neuron model is far from the reality of neurons. It is really simple in order tomake the study of plasticity easier. Some questions raise when we try to match it with biology.For instance, what does β represents? Many things at the same time: the time one neuron willinfluence others, the time of a spike as it will not be able to spike again until the moment itcomes back to the state 0. Neurons are generally described through their membrane potentialwhich has no link to our model. Then, observations such as potential depolarisation is needed tolead to potentiation cannot be checked or modelled. Moreover, the way their rate of jump from 0to 1 depends on weights is not really clear and needs to be clarify, maybe there is a need to adddelay as it is done in other papers [32].While STDP seems good to keep in memory stimuli, even spontaneously after such inputs [33],it needs to forget somehow. This seems not be the case in our model. Such a phenomenonis possible for instance under homeostatic mechanisms [33, 45, 48, 49]. STDP plays the role ofadditive synaptic scaling as when a weight increases, let say w , then w decreases. It is not agood thing according to [45], as they observed multiplicative synaptic scaling in their experiments.This is understandable as it is too specific and seems not sufficient. It is not useless if you think39s information supported by w is the exact opposite of the one supported by w , it enablesneurons " to win time ". So there is a need to add homeostasis to our model. Metaplasticityor plastic inhibitory (iSTDP) neurons are the most used. Indeed, we studied only a network ofexcitatory neurons. Adding non plastic inhibitory neurons will just decrease the minimum offiring rates of neurons. However, plastic inhibitory neurons could prevent from divergence ofweights. Finally, w ii = 0 is imposed but it could be interesting to use it as an homeostatic factor,decreasing the firing rate when it is to high and increasing it when it is weak. Relation to previous work
Analysis using the separation of time scale between weights dynamics and the network one hasbeen done in many other articles [11, 16, 18, 29, 30, 32, 40]. They modelled neurons as Poisson,except for [40], and derived a similar equation for weights on their slow time scale. This equationmainly depends on the cross correlation matrix which is not easy to handle with. They use Taylorexpansion and Fourier transform to approximate it for their simulations. In our model, such amatrix is hidden in the invariant measure of the fast process. Concerning the stability of weights,a similar result was found in [30] where "a stable fixed point of the output rate is possible if theintegral over the learning window is sufficiently negative." As, in their model, rates are linear inweights, stability of rates is equivalent to weights stability. Even if it is not a necessary condition,we could give an idea of how much negative the integral over the learning window needs to be inorder to have stability.
Conclusion
We propose a new view on STDP models. In contrast with tiny deterministic jumps of weights,weights have some weak probability to make a "big" jump. Thus, instead of continuous, weightsare discrete [2, 44]. Associated to the inter arrival time of spikes and the network state, we get aMarkov process. We simplified it thanks to a separation of time scale and found simple conditionsof positive recurrence. This work opens a new framework of study for plasticity which we hope itwill give rise to more mathematical results on plasticity in the following.40 nnexesA Dimension 2 for uniqueness
After giving the generator ( B , D ( B )) in 2 dimensions, we then compute the equation satisfies bythe Laplace transform of a given stationary distribution for ( S t , V t ). GeneratorProposition A.1. D ( B ) = { f ∈ C ub ( E ) and ( ∂ s + ∂ s ) f ∈ C ub ( E ) } and ∀ f ∈ D ( B ) : B f ( s, (0 , α ( f (( s , , (0 , − f ( s, (0 , α ( f ((0 , s ) , (1 , − f ( s, (0 , P ∂ s i f ( s, (0 , B f ( s, (0 , α ( f ((0 , s ) , (1 , − f ( s, (0 , β ( f ( s, (0 , − f ( s, (0 , P ∂ s i f ( s, (0 , B f ( s, (1 , α ( f (( s , , (1 , − f ( s, (1 , β ( f ( s, (0 , − f ( s, (1 , P ∂ s i f ( s, (1 , B f ( s, (1 , β ( f ( s, (0 , − f ( s, (1 , β ( f ( s, (1 , − f ( s, (1 , P ∂ s i f ( s, (1 , Or in a shorter version: B f ( s, v ) = (( ∂ s + ∂ s ) f )( x )+ α (1 − v ,v ) v [ f (( s v , s ) , (1 − v , v )) − f ( x )]+ α ( v , − v ) v [ f (( s , s v ) , ( v , − v )) − f ( x )] Proof.
Let f ∈ D ( B ), then by definition lim t → E x ( f ( X t )) − f ( x ) t exists. Let’s compute it. We knowthat each element v ∈ I has only two neighbors (in the sens it can only reach two different states).We note α v v the rates to reach the neighbor v’. We do the computations for v = (0 , E ( s, (0 , ( f ( S t , V t )) = P ( s, (0 , ( V t = v ) f (( s + t, s + t ) , (0 , P ( s, (0 , ( V t = (1 , f ((0 , s + t ) , (1 , P ( s, (0 , ( V t = (0 , f (( s + t, s + t ) , (0 , o ( t )= (cid:16) − (cid:0) α + α (cid:1) t e − ( α + α ) t (cid:17) f (( s + t, s + t ) , (0 , α t e − α t f ((0 , s + t ) , (1 , α t e − α t f (( s + t, s + t ) , (0 , o ( t )= f (( s + t ) , (0 , α ( f ((0 , s + t ) , (1 , − f ( s + t, (0 , β ( f ( s + t, (0 , − f ( s + t, (0 , o ( t )Then we obtain: B f ( x ) = lim t → E ( s, (0 , ( f ( X t )) − f ( x ) t = α ( f ((0 , s ) , (1 , − f ( s, (0 , β ( f ( s, (0 , − f ( s, (0 , , . ∇ s f ( s, (0 , B f ( x ) as in the proposition ∀ x ∈ E , and D ( B ) ⊆ { f ∈ C ub ( E ) and ( ∂ s + ∂ s ) f ∈ C ub ( E ) } . In order to have the other inclusion, we take f ∈ { g, g ∈ C ub ( E ) and ( ∂ s + ∂ s ) g ∈ C ub ( E ) } , then we compute for x = ( s, (0 , ∈ E : r x ( t ) = (cid:12)(cid:12)(cid:12)(cid:12) E x ( f ( X t )) − f ( x ) t − (cid:2) α ( f ((0 , s ) , (1 , − f ( s, (0 , β ( f ( s, (0 , − f ( s, (0 , , . ∇ s f ( s, (0 , (cid:3)(cid:12)(cid:12)(cid:12)(cid:12) t → , . ∇ s f ∈ C ub ( E ) (cid:12)(cid:12)(cid:12)(cid:12) f ( s + t, (0 , − f ( s, (0 , t − (1 , . ∇ s f ( s, (0 , (cid:12)(cid:12)(cid:12)(cid:12) ≤ t Z t | (1 , . ∇ s f ( s + u, (0 , − (1 , . ∇ s f ( s, (0 , | du ≤ sup ≤ u ≤ t | (1 , . ∇ s f ( s + u, (0 , − (1 , . ∇ s f ( s, (0 , |≤ (cid:15) If t small enough.Hence, lim t → E ( s, (0 , ( f ( X t )) − f ( x ) t exists. As we can do exactly the same computations for all x ∈ E , we deduce that { f ∈ C ub ( E ) and ( ∂ s + ∂ s ) f ∈ C ub ( E ) } ⊆ D ( B ). Thus, we have theequality wanted.We can see here the need to chose C ub ( E ) instead of C b ( E ) for instance. Indeed, the uniformcontinuity enable us to conclude on the domain of B and on another hand it is the biggestsubspace of L ∞ ( E ) on which the derivative is the generator of a C -semigroup. If we had chosen C ( E ) = { f unctions vanishing at ∞} , we see immediately the semigroup associated to ourprocess will not map C ( E ) into itself. T t f has no reason to vanish at ∞ . C ub ( E ) seems tobe the space that suits. Moreover, thanks to the portmanteau lemma, the knowledge of thesemigroup on C ub ( E ) characterizes the law of the process. We can then use the definition 3.10to search the Laplace transforms of invariant measures. Laplace transform
First, we show we can write any invariant measure of the process in the form π ( s, v ) = P k ∈ I δ v k ( v ) µ wk π k ( s ) where ( µ w , ..., µ wN ) is the only invariant measure of the jump process ( V t )and π k is a measure on B ( R ). Then, we prove that if the process ( X t ) t ≥ has at least oneinvariant measure of probability π , then it is unique.It is interesting to look at the form of invariant measures for the following. Indeed, as ( V t )doesn’t depend on ( S t ), we can study its dynamic and deduce a nice decomposition of thestationary distribution of ( X t ). Proposition A.2.
The jump process alone ( V t ) t ≥ has a unique invariant probability measure ~µ = ( µ w , µ w , µ w , µ w ) T . Moreover, µ wv > , ∀ v ∈ I , and it satisfies: Q~µ = − α − α β β α − α − β βα − α − β β α α − β µ w µ w µ w µ w = 0 (50) Proof.
Indeed, as each neuron is connected to each other, ( V t ) t ≥ is irreducible. As its state spaceis finite, the process is also positive recurrent so has a unique invariant probability measure µ w by theorem1.7.7 in [39]. 42oreover, as each state is positive recurrent, µ wv > , ∀ v ∈ I .The matrix Q is the matrix of transition rates (Q-matrix) of ( V t ) t ≥ . With 1 = (0 , , , , , , , Q = ( q ij ) ≤ i,j ≤ we have Q has in the proposition. As µ w isinvariant, it belongs to the kernel of Q, which is (50), Theorem 3.5.5 in [39].From this result, we deduce that ∀ k ∈ I, R R π ( ds, k ) = µ wk . Therefore, we define π k as π k ( A ) = R A π ( ds,k ) µ wk , ∀ A ∈ B ( R ). Hence, π ( s, v ) = P k ∈ I δ v k ( v ) µ wk π k ( s ).Now, we previously showed the process ( X t ) t ≥ has at least one invariant probability measureon E , let π be one of them and let’s compute its Laplace transform to show the followingproposition: Proposition A.3.
Assume the process ( X t ) t ≥ has at least one invariant measure of probability π . Then it is unique.Proof. We will show that all invariant measure of probability has the same Laplace transform andas the later characterizes it, see for instance Theorem 4.3 in [26], there only exists one invariantmeasure of probability.We can write π as π ( A, v ) = P k ∈ I π k ( A ) ⊗ µ wk δ k ( v ) , ∀ A ∈ B ( R ), with π k ( A ) = π ( A, k )( µ wk ) − .To simplify computations, we will denote by L π be the vector of Laplace transforms of π k . So ∀ λ , λ ∈ R + : L π ( λ , λ ) = L π ( λ , λ ) L π ( λ , λ ) L π ( λ , λ ) L π ( λ , λ ) where ∀ v ∈ I, L π v ( λ , λ ) = Z R e − ( λ s + λ s ) π v ( ds )Just a remark, ∀ v ∈ I L π v (0 ,
0) = Z R π v ( ds ) = ( µ wv ) − Z R π ( ds, v ) = 1As we want to compute the Laplace transform of π which is in fact ( λ , λ ) P v ∈ I µ wv L π v ( λ , λ ),let’s use the following test functions, with λ = ( λ , λ ) and ∀ k ∈ I : e kλ ( s, v ) = e − ( λ s + λ s ) δ k ( v )By definition 3.10 of an invariant measure we get ∀ v ∈ I : X k ∈ I Z R +2 B e vλ ( s, k ) µ wk π k ( ds ) = 0 (51)43e then compute B e kλ ( s, v ): B e λ ( s, (0 , − α − α − ( λ + λ )) e − λ s − λ s B e λ ( s, (0 , βe − λ s − λ s B e λ ( s, (1 , βe − λ s − λ s B e λ ( s, (1 , B e λ ( s, (0 , α e − λ s B e λ ( s, (0 , − α − β − ( λ + λ )) e − λ s − λ s B e λ ( s, (1 , B e λ ( s, (1 , βe − λ s − λ s B e λ ( s, (0 , α e − λ s B e λ ( s, (0 , B e λ ( s, (1 , − α − β − ( λ + λ )) e − λ s − λ s B e λ ( s, (1 , βe − λ s − λ s B e λ ( s, (0 , B e λ ( s, (0 , α e − λ s B e λ ( s, (1 , α e − λ s B e λ ( s, (1 , − β − ( λ + λ )) e − λ s − λ s So with (51) and v = (0 ,
0) for instance: X k ∈ I Z R +2 B e λ ( s, k ) µ wk π k ( ds ) = 0 ⇔ (cid:0) − α − α − ( λ + λ ) (cid:1) µ w L π ( λ , λ ) + βµ w L π + βµ w L π = 0After computations for all v ∈ I we get: M ( λ , λ ) L π ( λ , λ ) L π ( λ , λ ) L π ( λ , λ ) L π ( λ , λ ) = − α L π ( λ , µ w1 µ w2 − α L π (0 , λ ) µ w1 µ w3 − α L π (0 , λ ) µ w2 µ w4 − α L π ( λ , µ w3 µ w4 (52)44ith: M ( λ , λ ) = − α − α − λ − λ β µ w2 µ w1 β µ w3 µ w1 − α − β − λ − λ β µ w4 µ w2 − α − β − λ − λ β µ w4 µ w3 − β − λ − λ As we have L π ( λ ,
0) = L π ( λ , L π ( λ , L π ( λ , L π ( λ , and L π (0 , λ ) = L π (0 , λ ) L π (0 , λ ) L π (0 , λ ) L π (0 , λ ) .Then we can get L π ( λ ,
0) and L π (0 , λ ) evaluating (52) in λ = 0 and λ = 0: M ( λ , L π ( λ ,
0) = − α L π ( λ , µ w1 µ w2 − α L π (0 , µ w1 µ w3 − α L π (0 , µ w2 µ w4 − α L π ( λ , µ w3 µ w4 As ∀ v ∈ I , L π v (0 ,
0) = R R +2 π v ( ds , ds ) = 1 so: M ( λ , L π ( λ ,
0) = − α µ w µ w − α µ w µ w L π ( λ ,
0) + − α µ w µ w − α µ w µ w And M (0 , λ ) L π (0 , λ ) = − α µ w µ w − α µ w µ w L π ( λ ,
0) + − α µ w µ w − α µ w µ w Putting terms in T i ( λ i ) in matrices marked M i ( λ i ) we get: M ( λ ) L π ( λ ,
0) = − α µ w1 µ w2 − α µ w3 µ w4 and M ( λ ) L π (0 , λ ) = − α µ w1 µ w3 − α µ w2 µ w4 (53)45ith: M ( λ ) = − α − α − λ β µ µ β µ µ α µ µ − α − β − λ β µ µ − α − β − λ β µ µ α µ µ − β − λ And M ( λ ) = − α − α − λ β µ µ β µ µ − α − β − λ β µ µ α µ µ − α − β − λ β µ µ α µ µ − β − λ As a triangular superior matrix with diagonal elements strictly positive, M is invertible. Moreover, M and M are invertible as diagonally dominant matrices whenever ( λ , λ ) ∈ R ∗ + × R ∗ + :First line of (50) gives: (cid:0) α + α (cid:1) µ w = βµ w + βµ w So ∀ λ ∈ R ∗ + : | M ( λ ) | − X j =2 | M j ( λ ) | = λ > ∀ λ ∈ R ∗ + : | M ( λ ) | − X j =2 | M j ( λ ) | = λ > | M ( λ ) | − X j =3 | M j ( λ ) | = α µ w + λ > | M ( λ ) | − X j =4 | M j ( λ ) | = α µ w + λ > M which shows M ( λ ) et M ( λ ) are diagonally dominant matricesso they are invertible ∀ ( λ , λ ) ∈ R ∗ + × R ∗ + . Hence, if π is an invariant measure for ( X t ) t ≥ , ∀ ( λ , λ ) ∈ R ∗ + × R ∗ + : L π (0 ,
0) =
By (53): L π ( λ ,
0) = ( M ( λ )) − − α µ w1 µ w2 − α µ w3 µ w4 and L π (0 , λ ) = ( M ( λ )) − − α µ w1 µ w3 − α µ w2 µ w4 (54)46y (52) L π ( λ , λ ) = ( M ( λ , λ )) − − α L π ( λ , µ w1 µ w2 − α L π (0 , λ ) µ w1 µ w3 − α L π (0 , λ ) µ w2 µ w4 − α L π ( λ , µ w3 µ w4 We conclude using the fact the Laplace transform of a law determines it, so π is unique.47 eferences [1] L. F. Abbott and S. B. Nelson. Synaptic plasticity: taming the beast. Nature neuroscience ,3:1178–1183, 2000.[2] D. J. Amit and S. Fusi. Learning in neural networks with material synapses.
NeuralComputation , 6(5):957–982, 1994.[3] P. A. Appleby and T. Elliott. Synaptic and temporal ensemble interpretation of spike-timing-dependent plasticity.
Neural computation , 17(11):2316–2336, 2005.[4] P. A. Appleby and T. Elliott. Stable competitive dynamics emerge from multispike interactionsin a stochastic model of spike-timing-dependent plasticity.
Neural computation , 18(10):2414–2464, 2006.[5] M. Benayoun, J. D. Cowan, W. van Drongelen, and E. Wallace. Avalanches in a StochasticModel of Spiking Neurons.
PLoS Computational Biology , 6(7):e1000846, July 2010.[6] M. K. Benna and S. Fusi. Computational principles of synaptic memory consolidation.
Nature Neuroscience , 19(12):1697–1706, Oct. 2016.[7] G.-q. Bi and M.-m. Poo. Synaptic modifications in cultured hippocampal neurons: dependenceon spike timing, synaptic strength, and postsynaptic cell type.
Journal of neuroscience ,18(24):10464–10472, 1998.[8] E. L. Bienenstock, L. N. Cooper, and P. W. Munro. Theory for the development of neuronselectivity: orientation specificity and binocular interaction in visual cortex. Technical report,DTIC Document, 1981.[9] P. C. Bressloff. Metastable states and quasicycles in a stochastic Wilson-Cowan model ofneuronal population dynamics.
Physical Review E , 82(5), Nov. 2010.[10] N. Brunel. Is cortical connectivity optimized for storing information?
Nature Neuroscience ,19(5):749–755, Apr. 2016.[11] A. N. Burkitt, H. Meffin, and D. B. Grayden. Spike-timing-dependent plasticity: therelationship to rate-based learning for models with weight dynamics determined by a stablefixed point.
Neural Computation , 16(5):885–940, 2004.[12] C. Clopath, L. Büsing, E. Vasilaki, and W. Gerstner. Connectivity reflects coding: A modelof voltage-based spike-timing-dependent-plasticity with homeostasis.
Nature , 2009.[13] M. H. Davis. Piecewise-deterministic Markov processes: A general class of non-diffusionstochastic models.
Journal of the Royal Statistical Society. Series B (Methodological) , pages353–388, 1984.[14] M. H. A. Davis.
Markov models and optimization . Monographs on statistics and appliedprobability. Chapman & Hall, London ; New York, 1st ed edition, 1993.[15] K. Fox and M. Stryker. Integrating Hebbian and homeostatic plasticity: introduction.
Philosophical Transactions of the Royal Society B: Biological Sciences , 372(1715):20160413,Mar. 2017. 4816] M. N. Galtier and G. Wainrib. A Biological Gradient Descent for Prediction Through aCombination of STDP and Homeostatic Plasticity.
Neural Computation , 25(11):2815–2832,Nov. 2013.[17] W. Gerstner and W. M. Kistler.
Spiking neuron models: single neurons, populations, plasticity .Cambridge University Press, Cambridge, U.K. ; New York, 2002.[18] M. Gilson, A. N. Burkitt, D. B. Grayden, D. A. Thomas, and J. L. van Hemmen. Emergence ofnetwork structure due to spike-timing-dependent plasticity in recurrent neuronal networks. I.Input selectivity–strengthening correlated input pathways.
Biological Cybernetics , 101(2):81–102, Aug. 2009.[19] M. Gilson, A. N. Burkitt, D. B. Grayden, D. A. Thomas, and J. L. van Hemmen. Emergenceof network structure due to spike-timing-dependent plasticity in recurrent neuronal networksV: self-organization schemes and weight dependence.
Biological Cybernetics , 103(5):365–386,Nov. 2010.[20] M. Gilson, T. Fukai, and A. N. Burkitt. Spectral Analysis of Input Spike Trains by Spike-Timing-Dependent Plasticity.
PLOS Computational Biology , 8(7):e1002584, 2012.[21] M. Graupner and N. Brunel. Calcium-based plasticity model explains sensitivity of synapticchanges to spike pattern, rate, and dendritic location.
PNAS , 109(52):21551–21552, 2012.[22] M. Hairer. Convergence of Markov processes. lecture notes , 2010.[23] D. Hebb.
The Organization of Behavior . Wiley & Sons. Wiley, New York, 1st ed edition,1949.[24] E. M. Izhikevich.
Dynamical systems in neuroscience: the geometry of excitability and bursting .Computational neuroscience. MIT Press, Cambridge, Mass, 2007. OCLC: ocm65400606.[25] E. M. Izhikevich and N. S. Desai. Relating stdp to bcm.
Neural computation , 15(7):1511–1523,2003.[26] O. Kallenberg.
Foundations of modern probability . Springer Science & Business Media, 2006.[27] H.-W. Kang and T. G. Kurtz. Separation of time-scales and model reduction for stochasticreaction networks.
The Annals of Applied Probability , 23(2):529–583, Apr. 2013.[28] S. Karlin and J. McGregor. The classification of birth and death processes.
Transactions ofthe American Mathematical Society , 86(2):366–400, 1957.[29] R. Kempter, W. Gerstner, and J. L. Van Hemmen. Hebbian learning and spiking neurons.
Physical Review E , 59(4):4498, 1999.[30] R. Kempter, W. Gerstner, and J. L. Van Hemmen. Intrinsic stabilization of output rates byspike-based Hebbian learning.
Neural computation , 13(12):2709–2741, 2001.[31] T. G. Kurtz. Averaging for martingale problems and stochastic approximation. In
AppliedStochastic Analysis , pages 186–209. Springer, 1992.[32] G. Lajoie, N. I. Krouchev, J. F. Kalaska, A. L. Fairhall, and E. E. Fetz. Correlation-basedmodel of artificially induced plasticity in motor cortex by a bidirectional brain-computerinterface.
PLOS Computational Biology , 13(2):e1005343, 2017.4933] A. Litwin-Kumar and B. Doiron. Formation and maintenance of neuronal assemblies throughsynaptic plasticity.
Nature communications , 5:5319, 2014.[34] H. Markram. A history of spike-timing-dependent plasticity.
Frontiers in Synaptic Neuro-science , 3, 2011.[35] H. Markram, W. Gerstner, and P. J. Sjöström. Spike-Timing-Dependent Plasticity: AComprehensive Overview.
Frontiers in Synaptic Neuroscience , 4, 2012.[36] H. Markram, J. Lübke, M. Frotscher, and B. Sakmann. Regulation of synaptic efficacy bycoincidence of postsynaptic aps and epsps.
Science , 275(5297):213–215, 1997.[37] M. Menshikov, S. Popov, and A. Wade.
Non-homogeneous Random Walks: LyapunovFunction Methods for Near-Critical Stochastic Systems , volume 209. Cambridge UniversityPress, 2016.[38] A. Morrison, M. Diesmann, and W. Gerstner. Phenomenological models of synaptic plasticitybased on spike timing.
Biological Cybernetics , 98(6):459–478, June 2008.[39] J. R. Norris.
Markov chains . Number 2. Cambridge university press, 1998.[40] G. K. Ocker, A. Litwin-Kumar, and B. Doiron. Self-organization of microcircuits in networksof spiking neurons with plastic synapses.
PLoS Comput Biol , 11(8):e1004458, 2015.[41] D. H. O’Connor, G. M. Wittenberg, and S. S.-H. Wang. Graded bidirectional synapticplasticity is composed of switch-like unitary events.
Proceedings of the National Academy ofSciences of the United States of America , 102(27):9679–9684, 2005.[42] E. Pechersky, G. Via, and A. Yambartsev. Stochastic Ising model with plastic interactions.
Statistics & Probability Letters , 123:100–106, Apr. 2017.[43] P. E. Protter.
Stochastic Integration and Differential Equations: Version 2.1 . Number 21 inStochastic Modelling and Applied Probability. Springer, Berlin, 2. ed. , corr. 3rd pr edition,2010. OCLC: 837782643.[44] C. Ribrault, K. Sekimoto, and A. Triller. From the stochasticity of molecular processes tothe variability of synaptic transmission.
Nature Reviews Neuroscience , 12(7):375–387, June2011.[45] G. G. Turrigiano. The dialectic of Hebb and homeostasis.
Philosophical Transactions of theRoyal Society B: Biological Sciences , 372(1715):20160258, Mar. 2017.[46] R. L. Tweedie. Invariant Measures for Markov Chains with no Irreducibility Assumptions.
Journal of Applied Probability , 25:275–285, 1988.[47] P. Yger and M. Gilson. Models of Metaplasticity: A Review of Concepts.
Frontiers inComputational Neuroscience , 9, Nov. 2015.[48] F. Zenke, W. Gerstner, and S. Ganguli. The temporal paradox of Hebbian learning andhomeostatic plasticity.
Current Opinion in Neurobiology , 43:166–176, Apr. 2017.[49] F. Zenke, G. Hennequin, and W. Gerstner. Synaptic Plasticity in Neural Networks NeedsHomeostasis with a Fast Rate Detector.